Hi ,
I am on process to write my first bulk loading job. I use Cloudera
CDH3U3 with hbase 0.90.4
Executing a job I see HFiles which created after job finished but there
were no entries in hbase. hbase shell >> count 'uu_bulk' return 0.
Here is my job configuration:
Configuration conf = HBaseConfiguration.create();
Job job = new Job(conf, getClass().getSimpleName());
job.setJarByClass(UuPushMapReduceJobFactory.class);
job.setMapperClass(UuPushMapper.class);
job.setMapOutputKeyClass(ImmutableBytesWritable.class);
job.setMapOutputValueClass(KeyValue.class);
job.setOutputFormatClass(HFileOutputFormat.class);
String path = uuAggregationContext.getUuInputPath();
String outputPath =
"/bulk_loading_hbase/output/"+System.currentTimeMillis();
LOG.info("path = " + path);
LOG.info("outputPath = " + outputPath);
final String tableName = "uu_bulk";
LOG.info("hbase tableName: " + tableName);
createRegions(conf , Bytes.toBytes(tableName));
FileInputFormat.addInputPath(job, new Path(path));
FileOutputFormat.setOutputPath(job, new Path(outputPath));
HFileOutputFormat.configureIncrementalLoad(job, new HTable(conf,
tableName));
//=====================================================================================
Reducers log ends
2012-08-28 11:53:40,643 INFO org.apache.hadoop.mapred.Merger: Down to
the last merge-pass, with 10 segments left of total size: 222885367
bytes
2012-08-28 11:53:54,137 INFO
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat:
Writer=hdfs://hdn16/bulk_loading_hbase/output/1346194117045/_temporary/_attempt_201208260949_0026_r_000005_0/d/3908303205246218823,
wrote=268435455
2012-08-28 11:54:11,966 INFO org.apache.hadoop.mapred.Task:
Task:attempt_201208260949_0026_r_000005_0 is done. And is in the
process of commiting
2012-08-28 11:54:12,975 INFO org.apache.hadoop.mapred.Task: Task
attempt_201208260949_0026_r_000005_0 is allowed to commit now
2012-08-28 11:54:13,007 INFO
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Saved
output of task 'attempt_201208260949_0026_r_000005_0' to
/bulk_loading_hbase/output/1346194117045
2012-08-28 11:54:13,009 INFO org.apache.hadoop.mapred.Task: Task
'attempt_201208260949_0026_r_000005_0' done.
2012-08-28 11:54:13,010 INFO
org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs'
truncater with mapRetainSize=-1 and reduceRetainSize=-1
As I understand HFiles were written
to /bulk_loading_hbase/output/1346194117045 but I don't see any activity
related to moving HFiles to hbase.
What I am doing wrong? What should to get the result to be written to
Hbase?
Thanks in advance
Oleg.