Thanks St.Ack It solved the problem.
On Fri, Nov 12, 2010 at 7:41 PM, Stack <[email protected]> wrote: > Fix your classpath. Add the google library. See > > http://hbase.apache.org/docs/r0.89.20100924/apidocs/org/apache/hadoop/hbase/mapreduce/package-summary.html#package_description > for more on classpath. > > St.Ack > > > On Fri, Nov 12, 2010 at 5:07 AM, Shuja Rehman <[email protected]> > wrote: > > Hi > > > > I am trying to use configureIncrementalLoad() function to handle the > > totalOrderPartitioning but it throws this exception. > > > > 10/11/12 05:02:21 INFO zookeeper.ClientCnxn: Opening socket connection to > > server /10.10.10.2:2181 > > 10/11/12 05:02:21 INFO zookeeper.ClientCnxn: Socket connection > established > > to app4.hsd1.wa.comcast.net./10.10.10.2:2181, initiating session > > 10/11/12 05:02:21 INFO zookeeper.ClientCnxn: Session establishment > complete > > on server app4.hsd1.wa.comcast.net./10.10.10.2:2181, sessionid = > > 0x12c401bfdae0008, negotiated timeout = 40000 > > 10/11/12 05:02:21 INFO mapreduce.HFileOutputFormat: Looking up current > > regions for table org.apache.hadoop.hbase.client.hta...@21e554 > > 10/11/12 05:02:21 INFO mapreduce.HFileOutputFormat: Configuring 1 reduce > > partitions to match current region count > > 10/11/12 05:02:21 INFO mapreduce.HFileOutputFormat: Writing partition > > information to > > hdfs://app4.hsd1.wa.comcast.net./user/root/partitions_1289566941504 > > Exception in thread "main" java.lang.NoClassDefFoundError: > > com/google/common/base/Preconditions > > at > > > org.apache.hadoop.hbase.mapreduce.HFileOutputFormat.writePartitions(HFileOutputFormat.java:185) > > at > > > org.apache.hadoop.hbase.mapreduce.HFileOutputFormat.configureIncrementalLoad(HFileOutputFormat.java:258) > > at ParserDriver.runJob(ParserDriver.java:162) > > at ParserDriver.main(ParserDriver.java:109) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > > at > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > at java.lang.reflect.Method.invoke(Method.java:597) > > at org.apache.hadoop.util.RunJar.main(RunJar.java:186) > > Caused by: java.lang.ClassNotFoundException: > > com.google.common.base.Preconditions > > at java.net.URLClassLoader$1.run(URLClassLoader.java:202) > > at java.security.AccessController.doPrivileged(Native Method) > > at java.net.URLClassLoader.findClass(URLClassLoader.java:190) > > at java.lang.ClassLoader.loadClass(ClassLoader.java:307) > > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) > > at java.lang.ClassLoader.loadClass(ClassLoader.java:248) > > ... 9 more > > 10/11/12 05:02:21 INFO zookeeper.ZooKeeper: Session: 0x12c401bfdae0008 > > closed > > > > Here is the code. > > > > Configuration conf = HBaseConfiguration.create(); > > > > Job job = new Job(conf, "j"); > > > > HTable table = new HTable(conf, "mytab"); > > > > FileInputFormat.setInputPaths(job, input); > > job.setJarByClass(ParserDriver.class); > > job.setMapperClass(MyParserMapper.class); > > > > job.setInputFormatClass(XmlInputFormat.class); > > job.setReducerClass(PutSortReducer.class); > > Path outPath = new Path(output); > > FileOutputFormat.setOutputPath(job, outPath); > > > > job.setMapOutputValueClass(Put.class); > > job.setMapOutputKeyClass(ImmutableBytesWritable.class); > > * HFileOutputFormat.configureIncrementalLoad(job, table);* > > TableMapReduceUtil.addDependencyJars(job); > > job.waitForCompletion(true); > > > > I guess, there are some jar files missing. if yes then from where to get > > these? > > > > Thanks > > > > On Thu, Nov 11, 2010 at 12:57 AM, Stack <[email protected]> wrote: > > > >> On Wed, Nov 10, 2010 at 11:53 AM, Shuja Rehman <[email protected]> > >> wrote: > >> > oh! I think u have not read the full post. The essay has 3 paragraphs > :) > >> > > >> > *Should I need to add the following line also > >> > > >> > job.setPartitionerClass(TotalOrderPartitioner.class); > >> > > >> > >> You need to specify other than default partitioner so yes, above seems > >> necessary (Be aware that if only one reducer, all may appear to work > >> though your partitioner is bad... its when you have multiple reducers > >> that bad partitioner will show). > >> > >> > which book? OReilly.Hadoop.The.Definitive.Guide.Jun.2009? > >> > > >> > >> Yes. Or 2nd edition, October 2010. > >> > >> St.Ack > >> > >> > > >> > > >> > > >> > On Thu, Nov 11, 2010 at 12:49 AM, Stack <[email protected]> wrote: > >> > > >> >> Which two questions (you wrote an essay that looked like one big > >> >> question -- smile). > >> >> St.Ack > >> >> > >> >> On Wed, Nov 10, 2010 at 11:44 AM, Shuja Rehman < > [email protected]> > >> >> wrote: > >> >> > yeah, I tried it and it did not fails. can u answer other 2 > questions > >> as > >> >> > well? > >> >> > > >> >> > > >> >> > > >> >> > On Thu, Nov 11, 2010 at 12:15 AM, Stack <[email protected]> wrote: > >> >> > > >> >> >> All below looks reasonable (I did not do detailed review of your > code > >> >> >> posting). Have you tried it? Did it fail? > >> >> >> St.Ack > >> >> >> > >> >> >> On Wed, Nov 10, 2010 at 11:12 AM, Shuja Rehman < > >> [email protected]> > >> >> >> wrote: > >> >> >> > On Wed, Nov 10, 2010 at 9:20 PM, Stack <[email protected]> > wrote: > >> >> >> > > >> >> >> >> What you need? bulk-upload, in the scheme of things, is a well > >> >> >> >> documented feature. Its also one that has had some exercise > and > >> is > >> >> >> >> known to work well. For a 0.89 release and trunk, > documentation > >> is > >> >> >> >> here: > http://hbase.apache.org/docs/r0.89.20100924/bulk-loads.html > >> . > >> >> >> >> The unit test you refer to below is good for figuring how to > run a > >> >> job > >> >> >> >> (Bulk-upload was redone for 0.89/trunk and is much improved > over > >> what > >> >> >> >> was available in 0.20.x) > >> >> >> >> > >> >> >> > > >> >> >> > *I need to load data into hbase using Hfiles. * > >> >> >> > > >> >> >> > Ok, let me tell what I understand from all these things. > Basically > >> >> there > >> >> >> are > >> >> >> > two ways to bulk load into hbase. > >> >> >> > > >> >> >> > 1- Using Command Line tools (importtsv, completebulkload ) > >> >> >> > 2- Mapreduce job using HFileOutputFormat > >> >> >> > > >> >> >> > At the moment, I have generated the Hfiles using > HFileOutputFormat > >> and > >> >> >> > loading these files into hbase using completebulkload command > line > >> >> tool. > >> >> >> > here is my basic code skeleton. Correct me if I do anything > wrong. > >> >> >> > > >> >> >> > Configuration conf = new Configuration(); > >> >> >> > Job job = new Job(conf, "myjob"); > >> >> >> > > >> >> >> > FileInputFormat.setInputPaths(job, input); > >> >> >> > job.setJarByClass(ParserDriver.class); > >> >> >> > job.setMapperClass(MyParserMapper.class); > >> >> >> > job.setNumReduceTasks(1); > >> >> >> > job.setInputFormatClass(XmlInputFormat.class); > >> >> >> > job.setOutputFormatClass(HFileOutputFormat.class); > >> >> >> > job.setOutputKeyClass(ImmutableBytesWritable.class); > >> >> >> > job.setOutputValueClass(Put.class); > >> >> >> > job.setReducerClass(PutSortReducer.class); > >> >> >> > > >> >> >> > Path outPath = new Path(output); > >> >> >> > FileOutputFormat.setOutputPath(job, outPath); > >> >> >> > job.waitForCompletion(true); > >> >> >> > > >> >> >> > and here is mapper skeleton > >> >> >> > > >> >> >> > public class MyParserMapper extends > >> >> >> > Mapper<LongWritable, Text, ImmutableBytesWritable, Put> { > >> >> >> > while(true) > >> >> >> > { > >> >> >> > Put put = new Put(rowId); > >> >> >> > put.add(...); > >> >> >> > context.write(rwId, put); > >> >> >> > } > >> >> >> > > >> >> >> > The link says: > >> >> >> > *In order to function efficiently, HFileOutputFormat must be > >> >> configured > >> >> >> such > >> >> >> > that each output HFile fits within a single region. In order to > do > >> >> this, > >> >> >> > jobs use Hadoop's TotalOrderPartitioner class to partition the > map > >> >> output > >> >> >> > into disjoint ranges of the key space, corresponding to the key > >> ranges > >> >> of > >> >> >> > the regions in the table. *" > >> >> >> > > >> >> >> > Now according to my configuration above where i need to set > >> >> >> > *TotalOrderPartitioner > >> >> >> > ? *Should I need to add the following line also > >> >> >> > > >> >> >> > job.setPartitionerClass(TotalOrderPartitioner.class); > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > On totalorderpartition, this is a partitioner class from hadoop. > >> The > >> >> >> >> MR partitioner -- the class that dictates which reducers get > what > >> map > >> >> >> >> outputs -- is pluggable. The default partitioner does a hash of > >> the > >> >> >> >> output key to figure which reducer. This won't work if you are > >> >> >> >> looking to have your hfile output totally sorted. > >> >> >> >> > >> >> >> >> > >> >> >> > > >> >> >> > > >> >> >> >> If you can't figure what its about, I'd suggest you check out > the > >> >> >> >> hadoop book where it gets a good explication. > >> >> >> >> > >> >> >> >> which book? OReilly.Hadoop.The.Definitive.Guide.Jun.2009? > >> >> >> > > >> >> >> > On incremental upload, the doc. suggests you look at the output > for > >> >> >> >> LoadIncrementalHFiles command. Have you done that? You run > the > >> >> >> >> command and it'll add in whatever is ready for loading. > >> >> >> >> > >> >> >> > > >> >> >> > I just use the command line tool for bulk uplaod but not seen > >> >> >> > LoadIncrementalHFiles class yet to do it through program > >> >> >> > > >> >> >> > > >> >> >> > ------------------------------ > >> >> >> > > >> >> >> > > >> >> >> >> > >> >> >> >> St.Ack > >> >> >> >> > >> >> >> >> > >> >> >> >> On Wed, Nov 10, 2010 at 6:47 AM, Shuja Rehman < > >> [email protected] > >> >> > > >> >> >> >> wrote: > >> >> >> >> > Hey Community, > >> >> >> >> > > >> >> >> >> > Well...it seems that nobody has experienced with the bulk > load > >> >> option. > >> >> >> I > >> >> >> >> > have found one class which might help to write the code for > it. > >> >> >> >> > > >> >> >> >> > > >> >> >> >> > >> >> >> > >> >> > >> > https://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java > >> >> >> >> > > >> >> >> >> > From this, you can get the idea how to write map reduce job > to > >> >> output > >> >> >> in > >> >> >> >> > HFiles format. But There is a little confusion about these > two > >> >> things > >> >> >> >> > > >> >> >> >> > 1-TotalOrderPartitioner > >> >> >> >> > 2-configureIncrementalLoad > >> >> >> >> > > >> >> >> >> > Does anybody have idea about how these things and how to > >> configure > >> >> it > >> >> >> for > >> >> >> >> > the job? > >> >> >> >> > > >> >> >> >> > Thanks > >> >> >> >> > > >> >> >> >> > > >> >> >> >> > > >> >> >> >> > On Wed, Nov 10, 2010 at 1:02 AM, Shuja Rehman < > >> >> [email protected]> > >> >> >> >> wrote: > >> >> >> >> > > >> >> >> >> >> Hi > >> >> >> >> >> > >> >> >> >> >> I am trying to investigate the bulk load option as described > in > >> >> the > >> >> >> >> >> following link. > >> >> >> >> >> > >> >> >> >> >> http://hbase.apache.org/docs/r0.89.20100621/bulk-loads.html > >> >> >> >> >> > >> >> >> >> >> Does anybody have sample code or have used it before? > >> >> >> >> >> Can it be helpful to insert data into existing table. In my > >> >> scenario, > >> >> >> I > >> >> >> >> >> have one table with 1 column family in which data will be > >> inserted > >> >> >> every > >> >> >> >> 15 > >> >> >> >> >> minutes. > >> >> >> >> >> > >> >> >> >> >> Kindly share your experiences > >> >> >> >> >> > >> >> >> >> >> Thanks > >> >> >> >> >> -- > >> >> >> >> >> Regards > >> >> >> >> >> Shuja-ur-Rehman Baig > >> >> >> >> >> <http://pk.linkedin.com/in/shujamughal> > >> >> >> >> >> > >> >> >> >> >> > >> >> >> >> > > >> >> >> >> > > >> >> >> >> > -- > >> >> >> >> > Regards > >> >> >> >> > Shuja-ur-Rehman Baig > >> >> >> >> > <http://pk.linkedin.com/in/shujamughal> > >> >> >> >> > > >> >> >> >> > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > -- > >> >> >> > Regards > >> >> >> > Shuja-ur-Rehman Baig > >> >> >> > <http://pk.linkedin.com/in/shujamughal> > >> >> >> > > >> >> >> > >> >> > > >> >> > > >> >> > > >> >> > -- > >> >> > Regards > >> >> > Shuja-ur-Rehman Baig > >> >> > <http://pk.linkedin.com/in/shujamughal> > >> >> > > >> >> > >> > > >> > > >> > > >> > -- > >> > Regards > >> > Shuja-ur-Rehman Baig > >> > <http://pk.linkedin.com/in/shujamughal> > >> > > >> > > > > > > > > -- > > Regards > > Shuja-ur-Rehman Baig > > <http://pk.linkedin.com/in/shujamughal> > > > -- Regards Shuja-ur-Rehman Baig <http://pk.linkedin.com/in/shujamughal>
