There's something odd about this jar list. You said you are running hbase 91, yet you register a cloudera hbase 20.3 jar. You are also registering an ancient zookeeper jar. It doesn't sound like you are actually running either hbase 91, or Pig 8 from the tip of the svn branch.
D On Tue, Mar 29, 2011 at 6:34 AM, Jameson Lopp <[email protected]> wrote: > Just to follow up: I'm running Pig 0.8 from SVN. I finally got it working > though I'm not sure why this was required. I resolved the Class Not Found > errors by manually registering the jars in my Pig script: > > REGISTER /path/to/pig_0.8/piggybank.jar; > REGISTER /path/to/pig_0.8/lib/google-collections-1.0.jar; > REGISTER /path/to/pig_0.8/lib/hbase-0.20.3-1.cloudera.jar; > REGISTER /path/to/pig_0.8/lib/zookeeper-hbase-1329.jar > > We had these jars placed in the hadoop /lib directory on all of our hadoop > machines, and thus figured that they would get loaded for the map reduce > jobs. Apparently this is not the case... > > > -- > Jameson Lopp > Software Engineer > Bronto Software, Inc. > > On 03/25/2011 04:53 PM, Dmitriy Ryaboy wrote: > >> Pig 8 distribution or Pig 8 from svn? >> You want the latter (soon-to-be-Pig 0.8.1) >> >> D >> >> On Fri, Mar 25, 2011 at 1:02 PM, Jameson Lopp<[email protected]> wrote: >> >> Alright, I set up hbase 0.90.1 and pig 0.8.0 and feel like everything is >>> configured, but my pig script hangs after connecting to zookeeper... my >>> map >>> reduce job doesn't get scheduled and the process looks frozen. Some debug >>> output: >>> >>> 2011-03-25 15:51:07,344 [main] INFO >>> >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer >>> - Merged MR job 285 into MR job 282 >>> 2011-03-25 15:51:07,344 [main] INFO >>> >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer >>> - Merged MR job 293 into MR job 282 >>> 2011-03-25 15:51:07,344 [main] INFO >>> >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer >>> - Merged MR job 313 into MR job 282 >>> 2011-03-25 15:51:07,345 [main] INFO >>> >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer >>> - Requested parallelism of splitter: -1 >>> 2011-03-25 15:51:07,345 [main] INFO >>> >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer >>> - Merged 3 map-reduce splittees. >>> 2011-03-25 15:51:07,345 [main] INFO >>> >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer >>> - Merged 3 out of total 4 MR operators. >>> 2011-03-25 15:51:07,345 [main] INFO >>> >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer >>> - MR plan size after optimization: 8 >>> 2011-03-25 15:51:07,423 [main] INFO >>> org.apache.pig.tools.pigstats.ScriptState - Pig script settings are >>> added >>> to the job >>> 2011-03-25 15:51:07,434 [main] INFO >>> >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler >>> - mapred.job.reduce.markreset.buffer.percent is not set, set to default >>> 0.3 >>> 2011-03-25 15:51:11,014 [main] DEBUG org.apache.pig.impl.io.InterStorage >>> - >>> Pig Internal storage in use >>> 2011-03-25 15:51:11,014 [main] INFO >>> >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler >>> - Setting up multi store job >>> 2011-03-25 15:51:11,021 [main] INFO >>> >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler >>> - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=0 >>> 2011-03-25 15:51:11,022 [main] INFO >>> >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler >>> - Neither PARALLEL nor default parallelism is set for this job. Setting >>> number of reducers to 1 >>> 2011-03-25 15:51:11,103 [main] INFO >>> >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >>> - 1 map-reduce job(s) waiting for submission. >>> 2011-03-25 15:51:11,504 [Thread-3] DEBUG >>> org.apache.pig.impl.io.InterStorage - Pig Internal storage in use >>> 2011-03-25 15:51:11,611 [main] INFO >>> >>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >>> - 0% complete >>> >>> [snipped] ... >>> >>> 2011-03-25 15:47:08,617 [Thread-3-SendThread] INFO >>> org.apache.zookeeper.ClientCnxn - Attempting connection to server >>> 10.202.61.184:2181 >>> 2011-03-25 15:47:08,625 [Thread-3-SendThread] INFO >>> org.apache.zookeeper.ClientCnxn - Priming connection to >>> java.nio.channels.SocketChannel[connected local=/10.220.25.162:34767 >>> remote= >>> 10.202.61.184:2181] >>> 2011-03-25 15:47:08,627 [Thread-3-SendThread] INFO >>> org.apache.zookeeper.ClientCnxn - Server connection successful >>> >>> I found a few threads about people having problems connecting to hbase >>> through zookeeper due to misconfiguration / network issues but don't see >>> any >>> where it claims to connect successfully and then hangs... weird. >>> >>> -- >>> Jameson Lopp >>> Software Engineer >>> Bronto Software, Inc. >>> >>> On 03/25/2011 12:06 PM, Bill Graham wrote: >>> >>> The Pig trunk and Pig 0.8.0 branch both require HBase>= 0.89 (see >>>> PIG-1680). The Pig 0.8.0 release requires< 0.89 though so you should >>>> focus on that version of Pig. Or better yet, upgrade HBase to 0.90.1 >>>> if possible. >>>> >>>> On Fri, Mar 25, 2011 at 6:59 AM, Jameson Lopp<[email protected]> >>>> wrote: >>>> >>>> Running Hbase 0.20-0.20.3-1.cloudera - I've tried running this with Pig >>>>> 0.8 >>>>> from August 2010 and from trunk on March 25 2011. Do I need to use an >>>>> older >>>>> version? >>>>> >>>>> My pig script is trying to load from hbase via this command: >>>>> data = LOAD 'hbase://track' USING >>>>> org.apache.pig.backend.hadoop.hbase.HBaseStorage('open:browser open:ip >>>>> open:os', '-caching 1000') as (browser:chararray, ipAddress:chararray, >>>>> os:chararray); >>>>> >>>>> But the job fails trying to load the data: >>>>> Input(s): >>>>> Failed to read data from "hbase://track" >>>>> >>>>> When I look at my map reduce job, it fails every time with a >>>>> ClassNotFoundException: >>>>> java.io.IOException: java.lang.ClassNotFoundException: >>>>> org.apache.hadoop.hbase.mapreduce.TableSplit >>>>> at >>>>> >>>>> >>>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit.readFields(PigSplit.java:197) >>>>> at >>>>> >>>>> >>>>> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67) >>>>> at >>>>> >>>>> >>>>> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40) >>>>> at >>>>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:586) >>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) >>>>> at org.apache.hadoop.mapred.Child.main(Child.java:170) >>>>> Caused by: java.lang.ClassNotFoundException: >>>>> org.apache.hadoop.hbase.mapreduce.TableSplit >>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:202) >>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:190) >>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:307) >>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) >>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:248) >>>>> at java.lang.Class.forName0(Native Method) >>>>> at java.lang.Class.forName(Class.java:247) >>>>> at >>>>> >>>>> >>>>> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:907) >>>>> at >>>>> >>>>> >>>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit.readFields(PigSplit.java:185) >>>>> ... 5 more >>>>> >>>>> Now, perhaps this issue is better suited for a hadoop / map reduce / >>>>> cloudera mailing list, but every node in my hadoop cluster has >>>>> /usr/local/hadoop/lib/hbase-0.20.3-1.cloudera.jar which includes the >>>>> TableSplit class... so it seems to me that it should have no problem >>>>> loading >>>>> it. >>>>> >>>>> I've run out of ideas at this point - anyone have suggestions? Thanks! >>>>> -- >>>>> Jameson Lopp >>>>> Software Engineer >>>>> Bronto Software, Inc. >>>>> >>>>> >>>>> >>> >> >
