Re: Looking for an example of using HBaseStorage with Pig
I moved forward, but I am still not able to do anything. Here's what I am doing: grunt register /usr/local/hadoop/pigtmp/pig-0.6.0/lib/ElephantBird.jar grunt register /usr/local/hadoop/pigtmp/pig-0.6.0/lib/hbase-0.20.0.jar grunt register /usr/local/hadoop/pigtmp/pig-0.6.0/lib/hbase-0.20.0-test.jar grunt register /usr/local/hadoop/pigtmp/pig-0.6.0/lib/zookeeper-hbase-1329.jar grunt a = load 'hbase://silk1' USING com.twitter.elephantbird.pig.load.HBaseLoader('f1:destination_port') AS (destination_port); 2010-06-30 13:44:02,583 [main] INFO com.twitter.elephantbird.pig.load.HBaseLoader - no-arg constructor 2010-06-30 13:44:02,602 [main] INFO com.twitter.elephantbird.pig.load.HBaseLoader - no-arg constructor grunt dump a; 2010-06-30 13:44:05,321 [main] INFO com.twitter.elephantbird.pig.load.HBaseLoader - no-arg constructor 2010-06-30 13:44:05,337 [main] INFO com.twitter.elephantbird.pig.load.HBaseLoader - no-arg constructor 2010-06-30 13:44:05,452 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1 2010-06-30 13:44:05,452 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1 2010-06-30 13:44:08,361 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job 2010-06-30 13:44:08,394 [Thread-12] WARN org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 2010-06-30 13:44:08,597 [Thread-12] INFO com.twitter.elephantbird.pig.load.HBaseLoader - no-arg constructor 2010-06-30 13:44:08,597 [Thread-12] INFO com.twitter.elephantbird.pig.load.HBaseLoader - tablename: hbase://silk1 2010-06-30 13:44:08,651 [Thread-12] ERROR org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper - no clientPort found in zoo.cfg 2010-06-30 13:44:08,652 [Thread-12] ERROR org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper - no clientPort found in zoo.cfg 2010-06-30 13:44:08,652 [Thread-12] ERROR org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper - no clientPort found in zoo.cfg 2010-06-30 13:44:08,653 [Thread-12] ERROR org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper - no clientPort found in zoo.cfg 2010-06-30 13:44:09,390 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Cannot get jobid for this job 2010-06-30 13:44:09,391 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete 2010-06-30 13:44:09,391 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map reduce job(s) failed! 2010-06-30 13:44:09,402 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed to produce result in: hdfs://fchadoop01:54310/tmp/temp642740681/tmp668990886 2010-06-30 13:44:09,403 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed! 2010-06-30 13:44:09,405 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2997: Unable to recreate exception from backend error: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input slice for: hbase://silk1 Details at logfile: /usr/local/hadoop/pigtmp/pig-0.6.0/pig_1277919819066.log grunt Here's the log file Backend error message during job submission --- org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create input slice for: hbase://silk1 at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:269) at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730) at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378) at org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247) at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279) at java.lang.Thread.run(Thread.java:619) Caused by: java.io.IOException: Could not read quorum servers from zoo.cfg at org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.init(ZooKeeperWrapper.java:81) at org.apache.hadoop.hbase.client.HConnectionManager$ClientZKWatcher.getZooKeeperWrapper(HConnectionManager.java:199) at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getZooKeeperWrapper(HConnectionManager.java:878) at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRootRegion(HConnectionManager.java:894) at
Re: Looking for an example of using HBaseStorage with Pig
Thank you for trying to help me out. Here's the error that's in my log file ERROR 2998: Unhandled internal error. org/apache/pig/Slicer java.lang.NoClassDefFoundError: org/apache/pig/Slicer at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClassCond(ClassLoader.java:632) at java.lang.ClassLoader.defineClass(ClassLoader.java:616) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) at java.net.URLClassLoader.defineClass(URLClassLoader.java:283) at java.net.URLClassLoader.access$000(URLClassLoader.java:58) at java.net.URLClassLoader$1.run(URLClassLoader.java:197) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:296) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:422) at org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:452) at org.apache.pig.impl.logicalLayer.parser.QueryParser.NonEvalFuncSpec(QueryParser.java:5087) at org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1434) at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1245) at org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:911) at org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:700) at org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1164) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114) at org.apache.pig.PigServer.registerQuery(PigServer.java:425) at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:737) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75) at org.apache.pig.Main.main(Main.java:357) Caused by: java.lang.ClassNotFoundException: org.apache.pig.Slicer at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) ... 32 more On Mon, Jun 28, 2010 at 7:56 PM, Dmitriy Ryaboy dvrya...@gmail.com wrote: Pavel, What does the log say? I am guessing you need to a) make sure that all the hbase config stuff is on the classpath and b) load 'hbase://silk1' (no host:port) -D On Mon, Jun 28, 2010 at 10:37 AM, Pavel Gutin pavelgu...@gmail.com wrote: My apologies, i pasted the wrong line. I was testing to see if pig was able to locate the JAR by misspelling the name on purpose Here's the correct error. grunt a = load 'hbase://localhost:6/silk1' USING com.twitter.elephantbird.pig.load.HBaseLoader('f1:destination_port') AS (destination_port); 2010-06-28 13:19:01,288 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org/apache/pig/Slicer Details at logfile: /usr/local/hadoop/pigtmp/pig-0.7.0/pig_1277744862785.log grunt On Mon, Jun 28, 2010 at 1:34 PM, Pavel Gutin pavelgu...@gmail.com wrote: This seems like it might work for me. I downloaded it, compiled it, and added the JAR to PIG_CLASSPATH However, when i try to run the following command, i get an error grunt a = load 'hbase://myTable' USING co.twitter.elephantbird.pig.load.HBaseLoader('f1:col1') AS (col1); 2010-06-28 13:13:59,607 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve co.twitter.elephantbird.pig.load.HBaseLoader using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.] Details at logfile: /usr/local/hadoop/pigtmp/pig-0.7.0/pig_1277744862785.log grunt I have a feeling I am not referencing the table the right way. On Mon, Jun 28, 2010 at 11:43 AM, Dmitriy Ryaboy dvrya...@gmail.com wrote: There's an HBase LoadFunc that works with 0.6 in Elephant-Bird. http://github.com/kevinweil/elephant-bird There are slides here that show
Re: Looking for an example of using HBaseStorage with Pig
My apologies, i pasted the wrong line. I was testing to see if pig was able to locate the JAR by misspelling the name on purpose Here's the correct error. grunt a = load 'hbase://localhost:6/silk1' USING com.twitter.elephantbird.pig.load.HBaseLoader('f1:destination_port') AS (destination_port); 2010-06-28 13:19:01,288 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org/apache/pig/Slicer Details at logfile: /usr/local/hadoop/pigtmp/pig-0.7.0/pig_1277744862785.log grunt On Mon, Jun 28, 2010 at 1:34 PM, Pavel Gutin pavelgu...@gmail.com wrote: This seems like it might work for me. I downloaded it, compiled it, and added the JAR to PIG_CLASSPATH However, when i try to run the following command, i get an error grunt a = load 'hbase://myTable' USING co.twitter.elephantbird.pig.load.HBaseLoader('f1:col1') AS (col1); 2010-06-28 13:13:59,607 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve co.twitter.elephantbird.pig.load.HBaseLoader using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.] Details at logfile: /usr/local/hadoop/pigtmp/pig-0.7.0/pig_1277744862785.log grunt I have a feeling I am not referencing the table the right way. On Mon, Jun 28, 2010 at 11:43 AM, Dmitriy Ryaboy dvrya...@gmail.comwrote: There's an HBase LoadFunc that works with 0.6 in Elephant-Bird. http://github.com/kevinweil/elephant-bird There are slides here that show usage: http://squarecog.wordpress.com/2010/05/20/pig-hbase-hadoop-and-twitter-hug-talk-slides/ -D On Mon, Jun 28, 2010 at 7:59 AM, Pavel Gutin pavelgu...@gmail.com wrote: I am trying to get Pig to query my HBase table, but I cannot find any examples on the web. Can anyone provide me with a simple example? The best I could find so far, was a little blurb on the following page http://wiki.apache.org/pig/PigMultiQueryPerformanceSpecification but that didn't help much. Thanks in advance. - Pavel
Re: Looking for an example of using HBaseStorage with Pig
Pavel, What does the log say? I am guessing you need to a) make sure that all the hbase config stuff is on the classpath and b) load 'hbase://silk1' (no host:port) -D On Mon, Jun 28, 2010 at 10:37 AM, Pavel Gutin pavelgu...@gmail.com wrote: My apologies, i pasted the wrong line. I was testing to see if pig was able to locate the JAR by misspelling the name on purpose Here's the correct error. grunt a = load 'hbase://localhost:6/silk1' USING com.twitter.elephantbird.pig.load.HBaseLoader('f1:destination_port') AS (destination_port); 2010-06-28 13:19:01,288 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org/apache/pig/Slicer Details at logfile: /usr/local/hadoop/pigtmp/pig-0.7.0/pig_1277744862785.log grunt On Mon, Jun 28, 2010 at 1:34 PM, Pavel Gutin pavelgu...@gmail.com wrote: This seems like it might work for me. I downloaded it, compiled it, and added the JAR to PIG_CLASSPATH However, when i try to run the following command, i get an error grunt a = load 'hbase://myTable' USING co.twitter.elephantbird.pig.load.HBaseLoader('f1:col1') AS (col1); 2010-06-28 13:13:59,607 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve co.twitter.elephantbird.pig.load.HBaseLoader using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.] Details at logfile: /usr/local/hadoop/pigtmp/pig-0.7.0/pig_1277744862785.log grunt I have a feeling I am not referencing the table the right way. On Mon, Jun 28, 2010 at 11:43 AM, Dmitriy Ryaboy dvrya...@gmail.com wrote: There's an HBase LoadFunc that works with 0.6 in Elephant-Bird. http://github.com/kevinweil/elephant-bird There are slides here that show usage: http://squarecog.wordpress.com/2010/05/20/pig-hbase-hadoop-and-twitter-hug-talk-slides/ -D On Mon, Jun 28, 2010 at 7:59 AM, Pavel Gutin pavelgu...@gmail.com wrote: I am trying to get Pig to query my HBase table, but I cannot find any examples on the web. Can anyone provide me with a simple example? The best I could find so far, was a little blurb on the following page http://wiki.apache.org/pig/PigMultiQueryPerformanceSpecification but that didn't help much. Thanks in advance. - Pavel