Re: Looking for an example of using HBaseStorage with Pig

2010-06-30 Thread Pavel Gutin
I moved forward, but I am still not able to do anything.
Here's what I am doing:

grunt register /usr/local/hadoop/pigtmp/pig-0.6.0/lib/ElephantBird.jar
grunt register /usr/local/hadoop/pigtmp/pig-0.6.0/lib/hbase-0.20.0.jar
grunt register /usr/local/hadoop/pigtmp/pig-0.6.0/lib/hbase-0.20.0-test.jar
grunt register
/usr/local/hadoop/pigtmp/pig-0.6.0/lib/zookeeper-hbase-1329.jar
grunt a = load 'hbase://silk1' USING
com.twitter.elephantbird.pig.load.HBaseLoader('f1:destination_port') AS
(destination_port);
2010-06-30 13:44:02,583 [main] INFO
 com.twitter.elephantbird.pig.load.HBaseLoader - no-arg constructor
2010-06-30 13:44:02,602 [main] INFO
 com.twitter.elephantbird.pig.load.HBaseLoader - no-arg constructor
grunt dump a;
2010-06-30 13:44:05,321 [main] INFO
 com.twitter.elephantbird.pig.load.HBaseLoader - no-arg constructor
2010-06-30 13:44:05,337 [main] INFO
 com.twitter.elephantbird.pig.load.HBaseLoader - no-arg constructor
2010-06-30 13:44:05,452 [main] INFO
 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
- MR plan size before optimization: 1
2010-06-30 13:44:05,452 [main] INFO
 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
- MR plan size after optimization: 1
2010-06-30 13:44:08,361 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- Setting up single store job
2010-06-30 13:44:08,394 [Thread-12] WARN  org.apache.hadoop.mapred.JobClient
- Use GenericOptionsParser for parsing the arguments. Applications should
implement Tool for the same.
2010-06-30 13:44:08,597 [Thread-12] INFO
 com.twitter.elephantbird.pig.load.HBaseLoader - no-arg constructor
2010-06-30 13:44:08,597 [Thread-12] INFO
 com.twitter.elephantbird.pig.load.HBaseLoader - tablename: hbase://silk1
2010-06-30 13:44:08,651 [Thread-12] ERROR
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper - no clientPort found in
zoo.cfg
2010-06-30 13:44:08,652 [Thread-12] ERROR
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper - no clientPort found in
zoo.cfg
2010-06-30 13:44:08,652 [Thread-12] ERROR
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper - no clientPort found in
zoo.cfg
2010-06-30 13:44:08,653 [Thread-12] ERROR
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper - no clientPort found in
zoo.cfg
2010-06-30 13:44:09,390 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Cannot get jobid for this job
2010-06-30 13:44:09,391 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 100% complete
2010-06-30 13:44:09,391 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 1 map reduce job(s) failed!
2010-06-30 13:44:09,402 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Failed to produce result in:
hdfs://fchadoop01:54310/tmp/temp642740681/tmp668990886
2010-06-30 13:44:09,403 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Failed!
2010-06-30 13:44:09,405 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 2997: Unable to recreate exception from backend error:
org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to
create input slice for: hbase://silk1
Details at logfile: /usr/local/hadoop/pigtmp/pig-0.6.0/pig_1277919819066.log
grunt


Here's the log file



Backend error message during job submission
---
org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to
create input slice for: hbase://silk1
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:269)
at
org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810)
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.IOException: Could not read quorum servers from zoo.cfg
at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.init(ZooKeeperWrapper.java:81)
at
org.apache.hadoop.hbase.client.HConnectionManager$ClientZKWatcher.getZooKeeperWrapper(HConnectionManager.java:199)
at
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getZooKeeperWrapper(HConnectionManager.java:878)
at
org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRootRegion(HConnectionManager.java:894)
at

Re: Looking for an example of using HBaseStorage with Pig

2010-06-29 Thread Pavel Gutin
Thank you for trying to help me out. Here's the error that's in my log file

ERROR 2998: Unhandled internal error. org/apache/pig/Slicer

java.lang.NoClassDefFoundError: org/apache/pig/Slicer
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClassCond(ClassLoader.java:632)
at java.lang.ClassLoader.defineClass(ClassLoader.java:616)
at
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:296)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at
org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:422)
at
org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:452)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.NonEvalFuncSpec(QueryParser.java:5087)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1434)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1245)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:911)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:700)
at
org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1164)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114)
at org.apache.pig.PigServer.registerQuery(PigServer.java:425)
at
org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:737)
at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
at org.apache.pig.Main.main(Main.java:357)
Caused by: java.lang.ClassNotFoundException: org.apache.pig.Slicer
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
... 32 more


On Mon, Jun 28, 2010 at 7:56 PM, Dmitriy Ryaboy dvrya...@gmail.com wrote:

 Pavel,
 What does the log say?

 I am guessing you need to a) make sure that all the hbase config stuff is
 on
 the classpath and b) load 'hbase://silk1'  (no host:port)

 -D

 On Mon, Jun 28, 2010 at 10:37 AM, Pavel Gutin pavelgu...@gmail.com
 wrote:

  My apologies, i pasted the wrong line. I was testing to see if pig was
 able
  to locate the JAR by misspelling the name on purpose
 
  Here's the correct error.
 
  grunt a = load 'hbase://localhost:6/silk1' USING
  com.twitter.elephantbird.pig.load.HBaseLoader('f1:destination_port') AS
  (destination_port);
  2010-06-28 13:19:01,288 [main] ERROR org.apache.pig.tools.grunt.Grunt -
  ERROR 2998: Unhandled internal error. org/apache/pig/Slicer
  Details at logfile:
  /usr/local/hadoop/pigtmp/pig-0.7.0/pig_1277744862785.log
  grunt
 
 
  On Mon, Jun 28, 2010 at 1:34 PM, Pavel Gutin pavelgu...@gmail.com
 wrote:
 
   This seems like it might work for me. I downloaded it, compiled it, and
   added the JAR to PIG_CLASSPATH
  
   However, when i try to run the following command, i get an error
  
   grunt a = load 'hbase://myTable' USING
   co.twitter.elephantbird.pig.load.HBaseLoader('f1:col1') AS (col1);
   2010-06-28 13:13:59,607 [main] ERROR org.apache.pig.tools.grunt.Grunt -
   ERROR 1070: Could not resolve
  co.twitter.elephantbird.pig.load.HBaseLoader
   using imports: [, org.apache.pig.builtin.,
 org.apache.pig.impl.builtin.]
   Details at logfile:
   /usr/local/hadoop/pigtmp/pig-0.7.0/pig_1277744862785.log
   grunt
  
   I have a feeling I am not referencing the table the right way.
  
   On Mon, Jun 28, 2010 at 11:43 AM, Dmitriy Ryaboy dvrya...@gmail.com
  wrote:
  
   There's an HBase LoadFunc that works with 0.6 in Elephant-Bird.
   http://github.com/kevinweil/elephant-bird
  
   There are slides here that show 

Re: Looking for an example of using HBaseStorage with Pig

2010-06-28 Thread Pavel Gutin
My apologies, i pasted the wrong line. I was testing to see if pig was able
to locate the JAR by misspelling the name on purpose

Here's the correct error.

grunt a = load 'hbase://localhost:6/silk1' USING
com.twitter.elephantbird.pig.load.HBaseLoader('f1:destination_port') AS
(destination_port);
2010-06-28 13:19:01,288 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 2998: Unhandled internal error. org/apache/pig/Slicer
Details at logfile: /usr/local/hadoop/pigtmp/pig-0.7.0/pig_1277744862785.log
grunt


On Mon, Jun 28, 2010 at 1:34 PM, Pavel Gutin pavelgu...@gmail.com wrote:

 This seems like it might work for me. I downloaded it, compiled it, and
 added the JAR to PIG_CLASSPATH

 However, when i try to run the following command, i get an error

 grunt a = load 'hbase://myTable' USING
 co.twitter.elephantbird.pig.load.HBaseLoader('f1:col1') AS (col1);
 2010-06-28 13:13:59,607 [main] ERROR org.apache.pig.tools.grunt.Grunt -
 ERROR 1070: Could not resolve co.twitter.elephantbird.pig.load.HBaseLoader
 using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]
 Details at logfile:
 /usr/local/hadoop/pigtmp/pig-0.7.0/pig_1277744862785.log
 grunt

 I have a feeling I am not referencing the table the right way.

 On Mon, Jun 28, 2010 at 11:43 AM, Dmitriy Ryaboy dvrya...@gmail.comwrote:

 There's an HBase LoadFunc that works with 0.6 in Elephant-Bird.
 http://github.com/kevinweil/elephant-bird

 There are slides here that show usage:

 http://squarecog.wordpress.com/2010/05/20/pig-hbase-hadoop-and-twitter-hug-talk-slides/

 -D

 On Mon, Jun 28, 2010 at 7:59 AM, Pavel Gutin pavelgu...@gmail.com
 wrote:

  I am trying to get Pig to query my HBase table, but I cannot find any
  examples on the web. Can anyone provide me with a simple example?
 
  The best I could find so far, was a little blurb on the following page
  http://wiki.apache.org/pig/PigMultiQueryPerformanceSpecification but
 that
  didn't help much.
 
  Thanks in advance.
 
   - Pavel
 





Re: Looking for an example of using HBaseStorage with Pig

2010-06-28 Thread Dmitriy Ryaboy
Pavel,
What does the log say?

I am guessing you need to a) make sure that all the hbase config stuff is on
the classpath and b) load 'hbase://silk1'  (no host:port)

-D

On Mon, Jun 28, 2010 at 10:37 AM, Pavel Gutin pavelgu...@gmail.com wrote:

 My apologies, i pasted the wrong line. I was testing to see if pig was able
 to locate the JAR by misspelling the name on purpose

 Here's the correct error.

 grunt a = load 'hbase://localhost:6/silk1' USING
 com.twitter.elephantbird.pig.load.HBaseLoader('f1:destination_port') AS
 (destination_port);
 2010-06-28 13:19:01,288 [main] ERROR org.apache.pig.tools.grunt.Grunt -
 ERROR 2998: Unhandled internal error. org/apache/pig/Slicer
 Details at logfile:
 /usr/local/hadoop/pigtmp/pig-0.7.0/pig_1277744862785.log
 grunt


 On Mon, Jun 28, 2010 at 1:34 PM, Pavel Gutin pavelgu...@gmail.com wrote:

  This seems like it might work for me. I downloaded it, compiled it, and
  added the JAR to PIG_CLASSPATH
 
  However, when i try to run the following command, i get an error
 
  grunt a = load 'hbase://myTable' USING
  co.twitter.elephantbird.pig.load.HBaseLoader('f1:col1') AS (col1);
  2010-06-28 13:13:59,607 [main] ERROR org.apache.pig.tools.grunt.Grunt -
  ERROR 1070: Could not resolve
 co.twitter.elephantbird.pig.load.HBaseLoader
  using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]
  Details at logfile:
  /usr/local/hadoop/pigtmp/pig-0.7.0/pig_1277744862785.log
  grunt
 
  I have a feeling I am not referencing the table the right way.
 
  On Mon, Jun 28, 2010 at 11:43 AM, Dmitriy Ryaboy dvrya...@gmail.com
 wrote:
 
  There's an HBase LoadFunc that works with 0.6 in Elephant-Bird.
  http://github.com/kevinweil/elephant-bird
 
  There are slides here that show usage:
 
 
 http://squarecog.wordpress.com/2010/05/20/pig-hbase-hadoop-and-twitter-hug-talk-slides/
 
  -D
 
  On Mon, Jun 28, 2010 at 7:59 AM, Pavel Gutin pavelgu...@gmail.com
  wrote:
 
   I am trying to get Pig to query my HBase table, but I cannot find any
   examples on the web. Can anyone provide me with a simple example?
  
   The best I could find so far, was a little blurb on the following page
   http://wiki.apache.org/pig/PigMultiQueryPerformanceSpecification but
  that
   didn't help much.
  
   Thanks in advance.
  
- Pavel