Hello,
I have installed Pig 0.8.0 and Cassandra 0.7.4 and I'm not able to read data
from cassandra. I write a simple query just to test:
grunt> A = LOAD 'cassandra://msg_keyspace/messages' USING
org.apache.cassandra.hadoop.pig.CassandraStorage();
grunt> dump A;
And i'm getting the following error:
==========================================================================
2011-04-05 15:33:57,669 [main] INFO org.apache.pig.tools.pigstats.ScriptState
- Pig features used in the script: UNKNOWN
2011-04-05 15:33:57,669 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
pig.usenewlogicalplan is set to true. New logical plan will be used.
2011-04-05 15:33:57,819 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name: A:
Store(hdfs://localhost/tmp/temp2037710644/tmp-29784200:org.apache.pig.impl.io.InterStorage)
- scope-1 Operator Key: scope-1)
2011-04-05 15:33:57,850 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File
concatenation threshold: 100 optimistic? false
2011-04-05 15:33:57,877 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
- MR plan size before optimization: 1
2011-04-05 15:33:57,877 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
- MR plan size after optimization: 1
2011-04-05 15:33:57,969 [main] INFO org.apache.pig.tools.pigstats.ScriptState
- Pig script settings are added to the job
2011-04-05 15:33:57,990 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2011-04-05 15:34:03,376 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- Setting up single store job
2011-04-05 15:34:03,416 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 1 map-reduce job(s) waiting for submission.
2011-04-05 15:34:03,929 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 0% complete
2011-04-05 15:34:04,597 [Thread-5] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input
paths (combined) to process : 1
2011-04-05 15:34:05,942 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- HadoopJobId: job_201104051459_0008
2011-04-05 15:34:05,943 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- More information at:
http://localhost:50030/jobdetails.jsp?jobid=job_201104051459_0008
2011-04-05 15:34:35,912 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- job job_201104051459_0008 has failed! Stop running all dependent jobs
2011-04-05 15:34:35,918 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 100% complete
2011-04-05 15:34:35,931 [main] ERROR org.apache.pig.tools.pigstats.PigStats -
ERROR 2997: Unable to recreate exception from backed error:
java.lang.NumberFormatException: null
2011-04-05 15:34:35,931 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil
- 1 map reduce job(s) failed!
2011-04-05 15:34:35,933 [main] INFO org.apache.pig.tools.pigstats.PigStats -
Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
0.20.2-CDH3B4 0.8.0-SNAPSHOT root 2011-04-05 15:33:57 2011-04-05
15:34:35 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_201104051459_0008 A MAP_ONLY Message: Job failed! Error - NA
hdfs://localhost/tmp/temp2037710644/tmp-29784200,
Input(s):
Failed to read data from "cassandra://msg_keyspace/messages"
Output(s):
Failed to produce result in "hdfs://localhost/tmp/temp2037710644/tmp-29784200"
==========================================================================
Any idea how to fix this?
Cheers