Russell Jurney created PIG-2792:
-----------------------------------

             Summary: Wonderdog stopped working in Pig 0.10.0 (worked in 0.9.2)
                 Key: PIG-2792
                 URL: https://issues.apache.org/jira/browse/PIG-2792
             Project: Pig
          Issue Type: Bug
          Components: piggybank
    Affects Versions: 0.10.0, 0.11, 0.10.1
         Environment: Pig with Wonderdog 
https://github.com/infochimps-labs/wonderdog for elasticsearch integration. 
Elasticsearch 0.18.6. Pig local mode.
            Reporter: Russell Jurney
            Priority: Blocker
             Fix For: 0.10.1


The Pig UDFs in Wonderdog for ElasticSearch integration, which worked in 0.9.2 
stopped working in 0.10.0.

Now in 0.10.0 there is an error, as Wonderdog is unable to read its 
configuration from the hadoop cache.

If someone can help identify what the issue is, or advise how Wonderdog or Pig 
can be modified so that wonderdog works with with Pig 0.10, it would be greatly 
appreciated.

This issue is duped in the Wonderdog project here: 
https://github.com/infochimps-labs/wonderdog/issues/6 
https://github.com/infochimps-labs/wonderdog/issues/5 and 
https://github.com/infochimps-labs/wonderdog/issues/7

The error is below:

2012-07-06 16:50:51,501 [main] INFO  org.apache.pig.Main - Apache Pig version 
0.10.0-SNAPSHOT (rexported) compiled Jun 22 2012, 15:56:16
2012-07-06 16:50:51,502 [main] INFO  org.apache.pig.Main - Logging error 
messages to: /private/tmp/pig_1341618651472.log
2012-07-06 16:50:51,829 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to 
hadoop file system at: file:///
{"ok":true}
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100    11  100    11    0     0    647      0 --:--:-- --:--:-- --:--:--   733
2012-07-06 16:50:53,206 [main] INFO  org.apache.pig.tools.pigstats.ScriptState 
- Pig features used in the script: UNKNOWN
2012-07-06 16:50:53,379 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File 
concatenation threshold: 100 optimistic? false
2012-07-06 16:50:53,403 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
 - MR plan size before optimization: 1
2012-07-06 16:50:53,403 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
 - MR plan size after optimization: 1
2012-07-06 16:50:53,441 [main] INFO  org.apache.pig.tools.pigstats.ScriptState 
- Pig script settings are added to the job
2012-07-06 16:50:53,449 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler 
- mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2012-07-06 16:50:53,494 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler 
- Setting up single store job
2012-07-06 16:50:53,560 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- 1 map-reduce job(s) waiting for submission.
2012-07-06 16:50:53,587 [Thread-7] WARN  
org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library 
for your platform... using builtin-java classes where applicable
2012-07-06 16:50:53,597 [Thread-7] WARN  org.apache.hadoop.mapred.JobClient - 
No job jar file set.  User classes may not be found. See JobConf(Class) or 
JobConf#setJar(String).
****file:/tmp/emails.json
2012-07-06 16:50:53,711 [Thread-7] INFO  
org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to 
process : 2
2012-07-06 16:50:53,711 [Thread-7] INFO  
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input 
paths to process : 2
2012-07-06 16:50:53,734 [Thread-7] WARN  
org.apache.hadoop.io.compress.snappy.LoadSnappy - Snappy native library not 
loaded
2012-07-06 16:50:53,737 [Thread-7] INFO  
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input 
paths (combined) to process : 3
2012-07-06 16:50:54,008 [Thread-8] INFO  org.apache.hadoop.mapred.Task -  Using 
ResourceCalculatorPlugin : null
2012-07-06 16:50:54,023 [Thread-8] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader - 
Current split being processed file:/tmp/emails.json/part-m-00000:0+33554432
2012-07-06 16:50:54,029 [Thread-8] INFO  
com.infochimps.elasticsearch.ElasticSearchOutputFormat - Using 
field:[message_id] for document ids
2012-07-06 16:50:54,029 [Thread-8] INFO  
com.infochimps.elasticsearch.ElasticSearchOutputFormat - Using [null] as 
es.config
2012-07-06 16:50:54,029 [Thread-8] INFO  
com.infochimps.elasticsearch.ElasticSearchOutputFormat - Using [null] as 
es.plugins.dir
2012-07-06 16:50:54,033 [Thread-8] WARN  
org.apache.hadoop.mapred.FileOutputCommitter - Output path is null in cleanup
2012-07-06 16:50:54,034 [Thread-8] WARN  
org.apache.hadoop.mapred.LocalJobRunner - job_local_0001
java.lang.RuntimeException: java.lang.NullPointerException
        at 
com.infochimps.elasticsearch.ElasticSearchOutputFormat$ElasticSearchRecordWriter.<init>(ElasticSearchOutputFormat.java:133)
        at 
com.infochimps.elasticsearch.ElasticSearchOutputFormat.getRecordWriter(ElasticSearchOutputFormat.java:262)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:84)
        at 
org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.<init>(MapTask.java:628)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:753)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
Caused by: java.lang.NullPointerException
        at java.util.Hashtable.put(Hashtable.java:394)
        at java.util.Properties.setProperty(Properties.java:143)
        at java.lang.System.setProperty(System.java:746)
        at 
com.infochimps.elasticsearch.ElasticSearchOutputFormat$ElasticSearchRecordWriter.<init>(ElasticSearchOutputFormat.java:130)
        ... 6 more
2012-07-06 16:50:54,506 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- HadoopJobId: job_local_0001
2012-07-06 16:50:54,506 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- 0% complete
2012-07-06 16:50:59,022 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- job job_local_0001 has failed! Stop running all dependent jobs
2012-07-06 16:50:59,023 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- 100% complete
2012-07-06 16:50:59,024 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil 
- 1 map reduce job(s) failed!
2012-07-06 16:50:59,024 [main] INFO  
org.apache.pig.tools.pigstats.SimplePigStats - Detected Local mode. Stats 
reported below may be incomplete
2012-07-06 16:50:59,025 [main] INFO  
org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics: 

HadoopVersion   PigVersion      UserId  StartedAt       FinishedAt      Features
1.0.2   0.10.0-SNAPSHOT rjurney 2012-07-06 16:50:53     2012-07-06 16:50:59     
UNKNOWN

Failed!

Failed Jobs:
JobId   Alias   Feature Message Outputs
job_local_0001  json_emails     MAP_ONLY        Message: Job failed! Error - NA 
es://email/email?id=message_id&json=true&size=1000,

Input(s):
Failed to read data from "/tmp/emails.json"

Output(s):
Failed to produce result in "es://email/email?id=message_id&json=true&size=1000"

Job DAG:
job_local_0001


2012-07-06 16:50:59,025 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- Failed!
2012-07-06 16:50:59,029 [main] ERROR org.apache.pig.tools.grunt.GruntParser - 
ERROR 2244: Job failed, hadoop does not return any error message
2012-07-06 16:50:59,029 [main] ERROR org.apache.pig.tools.grunt.GruntParser - 
org.apache.pig.backend.executionengine.ExecException: ERROR 2244: Job failed, 
hadoop does not return any error message
        at 
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140)
        at 
org.apache.pig.tools.grunt.GruntParser.processShCommand(GruntParser.java:1025)
        at 
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:167)
        at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189)
        at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
        at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
        at org.apache.pig.Main.run(Main.java:555)
        at org.apache.pig.Main.main(Main.java:111)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

Details also at logfile: /private/tmp/pig_1341618651472.log
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

{
  "took" : 75,
  "timed_out" : false,
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0  
"_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 0,
    "max_score" : null,
    "hits" : [ ]
  }
}

100   193  100   193    0     0   2475      0 --:--:-- --:--:-- --:--:--  2539
2012-07-06 16:50:59,140 [main] ERROR org.apache.pig.tools.grunt.GruntParser - 
ERROR 2244: Job failed, hadoop does not return any error message
2012-07-06 16:50:59,140 [main] ERROR org.apache.pig.tools.grunt.GruntParser - 
org.apache.pig.backend.executionengine.ExecException: ERROR 2244: Job failed, 
hadoop does not return any error message
        at 
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140)
        at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:193)
        at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
        at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
        at org.apache.pig.Main.run(Main.java:555)
        at org.apache.pig.Main.main(Main.java:111)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to