queries on Spork (Pig on Spark)

2015-11-24 Thread Divya Gehlot
>
> Hi,


As a beginner ,I have below queries on Spork(Pig on Spark).
I have cloned  git clone https://github.com/apache/pig -b spark .
1.On which version of Pig and Spark , Spork  is being built ?
2. I followed the steps mentioned in   https://issues.apache.org/ji
ra/browse/PIG-4059 and try to run simple pig script just like Load the file
and dump/store it.
Getting errors :

>
grunt> A = load '/tmp/words_tb.txt' using PigStorage('\t') as
(empNo:chararray,empName:chararray,salary:chararray);
grunt> Store A into
'/tmp/spork';

2015-11-25 05:35:52,502 [main] INFO
org.apache.pig.tools.pigstats.ScriptState - Pig features used in the
script: UNKNOWN
2015-11-25 05:35:52,875 [main] WARN  org.apache.pig.data.SchemaTupleBackend
- SchemaTupleBackend has already been initialized
2015-11-25 05:35:52,883 [main] INFO
org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - Not MR
mode. RollupHIIOptimizer is disabled
2015-11-25 05:35:52,894 [main] INFO
org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer -
{RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator,
GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter,
MergeFilter, MergeForEach, PartitionFilterOptimizer,
PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter,
SplitFilter, StreamTypeCastInserter]}
2015-11-25 05:35:52,966 [main] INFO  org.apache.pig.data.SchemaTupleBackend
- Key [pig.schematuple] was not set... will not generate code.
2015-11-25 05:35:52,983 [main] INFO
org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher - add
Files Spark Job
2015-11-25 05:35:53,137 [main] INFO
org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher - Added
jar pig-0.15.0-SNAPSHOT-core-h2.jar
2015-11-25 05:35:53,138 [main] INFO
org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher - Added
jar pig-0.15.0-SNAPSHOT-core-h2.jar
2015-11-25 05:35:53,138 [main] INFO
org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher -
Converting operator POLoad (Name: A:
Load(/tmp/words_tb.txt:PigStorage(' ')) - scope-29 Operator Key: scope-29)
2015-11-25 05:35:53,205 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 2998: Unhandled internal error. Could not initialize class
org.apache.spark.rdd.RDDOperationScope$
Details at logfile: /home/pig/pig_1448425672112.log


Can you please help me in pointing whats wrong ?

Appreciate your help .

Thanks,

Regards,

Divya


Re: queries on Spork (Pig on Spark)

2015-11-24 Thread Divya Gehlot
Log files content :
Pig Stack Trace
---
ERROR 2998: Unhandled internal error. Could not initialize class
org.apache.spark.rdd.RDDOperationScope$
java.lang.NoClassDefFoundError: Could not initialize class
org.apache.spark.rdd.RDDOperationScope$
 at org.apache.spark.SparkContext.withScope(SparkContext.scala:681)
 at org.apache.spark.SparkContext.newAPIHadoopRDD(SparkContext.scala:1094)
 at
org.apache.pig.backend.hadoop.executionengine.spark.converter.LoadConverter.convert(LoadConverter.java:91)
 at
org.apache.pig.backend.hadoop.executionengine.spark.converter.LoadConverter.convert(LoadConverter.java:61)
 at
org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.physicalToRDD(SparkLauncher.java:666)
 at
org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.physicalToRDD(SparkLauncher.java:633)
 at
org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.physicalToRDD(SparkLauncher.java:633)
 at
org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.sparkOperToRDD(SparkLauncher.java:585)
 at
org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.sparkPlanToRDD(SparkLauncher.java:534)
 at
org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:209)
 at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:301)
 at org.apache.pig.PigServer.launchPlan(PigServer.java:1390)
 at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1375)
 at org.apache.pig.PigServer.storeEx(PigServer.java:1034)
 at org.apache.pig.PigServer.store(PigServer.java:997)
 at org.apache.pig.PigServer.openIterator(PigServer.java:910)
 at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:754)
 at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:376)
 at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
 at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
 at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66)
 at org.apache.pig.Main.run(Main.java:558)
 at org.apache.pig.Main.main(Main.java:170)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:136)


Didn't understand the problem behind the error .

Thanks,
Regards,
Divya

On 25 November 2015 at 14:00, Jeff Zhang <zjf...@gmail.com> wrote:

> >>> Details at logfile: /home/pig/pig_1448425672112.log
>
> You need to check the log file for details
>
>
>
>
> On Wed, Nov 25, 2015 at 1:57 PM, Divya Gehlot <divya.htco...@gmail.com>
> wrote:
>
>> Hi,
>>
>>
>> As a beginner ,I have below queries on Spork(Pig on Spark).
>> I have cloned  git clone https://github.com/apache/pig -b spark .
>> 1.On which version of Pig and Spark , Spork  is being built ?
>> 2. I followed the steps mentioned in   https://issues.apache.org/ji
>> ra/browse/PIG-4059 and try to run simple pig script just like Load the
>> file and dump/store it.
>> Getting errors :
>>
>>>
>> grunt> A = load '/tmp/words_tb.txt' using PigStorage('\t') as
>> (empNo:chararray,empName:chararray,salary:chararray);
>> grunt> Store A into
>> '/tmp/spork';
>>
>> 2015-11-25 05:35:52,502 [main] INFO
>> org.apache.pig.tools.pigstats.ScriptState - Pig features used in the
>> script: UNKNOWN
>> 2015-11-25 05:35:52,875 [main] WARN
>> org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already
>> been initialized
>> 2015-11-25 05:35:52,883 [main] INFO
>> org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - Not MR
>> mode. RollupHIIOptimizer is disabled
>> 2015-11-25 05:35:52,894 [main] INFO
>> org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer -
>> {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator,
>> GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter,
>> MergeFilter, MergeForEach, PartitionFilterOptimizer,
>> PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter,
>> SplitFilter, StreamTypeCastInserter]}
>> 2015-11-25 05:35:52,966 [main] INFO
>> org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not
>> set... will not generate code.
>> 2015-11-25 05:35:52,983 [main] INFO
>> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher - add
>> Files Spark Job
>> 2015-11-25 05:35:53,137 [main] INFO
>>