[ 
https://issues.apache.org/jira/browse/PIG-4765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang_intel updated PIG-4765:
----------------------------------
    Attachment: PIG-4765.patch

[~mohitsabharwal],[~pallavi.rao] and [~kexianda]:
The root cause of TestPoissonSampleLoader#testInstantiation fails is because  
PigConfiguration.PIG_SKEWEDJOIN_REDUCE_MEMUSAGE can not to be read from the 
configuration in 
[PoissonSampleLoader|https://github.com/apache/pig/blob/spark/src/org/apache/pig/impl/builtin/PoissonSampleLoader.java#L176]
 and 
[PigSplit#conf|https://github.com/apache/pig/blob/spark/src/org/apache/pig/impl/builtin/PoissonSampleLoader.java#L174]
 isn't initialized correctly in spark mode. In spark mode, PigSplit#conf is 
initialized in following stack info:
{code}
  at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit.setConf(PigSplit.java:356)
       at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
       at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
       at 
org.apache.hadoop.io.WritableFactories.newInstance(WritableFactories.java:57)
       at 
org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:284)
       at org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:77)
       at 
org.apache.spark.SerializableWritable$$anonfun$readObject$1.apply$mcV$sp(SerializableWritable.scala:45)
  //org.apache.spark.SerializableWritable#readObject
       at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1239)
       at 
org.apache.spark.SerializableWritable.readObject(SerializableWritable.scala:41)
       at 
sun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethodAccessorImpl.java:-1)
       at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
       at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
       at java.lang.reflect.Method.invoke(Method.java:606)
       at 
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
       at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
       at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
       at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
       at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
       at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
       at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
       at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
       at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
       at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
       at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
       at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
       at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
       at 
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:69)
       at 
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:95)
       at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:194)
       at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       at java.lang.Thread.run(Thread.java:744)      
{code}

>From above info, we can see that default configuration(not including pig 
>properties) are used in 
>[SerializableWritable#readObject|https://github.com/apache/spark/blob/d83c2f9f0b08d6d5d369d9fae04cdb15448e7f0d/core/src/main/scala/org/apache/spark/SerializableWritable.scala#L44]
> to initialize PigSplit.

In PIG-4765.patch, it initializes the PigSplit#conf by 
TaskAttemptContext#getConfiguration() which contains pig and hadoop properties.




> Enable TestPoissonSampleLoader in spark mode
> --------------------------------------------
>
>                 Key: PIG-4765
>                 URL: https://issues.apache.org/jira/browse/PIG-4765
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: liyunzhang_intel
>            Assignee: liyunzhang_intel
>             Fix For: spark-branch
>
>         Attachments: PIG-4765.patch
>
>
> in 
> https://builds.apache.org/job/Pig-spark/292/testReport/junit/org.apache.pig.test/,
>  it shows that TestPoissonSampleLoader fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to