[
https://issues.apache.org/jira/browse/PIG-4765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
liyunzhang_intel updated PIG-4765:
----------------------------------
Attachment: PIG-4765.patch
[~mohitsabharwal],[~pallavi.rao] and [~kexianda]:
The root cause of TestPoissonSampleLoader#testInstantiation fails is because
PigConfiguration.PIG_SKEWEDJOIN_REDUCE_MEMUSAGE can not to be read from the
configuration in
[PoissonSampleLoader|https://github.com/apache/pig/blob/spark/src/org/apache/pig/impl/builtin/PoissonSampleLoader.java#L176]
and
[PigSplit#conf|https://github.com/apache/pig/blob/spark/src/org/apache/pig/impl/builtin/PoissonSampleLoader.java#L174]
isn't initialized correctly in spark mode. In spark mode, PigSplit#conf is
initialized in following stack info:
{code}
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit.setConf(PigSplit.java:356)
at
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at
org.apache.hadoop.io.WritableFactories.newInstance(WritableFactories.java:57)
at
org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:284)
at org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:77)
at
org.apache.spark.SerializableWritable$$anonfun$readObject$1.apply$mcV$sp(SerializableWritable.scala:45)
//org.apache.spark.SerializableWritable#readObject
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1239)
at
org.apache.spark.SerializableWritable.readObject(SerializableWritable.scala:41)
at
sun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethodAccessorImpl.java:-1)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:69)
at
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:95)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:194)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
{code}
>From above info, we can see that default configuration(not including pig
>properties) are used in
>[SerializableWritable#readObject|https://github.com/apache/spark/blob/d83c2f9f0b08d6d5d369d9fae04cdb15448e7f0d/core/src/main/scala/org/apache/spark/SerializableWritable.scala#L44]
> to initialize PigSplit.
In PIG-4765.patch, it initializes the PigSplit#conf by
TaskAttemptContext#getConfiguration() which contains pig and hadoop properties.
> Enable TestPoissonSampleLoader in spark mode
> --------------------------------------------
>
> Key: PIG-4765
> URL: https://issues.apache.org/jira/browse/PIG-4765
> Project: Pig
> Issue Type: Sub-task
> Components: spark
> Reporter: liyunzhang_intel
> Assignee: liyunzhang_intel
> Fix For: spark-branch
>
> Attachments: PIG-4765.patch
>
>
> in
> https://builds.apache.org/job/Pig-spark/292/testReport/junit/org.apache.pig.test/,
> it shows that TestPoissonSampleLoader fails.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)