[jira] Subscription: PIG patch available

2017-12-13 Thread jira
Issue Subscription
Filter: PIG patch available (37 issues)

Subscriber: pigdaily

Key Summary
PIG-5320TestCubeOperator#testRollupBasic is flaky on Spark 2.2
https://issues.apache.org/jira/browse/PIG-5320
PIG-5317Upgrade old dependencies: commons-lang, hsqldb, commons-logging
https://issues.apache.org/jira/browse/PIG-5317
PIG-5312Uids not set in inner schemas after UNION ONSCHEMA
https://issues.apache.org/jira/browse/PIG-5312
PIG-5300hashCode for Bag needs to be order independent 
https://issues.apache.org/jira/browse/PIG-5300
PIG-5273_SUCCESS file should be created at the end of the job
https://issues.apache.org/jira/browse/PIG-5273
PIG-5267Review of org.apache.pig.impl.io.BufferedPositionedInputStream
https://issues.apache.org/jira/browse/PIG-5267
PIG-5256Bytecode generation for POFilter and POForeach
https://issues.apache.org/jira/browse/PIG-5256
PIG-5191Pig HBase 2.0.0 support
https://issues.apache.org/jira/browse/PIG-5191
PIG-5160SchemaTupleFrontend.java is not thread safe, cause PigServer thrown 
NPE in multithread env
https://issues.apache.org/jira/browse/PIG-5160
PIG-5115Builtin AvroStorage generates incorrect avro schema when the same 
pig field name appears in the alias
https://issues.apache.org/jira/browse/PIG-5115
PIG-5106Optimize when mapreduce.input.fileinputformat.input.dir.recursive 
set to true
https://issues.apache.org/jira/browse/PIG-5106
PIG-5081Can not run pig on spark source code distribution
https://issues.apache.org/jira/browse/PIG-5081
PIG-5080Support store alias as spark table
https://issues.apache.org/jira/browse/PIG-5080
PIG-5057IndexOutOfBoundsException when pig reducer processOnePackageOutput
https://issues.apache.org/jira/browse/PIG-5057
PIG-5029Optimize sort case when data is skewed
https://issues.apache.org/jira/browse/PIG-5029
PIG-4926Modify the content of start.xml for spark mode
https://issues.apache.org/jira/browse/PIG-4926
PIG-4913Reduce jython function initiation during compilation
https://issues.apache.org/jira/browse/PIG-4913
PIG-4849pig on tez will cause tez-ui to crash,because the content from 
timeline server is too long. 
https://issues.apache.org/jira/browse/PIG-4849
PIG-4750REPLACE_MULTI should compile Pattern once and reuse it
https://issues.apache.org/jira/browse/PIG-4750
PIG-4684Exception should be changed to warning when job diagnostics cannot 
be fetched
https://issues.apache.org/jira/browse/PIG-4684
PIG-4656Improve String serialization and comparator performance in 
BinInterSedes
https://issues.apache.org/jira/browse/PIG-4656
PIG-4598Allow user defined plan optimizer rules
https://issues.apache.org/jira/browse/PIG-4598
PIG-4551Partition filter is not pushed down in case of SPLIT
https://issues.apache.org/jira/browse/PIG-4551
PIG-4539New PigUnit
https://issues.apache.org/jira/browse/PIG-4539
PIG-4515org.apache.pig.builtin.Distinct throws ClassCastException
https://issues.apache.org/jira/browse/PIG-4515
PIG-4323PackageConverter hanging in Spark
https://issues.apache.org/jira/browse/PIG-4323
PIG-4313StackOverflowError in LIMIT operation on Spark
https://issues.apache.org/jira/browse/PIG-4313
PIG-4251Pig on Storm
https://issues.apache.org/jira/browse/PIG-4251
PIG-4002Disable combiner when map-side aggregation is used
https://issues.apache.org/jira/browse/PIG-4002
PIG-3952PigStorage accepts '-tagSplit' to return full split information
https://issues.apache.org/jira/browse/PIG-3952
PIG-3911Define unique fields with @OutputSchema
https://issues.apache.org/jira/browse/PIG-3911
PIG-3877Getting Geo Latitude/Longitude from Address Lines
https://issues.apache.org/jira/browse/PIG-3877
PIG-3873Geo distance calculation using Haversine
https://issues.apache.org/jira/browse/PIG-3873
PIG-3864ToDate(userstring, format, timezone) computes DateTime with strange 
handling of Daylight Saving Time with location based timezones
https://issues.apache.org/jira/browse/PIG-3864
PIG-3668COR built-in function when atleast one of the coefficient values is 
NaN
https://issues.apache.org/jira/browse/PIG-3668
PIG-3587add functionality for rolling over dates
https://issues.apache.org/jira/browse/PIG-3587
PIG-1804Alow Jython function to implement Algebraic and/or Accumulator 
interfaces
https://issues.apache.org/jira/browse/PIG-1804

You may edit this subscription at:
https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=16328=12322384


[jira] Subscription: PIG patch available

2017-12-13 Thread jira
Issue Subscription
Filter: PIG patch available (38 issues)

Subscriber: pigdaily

Key Summary
PIG-5317Upgrade old dependencies: commons-lang, hsqldb, commons-logging
https://issues-test.apache.org/jira/browse/PIG-5317
PIG-5316Initialize mapred.task.id property for PoS jobs
https://issues-test.apache.org/jira/browse/PIG-5316
PIG-5312Uids not set in inner schemas after UNION ONSCHEMA
https://issues-test.apache.org/jira/browse/PIG-5312
PIG-5310MergeJoin throwing NullPointer Exception
https://issues-test.apache.org/jira/browse/PIG-5310
PIG-5300hashCode for Bag needs to be order independent 
https://issues-test.apache.org/jira/browse/PIG-5300
PIG-5273_SUCCESS file should be created at the end of the job
https://issues-test.apache.org/jira/browse/PIG-5273
PIG-5267Review of org.apache.pig.impl.io.BufferedPositionedInputStream
https://issues-test.apache.org/jira/browse/PIG-5267
PIG-5256Bytecode generation for POFilter and POForeach
https://issues-test.apache.org/jira/browse/PIG-5256
PIG-5191Pig HBase 2.0.0 support
https://issues-test.apache.org/jira/browse/PIG-5191
PIG-5160SchemaTupleFrontend.java is not thread safe, cause PigServer thrown 
NPE in multithread env
https://issues-test.apache.org/jira/browse/PIG-5160
PIG-5115Builtin AvroStorage generates incorrect avro schema when the same 
pig field name appears in the alias
https://issues-test.apache.org/jira/browse/PIG-5115
PIG-5106Optimize when mapreduce.input.fileinputformat.input.dir.recursive 
set to true
https://issues-test.apache.org/jira/browse/PIG-5106
PIG-5081Can not run pig on spark source code distribution
https://issues-test.apache.org/jira/browse/PIG-5081
PIG-5080Support store alias as spark table
https://issues-test.apache.org/jira/browse/PIG-5080
PIG-5057IndexOutOfBoundsException when pig reducer processOnePackageOutput
https://issues-test.apache.org/jira/browse/PIG-5057
PIG-5029Optimize sort case when data is skewed
https://issues-test.apache.org/jira/browse/PIG-5029
PIG-4926Modify the content of start.xml for spark mode
https://issues-test.apache.org/jira/browse/PIG-4926
PIG-4913Reduce jython function initiation during compilation
https://issues-test.apache.org/jira/browse/PIG-4913
PIG-4849pig on tez will cause tez-ui to crash,because the content from 
timeline server is too long. 
https://issues-test.apache.org/jira/browse/PIG-4849
PIG-4750REPLACE_MULTI should compile Pattern once and reuse it
https://issues-test.apache.org/jira/browse/PIG-4750
PIG-4684Exception should be changed to warning when job diagnostics cannot 
be fetched
https://issues-test.apache.org/jira/browse/PIG-4684
PIG-4656Improve String serialization and comparator performance in 
BinInterSedes
https://issues-test.apache.org/jira/browse/PIG-4656
PIG-4598Allow user defined plan optimizer rules
https://issues-test.apache.org/jira/browse/PIG-4598
PIG-4551Partition filter is not pushed down in case of SPLIT
https://issues-test.apache.org/jira/browse/PIG-4551
PIG-4539New PigUnit
https://issues-test.apache.org/jira/browse/PIG-4539
PIG-4515org.apache.pig.builtin.Distinct throws ClassCastException
https://issues-test.apache.org/jira/browse/PIG-4515
PIG-4323PackageConverter hanging in Spark
https://issues-test.apache.org/jira/browse/PIG-4323
PIG-4313StackOverflowError in LIMIT operation on Spark
https://issues-test.apache.org/jira/browse/PIG-4313
PIG-4251Pig on Storm
https://issues-test.apache.org/jira/browse/PIG-4251
PIG-4002Disable combiner when map-side aggregation is used
https://issues-test.apache.org/jira/browse/PIG-4002
PIG-3952PigStorage accepts '-tagSplit' to return full split information
https://issues-test.apache.org/jira/browse/PIG-3952
PIG-3911Define unique fields with @OutputSchema
https://issues-test.apache.org/jira/browse/PIG-3911
PIG-3877Getting Geo Latitude/Longitude from Address Lines
https://issues-test.apache.org/jira/browse/PIG-3877
PIG-3873Geo distance calculation using Haversine
https://issues-test.apache.org/jira/browse/PIG-3873
PIG-3864ToDate(userstring, format, timezone) computes DateTime with strange 
handling of Daylight Saving Time with location based timezones
https://issues-test.apache.org/jira/browse/PIG-3864
PIG-3668COR built-in function when atleast one of the coefficient values is 
NaN
https://issues-test.apache.org/jira/browse/PIG-3668
PIG-3587add functionality for rolling over dates
https://issues-test.apache.org/jira/browse/PIG-3587
PIG-1804Alow Jython 

[jira] [Commented] (PIG-5320) TestCubeOperator#testRollupBasic is flaky on Spark 2.2

2017-12-13 Thread liyunzhang (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16290299#comment-16290299
 ] 

liyunzhang commented on PIG-5320:
-

patch is ok
But why need to change from 
{code}
  private final Set finishedJobIds = Sets.newHashSet();
 {code}
to
{code}
   private final Set finishedJobIds = Sets.newTreeSet();
{code}

{quote} 
My only concern is: is SparkListener#onJobEnd() called when the job fails? If 
not, then Pig would stuck in an infinite loop.
{quote}
I think it will call SparkListener#onJobEnd() when job fails.  in the 
[link|https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-webui-JobProgressListener.html#onJobEnd],
 it is said that even job fails, onJobEnd will be called. Not read source code, 
not running cases to verify.

> TestCubeOperator#testRollupBasic is flaky on Spark 2.2
> --
>
> Key: PIG-5320
> URL: https://issues.apache.org/jira/browse/PIG-5320
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5320_1.patch
>
>
> TestCubeOperator#testRollupBasic occasionally fails with
> {code}
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to 
> store alias c
>   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1779)
>   at org.apache.pig.PigServer.registerQuery(PigServer.java:708)
>   at 
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1110)
>   at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:512)
>   at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:781)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:858)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:821)
>   at org.apache.pig.test.Util.registerMultiLineQuery(Util.java:972)
>   at 
> org.apache.pig.test.TestCubeOperator.testRollupBasic(TestCubeOperator.java:124)
> Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get 
> the rdds of this spark operator: 
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37)
>   at 
> org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87)
>   at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:237)
>   at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:293)
>   at org.apache.pig.PigServer.launchPlan(PigServer.java:1475)
>   at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460)
>   at org.apache.pig.PigServer.execute(PigServer.java:1449)
>   at org.apache.pig.PigServer.access$500(PigServer.java:119)
>   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1774)
> Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING
>   at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138)
>   at 
> org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75)
>   at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112)
> {code}
> I think the problem is that in JobStatisticCollector#waitForJobToEnd 
> {{sparkListener.wait()}} is not inside a loop, like suggested in wait's 
> javadoc:
> {code}
>  * As in the one argument version, interrupts and spurious wakeups are
>  * possible, and this method should always be used in a loop:
> {code}
> Thus due to a spurious wakeup, the wait might pass without a notify getting 
> called.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is back to normal : Pig-trunk-commit #2555

2017-12-13 Thread Apache Jenkins Server
See 




Build failed in Jenkins: Pig-trunk #2062

2017-12-13 Thread Apache Jenkins Server
See 

Changes:

[szita] PIG-5318: Unit test failures on Pig on Spark with Spark 2.2 (nkollar 
via szita)

--
[...truncated 19.78 KB...]
   [javacc] File "Token.java" does not exist.  Will create one.
   [javacc] File "SimpleCharStream.java" does not exist.  Will create one.
   [javacc] Parser generated successfully.
 [move] Moving 1 file to 


prepare:
[mkdir] Created dir: 


genLexer:

genParser:

genTreeParser:

gen:

compile:
 [echo] *** Building Main Sources ***
 [echo] *** To compile with all warnings enabled, supply -Dall.warnings=1 
on command line ***
 [echo] *** Else, you will only be warned about deprecations ***
 [echo] *** Hadoop version used: 2 ; HBase version used: 1 ; Spark version 
used: 1 ***
[javac] Compiling 1105 source files to 

[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.
 [copy] Copying 1 file to 

 [copy] Copying 1 file to 

 [copy] Copying 2 files to 

 [copy] Copying 1 file to 


ivy-buildJar:

jar-simple:
 [echo] svnString 1818071
  [jar] Building jar: 

 [echo] svnString 1818071
  [jar] Building jar: 

  [jar] META-INF/native/linux32/libjansi.so already added, skipping
  [jar] META-INF/native/linux64/libjansi.so already added, skipping
  [jar] META-INF/native/osx/libjansi.jnilib already added, skipping
  [jar] META-INF/native/windows32/jansi.dll already added, skipping
  [jar] META-INF/native/windows64/jansi.dll already added, skipping
  [jar] org/fusesource/hawtjni/runtime/Callback.class already added, 
skipping
  [jar] org/fusesource/hawtjni/runtime/Library.class already added, skipping
  [jar] org/fusesource/hawtjni/runtime/PointerMath.class already added, 
skipping
  [jar] org/fusesource/jansi/Ansi$1.class already added, skipping
  [jar] org/fusesource/jansi/Ansi$2.class already added, skipping
  [jar] org/fusesource/jansi/Ansi$Attribute.class already added, skipping
  [jar] org/fusesource/jansi/Ansi$Color.class already added, skipping
  [jar] org/fusesource/jansi/Ansi$Erase.class already added, skipping
  [jar] org/fusesource/jansi/Ansi$NoAnsi.class already added, skipping
  [jar] org/fusesource/jansi/Ansi.class already added, skipping
  [jar] org/fusesource/jansi/AnsiConsole$1.class already added, skipping
  [jar] org/fusesource/jansi/AnsiConsole.class already added, skipping
  [jar] org/fusesource/jansi/AnsiOutputStream.class already added, skipping
  [jar] org/fusesource/jansi/AnsiRenderWriter.class already added, skipping
  [jar] org/fusesource/jansi/AnsiRenderer$Code.class already added, skipping
  [jar] org/fusesource/jansi/AnsiRenderer.class already added, skipping
  [jar] org/fusesource/jansi/AnsiString.class already added, skipping
  [jar] org/fusesource/jansi/HtmlAnsiOutputStream.class already added, 
skipping
  [jar] org/fusesource/jansi/WindowsAnsiOutputStream.class already added, 
skipping
  [jar] org/fusesource/jansi/internal/CLibrary.class already added, skipping
  [jar] 
org/fusesource/jansi/internal/Kernel32$CONSOLE_SCREEN_BUFFER_INFO.class already 
added, skipping
  [jar] org/fusesource/jansi/internal/Kernel32$COORD.class already added, 
skipping
  [jar] org/fusesource/jansi/internal/Kernel32$INPUT_RECORD.class already 
added, skipping
  [jar] org/fusesource/jansi/internal/Kernel32$KEY_EVENT_RECORD.class 
already added, skipping
  [jar] org/fusesource/jansi/internal/Kernel32$SMALL_RECT.class already 
added, skipping
  [jar] org/fusesource/jansi/internal/Kernel32.class already added, skipping
  [jar] org/fusesource/jansi/internal/WindowsSupport.class already added, 
skipping
Trying to override old definition of task propertycopy
Trying to override old definition of task propertycopy

copyCommonDependencies:
[mkdir] Created dir: 
 [copy] Copying 47 files to 
Trying to override old definition 

[jira] [Commented] (PIG-5320) TestCubeOperator#testRollupBasic is flaky on Spark 2.2

2017-12-13 Thread Nandor Kollar (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16289136#comment-16289136
 ] 

Nandor Kollar commented on PIG-5320:


I think this is a problem with Spark 1.6.x too, checking for the condition in a 
loop should solve the problem. I also changed the map and set implementation to 
sorted one, since we use integer job ids, I hope it would slightly improve 
performance in case of many jobs. [~kellyzly], [~szita] could you please have a 
look at my patch? My only concern is: is SparkListener#onJobEnd() called when 
the job fails? If not, then Pig would stuck in an infinite loop.

> TestCubeOperator#testRollupBasic is flaky on Spark 2.2
> --
>
> Key: PIG-5320
> URL: https://issues.apache.org/jira/browse/PIG-5320
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5320_1.patch
>
>
> TestCubeOperator#testRollupBasic occasionally fails with
> {code}
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to 
> store alias c
>   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1779)
>   at org.apache.pig.PigServer.registerQuery(PigServer.java:708)
>   at 
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1110)
>   at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:512)
>   at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:781)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:858)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:821)
>   at org.apache.pig.test.Util.registerMultiLineQuery(Util.java:972)
>   at 
> org.apache.pig.test.TestCubeOperator.testRollupBasic(TestCubeOperator.java:124)
> Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get 
> the rdds of this spark operator: 
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37)
>   at 
> org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87)
>   at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:237)
>   at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:293)
>   at org.apache.pig.PigServer.launchPlan(PigServer.java:1475)
>   at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460)
>   at org.apache.pig.PigServer.execute(PigServer.java:1449)
>   at org.apache.pig.PigServer.access$500(PigServer.java:119)
>   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1774)
> Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING
>   at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138)
>   at 
> org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75)
>   at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112)
> {code}
> I think the problem is that in JobStatisticCollector#waitForJobToEnd 
> {{sparkListener.wait()}} is not inside a loop, like suggested in wait's 
> javadoc:
> {code}
>  * As in the one argument version, interrupts and spurious wakeups are
>  * possible, and this method should always be used in a loop:
> {code}
> Thus due to a spurious wakeup, the wait might pass without a notify getting 
> called.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5320) TestCubeOperator#testRollupBasic is flaky on Spark 2.2

2017-12-13 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5320:
---
Attachment: PIG-5320_1.patch

> TestCubeOperator#testRollupBasic is flaky on Spark 2.2
> --
>
> Key: PIG-5320
> URL: https://issues.apache.org/jira/browse/PIG-5320
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5320_1.patch
>
>
> TestCubeOperator#testRollupBasic occasionally fails with
> {code}
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to 
> store alias c
>   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1779)
>   at org.apache.pig.PigServer.registerQuery(PigServer.java:708)
>   at 
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1110)
>   at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:512)
>   at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:781)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:858)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:821)
>   at org.apache.pig.test.Util.registerMultiLineQuery(Util.java:972)
>   at 
> org.apache.pig.test.TestCubeOperator.testRollupBasic(TestCubeOperator.java:124)
> Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get 
> the rdds of this spark operator: 
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37)
>   at 
> org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87)
>   at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:237)
>   at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:293)
>   at org.apache.pig.PigServer.launchPlan(PigServer.java:1475)
>   at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460)
>   at org.apache.pig.PigServer.execute(PigServer.java:1449)
>   at org.apache.pig.PigServer.access$500(PigServer.java:119)
>   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1774)
> Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING
>   at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138)
>   at 
> org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75)
>   at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112)
> {code}
> I think the problem is that in JobStatisticCollector#waitForJobToEnd 
> {{sparkListener.wait()}} is not inside a loop, like suggested in wait's 
> javadoc:
> {code}
>  * As in the one argument version, interrupts and spurious wakeups are
>  * possible, and this method should always be used in a loop:
> {code}
> Thus due to a spurious wakeup, the wait might pass without a notify getting 
> called.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5320) TestCubeOperator#testRollupBasic is flaky on Spark 2.2

2017-12-13 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5320:
---
Attachment: (was: PIG-5320_1.patch)

> TestCubeOperator#testRollupBasic is flaky on Spark 2.2
> --
>
> Key: PIG-5320
> URL: https://issues.apache.org/jira/browse/PIG-5320
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5320_1.patch
>
>
> TestCubeOperator#testRollupBasic occasionally fails with
> {code}
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to 
> store alias c
>   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1779)
>   at org.apache.pig.PigServer.registerQuery(PigServer.java:708)
>   at 
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1110)
>   at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:512)
>   at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:781)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:858)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:821)
>   at org.apache.pig.test.Util.registerMultiLineQuery(Util.java:972)
>   at 
> org.apache.pig.test.TestCubeOperator.testRollupBasic(TestCubeOperator.java:124)
> Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get 
> the rdds of this spark operator: 
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37)
>   at 
> org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87)
>   at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:237)
>   at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:293)
>   at org.apache.pig.PigServer.launchPlan(PigServer.java:1475)
>   at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460)
>   at org.apache.pig.PigServer.execute(PigServer.java:1449)
>   at org.apache.pig.PigServer.access$500(PigServer.java:119)
>   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1774)
> Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING
>   at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138)
>   at 
> org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75)
>   at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112)
> {code}
> I think the problem is that in JobStatisticCollector#waitForJobToEnd 
> {{sparkListener.wait()}} is not inside a loop, like suggested in wait's 
> javadoc:
> {code}
>  * As in the one argument version, interrupts and spurious wakeups are
>  * possible, and this method should always be used in a loop:
> {code}
> Thus due to a spurious wakeup, the wait might pass without a notify getting 
> called.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2

2017-12-13 Thread Adam Szita (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated PIG-5318:

   Resolution: Fixed
Fix Version/s: 0.18.0
   Status: Resolved  (was: Patch Available)

> Unit test failures on Pig on Spark with Spark 2.2
> -
>
> Key: PIG-5318
> URL: https://issues.apache.org/jira/browse/PIG-5318
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Fix For: 0.18.0
>
> Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch, 
> PIG-5318_4.patch, PIG-5318_5.patch, PIG-5318_6.patch
>
>
> There are several failing cases when executing the unit tests with Spark 2.2:
> {code}
>  org.apache.pig.test.TestAssert#testNegativeWithoutFetch
>  org.apache.pig.test.TestAssert#testNegative
>  org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch
>  org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput
>  org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore
>  org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication
>  org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore
> {code}
> All of these are related to fixes/changes in Spark.
> TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed 
> by asserting on the message of the exception's root cause, looks like on 
> Spark 2.2 the exception is wrapped into an additional layer.
> TestStore and TestStoreLocal failure are also a test related problems: looks 
> like SPARK-7953 is fixed in Spark 2.2
> The root cause of TestStoreInstances is yet to be found out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2

2017-12-13 Thread Adam Szita (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16289058#comment-16289058
 ] 

Adam Szita commented on PIG-5318:
-

[~nkollar], +1 for [^PIG-5318_6.patch], committed to trunk.
I think we should also upgrade the spark 2 minor version in Pig On Spark to 
2.2. We don't want to maintain a 1.6.1, 2.1.1, and 2.2.0 support at the same 
time, rather have one minor per major.
Created PIG-5321 to track the upgrade.

> Unit test failures on Pig on Spark with Spark 2.2
> -
>
> Key: PIG-5318
> URL: https://issues.apache.org/jira/browse/PIG-5318
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch, 
> PIG-5318_4.patch, PIG-5318_5.patch, PIG-5318_6.patch
>
>
> There are several failing cases when executing the unit tests with Spark 2.2:
> {code}
>  org.apache.pig.test.TestAssert#testNegativeWithoutFetch
>  org.apache.pig.test.TestAssert#testNegative
>  org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch
>  org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput
>  org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore
>  org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication
>  org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore
> {code}
> All of these are related to fixes/changes in Spark.
> TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed 
> by asserting on the message of the exception's root cause, looks like on 
> Spark 2.2 the exception is wrapped into an additional layer.
> TestStore and TestStoreLocal failure are also a test related problems: looks 
> like SPARK-7953 is fixed in Spark 2.2
> The root cause of TestStoreInstances is yet to be found out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (PIG-5321) Upgrade Spark 2 version to 2.2.0 for Pig on Spark

2017-12-13 Thread Adam Szita (JIRA)
Adam Szita created PIG-5321:
---

 Summary: Upgrade Spark 2 version to 2.2.0 for Pig on Spark
 Key: PIG-5321
 URL: https://issues.apache.org/jira/browse/PIG-5321
 Project: Pig
  Issue Type: Improvement
  Components: spark
Reporter: Adam Szita


Right now we maintain support for 2 versions of Spark for PoS jobs:
spark1.version=1.6.1
spark2.version=2.1.1

I believe we should move forward with the latter.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5320) TestCubeOperator#testRollupBasic is flaky on Spark 2.2

2017-12-13 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5320:
---
Status: Patch Available  (was: Open)

> TestCubeOperator#testRollupBasic is flaky on Spark 2.2
> --
>
> Key: PIG-5320
> URL: https://issues.apache.org/jira/browse/PIG-5320
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5320_1.patch
>
>
> TestCubeOperator#testRollupBasic occasionally fails with
> {code}
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to 
> store alias c
>   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1779)
>   at org.apache.pig.PigServer.registerQuery(PigServer.java:708)
>   at 
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1110)
>   at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:512)
>   at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:781)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:858)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:821)
>   at org.apache.pig.test.Util.registerMultiLineQuery(Util.java:972)
>   at 
> org.apache.pig.test.TestCubeOperator.testRollupBasic(TestCubeOperator.java:124)
> Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get 
> the rdds of this spark operator: 
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37)
>   at 
> org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87)
>   at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:237)
>   at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:293)
>   at org.apache.pig.PigServer.launchPlan(PigServer.java:1475)
>   at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460)
>   at org.apache.pig.PigServer.execute(PigServer.java:1449)
>   at org.apache.pig.PigServer.access$500(PigServer.java:119)
>   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1774)
> Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING
>   at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138)
>   at 
> org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75)
>   at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112)
> {code}
> I think the problem is that in JobStatisticCollector#waitForJobToEnd 
> {{sparkListener.wait()}} is not inside a loop, like suggested in wait's 
> javadoc:
> {code}
>  * As in the one argument version, interrupts and spurious wakeups are
>  * possible, and this method should always be used in a loop:
> {code}
> Thus due to a spurious wakeup, the wait might pass without a notify getting 
> called.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (PIG-5320) TestCubeOperator#testRollupBasic is flaky on Spark 2.2

2017-12-13 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-5320:
---
Attachment: PIG-5320_1.patch

> TestCubeOperator#testRollupBasic is flaky on Spark 2.2
> --
>
> Key: PIG-5320
> URL: https://issues.apache.org/jira/browse/PIG-5320
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5320_1.patch
>
>
> TestCubeOperator#testRollupBasic occasionally fails with
> {code}
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to 
> store alias c
>   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1779)
>   at org.apache.pig.PigServer.registerQuery(PigServer.java:708)
>   at 
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1110)
>   at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:512)
>   at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:781)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:858)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:821)
>   at org.apache.pig.test.Util.registerMultiLineQuery(Util.java:972)
>   at 
> org.apache.pig.test.TestCubeOperator.testRollupBasic(TestCubeOperator.java:124)
> Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get 
> the rdds of this spark operator: 
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37)
>   at 
> org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87)
>   at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:237)
>   at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:293)
>   at org.apache.pig.PigServer.launchPlan(PigServer.java:1475)
>   at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460)
>   at org.apache.pig.PigServer.execute(PigServer.java:1449)
>   at org.apache.pig.PigServer.access$500(PigServer.java:119)
>   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1774)
> Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING
>   at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138)
>   at 
> org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75)
>   at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112)
> {code}
> I think the problem is that in JobStatisticCollector#waitForJobToEnd 
> {{sparkListener.wait()}} is not inside a loop, like suggested in wait's 
> javadoc:
> {code}
>  * As in the one argument version, interrupts and spurious wakeups are
>  * possible, and this method should always be used in a loop:
> {code}
> Thus due to a spurious wakeup, the wait might pass without a notify getting 
> called.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (PIG-5320) TestCubeOperator#testRollupBasic is flaky on Spark 2.2

2017-12-13 Thread Nandor Kollar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar reassigned PIG-5320:
--

Assignee: Nandor Kollar

> TestCubeOperator#testRollupBasic is flaky on Spark 2.2
> --
>
> Key: PIG-5320
> URL: https://issues.apache.org/jira/browse/PIG-5320
> Project: Pig
>  Issue Type: Bug
>  Components: spark
>Reporter: Nandor Kollar
>Assignee: Nandor Kollar
> Attachments: PIG-5320_1.patch
>
>
> TestCubeOperator#testRollupBasic occasionally fails with
> {code}
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to 
> store alias c
>   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1779)
>   at org.apache.pig.PigServer.registerQuery(PigServer.java:708)
>   at 
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1110)
>   at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:512)
>   at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:781)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:858)
>   at org.apache.pig.PigServer.registerScript(PigServer.java:821)
>   at org.apache.pig.test.Util.registerMultiLineQuery(Util.java:972)
>   at 
> org.apache.pig.test.TestCubeOperator.testRollupBasic(TestCubeOperator.java:124)
> Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get 
> the rdds of this spark operator: 
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37)
>   at 
> org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87)
>   at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:237)
>   at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:293)
>   at org.apache.pig.PigServer.launchPlan(PigServer.java:1475)
>   at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460)
>   at org.apache.pig.PigServer.execute(PigServer.java:1449)
>   at org.apache.pig.PigServer.access$500(PigServer.java:119)
>   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1774)
> Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING
>   at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138)
>   at 
> org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75)
>   at 
> org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112)
> {code}
> I think the problem is that in JobStatisticCollector#waitForJobToEnd 
> {{sparkListener.wait()}} is not inside a loop, like suggested in wait's 
> javadoc:
> {code}
>  * As in the one argument version, interrupts and spurious wakeups are
>  * possible, and this method should always be used in a loop:
> {code}
> Thus due to a spurious wakeup, the wait might pass without a notify getting 
> called.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (PIG-5320) TestCubeOperator#testRollupBasic is flaky on Spark 2.2

2017-12-13 Thread Nandor Kollar (JIRA)
Nandor Kollar created PIG-5320:
--

 Summary: TestCubeOperator#testRollupBasic is flaky on Spark 2.2
 Key: PIG-5320
 URL: https://issues.apache.org/jira/browse/PIG-5320
 Project: Pig
  Issue Type: Bug
  Components: spark
Reporter: Nandor Kollar


TestCubeOperator#testRollupBasic occasionally fails with
{code}
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to store 
alias c
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1779)
at org.apache.pig.PigServer.registerQuery(PigServer.java:708)
at 
org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1110)
at 
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:512)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
at org.apache.pig.PigServer.registerScript(PigServer.java:781)
at org.apache.pig.PigServer.registerScript(PigServer.java:858)
at org.apache.pig.PigServer.registerScript(PigServer.java:821)
at org.apache.pig.test.Util.registerMultiLineQuery(Util.java:972)
at 
org.apache.pig.test.TestCubeOperator.testRollupBasic(TestCubeOperator.java:124)
Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 0: fail to get the 
rdds of this spark operator: 
at 
org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:115)
at 
org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:140)
at 
org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkOperator.visit(SparkOperator.java:37)
at 
org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:87)
at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:46)
at 
org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:237)
at 
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:293)
at org.apache.pig.PigServer.launchPlan(PigServer.java:1475)
at 
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1460)
at org.apache.pig.PigServer.execute(PigServer.java:1449)
at org.apache.pig.PigServer.access$500(PigServer.java:119)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1774)
Caused by: java.lang.RuntimeException: Unexpected job execution status RUNNING
at 
org.apache.pig.tools.pigstats.spark.SparkStatsUtil.isJobSuccess(SparkStatsUtil.java:138)
at 
org.apache.pig.tools.pigstats.spark.SparkPigStats.addJobStats(SparkPigStats.java:75)
at 
org.apache.pig.tools.pigstats.spark.SparkStatsUtil.waitForJobAddStats(SparkStatsUtil.java:59)
at 
org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.sparkOperToRDD(JobGraphBuilder.java:225)
at 
org.apache.pig.backend.hadoop.executionengine.spark.JobGraphBuilder.visitSparkOp(JobGraphBuilder.java:112)
{code}

I think the problem is that in JobStatisticCollector#waitForJobToEnd 
{{sparkListener.wait()}} is not inside a loop, like suggested in wait's javadoc:
{code}
 * As in the one argument version, interrupts and spurious wakeups are
 * possible, and this method should always be used in a loop:
{code}

Thus due to a spurious wakeup, the wait might pass without a notify getting 
called.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)