Author: knoguchi Date: Tue May 14 20:42:51 2024 New Revision: 1917725 URL: http://svn.apache.org/viewvc?rev=1917725&view=rev Log: PIG-5448: All TestHBaseStorage tests failing on pig-on-spark3 (knoguchi)
Modified: pig/trunk/CHANGES.txt pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/spark/SparkLauncher.java Modified: pig/trunk/CHANGES.txt URL: http://svn.apache.org/viewvc/pig/trunk/CHANGES.txt?rev=1917725&r1=1917724&r2=1917725&view=diff ============================================================================== --- pig/trunk/CHANGES.txt (original) +++ pig/trunk/CHANGES.txt Tue May 14 20:42:51 2024 @@ -34,6 +34,8 @@ PIG-5416: Spark unit tests failing rando PIG-5447: Pig-on-Spark TestSkewedJoin.testSkewedJoinOuter failing with NoSuchElementException (knoguchi) +PIG-5448: All TestHBaseStorage tests failing on pig-on-spark3 (knoguchi) + Release 0.18.0 - Unreleased INCOMPATIBLE CHANGES Modified: pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/spark/SparkLauncher.java URL: http://svn.apache.org/viewvc/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/spark/SparkLauncher.java?rev=1917725&r1=1917724&r2=1917725&view=diff ============================================================================== --- pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/spark/SparkLauncher.java (original) +++ pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/spark/SparkLauncher.java Tue May 14 20:42:51 2024 @@ -591,6 +591,13 @@ public class SparkLauncher extends Launc sparkConf.setMaster(master); sparkConf.setAppName(pigCtxtProperties.getProperty(PigContext.JOB_NAME,"pig")); + + // For non-hdfs inputs, PigSplit may show up as empty but still + // contain inputs when accessed. These splits should not be + // skipped. + sparkConf.set("spark.hadoopRDD.ignoreEmptySplits", "false"); + + // On Spark 1.6, Netty file server doesn't allow adding the same file with the same name twice // This is a problem for streaming using a script + explicit ship the same script combination (PIG-5134) // HTTP file server doesn't have this restriction, it overwrites the file if added twice