[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16289058#comment-16289058 ] Adam Szita commented on PIG-5318: - [~nkollar], +1 for [^PIG-5318_6.patch], committed to trunk. I think we should also upgrade the spark 2 minor version in Pig On Spark to 2.2. We don't want to maintain a 1.6.1, 2.1.1, and 2.2.0 support at the same time, rather have one minor per major. Created PIG-5321 to track the upgrade. > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch, > PIG-5318_4.patch, PIG-5318_5.patch, PIG-5318_6.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16284463#comment-16284463 ] Rohini Palaniswamy commented on PIG-5318: - +1 on the patch from my side. bq. found an universal way to tell the current Spark version Did not suggest that as [~gezapeti] has mentioned earlier that it is internal to Spark - https://issues.apache.org/jira/browse/OOZIE-2606?focusedCommentId=15528793=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15528793. It is ok here though as it is only for tests. bq. testKeepGoingFailed is excluded because of SPARK-7953 SPARK-7953 still does not seem to be fixed as you mentioned. Can you try to find which jira actually fixed it and probably close SPARK-7953 if it is not required anymore. Identifying what behavior change caused this might also help find other places in Pig on Spark that have to be fixed or changed for the new behavior. > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch, > PIG-5318_4.patch, PIG-5318_5.patch, PIG-5318_6.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283321#comment-16283321 ] Nandor Kollar commented on PIG-5318: Attached PIG-5318_6.patch, found an universal way to tell the current Spark version, that works with both Spark 1.6.x and Spark 2.x too, and there's no need to start SparkContext. (thanks [~gezapeti] :) ) > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch, > PIG-5318_4.patch, PIG-5318_5.patch, PIG-5318_6.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283255#comment-16283255 ] Nandor Kollar commented on PIG-5318: [~kellyzly] thanks for the explanation, in this case I think enabling this test is fine, and there's no need to check for Spark version, we don't support older Spark versions. > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch, > PIG-5318_4.patch, PIG-5318_5.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283023#comment-16283023 ] liyunzhang commented on PIG-5318: - [~nkollar]: testKeepGoingFailed is excluded because of SPARK-7953. At that time we used spark 1.3. And after upgrading to 1.6, not enable this test again. > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch, > PIG-5318_4.patch, PIG-5318_5.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16281815#comment-16281815 ] Nandor Kollar commented on PIG-5318: Attached PIG-5318_5.patch which includes fix for TestAssert, TestScalarAliases, TestEvalPipeline2, TestStore and TestStoreLocal test cases, but doesn't fix TestStoreInstances failure. The Spark version is determined like Rohini suggested. I also noticed, that testKeepGoigFailed (fixed the typo in method name, now testKeepGoingFailed) was excluded from spark exec type, I enabled this test case, since it passed in my environment with 1.6, 2.1 and 2.2 Spark versions. [~kellyzly] do you remember why this was excluded? Looks like the Jira it is referring to is not yet fixed, despite this the test passes with 1.6.x Spark. > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch, > PIG-5318_4.patch, PIG-5318_5.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16280635#comment-16280635 ] Rohini Palaniswamy commented on PIG-5318: - bq. Should we open a separate Jira for fixing TestStoreInstances in spark mode? Sure. It will require more time for you to come up with a solution. We can get the other ones fixed in this jira. > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch, > PIG-5318_4.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16277122#comment-16277122 ] Rohini Palaniswamy commented on PIG-5318: - bq. it looks like the way I wanted to tell Spark version doesn't work on Spark 1.x Missed this earlier. If the spark-version-info.properties file is not there, you could just return false for isSpark2_2_plus which will be easier. > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch, > PIG-5318_4.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16277120#comment-16277120 ] Rohini Palaniswamy commented on PIG-5318: - bq. but how about modifying PigOutputformat, like I did in the patch (making the relevant variables static)? This cannot be done. It is hacky and will break Pig local mode and Tez. In local mode, save jvm is used to execute the whole script which can have parallel STORE statements. Tez also allows storing to multiple outputs from same vertex in a DAG - i.e multiple PigOutputFormat in the save jvm. bq. isSpark2_1_minus Can you make it isSpark2_2_plus which is slightly more intuitive than 2_1_minus. Also instantiating SparkContext just to get version seems overkill. Prefer the previous logic you had. Is there any reason that could not be used? > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch, > PIG-5318_4.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16276747#comment-16276747 ] Nandor Kollar commented on PIG-5318: Attached PIG-5318_4.patch, it looks like the way I wanted to tell Spark version doesn't work on Spark 1.x, using SparkContext#version instead. > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch, PIG-5318_2.patch, PIG-5318_3.patch, > PIG-5318_4.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16276557#comment-16276557 ] Nandor Kollar commented on PIG-5318: bq. You should just do isSpark2_x (sparkVersion.startsWith("2.")) instead of isSpark2_2_x . If Spark 2.3 gets released, then code will have to change. You're right, but matching for 2.x is not good enough. On Spark 2.1, abortTask and abortJob is not called (see SPARK-7953), but looks like in Spark 2.2 this is fixed (at least it looks like it is fixed). I'll update the patch soon, we should match Spark 2.2+. bq. Spark should consistently use the same OutputFormat instance in this case Ok, so I guess this should be a new Jira for Spark, however Spark 2.2 is already released, and creates more OutputFormat instances like said before. Indeed, we shouldn't modify the test case, but how about modifying PigOutputformat, like I did in the patch (making the relevant variables static)? > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch, PIG-5318_2.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16274766#comment-16274766 ] Rohini Palaniswamy commented on PIG-5318: - You should just do isSpark2_x (sparkVersion.startsWith("2.")) instead of isSpark2_2_x . If Spark 2.3 gets released, then code will have to change. bq. Not sure if it is a bug in Pig, or in Spark, should Spark consistently use the same OutputFormat instance in this case? Spark should consistently use the same OutputFormat instance in this case. We should not be modifying the test case. There will be users who will be using local variables in StoreFunc for some computation at least. > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch, PIG-5318_2.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16274385#comment-16274385 ] Nandor Kollar commented on PIG-5318: Attached PIG-5318_2.patch, I addressed Rohini's comments there. As of {{TestStoreInstances}} failure, it looks like Spark (unlike Tez and MapReduce) creates multiple instances from {{PigOutputFormat}} while setting up the output committers: [setupCommitter|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala#L74] is called from both [setupJob|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala#L138] and from [setupTask|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala#L165], and {{setupCommitter}} creates a new {{PigOutputFormat}} each time, saving in a private variable. In addition, when Spark writes to files, a new {{PigOutputFormat}} is [getting created|https://github.com/apache/spark/blob/branch-2.2/core/src/main/scala/org/apache/spark/internal/io/SparkHadoopMapReduceWriter.scala#L75] too, and since POStores are saved and deserialized in configuration, but StoreFuncInterface inside stores are [transient|https://github.com/apache/pig/blob/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POStore.java#L53], a new instance of {{STFuncCheckInstances}} is getting created, each time, thus {{putNext}} and {{commitTask}} will use different array instances. Not sure if it is a bug in Pig, or in Spark, should Spark consistently use the same OutputFormat instance in this case? Making {{reduceStores}}, {{mapStores}}, {{currentConf}} static inside {{TestStoreInstances}} would solve the problem, [~rohini], [~kellyzly] what do you think about this solution? > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch, PIG-5318_2.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272501#comment-16272501 ] Nandor Kollar commented on PIG-5318: Thanks [~rohini] and [~kellyzly] for your review! Hm, I think I understood the point of TestStoreInstances, and indeed, my change on that test looks pointless. I'm afraid this might be a bug and not a test issue. I'll continue the investigation why it is failing, and what how to fix it, so far it looks like commitTask is not called on the correct OutputCommitterTestInstances instance, the array is empty. > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271998#comment-16271998 ] liyunzhang commented on PIG-5318: - [~nkollar]: thanks for working on it. thanks [~rohini] 's comments. just quick scan. several questions 1. the modification for {{TestAssert}},{{TestEvalPipeline}},{{TestScalarAliases}} suit for Pig on MR or Pig on Tez? I guess it will not hurt other engine, just want to confirm with it. 2. not very understand the purpose about the modification of TestStoreInstances, are there some problems with previous code? before {code} private ArrayList outRows; {code} After {code} private static MapoutRows = new HashMap<>(); private String location; {code} > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271252#comment-16271252 ] Rohini Palaniswamy commented on PIG-5318: - Few Comments: 1) TestStoreBase - Please add a isSpark2_x() method to Util.java after isHadoop1_x() and use that - mode.toString().startsWith("SPARK") -> Util.isSparkExecType(mode) 2) Changing test case of TestStoreInstances beats the purpose of the test. > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch > > > There are several failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (PIG-5318) Unit test failures on Pig on Spark with Spark 2.2
[ https://issues.apache.org/jira/browse/PIG-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16270990#comment-16270990 ] Nandor Kollar commented on PIG-5318: [~szita], [~kellyzly] could you please have review? > Unit test failures on Pig on Spark with Spark 2.2 > - > > Key: PIG-5318 > URL: https://issues.apache.org/jira/browse/PIG-5318 > Project: Pig > Issue Type: Bug > Components: spark >Reporter: Nandor Kollar >Assignee: Nandor Kollar > Attachments: PIG-5318_1.patch > > > There are sever failing cases when executing the unit tests with Spark 2.2: > {code} > org.apache.pig.test.TestAssert#testNegativeWithoutFetch > org.apache.pig.test.TestAssert#testNegative > org.apache.pig.test.TestEvalPipeline2#testNonStandardDataWithoutFetch > org.apache.pig.test.TestScalarAliases#testScalarErrMultipleRowsInInput > org.apache.pig.test.TestStore#testCleanupOnFailureMultiStore > org.apache.pig.test.TestStoreInstances#testBackendStoreCommunication > org.apache.pig.test.TestStoreLocal#testCleanupOnFailureMultiStore > {code} > All of these are related to fixes/changes in Spark. > TestAssert, TestScalarAliases and TestEvalPipeline2 failures could be fixed > by asserting on the message of the exception's root cause, looks like on > Spark 2.2 the exception is wrapped into an additional layer. > TestStore and TestStoreLocal failure are also a test related problems: looks > like SPARK-7953 is fixed in Spark 2.2 > The root cause of TestStoreInstances is yet to be found out. -- This message was sent by Atlassian JIRA (v6.4.14#64029)