[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user mrkm4ntr commented on the issue: https://github.com/apache/spark/pull/20568 @gatorsmile Thanks! I will close it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20568 @mrkm4ntr Thank you for your contribution! The PR has been merged using your Github account. Could you close this? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20568 I think we can close this now. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20568 Submitted the PR https://github.com/apache/spark/pull/20630 to take this over. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20568 To speedup the work here, I will take this over. All the contributions should be given to @mrkm4ntr Thanks for your work! @mrkm4ntr --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/20568 I think this should block RC4 : ( For ML, it's really important that MurmurHash3 behave consistently across platforms. However, for ML, we'll need to maintain the old implementation of MurmushHash3 to maintain the behavior of ML Pipelines exported from previous versions of Spark. I'll create & link a JIRA here in a moment. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user sameeragarwal commented on the issue: https://github.com/apache/spark/pull/20568 @hvanhovell just to make sure, given the dependency on `FeatureHasher`, should this block RC4? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20568 @mrkm4ntr this is legitimate failure. Can you fix the python tests? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/20568 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20568 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87509/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20568 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20568 **[Test build #87509 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87509/testReport)** for PR 20568 at commit [`c20cd97`](https://github.com/apache/spark/commit/c20cd97d7ce5690993b4490bb7cca955e7703d90). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20568 **[Test build #87509 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87509/testReport)** for PR 20568 at commit [`c20cd97`](https://github.com/apache/spark/commit/c20cd97d7ce5690993b4490bb7cca955e7703d90). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/20568 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20568 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/20568 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20568 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20568 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87501/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20568 **[Test build #87501 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87501/testReport)** for PR 20568 at commit [`c20cd97`](https://github.com/apache/spark/commit/c20cd97d7ce5690993b4490bb7cca955e7703d90). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20568 **[Test build #87501 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87501/testReport)** for PR 20568 at commit [`c20cd97`](https://github.com/apache/spark/commit/c20cd97d7ce5690993b4490bb7cca955e7703d90). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20568 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/20568 Retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/20568 @mrkm4ntr Do not worry about these failures. Since we know there are some unstable tests, our community is trying to fix them. For a while, we have to kick test. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user mrkm4ntr commented on the issue: https://github.com/apache/spark/pull/20568 I cannot reproduce this failure of the test in my environment. It seems to me that this is not related to this change... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/20568 Retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20568 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20568 **[Test build #87472 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87472/testReport)** for PR 20568 at commit [`336bce0`](https://github.com/apache/spark/commit/336bce0d38d2068d12c4ba647da084e65bf30c93). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20568 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87472/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20568 **[Test build #87472 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87472/testReport)** for PR 20568 at commit [`336bce0`](https://github.com/apache/spark/commit/336bce0d38d2068d12c4ba647da084e65bf30c93). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/20568 Jenkins, test this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user mrkm4ntr commented on the issue: https://github.com/apache/spark/pull/20568 @hvanhovell I added a method and changed it so that we call it only from FeatureHasher. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user mrkm4ntr commented on the issue: https://github.com/apache/spark/pull/20568 @hvanhovell I sent an e-mail to the topic `[VOTE] Spark 2.3.0 (RC3)`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user mrkm4ntr commented on the issue: https://github.com/apache/spark/pull/20568 I registered with the same user name in dev list. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20568 @mrkm4ntr I see your point. Adding a method to Murmur3 would work. The problem is that we are now going to release a `FeatureHasher` in Spark 2.3 that uses the current Murmur3 implementation. If we change this to use the correct Murmur3 implementation after the release of Spark 2.3 we will break all models using feature hashing created using Spark 2.3. This might be a blocker. Can you send an e-mail to the dev list? cc @sameeragarwal @srowen for more visibility. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/20568 How about add a new config to control whether to use the new Murmur3 hash function and have that default turned off? We also have to document the change explicitly. WDYT @gatorsmile @hvanhovell @cloud-fan ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user mrkm4ntr commented on the issue: https://github.com/apache/spark/pull/20568 @hvanhovell The main motivation is making the online prediction of trained parameters using FeatureHasher in MLLib. If the generated hash value is different from the implementations in another language, indices of coefficients do not match and can not predict correctly. But I agree backward compatibility is more important. Since FeatureHasher will be added from Spark 2.3.0, how about adding a new method of this content to Murmur 3 and using it only from FeatureHasher? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20568 @mrkm4ntr The change itself looks pretty reasonable. However I am very hesitant to merge this because this will probably break bucketing (it uses murmur3 to create the buckets); for example a bucketed table written by Spark 2.2 cannot be safely read by Spark after the change. Can you explain what problem you are trying to fix here? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user mrkm4ntr commented on the issue: https://github.com/apache/spark/pull/20568 @kiszk Thank you for your review! I fixed it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20568 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20568 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org