[jira] [Commented] (SPARK-23381) Murmur3 hash generates a different value from other implementations

Joseph K. Bradley (JIRA) Fri, 16 Feb 2018 12:34:25 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-23381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16367845#comment-16367845
 ]


Joseph K. Bradley commented on SPARK-23381:
-------------------------------------------

Copying my comment from the PR:
{quote}
For ML, I actually don't think this has to be a blocker. It's not great, but 
it's not a regression.

However, we should definitely fix this in the future and soon: For ML, it's 
really important that MurmurHash3 behave consistently across platforms.

To fix this, we'll need to maintain the old implementation of MurmushHash3 to 
maintain the behavior of ML Pipelines exported from previous versions of Spark.
{quote}

> Murmur3 hash generates a different value from other implementations
> -------------------------------------------------------------------
>
>                 Key: SPARK-23381
>                 URL: https://issues.apache.org/jira/browse/SPARK-23381
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.2.1
>            Reporter: Shintaro Murakami
>            Priority: Major
>
> Murmur3 hash generates a different value from the original and other 
> implementations (like Scala standard library and Guava or so) when the length 
> of a bytes array is not multiple of 4.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-23381) Murmur3 hash generates a different value from other implementations

Reply via email to