[GitHub] [spark] MaxGekk commented on issue #20793: [WIP][SPARK-23643] Shrinking the buffer in hashSeed up to size of the seed parameter

GitBox Wed, 13 Mar 2019 14:32:06 -0700

MaxGekk commented on issue #20793: [WIP][SPARK-23643] Shrinking the buffer in 
hashSeed up to size of the seed parameter
URL: https://github.com/apache/spark/pull/20793#issuecomment-472613858
 
 
   @shahidki31 At least it looks strange. `PowerIterationClustering` uses 
partition number as a seed for `XORShiftRandom`: 
https://github.com/apache/spark/blob/25bcf59b3b566b77bfc8a40a4f4253b81f340aa4/mllib/src/main/scala/org/apache/spark/mllib/clustering/PowerIterationClustering.scala#L318
    and partitions index as well as number of partitions are not controlled, 
and the test expects some deterministic results. I would disable the test so 
far. @srowen WDYT?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] MaxGekk commented on issue #20793: [WIP][SPARK-23643] Shrinking the buffer in hashSeed up to size of the seed parameter

Reply via email to