[GitHub] spark issue #20793: [SPARK-23643] Shrinking the buffer in hashSeed up to siz...

2018-03-11 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20793
  
**[Test build #88160 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88160/testReport)**
 for PR 20793 at commit 
[`177afcc`](https://github.com/apache/spark/commit/177afcc4277b604b783aef40d86d93d6a9add6fc).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20793: [SPARK-23643] Shrinking the buffer in hashSeed up to siz...

2018-03-11 Thread MaxGekk
Github user MaxGekk commented on the issue:

https://github.com/apache/spark/pull/20793
  
The question is that existing output of pseudo random/sample is guaranteed 
by public API. It seems it doesn't. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20793: [SPARK-23643] Shrinking the buffer in hashSeed up to siz...

2018-03-11 Thread MaxGekk
Github user MaxGekk commented on the issue:

https://github.com/apache/spark/pull/20793
  
At least some tests expect that particular values would be result of 
sample/random: 
https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala#L550-L564
 . 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20793: [SPARK-23643] Shrinking the buffer in hashSeed up to siz...

2018-03-10 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/20793
  
Ah, results are different since the number of operations are different. It 
may be an issue like #20630.

I am curious why test are failure when seed is changed. Of course, I 
understand the sequence of rand must be reproducable with certain seed value in 
a package or implementation.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20793: [SPARK-23643] Shrinking the buffer in hashSeed up to siz...

2018-03-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20793
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20793: [SPARK-23643] Shrinking the buffer in hashSeed up to siz...

2018-03-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20793
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/88156/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20793: [SPARK-23643] Shrinking the buffer in hashSeed up to siz...

2018-03-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20793
  
**[Test build #88156 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88156/testReport)**
 for PR 20793 at commit 
[`bb40ef2`](https://github.com/apache/spark/commit/bb40ef2e8d337508d60903a6a824b5aa45d87326).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20793: [SPARK-23643] Shrinking the buffer in hashSeed up to siz...

2018-03-10 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/20793
  
Does `hashSeed` method produce same hash value after this change?

```scala
scala> def hashSeed(seed: Long): Long = {
 |   val bytes = 
ByteBuffer.allocate(java.lang.Long.SIZE).putLong(seed).array()
 |   val lowBits = MurmurHash3.bytesHash(bytes)
 |   val highBits = MurmurHash3.bytesHash(bytes, lowBits)
 |   (highBits.toLong << 32) | (lowBits.toLong & 0xL)
 | }
hashSeed: (seed: Long)Long

scala> hashSeed(100)
res3: Long = 852394178374189935

scala> def hashSeed2(seed: Long): Long = {
 |   val bytes = 
ByteBuffer.allocate(java.lang.Long.BYTES).putLong(seed).array()
 |   val lowBits = MurmurHash3.bytesHash(bytes)
 |   val highBits = MurmurHash3.bytesHash(bytes, lowBits)
 |   (highBits.toLong << 32) | (lowBits.toLong & 0xL)
 | }
hashSeed2: (seed: Long)Long
scala> hashSeed2(100)
res7: Long = 1088402058313200430
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20793: [SPARK-23643] Shrinking the buffer in hashSeed up to siz...

2018-03-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20793
  
**[Test build #88156 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88156/testReport)**
 for PR 20793 at commit 
[`bb40ef2`](https://github.com/apache/spark/commit/bb40ef2e8d337508d60903a6a824b5aa45d87326).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20793: [SPARK-23643] Shrinking the buffer in hashSeed up to siz...

2018-03-10 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/20793
  
Jenkins, ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20793: [SPARK-23643] Shrinking the buffer in hashSeed up to siz...

2018-03-10 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/20793
  
Good catch, LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20793: [SPARK-23643] Shrinking the buffer in hashSeed up to siz...

2018-03-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20793
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20793: [SPARK-23643] Shrinking the buffer in hashSeed up to siz...

2018-03-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20793
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org