[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2824#issuecomment-59686863
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21906/consoleFull)
 for   PR 2824 at commit 
[`be0533a`](https://github.com/apache/spark/commit/be0533a88f6b624629ac66cfeb9989337c002cfd).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-20 Thread jerryshao
Github user jerryshao commented on the pull request:

https://github.com/apache/spark/pull/2824#issuecomment-59688035
  
Hi @JoshRosen , I just set `transferToEnabled` to false as default value, 
unless users explicitly set it to true, `transferTo` will not be enabled. 

Currently, only `ExternalSorter` use this API as file to file copying and 
this is controlled by configuration `spark.file.transferTo`, other uses of 
`copyStream` in Spark code are all not file to file copying, so this parameter 
will not take effect.

If future uses of `copyStream`, user have to get `transferToEnabled` from 
configuration, I add some usage notes here.  Still user can bypass 
`spark.file.transferTo` and directly set this parameter to true, but they have 
to be responsible for the correctness of usage.

The reason I didn't take `SparkConf` as a parameter to control the behavior 
is that it should modify lots of the current codes to get `SparkConf` in which 
it calls `copyStream`.

So what is your opinion? Thanks a lot.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-20 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/2824#issuecomment-59688343
  
HI @jerryshao,

Changing the default is exactly what I had in mind.  This looks good to me! 
 (Going to bed now; I'll merge this tomorrow and backport to `branch-1.1`)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-20 Thread jerryshao
Github user jerryshao commented on the pull request:

https://github.com/apache/spark/pull/2824#issuecomment-59688531
  
Thanks a lot :).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2824#issuecomment-59691398
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21906/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2824#issuecomment-59691389
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21906/consoleFull)
 for   PR 2824 at commit 
[`be0533a`](https://github.com/apache/spark/commit/be0533a88f6b624629ac66cfeb9989337c002cfd).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-20 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/2824


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-20 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/2824#issuecomment-59805036
  
I've merged this into `master` and `branch-1.1`.

 Thanks a lot :).

Thank YOU (and @mridulm) for helping to diagnose this really subtle bug!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-19 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/2824#discussion_r19065298
  
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -272,20 +272,38 @@ private[spark] object Utils extends Logging {
   /** Copy all data from an InputStream to an OutputStream */
   def copyStream(in: InputStream,
  out: OutputStream,
- closeStreams: Boolean = false): Long =
+ closeStreams: Boolean = false,
+ transferToEnabled: Boolean = true): Long =
   {
 var count = 0L
 try {
-  if (in.isInstanceOf[FileInputStream]  
out.isInstanceOf[FileOutputStream]) {
+  if (in.isInstanceOf[FileInputStream]  
out.isInstanceOf[FileOutputStream]
+ transferToEnabled) {
 // When both streams are File stream, use transferTo to improve 
copy performance.
 val inChannel = in.asInstanceOf[FileInputStream].getChannel()
 val outChannel = out.asInstanceOf[FileOutputStream].getChannel()
+val initialPos = outChannel.position()
 val size = inChannel.size()
 
 // In case transferTo method transferred less data than we have 
required.
 while (count  size) {
   count += inChannel.transferTo(count, size - count, outChannel)
 }
+
+// Check the position after transferTo loop to see if it is in the 
right position and
+// give user information if not.
+// Position will not be increased to the expected length after 
calling transferTo in
+// kernel version 2.6.32, this issue can be seen in
+// https://bugs.openjdk.java.net/browse/JDK-7052359
+// This will lead to stream corruption issue when using sort-based 
shuffle (SPARK-3948).
+val finalPos = outChannel.position()
+assert(finalPos == initialPos + size,
+  s
+ |Current position $finalPos do not equal to expected position 
${initialPos + count}
--- End diff --

I think normally `size` would be equal to `count`. I will change to `size` 
to keep consistency.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-19 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/2824#discussion_r19065507
  
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -272,20 +272,38 @@ private[spark] object Utils extends Logging {
   /** Copy all data from an InputStream to an OutputStream */
   def copyStream(in: InputStream,
  out: OutputStream,
- closeStreams: Boolean = false): Long =
+ closeStreams: Boolean = false,
+ transferToEnabled: Boolean = true): Long =
--- End diff --

IMHO, I think NIO way will gain much better performance, especially in 
shuffle. Though we met some issues with specific kernel version, mostly this 
exception will not be met, I think we should balance the trade-off between 
seldom met exception and performance gain. 

If we hopes people to never use `transferTo`, why not directly remove this 
code path and configuration, this will be more straightforward.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-19 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/2824#discussion_r19066318
  
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -272,20 +272,38 @@ private[spark] object Utils extends Logging {
   /** Copy all data from an InputStream to an OutputStream */
   def copyStream(in: InputStream,
  out: OutputStream,
- closeStreams: Boolean = false): Long =
+ closeStreams: Boolean = false,
+ transferToEnabled: Boolean = true): Long =
--- End diff --

To be clear: I agree that `transferTo` should be enabled.  My concern here 
was simply whether setting `spark.file.transfterTo` will _always_ control the 
use of `transferTo`.  I'm fine having transferTo as the default, but I'd like 
it if there was an option to disable all uses of it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-19 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/2824#discussion_r19066582
  
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -272,20 +272,38 @@ private[spark] object Utils extends Logging {
   /** Copy all data from an InputStream to an OutputStream */
   def copyStream(in: InputStream,
  out: OutputStream,
- closeStreams: Boolean = false): Long =
+ closeStreams: Boolean = false,
+ transferToEnabled: Boolean = true): Long =
--- End diff --

Thanks Josh, I think I start to know your concern. My previous modification 
may not well control the behavior of `transferTo` in all uses of it, but this 
brings less impact to the current code path. If we'd like to control it in all 
the places, we have to pass and verify the configuration in all the places 
where use `copyStream`, which is not so elegant in some codes which have to to 
get SparkConf. I will think a better way to address this. Thanks a lot.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-17 Thread jerryshao
Github user jerryshao commented on the pull request:

https://github.com/apache/spark/pull/2824#issuecomment-59475885
  
Hi @JoshRosen , I just add a configuration that can bypass the NIO way of 
copying stream. Would you mind taking a look at it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2824#issuecomment-59475982
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21854/consoleFull)
 for   PR 2824 at commit 
[`a82b184`](https://github.com/apache/spark/commit/a82b18423f57c5f584d93d2702d710d1cde843c2).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2824#issuecomment-59483494
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21854/consoleFull)
 for   PR 2824 at commit 
[`a82b184`](https://github.com/apache/spark/commit/a82b18423f57c5f584d93d2702d710d1cde843c2).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2824#issuecomment-59483499
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21854/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-17 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/2824#discussion_r19034004
  
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -272,20 +272,38 @@ private[spark] object Utils extends Logging {
   /** Copy all data from an InputStream to an OutputStream */
   def copyStream(in: InputStream,
  out: OutputStream,
- closeStreams: Boolean = false): Long =
+ closeStreams: Boolean = false,
+ transferToEnabled: Boolean = true): Long =
   {
 var count = 0L
 try {
-  if (in.isInstanceOf[FileInputStream]  
out.isInstanceOf[FileOutputStream]) {
+  if (in.isInstanceOf[FileInputStream]  
out.isInstanceOf[FileOutputStream]
+ transferToEnabled) {
 // When both streams are File stream, use transferTo to improve 
copy performance.
 val inChannel = in.asInstanceOf[FileInputStream].getChannel()
 val outChannel = out.asInstanceOf[FileOutputStream].getChannel()
+val initialPos = outChannel.position()
 val size = inChannel.size()
 
 // In case transferTo method transferred less data than we have 
required.
 while (count  size) {
   count += inChannel.transferTo(count, size - count, outChannel)
 }
+
+// Check the position after transferTo loop to see if it is in the 
right position and
+// give user information if not.
+// Position will not be increased to the expected length after 
calling transferTo in
+// kernel version 2.6.32, this issue can be seen in
+// https://bugs.openjdk.java.net/browse/JDK-7052359
+// This will lead to stream corruption issue when using sort-based 
shuffle (SPARK-3948).
+val finalPos = outChannel.position()
+assert(finalPos == initialPos + size,
+  s
+ |Current position $finalPos do not equal to expected position 
${initialPos + count}
--- End diff --

This `assert` checks whether `finalPos` is `initialPos + size`, but this 
error message uses `initialPos + count`; could this lead to confusion?

I suppose that `count = size` here, so it's probably fine, but it might be 
confusing if `count` was ever greater than `size`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-17 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/2824#discussion_r19034247
  
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -272,20 +272,38 @@ private[spark] object Utils extends Logging {
   /** Copy all data from an InputStream to an OutputStream */
   def copyStream(in: InputStream,
  out: OutputStream,
- closeStreams: Boolean = false): Long =
+ closeStreams: Boolean = false,
+ transferToEnabled: Boolean = true): Long =
--- End diff --

Should we change this default to `false`?  This would help us to guard 
against cases in which new code uses `copyStream` on File[Input|Output]Streams 
but the programmer forgot to read the configuration option.  I guess my concern 
is that I want to ensure that if `spark.file.transferTo` is false, then we are 
guaranteed to _never_ use the `transferTo` path in this function.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-16 Thread jerryshao
GitHub user jerryshao opened a pull request:

https://github.com/apache/spark/pull/2824

[SPARK-3948][Shuffle]Fix stream corruption bug in sort-based shuffle

Kernel 2.6.32 bug will lead to unexpected behavior of transferTo in 
copyStream, and this will corrupt the shuffle output file in sort-based 
shuffle, which will somehow introduce PARSING_ERROR(2), deserialization error 
or offset out of range. Here fix this by adding append flag, also add some 
position checking code. Details can be seen in 
[SPARK-3948](https://issues.apache.org/jira/browse/SPARK-3948).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jerryshao/apache-spark SPARK-3948

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2824.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2824


commit b47cc9f8f32e01868288ec82eedc1f421a717a8b
Author: jerryshao saisai.s...@intel.com
Date:   2014-10-16T08:45:51Z

Fix kernel 2.6.32 bug led unexpected behavior of transferTo




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-16 Thread CodingCat
Github user CodingCat commented on a diff in the pull request:

https://github.com/apache/spark/pull/2824#discussion_r18956465
  
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -280,12 +280,29 @@ private[spark] object Utils extends Logging {
 // When both streams are File stream, use transferTo to improve 
copy performance.
 val inChannel = in.asInstanceOf[FileInputStream].getChannel()
 val outChannel = out.asInstanceOf[FileOutputStream].getChannel()
+val initialPos = outChannel.position()
 val size = inChannel.size()
 
 // In case transferTo method transferred less data than we have 
required.
 while (count  size) {
   count += inChannel.transferTo(count, size - count, outChannel)
 }
+
+// Check the position after transferTo loop to see if it is in the 
right position and
+// give user information if not.
+// Position will not be increased to the expected length after 
calling transferTo in
+// kernel version 2.6.32, this issue can be seen in
+// scalastyle:off
+// 
https://bugs.openjdk.java.net/browse/JDK-7052359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel)
--- End diff --

en...I guess this line will trigger scalastyle checker error


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-16 Thread jerryshao
Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/2824#discussion_r18957257
  
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -280,12 +280,29 @@ private[spark] object Utils extends Logging {
 // When both streams are File stream, use transferTo to improve 
copy performance.
 val inChannel = in.asInstanceOf[FileInputStream].getChannel()
 val outChannel = out.asInstanceOf[FileOutputStream].getChannel()
+val initialPos = outChannel.position()
 val size = inChannel.size()
 
 // In case transferTo method transferred less data than we have 
required.
 while (count  size) {
   count += inChannel.transferTo(count, size - count, outChannel)
 }
+
+// Check the position after transferTo loop to see if it is in the 
right position and
+// give user information if not.
+// Position will not be increased to the expected length after 
calling transferTo in
+// kernel version 2.6.32, this issue can be seen in
+// scalastyle:off
+// 
https://bugs.openjdk.java.net/browse/JDK-7052359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel)
--- End diff --

I'm not sure, I found some code in KafkaUtils also use this 
`scalastyle:off` to turn off scalacheck.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2824#issuecomment-59367830
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21805/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2824#issuecomment-59368192
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21807/consoleFull)
 for   PR 2824 at commit 
[`e17ada2`](https://github.com/apache/spark/commit/e17ada2629b7ac0cc6e888cd1a0da6827159ea3b).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-16 Thread CodingCat
Github user CodingCat commented on a diff in the pull request:

https://github.com/apache/spark/pull/2824#discussion_r18957913
  
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -280,12 +280,29 @@ private[spark] object Utils extends Logging {
 // When both streams are File stream, use transferTo to improve 
copy performance.
 val inChannel = in.asInstanceOf[FileInputStream].getChannel()
 val outChannel = out.asInstanceOf[FileOutputStream].getChannel()
+val initialPos = outChannel.position()
 val size = inChannel.size()
 
 // In case transferTo method transferred less data than we have 
required.
 while (count  size) {
   count += inChannel.transferTo(count, size - count, outChannel)
 }
+
+// Check the position after transferTo loop to see if it is in the 
right position and
+// give user information if not.
+// Position will not be increased to the expected length after 
calling transferTo in
+// kernel version 2.6.32, this issue can be seen in
+// scalastyle:off
+// 
https://bugs.openjdk.java.net/browse/JDK-7052359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel)
--- End diff --

I see, I didn't notice that line, I think that shall be good


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2824#issuecomment-59380188
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21807/consoleFull)
 for   PR 2824 at commit 
[`e17ada2`](https://github.com/apache/spark/commit/e17ada2629b7ac0cc6e888cd1a0da6827159ea3b).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2824#issuecomment-59380201
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21807/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-16 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/2824#discussion_r1823
  
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -280,12 +280,29 @@ private[spark] object Utils extends Logging {
 // When both streams are File stream, use transferTo to improve 
copy performance.
 val inChannel = in.asInstanceOf[FileInputStream].getChannel()
 val outChannel = out.asInstanceOf[FileOutputStream].getChannel()
+val initialPos = outChannel.position()
 val size = inChannel.size()
 
 // In case transferTo method transferred less data than we have 
required.
 while (count  size) {
   count += inChannel.transferTo(count, size - count, outChannel)
 }
+
+// Check the position after transferTo loop to see if it is in the 
right position and
+// give user information if not.
+// Position will not be increased to the expected length after 
calling transferTo in
+// kernel version 2.6.32, this issue can be seen in
+// scalastyle:off
+// 
https://bugs.openjdk.java.net/browse/JDK-7052359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel)
--- End diff --

You can remove url part after the '?' here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-16 Thread adrian-wang
Github user adrian-wang commented on the pull request:

https://github.com/apache/spark/pull/2824#issuecomment-59461464
  
The patch works fine for me, on my 2.6.32 cluster. Thanks! 
@dbtsai  You should also try this, for your SPARK-3630 issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-16 Thread pwendell
Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/2824#issuecomment-59468226
  
@adrian-wang, Just so I understand - were you seeing the issue before 
applying this patch, and then the patch made it go away?

@jerryshao Could we also have a branch-1.1 version of this that simply logs 
an error instead of throwing an exception (via `assert`)? I just want to be 
more conservative with the existing branch and not cause programs to terminate.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-16 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/2824#issuecomment-59468593
  
Maybe this is being overly-conservative, but could we add an undocumented 
configuration option that allows users to bypass the `transferTo` here?  If we 
ship 1.1.1 or 1.2 and this bug resurfaces, it would be nice if there was a way 
for users to revert back to the older version of this code without having to 
recompile.  A lot of users can't simply upgrade their kernel, so I think we 
need to offer an easy workaround for them.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-16 Thread adrian-wang
Github user adrian-wang commented on the pull request:

https://github.com/apache/spark/pull/2824#issuecomment-59469437
  
@pwendell yes, I was suffering from running some certain queries over 
sort-base shuffle, just like the discussion in SPARK-3630. And with this patch 
the issue is gone. Thanks! -- My cluster is 4-node redhat 6.2, with kernel 
2.6.32.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3948][Shuffle]Fix stream corruption bug...

2014-10-16 Thread jerryshao
Github user jerryshao commented on the pull request:

https://github.com/apache/spark/pull/2824#issuecomment-59469537
  
I think if bug is occurred when running job, even if we do not throw an 
exception here, we will still meet other exceptions in reduce side, so I use 
`assert` here. I think an undocumented config is a good idea, but since we need 
SparkConf object to pass it, some other code path which use `copyStream` also 
need to change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org