[GitHub] spark issue #20519: [Spark-23240][python] Don't let python site customizatio...

2018-02-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20519
  
I hope we can go with the error message one for now if we are all fine 
here. Will keep my eyes on JIRA and mailing list and check if similar issues 
are raised again ..


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20519: [Spark-23240][python] Don't let python site customizatio...

2018-02-13 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/20519
  
I'm ok with either approach; the stdout approach can still suffer from 
people doing crazy things in their customizations (like messing with what 
stdout is), but at that point I'm ok with not supporting those use cases.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20519: [Spark-23240][python] Don't let python site customizatio...

2018-02-12 Thread bersprockets
Github user bersprockets commented on the issue:

https://github.com/apache/spark/pull/20519
  
>Can't we just go with the error message for now?

Sure, but before we decide not to fix, did you see my comment 
[above](https://github.com/apache/spark/pull/20519#issuecomment-364677070) on a 
much less invasive fix to this issue (very little will change but will fix it 
in most cases). Summary of that comment: daemon.py should write some expected 
header to stdout before writing the port number to let PythonWorkerFactory know 
when the daemon's output starts.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20519: [Spark-23240][python] Don't let python site customizatio...

2018-02-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20519
  
For the change itself in this PR. it looks a overkill .. Can't we just go 
with the error message for now?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20519: [Spark-23240][python] Don't let python site customizatio...

2018-02-10 Thread bersprockets
Github user bersprockets commented on the issue:

https://github.com/apache/spark/pull/20519
  
>yea but we can't simply flush and ignore the stdout specifically from 
sitecustomize unless we define a kind of an additional protocol like this 
because we can't simply distinguish if the output

We might be able to distinguish between sitecustomize.py output and 
daemon.py output. Assuming the code in the sitecustomize.py is not 
multi-threaded, we can assume all output from sitecustomize.py comes *before* 
any output from daemon.py. Therefore, if daemon.py first prints a "magic 
number" or some other string that is unlikely to show up in sitecustomize.py 
output, PythonWorkerFactory.startDaemon() will know when daemon.py output 
starts. daemon.py would print the port number only after printing this magic 
value. For example:


daemon port: ^@^@\325


Once the scala code sees "daemon port: " in the launched process's stdout, 
it knows the next 4 bytes are the port number.

However, if sitecustomize.py starts multi-threaded code (and if that's even 
possible, that's a corner-corner-corner case), its output could potentially be 
interleaved with the daemon's output. Also, I am not sure sitecustomize.py 
output is guaranteed to show up first in stdout, but it seems reasonable that 
it would.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20519: [Spark-23240][python] Don't let python site customizatio...

2018-02-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20519
  
Hm .. yea but we can't simply flush and ignore the stdout specifically from 
sitecustomize unless we define a kind of an additional protocol like this 
because we can't simply distinguish if the output is from sitecustomize or the 
daemon but they are put int stdout together.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20519: [Spark-23240][python] Don't let python site customizatio...

2018-02-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20519
  
I am not yet fully sure about how sitecustomize fills some bytes to stdout 
but I believe it doesn't always happen because at least we have it for Python 
coverage - 
https://github.com/apache/spark/blob/master/python/test_coverage/sitecustomize.py.

The only way I used to verify this PR and 
https://github.com/apache/spark/pull/20424 was to manually add some bytes to 
stdout.

So, I am still thinking it's a corner case .. Can we otherwise simply flush 
and remove the stdout right before starting the daemon?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20519: [Spark-23240][python] Don't let python site customizatio...

2018-02-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20519
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20519: [Spark-23240][python] Don't let python site customizatio...

2018-02-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20519
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87267/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20519: [Spark-23240][python] Don't let python site customizatio...

2018-02-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20519
  
**[Test build #87267 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87267/testReport)**
 for PR 20519 at commit 
[`e07806e`](https://github.com/apache/spark/commit/e07806e4180a79badfb86ac67fd5707317525456).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20519: [Spark-23240][python] Don't let python site customizatio...

2018-02-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20519
  
**[Test build #87267 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87267/testReport)**
 for PR 20519 at commit 
[`e07806e`](https://github.com/apache/spark/commit/e07806e4180a79badfb86ac67fd5707317525456).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20519: [Spark-23240][python] Don't let python site customizatio...

2018-02-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20519
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87262/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20519: [Spark-23240][python] Don't let python site customizatio...

2018-02-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20519
  
**[Test build #87262 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87262/testReport)**
 for PR 20519 at commit 
[`3dc4f9a`](https://github.com/apache/spark/commit/3dc4f9a62c5b0559fa5d9601f17516dcb9e16530).
 * This patch **fails Python style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20519: [Spark-23240][python] Don't let python site customizatio...

2018-02-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20519
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20519: [Spark-23240][python] Don't let python site customizatio...

2018-02-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20519
  
**[Test build #87262 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87262/testReport)**
 for PR 20519 at commit 
[`3dc4f9a`](https://github.com/apache/spark/commit/3dc4f9a62c5b0559fa5d9601f17516dcb9e16530).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20519: [Spark-23240][python] Don't let python site customizatio...

2018-02-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20519
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87173/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20519: [Spark-23240][python] Don't let python site customizatio...

2018-02-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20519
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20519: [Spark-23240][python] Don't let python site customizatio...

2018-02-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20519
  
**[Test build #87173 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87173/testReport)**
 for PR 20519 at commit 
[`6c38829`](https://github.com/apache/spark/commit/6c38829ea5c81db0d8bd62fde249f2c01ff780e7).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20519: [Spark-23240][python] Don't let python site customizatio...

2018-02-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20519
  
**[Test build #87173 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87173/testReport)**
 for PR 20519 at commit 
[`6c38829`](https://github.com/apache/spark/commit/6c38829ea5c81db0d8bd62fde249f2c01ff780e7).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20519: [Spark-23240][python] Don't let python site customizatio...

2018-02-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20519
  
**[Test build #87171 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87171/testReport)**
 for PR 20519 at commit 
[`c63e8f4`](https://github.com/apache/spark/commit/c63e8f4b91e32ef5c3eb87ff5fafb0e7ad30261e).
 * This patch **fails Python style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20519: [Spark-23240][python] Don't let python site customizatio...

2018-02-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20519
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87171/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20519: [Spark-23240][python] Don't let python site customizatio...

2018-02-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20519
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20519: [Spark-23240][python] Don't let python site customizatio...

2018-02-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20519
  
**[Test build #87171 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87171/testReport)**
 for PR 20519 at commit 
[`c63e8f4`](https://github.com/apache/spark/commit/c63e8f4b91e32ef5c3eb87ff5fafb0e7ad30261e).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20519: [Spark-23240][python] Don't let python site customizatio...

2018-02-07 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/20519
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20519: [Spark-23240][python] Don't let python site customizatio...

2018-02-06 Thread bersprockets
Github user bersprockets commented on the issue:

https://github.com/apache/spark/pull/20519
  
>corner case described in the JIRA

I don't have a good feeling for how often people use python site 
customizations. I did encounter a real life case that caused the failure 
described in the Jira.

This is your area, I will leave it up to you. I still have the first PR 
where we just better report this case rather than try to fix it.

The logic of the code is largely the same:
- launch the daemon
- read the port number from a DataInputStream connected to the daemon.

The steps to set up a socket, and the try/catches required, add quite a few 
lines, even though the end result is the same (read an int from a 
DataInputStream connected to the daemon).

That being said, the code does add some complexities and timing differences.



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20519: [Spark-23240][python] Don't let python site customizatio...

2018-02-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20519
  
Isn't it too a invasive change to deal with a corner case described in the 
JIRA .. ? Did I maybe miss something?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20519: [Spark-23240][python] Don't let python site customizatio...

2018-02-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20519
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20519: [Spark-23240][python] Don't let python site customizatio...

2018-02-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20519
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org