[jira] [Commented] (SPARK-5004) PySpark does not handle SOCKS proxy

Eric O. LEBIGOT (EOL) (JIRA) Tue, 31 Mar 2015 08:14:24 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-5004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14388680#comment-14388680
 ]


Eric O. LEBIGOT (EOL) commented on SPARK-5004:
----------------------------------------------

Since this bug is unfortunately still here, I am adding some potentially useful 
information:

In a Python shell, _starting_ with network settings with _no proxy_ allows 
Spark to start (that's the normal behavior).

Now, the new information is that switching back to network settings _with a 
proxy_ lets Spark run normally.

Thus, the proxy problem (likely Spark trying to access localhost through the 
proxy even if localhost is in the list of exceptions) seems to only appear at 
startup, as far as I can see.

> PySpark does not handle SOCKS proxy
> -----------------------------------
>
>                 Key: SPARK-5004
>                 URL: https://issues.apache.org/jira/browse/SPARK-5004
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 1.2.0, 1.3.0
>            Reporter: Eric O. LEBIGOT (EOL)
>
> PySpark cannot run even the quick start examples when a SOCKS proxy is used. 
> Turning off the SOCKS proxy makes PySpark work.
> The Scala-shell version is not affected and works even when a SOCKS proxy is 
> used.
> Is there a quick workaround, while waiting for this to be fixed?
> Here is the error message (printed, e.g., when .count() is called):
> {code}
> >>> 14/12/30 17:13:44 WARN PythonWorkerFactory: Failed to open socket to 
> >>> Python daemon:
> java.net.SocketException: Malformed reply from SOCKS server
>         at java.net.SocksSocketImpl.readSocksReply(SocksSocketImpl.java:129)
>         at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:503)
>         at java.net.Socket.connect(Socket.java:579)
>         at java.net.Socket.connect(Socket.java:528)
>         at java.net.Socket.<init>(Socket.java:425)
>         at java.net.Socket.<init>(Socket.java:241)
>         at 
> org.apache.spark.api.python.PythonWorkerFactory.createSocket$1(PythonWorkerFactory.scala:75)
>         at 
> org.apache.spark.api.python.PythonWorkerFactory.liftedTree1$1(PythonWorkerFactory.scala:90)
>         at 
> org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:89)
>         at 
> org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:62)
>         at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:102)
>         at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)
>         at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
>         at org.apache.spark.scheduler.Task.run(Task.scala:56)
>         at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:724)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-5004) PySpark does not handle SOCKS proxy

Reply via email to