It's possible that Spark sets the executor environment explicitly, which
would lead to the http_proxy and https_proxy environment variables not
being passed along to the executor. You could try using the
`--executor_environment_variables` command-line flag when running the agent
to specify these environment variables, ensuring that they get passed
through.

On Sat, Oct 31, 2015 at 12:06 AM, Zhongyue Luo <[email protected]>
wrote:

> Any advise on this issue? I'm having the same problem.
>
> On Fri, Oct 9, 2015 at 4:13 AM, David M <[email protected]> wrote:
>
>> Hi everyone.
>>
>> I have Mesos cluster (0.24.1) for running Spark (1.5.2) that runs great.
>>
>> I have a requirement to move my Mesos cluster nodes behind a Squid http
>> proxy.
>> All cluster nodes previously had direct outbound Internet access so
>> accessing SPARK_EXECUTOR_URI from a public source was not a problem.
>>
>> System-wide I have http_proxy and https_proxy environment variables set.
>> Command line tools like curl and wget operate just fine against Internet
>> resources.
>> After configuring Maven's proxy settings the Mesos build completed
>> successfully.
>>
>> I copied my /etc/hosts file to HDFS and attempted the WordCount example
>> from:
>> http://documentation.altiscale.com/spark-shell-examples-1-1
>>
>> It failed with this in the executors stderr file:
>>
>> I1008 15:39:48.417644 21698 logging.cpp:172] INFO level logging started!
>> I1008 15:39:48.417819 21698 fetcher.cpp:414] Fetcher Info:
>> {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/20151007-154648-2701359370-5050-25191-S3\/spark","items":[{"action":"BYPASS_CACHE","uri":{"extract":true,"value":"http:\/\/
>> d3kbcqa49mib13.cloudfront.net
>> \/spark-1.5.1-bin-hadoop2.6.tgz"}}],"sandbox_directory":"\/var\/run\/mesos\/slaves\/20151007-154648-2701359370-5050-25191-S3\/frameworks\/20151008-123957-2701359370-5050-6382-0001\/executors\/20151007-154648-2701359370-5050-25191-S3\/runs\/507827fb-cfb0-4a1d-977d-9b9afb972c29","user":"spark"}
>> I1008 15:39:48.418918 21698 fetcher.cpp:369] Fetching URI '
>> http://d3kbcqa49mib13.cloudfront.net/spark-1.5.1-bin-hadoop2.6.tgz'
>> I1008 15:39:48.418936 21698 fetcher.cpp:243] Fetching directly into the
>> sandbox directory
>> I1008 15:39:48.418949 21698 fetcher.cpp:180] Fetching URI '
>> http://d3kbcqa49mib13.cloudfront.net/spark-1.5.1-bin-hadoop2.6.tgz'
>> I1008 15:39:48.418958 21698 fetcher.cpp:127] Downloading resource from '
>> http://d3kbcqa49mib13.cloudfront.net/spark-1.5.1-bin-hadoop2.6.tgz' to
>> '/var/run/mesos/slaves/20151007-154648-2701359370-5050-25191-S3/frameworks/20151008-123957-2701359370-5050-6382-0001/executors/20151007-154648-2701359370-5050-25191-S3/runs/507827fb-cfb0-4a1d-977d-9b9afb972c29/spark-1.5.1-bin-hadoop2.6.tgz'
>> Failed to fetch '
>> http://d3kbcqa49mib13.cloudfront.net/spark-1.5.1-bin-hadoop2.6.tgz':
>> Error downloading resource, received HTTP return code 400
>> Failed to synchronize with slave (it's probably exited)
>>
>> Long troubleshooting story short it appears that libcurl isn't finding
>> out about my proxy.
>>
>> In ./3rdparty/libprocess/3rdparty/stout/include/stout/posix/net.hpp
>>
>> I added
>>
>>   curl_easy_setopt(curl, CURLOPT_VERBOSE, true);
>>   curl_easy_setopt(curl, CURLOPT_PROXY, "<my squid server hostname
>> here>");
>>   curl_easy_setopt(curl, CURLOPT_PROXYPORT, <my squid server port here>);
>>
>> before
>>
>> CURLcode curlErrorCode = curl_easy_perform(curl);
>>
>> I then recompiled Mesos and the WordCount example is successful.
>>
>> What is the correct way to set proxy so that libcurl will make use of it?
>>
>> Thank you.
>> David
>>
>>
>>
>>
>
>
> --
> *Intel SSG/STO/BDT*
> 880 Zixing Road, Zizhu Science Park, Minhang District, 200241, Shanghai,
> China
> +862161166500
>

Reply via email to