I think redirecting the subprocess's out/err streams to files would
definitely be a worthwhile change.


On Fri, Dec 20, 2013 at 11:53 AM, Josh Rosen <[email protected]> wrote:

> We could definitely improve PySpark's failure reporting mechanisms.  Right
> now, the worker has a try-catch block that forwards Python exceptions to
> Java, but there are still a few failures that can occur after the worker
> starts up and before we enter that block that may go unreported in the
> worker's own logs (see
> https://github.com/apache/incubator-spark/blob/master/python/pyspark/worker.py).
>  For example, I think you might see problems if the UDF or broadcast
> variables can't be deserialized properly.
>
> We should move more of the worker's code into the try block.  It would
> also be helpful to redirect the Python subprocesses' stderr and stdout to
> log to a file.
>
>
> On Fri, Dec 20, 2013 at 11:50 AM, Sandy Ryza <[email protected]>wrote:
>
>> Yeah, only using numpy.  Strange, it must be an issue with my setup.
>>  Will let you know if I figure it out.
>>
>> -Sandy
>>
>>
>> On Fri, Dec 20, 2013 at 6:03 AM, Michael Ronquest <[email protected]>wrote:
>>
>>> Sandy,
>>>         Are you just using numpy? numexpr (fast math for numpy arrays)
>>> has issues on workers.
>>> Cheers,
>>> Mike
>>>
>>> On 12/19/2013 06:04 PM, Sandy Ryza wrote:
>>>
>>>> Verified that python is installed on the worker. When I simplify my job
>>>> I'm able to to get more stuff in stderr, but it's just the Java log4j
>>>> messages.
>>>>
>>>> I narrowed it down and I'm pretty sure the error is coming from my use
>>>> of numpy - I'm trying to pass around records that hold numpy arrays.  I've
>>>> verified that numpy is installed on the workers and that the job works
>>>> locally on the master.  Is there anything else I need to do for accessing
>>>> numpy from workers?
>>>>
>>>> thanks,
>>>> Sandy
>>>>
>>>>
>>>>
>>>> On Thu, Dec 19, 2013 at 2:23 PM, Matei Zaharia 
>>>> <[email protected]<mailto:
>>>> [email protected]>> wrote:
>>>>
>>>>     It might also mean you don’t have Python installed on the worker.
>>>>
>>>>     On Dec 19, 2013, at 1:17 PM, Jey Kottalam <[email protected]
>>>>     <mailto:[email protected]>> wrote:
>>>>
>>>>     > That's pretty unusual; normally the executor's stderr output would
>>>>     > contain a stacktrace and any other error messages from your Python
>>>>     > code. Is it possible that the PySpark worker crashed in C code
>>>>     or was
>>>>     > OOM killed?
>>>>     >
>>>>     > On Thu, Dec 19, 2013 at 11:10 AM, Sandy Ryza
>>>>     <[email protected] <mailto:[email protected]>> wrote:
>>>>     >> Hey All,
>>>>     >>
>>>>     >> Where are python logs in PySpark supposed to go?  My job is
>>>>     getting a
>>>>     >> org.apache.spark.SparkException: Python worker exited
>>>>     unexpectedly (crashed)
>>>>     >> but when I look at the stdout/stderr logs in the web UI,
>>>>     nothing interesting
>>>>     >> shows up (stdout is empty and stderr just has the spark
>>>>     executor command).
>>>>     >>
>>>>     >> Is this the expected behavior?
>>>>     >>
>>>>     >> thanks in advance for any guidance,
>>>>     >> Sandy
>>>>
>>>>
>>>>
>>>
>>
>

Reply via email to