Sandy,
Are you just using numpy? numexpr (fast math for numpy arrays)
has issues on workers.
Cheers,
Mike
On 12/19/2013 06:04 PM, Sandy Ryza wrote:
Verified that python is installed on the worker. When I simplify my
job I'm able to to get more stuff in stderr, but it's just the Java
log4j messages.
I narrowed it down and I'm pretty sure the error is coming from my use
of numpy - I'm trying to pass around records that hold numpy arrays.
I've verified that numpy is installed on the workers and that the job
works locally on the master. Is there anything else I need to do for
accessing numpy from workers?
thanks,
Sandy
On Thu, Dec 19, 2013 at 2:23 PM, Matei Zaharia
<[email protected] <mailto:[email protected]>> wrote:
It might also mean you don’t have Python installed on the worker.
On Dec 19, 2013, at 1:17 PM, Jey Kottalam <[email protected]
<mailto:[email protected]>> wrote:
> That's pretty unusual; normally the executor's stderr output would
> contain a stacktrace and any other error messages from your Python
> code. Is it possible that the PySpark worker crashed in C code
or was
> OOM killed?
>
> On Thu, Dec 19, 2013 at 11:10 AM, Sandy Ryza
<[email protected] <mailto:[email protected]>> wrote:
>> Hey All,
>>
>> Where are python logs in PySpark supposed to go? My job is
getting a
>> org.apache.spark.SparkException: Python worker exited
unexpectedly (crashed)
>> but when I look at the stdout/stderr logs in the web UI,
nothing interesting
>> shows up (stdout is empty and stderr just has the spark
executor command).
>>
>> Is this the expected behavior?
>>
>> thanks in advance for any guidance,
>> Sandy