I was unable to reproduce the error. I tried the testcase on a
standalone cluster with 1 worker, and after I changed
line.split('\t')  to  line.split(',')  it worked and I got this
output:

[{'trial': 1, 'F': array([-0.360479,  0.151966, -0.924236,  0.716188,
-0.359643,  1.214568,
        1.816294,  1.205595,  0.71207 ])}]


I ran it on Spark v0.8.0-incubating, with numpy.__version__ ==
'1.7.1', and my Python is "Python 2.7.4 (default, Sep 26 2013,
03:20:26)". I ran both the master and the worker on the same machine,
which is running Ubuntu 13.04 on amd64.

If you're comfortable with debugging at the level of C and system
calls, you could try adding a time.sleep(..) call in parse_sotw() and
using the extra time to attach gdb or strace to the python worker
process to find the cause of the crash.

-Jey

On Thu, Dec 19, 2013 at 3:30 PM, Sandy Ryza <[email protected]> wrote:
> Here's a self-contained testcase.  It works when I take out the line with
> the np.asarray.
>
> ---
> import sys
> import numpy as np
> from pyspark import SparkContext
>
> def parse_sotw(line):
>     tokens = line.split('\t')
>     return {'trial': int(tokens[0]),
>             'F': np.asarray(map(float, tokens[1:]))}
>
> sc = SparkContext(sys.argv[2], "Risk MonteCarlo")
>
> # Load data
> sotw_raw = sc.textFile(sys.argv[1])
> sotw = sotw_raw.map(parse_sotw)
> print sotw.collect()
> ---
> Here's a sample input line:
> 1,-0.360479,0.151966,-0.924236,0.716188,-0.359643,1.214568,1.816294,1.205595,0.71207
> ---
>
> thanks a ton for your help,
> Sandy
>
>
> On Thu, Dec 19, 2013 at 3:09 PM, Jey Kottalam <[email protected]> wrote:
>>
>> Hi Sandy,
>>
>> Do you have a self-contained testcase that we could use to reproduce the
>> error?
>>
>> -Jey
>>
>> On Thu, Dec 19, 2013 at 3:04 PM, Sandy Ryza <[email protected]>
>> wrote:
>> > Verified that python is installed on the worker. When I simplify my job
>> > I'm
>> > able to to get more stuff in stderr, but it's just the Java log4j
>> > messages.
>> >
>> > I narrowed it down and I'm pretty sure the error is coming from my use
>> > of
>> > numpy - I'm trying to pass around records that hold numpy arrays.  I've
>> > verified that numpy is installed on the workers and that the job works
>> > locally on the master.  Is there anything else I need to do for
>> > accessing
>> > numpy from workers?
>> >
>> > thanks,
>> > Sandy
>> >
>> >
>> >
>> > On Thu, Dec 19, 2013 at 2:23 PM, Matei Zaharia <[email protected]>
>> > wrote:
>> >>
>> >> It might also mean you don’t have Python installed on the worker.
>> >>
>> >> On Dec 19, 2013, at 1:17 PM, Jey Kottalam <[email protected]> wrote:
>> >>
>> >> > That's pretty unusual; normally the executor's stderr output would
>> >> > contain a stacktrace and any other error messages from your Python
>> >> > code. Is it possible that the PySpark worker crashed in C code or was
>> >> > OOM killed?
>> >> >
>> >> > On Thu, Dec 19, 2013 at 11:10 AM, Sandy Ryza
>> >> > <[email protected]>
>> >> > wrote:
>> >> >> Hey All,
>> >> >>
>> >> >> Where are python logs in PySpark supposed to go?  My job is getting
>> >> >> a
>> >> >> org.apache.spark.SparkException: Python worker exited unexpectedly
>> >> >> (crashed)
>> >> >> but when I look at the stdout/stderr logs in the web UI, nothing
>> >> >> interesting
>> >> >> shows up (stdout is empty and stderr just has the spark executor
>> >> >> command).
>> >> >>
>> >> >> Is this the expected behavior?
>> >> >>
>> >> >> thanks in advance for any guidance,
>> >> >> Sandy
>> >>
>> >
>
>

Reply via email to