And... I found the problem:

21709 clone(child_stack=0, flags=CLONE_PARENT_SETTID|SIGCHLD,
parent_tidptr=0xffb78fd0) = -1 ENOMEM (Cannot allocate memory)

It looks like I'm getting hit by memory fragmentation, at the OS level.

This is a generic problem with long running processes, and I can think of a
variety of ways of dealing with it. For my current project I think I'll be
fine with killing and restarting J (I'm already checkpointing my work to
disk, and it restarts cleanly enough). I'll throw in a little delay on the
restart to avoid some pathological edge conditions and that'll get me where
I need to be.

Thanks,

-- 
Raul



On Wed, Mar 12, 2014 at 3:35 PM, Raul Miller <[email protected]> wrote:

> Running on ec2 linux, I'm seeing
>
> |interface error: spawn
>
> after only tens of thousands of calls out to the operating system.
>
> I can (and have) engineered around this, but it's annoying, and I'd like
> to understand the cause.
>
> How can I determine the underlying error, so I can start tracing this back?
>
> The description of 2!:0 at
> http://www.jsoftware.com/help/dictionary/dx002.htm currently does not
> suggest how I could isolate underlying OS issues.
>
> Thanks,
>
> --
> Raul
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to