On Mon, Oct 8, 2012 at 11:19 AM, Owen Mackwood <owen.mackw...@bccn-berlin.de
> wrote:
> Hi Anthony,
>
> On 8 October 2012 15:54, Anthony Scopatz <scop...@gmail.com> wrote:
>
>> Hmmm, Are you actually copying the data (f.root.data[:]) or are you
>> simply passing a reference as arguments (f.root.data)?
>>
>
> I call f.root.data.read() on any arrays to load them into the process
> target args dictionary. I had assumed this returns a copy of the data. The
> documentation doesn't specify which, or even if there is any difference
> from __getitem__.
>
> So if you are opening a file in the master process and then
>> writing/creating/flushing from the workers this may cause a problem.
>> Multiprocess creates a fork of the original process so you are relying on
>> the file handle from the master process to not accidentally change somehow.
>> Can you try to open the files in the workers rather than the master? I
>> hope that this clears up the issue.
>>
>
> I am not accessing the master file from the worker processes. At least not
> by design, though as you say some kind of strange behaviour could be
> arising due to the copy-on-fork of Linux. In principle, each process has
> its own file and there is no sharing of files between processes.
>
>
>> Basically, I am advocating a more conservative approach where all data
>> that is read or written to in a worker must come from that worker, rather
>> than being generated by the master. If you are *still* experiencing
>> these problems, then we know we have a real problem.
>>
>
> I'm being about as conservative as can be with my system. Unless read()
> returns a reference to the master file there should be absolutely no
> sharing between processes. And even if my args dictionary contains a
> reference to the in-memory HDF5 file, how could reading it possibly trigger
> a call to openFile?
>
> Can you clarify the semantics of read() vs. __getitem__()? Thanks.
Hello Owen,
So __getitem__() calls read() on the items it needs. Both should return a
copy in-memory of the data that is on disk.
Frankly, I am not really sure what is going on, given what you have said.
A minimal example which reproduces the error would be really helpful.
From the error that you have provided, though, the only thing that I can
think of is that it is related to file opening on the worker processes.
Be Well
Anthony
>
> Regards,
> Owen
>
>
> ------------------------------------------------------------------------------
> Don't let slow site performance ruin your business. Deploy New Relic APM
> Deploy New Relic app performance management and know exactly
> what is happening inside your Ruby, Python, PHP, Java, and .NET app
> Try New Relic at no cost today and get our sweet Data Nerd shirt too!
> http://p.sf.net/sfu/newrelic-dev2dev
> _______________________________________________
> Pytables-users mailing list
> Pytables-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/pytables-users
>
>
------------------------------------------------------------------------------
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users