Chistiaan,

A Saturday 04 April 2009, Christiaan Putter escrigué:
> Hi,
>
> Just realised I was applying the wrong label in my message filter, 
> no wonder I wasn't seeing messages from the pytables mailing list :-)
>
> 2009/4/3 Francesc Alted <fal...@pytables.org>:
> > Hi Christiaan,
> >
> >
> >Your message was indeed successfully sent to the PyTables list.  So
> > you must be subscribed (I don't know why you are not receiving
> > messages from it, though).  I'm attaching my answer below.
> >
> >Also, could you try to set the parameter NODE_CACHE_SLOTS to 0 in
> > your tables/parameters.py?  Or pass `NODE_CACHE_SLOTS=0` to the
> > openFile function, which works similarly, but on a per-file basis
> > only.
> >
> > A Friday 03 April 2009, Christiaan Putter escrigué:
> >> Hi guys,
> >>
> >> I'm having some issues with multiple threads and pytables.  I have
> >> a single lock attached to the file and use it when both reading
> >> and writing to the file.  When I'm only running one thread there
> >> aren't any  problems, though several threads cause some strange
> >> and unpredictable behaviour.  It seems to be more stable on linux
> >> then windows for some reason.
> >>
> >> Symptoms include:
> >> - program crash.  no exceptions. no trace back.  app simply
> >> disappears - unexpected exceptions
> >> - write failures
> >>
> >>
> >> It would be nice if someone could advice me on some best practices
> >> for using threads with pytables.  What should I consider as an
> >> "atomic" operation from my side?
> >
> > Yeah.  Seems like you are trying to delete objects from the object
> > tree from different threads, and apparently this makes the LRU
> > cache for objects to choke.
> >
> > Mmh, how are you locking your file?  It would help a lot if you can
> > provide a small script that exposes the problem.
> >
> > Cheers,
>
> All file access to a file is done through a single class.  Instances
> of that class can be on any thread.  The file is opened when the app
> is started and the file object is then made available as a service to
> any instances that need it.
>
> The file is opened as such:
>
> filters = Filters(complevel=self.complevel, complib=self.complib,
>                                       shuffle=self.shuffle,
> fletcher32=self.fletcher)
> self.engine = openFile(self.path, mode = 'a', title = "HasTraits HDF5
> Database",
>                                        filters=filters,
> NODE_CACHE_SLOTS=0)
> self.engine.Rlock = threading.RLock()
> self.engine.Wlock = threading.Lock()
>
>
> I used a reentrant lock at first, but restructured my code later on
> to use a normal lock hoping that would solve the problem...   So any
> instance that uses that 'engine' would use the engine.Wlock
>
> So the class that reads and writes to the file has a read method:
>
> ## self.h5f is a reference to the above file object returned by
> openFile
>
>     def read_records(self):
>         if not self.isactive():
>             return {}
>
>         with self.h5f.Wlock:
>             node_exists = self.path in self.h5f
>
>         if not node_exists:
>             return {}
>
>         res = {}
>         try:
>             table = self.getTable()
>
>             if table != None:
>                 with self.h5f.Wlock:
>                     for name in table.colnames:
>                         arr = table.cols._f_col(name)[:]
>                         res[name] = arr
>         except Exception, e:
>             print 'Read failure: %s | %s' % (e, self.msg)
>
>         return res
>
>
> A nasty write method:
>
>     def overwrite(self):
>         try:
>             with self.h5f.Wlock:
>                 if self.isactive():
>                     if self.path in self.h5f:
>                         self.h5f.removeNode(self.path)
>                 else:
>                     return
>
>             table = self.getTable()
>
>             if table != None:
>                 with self.h5f.Wlock:
>                     row = table.row
>                     for i in xrange(len(self.time)):
>                         for key, val in self.data.items():
>                             row[key] = val[i]
>                         row.append()
>                     table.flush()
>                     table._f_close()
>         except Exception, e:
>             print 'Write failure: %s | %s' % (e, self.msg)
>
>
> So writing to a node always deletes it first.  That's because columns
> can be added and removed arbitraryly.
>
> A new table is created as such:
>
>     def createTable(self):
>         with self.h5f.Wlock:
>             res = None
>             if self.isactive() and len(self.data.items()) > 0:
>                 res = self.h5f.createTable(self.where, self.name,
> self.map_columns(), \
>                         '', createparents=True)
>             return res
>
>     def getTable(self):
>         res = None
>         if self.isactive():
>             with self.h5f.Wlock:
>                 newtable = self.path not in self.h5f
>
>             if newtable:
>                 res = self.createTable()
>             else:
>                 with self.h5f.Wlock:
>                     res = self.h5f.getNode(self.path)
>         return res
>
>
>
> It looked much prettier when I was using a reentrant lock...
>
> So that's basically it.  I'm sure that I go through the lock for all
> read and write access to the file (though maybe I missed something),
> which means I'm pretty sure that even though several threads might
> read/write at the same time they all have to wait for the lock to be
> released.

Although your code may be locking correctly the file, the problem is for 
sure the PyTables LRU cache for nodes.  This is why I've suggested you 
to disable it.

> I tested setting NODE_CACHE_SLOTS=0 though on windows that causes
> openFile to raise an unexpected keyword argument exception (guessing
> I'll have to update pytables...)

Yes, you need PyTables 2.1 in order to use NODE_CACHE_SLOTS=0 as 
argument, but you can always modify it in tables/parameters.py.

> On linux it sometime crashes and 
> prints out:
>
> HDF5-DIAG: Error detected in HDF5 (1.8.2) thread 0:
>   #000: H5F.c line 1724 in H5Fflush(): flush failed
>     major: File accessability
>     minor: Unable to initialize object
>   #001: H5F.c line 1816 in H5F_flush(): unable to flush metadata
> cache major: Object cache
>     minor: Unable to flush data from cache
>   #002: H5AC.c line 1097 in H5AC_flush(): Can't flush entry.
>     major: Object cache
>     minor: Unable to flush data from cache
>   #003: H5C.c line 3922 in H5C_flush_cache(): cache has protected
> items major: Object cache
>     minor: Unable to flush data from cache
>

Mmh, this seems a problem with the threads accessing the metadata cache 
in HDF5 itself (similar to the LRU node cache in PyTables).  However, 
it is great to see that the PyTables' LRU cache problem has vanished.

> Do you think I can somehow restructure my read / write methods to get
> this to work?

I've two suggestions here:

1. Set the parameter METADATA_CACHE_SIZE=0 in tables/parameters.py (or, 
if using PyTables 2.1 or higher, pass it to the `openFile()` function).  
This should disable the HDF5 metadata cache.

2. If 1. do not work, you may try recompiling HDF5 in thread safe mode.  
This can be achieved on Unix by passing the --enable-threadsafe' flag 
to the 'configure' script.  More info about HDF5 thread safe mode in:

http://www.hdfgroup.org/HDF5/doc/TechNotes/ThreadSafeLibrary.html

HTH,

-- 
Francesc Alted

"One would expect people to feel threatened by the 'giant
brains or machines that think'.  In fact, the frightening
computer becomes less frightening if it is used only to
simulate a familiar noncomputer."

-- Edsger W. Dykstra
   "On the cruelty of really teaching computer science"

------------------------------------------------------------------------------
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to