After doing a lot of testing and thinking I came to a conclusion as to
what I think was causing by errors, and have come up with a fix. The
basic idea behind my issues were that multiple threads could can not
in any way change the ctypes function pointers. With the current code
it is possible to have these function pointers modified in the wappers
__call__ method. Also while calling certain functions in the Shapely
API it changed the errcheck function at run-time. I think the main
reason for this in most cases was the need to bind objects into these
error checking functions before calling. This is not strictly needed
because the parameters to the errcheck method provide all needed
information. What I have come up with does not look the greatest
because of all of the errcheck methods get defined when the module is
imported, but it seems to run better. Please let me know your thoughts
on the attached patch.

Thanks,
Aron

On Thu, Feb 26, 2009 at 10:53 AM, Sean Gillies <[email protected]> wrote:
> Give it a try. I like the optimization idea too.
>
> On Feb 25, 2009, at 1:25 PM, Aron Bierbaum wrote:
>
>> It would be part of the function wrapper returned by the LGEOS
>> object. The idea being that we have to check before every call. In
>> order to improve performance a little I was also wondering about
>> storing the function wrappers in the LGEOS object using setattr so
>> that __getattr__ wouldn't get called the next time the user tries to
>> call the geos method.
>>
>> Thanks,
>> Aron
>>
>> class LGEOS(object):
>>
>>     def __getattr__(self, name):
>>         func = getattr(self._lgeos, name)
>>         if self._geos_c_version >= (1,5,0):
>>             ob = getattr(self._lgeos, name + '_r')
>>             class wrapper(object):
>>                 __name__ = None
>>                 errcheck = None
>>                 restype = None
>>                 def __init__(self):
>>                     self.func = ob
>>                     self.__name__ = ob.__name__
>>                     self.func.restype = func.restype
>>                     if func.argtypes is None:
>>                         self.func.argtypes = None
>>                     else:
>>                         self.func.argtypes = [c_void_p] +
>> func.argtypes
>>                 def __call__(self, *args):
>>                     if not hasattr(thread_data, 'geos_handle'):
>>                         thread_data.geos_handle =
>> lgeos._lgeos.initGEOS_r(notice_h, error_h)
>>
>>                     if self.errcheck is not None:
>>                         self.func.errcheck = self.errcheck
>>                     if self.restype is not None:
>>                         self.func.restype = self.restype
>>                     return self.func(thread_data.geos_handle, *args)
>>             attr = wrapper()
>>
>>
>> -Aron
>>
>> On Wed, Feb 25, 2009 at 1:51 PM, Sean Gillies <[email protected]>
>> wrote:
>> That would be a method of what class?
>>
>> On Feb 25, 2009, at 10:43 AM, Aron Bierbaum wrote:
>>
>> > I guess I don't see where these discussions would effect what we are
>> > doing. I am sure I am missing something, but in order to allow no
>> > changes by users, can't we just do something like:
>> >
>> >                 def __call__(self, *args):
>> >                     if not hasattr(thread_data, 'geos_handle'):
>> >                         thread_data.geos_handle =
>> > lgeos._lgeos.initGEOS_r(notice_h, error_h)
>> >
>> > -Aron
>> >
>> > On Wed, Feb 25, 2009 at 10:26 AM, Sean Gillies <[email protected]>
>> > wrote:
>> > Aron,
>> >
>> > r1194 is indeed the last stable revision for a while.
>> >
>> > Importing modules from a thread is fraught with all kinds of issues.
>> > See this interesting discussion:
>> >
>> > http://bugs.python.org/issue1720705
>> >
>> > And the conclusion at the end of
>> >
>> > http://svn.python.org/view/python/trunk/Doc/library/threading.rst?view=markup&pathrev=61365
>> >
>> > I think we're going to need a module level function to initialize
>> > thread local data.
>> >
>> > On Feb 24, 2009, at 5:23 PM, Aron Bierbaum wrote:
>> >
>> > > Well I finally got around to testing these changes. It appears
>> that
>> > > there are a lot of changes being made currently, which revision
>> > should
>> > > I be testing with? I have tried running test_threads.py with both
>> > > trunk and r1194 and the test fails. I briefly looked at r1194
>> and it
>> > > looks like the thread specific geos_handle is not getting
>> created in
>> > > each thread. It is created correctly in the primordially thread,
>> but
>> > > not the others. I believe it is because the threading.local
>> > > constructor is only called once, not once per thread. Any ideas
>> on a
>> > > fix?
>> > >
>> > > Thanks,
>> > > Aron
>> > >
>> > > On Thu, Jan 29, 2009 at 5:59 PM, Aron Bierbaum
>> > > <[email protected]> wrote:
>> > >> Wow, this looks really nice. I should have time to test it either
>> > >> later tonight or tomorrow. It's also interesting that this should
>> > >> allow an easy transition if/when they move to a thread safe only
>> > >> interface.
>> > >>
>> > >> Thanks,
>> > >> Aron
>> > >>
>> > >> On Thu, Jan 29, 2009 at 4:33 PM, Sean Gillies <[email protected]>
>> > >> wrote:
>> > >>> Aron,
>> > >>>
>> > >>> I did a bit of work along those lines and made some pretty good
>> > >>> progress. Tests are passing except for a couple I had to
>> disable.
>> > >>> Updating of coordinate sequences through geos_linestring and
>> > >>> geos_linearring is currently broken. I'm not sure whether it's
>> my
>> > >>> fault or a bug in the reentrant API. Probably the former.
>> > >>>
>> > >>> Checkout r1194 if you're interested in testing. I haven't
>> tried it
>> > >>> in
>> > >>> a threaded situation yet, and suspect it will be pretty hard to
>> > >>> trigger a fault -- Shapely itself doesn't expose any means of
>> > >>> modifying the GEOS context data, and I don't see it being done
>> > >>> inside
>> > >>> GEOS.
>> > >>>
>> > >>> Cheers,
>> > >>> Sean
>> > >>>
>> > >>> On Jan 28, 2009, at 10:44 PM, Aron Bierbaum wrote:
>> > >>>
>> > >>>> This is a really long shot but what do people think of
>> something
>> > >>>> like
>> > >>>> the following. I have run into a few problems with ctypes while
>> > >>>> trying
>> > >>>> to run with it, but I thought it might start people thinking.
>> > >>>>
>> > >>>> import threading
>> > >>>> import functools
>> > >>>>
>> > >>>> class LocalStorage(threading.local):
>> > >>>>   def __getattr__(self, name):
>> > >>>>      if "handle" == name:
>> > >>>>         self.handle = geos_lib.initGEOS_r(notice_h, error_h)
>> > >>>>         #print "Construct handle:",
>> > threading.currentThread().name
>> > >>>>         #print "              id:", id(self.handle)
>> > >>>>         return self.handle
>> > >>>>      raise AttributeError("Local storage does not have
>> attribute
>> > >>>> '%s'" % name)
>> > >>>>
>> > >>>> local_storage = LocalStorage()
>> > >>>>
>> > >>>> class ThreadSafeGeos(object):
>> > >>>>   def __getattr__(self, name):
>> > >>>>      new_name = "%s_r" % (name)
>> > >>>>      r_func = getattr(geos_lib, new_name)
>> > >>>>      handle = local_storage.handle
>> > >>>>      return functools.partial(r_func, handle)
>> > >>>>
>> > >>>> tsgeos = ThreadSafeGeos()
>> > >>>>
>> > >>>> lgeos = tsgeos
>> > >>>>
>> > >>>>
>> > >>>> -Aron
>> > >>>>
>> > >>>>
>> > >>>> On Tue, Jan 27, 2009 at 8:16 AM, Justin Bronn
>> <[email protected]>
>> > >>>> wrote:
>> > >>>>> Aron Bierbaum wrote:
>> > >>>>>> The only idea that I have off the top of my head is to use
>> > >>>>>> threading.local in order to keep track of a handle for each
>> > >>>>>> thread
>> > >>>>>> that calls any function in GEOS. I tried implementing
>> something
>> > >>>>>> like
>> > >>>>>> this quickly and ran into problems. Any other ideas?
>> > >>>>>
>> > >>>>> Since I'm going to have to do the same for GeoDjango's
>> bindings,
>> > >>>>> I've
>> > >>>>> been kicking around some ideas on how to do this -- and I also
>> > >>>>> thought
>> > >>>>> of something similar.  I think this idea would work, but it's
>> > >>>>> going
>> > >>>>> to
>> > >>>>> require some extensive implementation.
>> > >>>>>
>> > >>>>> Basically you'd have a global dict keyed by
>> > >>>>> threading.currentThread()
>> > >>>>> and a corresponding value be the GEOS context handle.
>> However,
>> > >>>>> you'd
>> > >>>>> have to (have a way to determine whether you're running GEOS
>> 3.1
>> > >>>>> or
>> > >>>>> not
>> > >>>>> and use the appropriate threading/non-threading function
>> > >>>>> signature --
>> > >>>>> including dynamically insert the context handle into the
>> GEOS C
>> > >>>>> API
>> > >>>>> function arguments.  I'm wondering how much overhead this is
>> and
>> > >>>>> what
>> > >>>>> the effect on performance would be -- but I haven't had time
>> > (and
>> > >>>>> won't
>> > >>>>> for the next several weeks) to mock up and try it out.
>> > >>>>>
>> > >>>>>>> Thread safety hasn't been a problem up to now, and we've
>> > >>>>>>> hammered
>> > >>>>>>> on
>> > >>>>>>> it quite a bit. We're using PyDLL to load libgeos_c, which
>> > means
>> > >>>>>>> the
>> > >>>>>>> GIL isn't released.
>> > >>>>>>>
>> > >>>>>
>> > >>>>> Yeah, it's a problem that's not seen until two routines enter
>> > the
>> > >>>>> same
>> > >>>>> critical areas in GEOS at the same time.  For web apps the
>> > problem
>> > >>>>> (segfaulting Apache/FCGI processes) becomes more apparent as
>> > >>>>> traffic
>> > >>>>> increases for the site.  I consider it a serious problem.
>> > >>>>>
>> > >>>>> -Justin
>> > >>>>> _______________________________________________
>> > >>>>> Community mailing list
>> > >>>>> [email protected]
>> > >>>>> http://lists.gispython.org/mailman/listinfo/community
>> > >>>>>
>> > >>>> _______________________________________________
>> > >>>> Community mailing list
>> > >>>> [email protected]
>> > >>>> http://lists.gispython.org/mailman/listinfo/community
>> > >>>
>> > >>> _______________________________________________
>> > >>> Community mailing list
>> > >>> [email protected]
>> > >>> http://lists.gispython.org/mailman/listinfo/community
>> > >>>
>> > >>
>> > > _______________________________________________
>> > > Community mailing list
>> > > [email protected]
>> > > http://lists.gispython.org/mailman/listinfo/community
>> >
>> > _______________________________________________
>> > Community mailing list
>> > [email protected]
>> > http://lists.gispython.org/mailman/listinfo/community
>> >
>> > _______________________________________________
>> > Community mailing list
>> > [email protected]
>> > http://lists.gispython.org/mailman/listinfo/community
>>
>> _______________________________________________
>> Community mailing list
>> [email protected]
>> http://lists.gispython.org/mailman/listinfo/community
>>
>> _______________________________________________
>> Community mailing list
>> [email protected]
>> http://lists.gispython.org/mailman/listinfo/community
>
> _______________________________________________
> Community mailing list
> [email protected]
> http://lists.gispython.org/mailman/listinfo/community
>

Attachment: thread.diff
Description: Binary data

_______________________________________________
Community mailing list
[email protected]
http://lists.gispython.org/mailman/listinfo/community

Reply via email to