Aron,

I've committed most of your patch, with a few changes, to the Shapely  
(1.1) trunk.

http://trac.gispython.org/lab/changeset/1226

If you try test_threads.py you'll see that the individual threads now  
get their own GEOS handles.

I like the use of functools.partial. Thanks for the help!

On Feb 26, 2009, at 4:32 PM, Aron Bierbaum wrote:

> After doing a lot of testing and thinking I came to a conclusion as to
> what I think was causing by errors, and have come up with a fix. The
> basic idea behind my issues were that multiple threads could can not
> in any way change the ctypes function pointers. With the current code
> it is possible to have these function pointers modified in the wappers
> __call__ method. Also while calling certain functions in the Shapely
> API it changed the errcheck function at run-time. I think the main
> reason for this in most cases was the need to bind objects into these
> error checking functions before calling. This is not strictly needed
> because the parameters to the errcheck method provide all needed
> information. What I have come up with does not look the greatest
> because of all of the errcheck methods get defined when the module is
> imported, but it seems to run better. Please let me know your thoughts
> on the attached patch.
>
> Thanks,
> Aron
>
> On Thu, Feb 26, 2009 at 10:53 AM, Sean Gillies <[email protected]>  
> wrote:
>> Give it a try. I like the optimization idea too.
>>
>> On Feb 25, 2009, at 1:25 PM, Aron Bierbaum wrote:
>>
>>> It would be part of the function wrapper returned by the LGEOS
>>> object. The idea being that we have to check before every call. In
>>> order to improve performance a little I was also wondering about
>>> storing the function wrappers in the LGEOS object using setattr so
>>> that __getattr__ wouldn't get called the next time the user tries to
>>> call the geos method.
>>>
>>> Thanks,
>>> Aron
>>>
>>> class LGEOS(object):
>>>
>>>     def __getattr__(self, name):
>>>         func = getattr(self._lgeos, name)
>>>         if self._geos_c_version >= (1,5,0):
>>>             ob = getattr(self._lgeos, name + '_r')
>>>             class wrapper(object):
>>>                 __name__ = None
>>>                 errcheck = None
>>>                 restype = None
>>>                 def __init__(self):
>>>                     self.func = ob
>>>                     self.__name__ = ob.__name__
>>>                     self.func.restype = func.restype
>>>                     if func.argtypes is None:
>>>                         self.func.argtypes = None
>>>                     else:
>>>                         self.func.argtypes = [c_void_p] +
>>> func.argtypes
>>>                 def __call__(self, *args):
>>>                     if not hasattr(thread_data, 'geos_handle'):
>>>                         thread_data.geos_handle =
>>> lgeos._lgeos.initGEOS_r(notice_h, error_h)
>>>
>>>                     if self.errcheck is not None:
>>>                         self.func.errcheck = self.errcheck
>>>                     if self.restype is not None:
>>>                         self.func.restype = self.restype
>>>                     return self.func(thread_data.geos_handle, *args)
>>>             attr = wrapper()
>>>
>>>
>>> -Aron
>>>
>>> On Wed, Feb 25, 2009 at 1:51 PM, Sean Gillies <[email protected]>
>>> wrote:
>>> That would be a method of what class?
>>>
>>> On Feb 25, 2009, at 10:43 AM, Aron Bierbaum wrote:
>>>
>>>> I guess I don't see where these discussions would effect what we  
>>>> are
>>>> doing. I am sure I am missing something, but in order to allow no
>>>> changes by users, can't we just do something like:
>>>>
>>>>                 def __call__(self, *args):
>>>>                     if not hasattr(thread_data, 'geos_handle'):
>>>>                         thread_data.geos_handle =
>>>> lgeos._lgeos.initGEOS_r(notice_h, error_h)
>>>>
>>>> -Aron
>>>>
>>>> On Wed, Feb 25, 2009 at 10:26 AM, Sean Gillies <[email protected]>
>>>> wrote:
>>>> Aron,
>>>>
>>>> r1194 is indeed the last stable revision for a while.
>>>>
>>>> Importing modules from a thread is fraught with all kinds of  
>>>> issues.
>>>> See this interesting discussion:
>>>>
>>>> http://bugs.python.org/issue1720705
>>>>
>>>> And the conclusion at the end of
>>>>
>>>> http://svn.python.org/view/python/trunk/Doc/library/threading.rst?view=markup&pathrev=61365
>>>>
>>>> I think we're going to need a module level function to initialize
>>>> thread local data.
>>>>
>>>> On Feb 24, 2009, at 5:23 PM, Aron Bierbaum wrote:
>>>>
>>>>> Well I finally got around to testing these changes. It appears
>>> that
>>>>> there are a lot of changes being made currently, which revision
>>>> should
>>>>> I be testing with? I have tried running test_threads.py with both
>>>>> trunk and r1194 and the test fails. I briefly looked at r1194
>>> and it
>>>>> looks like the thread specific geos_handle is not getting
>>> created in
>>>>> each thread. It is created correctly in the primordially thread,
>>> but
>>>>> not the others. I believe it is because the threading.local
>>>>> constructor is only called once, not once per thread. Any ideas
>>> on a
>>>>> fix?
>>>>>
>>>>> Thanks,
>>>>> Aron
>>>>>
>>>>> On Thu, Jan 29, 2009 at 5:59 PM, Aron Bierbaum
>>>>> <[email protected]> wrote:
>>>>>> Wow, this looks really nice. I should have time to test it either
>>>>>> later tonight or tomorrow. It's also interesting that this should
>>>>>> allow an easy transition if/when they move to a thread safe only
>>>>>> interface.
>>>>>>
>>>>>> Thanks,
>>>>>> Aron
>>>>>>
>>>>>> On Thu, Jan 29, 2009 at 4:33 PM, Sean Gillies <[email protected]>
>>>>>> wrote:
>>>>>>> Aron,
>>>>>>>
>>>>>>> I did a bit of work along those lines and made some pretty good
>>>>>>> progress. Tests are passing except for a couple I had to
>>> disable.
>>>>>>> Updating of coordinate sequences through geos_linestring and
>>>>>>> geos_linearring is currently broken. I'm not sure whether it's
>>> my
>>>>>>> fault or a bug in the reentrant API. Probably the former.
>>>>>>>
>>>>>>> Checkout r1194 if you're interested in testing. I haven't
>>> tried it
>>>>>>> in
>>>>>>> a threaded situation yet, and suspect it will be pretty hard to
>>>>>>> trigger a fault -- Shapely itself doesn't expose any means of
>>>>>>> modifying the GEOS context data, and I don't see it being done
>>>>>>> inside
>>>>>>> GEOS.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Sean
>>>>>>>
>>>>>>> On Jan 28, 2009, at 10:44 PM, Aron Bierbaum wrote:
>>>>>>>
>>>>>>>> This is a really long shot but what do people think of
>>> something
>>>>>>>> like
>>>>>>>> the following. I have run into a few problems with ctypes while
>>>>>>>> trying
>>>>>>>> to run with it, but I thought it might start people thinking.
>>>>>>>>
>>>>>>>> import threading
>>>>>>>> import functools
>>>>>>>>
>>>>>>>> class LocalStorage(threading.local):
>>>>>>>>   def __getattr__(self, name):
>>>>>>>>      if "handle" == name:
>>>>>>>>         self.handle = geos_lib.initGEOS_r(notice_h, error_h)
>>>>>>>>         #print "Construct handle:",
>>>> threading.currentThread().name
>>>>>>>>         #print "              id:", id(self.handle)
>>>>>>>>         return self.handle
>>>>>>>>      raise AttributeError("Local storage does not have
>>> attribute
>>>>>>>> '%s'" % name)
>>>>>>>>
>>>>>>>> local_storage = LocalStorage()
>>>>>>>>
>>>>>>>> class ThreadSafeGeos(object):
>>>>>>>>   def __getattr__(self, name):
>>>>>>>>      new_name = "%s_r" % (name)
>>>>>>>>      r_func = getattr(geos_lib, new_name)
>>>>>>>>      handle = local_storage.handle
>>>>>>>>      return functools.partial(r_func, handle)
>>>>>>>>
>>>>>>>> tsgeos = ThreadSafeGeos()
>>>>>>>>
>>>>>>>> lgeos = tsgeos
>>>>>>>>
>>>>>>>>
>>>>>>>> -Aron
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Jan 27, 2009 at 8:16 AM, Justin Bronn
>>> <[email protected]>
>>>>>>>> wrote:
>>>>>>>>> Aron Bierbaum wrote:
>>>>>>>>>> The only idea that I have off the top of my head is to use
>>>>>>>>>> threading.local in order to keep track of a handle for each
>>>>>>>>>> thread
>>>>>>>>>> that calls any function in GEOS. I tried implementing
>>> something
>>>>>>>>>> like
>>>>>>>>>> this quickly and ran into problems. Any other ideas?
>>>>>>>>>
>>>>>>>>> Since I'm going to have to do the same for GeoDjango's
>>> bindings,
>>>>>>>>> I've
>>>>>>>>> been kicking around some ideas on how to do this -- and I also
>>>>>>>>> thought
>>>>>>>>> of something similar.  I think this idea would work, but it's
>>>>>>>>> going
>>>>>>>>> to
>>>>>>>>> require some extensive implementation.
>>>>>>>>>
>>>>>>>>> Basically you'd have a global dict keyed by
>>>>>>>>> threading.currentThread()
>>>>>>>>> and a corresponding value be the GEOS context handle.
>>> However,
>>>>>>>>> you'd
>>>>>>>>> have to (have a way to determine whether you're running GEOS
>>> 3.1
>>>>>>>>> or
>>>>>>>>> not
>>>>>>>>> and use the appropriate threading/non-threading function
>>>>>>>>> signature --
>>>>>>>>> including dynamically insert the context handle into the
>>> GEOS C
>>>>>>>>> API
>>>>>>>>> function arguments.  I'm wondering how much overhead this is
>>> and
>>>>>>>>> what
>>>>>>>>> the effect on performance would be -- but I haven't had time
>>>> (and
>>>>>>>>> won't
>>>>>>>>> for the next several weeks) to mock up and try it out.
>>>>>>>>>
>>>>>>>>>>> Thread safety hasn't been a problem up to now, and we've
>>>>>>>>>>> hammered
>>>>>>>>>>> on
>>>>>>>>>>> it quite a bit. We're using PyDLL to load libgeos_c, which
>>>> means
>>>>>>>>>>> the
>>>>>>>>>>> GIL isn't released.
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Yeah, it's a problem that's not seen until two routines enter
>>>> the
>>>>>>>>> same
>>>>>>>>> critical areas in GEOS at the same time.  For web apps the
>>>> problem
>>>>>>>>> (segfaulting Apache/FCGI processes) becomes more apparent as
>>>>>>>>> traffic
>>>>>>>>> increases for the site.  I consider it a serious problem.
>>>>>>>>>
>>>>>>>>> -Justin
>>>>>>>>> _______________________________________________
>>>>>>>>> Community mailing list
>>>>>>>>> [email protected]
>>>>>>>>> http://lists.gispython.org/mailman/listinfo/community
>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Community mailing list
>>>>>>>> [email protected]
>>>>>>>> http://lists.gispython.org/mailman/listinfo/community
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Community mailing list
>>>>>>> [email protected]
>>>>>>> http://lists.gispython.org/mailman/listinfo/community
>>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> Community mailing list
>>>>> [email protected]
>>>>> http://lists.gispython.org/mailman/listinfo/community
>>>>
>>>> _______________________________________________
>>>> Community mailing list
>>>> [email protected]
>>>> http://lists.gispython.org/mailman/listinfo/community
>>>>
>>>> _______________________________________________
>>>> Community mailing list
>>>> [email protected]
>>>> http://lists.gispython.org/mailman/listinfo/community
>>>
>>> _______________________________________________
>>> Community mailing list
>>> [email protected]
>>> http://lists.gispython.org/mailman/listinfo/community
>>>
>>> _______________________________________________
>>> Community mailing list
>>> [email protected]
>>> http://lists.gispython.org/mailman/listinfo/community
>>
>> _______________________________________________
>> Community mailing list
>> [email protected]
>> http://lists.gispython.org/mailman/listinfo/community
>>
> <thread.diff>_______________________________________________
> Community mailing list
> [email protected]
> http://lists.gispython.org/mailman/listinfo/community

_______________________________________________
Community mailing list
[email protected]
http://lists.gispython.org/mailman/listinfo/community

Reply via email to