Aron, I've committed most of your patch, with a few changes, to the Shapely (1.1) trunk.
http://trac.gispython.org/lab/changeset/1226 If you try test_threads.py you'll see that the individual threads now get their own GEOS handles. I like the use of functools.partial. Thanks for the help! On Feb 26, 2009, at 4:32 PM, Aron Bierbaum wrote: > After doing a lot of testing and thinking I came to a conclusion as to > what I think was causing by errors, and have come up with a fix. The > basic idea behind my issues were that multiple threads could can not > in any way change the ctypes function pointers. With the current code > it is possible to have these function pointers modified in the wappers > __call__ method. Also while calling certain functions in the Shapely > API it changed the errcheck function at run-time. I think the main > reason for this in most cases was the need to bind objects into these > error checking functions before calling. This is not strictly needed > because the parameters to the errcheck method provide all needed > information. What I have come up with does not look the greatest > because of all of the errcheck methods get defined when the module is > imported, but it seems to run better. Please let me know your thoughts > on the attached patch. > > Thanks, > Aron > > On Thu, Feb 26, 2009 at 10:53 AM, Sean Gillies <[email protected]> > wrote: >> Give it a try. I like the optimization idea too. >> >> On Feb 25, 2009, at 1:25 PM, Aron Bierbaum wrote: >> >>> It would be part of the function wrapper returned by the LGEOS >>> object. The idea being that we have to check before every call. In >>> order to improve performance a little I was also wondering about >>> storing the function wrappers in the LGEOS object using setattr so >>> that __getattr__ wouldn't get called the next time the user tries to >>> call the geos method. >>> >>> Thanks, >>> Aron >>> >>> class LGEOS(object): >>> >>> def __getattr__(self, name): >>> func = getattr(self._lgeos, name) >>> if self._geos_c_version >= (1,5,0): >>> ob = getattr(self._lgeos, name + '_r') >>> class wrapper(object): >>> __name__ = None >>> errcheck = None >>> restype = None >>> def __init__(self): >>> self.func = ob >>> self.__name__ = ob.__name__ >>> self.func.restype = func.restype >>> if func.argtypes is None: >>> self.func.argtypes = None >>> else: >>> self.func.argtypes = [c_void_p] + >>> func.argtypes >>> def __call__(self, *args): >>> if not hasattr(thread_data, 'geos_handle'): >>> thread_data.geos_handle = >>> lgeos._lgeos.initGEOS_r(notice_h, error_h) >>> >>> if self.errcheck is not None: >>> self.func.errcheck = self.errcheck >>> if self.restype is not None: >>> self.func.restype = self.restype >>> return self.func(thread_data.geos_handle, *args) >>> attr = wrapper() >>> >>> >>> -Aron >>> >>> On Wed, Feb 25, 2009 at 1:51 PM, Sean Gillies <[email protected]> >>> wrote: >>> That would be a method of what class? >>> >>> On Feb 25, 2009, at 10:43 AM, Aron Bierbaum wrote: >>> >>>> I guess I don't see where these discussions would effect what we >>>> are >>>> doing. I am sure I am missing something, but in order to allow no >>>> changes by users, can't we just do something like: >>>> >>>> def __call__(self, *args): >>>> if not hasattr(thread_data, 'geos_handle'): >>>> thread_data.geos_handle = >>>> lgeos._lgeos.initGEOS_r(notice_h, error_h) >>>> >>>> -Aron >>>> >>>> On Wed, Feb 25, 2009 at 10:26 AM, Sean Gillies <[email protected]> >>>> wrote: >>>> Aron, >>>> >>>> r1194 is indeed the last stable revision for a while. >>>> >>>> Importing modules from a thread is fraught with all kinds of >>>> issues. >>>> See this interesting discussion: >>>> >>>> http://bugs.python.org/issue1720705 >>>> >>>> And the conclusion at the end of >>>> >>>> http://svn.python.org/view/python/trunk/Doc/library/threading.rst?view=markup&pathrev=61365 >>>> >>>> I think we're going to need a module level function to initialize >>>> thread local data. >>>> >>>> On Feb 24, 2009, at 5:23 PM, Aron Bierbaum wrote: >>>> >>>>> Well I finally got around to testing these changes. It appears >>> that >>>>> there are a lot of changes being made currently, which revision >>>> should >>>>> I be testing with? I have tried running test_threads.py with both >>>>> trunk and r1194 and the test fails. I briefly looked at r1194 >>> and it >>>>> looks like the thread specific geos_handle is not getting >>> created in >>>>> each thread. It is created correctly in the primordially thread, >>> but >>>>> not the others. I believe it is because the threading.local >>>>> constructor is only called once, not once per thread. Any ideas >>> on a >>>>> fix? >>>>> >>>>> Thanks, >>>>> Aron >>>>> >>>>> On Thu, Jan 29, 2009 at 5:59 PM, Aron Bierbaum >>>>> <[email protected]> wrote: >>>>>> Wow, this looks really nice. I should have time to test it either >>>>>> later tonight or tomorrow. It's also interesting that this should >>>>>> allow an easy transition if/when they move to a thread safe only >>>>>> interface. >>>>>> >>>>>> Thanks, >>>>>> Aron >>>>>> >>>>>> On Thu, Jan 29, 2009 at 4:33 PM, Sean Gillies <[email protected]> >>>>>> wrote: >>>>>>> Aron, >>>>>>> >>>>>>> I did a bit of work along those lines and made some pretty good >>>>>>> progress. Tests are passing except for a couple I had to >>> disable. >>>>>>> Updating of coordinate sequences through geos_linestring and >>>>>>> geos_linearring is currently broken. I'm not sure whether it's >>> my >>>>>>> fault or a bug in the reentrant API. Probably the former. >>>>>>> >>>>>>> Checkout r1194 if you're interested in testing. I haven't >>> tried it >>>>>>> in >>>>>>> a threaded situation yet, and suspect it will be pretty hard to >>>>>>> trigger a fault -- Shapely itself doesn't expose any means of >>>>>>> modifying the GEOS context data, and I don't see it being done >>>>>>> inside >>>>>>> GEOS. >>>>>>> >>>>>>> Cheers, >>>>>>> Sean >>>>>>> >>>>>>> On Jan 28, 2009, at 10:44 PM, Aron Bierbaum wrote: >>>>>>> >>>>>>>> This is a really long shot but what do people think of >>> something >>>>>>>> like >>>>>>>> the following. I have run into a few problems with ctypes while >>>>>>>> trying >>>>>>>> to run with it, but I thought it might start people thinking. >>>>>>>> >>>>>>>> import threading >>>>>>>> import functools >>>>>>>> >>>>>>>> class LocalStorage(threading.local): >>>>>>>> def __getattr__(self, name): >>>>>>>> if "handle" == name: >>>>>>>> self.handle = geos_lib.initGEOS_r(notice_h, error_h) >>>>>>>> #print "Construct handle:", >>>> threading.currentThread().name >>>>>>>> #print " id:", id(self.handle) >>>>>>>> return self.handle >>>>>>>> raise AttributeError("Local storage does not have >>> attribute >>>>>>>> '%s'" % name) >>>>>>>> >>>>>>>> local_storage = LocalStorage() >>>>>>>> >>>>>>>> class ThreadSafeGeos(object): >>>>>>>> def __getattr__(self, name): >>>>>>>> new_name = "%s_r" % (name) >>>>>>>> r_func = getattr(geos_lib, new_name) >>>>>>>> handle = local_storage.handle >>>>>>>> return functools.partial(r_func, handle) >>>>>>>> >>>>>>>> tsgeos = ThreadSafeGeos() >>>>>>>> >>>>>>>> lgeos = tsgeos >>>>>>>> >>>>>>>> >>>>>>>> -Aron >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Jan 27, 2009 at 8:16 AM, Justin Bronn >>> <[email protected]> >>>>>>>> wrote: >>>>>>>>> Aron Bierbaum wrote: >>>>>>>>>> The only idea that I have off the top of my head is to use >>>>>>>>>> threading.local in order to keep track of a handle for each >>>>>>>>>> thread >>>>>>>>>> that calls any function in GEOS. I tried implementing >>> something >>>>>>>>>> like >>>>>>>>>> this quickly and ran into problems. Any other ideas? >>>>>>>>> >>>>>>>>> Since I'm going to have to do the same for GeoDjango's >>> bindings, >>>>>>>>> I've >>>>>>>>> been kicking around some ideas on how to do this -- and I also >>>>>>>>> thought >>>>>>>>> of something similar. I think this idea would work, but it's >>>>>>>>> going >>>>>>>>> to >>>>>>>>> require some extensive implementation. >>>>>>>>> >>>>>>>>> Basically you'd have a global dict keyed by >>>>>>>>> threading.currentThread() >>>>>>>>> and a corresponding value be the GEOS context handle. >>> However, >>>>>>>>> you'd >>>>>>>>> have to (have a way to determine whether you're running GEOS >>> 3.1 >>>>>>>>> or >>>>>>>>> not >>>>>>>>> and use the appropriate threading/non-threading function >>>>>>>>> signature -- >>>>>>>>> including dynamically insert the context handle into the >>> GEOS C >>>>>>>>> API >>>>>>>>> function arguments. I'm wondering how much overhead this is >>> and >>>>>>>>> what >>>>>>>>> the effect on performance would be -- but I haven't had time >>>> (and >>>>>>>>> won't >>>>>>>>> for the next several weeks) to mock up and try it out. >>>>>>>>> >>>>>>>>>>> Thread safety hasn't been a problem up to now, and we've >>>>>>>>>>> hammered >>>>>>>>>>> on >>>>>>>>>>> it quite a bit. We're using PyDLL to load libgeos_c, which >>>> means >>>>>>>>>>> the >>>>>>>>>>> GIL isn't released. >>>>>>>>>>> >>>>>>>>> >>>>>>>>> Yeah, it's a problem that's not seen until two routines enter >>>> the >>>>>>>>> same >>>>>>>>> critical areas in GEOS at the same time. For web apps the >>>> problem >>>>>>>>> (segfaulting Apache/FCGI processes) becomes more apparent as >>>>>>>>> traffic >>>>>>>>> increases for the site. I consider it a serious problem. >>>>>>>>> >>>>>>>>> -Justin >>>>>>>>> _______________________________________________ >>>>>>>>> Community mailing list >>>>>>>>> [email protected] >>>>>>>>> http://lists.gispython.org/mailman/listinfo/community >>>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Community mailing list >>>>>>>> [email protected] >>>>>>>> http://lists.gispython.org/mailman/listinfo/community >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Community mailing list >>>>>>> [email protected] >>>>>>> http://lists.gispython.org/mailman/listinfo/community >>>>>>> >>>>>> >>>>> _______________________________________________ >>>>> Community mailing list >>>>> [email protected] >>>>> http://lists.gispython.org/mailman/listinfo/community >>>> >>>> _______________________________________________ >>>> Community mailing list >>>> [email protected] >>>> http://lists.gispython.org/mailman/listinfo/community >>>> >>>> _______________________________________________ >>>> Community mailing list >>>> [email protected] >>>> http://lists.gispython.org/mailman/listinfo/community >>> >>> _______________________________________________ >>> Community mailing list >>> [email protected] >>> http://lists.gispython.org/mailman/listinfo/community >>> >>> _______________________________________________ >>> Community mailing list >>> [email protected] >>> http://lists.gispython.org/mailman/listinfo/community >> >> _______________________________________________ >> Community mailing list >> [email protected] >> http://lists.gispython.org/mailman/listinfo/community >> > <thread.diff>_______________________________________________ > Community mailing list > [email protected] > http://lists.gispython.org/mailman/listinfo/community _______________________________________________ Community mailing list [email protected] http://lists.gispython.org/mailman/listinfo/community
