Le vendredi 22 août 2014 21:50:56, Blake Thompson a écrit : > Even, > > On Fri, Aug 22, 2014 at 1:33 PM, Even Rouault <[email protected]> > > wrote: > > Note: after re-reading, I realize that I misread your above sentence as > > "it will work to translate multiple datasets in a parallel way with a > > *per- dataset* cache"). > > > > So, even if you didn't write it, I'm afraid that people will assume that > > calling CreateCopy() on the same source dataset handle would be > > thread-safe (imagine that one thread translates to format F1, while the > > other one to format F2), whereas in the current state of the RFC it is > > not. Because CreateCopy() will call GetGeoTransform(), > > GetProjectionRef() etc which are generally thread unsafe. For example > > GTiff has a lazy loading approach for those 2 methods, and it is not the > > only one. > > If we claim thread-safety, we should likely offer full thread-safety (at > > > least > > in reading scenarios), not partial one. Otherwise I'm afraid no one but > > the few people that have taken part to that discussion or read the RFC > > will know > > the limits. > > Perhaps the goal of the RFC should be to allow CreateCopy to be threadsafe? > I have no tried yet to venture into other parts of thread safety on the > dataset (already bit off a very large chunk). > > > I'm wondering if it wouldn't be worth having GDALDatasetThreadSafe and > > GDALRasterBandThreadSafe classes (or whatever name is appropriate), that > > would > > follow the decorator pattern, i.e. they will own a thread unsafe "real" > > dataset/band and override the methods to lock them. Similarly to what I > > have > > done in OGR with ogr/ogrsf_frmts/generic/ogrmutexeddatasource.cpp and > > ogrmutexedlayer.cpp, needed for the FGDB driver (the one that depends on > > the > > ESRI SDK). > > > > My idea would be to have an open flag GDAL_OF_THREADSAFE for > > GDALOpenEx(). When set, GDALOpen() would do the usual job and get a (in > > most cases) unsafe > > dataset object. Then it would query a virtual method of the dataset to > > return > > a thread-safe version of it (GetThreadSafe()). > > - If not defined by the driver, the base implementation of > > GetThreadSafe() would return the dataset wrapped in > > GDALDatasetThreadSafe > > - If the implementation of the dataset is already thread-safe, it's > > GetThreadSafe() would return just self > > - For the in-between situations, it could for example override > > GDALDatasetThreadSafe/GDALRasterBandThreadSafe base implementation to > > specialize the methods that don't need locking. > > > > This idea might not work (and I've not though how it would combine with > > my previous IReadBlock_thread_safe/IReadBlock approach). I have just > > written it > > as it came to my mind. The main motivation is to make it easy to have > > thread- > > safe versions by default, without the driver having to care about that, > > while > > being flexible to make drivers that need finer control to do it. > > This approach worries me, there is already a lot of inheritance of the > GDALDataset, its really hard to learn already.
Well, the learning phase would be only be people who want to tweak thread safety of their drivers, i.e. not a lot of people I bet. The user wouldn't have to pay a lot (just specifying GDAL_OF_THREADSAFE, or nothing at all if the price of thread safety is not too big) > Adding another layer would > make it even harder to maintain in my opinion. It also provides one more > config option in an already overwhelming number of config options. I think > we might be better off if we made a long term plan on how we wanted to make > GDAL thread safe. Perhaps make a list of parts needing thread safety and > progress towards it over a period of time. As part of it maintain a webpage > that describes what is thread safe currently, so people are not confused on > what is and is not thread safe. People are already confused... and have a hard-time really understanding the implication of the few lines at http://trac.osgeo.org/gdal/wiki/FAQMiscellaneous#IstheGDALlibrarythread-safe I'm afraid more documentation of a more complex situation will not make it better. It is not reasonable to have to document that "GetProjectionRef() is thread-safe in drivers X and Y, whereas GetGeoTransform() is in Y and Z". Because that's the situation we will have for sure if we don't adopt a global default mechanism (unless someone spends weeks in adding locks in every virtual method of every driver). The external locking I suggested is not completely a new approach. I could mention Collections.synchronizedList() used in Java to make their List implementations thread safe... Sorry for mentionning Java ;-) The ideal would be to have a solution were by default we have a guarantee thread-safety and people (GDAL driver developers) who want to tune it can do it if they are brave enough. > > > If, through benchmark, we determine that the cost of the thread safe > > version > > is neglectable, then GDAL_OF_THREADSAFE might be useless, and GDALOpen() > > would > > always return the thread-safe version. > > I wish there were lots of benchmarks in the test suite (if there are, I > don't know where they are), They don't yet exist. Most code is functionnal code in Python, so not appropriate for performance testing, especially regarding threading. The autotest/cpp should likely be enhanced to have such tests (the only performance benchmark I can think of is testperfcopywords that was created to justify the templated approach of GDALCopyWords) To be mentionned, there's also apps/multireadtest.cpp (not compiled by default). It just ensures that opening the same file in N dataset objects and reading them in N threads works. > so that we can even roughly measure changes to > the internals of the library. (I realize this is something that could be > difficult to do well). > > Thanks, > > Blake -- Spatialys - Geospatial professional services http://www.spatialys.com _______________________________________________ gdal-dev mailing list [email protected] http://lists.osgeo.org/mailman/listinfo/gdal-dev
