Le vendredi 22 août 2014 17:53:50, Blake Thompson a écrit : > Jeff, > > Thanks Blake for the detailed response. I did not realize that I did not do > > > a reply all in my previous email I sent. > > Not an issue, glad you guys are interested in my changes. > > > --> I thought that this was not possible using current trunk GDAL because > > of the global cache. At least the writing side using multiple threads can > > cause issues. > > See this response on a older topic I found in GDAL dev list: > > http://lists.osgeo.org/pipermail/gdal-dev/2013-January/035215.html > > The response there was for a similar question as the one i posted about > > supporting of batch translating of dataset with this RFC . > > I do believe it is possible in the current trunk of GDAL (and I have > written some code to do it with out seeing any issues). There is a "lock" > on raster blocks currently that prevents them from being removed from the > cache if they are currently be utilized. > > > --> Ok, I was under the wrong impression that in addition of a cache per > > dataset there is still a cap of global cache that oversee the total cache > > used by GDAL. > > This really isn't possible, because if something needs to be removed from > the cache and there is nothing to remove from the current dataset, it would > have to go to another dataset to lower the cache size. At this point the > per dataset cache really has limited meaning, because a lot of the same > issues will occur. > > --> Is this mean that in addition of the per dataset cache once can still > > > use the global cache (only not per dataset one) and have the scenario of > > translating multiple datasets in a parallel way > > works ok (without threading issues due to current implementation global > > cache). > > Yes, it will work to translate multiple datasets in a parallel way with a > global cache.
Note: after re-reading, I realize that I misread your above sentence as "it will work to translate multiple datasets in a parallel way with a *per- dataset* cache"). So, even if you didn't write it, I'm afraid that people will assume that calling CreateCopy() on the same source dataset handle would be thread-safe (imagine that one thread translates to format F1, while the other one to format F2), whereas in the current state of the RFC it is not. Because CreateCopy() will call GetGeoTransform(), GetProjectionRef() etc which are generally thread unsafe. For example GTiff has a lazy loading approach for those 2 methods, and it is not the only one. If we claim thread-safety, we should likely offer full thread-safety (at least in reading scenarios), not partial one. Otherwise I'm afraid no one but the few people that have taken part to that discussion or read the RFC will know the limits. I'm wondering if it wouldn't be worth having GDALDatasetThreadSafe and GDALRasterBandThreadSafe classes (or whatever name is appropriate), that would follow the decorator pattern, i.e. they will own a thread unsafe "real" dataset/band and override the methods to lock them. Similarly to what I have done in OGR with ogr/ogrsf_frmts/generic/ogrmutexeddatasource.cpp and ogrmutexedlayer.cpp, needed for the FGDB driver (the one that depends on the ESRI SDK). My idea would be to have an open flag GDAL_OF_THREADSAFE for GDALOpenEx(). When set, GDALOpen() would do the usual job and get a (in most cases) unsafe dataset object. Then it would query a virtual method of the dataset to return a thread-safe version of it (GetThreadSafe()). - If not defined by the driver, the base implementation of GetThreadSafe() would return the dataset wrapped in GDALDatasetThreadSafe - If the implementation of the dataset is already thread-safe, it's GetThreadSafe() would return just self - For the in-between situations, it could for example override GDALDatasetThreadSafe/GDALRasterBandThreadSafe base implementation to specialize the methods that don't need locking. This idea might not work (and I've not though how it would combine with my previous IReadBlock_thread_safe/IReadBlock approach). I have just written it as it came to my mind. The main motivation is to make it easy to have thread- safe versions by default, without the driver having to care about that, while being flexible to make drivers that need finer control to do it. If, through benchmark, we determine that the cost of the thread safe version is neglectable, then GDAL_OF_THREADSAFE might be useless, and GDALOpen() would always return the thread-safe version. > > Thanks, > > Blake -- Spatialys - Geospatial professional services http://www.spatialys.com _______________________________________________ gdal-dev mailing list [email protected] http://lists.osgeo.org/mailman/listinfo/gdal-dev
