On Mon, Jul 19, 2010 at 7:54 PM, Frank Warmerdam <[email protected]> wrote: > Martin Dobias wrote: >> One note to avoid confusion: the suggestion I've made above relates >> only to shapefile driver in OGR and doesn't impose any changes to the >> API. The suggested patch reuses OGRShape instances which are passed >> between OGR shapefile driver and shapelib. These OGRShape instances >> never get to the user, so it's just a matter of internal working of >> the shapefile driver. Please take a look at the patch if still >> unclear. > > I'm not sure what an OGRShape is. Perhaps you are referring to > OGRFeature? Or SHPObject? If the optimization is to reuse > a SHPObject in repeated calls to Shapelib then this is indeed > something that could be pursued without impact on the broader > OGR API though I'd be amazed to find it makes a really big > difference.
Ooops, sorry! I meant SHPObject. From my tests, when reusing the SHPObject (and the coordinate arrays in it), the time of retrieval of 100 thousand line features goes from +-125ms down to +-95ms. For 100 thousand features it saved +- 700 thousand pairs of alloc/free calls. >> GetFeature() returns a new instance and DestroyFeature() deletes that >> instance. My idea is that DestroyFeature() call would save the >> instance in a pool (list) of "returned" feature instances. These >> returned features could be reused by the GetFeature() - it will take >> one from the list instead of creating a new instance. I think this >> doesn't make any influence on the public OGR API, because the >> semantics will be the same. Only the OGR internals will be modified so >> that it will not destroy OGRFeature instance immediately, because it >> will assume that more GetFeature() calls will be issued. >> >> If the pool would be specific for each OGRLayer, many >> allocations/deallocations of OGRFeature and OGRField instances could >> be saved, because the features contain the same fields, they would >> only have to be cleaned (but the array would stay as-is). A layer has >> usually the same type of geometry for all features, so even geometries >> could be kept and only the size of the coordinate array would be >> altered between the calls. > > This seems *possible* but pretty complicated and if not done very > carefully could introduce additional problems. I can't help but > wonder if you aren't just using a poor heap implementation which > is making allocations and deallocations unnecessarily expensive. > Reworking huge amounts of code around the assumption that > new/delete are terribly expensive does not seem entirely prudent. I use stock heap allocator from libc (on ubuntu), nothing fancy. Anyway, reusing of the objects is not very high on my list, it's just something worth considering. If I will have some spare time, I will look into the complexity and the possible gains, but now I'll rather focus on getting the shapefile driver faster using stuff I have ready. Regards Martin _______________________________________________ gdal-dev mailing list [email protected] http://lists.osgeo.org/mailman/listinfo/gdal-dev
