Hi, There have been some recent discussion on the qgis list about an old ticket https://hub.qgis.org/issues/11007
Basically the issue seems to be that a lot / most non-shapelib / non-OGR based shapefile readers don't understand the way OGR delete features in shapefiles. When OGR/shapelib deletes a feature, it simply marks the corresponding record in the DBF as deleted (technically putting a '*' character in the first byte of the DBF record) and that's all. Very fast and OGR handles that consistently (with the small restriction that the feature count reports the deleted features as still existing, but iteration or getting features by id do not report them) This way of deleting a DBF record is the documented one : http://www.clicketyclick.dk/databases/xbase/format/dbf.html#DBF_STRUCT """ Deleted flag: Value Description 2Ah (*) Record is deleted 20h (blank) Record is valid """ However other GIS packages, and among others, a famous proprietary one - let's call it "LineGIS" - when reading such shapefiles do not recognize the deleted feature as deleted and display both the geometry and attributes. More annoying, when "LineGIS" deletes another record in such a shapefile and saves the result, the shapefile can no longer be opened afterwards with an error message reporting an inconsistency in number of shapes w.r.t number of records (and on inspection, the shp/shx indeed contain N - 1 records and the dbf N - 2, so it looks like it would be semi-aware of deleted DBF records) When "LineGIS" starts with a "clean" shapefile and deletes a record in it, it removes the corresponding entries in the .dbf, .shp and .shx files, which is the result of the REPACK operation the shapefile driver can do if explicitly asked. "LineGIS" isn't the only one to have troubles with deleted DBF records. From what I can see GeoTools (just picking a random example) only fully handle them since 2014 : https://osgeo-org.atlassian.net/browse/GEOT-4539 https://github.com/geotools/geotools/commit/e7333ccb284d137f3240ce5d0d09b3d7195f1890 The shapefile specification ( http://www.esri.com/library/whitepapers/pdfs/shapefile.pdf ) doesn't mention about how deleted records should be handled. Particularly if the requirement "The table must contain one record per shape feature" (page 25) allows DBF records marked as deleted... Anyway the theory/spec and the practice are 2 different things. What surprises me is such an issue didn't raise more loud complaints before as the OGR / shapelib behaviour has been the same since forever AFAIK. I'm wondering if OGR shouldn't automatically run REPACK when closing a shapefile when deletions (as well as edit operations of existing features leading to holes in the .shp) have happened. The side effect of this would be a slower closing (creation only scenarios wouldn't be affected) and a renumbering of the FID of features after the deleted feature(s). Thoughts ? (Regarding the QGIS issue, as QGIS explicitly runs REPACK after edition/deleting, it is not clear why the issue would persist. But some reports might be with older QGIS/GDAL versions) Even -- Spatialys - Geospatial professional services http://www.spatialys.com _______________________________________________ gdal-dev mailing list [email protected] http://lists.osgeo.org/mailman/listinfo/gdal-dev
