Hi, Extending shapefile capabilities: The one reason which might persuade anyone is the still abundant use of shapefiles for dataexchange. Shapefile does seem to play a vital role there. I don't know of anything equally 'tradeable'. However, an extension would also endanger the interoperability/dataexchange, so little of this would work short of definining a new shapefile variant specification, working title .shp2. I can see the outlines, even though I doubt it will be possible to introduce such a thing. The mods would not be that hard, though.
Deleting: the bit I've implemented quickly for internal use does use the ghost-feature approach, so I agree there. At the end of the session I pack the .dbf; now I'll consider packing the .shp as well to avoid problems with packages that do not read the .shx. Thanks for the advice. As done now internally, a request for a deleted record returns a null for does not exist, but does not affect the numbering (fid) of the other records/shapes/rows. In the .shx, only the size is set to 0 for a delete, so an undelete is possible. For the dbf, the only overhead is maintaining a list of deleted record(number)s. To properly operate a shapefile-like system in which a delete actually reduces the number of records would require 2 things: 1) another definition of the record identifiers (fid as part of the record). This can be done in the shapefile, of course, since the record number is already stored (1-based); it is just never used afaik. 2) maintain a (kind of) .dbx in memory, to index the valid records much in the same way the .shx does. One could store it or build it when reading the .dbf file. A more efficient scheme should be possible. Y ou can indeed not throw such a thing at existing shapefiledrivers. Just sharing my thoughts here. Jan On Wed, Apr 30, 2014 at 4:30 PM, Even Rouault <[email protected]>wrote: > Le mardi 29 avril 2014 19:46:46, Jan Heckman a écrit : > > Hi, > > It appears I have to do some homework on ogr's shapefile functions as it > > stands now. > > > > 8GB: If interoperability is more of a priority than capacity, that's a > > valid consideration. I've not really needed anything > 4GB so far. > > I'm not sure there's a point in "extending" shapefile capabilities whereas > there are other formats, more capable, that don't have a 32bit offset > limitation. > > > > > Delete: > > By delete I mean leaving the information in the file but (shapefile) > taking > > it out of the index chain (.shx), and .dbf, marking the record with an > > asterisk in its firs byte. > > As far as arcgis, I did a delete in this way and tried to load it. When I > > do not reduce the record count in the dbf header, arcgis will not load > it; > > when I do reduce the record count in the header, arcgis will load the > > shapefile but the attributes will not match the shapes. As a cross-check, > > you can open the .dbf in open office or excel: the delete will be > > recognized. > > > > My guess is that arcgis maps the shaperecords to the physical records of > > the dbf only. > > > > To allow use of the shapefile in arcgis, I have to compact the .dbf. The > > shape will then be handled correctly. > > > > A recipe to try this out: > > create a new empty point shapefile, load it in arcgis. Using arccatalog > to > > create the shapefile, it will have a single ID integer attribute. That's > > the starting point. > > Create 3 points and give them ID's 1 - 3. > > Now to 'delete' the second record using a diskeditor: > > Copy the shapefile. Open the .shx. The .shx has a header and records > > consisting of offset-length pairs. A pair takes 8 bytes. Change the 2nd > > offset to be identical to the last (00000040 -> 0000004E). Diminish the > > filelength indicator in the header (offset 0x18) by 4 (0000003E to > > 0000003A). Copy the file, except the last 8 bytes to the new .shx file. > > DBF: open in editor, change the first byte of the second record (at > offset > > 0x48) to an asterisk. The recordcount in the header is at offset 4 > (little > > endian). > > > > Load in arcgis, will fail. > > Yes I'm not surprised at all and I would say that ArcGIS behaviour is sane. > And I guess that OGR would be defeated too if you tried to open such a > shapefile with it. > If you delete record ID 2, you just have to mark the .dbf record with '*'. > For > uniqueness purposes, feature 3 should remain feature 3. And feature 2 be a > "ghost feature". That's whay OGR does when you use the DeleteFeature() > API. I > don't think it touches the .shx at all. It could possibly change the > offset to > be 0 as a marker for invalid, but we don't even do that. > > Otherwise you have to compact both the .dbf and .shx (and possibly .shp for > software not using .shx as Jukka mentionned) to move data. > > Even > > -- > Geospatial professional services > http://even.rouault.free.fr/services.html >
_______________________________________________ gdal-dev mailing list [email protected] http://lists.osgeo.org/mailman/listinfo/gdal-dev
