Thanks Nyall for raising this again, The way I see it, fid can be seen very similarly to rowid or shapefile fids. A semi-stable unique identifier. Just - and that's the big difference - those are not part of the data hence the system can transparently deal with duplicates and can fill holes once in a while (shp -> repack, sqlite rowid -> vacuum).
If I could choose I would just make fids disappear (not only from the interface but from all the gpkg implementation). And replace it with rowid if there's a good reason for it (which I still fail to see). So just to have brought up the idea: Could we make fids optional for newly created gpkgs. Or is my fear that this will affect interoperability in a bad way correct? Matthias On Tue, Oct 13, 2020 at 11:45 PM Nyall Dawson <[email protected]> wrote: > Hi list, > > (Linus Torvalds-style harsh truths incoming, read only after > coffee/alcohol!) > > Having spent an incredibly frustrating day fighting with the > limitations of GPKG and the horrible workflow that they mandate, I'd > love to start brainstorming on how we can fix this. > > While previous discussions have related to the GPKG sqlite wal mess, > that has (to the extent of my experience) been resolved in the latest > release. So I'd like to focus on what I see as the biggest pain point > of GPKG: the FID column. > > This is a pain point for numerous reasons: > > - The type constraint on the fid column makes it really hard to > translate datasets with an existing, non-numeric "fid" column into > geopackage. Eg. GML files often have a textual fid string, and > attempting to convert these to gpkg results in a string of errors > about string values not being usable as fid values, and an empty > result layer. The only workaround here is to translate first to an > alternative format (such as shp!), delete the fid column, and THEN > save as gpkg. > > - The fid unique constraint, while understandable, results in a HUGE > raft of issues while working with these. It's SO easy to get a > situation where you have duplicate fids in an edit buffer, and no way > to save these features back to the gpkg. You get a series of 1000s of > errors about duplicate fid, and then an ambiguous state where you're > completely unsure exactly what's been saved and what's about to be > lost. This isn't just attributable to a single tool in QGIS -- it's > possible to end up with duplicate fids through so many different > operations, including really simple stuff like copying and pasting > features! > > I've fought with this since we've really started to push GPKG and, > frankly, I've given up. I don't think there's any way to fix the > current situation and leave fids as they currently behave. > > So what I propose is a radical re-think about how GPKG fids are > handled and exposed by QGIS (and by GDAL). > > I propose that we > > 1. demote fids to being only a "semi-permanent" row identifier, with > the message being that "sometimes these WILL change and you can't rely > on them as a permanent id field for joins and row identification". If > users require a permanent unique identifier (i.e. a primary key) on > their table then THEY have to make and manage that themselves, just > like shapefiles etc. > > 2. expose fids as a read-only field. Users can still see them if they > want, but they cannot edit them. > > 3. make QGIS (or GDAL?) ALWAYS generate a completely new fid whenever > a row is changed or added. Throwaway the old value, make a new one on > EVERY edit/addition. > > 4 We COMPLETELY ignore any existing fid value set for features added > to a GPKG layer. I.e. in the case of translating a GML with a text fid > field, we completely ignore the incoming GML fid values and instead > use the "always generate a new fid" rule. > > Yes, these changes will break existing workflows, and possibly break > existing tools/scripts. But honestly, in my experience and the > experience of my customers, there's a COMPLETE lack of faith and trust > in GPKG at this stage. EVERYONE has their horror stories of lost data > and mangled datasets. We've got to do something drastic, and we've got > to do it sooner rather than later to salvage what little hope does > remain for this format. > > Thoughts? > > Nyall > _______________________________________________ > QGIS-Developer mailing list > [email protected] > List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer > Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer
_______________________________________________ QGIS-Developer mailing list [email protected] List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer
