(forwarding back to list) ---------- Forwarded message --------- From: Alexis R.L. <[email protected]> Date: Thu, 15 Oct 2020 at 07:07 Subject: Re: [QGIS-Developer] GPKG and FID -- can we fix this mess? To: Nyall Dawson <[email protected]>
> Greetings, > Can't we have the full reset option as a failsafe method? This way we work > normally and when there is an error related to fids we simply scrap them. > Personally I don't get fid issues these days and never rely on the fids. I > also don't use GML so the other aspects don't affect me. > Thanks, > Alex I think this is a great idea -- we could retain any compliant fid values without change, but as soon as we hit a duplicate fid or non-integer fid value then we discard it and generate a new one. What do you think Even? Nyall Le mar. 13 oct. 2020 à 17:45, Nyall Dawson <[email protected]> a écrit : > > Hi list, > > (Linus Torvalds-style harsh truths incoming, read only after coffee/alcohol!) > > Having spent an incredibly frustrating day fighting with the > limitations of GPKG and the horrible workflow that they mandate, I'd > love to start brainstorming on how we can fix this. > > While previous discussions have related to the GPKG sqlite wal mess, > that has (to the extent of my experience) been resolved in the latest > release. So I'd like to focus on what I see as the biggest pain point > of GPKG: the FID column. > > This is a pain point for numerous reasons: > > - The type constraint on the fid column makes it really hard to > translate datasets with an existing, non-numeric "fid" column into > geopackage. Eg. GML files often have a textual fid string, and > attempting to convert these to gpkg results in a string of errors > about string values not being usable as fid values, and an empty > result layer. The only workaround here is to translate first to an > alternative format (such as shp!), delete the fid column, and THEN > save as gpkg. > > - The fid unique constraint, while understandable, results in a HUGE > raft of issues while working with these. It's SO easy to get a > situation where you have duplicate fids in an edit buffer, and no way > to save these features back to the gpkg. You get a series of 1000s of > errors about duplicate fid, and then an ambiguous state where you're > completely unsure exactly what's been saved and what's about to be > lost. This isn't just attributable to a single tool in QGIS -- it's > possible to end up with duplicate fids through so many different > operations, including really simple stuff like copying and pasting > features! > > I've fought with this since we've really started to push GPKG and, > frankly, I've given up. I don't think there's any way to fix the > current situation and leave fids as they currently behave. > > So what I propose is a radical re-think about how GPKG fids are > handled and exposed by QGIS (and by GDAL). > > I propose that we > > 1. demote fids to being only a "semi-permanent" row identifier, with > the message being that "sometimes these WILL change and you can't rely > on them as a permanent id field for joins and row identification". If > users require a permanent unique identifier (i.e. a primary key) on > their table then THEY have to make and manage that themselves, just > like shapefiles etc. > > 2. expose fids as a read-only field. Users can still see them if they > want, but they cannot edit them. > > 3. make QGIS (or GDAL?) ALWAYS generate a completely new fid whenever > a row is changed or added. Throwaway the old value, make a new one on > EVERY edit/addition. > > 4 We COMPLETELY ignore any existing fid value set for features added > to a GPKG layer. I.e. in the case of translating a GML with a text fid > field, we completely ignore the incoming GML fid values and instead > use the "always generate a new fid" rule. > > Yes, these changes will break existing workflows, and possibly break > existing tools/scripts. But honestly, in my experience and the > experience of my customers, there's a COMPLETE lack of faith and trust > in GPKG at this stage. EVERYONE has their horror stories of lost data > and mangled datasets. We've got to do something drastic, and we've got > to do it sooner rather than later to salvage what little hope does > remain for this format. > > Thoughts? > > Nyall > _______________________________________________ > QGIS-Developer mailing list > [email protected] > List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer > Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer _______________________________________________ QGIS-Developer mailing list [email protected] List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer
