Re: [QGIS-Developer] GeoPackage - where are we -where do we go

2020-06-29 Thread Nyall Dawson
On Tue, 30 Jun 2020 at 15:52, Valérian Lebert  wrote:
>
> Hi,
>
> For my information, what is the downside of using spatialite instead of 
> geopackage apart from the weight of the file ? Are these issues an issue with 
> spatialite also?

yes -- it's coming from the underlying sqlite format which is used by
both spatialite/geopackage

Nyall
___
QGIS-Developer mailing list
QGIS-Developer@lists.osgeo.org
List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer

Re: [QGIS-Developer] GeoPackage - where are we -where do we go

2020-06-29 Thread Valérian Lebert
Hi,

For my information, what is the downside of using spatialite instead of
geopackage apart from the weight of the file ? Are these issues an issue
with spatialite also?

Regards
Valérian

Le ven. 8 mai 2020 à 11:30, Matthias Kuhn  a écrit :

> Hi list,
>
> I wondered about the state of GeoPackage. Personally, cince it has been
> introduced to qgis and evenmore since it has been selected as the default
> format, I have never grown to fully and completely.
>
> I do not want to trigger a evangelical discussion here. I'd like to see
> where we are and what we can reasonably do to have a default file format
> which can be recommended with no bad feelings.
>
>
> Here follow a couple of observations over the years, some of them
> properties of the specs I believe:
>
>
> * The fid requirement
>
>   I sometimes want my features to be identified by uuids or others. They
> also tend to accumulate if derived datasets are created (through processing
> etc). If I need some pseudo stable primary key there is a rowid builtin
> into sqlite, we don't need a second one.
>
>   Possible mitigation: alter the ogr implementation. possibly alter the
> standard (required?)
>
> * The modification on r/o open
>
>   Has caused too much pain on git.
>
>   Possible mitigation: a) switch to journal mode=delete (not an easy
> option because of https://issues.qgis.org/issues/15351) b) only switch to
> wal mode when layers are put into edit mode (I have strong doubts this is a
> safe thing to do)
>
> * The network share freeze
>
>   Our default file should play nicely with (windows) network shares. It's
> clear to everyone that we can't expect concurrent writes. But it should
> "just work" for concurrent read by many.
>
>   Possible mitigation: switch to journal mode=delete for network shares
> (we are looking into this)
>
> * The wal file appearing next to the file
>
>   It is confusing to newcomers and looks almost like a sidecar file. I
> would care less if it was put into some system cache folder instead of just
> into my data folder. Or at least if it was a hidden file.
>
>   Possible mitigation: switch to journal mode=delete (not an easy option
> because of https://issues.qgis.org/issues/15351)
>
> * The couple of corrupted files I have received over the years which
> could only be repaired by a command line "dump contents as sql and execute
> into new file"
>
>   I have not found a way to reproduce this. Some of them were produced by
> older qgis versions making it easy to violate foreign key constraints and
> hard to recover. This has been fixed.
>
>   Possible mitigation: offer a "repair" option in qgis. Through processing
> or "on the fly" upon detection.
>
> * Default value magic replace values on insert (with no possibility to
> pre-evaluate them)
>
>   E.g. a global sequence like on postgres would be nice. Can be worked
> around through default values in qgis though.
>
>   Possible mitigation: a)add it as a feature to sqlite. b) use qgis
> default values. c) live with it.
>
> * The requirement for a single geometry column per table
>
>   I just don't see a good reason to forbid that
>
>   Possible mitigation: a) alter the standard. b) ignore the standard and
> patch the ogr implementation.
>
>
> I wonder how others feel about these topics.
>
>
> - Are there more pain points I forgot to list?
>
> - Do you see more approaches to mitigate these problems?
>
> - Is someone already working on these issues?
>
>
> It would be great to have a standard file format that we can fully trust.
> Let's make a reality check if GeoPackage can be this format.
>
> Best regards
> --
> Matthias Kuhn
> matth...@opengis.ch
> +41 (0)76 435 67 63 <+41764356763>
> [image: OPENGIS.ch Logo] 
> ___
> QGIS-Developer mailing list
> QGIS-Developer@lists.osgeo.org
> List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
> Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer
___
QGIS-Developer mailing list
QGIS-Developer@lists.osgeo.org
List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer

Re: [QGIS-Developer] GeoPackage - where are we -where do we go

2020-06-05 Thread Paul Wittle
Hi,

The file we have had problems with was a GeoPackage based on planning polygons 
so would you like me to look into see if I can create some sort of fake version 
for testing? Alternatively you may already have one that is suitable as our one 
is effectively just a table with ~1700 polygon entries.

We have switched that project over to using the database instead and it might 
be difficult to create an authentic fake version as I suspect the polygons need 
to be irregular to achieve the rendering delay you refer to.

I think our issues were always when someone was editing at the same time as 
someone else viewing the data.

Despite switching to the database we are still very interested to ensure the 
deadlocks issue is fixed because there will always be some datasets where you 
just want to share them read only and sometimes a network file is just more 
convenient; especially when it is only a few users.

Anyway; I'd be interested to know if it is fixed.

Thanks,

Paul Wittle
[cid:image009.jpg@01D63B1F.39488FD0]
Business Solutions Analyst (GIS)
ICT Operations
Dorset Council
01305 228473 

dorsetcouncil.gov.uk

[cid:image010.png@01D63B1F.39488FD0]
[cid:image011.png@01D63B1F.39488FD0]
[cid:image012.png@01D63B1F.39488FD0]

This e-mail and any files transmitted with it are intended solely for the use 
of the individual or entity to whom they are addressed. It may contain 
unclassified but sensitive or protectively marked material and should be 
handled accordingly. Unless you are the named addressee (or authorised to 
receive it for the addressee) you may not copy or use it, or disclose it to 
anyone else. If you have received this transmission in error please notify the 
sender immediately. All traffic may be subject to recording and/or monitoring 
in accordance with relevant legislation. Any views expressed in this message 
are those of the individual sender, except where the sender specifies and with 
authority, states them to be the views of Dorset Council. Dorset Council does 
not accept service of documents by fax or other electronic means. Virus 
checking: Whilst all reasonable steps have been taken to ensure that this 
electronic communication and its attachments whether encoded, encrypted or 
otherwise supplied are free from computer viruses, Dorset Council accepts no 
liability in respect of any loss, cost, damage or expense suffered as a result 
of accessing this message or any of its attachments. For information on how 
Dorset Council processes your information, please see 
www.dorsetcouncil.gov.uk/416433
___
QGIS-Developer mailing list
QGIS-Developer@lists.osgeo.org
List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer

Re: [QGIS-Developer] GeoPackage - where are we -where do we go

2020-06-02 Thread Even Rouault
On mardi 2 juin 2020 15:58:38 CEST Olivier Dalang wrote:
> Hi !
> 
> About WAL related issues :
> 
> I just tried to reproduce the original issue that lead to using WAL on
> local files (https://github.com/qgis/QGIS/issues/23283) but could not
> reproduce it (recent master, Windows 10). Does the issue still exist
> on other platforms ? (this can be tested by changing the
> QgsSetting qgis/walForSqlite3 to false and following the steps in the
> issue).

Did you also try running the tests that were added per
https://github.com/qgis/QGIS/commit/b6b8759efbeb833d0d3dbf6df008087701361ad3
with WAL disabled ?

To reproduce the issue, you might need a sufficiently large GeoPackage file so 
that redraw 
time is not too instant and/or possible move quickly your map just before 
validating an edit 
so that a redraw is in progress while changes are committed to the database.
I have unfortunately no memories of if/how I reproduced the issue. Perhaps I 
just came with 
the unit tests first.

-- 
Spatialys - Geospatial professional services
http://www.spatialys.com
___
QGIS-Developer mailing list
QGIS-Developer@lists.osgeo.org
List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer

Re: [QGIS-Developer] GeoPackage - where are we -where do we go

2020-06-02 Thread Olivier Dalang
Hi !

About WAL related issues :

I just tried to reproduce the original issue that lead to using WAL on
local files (https://github.com/qgis/QGIS/issues/23283) but could not
reproduce it (recent master, Windows 10). Does the issue still exist
on other platforms ? (this can be tested by changing the
QgsSetting qgis/walForSqlite3 to false and following the steps in the
issue).

If not, maybe we could consider switching back to journal_mode=DELETE, as
it seems it could solve some of the issues with gpkgs. If yes, would
sharing just one sqlite connection handle for all accesses to a sqlite file
be a sensible approach to avoid the deadlock without using WAL ? I' n

Cheers !

Olivier


On Fri, 8 May 2020 at 11:30, Matthias Kuhn  wrote:

> Hi list,
>
> I wondered about the state of GeoPackage. Personally, cince it has been
> introduced to qgis and evenmore since it has been selected as the default
> format, I have never grown to fully and completely.
>
> I do not want to trigger a evangelical discussion here. I'd like to see
> where we are and what we can reasonably do to have a default file format
> which can be recommended with no bad feelings.
>
>
> Here follow a couple of observations over the years, some of them
> properties of the specs I believe:
>
>
> * The fid requirement
>
>   I sometimes want my features to be identified by uuids or others. They
> also tend to accumulate if derived datasets are created (through processing
> etc). If I need some pseudo stable primary key there is a rowid builtin
> into sqlite, we don't need a second one.
>
>   Possible mitigation: alter the ogr implementation. possibly alter the
> standard (required?)
>
> * The modification on r/o open
>
>   Has caused too much pain on git.
>
>   Possible mitigation: a) switch to journal mode=delete (not an easy
> option because of https://issues.qgis.org/issues/15351) b) only switch to
> wal mode when layers are put into edit mode (I have strong doubts this is a
> safe thing to do)
>
> * The network share freeze
>
>   Our default file should play nicely with (windows) network shares. It's
> clear to everyone that we can't expect concurrent writes. But it should
> "just work" for concurrent read by many.
>
>   Possible mitigation: switch to journal mode=delete for network shares
> (we are looking into this)
>
> * The wal file appearing next to the file
>
>   It is confusing to newcomers and looks almost like a sidecar file. I
> would care less if it was put into some system cache folder instead of just
> into my data folder. Or at least if it was a hidden file.
>
>   Possible mitigation: switch to journal mode=delete (not an easy option
> because of https://issues.qgis.org/issues/15351)
>
> * The couple of corrupted files I have received over the years which
> could only be repaired by a command line "dump contents as sql and execute
> into new file"
>
>   I have not found a way to reproduce this. Some of them were produced by
> older qgis versions making it easy to violate foreign key constraints and
> hard to recover. This has been fixed.
>
>   Possible mitigation: offer a "repair" option in qgis. Through processing
> or "on the fly" upon detection.
>
> * Default value magic replace values on insert (with no possibility to
> pre-evaluate them)
>
>   E.g. a global sequence like on postgres would be nice. Can be worked
> around through default values in qgis though.
>
>   Possible mitigation: a)add it as a feature to sqlite. b) use qgis
> default values. c) live with it.
>
> * The requirement for a single geometry column per table
>
>   I just don't see a good reason to forbid that
>
>   Possible mitigation: a) alter the standard. b) ignore the standard and
> patch the ogr implementation.
>
>
> I wonder how others feel about these topics.
>
>
> - Are there more pain points I forgot to list?
>
> - Do you see more approaches to mitigate these problems?
>
> - Is someone already working on these issues?
>
>
> It would be great to have a standard file format that we can fully trust.
> Let's make a reality check if GeoPackage can be this format.
>
> Best regards
> --
> Matthias Kuhn
> matth...@opengis.ch
> +41 (0)76 435 67 63 <+41764356763>
> [image: OPENGIS.ch Logo] 
> ___
> QGIS-Developer mailing list
> QGIS-Developer@lists.osgeo.org
> List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
> Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer
___
QGIS-Developer mailing list
QGIS-Developer@lists.osgeo.org
List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer

Re: [QGIS-Developer] GeoPackage - where are we -where do we go

2020-05-08 Thread Jeff McKenna

On 2020-05-08 12:55 p.m., Even Rouault wrote:


Actually that might be just my past reading of the below blog post that 
Jeff Yutzler, the editor of the spec just reminded me about:


http://geopackage.blogspot.com/2015/10/one-geometry-per-table.html

The link to the modeling guidelines in this post is now dead. They are at:

http://www.geopackage.org/guidance/modeling.html



All are encouraged to engage Jeff Yutzler directly in the thread at 
https://twitter.com/mapserving/status/1258732673465552897   He has 
stated that he has read our mailing list thread here and will take these 
thoughts to the next working group meeting for the specification. 
(thanks to Matthias for opening this discussion today)


-jeff



--
Jeff McKenna
MapServer Consulting and Training Services
https://gatewaygeo.com/



___
QGIS-Developer mailing list
QGIS-Developer@lists.osgeo.org
List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer

Re: [QGIS-Developer] GeoPackage - where are we -where do we go

2020-05-08 Thread Even Rouault
> > *The requirement for a single geometry column per table
> > 
> >I just don't see a good reason to forbid that
> >
> >Possible mitigation: a) alter the standard. b) ignore the standard
> > 
> > and patch the ogr implementation.
> 
> I don't think the Geopackage OGC SWG would be keen to change that. I believe
> I floated the idea around a few years ago, but they wanted GeoPackage core
> to remain simple and adding multiple geometry columns goes against that.

Actually that might be just my past reading of the below blog post that Jeff 
Yutzler, the 
editor of the spec just reminded me about:

http://geopackage.blogspot.com/2015/10/one-geometry-per-table.html

The link to the modeling guidelines in this post is now dead. They are at:

http://www.geopackage.org/guidance/modeling.html

Even


-- 
Spatialys - Geospatial professional services
http://www.spatialys.com
___
QGIS-Developer mailing list
QGIS-Developer@lists.osgeo.org
List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer

Re: [QGIS-Developer] GeoPackage - where are we -where do we go

2020-05-08 Thread Richard Duivenvoorde
On 5/8/20 3:13 PM, Tobias Wendorff wrote:
> Am 08.05.2020 um 15:10 schrieb Andreas Neumann:
>> To me, this is not a downside, but a big, big plus! Fewer mess on the
>> file system.
> My students and co-workers start to save each layer in a new GPKG, so I
> don't have a benefit all ;)
> 
>> And it is the whole point  of a Geopackage, to package many data sets
>> into one file, so it can be easily shared.
> Just ZIP or TAR them ;)

Thanks for raising this Matthias! But I think we should not try to
create a silver bullet here. One should use gpkg where it fit's (and
that is apparently NOT in networked environment, that is where db's are
a better fit?).

About Tobias' flatgeobuf: instead of a shp/gpkg file alternative, would
this not be a very good candidate to store our intermediate processing
steps in (which was shp, not shure what it is now?)?

May I second the above mentioned good thing about gpkg: that you can
have more (related) sets of data in one file, that you can create joins
in these to create views, that you can save styles and projects in it
etc etc

The fact at most people use some programs/formats to do things they
equally can do with a txt/csv file does not make a more advanced format
a bad format :-)

Today I actually hit an issue in which selection first seem not to work:
https://github.com/qgis/QGIS/issues/36291
from which my conclusion is: views are cool but be careful with primary
keys!

So in the case of gpkg we should probably have a dialog like we have for
Postgis/Oracle in which you can define a Primary Key, etc etc?

Which brings me to: we should handle gpkg as actual databases and not as
simple files? Maybe we/ogc are actually trying to mixup concepts:
databases vs simple files?

Regards,

Richard Duivenvoorde
___
QGIS-Developer mailing list
QGIS-Developer@lists.osgeo.org
List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer

Re: [QGIS-Developer] GeoPackage - where are we -where do we go

2020-05-08 Thread Tobias Wendorff
Am 08.05.2020 um 15:10 schrieb Andreas Neumann:
> To me, this is not a downside, but a big, big plus! Fewer mess on the
> file system.

My students and co-workers start to save each layer in a new GPKG, so I
don't have a benefit all ;)

> And it is the whole point  of a Geopackage, to package many data sets
> into one file, so it can be easily shared.

Just ZIP or TAR them ;)

> If you want to discuss this, please open a separate thread on it.

Did so in the past, that's why I'm using FlatGeobuf. But I'm quiet now.
___
QGIS-Developer mailing list
QGIS-Developer@lists.osgeo.org
List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer

Re: [QGIS-Developer] GeoPackage - where are we -where do we go

2020-05-08 Thread Saber Razmjooei
Hi all,

Interesting discussion!

Standard does not care (and should not care to great extent) about use
cases. Creating revisions and incremental changelogs is a common use case
(unless you work with
my-spatial_data_v010_final_draft_final_002_really-final_005.gpkg :) ), you
have those common use cases.

Providing a tool similar to geodiff or Detect dataset changes
algorithms can help users to overcome the limitation of standards...

Regards
Saber




On Fri, 8 May 2020 at 11:48, Matthias Kuhn  wrote:

> Hi Tim
> On 5/8/20 12:25 PM, Tim Sutton wrote:
>
> Hi
>
> On 8 May 2020, at 10:34, Werner Macho  wrote:
>
> Hi Matthias,
>
> May I add another pain here?
>
> Storing gpkg in the cloud means syncing even after the file was only
> opened in QGIS and nothing has changed inside - it was only opened to view
> something.
> On very large gpkg files this is also not really nice.
> (Maybe I am just using gpkg wrong, but at least this happens on my
> installation)
>
>
> In some cases you may be able to work around this by using geodiff from
> our fine friends over at Lutra. It will let you extract just the changes
> (if you have the original and the changed copy locally), then upload them
> to the server.
>
> Thanks for mentioning that. This might indeed be good for very controlled
> environments but adds quite a bit of additional requirements on logic and
> services. In those cases you are often better off setting up a postgres
> database directly.
>
> In this thread, let's focus on "how to get things to work properly for our
> standard geo container file."
>
> Matthias
>
>
> [1]https://github.com/lutraconsulting/geodiff
>
> Regards
>
> Tim
> —
>
>
>
>
>
>
>
>
>
> *Tim Sutton*
>
> *Co-founder:* Kartoza
> *Ex Project chair:* QGIS.org
>
> Visit http://kartoza.com to find out about open source:
>
> Desktop GIS programming services
> Geospatial web development
> GIS Training
> Consulting Services
>
> *Skype*: timlinux
> *IRC:* timlinux on #qgis at freenode.net
>
> I'd love to connect. Here's my calendar link
>  to make finding time easy.
>
> ___
> QGIS-Developer mailing list
> QGIS-Developer@lists.osgeo.org
> List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
> Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer



-- 
Saber Razmjooei
www.lutraconsulting.co.uk
___
QGIS-Developer mailing list
QGIS-Developer@lists.osgeo.org
List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer

Re: [QGIS-Developer] GeoPackage - where are we -where do we go

2020-05-08 Thread Andreas Neumann

Hi Tobias,

Please see my comment below.

Am 08.05.20 um 13:52 schrieb Tobias Wendorff:

Am 08.05.2020 um 11:30 schrieb Matthias Kuhn:

I do not want to trigger a evangelical discussion here. I'd like to see
where we are and what we can reasonably do to have a default file format
which can be recommended with no bad feelings.

Two downsides of GPKG, which I've experienced with many users, who
worked with Shapefiles in the past:

1. A GPKG can contain multiple layers, but has one filename only. This
confuses many users. They're expecting single files, like
"houses_poly.gpkg" and "houses_point.gpkg". When they only see
"houses.gpkg", they think they're missing anything. From the normal
Windows Explorer (don't know about Nautilus etc.), you also cannot watch
inside the GPKG file to check its content. It would be nice to make
Windows parse GPKG's/sqlite3's metadata.


To me, this is not a downside, but a big, big plus! Fewer mess on the 
file system.


And it is the whole point  of a Geopackage, to package many data sets 
into one file, so it can be easily shared.


I  think we should concentrate in this discussion on the real technical 
issues, and not on philosophical differences or different workflows, 
which certainly have their pros and cons.


If you want to discuss this, please open a separate thread on it.

Thanks,

Andreas

___
QGIS-Developer mailing list
QGIS-Developer@lists.osgeo.org
List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer

Re: [QGIS-Developer] GeoPackage - where are we -where do we go

2020-05-08 Thread Tobias Wendorff
Am 08.05.2020 um 11:59 schrieb Andreas Neumann:
> If we don't like gpkg as default format - the question is: what is the
> alternative?

Please don't forget about Björn's FlatGeobuf:
https://bjornharrtell.github.io/flatgeobuf/

Since I'm not a fan of the bloat sqlite3 has, FlatGeobuf is in heavily
use on my systems.
___
QGIS-Developer mailing list
QGIS-Developer@lists.osgeo.org
List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer

Re: [QGIS-Developer] GeoPackage - where are we -where do we go

2020-05-08 Thread Tobias Wendorff
Hi Ismail,

Am 08.05.2020 um 14:04 schrieb Ismail Sunni:
> You can right-click on the gpkg file in the file browser panel and you
> can see the `Compact Database (VACUUM)`.

thanks for the hint, but I've disabled QGIS' file browser in our
environment (research department of University) and at home. Let me
explain the reason:

QGIS' file browser by default tries to scan ZIP files and other (big)
files. This had an noticeable effect on our network traffic and the
client performance. Via VPN (home office right now) this effect got even
worse, since the VPN performance is degraded, too. Also, we had some
crashes on student laptops while scanning many shapefiles in a
directory. I'm very afraid that someone will try a VACUUM over the
network (or over the VPN).

Our users normally browse the directory by Windows Explorer, drag & drop
the file into QGIS or copy it to the local system first. This also has
to do with GPKG's bad network performance.

Best regards,
Tobias
___
QGIS-Developer mailing list
QGIS-Developer@lists.osgeo.org
List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer

Re: [QGIS-Developer] GeoPackage - where are we -where do we go

2020-05-08 Thread Ismail Sunni
Hi Tobias,

VACUUM doesn't help here and isn't reachable easily using a button in
> QGIS. Seems like there's much empty space in the db scheme. Perhaps
> anyone can explain this?


You can right-click on the gpkg file in the file browser panel and you can
see the `Compact Database (VACUUM)`.

Best regards

On Fri, May 8, 2020 at 1:52 PM Tobias Wendorff <
tobias.wendo...@tu-dortmund.de> wrote:

> Am 08.05.2020 um 11:30 schrieb Matthias Kuhn:
> >
> > I do not want to trigger a evangelical discussion here. I'd like to see
> > where we are and what we can reasonably do to have a default file format
> > which can be recommended with no bad feelings.
>
> Two downsides of GPKG, which I've experienced with many users, who
> worked with Shapefiles in the past:
>
> 1. A GPKG can contain multiple layers, but has one filename only. This
> confuses many users. They're expecting single files, like
> "houses_poly.gpkg" and "houses_point.gpkg". When they only see
> "houses.gpkg", they think they're missing anything. From the normal
> Windows Explorer (don't know about Nautilus etc.), you also cannot watch
> inside the GPKG file to check its content. It would be nice to make
> Windows parse GPKG's/sqlite3's metadata.
>
> 2. As pointed out in the past, a big downside of GPKG or sqlite3 is the
> huge bloat. I often have single table GPKGs, which are 12 GiB. When
> compressing them with zstd, I can get it down to 3 GiB or even less in
> about no time.
>
> VACUUM doesn't help here and isn't reachable easily using a button in
> QGIS. Seems like there's much empty space in the db scheme. Perhaps
> anyone can explain this?
> ___
> QGIS-Developer mailing list
> QGIS-Developer@lists.osgeo.org
> List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
> Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer



-- 
Ismail Sunni
Software Engineer
ismailsunni.id
ismailsunni.wordpress.com
___
QGIS-Developer mailing list
QGIS-Developer@lists.osgeo.org
List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer

Re: [QGIS-Developer] GeoPackage - where are we -where do we go

2020-05-08 Thread Tobias Wendorff
Am 08.05.2020 um 11:30 schrieb Matthias Kuhn:
>
> I do not want to trigger a evangelical discussion here. I'd like to see
> where we are and what we can reasonably do to have a default file format
> which can be recommended with no bad feelings.

Two downsides of GPKG, which I've experienced with many users, who
worked with Shapefiles in the past:

1. A GPKG can contain multiple layers, but has one filename only. This
confuses many users. They're expecting single files, like
"houses_poly.gpkg" and "houses_point.gpkg". When they only see
"houses.gpkg", they think they're missing anything. From the normal
Windows Explorer (don't know about Nautilus etc.), you also cannot watch
inside the GPKG file to check its content. It would be nice to make
Windows parse GPKG's/sqlite3's metadata.

2. As pointed out in the past, a big downside of GPKG or sqlite3 is the
huge bloat. I often have single table GPKGs, which are 12 GiB. When
compressing them with zstd, I can get it down to 3 GiB or even less in
about no time.

VACUUM doesn't help here and isn't reachable easily using a button in
QGIS. Seems like there's much empty space in the db scheme. Perhaps
anyone can explain this?
___
QGIS-Developer mailing list
QGIS-Developer@lists.osgeo.org
List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer

Re: [QGIS-Developer] GeoPackage - where are we -where do we go

2020-05-08 Thread Andreas Neumann

Very well. I am glad we are at the same boat here ;-)

Andreas

Am 08.05.20 um 12:43 schrieb Matthias Kuhn:


Hi Andreas

Thanks for also pointing out this question. I would like to completely 
focus on b) for now to not disperse the discussion too much.


If it comes down to unresolvable problems we can still go for d).

But let's keep that for later.

Matthias

On 5/8/20 11:59 AM, Andreas Neumann wrote:


Hi Matthias,

Thank you for listing all the open issues/problem with gpkg that you 
know of. It really helps.


If we don't like gpkg as default format - the question is: what is 
the alternative?


a) stay with ESRI shapefile (I think noone would like this)
b) work with the SQLite, gpkg/OGC community to fix the gpkg issues 
(my preference)

c) use ESRI FGDB format (then we are at the mercy of ESRI)
d) invent something new (risky, if only QGIS uses that, 
interoperability would suck)


I would prefer option b) a lot, and if that is not feasible, then 
maybe d). d) will also be risky.


a) would equally suck as the current state of gpkg - I've seen far 
too many corrupt shape files, people complaining about 
interoperability issues (ArcGIS would show features that had been 
deleted in QGIS, ) and I don't need to repeat the list of the 
numerous restrictions of ESRI shp format.


Andreas

Am 08.05.20 um 11:30 schrieb Matthias Kuhn:


Hi list,

I wondered about the state of GeoPackage. Personally, cince it has 
been introduced to qgis and evenmore since it has been selected as 
the default format, I have never grown to fully and completely.


I do not want to trigger a evangelical discussion here. I'd like to 
see where we are and what we can reasonably do to have a default 
file format which can be recommended with no bad feelings.



Here follow a couple of observations over the years, some of them 
properties of the specs I believe:



* The fid requirement

  I sometimes want my features to be identified by uuids or others. 
They also tend to accumulate if derived datasets are created 
(through processing etc). If I need some pseudo stable primary key 
there is a rowid builtin into sqlite, we don't need a second one.


  Possible mitigation: alter the ogr implementation. possibly alter 
the standard (required?)


* The modification on r/o open

  Has caused too much pain on git.

  Possible mitigation: a) switch to journal mode=delete (not an easy 
option because of https://issues.qgis.org/issues/15351) b) only 
switch to wal mode when layers are put into edit mode (I have strong 
doubts this is a safe thing to do)


* The network share freeze

  Our default file should play nicely with (windows) network shares. 
It's clear to everyone that we can't expect concurrent writes. But 
it should "just work" for concurrent read by many.


  Possible mitigation: switch to journal mode=delete for network 
shares (we are looking into this)


* The wal file appearing next to the file

  It is confusing to newcomers and looks almost like a sidecar file. 
I would care less if it was put into some system cache folder 
instead of just into my data folder. Or at least if it was a hidden 
file.


  Possible mitigation: switch to journal mode=delete (not an easy 
option because of https://issues.qgis.org/issues/15351)


* The couple of corrupted files I have received over the years which 
could only be repaired by a command line "dump contents as sql and 
execute into new file"


  I have not found a way to reproduce this. Some of them were 
produced by older qgis versions making it easy to violate foreign 
key constraints and hard to recover. This has been fixed.


  Possible mitigation: offer a "repair" option in qgis. Through 
processing or "on the fly" upon detection.


*Default value magic replace values on insert (with no possibility 
to pre-evaluate them)


  E.g. a global sequence like on postgres would be nice. Can be 
worked around through default values in qgis though.


  Possible mitigation: a)add it as a feature to sqlite. b) use qgis 
default values. c) live with it.


*The requirement for a single geometry column per table

  I just don't see a good reason to forbid that

  Possible mitigation: a) alter the standard. b) ignore the standard 
and patch the ogr implementation.



I wonder how others feel about these topics.


- Are there more pain points I forgot to list?

- Do you see more approaches to mitigate these problems?

- Is someone already working on these issues?


It would be great to have a standard file format that we can fully 
trust. Let's make a reality check if GeoPackage can be this format.


Best regards

--
Matthias Kuhn
matth...@opengis.ch 
+41 (0)76 435 67 63 
OPENGIS.ch Logo 

___
QGIS-Developer mailing list
QGIS-Developer@lists.osgeo.org
List info:https://lists.osgeo.org/mailman/listinfo/qgis-developer
Unsubscribe:https://lists.osgeo.org/mailman/listinfo/qgis-developer



Re: [QGIS-Developer] GeoPackage - where are we -where do we go

2020-05-08 Thread Matthias Kuhn

Hi Even,

On 5/8/20 12:30 PM, Even Rouault wrote:


> * The fid requirement

>

>   I sometimes want my features to be identified by uuids or others.

> They also tend to accumulate if derived datasets are created (through

> processing etc). If I need some pseudo stable primary key there is a

> rowid builtin into sqlite, we don't need a second one.

>

>   Possible mitigation: alter the ogr implementation. possibly alter the

> standard (required?)

I think the main issue here is that we expose the FID as a normal 
column in QGIS. This is a QGIS only behaviour, that could be easily 
removed.



Can we still turn this road now?

Many will meanwhile depend on this behavior (it's not always bad). So if 
we treat it as a system attribute and hide it away on provider level we 
risk breaking workflows. Also I'm unsure about further processing, do we 
remove these columns from resulting files or not?


Also, will it then be possible to add another "user id column" with 
autoincrement (I couldn't get sqlite to create a second autoincrement 
coolumn here in a quick test).


Maybe we need a gpkg extension "fid_is_visible" to make this cleaner? or 
allow creation of gpkgs out of spec?


Keeping a integer primary key column is indeed a requirement of the 
standard, and can be useful in circumstances where you run VACUUM on 
the DB, which would alter rowid in case of deleted features and 
possibly confuse QGIS identification by feature it if that happened 
after a layer has been loaded (but that's a bit of a edge scenario).


I think that's an application level problem to solve and not format 
level requirement to impose.


>

> * The modification on r/o open

>

>   Has caused too much pain on git.

>

>   Possible mitigation: a) switch to journal mode=delete (not an easy

> option because of https://issues.qgis.org/issues/15351)

> b) only switch

> to wal mode when layers are put into edit mode (I have strong doubts

> this is a safe thing to do)

That should be investigated/tested, but indeed it's not compleletely 
obvious that a connection that has been opened in the default journal 
mode=delete will "see" that it has been turned to WAL by another. I 
believe I looked in the SQLite doc about that scenario, but didn't 
find anything.


>

> * The network share freeze

>

>   Our default file should play nicely with (windows) network shares.

> It's clear to everyone that we can't expect concurrent writes. But it

> should "just work" for concurrent read by many.

>

>   Possible mitigation: switch to journal mode=delete for network shares

> (we are looking into this)

I don't think journal mode=delete wil prevent SQLite3 from creating 
locks (or trying to do), to detect potential writes. I'm not sure if 
that can result in deadlock scenarios if they are network issues. One 
possibility to avoid all (deal)lock issues is to use the 
SQLITE_USE_OGR_VFS=YES env variable that will use GDAL I/O layer 
instead of SQLite3 built-in one. The side effect of this is that as 
GDAL I/O layer doesn't implement locking, no locking attempt is done. 
So it has been obvserved in 
https://github.com/qgis/QGIS/issues/27899#issuecomment-535413602 that 
it actually resulted in speedups.


Alternate implementation with identical effects: use the 
SQLite3 uri syntax with the immutable=1 or nolock=1. See 
https://www.sqlite.org/uri.html


Of course, in such case, edits should be disabled, or enabled only 
with a big big red warning, since database corruption would occur for 
sure if 2 people tried to edit the DB simulatenously. I also thing 
that in a scenario 1 writer, other readers, the readers could possibly 
see inconsistant/broken state in a transient way. That could actually 
arise if there's a single machine editing & viewing the database, for 
example if a rendering thread reads during the time the DB is written.


(but the same could probably seen with some editing scenarios on 
shapefiles)


I think we can accept these constraints. It's our job to do our best to 
prevent users to shoot themselves in their foot. But not at the expense 
of preventing valid use cases.


I.e. it should be clear to everyone that a file based format is meant 
for single write access only. As you said, it's probably the case with 
shapefiles as well (hmm, I wanted to keep them out of this discussion) 
and the limitation is obvious to most of us.



>

> * The wal file appearing next to the file

>

>   It is confusing to newcomers and looks almost like a sidecar file. I

> would care less if it was put into some system cache folder instead of

> just into my data folder. Or at least if it was a hidden file.

By overriding sqlite3 I/O callback, I'm wondering if we couldn't move 
the .wal file somewhere else in the filesystem (of course that would 
only work for OGR enabled consumer, but probably good enough). That 
said, the modification of the first 16 bytes of the main .gpkg file, 
which cause issue for file synchronization, would remain, as it is 

Re: [QGIS-Developer] GeoPackage - where are we -where do we go

2020-05-08 Thread Matthias Kuhn

Hi Tim

On 5/8/20 12:25 PM, Tim Sutton wrote:

Hi

On 8 May 2020, at 10:34, Werner Macho > wrote:


Hi Matthias,

May I add another pain here?

Storing gpkg in the cloud means syncing even after the file was only 
opened in QGIS and nothing has changed inside - it was only opened to 
view something.

On very large gpkg files this is also not really nice.
(Maybe I am just using gpkg wrong, but at least this happens on my 
installation)


In some cases you may be able to work around this by using geodiff 
from our fine friends over at Lutra. It will let you extract just the 
changes (if you have the original and the changed copy locally), then 
upload them to the server.


Thanks for mentioning that. This might indeed be good for very 
controlled environments but adds quite a bit of additional requirements 
on logic and services. In those cases you are often better off setting 
up a postgres database directly.


In this thread, let's focus on "how to get things to work properly for 
our standard geo container file."


Matthias



[1]https://github.com/lutraconsulting/geodiff

Regards

Tim
—









*Tim Sutton*

*Co-founder:*Kartoza
*Ex Project chair:*QGIS.org 

Visit http://kartoza.com  to find out about open 
source:


Desktop GIS programming services
Geospatial web development
GIS Training
Consulting Services

*Skype*: timlinux
*IRC:*timlinux on #qgis at freenode.net 

I'd love to connect. Here's my calendar link 
 to make finding time easy.


___
QGIS-Developer mailing list
QGIS-Developer@lists.osgeo.org
List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer

Re: [QGIS-Developer] GeoPackage - where are we -where do we go

2020-05-08 Thread Matthias Kuhn

Hi Andreas

Thanks for also pointing out this question. I would like to completely 
focus on b) for now to not disperse the discussion too much.


If it comes down to unresolvable problems we can still go for d).

But let's keep that for later.

Matthias

On 5/8/20 11:59 AM, Andreas Neumann wrote:


Hi Matthias,

Thank you for listing all the open issues/problem with gpkg that you 
know of. It really helps.


If we don't like gpkg as default format - the question is: what is the 
alternative?


a) stay with ESRI shapefile (I think noone would like this)
b) work with the SQLite, gpkg/OGC community to fix the gpkg issues (my 
preference)

c) use ESRI FGDB format (then we are at the mercy of ESRI)
d) invent something new (risky, if only QGIS uses that, 
interoperability would suck)


I would prefer option b) a lot, and if that is not feasible, then 
maybe d). d) will also be risky.


a) would equally suck as the current state of gpkg - I've seen far too 
many corrupt shape files, people complaining about interoperability 
issues (ArcGIS would show features that had been deleted in QGIS, ) 
and I don't need to repeat the list of the numerous restrictions of 
ESRI shp format.


Andreas

Am 08.05.20 um 11:30 schrieb Matthias Kuhn:


Hi list,

I wondered about the state of GeoPackage. Personally, cince it has 
been introduced to qgis and evenmore since it has been selected as 
the default format, I have never grown to fully and completely.


I do not want to trigger a evangelical discussion here. I'd like to 
see where we are and what we can reasonably do to have a default file 
format which can be recommended with no bad feelings.



Here follow a couple of observations over the years, some of them 
properties of the specs I believe:



* The fid requirement

  I sometimes want my features to be identified by uuids or others. 
They also tend to accumulate if derived datasets are created (through 
processing etc). If I need some pseudo stable primary key there is a 
rowid builtin into sqlite, we don't need a second one.


  Possible mitigation: alter the ogr implementation. possibly alter 
the standard (required?)


* The modification on r/o open

  Has caused too much pain on git.

  Possible mitigation: a) switch to journal mode=delete (not an easy 
option because of https://issues.qgis.org/issues/15351) b) only 
switch to wal mode when layers are put into edit mode (I have strong 
doubts this is a safe thing to do)


* The network share freeze

  Our default file should play nicely with (windows) network shares. 
It's clear to everyone that we can't expect concurrent writes. But it 
should "just work" for concurrent read by many.


  Possible mitigation: switch to journal mode=delete for network 
shares (we are looking into this)


* The wal file appearing next to the file

  It is confusing to newcomers and looks almost like a sidecar file. 
I would care less if it was put into some system cache folder instead 
of just into my data folder. Or at least if it was a hidden file.


  Possible mitigation: switch to journal mode=delete (not an easy 
option because of https://issues.qgis.org/issues/15351)


* The couple of corrupted files I have received over the years which 
could only be repaired by a command line "dump contents as sql and 
execute into new file"


  I have not found a way to reproduce this. Some of them were 
produced by older qgis versions making it easy to violate foreign key 
constraints and hard to recover. This has been fixed.


  Possible mitigation: offer a "repair" option in qgis. Through 
processing or "on the fly" upon detection.


*Default value magic replace values on insert (with no possibility to 
pre-evaluate them)


  E.g. a global sequence like on postgres would be nice. Can be 
worked around through default values in qgis though.


  Possible mitigation: a)add it as a feature to sqlite. b) use qgis 
default values. c) live with it.


*The requirement for a single geometry column per table

  I just don't see a good reason to forbid that

  Possible mitigation: a) alter the standard. b) ignore the standard 
and patch the ogr implementation.



I wonder how others feel about these topics.


- Are there more pain points I forgot to list?

- Do you see more approaches to mitigate these problems?

- Is someone already working on these issues?


It would be great to have a standard file format that we can fully 
trust. Let's make a reality check if GeoPackage can be this format.


Best regards

--
Matthias Kuhn
matth...@opengis.ch 
+41 (0)76 435 67 63 
OPENGIS.ch Logo 

___
QGIS-Developer mailing list
QGIS-Developer@lists.osgeo.org
List info:https://lists.osgeo.org/mailman/listinfo/qgis-developer
Unsubscribe:https://lists.osgeo.org/mailman/listinfo/qgis-developer


___
QGIS-Developer mailing list
QGIS-Developer@lists.osgeo.org
List info: 

Re: [QGIS-Developer] GeoPackage - where are we -where do we go

2020-05-08 Thread Even Rouault
> * The fid requirement
> 
>I sometimes want my features to be identified by uuids or others.
> They also tend to accumulate if derived datasets are created (through
> processing etc). If I need some pseudo stable primary key there is a
> rowid builtin into sqlite, we don't need a second one.
> 
>Possible mitigation: alter the ogr implementation. possibly alter the
> standard (required?)

I think the main issue here is that we expose the FID as a normal column in 
QGIS. This is a 
QGIS only behaviour, that could be easily removed.
Keeping a integer primary key column is indeed a requirement of the standard, 
and can be 
useful in circumstances where you run VACUUM on the DB, which would alter rowid 
in case of 
deleted features and possibly confuse QGIS identification by feature it if that 
happened after 
a layer has been loaded (but that's a bit of a edge scenario).

> 
> * The modification on r/o open
> 
>Has caused too much pain on git.
> 
>Possible mitigation: a) switch to journal mode=delete (not an easy
> option because of https://issues.qgis.org/issues/15351) 

> b) only switch
> to wal mode when layers are put into edit mode (I have strong doubts
> this is a safe thing to do)

That should be investigated/tested, but indeed it's not compleletely obvious 
that a 
connection that has been opened in the default journal mode=delete will "see" 
that it has 
been turned to WAL by another. I believe I looked in the SQLite doc about that 
scenario, but 
didn't find anything.

> 
> * The network share freeze
> 
>Our default file should play nicely with (windows) network shares.
> It's clear to everyone that we can't expect concurrent writes. But it
> should "just work" for concurrent read by many.
> 
>Possible mitigation: switch to journal mode=delete for network shares
> (we are looking into this)

I don't think journal mode=delete wil prevent SQLite3 from creating locks (or 
trying to do), to 
detect potential writes. I'm not sure if that can result in deadlock scenarios 
if they are 
network issues. One possibility to avoid all (deal)lock issues is to use the 
SQLITE_USE_OGR_VFS=YES env variable that will use GDAL I/O layer instead of 
SQLite3 built-
in one. The side effect of this is that as GDAL I/O layer doesn't implement 
locking, no locking 
attempt is done. So it has been obvserved in 
https://github.com/qgis/QGIS/issues/
27899#issuecomment-535413602 that it actually resulted in speedups.
Alternate implementation with identical effects: use the SQLite3 uri 
syntax with 
the immutable=1 or nolock=1. See https://www.sqlite.org/uri.html 

Of course, in such case, edits should be disabled, or enabled only with a big 
big red warning, 
since database corruption would occur for sure if 2 people tried to edit the DB 
simulatenously. I also thing that in a scenario 1 writer, other readers, the 
readers could 
possibly see inconsistant/broken state in a transient way. That could actually 
arise if there's a 
single machine editing & viewing the database, for example if a rendering 
thread reads 
during the time the DB is written.
(but the same could probably seen with some editing scenarios on shapefiles)

> 
> * The wal file appearing next to the file
> 
>It is confusing to newcomers and looks almost like a sidecar file. I
> would care less if it was put into some system cache folder instead of
> just into my data folder. Or at least if it was a hidden file.

By overriding sqlite3 I/O callback, I'm wondering if we couldn't move the .wal 
file somewhere 
else in the filesystem (of course that would only work for OGR enabled 
consumer, but 
probably good enough). That said, the modification of the first 16 bytes of the 
main .gpkg 
file, which cause issue for file synchronization, would remain, as it is a 
design constraint of 
WAL.

>Possible mitigation: switch to journal mode=delete (not an easy
> option because of https://issues.qgis.org/issues/15351)

I was wondering if we could consult the SQLite3 main author 
(http://www.hwaci.com/drh/) 
on all those locking and concurrency issues, as he has probably experience with 
similar 
scenarios.

> *The requirement for a single geometry column per table
> 
>I just don't see a good reason to forbid that
> 
>Possible mitigation: a) alter the standard. b) ignore the standard
> and patch the ogr implementation.

I don't think the Geopackage OGC SWG would be keen to change that. I believe I 
floated the 
idea around a few years ago, but they wanted GeoPackage core to remain simple 
and adding 
multiple geometry columns goes against that.
That said, GeoPackage has an extension mechanism, and it would be possible on 
the OGR 
side to define one to flag tables with multiple geometry columns as extended 
(I've just 
verified for exemple that the definition of the gpkg_geometry_columns system 
table has a 
unique constraint on the tuple (table_name, column_name), and not just 
table_name, so this 
is a provision for that 

Re: [QGIS-Developer] GeoPackage - where are we -where do we go

2020-05-08 Thread Tim Sutton
Hi

> On 8 May 2020, at 10:34, Werner Macho  wrote:
> 
> Hi Matthias,
> 
> May I add another pain here?
> 
> Storing gpkg in the cloud means syncing even after the file was only opened 
> in QGIS and nothing has changed inside - it was only opened to view something.
> On very large gpkg files this is also not really nice.
> (Maybe I am just using gpkg wrong, but at least this happens on my 
> installation)

In some cases you may be able to work around this by using geodiff from our 
fine friends over at Lutra. It will let you extract just the changes (if you 
have the original and the changed copy locally), then upload them to the server.

[1]https://github.com/lutraconsulting/geodiff

Regards

Tim
—










Tim Sutton

Co-founder: Kartoza
Ex Project chair: QGIS.org

Visit http://kartoza.com  to find out about open source:

Desktop GIS programming services
Geospatial web development
GIS Training
Consulting Services

Skype: timlinux 
IRC: timlinux on #qgis at freenode.net

I'd love to connect. Here's my calendar link 
 to make finding time easy.

___
QGIS-Developer mailing list
QGIS-Developer@lists.osgeo.org
List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer

Re: [QGIS-Developer] GeoPackage - where are we -where do we go

2020-05-08 Thread Andreas Neumann

Hi Matthias,

Thank you for listing all the open issues/problem with gpkg that you 
know of. It really helps.


If we don't like gpkg as default format - the question is: what is the 
alternative?


a) stay with ESRI shapefile (I think noone would like this)
b) work with the SQLite, gpkg/OGC community to fix the gpkg issues (my 
preference)

c) use ESRI FGDB format (then we are at the mercy of ESRI)
d) invent something new (risky, if only QGIS uses that, interoperability 
would suck)


I would prefer option b) a lot, and if that is not feasible, then maybe 
d). d) will also be risky.


a) would equally suck as the current state of gpkg - I've seen far too 
many corrupt shape files, people complaining about interoperability 
issues (ArcGIS would show features that had been deleted in QGIS, ) and 
I don't need to repeat the list of the numerous restrictions of ESRI shp 
format.


Andreas

Am 08.05.20 um 11:30 schrieb Matthias Kuhn:


Hi list,

I wondered about the state of GeoPackage. Personally, cince it has 
been introduced to qgis and evenmore since it has been selected as the 
default format, I have never grown to fully and completely.


I do not want to trigger a evangelical discussion here. I'd like to 
see where we are and what we can reasonably do to have a default file 
format which can be recommended with no bad feelings.



Here follow a couple of observations over the years, some of them 
properties of the specs I believe:



* The fid requirement

  I sometimes want my features to be identified by uuids or others. 
They also tend to accumulate if derived datasets are created (through 
processing etc). If I need some pseudo stable primary key there is a 
rowid builtin into sqlite, we don't need a second one.


  Possible mitigation: alter the ogr implementation. possibly alter 
the standard (required?)


* The modification on r/o open

  Has caused too much pain on git.

  Possible mitigation: a) switch to journal mode=delete (not an easy 
option because of https://issues.qgis.org/issues/15351) b) only switch 
to wal mode when layers are put into edit mode (I have strong doubts 
this is a safe thing to do)


* The network share freeze

  Our default file should play nicely with (windows) network shares. 
It's clear to everyone that we can't expect concurrent writes. But it 
should "just work" for concurrent read by many.


  Possible mitigation: switch to journal mode=delete for network 
shares (we are looking into this)


* The wal file appearing next to the file

  It is confusing to newcomers and looks almost like a sidecar file. I 
would care less if it was put into some system cache folder instead of 
just into my data folder. Or at least if it was a hidden file.


  Possible mitigation: switch to journal mode=delete (not an easy 
option because of https://issues.qgis.org/issues/15351)


* The couple of corrupted files I have received over the years which 
could only be repaired by a command line "dump contents as sql and 
execute into new file"


  I have not found a way to reproduce this. Some of them were produced 
by older qgis versions making it easy to violate foreign key 
constraints and hard to recover. This has been fixed.


  Possible mitigation: offer a "repair" option in qgis. Through 
processing or "on the fly" upon detection.


*Default value magic replace values on insert (with no possibility to 
pre-evaluate them)


  E.g. a global sequence like on postgres would be nice. Can be worked 
around through default values in qgis though.


  Possible mitigation: a)add it as a feature to sqlite. b) use qgis 
default values. c) live with it.


*The requirement for a single geometry column per table

  I just don't see a good reason to forbid that

  Possible mitigation: a) alter the standard. b) ignore the standard 
and patch the ogr implementation.



I wonder how others feel about these topics.


- Are there more pain points I forgot to list?

- Do you see more approaches to mitigate these problems?

- Is someone already working on these issues?


It would be great to have a standard file format that we can fully 
trust. Let's make a reality check if GeoPackage can be this format.


Best regards

--
Matthias Kuhn
matth...@opengis.ch 
+41 (0)76 435 67 63 
OPENGIS.ch Logo 

___
QGIS-Developer mailing list
QGIS-Developer@lists.osgeo.org
List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer
___
QGIS-Developer mailing list
QGIS-Developer@lists.osgeo.org
List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer

Re: [QGIS-Developer] GeoPackage - where are we -where do we go

2020-05-08 Thread Paolo Cavallini
Hi MAtthias,
thanks for this very good review. Agreed on all points. As Werner
mentioned, the issue also impacts on incremental backups, causing lots
of wasted disk space.
Cheers.

Il 08/05/20 11:34, Werner Macho ha scritto:
> Hi Matthias,
> 
> May I add another pain here?
> 
> Storing gpkg in the cloud means syncing even after the file was only
> opened in QGIS and nothing has changed inside - it was only opened to
> view something.
> On very large gpkg files this is also not really nice.
> (Maybe I am just using gpkg wrong, but at least this happens on my
> installation)
> 
> regards
> Werner
> 
> On Fri, May 8, 2020 at 11:31 AM Matthias Kuhn  > wrote:
> 
> Hi list,
> 
> I wondered about the state of GeoPackage. Personally, cince it has
> been introduced to qgis and evenmore since it has been selected as
> the default format, I have never grown to fully and completely.
> 
> I do not want to trigger a evangelical discussion here. I'd like to
> see where we are and what we can reasonably do to have a default
> file format which can be recommended with no bad feelings.
> 
> 
> Here follow a couple of observations over the years, some of them
> properties of the specs I believe:
> 
> 
> * The fid requirement
> 
>   I sometimes want my features to be identified by uuids or others.
> They also tend to accumulate if derived datasets are created
> (through processing etc). If I need some pseudo stable primary key
> there is a rowid builtin into sqlite, we don't need a second one.
> 
>   Possible mitigation: alter the ogr implementation. possibly alter
> the standard (required?)
> 
> * The modification on r/o open
> 
>   Has caused too much pain on git.
> 
>   Possible mitigation: a) switch to journal mode=delete (not an easy
> option because of https://issues.qgis.org/issues/15351) b) only
> switch to wal mode when layers are put into edit mode (I have strong
> doubts this is a safe thing to do)
> 
> * The network share freeze
> 
>   Our default file should play nicely with (windows) network shares.
> It's clear to everyone that we can't expect concurrent writes. But
> it should "just work" for concurrent read by many.
> 
>   Possible mitigation: switch to journal mode=delete for network
> shares (we are looking into this)
> 
> * The wal file appearing next to the file
> 
>   It is confusing to newcomers and looks almost like a sidecar file.
> I would care less if it was put into some system cache folder
> instead of just into my data folder. Or at least if it was a hidden
> file.
> 
>   Possible mitigation: switch to journal mode=delete (not an easy
> option because of https://issues.qgis.org/issues/15351)
> 
> * The couple of corrupted files I have received over the years which
> could only be repaired by a command line "dump contents as sql and
> execute into new file"
> 
>   I have not found a way to reproduce this. Some of them were
> produced by older qgis versions making it easy to violate foreign
> key constraints and hard to recover. This has been fixed.
> 
>   Possible mitigation: offer a "repair" option in qgis. Through
> processing or "on the fly" upon detection.
> 
> *Default value magic replace values on insert (with no possibility
> to pre-evaluate them)
> 
>   E.g. a global sequence like on postgres would be nice. Can be
> worked around through default values in qgis though.
> 
>   Possible mitigation: a)add it as a feature to sqlite. b) use qgis
> default values. c) live with it.
> 
> *The requirement for a single geometry column per table
> 
>   I just don't see a good reason to forbid that
> 
>   Possible mitigation: a) alter the standard. b) ignore the standard
> and patch the ogr implementation.
> 
> 
> I wonder how others feel about these topics.
> 
> 
> - Are there more pain points I forgot to list?
> 
> - Do you see more approaches to mitigate these problems?
> 
> - Is someone already working on these issues?
> 
> 
> It would be great to have a standard file format that we can fully
> trust. Let's make a reality check if GeoPackage can be this format.
> 
> Best regards
> 
> -- 
> Matthias Kuhn
> matth...@opengis.ch 
> +41 (0)76 435 67 63 
> OPENGIS.ch Logo 
> ___
> QGIS-Developer mailing list
> QGIS-Developer@lists.osgeo.org 
> List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
> Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer
> 
> 
> ___
> QGIS-Developer mailing list
> QGIS-Developer@lists.osgeo.org
> List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
> 

Re: [QGIS-Developer] GeoPackage - where are we -where do we go

2020-05-08 Thread Alessandro Pasotti
On Fri, May 8, 2020 at 11:30 AM Matthias Kuhn  wrote:

> Hi list,
>
> I wondered about the state of GeoPackage. Personally, cince it has been
> introduced to qgis and evenmore since it has been selected as the default
> format, I have never grown to fully and completely.
>
> I do not want to trigger a evangelical discussion here. I'd like to see
> where we are and what we can reasonably do to have a default file format
> which can be recommended with no bad feelings.
>
>
>
Thanks for starting this discussions, we clearly need a path forward.

I just added a note on one of the topics (and cut the rest):

[...]

> * The couple of corrupted files I have received over the years which
> could only be repaired by a command line "dump contents as sql and execute
> into new file"
>
>   I have not found a way to reproduce this. Some of them were produced by
> older qgis versions making it easy to violate foreign key constraints and
> hard to recover. This has been fixed.
>
>   Possible mitigation: offer a "repair" option in qgis. Through processing
> or "on the fly" upon detection.
>

If I'm not mistaken, the fix was to disable foreign key constraints
altogether for the whole QGIS application for all GPKGs, no questions asked.

This was IMHO a bad decision because it may turn a correct GPKG into a
corrupted GPKG.

A better approach in case of corrupted files would have probably been to
disable foreign constraints only in case a file is corrupted and restore
the constraints after the file has been repaired or, as you already
mentioned to offer an automatic or semi-automatic repair function.


Kind regards.

-- 
Alessandro Pasotti
QCooperative:  www.qcooperative.net
ItOpen:   www.itopen.it
___
QGIS-Developer mailing list
QGIS-Developer@lists.osgeo.org
List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer

Re: [QGIS-Developer] GeoPackage - where are we -where do we go

2020-05-08 Thread Werner Macho
Hi Matthias,

May I add another pain here?

Storing gpkg in the cloud means syncing even after the file was only opened
in QGIS and nothing has changed inside - it was only opened to view
something.
On very large gpkg files this is also not really nice.
(Maybe I am just using gpkg wrong, but at least this happens on my
installation)

regards
Werner

On Fri, May 8, 2020 at 11:31 AM Matthias Kuhn  wrote:

> Hi list,
>
> I wondered about the state of GeoPackage. Personally, cince it has been
> introduced to qgis and evenmore since it has been selected as the default
> format, I have never grown to fully and completely.
>
> I do not want to trigger a evangelical discussion here. I'd like to see
> where we are and what we can reasonably do to have a default file format
> which can be recommended with no bad feelings.
>
>
> Here follow a couple of observations over the years, some of them
> properties of the specs I believe:
>
>
> * The fid requirement
>
>   I sometimes want my features to be identified by uuids or others. They
> also tend to accumulate if derived datasets are created (through processing
> etc). If I need some pseudo stable primary key there is a rowid builtin
> into sqlite, we don't need a second one.
>
>   Possible mitigation: alter the ogr implementation. possibly alter the
> standard (required?)
>
> * The modification on r/o open
>
>   Has caused too much pain on git.
>
>   Possible mitigation: a) switch to journal mode=delete (not an easy
> option because of https://issues.qgis.org/issues/15351) b) only switch to
> wal mode when layers are put into edit mode (I have strong doubts this is a
> safe thing to do)
>
> * The network share freeze
>
>   Our default file should play nicely with (windows) network shares. It's
> clear to everyone that we can't expect concurrent writes. But it should
> "just work" for concurrent read by many.
>
>   Possible mitigation: switch to journal mode=delete for network shares
> (we are looking into this)
>
> * The wal file appearing next to the file
>
>   It is confusing to newcomers and looks almost like a sidecar file. I
> would care less if it was put into some system cache folder instead of just
> into my data folder. Or at least if it was a hidden file.
>
>   Possible mitigation: switch to journal mode=delete (not an easy option
> because of https://issues.qgis.org/issues/15351)
>
> * The couple of corrupted files I have received over the years which
> could only be repaired by a command line "dump contents as sql and execute
> into new file"
>
>   I have not found a way to reproduce this. Some of them were produced by
> older qgis versions making it easy to violate foreign key constraints and
> hard to recover. This has been fixed.
>
>   Possible mitigation: offer a "repair" option in qgis. Through processing
> or "on the fly" upon detection.
>
> * Default value magic replace values on insert (with no possibility to
> pre-evaluate them)
>
>   E.g. a global sequence like on postgres would be nice. Can be worked
> around through default values in qgis though.
>
>   Possible mitigation: a)add it as a feature to sqlite. b) use qgis
> default values. c) live with it.
>
> * The requirement for a single geometry column per table
>
>   I just don't see a good reason to forbid that
>
>   Possible mitigation: a) alter the standard. b) ignore the standard and
> patch the ogr implementation.
>
>
> I wonder how others feel about these topics.
>
>
> - Are there more pain points I forgot to list?
>
> - Do you see more approaches to mitigate these problems?
>
> - Is someone already working on these issues?
>
>
> It would be great to have a standard file format that we can fully trust.
> Let's make a reality check if GeoPackage can be this format.
>
> Best regards
> --
> Matthias Kuhn
> matth...@opengis.ch
> +41 (0)76 435 67 63 <+41764356763>
> [image: OPENGIS.ch Logo] 
> ___
> QGIS-Developer mailing list
> QGIS-Developer@lists.osgeo.org
> List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
> Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer
___
QGIS-Developer mailing list
QGIS-Developer@lists.osgeo.org
List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer

[QGIS-Developer] GeoPackage - where are we -where do we go

2020-05-08 Thread Matthias Kuhn

Hi list,

I wondered about the state of GeoPackage. Personally, cince it has been 
introduced to qgis and evenmore since it has been selected as the 
default format, I have never grown to fully and completely.


I do not want to trigger a evangelical discussion here. I'd like to see 
where we are and what we can reasonably do to have a default file format 
which can be recommended with no bad feelings.



Here follow a couple of observations over the years, some of them 
properties of the specs I believe:



* The fid requirement

  I sometimes want my features to be identified by uuids or others. 
They also tend to accumulate if derived datasets are created (through 
processing etc). If I need some pseudo stable primary key there is a 
rowid builtin into sqlite, we don't need a second one.


  Possible mitigation: alter the ogr implementation. possibly alter the 
standard (required?)


* The modification on r/o open

  Has caused too much pain on git.

  Possible mitigation: a) switch to journal mode=delete (not an easy 
option because of https://issues.qgis.org/issues/15351) b) only switch 
to wal mode when layers are put into edit mode (I have strong doubts 
this is a safe thing to do)


* The network share freeze

  Our default file should play nicely with (windows) network shares. 
It's clear to everyone that we can't expect concurrent writes. But it 
should "just work" for concurrent read by many.


  Possible mitigation: switch to journal mode=delete for network shares 
(we are looking into this)


* The wal file appearing next to the file

  It is confusing to newcomers and looks almost like a sidecar file. I 
would care less if it was put into some system cache folder instead of 
just into my data folder. Or at least if it was a hidden file.


  Possible mitigation: switch to journal mode=delete (not an easy 
option because of https://issues.qgis.org/issues/15351)


* The couple of corrupted files I have received over the years which 
could only be repaired by a command line "dump contents as sql and 
execute into new file"


  I have not found a way to reproduce this. Some of them were produced 
by older qgis versions making it easy to violate foreign key constraints 
and hard to recover. This has been fixed.


  Possible mitigation: offer a "repair" option in qgis. Through 
processing or "on the fly" upon detection.


*Default value magic replace values on insert (with no possibility to 
pre-evaluate them)


  E.g. a global sequence like on postgres would be nice. Can be worked 
around through default values in qgis though.


  Possible mitigation: a)add it as a feature to sqlite. b) use qgis 
default values. c) live with it.


*The requirement for a single geometry column per table

  I just don't see a good reason to forbid that

  Possible mitigation: a) alter the standard. b) ignore the standard 
and patch the ogr implementation.



I wonder how others feel about these topics.


- Are there more pain points I forgot to list?

- Do you see more approaches to mitigate these problems?

- Is someone already working on these issues?


It would be great to have a standard file format that we can fully 
trust. Let's make a reality check if GeoPackage can be this format.


Best regards

--
Matthias Kuhn
matth...@opengis.ch 
+41 (0)76 435 67 63 
OPENGIS.ch Logo 
___
QGIS-Developer mailing list
QGIS-Developer@lists.osgeo.org
List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer