Re: [QGIS-Developer] Python script to count the number of feature for each unique values

2019-02-26 Thread kimaidou
Thanks a lot Nyall, setting the NoGeometry flags and subset of fields does
improve the speed by factor 3 for my test layer ( 300k features and 80 (!)
fields )

Regards
Michaël

Le lun. 25 févr. 2019 à 23:48, Nyall Dawson  a
écrit :

> On Tue, 26 Feb 2019 at 03:31, kimaidou  wrote:
> >
> > Hi all,
> >
> > I needed to gather basic stats for a layer containing multiple fields. I
> created the following gist to compute the number of features per each
> unique value for a list of given fields
> > https://gist.github.com/mdouchin/a234efb7e67ebd8dae3a04cb26cf5e72
>
> Checkout trap #3 from
> http://nyalldawson.net/2016/10/speeding-up-your-pyqgis-scripts/.
> Specifically, since you aren’t doing anything with the geometry, and
> are only using a single attribute from the layer, by calling setFlags(
> QgsFeatureRequest.NoGeometry ) and setSubsetOfAttributes() you can
> tell QGIS that you don’t need the geometry, and only require a single
> attribute’s value in the feature request. This will give a huge speed
> boost to the feature fetching.
>
> > I know I could use PostgreSQL to do it (see
> https://twitter.com/kimaidou/status/1100053546978983936 conversation),
> wich is much faster, but I needed a quick way to do it with Python.
> > I tested virtual layers too, but my source layer has many fields (with
> some heavy data) and the copy/pasting into spatialite was not efficient
> here.
> >
> > The method uniqueValues(fieldIndex) of QgsVectorLayer is pretty fast
> compared to my iteration through the layer features. Any hint how to
> improve the speed of my calculation ?
>
> Potentially this could be a new method in QgsVectorDataProvider which
> sends a native query to be executed on the backend (like uniqueValues
> does -- that's why it's so fast). But that would require an addition
> to the c++ QgsVectorDataProvider class, and implementations in the
> popular vector data providers.
>
> Nyall
>
> >
> > Regards,
> > Michaël
> > ___
> > QGIS-Developer mailing list
> > QGIS-Developer@lists.osgeo.org
> > List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
> > Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer
>
___
QGIS-Developer mailing list
QGIS-Developer@lists.osgeo.org
List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer

Re: [QGIS-Developer] Python script to count the number of feature for each unique values

2019-02-25 Thread Nyall Dawson
On Tue, 26 Feb 2019 at 03:31, kimaidou  wrote:
>
> Hi all,
>
> I needed to gather basic stats for a layer containing multiple fields. I 
> created the following gist to compute the number of features per each unique 
> value for a list of given fields
> https://gist.github.com/mdouchin/a234efb7e67ebd8dae3a04cb26cf5e72

Checkout trap #3 from
http://nyalldawson.net/2016/10/speeding-up-your-pyqgis-scripts/.
Specifically, since you aren’t doing anything with the geometry, and
are only using a single attribute from the layer, by calling setFlags(
QgsFeatureRequest.NoGeometry ) and setSubsetOfAttributes() you can
tell QGIS that you don’t need the geometry, and only require a single
attribute’s value in the feature request. This will give a huge speed
boost to the feature fetching.

> I know I could use PostgreSQL to do it (see 
> https://twitter.com/kimaidou/status/1100053546978983936 conversation), wich 
> is much faster, but I needed a quick way to do it with Python.
> I tested virtual layers too, but my source layer has many fields (with some 
> heavy data) and the copy/pasting into spatialite was not efficient here.
>
> The method uniqueValues(fieldIndex) of QgsVectorLayer is pretty fast compared 
> to my iteration through the layer features. Any hint how to improve the speed 
> of my calculation ?

Potentially this could be a new method in QgsVectorDataProvider which
sends a native query to be executed on the backend (like uniqueValues
does -- that's why it's so fast). But that would require an addition
to the c++ QgsVectorDataProvider class, and implementations in the
popular vector data providers.

Nyall

>
> Regards,
> Michaël
> ___
> QGIS-Developer mailing list
> QGIS-Developer@lists.osgeo.org
> List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
> Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer
___
QGIS-Developer mailing list
QGIS-Developer@lists.osgeo.org
List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer

[QGIS-Developer] Python script to count the number of feature for each unique values

2019-02-25 Thread kimaidou
Hi all,

I needed to gather basic stats for a layer containing multiple fields. I
created the following gist to compute the number of features per each
unique value for a list of given fields
https://gist.github.com/mdouchin/a234efb7e67ebd8dae3a04cb26cf5e72

I know I could use PostgreSQL to do it (see
https://twitter.com/kimaidou/status/1100053546978983936 conversation), wich
is much faster, but I needed a quick way to do it with Python.
I tested virtual layers too, but my source layer has many fields (with some
heavy data) and the copy/pasting into spatialite was not efficient here.

The method uniqueValues(fieldIndex) of QgsVectorLayer is pretty fast
compared to my iteration through the layer features. Any hint how to
improve the speed of my calculation ?

Regards,
Michaël
___
QGIS-Developer mailing list
QGIS-Developer@lists.osgeo.org
List info: https://lists.osgeo.org/mailman/listinfo/qgis-developer
Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-developer