On 2014/10/26 19:22, Even Rouault wrote:
Le dimanche 26 octobre 2014 16:44:37, Zoltan Szecsei a écrit :
Hi,
I just want to clear up my mindset as to how a VRT is implemented in QGIS.
Zoltan,

In fact those are more OGR questions than QGIS questions. QGIS makes no
difference when reading a plain shapefile (through OGR) or a VRT.

I'd like to understand when QGIS opens a file, when it reads the
contents, and when it writes (if need be) and closes a file.
In this context, I am thinking about SHP files - especially the NGI
dataset which comes out "cut up" into degree squares.
Let's just deal with 1 feature type: Rivers lines. My VRT looks like:
<OGRVRTDataSource>
    <OGRVRTUnionLayer name="Rivers">
    <OGRVRTLayer name="2730_RIVER_LINE_2006_06">
      <SrcDataSource
relativeToVRT="1">2730/2730_RIVER_LINE_2006_06.shp</SrcDataSource>
    </OGRVRTLayer>
    <OGRVRTLayer name="2731_RIVER_LINE_2006_04">
      <SrcDataSource
relativeToVRT="1">2731/2731_RIVER_LINE_2006_04.shp</SrcDataSource>
    </OGRVRTLayer>
    <OGRVRTLayer name="2732_RIVER_LINE_2006_04">
      <SrcDataSource
relativeToVRT="1">2732/2732_RIVER_LINE_2006_04.shp</SrcDataSource>
    </OGRVRTLayer>
    </OGRVRTUnionLayer>
</OGRVRTDataSource>

   * When I open the VRT in QGIS, does QGIS open ALL the VRT files and
     look for the extent of each of the files?
If QGIS issues a GetExtent() on the VRT, then with the above definition, it
will query the 3 shapefiles to find the extent of each. But on shapefiles this 
is
a fast operation.
You could define <ExtentXMin>, etc... just besides OGRVRTUnionLayer if you
really want fast GetExtent()

       o If my VRT had the extents included for each of the files, would
         this stop QGIS from (at this stage) opening the files and
         reading the extents?
Yes, but QGIS probably asks GetFeatureCount(), so it will need to open each
shapefile, unless you define <FeatureCount> as well.
But QGIS will also asks the field definition, and will need to open each ....,
unless you define <Field>

   * Before rendering the VRT, does QGIS look at the extents of my
     viewport and only physically open my files and render it's contents?
QGIS will define SetSpatialFilter() on the layer with the extent, so as the
layer can use a spatial index if it has one. Reviewing my code in VRT union
layer, I can see that the spatial filter will be forwarded to each source
layer. So it will need to open them, but the shapefile driver won't scan any
feature if setting a spatial filter that does not intersect the extent of the
shapefile, so that should be fast. A possible optimization could be done in the
VRT union layer to take into account the extent of the source layer to avoid
iterating on it if the spatial filter on th union layer doesn't intersect that
extent.

To be efficient, you likely need to compute .qix spatial index on each 
shapefile.


     In other words, if I first zoom into a known area, then open my VRT,
     will QGIS at this stage still open all the subfiles, instead of
     waiting until a specific subfile needs opening)

   * As I pan around my map, does QGIS open and close the VRT subfiles
     that are out of my current viewing region?
The VRT driver will maintain a pool of a maximum of 100 source layers by
default (that number can be altered by setting the OGR_VRT_MAX_OPENED
configuration option) and will close transparently the older ones

   * Presumably if any of my VRT subfiles touch or overlap my current
     viewport, they would be "processed" depending on what I am doing?

   * Is there a way to structure a VRT file so that you can have access
     to the underlying files that make up the VRT? (Even edit access?)
Not sure what you mean by "have access to". But a union VRT can be opened in
update mode and the update mode will be forwareded to the source layers
(provided they support it). You can delete or modify features. For creation of
new features, you need to specify <SourceLayerFieldName> as documented in
http://gdal.org/drv_vrt.html


Or, is the VRT just any easy way to bunch a whole lot of maps under one
name, and there is no processing benefit depending on the area you are
viewing or working in?
Your above VRT should work reasonably fast. Unless you have several hunderds
or thousands of source layers. In which case, you may need to define more
optional elements in the VRT to avoid the scans, and there would be perhaps a
need for some enhancements in the OGRUnionLayer class.

Even


Hi Even,
Thanks for the detailed thought, and for the effort of reviewing your code.
I'm fiddling with setting up quite a big dataset - likely to have over 1000 shapefiles in the VRT - maybe even up to 3000 - but I will experiment and see what is both logical and practical. My goal with the above questions is to try to avoid opening all the shapefiles at the time the VRT is opened, so that there won't be a "million and one" physical disk IOs. If the user then loads my VRT with rendering off, it should load very quickly (if I can supply all the details needed, in the VRT file). Once the user has zoomed into his/her area of interest, and turns rendering on for the VRT, then (hopefully) only the underlying shapefiles in that AOI need to be physically accessed.

So, how compatible is the current code when opening a VRT, to zeroing the need to open any underlying VRT files before any rendering or other operations are done by the user, and if the user is "zoomed in", to limiting the underlying VRT file-actions only to those affected by the current zoom level?

ogrinfo -al -so gives a lot of info that could be added to the static VRT file, but is it enough to stop QGIS's implementation of VRT from physically querying the underlying files until absolutely necessary?

Also, when a VRT opened, do you really need all the knowledge (like featurecount) at this stage? One negative of me putting the featurecount into the VRT xml, is that someone could change that shape file, and the actual feature count would then differ from that in the xml file.

So, probably to negate the direction I am hoping to go in (like putting details into the VRT file so that opening the VRT would cause minimal disk io), the correct way would be to optimise the QGIS code so that the information about the underlying files is only read by QGIS when absolutely necessary.


Regards & thanks again for your interest.
Zoltan



--

===========================================
Zoltan Szecsei PrGISc [PGP0031]
Geograph (Pty) Ltd.
GIS and Photogrammetric Services

P.O. Box 7, Muizenberg 7950, South Africa.

Mobile: +27-83-6004028
Fax:    +27-86-6115323     www.geograph.co.za
===========================================

_______________________________________________
Qgis-developer mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/qgis-developer

Reply via email to