Hi,

I'm seeing some weird behaviors related to virtual raster datasets opened 
simultaneously from multiple processes. I hope I can explain so that this makes 
sense. Here's an excerpt of my python code:

        http://dpaste.com/hold/515217/

Line 8 is where I make a change to the dataset:

        source_ds.SetProjection(source_ds.GetGCPProjection())

I do that so that the projection for the ground control points is available for 
a later call to gdal.ReprojectImage(); it wasn't working until I started to use 
SetProjection() in this way. All of this is being called from the context of a 
multi-process web server, running as unprivileged user "www-data" under Ubuntu 
(this is important later). My web server error log fills up with these:

        ERROR 1: Failed to write .vrt file in FlushCache().

My assumption here is that because the unprivileged user can't write to the 
dataset file, gdal throws off an error to complain that it can't flush the 
dataset cache back to the original file. So far, this is just an annoyance, but 
one that I would expect to go away when I switched from gdal.Open() to 
gdal.OpenShared() with the read-only flag, like this:

        gdal.OpenShared(src_path, gdal.GA_ReadOnly)

Still getting the errors.

Meanwhile, I made a switch in web servers, from an Apache-based CGI environment 
to the multi-worker WSGI server Gunicorn. When I initially ran my code under 
Gunicorn using my normal, privileged user account, I immediately started to see 
failures from gdal.Open and gdal.OpenShared, specifically the assertion errors 
on line 4 of the dpaste above. I tried to place exclusive file locks (using 
fcntl.flock) around each access to a given VRT dataset, but this didn't seem to 
help at all. There were frequent, unpredictable errors with opening data sets 
in a multi-process environment *until* I switched from the privileged user to 
the unprivileged user. Once I did that, everything began to work normally, but 
I got all the old "ERROR 1" reports again.

It seems to me that gdal.OpenShared() with the read-only flag isn't doing what 
it promises, and that it's trying to write back to the files, potentially 
modifying them even as competing processes are accessing them. Is it possible 
that the overlapping processes in my privileged user scenario are seeing 
temporarily-empty VRT files? I'm also confused by the lack of a gdal.Close() 
function or something similar, and by the fact that I can't seem to make a 
change to a dataset in memory without gdal attempting to push that change back 
to disk via FlushCache().

What's the right thing to do here? Make temporary copies of small VRT data sets 
prior to each use so they can be safely written to and disposed of? Build a 
wrapper class that encapsulates copying and disposal? Figure out some way to 
make gdal release datasets when asked, or open them in real read-only mode?

Any advice greatly appreciated!

-mike.

----------------------------------------------------------------
michal migurski- [email protected]
                 415.558.1610



_______________________________________________
gdal-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to