[gdal-dev] Python programs on multi-core machines

Jan Hartmann Mon, 13 Aug 2012 04:09:28 -0700

I'm working on multi-core VMss in a Cloud environment, that access theirdata on a central dataserver via NFS. Parallellizing jobs for differentmap sheets gives huge accelerations for C-programs like gdaladdo, butthere seems to be a problem with Python-based programs like rgb2pct.py.Consider the following:


(
    rgb2pct.py file1.tif file1_256.tif
    gdaladdo file1_256.tif 2 4 8 16
)&
(
    rgb2pct.py file1.tif file1_256.tif
    gdaladdo file1_256.tif 2 4 8 16
)&
.. etc, for all available cores
wait

When running this on a 16-core VM I see first 16 python processes, eachwith CPU-loads around 20% for each processor, and then 16 gdaladdoprocesses, with CPU-loads around 95%. When I replace the tif-input-filesfor rgb2pct.py by equivalent jpg-files, the loads for the 16 rgb2pct.pyprocesses increase to about 80% and the overall computing time more thanhalves.

So my impression is that one Python I/O process blocks all others. Ihave read something about Python's GIL (Global Interpreter Lock,http://docs.python.org/faq/library#can-t-we-get-rid-of-the-global-interpreter-lock)and the multi-processing module, but I don't see an easy way toimplement this for my setup. Does anyone have a simple solution forthis problem?

Jan

_______________________________________________
gdal-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/gdal-dev

[gdal-dev] Python programs on multi-core machines

Reply via email to