Dear Lorenzo,

On 03/03/2016 12:44 AM, Lorenzo Bottaccioli wrote:
If i run the code with out parallelization it takes around 650s to
complete the calculation. Each process of the for loop is executed in
~10s. If i run with parallelization it takes ~900s to complete the
procces and each process of the for loop it takes ~30s.

How is that? how can i Fix this?

If I am not mistaken, you are splitting your process into 8 concurrent processes. Do you have at least 8 cores in your machine and can you observe that the processes run indeed in parallel? If so, the overhead may come from the processes trying to get access to input and output files at the same time. Although your computations may run in parallel, the I/O will still happen sequentially and less optimal than during your non-concurrent run, because multiple processes are now frequently fighting to get access to files at the same time, blocking each other.

You should probably instrument your code to figure out where the time is spent.

In case I/O is indeed the bottleneck, then you might get better results by distributing the processes over multiple disks (multiple controllers). This will get rid of some synchronization points allowing multiple processes to continue to run in parallel, even while doing I/O.

As a general rule, you don't want large scale I/O to happen concurrently on non-parallel hardware.

Best regards,
Kor

_______________________________________________
gdal-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to