Hi all!

I would like to pick your brain for some suggestion on how to modify my 
image analysis pipeline.

I am analyzing terabytes of image stacks generated using a microscope. The 
current code I generated rely heavily on scikit-image, numpy and scipy. In 
order to speed up the analysis the code runs on a HPC computer (
https://www.nsc.liu.se/systems/triolith/) with MPI (mpi4py) for 
parallelization and hdf5 (h5py) for file storage. The development cycle of 
the code has been pretty painful mainly due to my non familiarity with mpi 
and problems in compiling parallel hdf5 (with many open/closing bugs). 
However, the big drawback is that each core has only 2Gb of RAM (no shared 
ram across nodes) and in order to run some of the processing steps i ended 
up reserving one node (16 cores) but running only 3 cores in order to have 
enough ram (image chunking won’t work in this case). As you can imagine 
this is extremely inefficient and i end up getting low priority in the 
queue system.


Our lab currently bought a new 4 nodes server with shared RAM running 
hadoop. My goal is to move the parallelization of the processing to dask. I 
tested it before in another system and works great. The drawback is that, 
if I understood correctly, parallel hdf5 works only with MPI 
(driver=’mpio’). Hdf5 gave me quite a bit of headache but works well in 
keeping a good structure of the data and i can save everything as numpy 
arrays….very handy. 


If I will move to hadoop/dask what do you think will be a good solution for 
data storage? Do you have any additional suggestion that can improve the 
layout of the pipeline? Any help will be greatly appreciated.

-- 
You received this message because you are subscribed to the Google Groups 
"scikit-image" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to scikit-image+unsubscr...@googlegroups.com.
To post to this group, send an email to scikit-image@googlegroups.com.
To view this discussion on the web, visit 
https://groups.google.com/d/msgid/scikit-image/8c6da119-49fa-42b7-ab3d-60b3738a94c4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to