Hi Michael,
On Nov 27, 2012, at 6:33 PM, Michael Jackson <[email protected]>
wrote:
> I have a project coming up where we are going to store multiple modalities of
> data acquired from Scanning Electron Microscopes (EBSD, EDS and ISE**).
>
> For each modality the data is acquired in a grid fashion. For the ISE data
> the data is actually a gray scale image so those are easy to store and think
> about. If the acquired is 2048 x 2048 then you have a gray scale image of
> that same size (unsigned char). This is where things start getting
> interesting. For the EBSD data there are several signals collected at *each*
> pixel. Some of the signals are simple scalar values (floats) and we have been
> storing those also in a 2D array just like the ISE image. But one of the
> signals is actually itself a 2D image (60x80 pixels). So for example the EBSD
> sampling grid dimensions is 100 x 75 and at each grid point there is a 60x80
> array of data.
>
> The EDS data is much the same except we have a 2048 1D Array at each pixel
> and the dimensions of the EDS sampling grid is 512 x 384
>
> I am trying to figure out a balance between efficient storage and easy
> access. One thought was to store each grid point as its own "group" but that
> would be hundreds of thousands of groups and I don't think HDF5 is going to
> react well to that. So the other end of that would be to continue to think of
> each modality of data as an "Image" and store all the data under a group such
> as "EDS" as a large multi-dimensional array. So for example in the EBSD data
> acquisition from above I would have a 4D array (100x75x80x60). What type of
> attributes should I store the data set so that later when we are reading
> through the data we can efficiently grab hyper slabs of the data without
> having to read the entire data set into memory?
Actually, groups with hundreds of thousands of links should be fine.
However, I would lean toward keeping the image structure and either
using an array datatype (80x60, in the case you gave) or a compound datatype
for the "pixels". Another useful option is to create a group for each "image"
and then store a separate dataset for each field in the array.
Quincey
> I hope all of that was clear enough to elicit some advice on storage. Thanks
> for any help. Just for clarification the sizes of the data sets are for our
> "experimental" data sets where we are just trying to figure this out. The
> real data sets will likely be multi-gigabytes in size for each "slice" of
> data where we may have 250 slices.
>
>
> ** EBSD - Electron Backscatter Diffraction
> EDS - Energy dispersive Spectra
> ISE - Ion Induced Secondary Electron Image
>
> Thanks for any help or advice.
> ___________________________________________________________
> Mike Jackson Principal Software Engineer
> BlueQuartz Software Dayton, Ohio
> [email protected] www.bluequartz.net
>
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org