[Perldl] complicated big data

Ingo Schmid Wed, 20 Oct 2010 02:20:52 -0700

 Hello,

since everybody seems really helpful here, I'm going to ask for somehelp with this:


I'm reading imaging data that looks like the following:

(header+2*x,y,z, ...)

In other words, there is a line of complex data (two floats) preceded bya header (128 bytes long, 47 fields of different data type (bitmaps,ushort, long ...).

Up until now, I have a loop reading first the header with read (thenusing unpack), following by readflex to fetch a line). This requires twofile accesses per line, and a perl loop. For 12 GB of data that takes awhile.

In the header are fields that tell the indices in the big data matrixwhere this should go to. So I use something like this to populate a bigpiddle:


data(,x,header(field_y),header(field_z), ...)+=line

The + is there because some lines are acquired multiple times, thelookup of the header is necessary since they can come in arbitrary orderand some are missing completely (intentionally).


Now I have the following issues:

1. The data piddle will be bigger than 2GB (see my previous request),each dimension will max out around 10^5!


2. Do you see an effective pdlish way to avoid a loop?

ad 1) For the moment, I can/have to split data into multiple pidlles,not nice but doable. Since the data is going to be processed further(FFT, filtering ..) I have to find the best place to split, which can behard since the structure of the data varies a lot depending on theexperiment.



Thanks

Ingo

_______________________________________________
Perldl mailing list
[email protected]
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl

[Perldl] complicated big data

Reply via email to