On Tue, Jun 28, 2011 at 10:50 AM, Joe Harrington <j...@physics.ucf.edu> wrote:
> As with Travis, I have not had time to wade through the 150+ messages > on masked arrays, but I'd like to raise a concept I've mentioned in > the past that would enable a broader use if done slightly differently. > That is, the "masked array" problem is a subset of this more-general > problem. Please respond on this thread if you want me to see the > replies. > > The problem is of data accumulation and averaging, and it comes up all > the time in astronomy, medical imaging, and other fields that handle > stacks of images. Say that I observe a planet, taking images at, say, > 30 Hz for several hours, as with many video cameras. I want to > register or map those images to a common coordinate system, stack > them, and average them. I must recognize: > > 1. transient there are bad pixels due to cosmic ray hits on the CCD array > 2. some pixels are permanently dead and always produce bad data > 3. some data in the image is not part of the planet > 4. some images have too much atmospheric blurring to be used > 5. sometimes moons of planets like Jupiter pass in front > 6. etc. > > For each original image: > Make a mask where 1 is good and 0 is bad. This trims both bad > pixels and pixels that are not part of the planet. > Evaluate the images and zero the entire image (or regions of an > image) if quality is low (or there is an obstruction). > Shift (or map, in the case of lon-lat mapping) the image AND THE > MASK to a common coordinate grid. > Append the image and the mask to the ends of the stacks of prior > images and masks. > > For each x and y position in the image/mask stacks: > Perform some averaging of all pixels in the stack that are not masked, > or otherwise use the mask info to decide how to use the image info. > > The most common of these averages is to multiply the stack of images > by the stack of masks, zeroing the bad data, then sum both images and > masks down the third axis. This produces an accumulator image and a > count image, where the count image says how many data points went into > the accumulator at each pixel position. Divide the accumulator by the > count. This produces an image with no missing data and the highest > SNR per pixel possible. > > Two things differ from the current ma package: the sense of TRUE/FALSE > and the ability to have numbers other than 0 or 1. For example, if > data quality were part of the assessment, each pixel could get a data > quality number and the accumulation would only include pixels with > better than a certain number, or there could be weighted averaging. > > Currently, we just implement this as described, without using the > masked array capability at all. However, it's cumbersome to hand > around pairs of objects, and making a new object to handle them > together means breaking it apart when calling standard routines that > don't handle masks. There are probably some very fast ways to do > interesting kinds of averages that we haven't explored but would if > this kind of capability were in the infrastructure. > > So, I hope that if some kind of masking goes into the array object, it > allows TRUE=good, FALSE=bad, and at least an integer count if not a > floating quality number. One bit is not enough. Of course, these > could be options. Certainly adding 7 or 31 more bits to the mask will > make space problems for some, as would flipping the sense of the > mask. However, speed matters. Some of those doing this kind of > processing are handling hours of staring video data, which is 1e6 > frames a night from each camera. > > This might be described as masks as weights, or perhaps from the Bayesian point of view, as measures of just how unknown unknown data is. One thing I like about masks is that they allow ideas like this to creep in and thus encourage future innovations. But as you also point out, 8 bits might be a bit on the low side for that and a consistent idea of how to interpret things would help ala the missing or unknown models. Chuck
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion