Hi all, during the mini-hackathon last week in Madison, I tried to play with some ideas of a framework that could possibly unify the currently existing Functions, Ops, and Region-Of-Interest concepts in imglib2. At this point it is all very experimental and there are several open problems (see end of this mail). Anyway, here are some details on what I tried. I realize that it's a very long mail, it would be great if some of you grind through and comment anyway :-)
Goals ===== The goals of the stuff in net.imglib2.ops.expressions were the following: * Make ops easier to manipulate (i.e. concatenate, iterate, ...) than with the current net.imglib2.ops.operations design. * The same design should be apply on all levels, for example pixels, images, lists of images, and mixtures of those things. * It should be possible to make this nearly as fast as hand-written code. The goal is that you actually would use a pixelwise operation with an iterate construct to apply it to images instead of handcoding loops. (Unfortunately, there is still much room for performance improvement...) Basic Ops ========= The basic idea is that an Op producing a result of type T is a Sampler<T>. public interface Op< O > extends Sampler< O > {...} Calling op.get() triggers the computation and returns the result. The inputs to an Op should also be Sampler<T>. This makes it easy to connect either cursors etc or other Ops to the inputs. If you have something like op1 = A + 10 op2 = B + C op3 = op1 + op2 then I think we need some concept of a "port" to refer to the inputs. We want to be able to say: Here is a binary operation "op3" which has two inputs "A" and "C", iterate that over (two input and one output) images! Note, that we leave port "C" open. Maybe later we want to connect a list of constants there when we iterate over sets of images. I made the following interface public interface Port< T > { public void set( Sampler< T > sampler ); public void setConst( T t ); } where setConst() is just a convenience method supposed to create a Sampler<T> wrapper if you just want to connect a value directly. The interfaces for unary and binary operations can be asked for input and output ports: public interface Op< O > extends Sampler< O > { public Port< O > output(); } public interface UnaryOp< O, I1 > extends Op< O > { public Port< I1 > input1(); } public interface BinaryOp< O, I1, I2 > extends Op< O > { public Port< I1 > input1(); public Port< I2 > input2(); } Lets look at an example of creating and using a binary op. (All examples can be found in net.imglib2.ops.expression.examples on the roi-experiments branch.) public static void addExample1( final Img< FloatType > inputA, final Img< FloatType > inputB, final Img< FloatType > output ) { final Cursor< FloatType > ca = inputA.cursor(); final Cursor< FloatType > cb = inputB.cursor(); final Cursor< FloatType > cc = output.cursor(); final Add< FloatType > op = new Add< FloatType >(); op.input1().set( ca ); op.input2().set( cb ); op.output().set( cc ); while ( cc.hasNext() ) { ca.fwd(); cb.fwd(); cc.fwd(); op.get(); } } The op is created using new Add<FloatType>() which creates a BinaryOp<FloatType,FloatType,FloatType> with undefined inputs and outputs. Then we connect cursors over the input and output images to the ops ports and manually iterate over all pixels in the input and output image using a while loop and fwd()ing all cursors. Alternatively, we could have connected the cursors when constructing the op final Add< FloatType > op = new Add< FloatType >( cc, ca, cb ); or even used a static Ops creator method final Add< FloatType > op = Ops.add( cc, ca, cb ); If we look at the implementation of Add it is actually quite harmless. Boilerplate code for implementing Ports etc is inherited from AbstractBinaryOp. public final class Add< T extends NumericType< T > > extends AbstractBinaryOp< T, T, T > { public Add() {} public Add( final Sampler< T > output, final Sampler< T > input1, final Sampler< T > input2 ) { super( output, input1, input2 ); } @Override public T get() { final T t = output.get(); t.set( input1.get() ); t.add( input2.get() ); return t; } protected Add( final Add< T > expression ) { super( expression ); } @Override public Add< T > copy() { return new Add< T >( this ); } } Iterating Ops ============= Next, lets use an Iteration construct to apply the pixel op to the images instead of doing the iteration manually. public static void addExample2( final Img< FloatType > inputA, final Img< FloatType > inputB, final Img< FloatType > output ) { final Add< FloatType > pixelOp = new Add< FloatType >(); final BinaryIterate< FloatType, FloatType, FloatType > op = new BinaryIterate<FloatType, FloatType, FloatType>( pixelOp ); // setConst(t) creates a new Sampler<T> object whose get() returns t. op.input1().setConst( inputA ); op.input2().setConst( inputB ); op.output().setConst( output ); op.get(); } We "lift" the binary pixel op to a binary operation on IterableIntervals using a BinaryIterate operation. BinaryIterate< A, B, C > is a BinaryOp< IterableInterval<A>, IterableInterval<B>, IterableInterval<C> >. So next we connect the input and output images to its ports, call get() once and are done. Alternatively, Ops.compute( op, output, inputA, inputB ); would do the connecting and get()ing for us. We can even write the whole addExample2 in one line: Ops.compute( Ops.iterate( Ops.<FloatType>add() ), output, inputA, inputB ); Chaining Ops ============ Lets create a slightly more complex pixelwise binary operation on inputs i1 and i2 that computes ( ( 100 + i1 ) + i2 ). Then we use the BinaryIterate operation and run it on the input and output images. public static void addExample3( final Img< FloatType > inputA, final Img< FloatType > inputB, final Img< FloatType > output ) { final Add< FloatType > innerpixelop = Ops.add( Ops.floatTemp(), Ops.floatConstant( 100 ), null ); final Add< FloatType > outerpixelop = Ops.add( null, innerpixelop, null ); final WrapBinary<FloatType,FloatType,FloatType> pixelop = Ops.wrap( outerpixelop, innerpixelop.input2(), outerpixelop.input2() ); final BinaryIterate< FloatType, FloatType, FloatType > op = Ops.iterate( pixelop ); Ops.compute( op, output, inputA, inputB ); } First we create innerpixelop to compute (100 + i1) and store the result in a temporary variable (created using Ops.floatTemp()). Note that the second input port is null for now. Later we want BinaryIterate to connect a cursor to it. Next we create outerpixelop to compute (innerpixelop + i2). Again, the second input port is null. Now we have a tree of ops that should be presented to BinaryIterate as a BinaryOp. Therefore, we need to create a wrapper by specifying one Op (output) and two input ports. This happens in Ops.wrap(). outerpixelop is the op that provides the result, and the two ports we left set to null are the input ports. It would be nice to be able to write the pixelop construction compactly in one line. However, above we need the innerpixelop and outerpixel intermediate variables because we need to access innerpixelop.input2() and outerpixelop.input2() in order to construct the WrapBinary! Therefore, I have PortRef objects, which can act as names for ports. Introducing PortRefs i1 and i2, final PortRef< FloatType > i1 = new PortRef< FloatType >(); final PortRef< FloatType > i2 = new PortRef< FloatType >(); we can rewrite the above to final Add< FloatType > innerpixelop = Ops.add( Ops.floatTemp(), Ops.floatConstant( 100 ), i1 ); final Add< FloatType > outerpixelop = Ops.add( null, innerpixelop, i2 ); final WrapBinary< FloatType, FloatType, FloatType > pixelop = Ops.wrap( outerpixelop, i1, i2 ); or in one (very long) line final BinaryIterate< FloatType, FloatType, FloatType > op = Ops.iterate( Ops.wrap( Ops.add(null, Ops.add(Ops.floatTemp(), Ops.floatConstant(100), i1), i2), i1, i2 ) ); Now if only we could get rid of having to create the PortRefs... More "Lifting" Ops ================== One more helpful tool I played with is a Reduce operation. Given a binary operation *:(x,y)->z, it is applied recursively to a sequence [a,b,d,...] as ((((a*b)*c)*d)*...) Example: public static void main( final String[] args ) { final float[] data = new float[] { 12, 7, 4, 100, 9, 30, 34, 97.2f, 13.4f }; final Img< FloatType > imgA = ArrayImgs.floats( data, data.length ); final IntervalReduce< FloatType > reduceMin = Ops.reduce( Ops.<FloatType>min() ); reduceMin.input1().setConst( imgA ); reduceMin.output().setConst( new FloatType() ); System.out.println( "minimum is " + reduceMin.get().get() ); System.out.println( "maximum is " + Ops.compute( Ops.reduce( Ops.<FloatType>max() ), new FloatType(), imgA )); System.out.println( "sum is " + Ops.compute( Ops.reduce( Ops.<FloatType>add() ), new FloatType(), imgA )); } First we create a 1D test image containing some values. Then we compute the minimum, maximum, and sum of the values by reduce with pixelwise min, max, and add. For the minimum operation, I have written it a bit more verbose than for the other two: IntervalReduce<T> is an Op that implements the reduce operation. It is a UnaryOp<T, IterableInterval<T>>. It is constructed with a BinaryOp<T,T,T>, done here with the Ops.reduce() static convenience method. Final example. A reduce operation as the one discussed before, takes an IterableInterval<T> and computes a T. Now assume that the IterableInterval is a subset of pixels in an image, for instance all pixels within a hypersphere centered at X. The reduce the would compute for example the minimum of those pixels. Now if we can move the Hypersphere to different center X, compute the minimum there etc, and write the minima to another image, we have a minimum filter. With the region-of-iterest stuff (that the roi-experiments branch was actually about) we can do that easily. public static void main( final String[] args ) throws ImgIOException { final String fn = "/Users/pietzsch/workspace/data/DrosophilaWing.tif"; final int span = 2; final ArrayImgFactory<FloatType> factory = new ArrayImgFactory<FloatType>(); final FloatType type = new FloatType(); final Img<FloatType> imgInput = new ImgOpener().openImg(fn, factory, type); final Img<FloatType> imgOutput = factory.create(imgInput, type); // a min filter using 2D hypersphere neighborhoods of radius 3. final RegionFilter< FloatType, BoolType, HyperSphereRegion > op = Ops.regionfilter( new HyperSphereRegion( 2, 3 ), Ops.reduce( Ops.<FloatType>min() ) ); Ops.compute( op, imgOutput, Views.extendBorder( imgInput ) ); // ... or a sum filter using 5x5 hyperbox neighborhoods ... Ops.compute( Ops.regionfilter( new HyperBoxRegion( Intervals.createMinMax( -2, -2, 2, 2 ) ), Ops.reduce( Ops.<FloatType>add() ) ), imgOutput, Views.extendBorder( imgInput ) ); // ... or a max filter using hypersphere neighborhoods of radius 5 ... Ops.compute( Ops.regionfilter( new HyperSphereRegion( 2, 5 ), Ops.reduce( Ops.<FloatType>max() ) ), imgOutput, Views.extendBorder( imgInput ) ); ImageJFunctions.show( imgInput, "input" ); ImageJFunctions.show( imgOutput, "filtered" ); } Note that in the RegionFilter op that we construct we lift the reduce operation that is already a lifted pixel operation to obtain a UnaryOp< IterableInterval<T>, IterableInterval<IterableInterval<T> > >. I think that this also shows that the "ROI stuff" we proposed in the Dundee hackathon fits very nicely into this framework (however that ROI stuff is still completely unoptimized, so the examples above are ... slow). Problems ======== Performance: We have seen that we now are at nesting levels where the JIT becomes quite unpredictable. So performance tuning for this stuff is not easy. In Madison, Johannes showed that for special cases we can become very fast using javassist to basically compile it ourselves. Temporaries: One problem that is completely unsolved in this proposal is the creation of temporary results (pixels, images, etc...). I had discussions with Christian Dietz about this, because they really need this in KNIME. I mainly remember, that there are all kinds of problems and we couldn't come up with a satisfying solution. Christian, maybe you can comment a bit, on what the requirements for temporary creation are. Copying Ops: For parallelization it is necessary that we copy Ops (which can be rather complicated nested trees) in order to run on partitions of the input data in parallel. This is not trivial, because the copying mechanism has to decide which temporaries to recreate, reassign ports between copied ops, etc... Samplers everywhere: Ok, so this was the original idea of the design, but sometimes it introduces an additional layer of indirection. Not always though: For instance if you connect accessors to the input and output of an op this actually avoids copying the data. I couldn't see a nice way around it, but maybe someone else does! Ok, if you've read this far, thanks a lot! If you want to take a closer look, it's all in the "roi-experiments" branch of imglib2, in package net.imglib2.ops.expression. Comments and suggestions are very welcome. best regards, Tobias _______________________________________________ ImageJ-devel mailing list ImageJ-devel@imagej.net http://imagej.net/mailman/listinfo/imagej-devel