Re: [Numbers] Arrays of "Complex" objects and RAM

Gilles Sadowski Thu, 07 Nov 2019 06:09:34 -0800

> >>>> [...]
> >>>
> >>> public class ComplexArray {
> >>>     private final MultidimensionalCounter counter;
> >>>     // ...
> >>>
> >>>     public ComplexArray(int... dimensions) {
> >>>         counter = new MultidimensionalCounter(dimensions);
> >>>         data = new data[counter.getSize()];
> >>>     }
> >>>
> >>>     /**
> >>>      * @param indices Storage location.
> >>>      * @return the complex number stored at the given location.
> >>>      */
> >>>     public Complex get(int... indices) {
> >>>         return data[counter.getCount(indices)];
> >>>     }
> >>> }
> >>>
> >>> Or perhaps the data cube should be an additional abstraction on top of
> >>> "ComplexArray" (?).
> >> I think adding the abstraction on top is easier.
> >>
> >> Array can be renamed to Collection/List? Thus the count is the only
> >> thing that matters, and optionally a constant time access get(int index)
> >> method.
> >
> > +1
> >
> >>
> >> This can later be reshaped to a Matrix representation using the
> >> MultidimensionalCounter pattern.
> >>
> >> We have two use cases:
> >>
> >> 1. You know the final count of Complex objects
> >> 2. You do not know the count of Complex objects
> >>
> >> Use case 2 is used in streams where the ComplexList must be expandable
> >> for use as a collector. This can be satisfied with a constructor with an
> >> initial capacity. The ComplexList would thus offer a specialisation of
> >> ArrayList<Complex> by storing the Complex objects using primitive array(s).
> >>
> >> Use case 2 rules out the possibility of immutability. This is the type
> >> of functionality under discussion and could be served using an interface
> >> to allow different implementations:
> >>
> >> public interface ComplexList {
> >> [...]
> >> }
> >
> > +1
> >
> >> This could be separated to put the set and add methods in a
> >> sub-interface allowing the top level interface to be an immutable object.
> >
> > Perhaps better to follow the JDK convention and provide a
> >    public static ComplexList unmodifiableList(ComplexList delegate)
> > method.
> >
> >> However the apply functions currently are specified using Complex.
> >> Efficiency would be gained using primitive specialisations for operating
> >> on 1 or 2 complex numbers and outputting the results to a provided 
> >> consumer:
> >>
> >>     public interface ComplexProvider {
> >>         /**
> >>          * Gets the real component of the complex number.
> >>          *
> >>          * @return the real component
> >>          */
> >>         double getReal();
> >>
> >>         /**
> >>          * Gets the imaginary component of the complex number.
> >>          *
> >>          * @return the imaginary component
> >>          */
> >>         double getImaginary();
> >>     }
> >>
> >>     public interfaceComplexProvider {
> >>         /**
> >>          * Accept the components of the complex number.
> >>          * Optionally return a new ComplexProvider with the components.
> >>          *
> >>          * @param real the real
> >>          * @param imaginary the imaginary
> >>          * @return the complex provider (or null)
> >>          */
> >>         ComplexProvider accept(double real, double imaginary);
> >>     }
> >
> > What is gained wrt using "Complex" instances directly?
>
> The above is a typo. The accept method should be in a ComplexConsumer 
> interface.
>
> If the data is in a primitive array you would like to be able to access it, 
> and then set it back to the same array or a different array without going 
> through creation of a Complex for each (real, imaginary) pair. So the 
> ComplexProvider and ComplexConsumer provide an input and output for any 
> function that changes the complex data. I’ve made it a bit more complicated 
> with the ComplexConsumer returning a ComplexProvider just to allow all the 
> methods in Complex to be rewritten to provide static methods that can use 
> these interfaces.


I can see that one pass through the primitive array could yield the "real"
and "imaginary" arguments to "accept"; but it must return an instance of
a class (that implements "ComplexProvider"); hence calling a constructor.
In a chain of operations, only the first call to the constructor is avoided.

>
> A simpler approach is for the ComplexList to allow modification in-place by 
> providing a list cursor that can traverse the list (like an iterator) and 
> also skip to positions. It would allow get and set of the data at the current 
> position. This cursor could be used by another class to apply functions in 
> place to the data. The functions could be a user specified generic function 
> using the ComplexFunction interface I suggested.

Still, I don't see what is the gain wrt to creating a result "ComplexList" and
let the caller decide whether to replace the input data with the result:
    ComplexList in = ...
    ComplexList out = in.apply(function);
    in = out;

> I think that performance would have to be assessed for various prototypes. 
> Some of the methods in Complex have a lot of work. So the creation of a 
> Complex for each position may not be a big overhead

I'd be inclined to think so.

> allowing a stream approach to build a custom process pipeline which is then 
> collected into a new list at the end. For operations that are fast such as 
> negate, add, subtract, conjugate these would be easy to code into the 
> ComplexList to operate directly on the data array.

At the cost of code duplication, and once allowed for some operations, there
is no stopping of requests for others...

> >
> >> [...]
> >>
> >> Unfortunately processing the stream as encountered may result in out of
> >> order elements if parallelisation occurs so the results may not be
> >> collected back in the same order.
> >
> > IIUC, this would be a non-starter for the data cube abstraction.
> >
> >> Here there are solutions to different problems. So which are we trying
> >> to solve:
> >>
> >> 1. Efficient storage of large amounts of complex numbers.
> >>
> >> 2. Efficient computation on large amounts of complex numbers.
> >
> > Those two IMO.
> >
> >>
> >> 3. Custom computation on complex numbers.
> >
> > Not sure what you mean.
>
> Passing a function that will accept the real and imaginary components, do a 
> lot of work and send it to an output consumer.

So this falls in the category where one additional intermediate creation
of an instance would not matter too much.

> >
> >>
> >> 4. Computation on large amounts of complex numbers in-place.
> >
> > This would assume very, very, large amounts of data.
> > For applications that would manage with data that fill at most half the
> > available memory, this could be emulated with
> >   ComplexList data = ...
> >   data = function.apply(data);
> >
> > And, with a "data cube" abstraction with several underlying blocks
> > of data, the amount of needed "spare" RAM could be reduced at will:
> > Instead of storing one block of 1000 x 1000, one could store 4 of
> > 500 x 500; process them sequentially would require four times less
> > spare RAM.
> >
> >> For 1 we can use the abstraction to a collection (e.g. ComplexList).
> >>
> >> For 2 we can duplicate all methods in Complex to work directly on the
> >> underlying data.
> >>
> >> For 2/3 we can rewrite computation of Complex to work on input providers
> >> and output consumers and allow functions to be applied to the collection.
> >>
> >> For 4 we require different mutability of the collection.
> >
> > I don't think that immutability is a very useful requirement when handling
> > very large amounts of data.  E.g. the data cube should probably allow this
> > kinf of usage:
> >   ComplexCube c = ...
> >   c.transformInPlace(function);
> >
> >> I will continue to think about this as the solution to all issues is not
> >> straightforward.
> >
> > I didn't want to push this too far.  The sole use-case I was aware of is an
> > "OutOfMemoryError" due to storing "Complex" instances in an array of
> > arrays of arrays of...
> >
> > I think that an interesting example application would be how to compute
> > the FFT of the stored data (e.g. using "JTransforms”).
>
> JTransforms just creates FFT objects and then works on provided arrays for 
> forward and reverse transforms. Here’s some code for a float forward 
> transform of some image pixels. Here it is square (more efficient) but it 
> doesn’t have to be.
>
>       // In reality pixels is actually an image
>       float[] pixels = new float[size * size];
>
>       // FFT data is twice the size
>       final float[] data = new float[size * size * 2];
>       final FloatFFT_2D fft = new FloatFFT_2D(size, size);
>
>       System.arraycopy(pixels, 0, data, 0, pixels.length);
>       fft.realForwardFull(data);
>
> So to use JTransforms the data has to be packed in the correct shape. It uses 
> alternating real and imaginary components of successive numbers. It would 
> mean that the ComplexList should have constructor that can wrap an existing 
> complex array. You can then pass it the data output from JTransforms.

So IIUC, there is nothing that the ComplexList abstraction can bring to
simplify that use-case, e.g. like
  ComplexList in = ...
  // Hypothetical function object.
  ComplexFunction fft = new JTransformsComplexFunction(in.size());
  ComplexList out = in.apply(fft);

> > Another nice example would be fractal computations.
>
> Not familiar with this.
>
> My use case is two complex arrays created from FFT of real data which are 
> used to compute a conjugate multiplication and the absolute magnitude of each 
> array. This is used to compute a cross correlation.

This is also what started this thread: The user called the Commons Math's
FFT utilities using arrays of "Complex" objects and got the "OutOfMemory"
error.  Hence the question of whether storing many "Complex" objects is
ever useful, as compared to the "ComplexList", backed with an array of
primitives, and instantiating transient "Complex" instances on-demand.

Regards,
Gilles

> >>> [...]

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [Numbers] Arrays of "Complex" objects and RAM

Reply via email to