This is a fabulous idea! We do this in Lucene, "manually", in the indexer, where we need a simple struct to hold details for each unique term we've seen. We maintain our own (1D) parallel arrays for this...
Mike http://blog.mikemccandless.com On Fri, Mar 25, 2011 at 10:36 AM, Dawid Weiss <[email protected]> wrote: > > Hi guys, > This is not directly related to Mahout, but since most of you deal with > computations, I think it is relevant and I seek feedback/ improvement ideas. > If you've ever had to create a large array (or worse: multidimensional > array) of a relatively simple structure-like data holder class then you > probably know the pain of initializing sub-arrays and the memory overhead > that jagged arrays incur. The idea to generate stub code to handle such > cases has been around my head for a long time, but I finally managed to find > some time and implement it. I really like the results so far, especially in > multidimensional case the code is so much nicer. Even if you have a > relatively simple array of byte[][] you can do this: > @Struct(dimensions = 2) > public final class Byte { > public byte value; > } > this will generate stub class ByteArray2D (if javac has access to > apt-processor in hppc-struct, that is; or if your maven project is > configured properly, see hppc-examples for how to do this) with a single > byte[] field and flattened Byte objects, including accessors to individual > fields or valuetype-copying methods for handling entire structures. More is > here: > http://issues.carrot2.org/browse/HPPC-54 > And a trivial sample here: > https://github.com/carrotsearch/hppc/blob/master/hppc-examples/src/main/java/com/carrotsearch/hppc/examples/StructExample.java > Again, if you have any ideas/ improvements, they are most welcome (use JIRA > above for comments or fork the code on github). > Dawid
