Hi,

I have a schema to store linear models in machine learning whose relevant subpart looks like this:

{
            "type": "record",
            "name": "LinearModel",
"fields": [{"name": "weights", "type":{"type":"array", "items":"double"}}]
}

I understand that the actual serialized form of this should be rather efficient. What worries me is how the Java (specific) API for the weights plays out:

public class LinearModel{...
  public GenericArray<Double> weights;
...}

This means that I have to wrap each and every double in my double[] into a Double object and add it to the GenericArray, right?

The trouble is that the double[] I intend to store may very well be choosen in size to max out the available memory of the machine, so I don't really have room for a more-than-lifesize copy of the data.

Is there a way to "stream" the doubles into the output without holding a copy in memory? Or is there another way to encode a double[] in a schema?

Thanks for any pointers,

Markus

Reply via email to