Hi, I need to add col1:Array[String], col2:Array[Int] and col3:Array[Float] to docvalue.
col1: Array[String] sparse dimension from OLAP world col2: Array[Int] + Array[Float] represents a sparse vector for sparse measure from OLAP world with dictionary encoding for col1 mapped to col2 I have few options to implement it: 1. Use SortedSetDocValuesField for each one of them with String, Int and Float mapped to Byte 2. Generate byte array from Array[String], Array[Int] and Array[Float] and save them as a byteBlob using BinaryDocValuesField I know for sure that Array[Int] and Array[Float] will compress better if I save them using specific encoding but I am confused whether to use 1 or 2 to implement the idea. 1 has a limitation on the number of bytes I can save and I am not sure if pushing a Set to serialize to disk is a good idea (I am not sure yet if a Set is being serialized to disk, most likely not). I am open to coming up with specific encoding for Array data type where it re-uses the current String, Int and Float encodings that we already have. It will be great if experts can provide some pointers on using SortedSetDocValues or serialize/deserialize using BinaryDocValuesField. The idea of sparse dimension and measure comes from Oracle Essbase and I believe we may bring in tensors as well in future. Thanks. Deb