paul-rogers commented on issue #1870: DRILL-7359: Add support for DICT type in RowSet Framework URL: https://github.com/apache/drill/pull/1870#issuecomment-541440586 Let's continue to assume that `DICT` is, essentially, `DICT<KEY_TYPE, VALUE_TYPE>` and that we can think of the `DICT`, when writing, as a pair of arrays: one for keys, one for values. (Because vectors are write-once, we have to add (key, value) pairs one by one. If so, then we need a new form of writer, a `DictWriter` that has semantics such as: ``` ObjectWriter key(); ObjectWriter value(); void save(); ``` The `key()` gives us an object writer that lets us access the key writer. If we restrict keys to be scalars, then we can just do: ``` ScalarWriter key(); ``` Values can, I imagine, be of any type. So, the `ObjectWriter` lets us work with them generically. Suppose we had a `DICT<VARCHAR,DOUBLE>`, we could do: ``` DictWriter dictWriter = rowWriter.dict("myDict"); dictWriter.key().setString("fred"); dictWriter.value().scalar().setDouble(123.45) dictWriter.save(); dictWriter.key().setString("barney"); dictWriter.value().scalar().setDouble(98.76) dictWriter.save(); ``` This means that, in the `ObjectWriter()`, we need to add a new method, `dict()`, which will return a `DictWriter`, and we need to define the `DictWriter` interface. There is quite a bit of commentary in the column accessor package that (tries) to explain the structure behind these writers. Perhaps that might help explain the ideas here.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services