Hey all, when looking at the drill options, and specifically as I was trying to understand the parquet options, I realized that the naming of the options was forming "question" as I looked at them. What do I mean? Consider:
+--------------------------------------------+ | name | +--------------------------------------------+ | store.parquet.block-size | | store.parquet.compression | | store.parquet.dictionary.page-size | | store.parquet.enable_dictionary_encoding | | store.parquet.page-size | | store.parquet.use_new_reader | | store.parquet.vector_fill_check_threshold | | store.parquet.vector_fill_threshold | +--------------------------------------------+ So I will remove "store.parquet" as I refer to them here: use_new_reader - This seems fairly obvious an "on read" options and (maybe?) does affect the Parquet writer, yet "enable_dictionary_encoding" is likely ONLY an on write option.... correct? I mean, if the Parquet file was written somewhere else, and written with Dictionary encoding, Drill will still read it ok, regardless of this setting. Compression as well, if the Parquet file was created with gzip, and this setting is snappy, it will still read it, same goes for block size. Thus, those seem to be "writer" settings, rather than reader settings. So what about the vector settings? Write or Read (or both?) For json there is this setting: | store.json.writer.uglify which seems to be writer focused and obviously writer, but for other settings, knowing what the setting applies to, on write, on read, neither, or both, could be very useful for troubleshooting and knowing which settings to play with. Now, changing these settings as they are is not recommended, even in my test clusters, I have scripts that alter them for specific ETLs, and I would hate to have things break, but how hard would it be to add a string column to sys.options something like "applies_to" with write, read, both, neither, n/a as options? I think this could be valuable for users and administrators of Drill. One other note, in addition to the applies_to, would it be horrifically difficult to add a "description" field for options? Self documenting settings sure would be handy.... :) John
