es1220 opened a new pull request #7282: Enhance orc-extensions - use orc file schema URL: https://github.com/apache/incubator-druid/pull/7282 `orc-extensions` uses custom struct `typeString`. (user configuration or druid parser auto making) `typeString` is an unstable and has the potential to make a mistake. (such as column order, type ..) So, I create **`DruidOrcNewInputFormat`** and **`druid_orc`** parser type. Now, if you change only the `inputFormat` and parser `type`, you can easily ingest the orc file like a `parquet-extensions` without any `typeString` errors. - `DruidOrcNewInputFormat` - has `OrcNewInputFormat` - creates `DruidOrcRecordReader` and store file schema - `DruidOrcRecordReader` - converts `OrcStruct` to `Map<String, Object>` by stored file schema. (This has moved the existing process in `OrcHadoopInputRowParser`.) - `DruidOrcHadoopInputRowParser` - converts `Map` to `MapBasedInputRow`.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
