Brad Kish created ORC-247:
-----------------------------
Summary: Need a way for ConvertTool to rename fields on conversion
to Orc
Key: ORC-247
URL: https://issues.apache.org/jira/browse/ORC-247
Project: ORC
Issue Type: Improvement
Components: tools
Affects Versions: 1.4.0
Reporter: Brad Kish
Attachments: ConvertTool.java
We are trying to convert a large store of JSON data to Orc format using the
Convert tool for querying with Athena/Presto and others.
However, some of the existing JSON field names are not compatible with the
query engines that we want to use. We need to rename the fields during the
conversion process.
We have hacked a workaround, so that the tool takes two schemas: a source
schema and a target schema. The two schemas must describe the same fields, but
they can have different names for the fields. Not sure that this is a
reasonable solution, but it is one possibility.
I have attached our changes to ConvertTool--it doesn't do any error checking,
it was more of a proof of concept.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)