Brad Kish created ORC-247:
-----------------------------

             Summary: Need a way for ConvertTool to rename fields on conversion 
to Orc
                 Key: ORC-247
                 URL: https://issues.apache.org/jira/browse/ORC-247
             Project: ORC
          Issue Type: Improvement
          Components: tools
    Affects Versions: 1.4.0
            Reporter: Brad Kish
         Attachments: ConvertTool.java

We are trying to convert a large store of JSON data to Orc format using the 
Convert tool for querying with Athena/Presto and others.

However, some of the existing JSON field names are not compatible with the 
query engines that we want to use. We need to rename the fields during the 
conversion process.

We have hacked a workaround, so that the tool takes two schemas: a source 
schema and a target schema. The two schemas must describe the same fields, but 
they can have different names for the fields. Not sure that this is a 
reasonable solution, but it is one possibility.

I have attached our changes to ConvertTool--it doesn't do any error checking, 
it was more of a proof of concept.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to