[GitHub] orc pull request #308: Deliver a lower-case schema to OrcFile

ddrinka Thu, 13 Sep 2018 16:11:30 -0700

GitHub user ddrinka opened a pull request:

    https://github.com/apache/orc/pull/308


    Deliver a lower-case schema to OrcFile

    Mixed-case struct field names don't work in Hive.  There should be a way to 
convert a camel-cased JSON document into ORC without having to pre-process the 
JSON.
    
    This pull request is a proof-of-concept which generates two schemas, one 
using the default case which is provided to the JsonReader as usual, and 
another schema which is lower cased and is provided to OrcFile.
    
    TypeDescription is immutable and non-trivial to manually clone using public 
accessors, so to make the idea clear, I do the conversion at schema ingest 
rather than where it's provided to OrcFile.  The downside of this approach is 
that automatic schema detection doesn't benefit from these changes.  A more 
experienced implementer could certainly do better.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ddrinka/orc ddrinka-pr-lowercase-schema

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/orc/pull/308.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #308
    
----
commit cc7e909725d059b69f9a8c384aca2691b52ce0ff
Author: Douglas Drinka <ddrinka@...>
Date:   2018-09-13T22:59:11Z

    Deliver a lower-case schema to OrcFile

----


---

[GitHub] orc pull request #308: Deliver a lower-case schema to OrcFile

Reply via email to