GitHub user tkcode123 opened a pull request:
https://github.com/apache/orc/pull/217
Provide additional constructor to JsonReader (java orc tools)
Provide additional constructor to JsonReader so that embedding code can use
its own JsonParser implementation. Intended to plug in a parser that transforms
JSON while reading (flattening nested structs, renaming and filtering
capabilities).
Rationale: Our application often gets JSON files that have deeply nested
arrays with structs where the innermost elements are generic like
.
I would like to be able to move the value element into separate, correctly
typed elements that hold
either bigints, doubles, strings or boolean (etc.) so that compression and
value handling is improved. It is intended to leverage JOLT
(https://github.com/bazaarvoice/jolt) for this. I would like
to read the original files, transform them in memory to the target shape
JSON objects and then
create ORC files from that representation.
Adding just another ctor would allow us to implement such a transformation
step.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/tkcode123/orc master
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/orc/pull/217.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #217
commit f8134e167718035eea0b3a1796162c74a667adf0
Author: Thomas KruÌger
Date: 2018-02-11T22:33:49Z
Provide additional constructor to JsonReader so that embedding code can
use it's own JsonParser implementation. Intended to plug in a parser
that transforms JSON while reading (flattening nested structs, renaming
and filtering capabilities).
---