Alex Levenson created PARQUET-65:
------------------------------------

             Summary: Create a jackson integration module for pojo support
                 Key: PARQUET-65
                 URL: https://issues.apache.org/jira/browse/PARQUET-65
             Project: Parquet
          Issue Type: New Feature
          Components: parquet-mr
            Reporter: Alex Levenson
            Priority: Minor


There's currently a PR for pojo support:
https://github.com/apache/incubator-parquet-mr/pull/21

And it occurred to me that one way we could do this without re-inventing the 
wheel is to use jackson. Jackson can essentially take a parse tree, either the 
result of parsing XML, or json, or anything (for example there's a yaml 
plugin), and then, there are 3 things jackson lets you do with that tree. You 
can either visit the nodes in the tree (they call this streaming), you can map 
the tree onto the datastructures built into java (essentially get a Map<Object, 
Object>, or, you can map the tree onto a user defined class. The latter lets 
you work with a well typed class, and also lets you use jackson's annotations 
for controlling how the tree -> pojo mapping works (renaming fields and so on).

We could leverage all of that by creating something that goes from parquet data 
to the jackson parse tree, and then leave the rest of the work to jackson. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to