[
https://issues.apache.org/jira/browse/PARQUET-65?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16816053#comment-16816053
]
Victor commented on PARQUET-65:
-------------------------------
Is this still a subject?
It would be great to be able to generate a parquet schema from a pojo (in the
same way as
[https://github.com/FasterXML/jackson-dataformats-binary/tree/master/avro#generating-avro-schema-from-pojo-definition)]
and then be able to write it to a parquet file, but without all the overhead
of going through avro (which implies serializing to bytes then read it back
with a generic record from avro before usirg the AvroParquetWriter, cf
https://github.com/FasterXML/jackson-dataformats-binary/issues/9#issuecomment-325685012).
> Create a jackson integration module for pojo support
> ----------------------------------------------------
>
> Key: PARQUET-65
> URL: https://issues.apache.org/jira/browse/PARQUET-65
> Project: Parquet
> Issue Type: New Feature
> Components: parquet-mr
> Reporter: Alex Levenson
> Priority: Minor
>
> There's currently a PR for pojo support:
> https://github.com/apache/incubator-parquet-mr/pull/21
> And it occurred to me that one way we could do this without re-inventing the
> wheel is to use jackson. Jackson can essentially take a parse tree, either
> the result of parsing XML, or json, or anything (for example there's a yaml
> plugin), and then, there are 3 things jackson lets you do with that tree. You
> can either visit the nodes in the tree (they call this streaming), you can
> map the tree onto the datastructures built into java (essentially get a
> Map<Object, Object>, or, you can map the tree onto a user defined class. The
> latter lets you work with a well typed class, and also lets you use jackson's
> annotations for controlling how the tree -> pojo mapping works (renaming
> fields and so on).
> We could leverage all of that by creating something that goes from parquet
> data to the jackson parse tree, and then leave the rest of the work to
> jackson.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)