[
https://issues.apache.org/jira/browse/HAWQ-178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15219495#comment-15219495
]
Ian Hellstrom commented on HAWQ-178:
------------------------------------
I understand that for most 'social network' purposes that is fine; most of the
examples I see are based on Twitter at least. For many manufacturing/healthcare
companies that won't do. They have highly nested data structures and a lot of
those are well-structured. I am working on many use cases (with Spark) where
there are arrays of structs (with several layers of arrays of structs within).
Unnesting these is for some purposes a must, for instance when feeding to a BI
tool.
Support for unnesting very complex JSONs is also hit-and-miss in Hive. Plus,
when you already do the bulk of the work in Hive, many won't like the idea of
using HAWQ (or something else) on top of that.
Having these structures as TEXT requires lots of messy regex. I'm just saying
that, so you know where I'm coming from.
> Add JSON plugin support in code base
> ------------------------------------
>
> Key: HAWQ-178
> URL: https://issues.apache.org/jira/browse/HAWQ-178
> Project: Apache HAWQ
> Issue Type: New Feature
> Components: PXF
> Reporter: Goden Yao
> Assignee: Christian Tzolov
> Fix For: backlog
>
> Attachments: PXFJSONPluginforHAWQ2.0andPXF3.0.0.pdf,
> PXFJSONPluginforHAWQ2.0andPXF3.0.0v.2.pdf,
> PXFJSONPluginforHAWQ2.0andPXF3.0.0v.3.pdf
>
>
> JSON has been a popular format used in HDFS as well as in the community,
> there has been a few JSON PXF plugins developed by the community and we'd
> like to see it being incorporated into the code base as an optional package.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)