[ 
https://issues.apache.org/jira/browse/HIVE-25188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17355908#comment-17355908
 ] 

David Mollitor edited comment on HIVE-25188 at 6/2/21, 6:29 PM:
----------------------------------------------------------------

[~dengzh] I've formatted the JSON to make it easier to read for discussion 
sake.  FYI, there are a few stray characters at the end of your example that 
were giving me issues during formatting.

{code:json}
{
        "data": {
                "H": {
                        "event": "track_active",
                        "platform": "Android"
                },
                "B": {
                        "device_type": "Phone",
                        "uuid": 
"[36ffec24-f6a4-4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16]"
                }
        },
        "messageId": "2475185636801962",
        "publish_time": 1622514629783,
        "attributes": {
                "region": "IN"
        }
}
{code}

create table json_table(data string, messageid string, publish_time bigint, 
attributes string);

The {{data}} field is not a String type.  It is itself a data type of type 
struct.  If you intend to do something like stuffing arbitrary data in that 
field, then "data" should be a Base-64 string and then you can declare it as a 
Binary type in Hive.  I think that's the preferred approach instead of just 
allowing an overloaded String type.

If you need to parse/query specific data from there, you would base64 decode 
the data value and use the {{get_json_object}} or {{json_tuple}} UDFs to read 
it.



was (Author: belugabehr):
[~dengzh] I've formatted the JSON to make it easier to read for discussion 
sake.  FYI, there are a few stray characters at the end of your example that 
were giving me issues during formatting.

{code:json}
{
        "data": {
                "H": {
                        "event": "track_active",
                        "platform": "Android"
                },
                "B": {
                        "device_type": "Phone",
                        "uuid": 
"[36ffec24-f6a4-4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16]"
                }
        },
        "messageId": "2475185636801962",
        "publish_time": 1622514629783,
        "attributes": {
                "region": "IN"
        }
}
{code}

create table json_table(data string, messageid string, publish_time bigint, 
attributes string);

The {{data}} field is not a String type.  It is itself a data type of type 
struct.  If you intend to do something like stuffing arbitrary data in that 
field, then "data" should be a Base-64 string and then you can declare it as a 
Binary type in Hive.  I think that's the preferred approach instead of just 
allowing an overloaded String type.

If you need to parse/query specific data from there, you would un-base64 it and 
use the {{get_json_object}} or {{json_tuple}} UDFs to read it.


> JsonSerDe: Unable to read the string value from a nested json
> -------------------------------------------------------------
>
>                 Key: HIVE-25188
>                 URL: https://issues.apache.org/jira/browse/HIVE-25188
>             Project: Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>    Affects Versions: 4.0.0
>            Reporter: Zhihua Deng
>            Assignee: Zhihua Deng
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Steps to reproduce:
> create table json_table(data string, messageid string, publish_time bigint, 
> attributes string);
>  
> if the data of the table stored like:
> {code:java}
> {"data":{"H":{"event":"track_active","platform":"Android"},"B":{"device_type":"Phone","uuid":"[36ffec24-f6a4-4f5d-aa39-72e5513d2cae,11883bee-a7aa-4010-8a66-6c3c63a73f16]"}},"messageId":"2475185636801962","publish_time":1622514629783,"attributes":{"region":"IN"}}"}}{code}
> Exception will be thrown when trying to deserialize the data:
>  
> Caused by: java.lang.IllegalArgumentException
>  at com.google.common.base.Preconditions.checkArgument(Preconditions.java:108)
>  at 
> org.apache.hadoop.hive.serde2.json.HiveJsonReader.visitLeafNode(HiveJsonReader.java:374)
>  at 
> org.apache.hadoop.hive.serde2.json.HiveJsonReader.visitNode(HiveJsonReader.java:216)
>  at 
> org.apache.hadoop.hive.serde2.json.HiveJsonReader.visitStructNode(HiveJsonReader.java:327)
>  at 
> org.apache.hadoop.hive.serde2.json.HiveJsonReader.visitNode(HiveJsonReader.java:221)
>  at 
> org.apache.hadoop.hive.serde2.json.HiveJsonReader.parseStruct(HiveJsonReader.java:198)
>  at org.apache.hadoop.hive.serde2.JsonSerDe.deserialize(JsonSerDe.java:181)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to