[ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16777516#comment-16777516
 ] 

BELUGA BEHR edited comment on HIVE-21240 at 2/26/19 2:54 AM:
-------------------------------------------------------------

[~bslim]

 

Thanks for the update.

Here is the diff I'm looking at: [^kafka_storage_handler.diff]

To pass the test with this diff, it requires that you use the {{JsonSerDe}} on 
my local branch which fixes the {{timestamp with local timezone}} stuff.  As 
you can see, I have populated the values with the timestamp values.  Are you 
expecting all values to be lost (null)?

 

Regarding {{KafkaJsonSerDe}}, if you wish to keep it around, I recommend we 
move it to the 'test' directory so that it's not shipping with the actual 
product.  If it's not meant for production, we don't want to make it available, 
because there's always that one person that will find it and use it.  However, 
the Hive {{JsonSerde}} is already the default in the Kafka project, so what is 
the LOE to use the one included with Hive than to use this test implementation?


was (Author: belugabehr):
[~bslim]

 

Thanks for the update.

Here is the diff I'm looking at: [^kafka_storage_handler.diff]

To pass the test with this diff, it requires that you use the {{JsonSerDe}} on 
my local branch which fixes the {{timestamp with local timezone}} stuff.  As 
you can see, I have populated the values with the timestamp values.  Are you 
expecting all values to be lost?

 

Regarding {{KafkaJsonSerDe}}, if you wish to keep it around, I recommend we 
move it to the 'test' directory so that it's not shipping with the actual 
product.  If it's not meant for production, we don't want to make it available, 
because there's always that one person that will find it and use it.  However, 
the Hive {{JsonSerde}} is already the default in the Kafka project, so what is 
the LOE to use the one included with Hive than to use this test implementation?

> JSON SerDe Re-Write
> -------------------
>
>                 Key: HIVE-21240
>                 URL: https://issues.apache.org/jira/browse/HIVE-21240
>             Project: Hive
>          Issue Type: Improvement
>          Components: Serializers/Deserializers
>    Affects Versions: 4.0.0, 3.1.1
>            Reporter: BELUGA BEHR
>            Assignee: BELUGA BEHR
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>         Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.10.patch, HIVE-21240.2.patch, HIVE-21240.3.patch, 
> HIVE-21240.4.patch, HIVE-21240.5.patch, HIVE-21240.6.patch, 
> HIVE-21240.7.patch, HIVE-21240.9.patch, HIVE-24240.8.patch, 
> kafka_storage_handler.diff
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to