[ 
https://issues.apache.org/jira/browse/HIVE-7777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14203915#comment-14203915
 ] 

Alon Goldshuv commented on HIVE-7777:
-------------------------------------

Either way should work (adding OpenCSV parsing on LazySimpleSerde or adding 
type support on this new CSV serde). 

IMO the deciding factor should be performance considerations. If adding quote 
stripping to LazySimpleSerde means it will slow down simple non quoted parsing 
(e.g, due to introducing the need to examine the state after each byte instead 
of seeking fast to the next line terminator) - I'd say the solution is best 
represented in 2 separate serdes (as proposed in this JIRA). If that isn't the 
case though - a single serde (as proposed by [~rstokes]) is more 
elegant/friendly. [~rstokes] - can you share information on that respect, or 
share the code for your modified LazySimpleSerde?

> Add CSV Serde based on OpenCSV
> ------------------------------
>
>                 Key: HIVE-7777
>                 URL: https://issues.apache.org/jira/browse/HIVE-7777
>             Project: Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Ferdinand Xu
>            Assignee: Ferdinand Xu
>              Labels: TODOC14
>             Fix For: 0.14.0
>
>         Attachments: HIVE-7777.1.patch, HIVE-7777.2.patch, HIVE-7777.3.patch, 
> HIVE-7777.patch, csv-serde-master.zip
>
>
> There is no official support for csvSerde for hive while there is an open 
> source project in github(https://github.com/ogrodnek/csv-serde). CSV is of 
> high frequency in use as a data format.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to