[ 
https://issues.apache.org/jira/browse/HIVE-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13500526#comment-13500526
 ] 

Shreepadma Venugopalan commented on HIVE-3678:
----------------------------------------------

With the changes from HIVE-3712, the column schema has *no* dependency on any 
specific db. The column schema, with the changes from HIVE-3712, uses simple 
data types, which are supported across DBs. The primary motivation for making 
the change to the schema in HIVE-3712 was to avoid storing column statistics 
fields as a BLOB. The problem with using a BLOB is a) BLOBs are designed to 
store large volumes of data in the order of GBs and are hence stored outside 
the row. A consequence of this design is BLOBs don't perform well for storing 
small amounts of data. While some DBs such as Oracle inline small BLOBs, all 
DBs don't. While BLOBs are the only practical choice for storing data whose 
size is not known in advance, it is an overkill for storing around 100 bytes of 
data, and b) there is no uniform support across DB vendors and versions. Hence 
I don't really see the value in storing this as a JSON BLOB.
                
> Add metastore upgrade scripts for column stats schema changes
> -------------------------------------------------------------
>
>                 Key: HIVE-3678
>                 URL: https://issues.apache.org/jira/browse/HIVE-3678
>             Project: Hive
>          Issue Type: Bug
>          Components: Metastore
>            Reporter: Shreepadma Venugopalan
>            Assignee: Shreepadma Venugopalan
>             Fix For: 0.10.0
>
>         Attachments: HIVE-3678.1.patch.txt
>
>
> Add upgrade script for column statistics schema changes for 
> Postgres/MySQL/Oracle/Derby

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to