[ 
https://issues.apache.org/jira/browse/HIVE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537364#comment-16537364
 ] 

Alan Gates commented on HIVE-17852:
-----------------------------------

In the thrift interface, we can't take out the optional arguments from storage 
descriptor for stored as directories.  This cold break clients still sending 
those values.  Instead your patch should change HiveMetaStore to simply ignore 
those values.

We should never drop columns from the databases in upgrade scripts.  If a user 
upgrades a database, then decides to rollback to a previous version (maybe 
something fails downstream in the upgrade), this will cause the old version to 
not work, since the columns are now missing.  So removing the columns from the 
install scripts is fine, but we shouldn't do a drop in the upgrade scripts.

Other than those two points the metastore changes look fine to me.  I'll let 
Sergey or others who understand the plan pieces better review that part.

 

> remove support for list bucketing "stored as directories" in 3.0
> ----------------------------------------------------------------
>
>                 Key: HIVE-17852
>                 URL: https://issues.apache.org/jira/browse/HIVE-17852
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Laszlo Bodor
>            Priority: Major
>             Fix For: 4.0.0
>
>         Attachments: HIVE-17852.01.patch, HIVE-17852.02.patch, 
> HIVE-17852.03.patch, HIVE-17852.04.patch, HIVE-17852.05.patch, 
> HIVE-17852.06.patch, HIVE-17852.07.patch, HIVE-17852.08.patch, 
> HIVE-17852.09.patch, HIVE-17852.10.patch, HIVE-17852.11.patch, 
> HIVE-17852.12.patch, HIVE-17852.13.patch, HIVE-17852.14.patch, 
> HIVE-17852.15.patch, HIVE-17852.16.patch, HIVE-17852.17.patch
>
>
> From the email thread:
> 1) LB, when stored as directories, adds a lot of low-level complexity to Hive 
> tables that has to be accounted for in many places in the code where the 
> files are written or modified - from FSOP to ACID/replication/export.
> 2) While working on some FSOP code I noticed that some of that logic is 
> broken - e.g. the duplicate file removal from tasks, a pretty fundamental 
> correctness feature in Hive, may be broken. LB also doesn’t appear to be 
> compatible with e.g. regular bucketing.
> 3) The feature hasn’t seen development activity in a while; it also doesn’t 
> appear to be used a lot.
> Keeping with the theme of cleaning up “legacy” code for 3.0, I was proposing 
> we remove it.
> (2) also suggested that, if needed, it might be easier to implement similar 
> functionality by adding some flexibility to partitions (which LB directories 
> look like anyway); that would also keep the logic on a higher level of 
> abstraction (split generation, partition pruning) as opposed to many 
> low-level places like FSOP, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to