Eugene Koifman created HIVE-20313:
-------------------------------------

             Summary: consider making ROW__ID a 1st class object
                 Key: HIVE-20313
                 URL: https://issues.apache.org/jira/browse/HIVE-20313
             Project: Hive
          Issue Type: Improvement
          Components: Transactions
    Affects Versions: 0.11.0
            Reporter: Eugene Koifman


ROW__ID, which is a struct that represents a unique row ID within a partition 
of a full CRUD transactional table is currently modeled as a {{VirtualColumn}}. 
 Acid metadata columns from which ROW__ID is built are actually stored in the 
data file.  

There is no end to special handling of acid metadata columns in the code to 
make this work.

Perhaps a better approach is to add struct column to an acid table at creation 
time and make it a 1st class citizen visible in the metastore.  'select 
count(*) ....' would need special handling to remove it.  There may need to be 
a way to make these columns read-only.

For data added via Load Data, Add Partition, etc (i.e. original files in a CRUD 
table), acid reader would have fill in the values as it does today.

This would make schema evolution, PPD, projection pruning work seamlessly.
This should also make adding formats other than ORC in full CRUD tables easy.

This will likely be painful but should be investigated.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to