[ 
https://issues.apache.org/jira/browse/HIVE-23871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-23871:
------------------------------------------
    Description: 
HIVE-23281 optimizes StorageDescriptor conversion as part of the ObjectStore by 
skipping particular Table properties like SkewInfo, bucketCols, ordering etc.
 However, it does that for all Transactional Tables – not only ACID – causing 
MicroManaged Tables to behave abnormally.
 MicroManaged (insert_only) tables may miss needed properties such as Storage 
Desc Params – that may define how lines are delimited (like in the example 
below):

To repro the issue:
{code:java}
CREATE TRANSACTIONAL TABLE delim_table_trans(id INT, name STRING, safety INT) 
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE;
LOAD DATA INPATH 'table1' OVERWRITE INTO TABLE delim_table_trans;
describe formatted delim_table_trans;
SELECT * FROM delim_table_trans;
{code}
Result:
{code:java}
Table Type:             MANAGED_TABLE            
Table Parameters:                
        bucketing_version       2                   
        numFiles                1                   
        numRows                 0                   
        rawDataSize             0                   
        totalSize               72                  
        transactional           true                
        transactional_properties        insert_only         
#### A masked pattern was here ####
                 
# Storage Information            
SerDe Library:          org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe      
 
InputFormat:            org.apache.hadoop.mapred.TextInputFormat         
OutputFormat:           
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat       
Compressed:             No                       
Num Buckets:            -1                       
Bucket Columns:         []                       
Sort Columns:           []                       
PREHOOK: query: SELECT * FROM delim_table_trans
PREHOOK: type: QUERY
PREHOOK: Input: default@delim_table_trans
#### A masked pattern was here ####
POSTHOOK: query: SELECT * FROM delim_table_trans
POSTHOOK: type: QUERY
POSTHOOK: Input: default@delim_table_trans
#### A masked pattern was here ####
NULL    NULL    NULL
NULL    NULL    NULL
NULL    NULL    NULL
NULL    NULL    NULL
NULL    NULL    NULL
NULL    NULL    NULL
 {code}

  was:
HIVE-23281 optimizes StorageDescriptor conversion as part of the ObjectStore by 
skipping particular Table properties like SkewInfo, bucketCols, ordering etc.
 However, it does that for all Transactional Tables – not only ACID – causing 
MicroManaged Tables to behave abnormally.
MicroManaged (insert_only) tables may miss needed properties such as Storage 
Desc Params – that may define how lines are delimited (like in the example 
below):

To repro the issue:
{code:java}
CREATE TRANSACTIONAL TABLE delim_table_trans(id INT, name STRING, safety INT) 
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE;
LOAD DATA INPATH 'table1' OVERWRITE INTO TABLE delim_table_trans;
describe formatted delim_table_trans;
SELECT * FROM delim_table_trans;
{code}
Result:
{code:java}
# Storage Information            
SerDe Library:          org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe      
 
InputFormat:            org.apache.hadoop.mapred.TextInputFormat         
OutputFormat:           
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat       
Compressed:             No                       
Num Buckets:            -1                       
Bucket Columns:         []                       
Sort Columns:           []                       
PREHOOK: query: SELECT * FROM delim_table_trans
PREHOOK: type: QUERY
PREHOOK: Input: default@delim_table_trans
#### A masked pattern was here ####
POSTHOOK: query: SELECT * FROM delim_table_trans
POSTHOOK: type: QUERY
POSTHOOK: Input: default@delim_table_trans
#### A masked pattern was here ####
NULL    NULL    NULL
NULL    NULL    NULL
NULL    NULL    NULL
NULL    NULL    NULL
NULL    NULL    NULL
NULL    NULL    NULL
{code}


> ObjectStore should properly handle MicroManaged Table properties
> ----------------------------------------------------------------
>
>                 Key: HIVE-23871
>                 URL: https://issues.apache.org/jira/browse/HIVE-23871
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Panagiotis Garefalakis
>            Assignee: Panagiotis Garefalakis
>            Priority: Major
>         Attachments: table1
>
>
> HIVE-23281 optimizes StorageDescriptor conversion as part of the ObjectStore 
> by skipping particular Table properties like SkewInfo, bucketCols, ordering 
> etc.
>  However, it does that for all Transactional Tables – not only ACID – causing 
> MicroManaged Tables to behave abnormally.
>  MicroManaged (insert_only) tables may miss needed properties such as Storage 
> Desc Params – that may define how lines are delimited (like in the example 
> below):
> To repro the issue:
> {code:java}
> CREATE TRANSACTIONAL TABLE delim_table_trans(id INT, name STRING, safety INT) 
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE;
> LOAD DATA INPATH 'table1' OVERWRITE INTO TABLE delim_table_trans;
> describe formatted delim_table_trans;
> SELECT * FROM delim_table_trans;
> {code}
> Result:
> {code:java}
> Table Type:           MANAGED_TABLE            
> Table Parameters:              
>       bucketing_version       2                   
>       numFiles                1                   
>       numRows                 0                   
>       rawDataSize             0                   
>       totalSize               72                  
>       transactional           true                
>       transactional_properties        insert_only         
> #### A masked pattern was here ####
>                
> # Storage Information          
> SerDe Library:        org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe      
>  
> InputFormat:          org.apache.hadoop.mapred.TextInputFormat         
> OutputFormat:         
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat       
> Compressed:           No                       
> Num Buckets:          -1                       
> Bucket Columns:       []                       
> Sort Columns:         []                       
> PREHOOK: query: SELECT * FROM delim_table_trans
> PREHOOK: type: QUERY
> PREHOOK: Input: default@delim_table_trans
> #### A masked pattern was here ####
> POSTHOOK: query: SELECT * FROM delim_table_trans
> POSTHOOK: type: QUERY
> POSTHOOK: Input: default@delim_table_trans
> #### A masked pattern was here ####
> NULL  NULL    NULL
> NULL  NULL    NULL
> NULL  NULL    NULL
> NULL  NULL    NULL
> NULL  NULL    NULL
> NULL  NULL    NULL
>  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to