[jira] [Updated] (HIVE-11804) Different describe formatted behavior depending on whether the table name is qualified with database name or not

Mark Grover (JIRA) Fri, 11 Sep 2015 12:20:25 -0700

     [ 
https://issues.apache.org/jira/browse/HIVE-11804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Mark Grover updated HIVE-11804:
-------------------------------
    Description: 
I have a simple text file based managed table on HDFS:
{quote}
show create table src;
+-------------------------------------------------------------------------------+--+
|                                createtab_stmt                                 
|
+-------------------------------------------------------------------------------+--+
| CREATE TABLE `src`(                                                           
|
|   `first` string,                                                             
|
|   `word` string)                                                              
|
| PARTITIONED BY (                                                              
|
|   `length` int)                                                               
|
| ROW FORMAT SERDE                                                              
|
|   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'                        
|
| STORED AS INPUTFORMAT                                                         
|
|   'org.apache.hadoop.mapred.TextInputFormat'                                  
|
| OUTPUTFORMAT                                                                  
|
|   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'                
|
| LOCATION                                                                      
|
|   'hdfs://name-node:8020/user/hive/warehouse/my.db/src'  |
| TBLPROPERTIES (                                                               
|
|   'transient_lastDdlTime'='1441921577')                                       
|
+-------------------------------------------------------------------------------+--+
{quote}

The describe formatted with the database name returns:
{quote}
describe formatted my.src first partition(length=1);
+-------------------------+-----------------------+-----------------------+-------+------------+-----------------+--------------+--------------+------------+-------------+----------+--+
|        col_name         |       data_type       |          min          |  
max  | num_nulls  | distinct_count  | avg_col_len  | max_col_len  | num_trues  
| num_falses  | comment  |
+-------------------------+-----------------------+-----------------------+-------+------------+-----------------+--------------+--------------+------------+-------------+----------+--+
| # col_name              | data_type             | comment               |     
  | NULL       | NULL            | NULL         | NULL         | NULL       | 
NULL        | NULL     |
|                         | NULL                  | NULL                  | 
NULL  | NULL       | NULL            | NULL         | NULL         | NULL       
| NULL        | NULL     |
| first                   | string                | from deserializer     | 
NULL  | NULL       | NULL            | NULL         | NULL         | NULL       
| NULL        | NULL     |
+-------------------------+-----------------------+-----------------------+-------+------------+-----------------+--------------+--------------+------------+-------------+----------+--+
{quote}

while without it returns:
{quote}
describe formatted src first partition(length=1);
+-------------------------------+---------------------------------------------------------------------------+-----------------------+--+
|           col_name            |                                 data_type     
                            |        comment        |
+-------------------------------+---------------------------------------------------------------------------+-----------------------+--+
| # col_name                    | data_type                                     
                            | comment               |
|                               | NULL                                          
                            | NULL                  |
| first                         | string                                        
                            |                       |
| word                          | string                                        
                            |                       |
|                               | NULL                                          
                            | NULL                  |
| # Partition Information       | NULL                                          
                            | NULL                  |
| # col_name                    | data_type                                     
                            | comment               |
|                               | NULL                                          
                            | NULL                  |
| length                        | int                                           
                            |                       |
|                               | NULL                                          
                            | NULL                  |
| # Detailed Table Information  | NULL                                          
                            | NULL                  |
| Database:                     | my                                            
                    | NULL                  |
| Owner:                        | hive                                          
                            | NULL                  |
| CreateTime:                   | Thu Sep 10 14:46:17 PDT 2015                  
                            | NULL                  |
| LastAccessTime:               | UNKNOWN                                       
                            | NULL                  |
| Protect Mode:                 | None                                          
                            | NULL                  |
| Retention:                    | 0                                             
                            | NULL                  |
| Location:                     | 
hdfs://name-node:8020/user/hive/warehouse/my.db/src  | NULL                  |
| Table Type:                   | MANAGED_TABLE                                 
                            | NULL                  |
| Table Parameters:             | NULL                                          
                            | NULL                  |
|                               | transient_lastDdlTime                         
                            | 1441921577            |
|                               | NULL                                          
                            | NULL                  |
| # Storage Information         | NULL                                          
                            | NULL                  |
| SerDe Library:                | 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe                        | 
NULL                  |
| InputFormat:                  | org.apache.hadoop.mapred.TextInputFormat      
                            | NULL                  |
| OutputFormat:                 | 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat                | 
NULL                  |
| Compressed:                   | No                                            
                            | NULL                  |
| Num Buckets:                  | -1                                            
                            | NULL                  |
| Bucket Columns:               | []                                            
                            | NULL                  |
| Sort Columns:                 | []                                            
                            | NULL                  |
| Storage Desc Params:          | NULL                                          
                            | NULL                  |
|                               | serialization.format                          
                            | 1                     |
+-------------------------------+---------------------------------------------------------------------------+-----------------------+--+
{quote}

In particular, I was looking for column stats information and it took me a 
while to figure out the difference in the outputs when using the db name and 
the results thereafter. I think it would be a huge time saver to fix this.


  was:
I have a simple text file based managed table on HDFS:
{quote}
show create table src;
+-------------------------------------------------------------------------------+--+
|                                createtab_stmt                                 
|
+-------------------------------------------------------------------------------+--+
| CREATE TABLE `src`(                                                           
|
|   `first` string,                                                             
|
|   `word` string)                                                              
|
| PARTITIONED BY (                                                              
|
|   `length` int)                                                               
|
| ROW FORMAT SERDE                                                              
|
|   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'                        
|
| STORED AS INPUTFORMAT                                                         
|
|   'org.apache.hadoop.mapred.TextInputFormat'                                  
|
| OUTPUTFORMAT                                                                  
|
|   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'                
|
| LOCATION                                                                      
|
|   'hdfs://name-node:8020/user/hive/warehouse/my.db/src'  |
| TBLPROPERTIES (                                                               
|
|   'transient_lastDdlTime'='1441921577')                                       
|
+-------------------------------------------------------------------------------+--+
{quote}

The describe formatted with the database name returns:
{quote}
describe formatted my.src first partition(length=1);
+-------------------------+-----------------------+-----------------------+-------+------------+-----------------+--------------+--------------+------------+-------------+----------+--+
|        col_name         |       data_type       |          min          |  
max  | num_nulls  | distinct_count  | avg_col_len  | max_col_len  | num_trues  
| num_falses  | comment  |
+-------------------------+-----------------------+-----------------------+-------+------------+-----------------+--------------+--------------+------------+-------------+----------+--+
| # col_name              | data_type             | comment               |     
  | NULL       | NULL            | NULL         | NULL         | NULL       | 
NULL        | NULL     |
|                         | NULL                  | NULL                  | 
NULL  | NULL       | NULL            | NULL         | NULL         | NULL       
| NULL        | NULL     |
| first                   | string                | from deserializer     | 
NULL  | NULL       | NULL            | NULL         | NULL         | NULL       
| NULL        | NULL     |
+-------------------------+-----------------------+-----------------------+-------+------------+-----------------+--------------+--------------+------------+-------------+----------+--+
{quote}

while without it returns:
{quote}
describe formatted src first partition(length=1);
+-------------------------------+---------------------------------------------------------------------------+-----------------------+--+
|           col_name            |                                 data_type     
                            |        comment        |
+-------------------------------+---------------------------------------------------------------------------+-----------------------+--+
| # col_name                    | data_type                                     
                            | comment               |
|                               | NULL                                          
                            | NULL                  |
| first                         | string                                        
                            |                       |
| word                          | string                                        
                            |                       |
|                               | NULL                                          
                            | NULL                  |
| # Partition Information       | NULL                                          
                            | NULL                  |
| # col_name                    | data_type                                     
                            | comment               |
|                               | NULL                                          
                            | NULL                  |
| length                        | int                                           
                            |                       |
|                               | NULL                                          
                            | NULL                  |
| # Detailed Table Information  | NULL                                          
                            | NULL                  |
| Database:                     | spark_hive                                    
                            | NULL                  |
| Owner:                        | hive                                          
                            | NULL                  |
| CreateTime:                   | Thu Sep 10 14:46:17 PDT 2015                  
                            | NULL                  |
| LastAccessTime:               | UNKNOWN                                       
                            | NULL                  |
| Protect Mode:                 | None                                          
                            | NULL                  |
| Retention:                    | 0                                             
                            | NULL                  |
| Location:                     | 
hdfs://name-node:8020/user/hive/warehouse/my.db/src  | NULL                  |
| Table Type:                   | MANAGED_TABLE                                 
                            | NULL                  |
| Table Parameters:             | NULL                                          
                            | NULL                  |
|                               | transient_lastDdlTime                         
                            | 1441921577            |
|                               | NULL                                          
                            | NULL                  |
| # Storage Information         | NULL                                          
                            | NULL                  |
| SerDe Library:                | 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe                        | 
NULL                  |
| InputFormat:                  | org.apache.hadoop.mapred.TextInputFormat      
                            | NULL                  |
| OutputFormat:                 | 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat                | 
NULL                  |
| Compressed:                   | No                                            
                            | NULL                  |
| Num Buckets:                  | -1                                            
                            | NULL                  |
| Bucket Columns:               | []                                            
                            | NULL                  |
| Sort Columns:                 | []                                            
                            | NULL                  |
| Storage Desc Params:          | NULL                                          
                            | NULL                  |
|                               | serialization.format                          
                            | 1                     |
+-------------------------------+---------------------------------------------------------------------------+-----------------------+--+
{quote}

In particular, I was looking for column stats information and it took me a 
while to figure out the difference in the outputs when using the db name and 
the results thereafter. I think it would be a huge time saver to fix this.



> Different describe formatted behavior depending on whether the table name is 
> qualified with database name or not
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-11804
>                 URL: https://issues.apache.org/jira/browse/HIVE-11804
>             Project: Hive
>          Issue Type: Bug
>          Components: Metastore
>            Reporter: Mark Grover
>
> I have a simple text file based managed table on HDFS:
> {quote}
> show create table src;
> +-------------------------------------------------------------------------------+--+
> |                                createtab_stmt                               
>   |
> +-------------------------------------------------------------------------------+--+
> | CREATE TABLE `src`(                                                         
>   |
> |   `first` string,                                                           
>   |
> |   `word` string)                                                            
>   |
> | PARTITIONED BY (                                                            
>   |
> |   `length` int)                                                             
>   |
> | ROW FORMAT SERDE                                                            
>   |
> |   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'                      
>   |
> | STORED AS INPUTFORMAT                                                       
>   |
> |   'org.apache.hadoop.mapred.TextInputFormat'                                
>   |
> | OUTPUTFORMAT                                                                
>   |
> |   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'              
>   |
> | LOCATION                                                                    
>   |
> |   'hdfs://name-node:8020/user/hive/warehouse/my.db/src'  |
> | TBLPROPERTIES (                                                             
>   |
> |   'transient_lastDdlTime'='1441921577')                                     
>   |
> +-------------------------------------------------------------------------------+--+
> {quote}
> The describe formatted with the database name returns:
> {quote}
> describe formatted my.src first partition(length=1);
> +-------------------------+-----------------------+-----------------------+-------+------------+-----------------+--------------+--------------+------------+-------------+----------+--+
> |        col_name         |       data_type       |          min          |  
> max  | num_nulls  | distinct_count  | avg_col_len  | max_col_len  | num_trues 
>  | num_falses  | comment  |
> +-------------------------+-----------------------+-----------------------+-------+------------+-----------------+--------------+--------------+------------+-------------+----------+--+
> | # col_name              | data_type             | comment               |   
>     | NULL       | NULL            | NULL         | NULL         | NULL       
> | NULL        | NULL     |
> |                         | NULL                  | NULL                  | 
> NULL  | NULL       | NULL            | NULL         | NULL         | NULL     
>   | NULL        | NULL     |
> | first                   | string                | from deserializer     | 
> NULL  | NULL       | NULL            | NULL         | NULL         | NULL     
>   | NULL        | NULL     |
> +-------------------------+-----------------------+-----------------------+-------+------------+-----------------+--------------+--------------+------------+-------------+----------+--+
> {quote}
> while without it returns:
> {quote}
> describe formatted src first partition(length=1);
> +-------------------------------+---------------------------------------------------------------------------+-----------------------+--+
> |           col_name            |                                 data_type   
>                               |        comment        |
> +-------------------------------+---------------------------------------------------------------------------+-----------------------+--+
> | # col_name                    | data_type                                   
>                               | comment               |
> |                               | NULL                                        
>                               | NULL                  |
> | first                         | string                                      
>                               |                       |
> | word                          | string                                      
>                               |                       |
> |                               | NULL                                        
>                               | NULL                  |
> | # Partition Information       | NULL                                        
>                               | NULL                  |
> | # col_name                    | data_type                                   
>                               | comment               |
> |                               | NULL                                        
>                               | NULL                  |
> | length                        | int                                         
>                               |                       |
> |                               | NULL                                        
>                               | NULL                  |
> | # Detailed Table Information  | NULL                                        
>                               | NULL                  |
> | Database:                     | my                                          
>                       | NULL                  |
> | Owner:                        | hive                                        
>                               | NULL                  |
> | CreateTime:                   | Thu Sep 10 14:46:17 PDT 2015                
>                               | NULL                  |
> | LastAccessTime:               | UNKNOWN                                     
>                               | NULL                  |
> | Protect Mode:                 | None                                        
>                               | NULL                  |
> | Retention:                    | 0                                           
>                               | NULL                  |
> | Location:                     | 
> hdfs://name-node:8020/user/hive/warehouse/my.db/src  | NULL                  |
> | Table Type:                   | MANAGED_TABLE                               
>                               | NULL                  |
> | Table Parameters:             | NULL                                        
>                               | NULL                  |
> |                               | transient_lastDdlTime                       
>                               | 1441921577            |
> |                               | NULL                                        
>                               | NULL                  |
> | # Storage Information         | NULL                                        
>                               | NULL                  |
> | SerDe Library:                | 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe                        | 
> NULL                  |
> | InputFormat:                  | org.apache.hadoop.mapred.TextInputFormat    
>                               | NULL                  |
> | OutputFormat:                 | 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat                | 
> NULL                  |
> | Compressed:                   | No                                          
>                               | NULL                  |
> | Num Buckets:                  | -1                                          
>                               | NULL                  |
> | Bucket Columns:               | []                                          
>                               | NULL                  |
> | Sort Columns:                 | []                                          
>                               | NULL                  |
> | Storage Desc Params:          | NULL                                        
>                               | NULL                  |
> |                               | serialization.format                        
>                               | 1                     |
> +-------------------------------+---------------------------------------------------------------------------+-----------------------+--+
> {quote}
> In particular, I was looking for column stats information and it took me a 
> while to figure out the difference in the outputs when using the db name and 
> the results thereafter. I think it would be a huge time saver to fix this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11804) Different describe formatted behavior depending on whether the table name is qualified with database name or not

Reply via email to