[
https://issues.apache.org/jira/browse/HIVE-11804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mark Grover updated HIVE-11804:
-------------------------------
Description:
I have a simple text file based managed table on HDFS:
{quote}
show create table src;
+-------------------------------------------------------------------------------+--+
| createtab_stmt
|
+-------------------------------------------------------------------------------+--+
| CREATE TABLE `src`(
|
| `first` string,
|
| `word` string)
|
| PARTITIONED BY (
|
| `length` int)
|
| ROW FORMAT SERDE
|
| 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
|
| STORED AS INPUTFORMAT
|
| 'org.apache.hadoop.mapred.TextInputFormat'
|
| OUTPUTFORMAT
|
| 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
|
| LOCATION
|
| 'hdfs://name-node:8020/user/hive/warehouse/my.db/src' |
| TBLPROPERTIES (
|
| 'transient_lastDdlTime'='1441921577')
|
+-------------------------------------------------------------------------------+--+
{quote}
The describe formatted with the database name returns:
{quote}
describe formatted my.src first partition(length=1);
+-------------------------+-----------------------+-----------------------+-------+------------+-----------------+--------------+--------------+------------+-------------+----------+--+
| col_name | data_type | min |
max | num_nulls | distinct_count | avg_col_len | max_col_len | num_trues
| num_falses | comment |
+-------------------------+-----------------------+-----------------------+-------+------------+-----------------+--------------+--------------+------------+-------------+----------+--+
| # col_name | data_type | comment |
| NULL | NULL | NULL | NULL | NULL |
NULL | NULL |
| | NULL | NULL |
NULL | NULL | NULL | NULL | NULL | NULL
| NULL | NULL |
| first | string | from deserializer |
NULL | NULL | NULL | NULL | NULL | NULL
| NULL | NULL |
+-------------------------+-----------------------+-----------------------+-------+------------+-----------------+--------------+--------------+------------+-------------+----------+--+
{quote}
while without it returns:
{quote}
describe formatted src first partition(length=1);
+-------------------------------+---------------------------------------------------------------------------+-----------------------+--+
| col_name | data_type
| comment |
+-------------------------------+---------------------------------------------------------------------------+-----------------------+--+
| # col_name | data_type
| comment |
| | NULL
| NULL |
| first | string
| |
| word | string
| |
| | NULL
| NULL |
| # Partition Information | NULL
| NULL |
| # col_name | data_type
| comment |
| | NULL
| NULL |
| length | int
| |
| | NULL
| NULL |
| # Detailed Table Information | NULL
| NULL |
| Database: | my
| NULL |
| Owner: | hive
| NULL |
| CreateTime: | Thu Sep 10 14:46:17 PDT 2015
| NULL |
| LastAccessTime: | UNKNOWN
| NULL |
| Protect Mode: | None
| NULL |
| Retention: | 0
| NULL |
| Location: |
hdfs://name-node:8020/user/hive/warehouse/my.db/src | NULL |
| Table Type: | MANAGED_TABLE
| NULL |
| Table Parameters: | NULL
| NULL |
| | transient_lastDdlTime
| 1441921577 |
| | NULL
| NULL |
| # Storage Information | NULL
| NULL |
| SerDe Library: |
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe |
NULL |
| InputFormat: | org.apache.hadoop.mapred.TextInputFormat
| NULL |
| OutputFormat: |
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat |
NULL |
| Compressed: | No
| NULL |
| Num Buckets: | -1
| NULL |
| Bucket Columns: | []
| NULL |
| Sort Columns: | []
| NULL |
| Storage Desc Params: | NULL
| NULL |
| | serialization.format
| 1 |
+-------------------------------+---------------------------------------------------------------------------+-----------------------+--+
{quote}
In particular, I was looking for column stats information and it took me a
while to figure out the difference in the outputs when using the db name and
the results thereafter. I think it would be a huge time saver to fix this.
was:
I have a simple text file based managed table on HDFS:
{quote}
show create table src;
+-------------------------------------------------------------------------------+--+
| createtab_stmt
|
+-------------------------------------------------------------------------------+--+
| CREATE TABLE `src`(
|
| `first` string,
|
| `word` string)
|
| PARTITIONED BY (
|
| `length` int)
|
| ROW FORMAT SERDE
|
| 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
|
| STORED AS INPUTFORMAT
|
| 'org.apache.hadoop.mapred.TextInputFormat'
|
| OUTPUTFORMAT
|
| 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
|
| LOCATION
|
| 'hdfs://name-node:8020/user/hive/warehouse/my.db/src' |
| TBLPROPERTIES (
|
| 'transient_lastDdlTime'='1441921577')
|
+-------------------------------------------------------------------------------+--+
{quote}
The describe formatted with the database name returns:
{quote}
describe formatted my.src first partition(length=1);
+-------------------------+-----------------------+-----------------------+-------+------------+-----------------+--------------+--------------+------------+-------------+----------+--+
| col_name | data_type | min |
max | num_nulls | distinct_count | avg_col_len | max_col_len | num_trues
| num_falses | comment |
+-------------------------+-----------------------+-----------------------+-------+------------+-----------------+--------------+--------------+------------+-------------+----------+--+
| # col_name | data_type | comment |
| NULL | NULL | NULL | NULL | NULL |
NULL | NULL |
| | NULL | NULL |
NULL | NULL | NULL | NULL | NULL | NULL
| NULL | NULL |
| first | string | from deserializer |
NULL | NULL | NULL | NULL | NULL | NULL
| NULL | NULL |
+-------------------------+-----------------------+-----------------------+-------+------------+-----------------+--------------+--------------+------------+-------------+----------+--+
{quote}
while without it returns:
{quote}
describe formatted src first partition(length=1);
+-------------------------------+---------------------------------------------------------------------------+-----------------------+--+
| col_name | data_type
| comment |
+-------------------------------+---------------------------------------------------------------------------+-----------------------+--+
| # col_name | data_type
| comment |
| | NULL
| NULL |
| first | string
| |
| word | string
| |
| | NULL
| NULL |
| # Partition Information | NULL
| NULL |
| # col_name | data_type
| comment |
| | NULL
| NULL |
| length | int
| |
| | NULL
| NULL |
| # Detailed Table Information | NULL
| NULL |
| Database: | spark_hive
| NULL |
| Owner: | hive
| NULL |
| CreateTime: | Thu Sep 10 14:46:17 PDT 2015
| NULL |
| LastAccessTime: | UNKNOWN
| NULL |
| Protect Mode: | None
| NULL |
| Retention: | 0
| NULL |
| Location: |
hdfs://name-node:8020/user/hive/warehouse/my.db/src | NULL |
| Table Type: | MANAGED_TABLE
| NULL |
| Table Parameters: | NULL
| NULL |
| | transient_lastDdlTime
| 1441921577 |
| | NULL
| NULL |
| # Storage Information | NULL
| NULL |
| SerDe Library: |
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe |
NULL |
| InputFormat: | org.apache.hadoop.mapred.TextInputFormat
| NULL |
| OutputFormat: |
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat |
NULL |
| Compressed: | No
| NULL |
| Num Buckets: | -1
| NULL |
| Bucket Columns: | []
| NULL |
| Sort Columns: | []
| NULL |
| Storage Desc Params: | NULL
| NULL |
| | serialization.format
| 1 |
+-------------------------------+---------------------------------------------------------------------------+-----------------------+--+
{quote}
In particular, I was looking for column stats information and it took me a
while to figure out the difference in the outputs when using the db name and
the results thereafter. I think it would be a huge time saver to fix this.
> Different describe formatted behavior depending on whether the table name is
> qualified with database name or not
> ----------------------------------------------------------------------------------------------------------------
>
> Key: HIVE-11804
> URL: https://issues.apache.org/jira/browse/HIVE-11804
> Project: Hive
> Issue Type: Bug
> Components: Metastore
> Reporter: Mark Grover
>
> I have a simple text file based managed table on HDFS:
> {quote}
> show create table src;
> +-------------------------------------------------------------------------------+--+
> | createtab_stmt
> |
> +-------------------------------------------------------------------------------+--+
> | CREATE TABLE `src`(
> |
> | `first` string,
> |
> | `word` string)
> |
> | PARTITIONED BY (
> |
> | `length` int)
> |
> | ROW FORMAT SERDE
> |
> | 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> |
> | STORED AS INPUTFORMAT
> |
> | 'org.apache.hadoop.mapred.TextInputFormat'
> |
> | OUTPUTFORMAT
> |
> | 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> |
> | LOCATION
> |
> | 'hdfs://name-node:8020/user/hive/warehouse/my.db/src' |
> | TBLPROPERTIES (
> |
> | 'transient_lastDdlTime'='1441921577')
> |
> +-------------------------------------------------------------------------------+--+
> {quote}
> The describe formatted with the database name returns:
> {quote}
> describe formatted my.src first partition(length=1);
> +-------------------------+-----------------------+-----------------------+-------+------------+-----------------+--------------+--------------+------------+-------------+----------+--+
> | col_name | data_type | min |
> max | num_nulls | distinct_count | avg_col_len | max_col_len | num_trues
> | num_falses | comment |
> +-------------------------+-----------------------+-----------------------+-------+------------+-----------------+--------------+--------------+------------+-------------+----------+--+
> | # col_name | data_type | comment |
> | NULL | NULL | NULL | NULL | NULL
> | NULL | NULL |
> | | NULL | NULL |
> NULL | NULL | NULL | NULL | NULL | NULL
> | NULL | NULL |
> | first | string | from deserializer |
> NULL | NULL | NULL | NULL | NULL | NULL
> | NULL | NULL |
> +-------------------------+-----------------------+-----------------------+-------+------------+-----------------+--------------+--------------+------------+-------------+----------+--+
> {quote}
> while without it returns:
> {quote}
> describe formatted src first partition(length=1);
> +-------------------------------+---------------------------------------------------------------------------+-----------------------+--+
> | col_name | data_type
> | comment |
> +-------------------------------+---------------------------------------------------------------------------+-----------------------+--+
> | # col_name | data_type
> | comment |
> | | NULL
> | NULL |
> | first | string
> | |
> | word | string
> | |
> | | NULL
> | NULL |
> | # Partition Information | NULL
> | NULL |
> | # col_name | data_type
> | comment |
> | | NULL
> | NULL |
> | length | int
> | |
> | | NULL
> | NULL |
> | # Detailed Table Information | NULL
> | NULL |
> | Database: | my
> | NULL |
> | Owner: | hive
> | NULL |
> | CreateTime: | Thu Sep 10 14:46:17 PDT 2015
> | NULL |
> | LastAccessTime: | UNKNOWN
> | NULL |
> | Protect Mode: | None
> | NULL |
> | Retention: | 0
> | NULL |
> | Location: |
> hdfs://name-node:8020/user/hive/warehouse/my.db/src | NULL |
> | Table Type: | MANAGED_TABLE
> | NULL |
> | Table Parameters: | NULL
> | NULL |
> | | transient_lastDdlTime
> | 1441921577 |
> | | NULL
> | NULL |
> | # Storage Information | NULL
> | NULL |
> | SerDe Library: |
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe |
> NULL |
> | InputFormat: | org.apache.hadoop.mapred.TextInputFormat
> | NULL |
> | OutputFormat: |
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat |
> NULL |
> | Compressed: | No
> | NULL |
> | Num Buckets: | -1
> | NULL |
> | Bucket Columns: | []
> | NULL |
> | Sort Columns: | []
> | NULL |
> | Storage Desc Params: | NULL
> | NULL |
> | | serialization.format
> | 1 |
> +-------------------------------+---------------------------------------------------------------------------+-----------------------+--+
> {quote}
> In particular, I was looking for column stats information and it took me a
> while to figure out the difference in the outputs when using the db name and
> the results thereafter. I think it would be a huge time saver to fix this.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
