[
https://issues.apache.org/jira/browse/IMPALA-3782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Quanlong Huang updated IMPALA-3782:
-----------------------------------
Epic Link: IMPALA-12887
> 'describe formatted' output does not match Hive's output for Avro tables
> ------------------------------------------------------------------------
>
> Key: IMPALA-3782
> URL: https://issues.apache.org/jira/browse/IMPALA-3782
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog
> Affects Versions: Impala 2.6.0
> Reporter: Lars Volker
> Assignee: Alexander Behm
> Priority: Minor
> Labels: usability
>
> The comment in
> [DescribeResultFactory.java#L174|https://github.com/cloudera/Impala/blob/cdh5-trunk/fe/src/main/java/com/cloudera/impala/service/DescribeResultFactory.java#L174]
> sounds like we want to produce the same output for {{describe formatted}} as
> Hive. However I found the column comments to be different for
> {{functional_avro.alltypes}} (and probably other tables).
> Hive:
> {noformat}
> $ hive -e "describe formatted functional_avro.alltypes"
> 16/06/23 20:24:45 WARN mapreduce.TableMapReduceUtil: The hbase-prefix-tree
> module jar containing PrefixTreeCodec is not present. Continuing without it.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in
> [jar:file:/home/lv/i6/thirdparty/hbase-1.2.0-cdh5.8.0/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
> [jar:file:/home/lv/i6/thirdparty/hadoop-2.6.0-cdh5.8.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 16/06/23 20:24:46 WARN conf.HiveConf: HiveConf of name hive.access.conf.url
> does not exist
> Logging initialized using configuration in
> file:/home/lv/i6/fe/src/test/resources/hive-log4j.properties
> OK
> # col_name data_type comment
> id int
> bool_col boolean
> tinyint_col int
> smallint_col int
> int_col int
> bigint_col bigint
> float_col float
> double_col double
> date_string_col string
> string_col string
> timestamp_col string
> # Partition Information
> # col_name data_type comment
> year int
> month int
> # Detailed Table Information
> Database: functional_avro
> Owner: lv
> CreateTime: Thu May 26 22:01:50 CEST 2016
> LastAccessTime: UNKNOWN
> Protect Mode: None
> Retention: 0
> Location: hdfs://localhost:20500/test-warehouse/alltypes_avro
> Table Type: EXTERNAL_TABLE
> Table Parameters:
> EXTERNAL TRUE
> avro.schema.url
> hdfs://localhost:20500//test-warehouse/avro_schemas/functional/alltypes.json
> transient_lastDdlTime 1464293060
> # Storage Information
> SerDe Library: org.apache.hadoop.hive.serde2.avro.AvroSerDe
> InputFormat:
> org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat
> OutputFormat:
> org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat
> Compressed: No
> Num Buckets: 0
> Bucket Columns: []
> Sort Columns: []
> Storage Desc Params:
> escape.delim \\
> field.delim ,
> serialization.format ,
> Time taken: 1.897 seconds, Fetched: 46 row(s)
> {noformat}
> Impala:
> {noformat}
> $ impala-shell.sh -q "describe formatted functional_avro.alltypes"
> Starting Impala Shell without Kerberos authentication
> Connected to localhost:21000
> Server version: impalad version 2.6.0-cdh5-INTERNAL DEBUG (build
> 338ba601c8b801bb83ecedd7306a5f7e2062797a)
> Query: describe formatted functional_avro.alltypes
> +------------------------------+-------------------------------------------------------------+------------------------------------------------------------------------------+
> | name | type
> | comment
> |
> +------------------------------+-------------------------------------------------------------+------------------------------------------------------------------------------+
> | # col_name | data_type
> | comment
> |
> | | NULL
> | NULL
> |
> | id | int
> | from deserializer
> |
> | bool_col | boolean
> | from deserializer
> |
> | tinyint_col | int
> | from deserializer
> |
> | smallint_col | int
> | from deserializer
> |
> | int_col | int
> | from deserializer
> |
> | bigint_col | bigint
> | from deserializer
> |
> | float_col | float
> | from deserializer
> |
> | double_col | double
> | from deserializer
> |
> | date_string_col | string
> | from deserializer
> |
> | string_col | string
> | from deserializer
> |
> | timestamp_col | string
> | from deserializer
> |
> | | NULL
> | NULL
> |
> | # Partition Information | NULL
> | NULL
> |
> | # col_name | data_type
> | comment
> |
> | | NULL
> | NULL
> |
> | year | int
> | NULL
> |
> | month | int
> | NULL
> |
> | | NULL
> | NULL
> |
> | # Detailed Table Information | NULL
> | NULL
> |
> | Database: | functional_avro
> | NULL
> |
> | Owner: | lv
> | NULL
> |
> | CreateTime: | Thu May 26 22:01:50 CEST 2016
> | NULL
> |
> | LastAccessTime: | UNKNOWN
> | NULL
> |
> | Protect Mode: | None
> | NULL
> |
> | Retention: | 0
> | NULL
> |
> | Location: |
> hdfs://localhost:20500/test-warehouse/alltypes_avro | NULL
> |
> | Table Type: | EXTERNAL_TABLE
> | NULL
> |
> | Table Parameters: | NULL
> | NULL
> |
> | | EXTERNAL
> | TRUE
> |
> | | avro.schema.url
> |
> hdfs://localhost:20500//test-warehouse/avro_schemas/functional/alltypes.json |
> | | transient_lastDdlTime
> | 1464293060
> |
> | | NULL
> | NULL
> |
> | # Storage Information | NULL
> | NULL
> |
> | SerDe Library: | org.apache.hadoop.hive.serde2.avro.AvroSerDe
> | NULL
> |
> | InputFormat: |
> org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat | NULL
> |
> | OutputFormat: |
> org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat | NULL
> |
> | Compressed: | No
> | NULL
> |
> | Num Buckets: | 0
> | NULL
> |
> | Bucket Columns: | []
> | NULL
> |
> | Sort Columns: | []
> | NULL
> |
> | Storage Desc Params: | NULL
> | NULL
> |
> | | escape.delim
> | \\
> |
> | | field.delim
> | ,
> |
> | | serialization.format
> | ,
> |
> +------------------------------+-------------------------------------------------------------+------------------------------------------------------------------------------+
> Fetched 46 row(s) in 0.01s
> {noformat}
> [~alex.behm], what is the expected behavior here? Should we omit outputting
> these comments?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
