Jarek Jarcec Cecho created HIVE-7174:
----------------------------------------
Summary: Do not accept string as scale and precision when reading
Avro schema
Key: HIVE-7174
URL: https://issues.apache.org/jira/browse/HIVE-7174
Project: Hive
Issue Type: Bug
Reporter: Jarek Jarcec Cecho
Assignee: Jarek Jarcec Cecho
I've noticed that the current AvroSerde will happily accept schema that uses
string instead of integer for scale and precision, e.g. fragment
{{"precision":"4","scale":"1"}} from following table:
{code}
CREATE TABLE `avro_dec1`(
`name` string COMMENT 'from deserializer',
`value` decimal(4,1) COMMENT 'from deserializer')
COMMENT 'just drop the schema right into the HQL'
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
TBLPROPERTIES (
'numFiles'='1',
'avro.schema.literal'='{\"namespace\":\"com.howdy\",\"name\":\"some_schema\",\"type\":\"record\",\"fields\":[{\"name\":\"name\",\"type\":\"string\"},{\"name\":\"value\",\"type\":{\"type\":\"bytes\",\"logicalType\":\"decimal\",\"precision\":\"4\",\"scale\":\"1\"}}]}'
);
{code}
However the Decimal spec defined in AVRO-1402 requires only integer to be there
and hence is allowing only following fragment instead
{{"precision":4,"scale":1}} (e.g. no double quotes around numbers).
As Hive can propagate this incorrect schema to new files and hence creating
files with invalid schema, I think that we should alter the behavior and insist
on the correct schema.
--
This message was sent by Atlassian JIRA
(v6.2#6252)