SerDe should support "null" column values
-----------------------------------------
Key: HIVE-226
URL: https://issues.apache.org/jira/browse/HIVE-226
Project: Hadoop Hive
Issue Type: Improvement
Components: Serializers/Deserializers
Reporter: Josh Ferguson
Currently if you attempt to load data that has null values it will work but
selecting data back out of the table will fail due to a null pointer exception
during deserialization.
Suppose we have a generic users table with ^A separated fields.
CREATE TABLE users
(id STRING, properties MAP<STRING, STRING>)
ROW FORMAT DELIMITED
COLLECTION ITEMS TERMINATED BY '44'
MAP KEYS TERMINATED BY '58'
STORED AS TEXTFILE;
we might insert this data (where spaces are ^A characters)
1 key:value
2
3 key:value
Then the following queries will fail
SELECT id FROM users;
SELECT id, properties FROM users;
SELECT properties FROM users;
But they should not fail.
Anytime the field delimiter is encountered twice in a row or the field
delimiter is encountered followed immediately by the line delimiter a
non-existent value should be assumed for the appropriate column.
To circumvent this in my application I have been substituting my own "reserved"
world NULL and the key/value pair NULL:NULL to indicate to my application that
particular fields currently have no value.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.