Bennie Schut created HIVE-3308:
----------------------------------

             Summary: Mixing avro and snappy gives null values
                 Key: HIVE-3308
                 URL: https://issues.apache.org/jira/browse/HIVE-3308
             Project: Hive
          Issue Type: Bug
          Components: Query Processor
    Affects Versions: 0.10.0
            Reporter: Bennie Schut


On default hive uses LazySimpleSerDe for output.
When I now enable compression and "select count(*) from avrotable" the output 
is a file with the .avro extension but this then will display null values since 
the file is in reality not an avro file but a file created by LazySimpleSerDe 
using compression so should be a .snappy file.
This causes any job (exception select * from avrotable is that not truly a job) 
to show null values.
If you use any serde other then avro you can temporarily fix this by setting 
"set hive.output.file.extension=.snappy" and it will correctly work again but 
this won't work on avro since it overwrites the hive.output.file.extension 
during initializing.

When you dump the query result into a table with "create table bla as" you can 
rename the .avro file into .snappy and the "select from bla" will also 
magiacally work again.

Input and Ouput serdes don't always match so when I use avro as an input format 
it should not set the hive.output.file.extension.
Onces it's set all queries will use it and fail making the connection useless 
to reuse.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to