Hadoop 0.20.1 and hive trunk from this week. Monday I'll try and use an older version of hive to see if that helps. Perhaps also "gz" to see if it's compression in general.

Yongqiang He wrote:
Hi Bennie,
Can you post your hadoop version and hive version?

Thanks
Yongqiang


On 2/5/10 10:05 AM, "Zheng Shao" <[email protected]> wrote:

That seems to be a bug.
Are you using hive trunk or any release?


On 2/5/10, Bennie Schut <[email protected]> wrote:
I have a tab separated files I have loaded it with "load data inpath"
then I do a

SET hive.exec.compress.output=true;
SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
SET mapred.map.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
select distinct login_cldr_id as cldr_id from chatsessions_load;

Ended Job = job_201001151039_1641
OK
NULL
NULL
NULL
Time taken: 49.06 seconds

however if I start it without the set commands I get this:
Ended Job = job_201001151039_1642
OK
2283
Time taken: 45.308 seconds

Which is the correct result.

When I do a "insert overwrite" on a rcfile table it will actually
compress the data correctly.
When I disable compression and query this new table the result is correct.
When I enable compression it's wrong again.
I see no errors in the logs.

Any idea's why this might happen?






Reply via email to