[jira] Commented: (HIVE-1838) Add quickLZ compression codec for Hive.

2010-12-09 Thread He Yongqiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969855#action_12969855
 ] 

He Yongqiang commented on HIVE-1838:


Just found that there is already a jira on Hadoop side:

https://issues.apache.org/jira/browse/HADOOP-6349

 Add quickLZ compression codec for Hive.
 ---

 Key: HIVE-1838
 URL: https://issues.apache.org/jira/browse/HIVE-1838
 Project: Hive
  Issue Type: New Feature
Reporter: He Yongqiang



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1838) Add quickLZ compression codec for Hive.

2010-12-07 Thread He Yongqiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969144#action_12969144
 ] 

He Yongqiang commented on HIVE-1838:


No. I mean compression codec for Hive. It could be used to compress 
intermediate data.

Here are some results:

5. Hadoop compression with native library (COMPRESSLEVEL=BEST_SPEED)
time java 
-Djava.library.path=/data/users/heyongqiang/hadoop-0.20/build/native/Linux-amd64-64/lib/
 CompressFile

real0m34.179s
user0m29.031s
sys 0m1.607s

compressed size: 275M

6. LZF
[heyongqi...@dev782 compress_test]$ time lzf -c 00_0 

real0m39.031s
user0m8.727s
sys 0m2.231s
compressed size: 393M

7. FastLZ
time fastlz/6pack -1 00_0 00_0.fastlz
real0m19.020s
user0m18.083s
sys 0m0.935s

compressed size: 391M

8.QuickLZ
time ./compress_file ../00_0 ../00_0.quicklz

real0m15.652s
user0m14.047s
sys 0m1.603s

compressed size: 334M

I modified QuickLZ's compress_file code to use a buffer for fairness. It turns 
out the result is very close to FastLZ. The modified version of QuickLZ is just 
one second better.


 Add quickLZ compression codec for Hive.
 ---

 Key: HIVE-1838
 URL: https://issues.apache.org/jira/browse/HIVE-1838
 Project: Hive
  Issue Type: New Feature
Reporter: He Yongqiang



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.