GitHub user shardul-cr7 opened a pull request:

    https://github.com/apache/carbondata/pull/2847

    [WIP]Support Gzip as column compressor

    Gzip compressed file size is less than that of snappy but takes more time.
    
    Data generated by tpch-dbgen(lineitem)
    
    **Load Performance Comparisons (Compression)**
    
    *Test Case 1*
    *File Size 3.9G*
    *Records ~30M*
    
    | Codec Used | Load Time | File Size After Load | 
    | ------ | ------ | ------ |
    | Snappy | 156s | 101M 
    | Zstd| 153s | 2.2M 
    | Gzip| 163s | 12.1M
    
    *Test Case 2*
    *File Size 7.8G*
    *Records ~60M*
    
    | Codec Used | Load Time | File Size After Load | 
    | ------ | ------ | ------ |
    | Snappy | 336s | 203.6M 
    | Zstd| 352s | 4.3M 
    | Gzip| 354s | 12.1M
    
    **Query Performance (Decompression)**
    
    *Test Case 1*
    
    | Codec Used | Full Scan Time  
    | ------ | ------ 
    | Snappy | 16.108s 
    | Zstd| 14.595s 
    | Gzip| 14.313s 
    
    *Test Case 2*
    
    | Codec Used | Full Scan Time  
    | ------ | ------ 
    | Snappy | 23.559s 
    | Zstd| 23.913s 
    | Gzip| 26.741s 
    
    Be sure to do all of the following checklist to help us incorporate 
    your contribution quickly and easily:
    
     - [ ] Any interfaces changed?
     
     - [ ] Any backward compatibility impacted?
     
     - [ ] Document update required?
    
     - [x] Testing done
          added some testcases
           
     - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/shardul-cr7/carbondata b010

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2847.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2847
    
----
commit 6ad88ccc5663353d16372d91878d7efb223b16d6
Author: shardul-cr7 <shardulsingh22@...>
Date:   2018-10-23T11:57:47Z

    [WIP]Support Gzip

----


---

Reply via email to