Github user jinossy commented on the pull request:

    https://github.com/apache/tajo/pull/311#issuecomment-76879983
  
    I’ve successfully tested by real data on my company cluster.
    * ENV
     * 2 TajoMaster + 4 TajoWorker 
     * JDK 1.7.0_67
     * 1G Network
    
    ```
    Json Table 
    2TB compressed by snappy
    7.3TB Actual bytes
    
    select count(*) from (select id from table1 group by id) t1;
    Progress: 100%, response time: 3546.781 sec
    ?count
    -------------------------------
    2802809536
    (1 rows, 3546.781 sec, 11 B selected)
    
    
    Parquet table
    8.1TB compressed by snappy
    select count(*)  from table2
    Progress: 100%, response time: 374.358 sec
    ?count
    -------------------------------
    16090817643
    (1 rows, 374.358 sec, 12 B selected)
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to