Github user jinossy commented on the pull request:
https://github.com/apache/tajo/pull/311#issuecomment-76879983
Iâve successfully tested by real data on my company cluster.
* ENV
* 2 TajoMaster + 4 TajoWorker
* JDK 1.7.0_67
* 1G Network
```
Json Table
2TB compressed by snappy
7.3TB Actual bytes
select count(*) from (select id from table1 group by id) t1;
Progress: 100%, response time: 3546.781 sec
?count
-------------------------------
2802809536
(1 rows, 3546.781 sec, 11 B selected)
Parquet table
8.1TB compressed by snappy
select count(*) from table2
Progress: 100%, response time: 374.358 sec
?count
-------------------------------
16090817643
(1 rows, 374.358 sec, 12 B selected)
```
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---