[ 
https://issues.apache.org/jira/browse/TAJO-1081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14153890#comment-14153890
 ] 

Hyunsik Choi commented on TAJO-1081:
------------------------------------

Thank you for sharing your investigation.

I also investigated the problem because I cannot figure out what is the main 
cause. 

Your investigation is partially right. We use the number of rows in the catalog 
that we can obtain the number by executing "\d". Sometimes, it is not available 
especially when users register external tables. As you mentioned, we need 
{{select count(*) from..} queries.

But, Tajo client uses the number of rows as supplementary information for LIMIT 
clause or displaying. Even though the number is not available, Tajo client can 
work well because Tajo client can read rows until scanner reaches out the end 
of tuples.

I also found the bug from PlannerUtil::getNonZeroLengthDataFiles() method. This 
method should have used AbstractStorageManager.hiddenFileFilter in order to 
skip hidden files which has prefix {{.}}. The current implementation reads all 
files even some files are not valid.

If you want, you can keep going this issue. Otherwise, I can take this issue.

Best regards,
Hyunsik

> Non-forwarded (simple) query shows wrong rows.
> ----------------------------------------------
>
>                 Key: TAJO-1081
>                 URL: https://issues.apache.org/jira/browse/TAJO-1081
>             Project: Tajo
>          Issue Type: Bug
>          Components: client, tajo master
>            Reporter: Hyunsik Choi
>            Assignee: Mai Hai Thanh
>            Priority: Blocker
>             Fix For: 0.9.0
>
>
> Non-forward queries show wrong rows. It is the very urgent and critical bug 
> that must be resolved before 0.9.0 release.
> {code}
> default> \d region
> table name: default.region
> table path: file:/Users/hyunsik/tpch/region
> store type: CSV
> number of rows: 0
> volume: 494 B
> Options: 
>       'csvfile.delimiter'='|'
> schema: 
> r_regionkey   INT8
> r_name        TEXT
> r_comment     TEXT
> default> 
> default> select * from region;
> r_regionkey,  r_name,  r_comment
> -------------------------------
> ,  ,  
> 2,  ,  
> ,  ,  
> ,  ,  
> ,  ,  
> ,  ,  
> ,  ,  
> ,  ,  
> ,  ,  
> ,  "
>    � (0,  
> 0,  AFRICA,  lar deposits. blithely final packages cajole. regular waters are 
> final requests. regular accounts are according to 
> 1,  AMERICA,  hs use ironic, even requests. s
> 2,  ASIA,  ges. thinly even pinto beans ca
> 3,  EUROPE,  ly final courts cajole furiously final excuse
> 4,  MIDDLE EAST,  uickly special accounts cajole carefully blithely close 
> requests. carefully final asymptotes haggle furiousl
> (15 rows, 0.03 sec, 494 B selected)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to