[ 
https://issues.apache.org/jira/browse/TAJO-933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14061903#comment-14061903
 ] 

ASF GitHub Bot commented on TAJO-933:
-------------------------------------

Github user hyunsik commented on the pull request:

    https://github.com/apache/tajo/pull/75#issuecomment-49011928
  
    Hi @davidzchen,
    
    We just need current reading offset in order to get the task progress. 
Also, we need writing offset and the current memory buffer size in order to 
estimate the output file size to be written. In addition to them, there is no 
other purpose. If you know a better approach for them, feel free to suggest us.
    
    In addition, we will ask these features to Parquet community. BTW, I think 
that we need this feature right now.


> Fork some classes of Parquet as builtin third-party classes
> -----------------------------------------------------------
>
>                 Key: TAJO-933
>                 URL: https://issues.apache.org/jira/browse/TAJO-933
>             Project: Tajo
>          Issue Type: Improvement
>          Components: storage
>            Reporter: Hyunsik Choi
>            Assignee: Hyunsik Choi
>            Priority: Minor
>              Labels: newbie, parquet
>             Fix For: 0.9.0
>
>
> Parquet has strict modifier and encapsulation design. This is well designed, 
> but it does not allow us to add desired features to Parquet. For example, it 
> is hard to get the written file size and how many memory buffer is filled.
> I propose forking some classes of Parquet file format as embed third-party 
> classes. After then, we can revise them tailored for Tajo.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to