[ https://issues.apache.org/jira/browse/PARQUET-1022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16793530#comment-16793530 ]
Tham commented on PARQUET-1022: ------------------------------- Uhm, after looking into C++ API, I think it requires a lot of work to either concat multiple files by binary blocks as [~xhochy] suggested or write data and metadata separately as [~wesmckinn] suggested. It seems that current implementation is not designed for this purpose and seems tricky to do so. I also saw that the commit of merge-tool in Java has recently been reverted [https://github.com/apache/parquet-mr/commit/ab42fe5180366120336fb3f8b9e6540aadb5da1b] :( Correct me if I'm wrong. Any idea is welcome. > [C++] Append mode in parquet-cpp > -------------------------------- > > Key: PARQUET-1022 > URL: https://issues.apache.org/jira/browse/PARQUET-1022 > Project: Parquet > Issue Type: New Feature > Components: parquet-cpp > Affects Versions: cpp-1.1.0 > Reporter: yugu > Assignee: Wes McKinney > Priority: Major > > As said, currently trying to work out a append feature for parquet files in > c++. > (been searching through repo etc, can't find example tho..) > Current solution is to (assume no schema changes that is): > Read in metadata > Change metadata based on appended rows+ original rows > Append a new row group (or multiple row group writer) > Write the new rows. > --- > The problem is that, is approached this way, the original last row group may > not be complete filled. Was wondering if there is a fix or I'm using the api > wrong... > Thanks ! : D -- This message was sent by Atlassian JIRA (v7.6.3#76005)