[
https://issues.apache.org/jira/browse/PARQUET-1022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16793530#comment-16793530
]
Tham commented on PARQUET-1022:
-------------------------------
Uhm, after looking into C++ API, I think it requires a lot of work to either
concat multiple files by binary blocks as [~xhochy] suggested or write data and
metadata separately as [~wesmckinn] suggested. It seems that current
implementation is not designed for this purpose and seems tricky to do so.
I also saw that the commit of merge-tool in Java has recently been reverted
[https://github.com/apache/parquet-mr/commit/ab42fe5180366120336fb3f8b9e6540aadb5da1b]
:(
Correct me if I'm wrong. Any idea is welcome.
> [C++] Append mode in parquet-cpp
> --------------------------------
>
> Key: PARQUET-1022
> URL: https://issues.apache.org/jira/browse/PARQUET-1022
> Project: Parquet
> Issue Type: New Feature
> Components: parquet-cpp
> Affects Versions: cpp-1.1.0
> Reporter: yugu
> Assignee: Wes McKinney
> Priority: Major
>
> As said, currently trying to work out a append feature for parquet files in
> c++.
> (been searching through repo etc, can't find example tho..)
> Current solution is to (assume no schema changes that is):
> Read in metadata
> Change metadata based on appended rows+ original rows
> Append a new row group (or multiple row group writer)
> Write the new rows.
> ---
> The problem is that, is approached this way, the original last row group may
> not be complete filled. Was wondering if there is a fix or I'm using the api
> wrong...
> Thanks ! : D
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)