[
https://issues.apache.org/jira/browse/PARQUET-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16572225#comment-16572225
]
Anatoli Shein commented on PARQUET-1372:
----------------------------------------
[~renato2099], thanks! So the current plan is to make each row group
approximately the given size without going over, while each row group should
also have at least one record.
> [C++] Add an API to allow writing RowGroups based on their size rather than
> num_rows
> ------------------------------------------------------------------------------------
>
> Key: PARQUET-1372
> URL: https://issues.apache.org/jira/browse/PARQUET-1372
> Project: Parquet
> Issue Type: Task
> Reporter: Anatoli Shein
> Assignee: Anatoli Shein
> Priority: Major
> Fix For: 1.5.0
>
>
> The current API allows writing RowGroups with specified numbers of rows,
> however does not allow writing RowGroups with specified size. In order to
> write RowGroups of specified size we need to write rows in chunks while
> checking the total_bytes_written after each chunk is written. This is
> currently impossible because the call to NextColumn() closes the current
> column writer.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)