[ https://issues.apache.org/jira/browse/PARQUET-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16572225#comment-16572225 ]
Anatoli Shein commented on PARQUET-1372: ---------------------------------------- [~renato2099], thanks! So the current plan is to make each row group approximately the given size without going over, while each row group should also have at least one record. > [C++] Add an API to allow writing RowGroups based on their size rather than > num_rows > ------------------------------------------------------------------------------------ > > Key: PARQUET-1372 > URL: https://issues.apache.org/jira/browse/PARQUET-1372 > Project: Parquet > Issue Type: Task > Reporter: Anatoli Shein > Assignee: Anatoli Shein > Priority: Major > Fix For: 1.5.0 > > > The current API allows writing RowGroups with specified numbers of rows, > however does not allow writing RowGroups with specified size. In order to > write RowGroups of specified size we need to write rows in chunks while > checking the total_bytes_written after each chunk is written. This is > currently impossible because the call to NextColumn() closes the current > column writer. -- This message was sent by Atlassian JIRA (v7.6.3#76005)