[ 
https://issues.apache.org/jira/browse/PARQUET-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16572225#comment-16572225
 ] 

Anatoli Shein commented on PARQUET-1372:
----------------------------------------

[~renato2099], thanks! So the current plan is to make each row group 
approximately the given size without going over, while each row group should 
also have at least one record.

> [C++] Add an API to allow writing RowGroups based on their size rather than 
> num_rows
> ------------------------------------------------------------------------------------
>
>                 Key: PARQUET-1372
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1372
>             Project: Parquet
>          Issue Type: Task
>            Reporter: Anatoli Shein
>            Assignee: Anatoli Shein
>            Priority: Major
>             Fix For: 1.5.0
>
>
> The current API allows writing RowGroups with specified numbers of rows, 
> however does not allow writing RowGroups with specified size. In order to 
> write RowGroups of specified size we need to write rows in chunks while 
> checking the total_bytes_written after each chunk is written. This is 
> currently impossible because the call to NextColumn() closes the current 
> column writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to