[ 
https://issues.apache.org/jira/browse/HUDI-3110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan closed HUDI-3110.
-------------------------------------
    Resolution: Invalid

> parquet max file size not honored
> ---------------------------------
>
>                 Key: HUDI-3110
>                 URL: https://issues.apache.org/jira/browse/HUDI-3110
>             Project: Apache Hudi
>          Issue Type: Bug
>    Affects Versions: 0.11.0
>            Reporter: sivabalan narayanan
>            Assignee: sivabalan narayanan
>            Priority: Major
>              Labels: sev:high
>             Fix For: 0.11.0
>
>
> setting hoodie.parquet.max.file.size does not get honored. 
> I still see size reaches 120Mb even though I configure max parquet size to 
> 50MB. 
> this is happening in both row writer path and non row writer path.
>  
>  df.write.format("hudi").
>      |         option(PRECOMBINE_FIELD_OPT_KEY, "other").
>      |         option(RECORDKEY_FIELD_OPT_KEY, "id").
>      |         option(PARTITIONPATH_FIELD_OPT_KEY, "type").
>      |         option(OPERATION_OPT_KEY,"bulk_insert").
>      |         option("hoodie.bulkinsert.shuffle.parallelism", "4").
>      |         option("hoodie.parquet.max.file.size","52428800").
>      |         option(TABLE_NAME, tableName).
>      |         option("hoodie.datasource.write.row.writer.enable","false").
>      |         mode(Overwrite).
>      |         save(basePath)
>  
>  ls -ltr /tmp/hudi_trips_cow/PullRequestEvent
> total 754048
> -rw-r--r--  1 nsb  wheel  121847456 Dec 27 19:14 
> e199774a-ceec-47bb-883e-4e669877f778-3_1-34-192_20211227191149448.parquet
> -rw-r--r--  1 nsb  wheel  119741276 Dec 27 19:14 
> e199774a-ceec-47bb-883e-4e669877f778-4_1-34-192_20211227191149448.parquet
> -rw-r--r--  1 nsb  wheel  114652047 Dec 27 19:14 
> e199774a-ceec-47bb-883e-4e669877f778-5_1-34-192_20211227191149448.parquet



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to