[ 
https://issues.apache.org/jira/browse/FLINK-22472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kurt Young reassigned FLINK-22472:
----------------------------------

    Assignee: luoyuxia

> The real partition data produced time is behind meta(_SUCCESS) file produced
> ----------------------------------------------------------------------------
>
>                 Key: FLINK-22472
>                 URL: https://issues.apache.org/jira/browse/FLINK-22472
>             Project: Flink
>          Issue Type: Improvement
>          Components: Connectors / FileSystem, Connectors / Hive
>            Reporter: Leonard Xu
>            Assignee: luoyuxia
>            Priority: Major
>         Attachments: image-2021-05-25-14-27-40-563.png
>
>
> I test write some data to csv file by flink filesystem connector, but after 
> the success file produced, the data file is still un-committed, it's very 
> weird to me.
> {code:java}
> bang@mac db1.db $ll 
> /var/folders/55/cw682b314gn8jhfh565hp7q00000gp/T/junit8642959834366044048/junit484868942580135598/test-partition-time-commit/d\=2020-05-03/e\=12/
> total 8
> drwxr-xr-x  4 bang  staff  128  4 25 19:57 ./
> drwxr-xr-x  8 bang  staff  256  4 25 19:57 ../
> -rw-r--r--  1 bang  staff   12  4 25 19:57 
> .part-b703d4b9-067a-4dfe-935e-3afc723aed56-0-4.inprogress.b7d9cf09-0f72-4dce-8591-b61b1d23ae9b
> -rw-r--r--  1 bang  staff    0  4 25 19:57 _MY_SUCCESS
> {code}
>  
> After some debug I found I have to set  {{sink.rolling-policy.file-size}} or 
> {{sink.rolling-policy.rollover-interval parameters, the default value of the 
> two parameters is pretty big(128M and 30min). It's not convenient for 
> test/demo. I think we can improve this.}}
>  
> As the doc[1] described, for row formats (csv, json), you can set the 
> parameter {{sink.rolling-policy.file-size}} or 
> {{sink.rolling-policy.rollover-interval}} in the connector properties and 
> parameter {{execution.checkpointing.interval}} in flink-conf.yaml together if 
> you don’t want to wait a long period before observe the data exists in file 
> system.
> [1] 
> https://ci.apache.org/projects/flink/flink-docs-master/docs/connectors/table/filesystem/#rolling-policy



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to