[
https://issues.apache.org/jira/browse/FLINK-20538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Flink Jira Bot updated FLINK-20538:
-----------------------------------
Labels: auto-deprioritized-major stale-minor (was:
auto-deprioritized-major)
I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help
the community manage its development. I see this issues has been marked as
Minor but is unassigned and neither itself nor its Sub-Tasks have been updated
for 180 days. I have gone ahead and marked it "stale-minor". If this ticket is
still Minor, please either assign yourself or give an update. Afterwards,
please remove the label or in 7 days the issue will be deprioritized.
> sink.rolling-policy.file-size does not work in filesystem connector
> -------------------------------------------------------------------
>
> Key: FLINK-20538
> URL: https://issues.apache.org/jira/browse/FLINK-20538
> Project: Flink
> Issue Type: Bug
> Components: Connectors / FileSystem, Table SQL / Ecosystem
> Affects Versions: 1.11.1
> Reporter: zhuxiaoshang
> Priority: Minor
> Labels: auto-deprioritized-major, stale-minor
>
> When I use sql filesystem connector to write data to hdfs,and set
> sink.rolling-policy.file-size to 50MB.But seems not working, there are still
> 100MB+ size files.
> My table ddl is :
>
> {code:java}
> CREATE TABLE cpc_bd_recall_log_hdfs (
> log_timestamp BIGINT,
> ip STRING,
> `raw` STRING,
> `day` STRING, `hour` STRING,`minute` STRING
> ) PARTITIONED BY (`day` , `hour` ,`minute`) WITH (
> 'connector'='filesystem',
> 'path'='hdfs://xxx/test.db/hdfs_test',
> 'format'='parquet',
> 'parquet.compression'='SNAPPY',
> 'sink.rolling-policy.file-size' = '50MB',
> 'sink.partition-commit.policy.kind' = 'success-file',
> 'sink.partition-commit.delay'='60s'
> );
> {code}
> the hdfs files are:
>
>
> {code:java}
> 0 2020-12-04 14:56
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/_SUCCESS
> -rw-r--r-- 3 hadoop hadoop 31.7 M 2020-12-04 14:55
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-0-2500
> -rw-r--r-- 3 hadoop hadoop 121.8 M 2020-12-04 14:56
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-0-2501
> -rw-r--r-- 3 hadoop hadoop 31.9 M 2020-12-04 14:55
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-1-2499
> -rw-r--r-- 3 hadoop hadoop 122.0 M 2020-12-04 14:56
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-1-2500
> -rw-r--r-- 3 hadoop hadoop 31.8 M 2020-12-04 14:55
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-10-2501
> -rw-r--r-- 3 hadoop hadoop 121.8 M 2020-12-04 14:56
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-10-2502
> -rw-r--r-- 3 hadoop hadoop 31.9 M 2020-12-04 14:55
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-11-2500
> -rw-r--r-- 3 hadoop hadoop 122.2 M 2020-12-04 14:56
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-11-2501
> -rw-r--r-- 3 hadoop hadoop 31.9 M 2020-12-04 14:55
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-12-2500
> -rw-r--r-- 3 hadoop hadoop 122.2 M 2020-12-04 14:56
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-12-2501
> -rw-r--r-- 3 hadoop hadoop 31.8 M 2020-12-04 14:55
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-13-2499
> -rw-r--r-- 3 hadoop hadoop 122.0 M 2020-12-04 14:56
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-13-2500
> -rw-r--r-- 3 hadoop hadoop 31.6 M 2020-12-04 14:55
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-14-2500
> -rw-r--r-- 3 hadoop hadoop 122.1 M 2020-12-04 14:56
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-14-2501
> -rw-r--r-- 3 hadoop hadoop 31.9 M 2020-12-04 14:55
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-15-2498
> -rw-r--r-- 3 hadoop hadoop 121.8 M 2020-12-04 14:56
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-15-2499
> -rw-r--r-- 3 hadoop hadoop 31.7 M 2020-12-04 14:55
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-16-2501
> -rw-r--r-- 3 hadoop hadoop 122.0 M 2020-12-04 14:56
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-16-2502
> -rw-r--r-- 3 hadoop hadoop 31.7 M 2020-12-04 14:55
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-17-2500
> -rw-r--r-- 3 hadoop hadoop 122.5 M 2020-12-04 14:56
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-17-2501
> -rw-r--r-- 3 hadoop hadoop 31.8 M 2020-12-04 14:55
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-18-2500
> -rw-r--r-- 3 hadoop hadoop 121.7 M 2020-12-04 14:56
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-18-2501
> -rw-r--r-- 3 hadoop hadoop 31.9 M 2020-12-04 14:55
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-19-2501
> -rw-r--r-- 3 hadoop hadoop 121.7 M 2020-12-04 14:56
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-19-2502
> -rw-r--r-- 3 hadoop hadoop 31.6 M 2020-12-04 14:55
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-2-2499
> -rw-r--r-- 3 hadoop hadoop 121.6 M 2020-12-04 14:56
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-2-2500
> -rw-r--r-- 3 hadoop hadoop 31.8 M 2020-12-04 14:55
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-3-2500
> -rw-r--r-- 3 hadoop hadoop 121.8 M 2020-12-04 14:56
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-3-2501
> -rw-r--r-- 3 hadoop hadoop 31.6 M 2020-12-04 14:55
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-4-2499
> -rw-r--r-- 3 hadoop hadoop 122.1 M 2020-12-04 14:56
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-4-2500
> -rw-r--r-- 3 hadoop hadoop 31.6 M 2020-12-04 14:55
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-5-2499
> -rw-r--r-- 3 hadoop hadoop 121.8 M 2020-12-04 14:56
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-5-2500
> -rw-r--r-- 3 hadoop hadoop 31.8 M 2020-12-04 14:55
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-6-2499
> -rw-r--r-- 3 hadoop hadoop 121.5 M 2020-12-04 14:56
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-6-2500
> -rw-r--r-- 3 hadoop hadoop 31.6 M 2020-12-04 14:55
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-7-2500
> -rw-r--r-- 3 hadoop hadoop 122.0 M 2020-12-04 14:56
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-7-2501
> -rw-r--r-- 3 hadoop hadoop 31.7 M 2020-12-04 14:55
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-8-2501
> -rw-r--r-- 3 hadoop hadoop 122.0 M 2020-12-04 14:56
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-8-2502
> -rw-r--r-- 3 hadoop hadoop 31.9 M 2020-12-04 14:55
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-9-2501
> -rw-r--r-- 3 hadoop hadoop 121.9 M 2020-12-04 14:56
> hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-9-2502
> {code}
>
>
> However,when I dig into source code,when writing element to bucket it'll
> invoke `shouldRollOnEvent` in TableRollingPolicy.
> I don't understand how can this happen?Is a BUG or somewhere I get it wrong.
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)