[GitHub] Myasuka commented on issue #7281: [FLINK-11107][state] Avoid memory stateBackend to create arbitrary folders under HA path when no checkpoint path configured
Myasuka commented on issue #7281: [FLINK-11107][state] Avoid memory stateBackend to create arbitrary folders under HA path when no checkpoint path configured URL: https://github.com/apache/flink/pull/7281#issuecomment-468532245 Since Flink-1.8 is about to release, @StephanEwen @StefanRRichter could anyone take a look at this problem? I submitted the same job with the same configuration (no checkpoint path but HA configured) with released Flink-1.3.2 (still has no such `MemoryStateBackend` creating random checkpoint path code, which should be treated as `old behavior`) and Flink-1.7.2 (already contained that part of code.) As you can see `Flink-1.3.2` would have a blob service folder, a completed checkpoint file and a submitted job graph file. I think this is the `old behavior`. https://user-images.githubusercontent.com/1709104/53614879-67353c80-3c16-11e9-8fac-0dee85b676d4.png";> However, `Flink-1.7.2` would have many checkpoint paths created by `MemoryStateBackend` from task-side, as you could guess, `41a7c8b8e62d81225868d2a5a60846f7` is the actual job-id of this job. These created checkpoint path should actually be useless, and might lead to `MaxDirectoryItemsExceededException` under high availability folder. https://user-images.githubusercontent.com/1709104/53614943-9e0b5280-3c16-11e9-81c4-868c3187a09b.png";> Moreover, as you can see, I don't think this would `keep supporting the old behavior` due to the grate directory structure difference. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Myasuka commented on issue #7281: [FLINK-11107][state] Avoid memory stateBackend to create arbitrary folders under HA path when no checkpoint path configured
Myasuka commented on issue #7281: [FLINK-11107][state] Avoid memory stateBackend to create arbitrary folders under HA path when no checkpoint path configured URL: https://github.com/apache/flink/pull/7281#issuecomment-447208854 @klion26 Hmm, I think it's not the same thing, even you could clean up checkpoint directories after job finished/failed in time, you cannot avoid running jobs to create so many directories (one operator with one memory state-backend, and one state-backend would create an unique directory). If several really big jobs are running, which created too many sub-directories under HA storage directory, new submitted job might also cannot create those new sub-directories. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Myasuka commented on issue #7281: [FLINK-11107][state] Avoid memory stateBackend to create arbitrary folders under HA path when no checkpoint path configured
Myasuka commented on issue #7281: [FLINK-11107][state] Avoid memory stateBackend to create arbitrary folders under HA path when no checkpoint path configured URL: https://github.com/apache/flink/pull/7281#issuecomment-446478434 BTW, If we do not create UUID directories for memory state-backend in this situation, job could still restore from high-availability storage. The only difference is the information of `Latest Restore` under `Checkpoints` tab of web UI would show the path is ``. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Myasuka commented on issue #7281: [FLINK-11107][state] Avoid memory stateBackend to create arbitrary folders under HA path when no checkpoint path configured
Myasuka commented on issue #7281: [FLINK-11107][state] Avoid memory stateBackend to create arbitrary folders under HA path when no checkpoint path configured URL: https://github.com/apache/flink/pull/7281#issuecomment-446297350 @StephanEwen , since you have left annotations below: ~~~ to keep supporting the old behavior where default (JobManager) Backend + HA mode = checkpoints in HA store we add the HA persistence dir as the checkpoint directory if none other is set ~~~ However, I'm wondering whether this keeps the same behavior as before. For Flink-1.3, (JobManager) Backend + HA mode = only create `completedCheckpoint` file under HA folder. On the other side, for Flink-1.6, this would create another `job-id/chk-x/_metadata` except the completedCheckpoint file. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services