xvrl opened a new issue #7886: Middlemanager fails startup due to corrupt task files URL: https://github.com/apache/incubator-druid/issues/7886 ### Affected Version 0.15.0-SNAPSHOT (git sha d99f77a01b5f4e0abde0ec85) ### Description An unclean middle-manager shutdown may leave empty task files in `${druid.indexer.task.baseTaskDir}/completedTasks/`. This may be an issue on it's own, but it could also happen for reasons beyond our control. Those empty (corrupt) files cause the middlemanager to fail on a subsequent startup, due to https://github.com/apache/incubator-druid/blob/master/indexing-service/src/main/java/org/apache/druid/indexing/worker/WorkerTaskManager.java#L430 re-throwingh a JsonProcessingException The exception message also looks incorrect, saying the files would be ignore, but instead it causes the entire startup sequence to interrupt, requiring user intervention to remove corrupt files in order to resume startup. ```2019-06-13T16:39:50,861 ERROR [main] org.apache.druid.cli.CliMiddleManager - Error when starting up. Failing. java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_212] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_212] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_212] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_212] at org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler.start(Lifecycle.java:443) ~[druid-core-0.15.0-incubating-SNAPSHOT.jar:0.15.0-incubating-SNAPSHOT] at org.apache.druid.java.util.common.lifecycle.Lifecycle.start(Lifecycle.java:339) ~[druid-core-0.15.0-incubating-SNAPSHOT.jar:0.15.0-incubating-SNAPSHOT] at org.apache.druid.guice.LifecycleModule$2.start(LifecycleModule.java:140) ~[druid-core-0.15.0-incubating-SNAPSHOT.jar:0.15.0-incubating-SNAPSHOT] at org.apache.druid.cli.GuiceRunnable.initLifecycle(GuiceRunnable.java:106) [druid-services-0.15.0-incubating-SNAPSHOT.jar:0.15.0-incubating-SNAPSHOT] at org.apache.druid.cli.ServerRunnable.run(ServerRunnable.java:57) [druid-services-0.15.0-incubating-SNAPSHOT.jar:0.15.0-incubating-SNAPSHOT] at org.apache.druid.cli.Main.main(Main.java:118) [druid-services-0.15.0-incubating-SNAPSHOT.jar:0.15.0-incubating-SNAPSHOT] Caused by: org.apache.druid.java.util.common.ISE: Failed to read completed task from disk at [/var/druid/task/completedTasks/index_kafka_metrics_opencensus_ebffe52caf33afb_lcgaddpn]. Ignored. at org.apache.druid.indexing.worker.WorkerTaskManager.initCompletedTasks(WorkerTaskManager.java:430) ~[druid-indexing-service-0.15.0-incubating-SNAPSHOT.jar:0.15.0-incubating-SNAPSHOT] at org.apache.druid.indexing.worker.WorkerTaskManager.start(WorkerTaskManager.java:135) ~[druid-indexing-service-0.15.0-incubating-SNAPSHOT.jar:0.15.0-incubating-SNAPSHOT] at org.apache.druid.indexing.worker.WorkerTaskMonitor.start(WorkerTaskMonitor.java:94) ~[druid-indexing-service-0.15.0-incubating-SNAPSHOT.jar:0.15.0-incubating-SNAPSHOT] ... 10 more Caused by: com.fasterxml.jackson.databind.JsonMappingException: No content to map due to end-of-input at [Source: var/druid/task/completedTasks/index_kafka_<datasource>_ebffe52caf33afb_lcgaddpn; line: 1, column: 1] at com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:148) ~[jackson-databind-2.6.7.jar:2.6.7] at com.fasterxml.jackson.databind.ObjectMapper._initForReading(ObjectMapper.java:3781) ~[jackson-databind-2.6.7.jar:2.6.7] at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3721) ~[jackson-databind-2.6.7.jar:2.6.7] at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2620) ~[jackson-databind-2.6.7.jar:2.6.7] at org.apache.druid.indexing.worker.WorkerTaskManager.initCompletedTasks(WorkerTaskManager.java:421) ~[druid-indexing-service-0.15.0-incubating-SNAPSHOT.jar:0.15.0-incubating-SNAPSHOT] at org.apache.druid.indexing.worker.WorkerTaskManager.start(WorkerTaskManager.java:135) ~[druid-indexing-service-0.15.0-incubating-SNAPSHOT.jar:0.15.0-incubating-SNAPSHOT] at org.apache.druid.indexing.worker.WorkerTaskMonitor.start(WorkerTaskMonitor.java:94) ~[druid-indexing-service-0.15.0-incubating-SNAPSHOT.jar:0.15.0-incubating-SNAPSHOT] ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
