himanshug commented on issue #8061: Native parallel batch indexing with shuffle URL: https://github.com/apache/incubator-druid/issues/8061#issuecomment-510694723 > To handle middleManager failure, a sort of self-cleanup can be triggered when some amount of time is elapsed since the last access to any partition for a supervisorTask. Does this sound good? anything is good as long as data is not left behind forever in error scenarios :) , that said it might be more work to track access time of the partition... - are you planning on using the OS managed file access time ? note that many FS are configured to not update access time on reads due to associated IO overhead - are you planning to track access time in MM memory ? what happens if MM process gets restarted for some reason and then it will lose track of that. I was imagining a simpler world where MM just periodically scans directories where partition files could be stored, accumulates a list of supervisor taskIds(for which some data is stored), makes a call to overlord to get state of all those tasks and then based on state(failed,completed etc) returned deletes data for those. that said, anything is fine really. it would be extra nice to document the failure cases and expected behavior e.g. - supervisor task process (or MM running it) crashed while phase1/2 tasks were running - one or more of phase1 tasks crashed - one or more of phase2 tasks crashed will we try to recover in some of those cases? for example if one or more phase1/2 tasks crashed, will supervisor task retry them ?are any/all of these tasks restorable i.e. return true for `canRestore()` ? it is not necessary for things to be recoverable as most of this would be iterated upon later but we can just document what to expect in such failure scenarios.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
