I need to run through some server maintenance on my data nodes, including a reboot. My splitlogs, though, only seem to have a replication factor of 1 (when a data nodes is taken offline, I sometimes have missing blocks for them). I know I can decommission data nodes with the exclude.dfs file, but this takes many days to complete, and seems like overkill for a quick reboot.
Is there a better way to go about this? It seems like splitlogs should have a replication factor of at least 2, in case a server dies (which would allow me to more safely reboot a server). I don't know the performance hit this would cause, though. Or maybe this is more of an HDFS question, and there's a way to tell a data node to only shed it's replication=1 blocks. Any thoughts would be appreciated. Thanks, Patrick
