I have a cluster that we have running under AWS EMR environment, to which we have loaded the datatorrent gateway, etc.
I had to restart my cluster today, only to find out that it is taking forever to bring up the HDFS services, as it is replaying all the edits in the edits files to apply them to the fsimage (it appears to have all the edits since we launched the server 6 months ago). By default isn't hdfs set up to automatically checkpoint every hour, and clear up the 'old' edits files; or is there something I need to enable to start automatic checkpointing (include any parameters I would need to set in the hdfs-site.xml config file as well)? Thanks, Jim