Reposting here to see if any of the HDFS developers have some good insight
into this.
Deep dive is in the below original message. The gist of it is after
upgrading to 2.7.2 on a ~260 node cluster, the active NN's fsimage download
and edit logs roll seem to get stuck in native FileChannel.force
like that. In one of large clusters (5000+ node, 2.7.3ish, jdk8),
> rollEdits() takes less than 30ms consistently.
>
> Kihwal
>
>
> ------
> *From:* Joey Paskhay <joey.pask...@gmail.com>
> *To:* hdfs-dev@hadoop.apache.org
> *Sent:* Tuesday, Se