The namenode is already a serious bottleneck for meta-data updates. If you allow some of the block map or meta-data to page out to disk, then the bottleneck is going to get much worse.
The only way to avoid this is to make the data much more cacheable and to have a viable cache coherency strategy. Cache coherency at the meta-data level is difficult. Cache coherency at the block level is also difficult (but not as difficult) because many blocks get moved for balance purposes. The MapR approach is a useful counter-example here since the architecture was specifically designed so that the only centralized data could be cached indefinitely because coherency can be checked on access. This dramatically increases the distribution of the location information which in turn makes the centralized copy much smaller and more pageable. The virtuous cycle continues by making the distributed resources read/write so that meta-data needn't be centralized. It is very hard for me to understand how to evolutionarily migrate the current HDFS architecture to something that admits paging of data to disk. The problem is that there are logical circularities with the current approach that force either the current design or a major rebuild from the ground up. On Mon, Sep 5, 2011 at 9:29 AM, Sesha Kumar <sesha...@gmail.com> wrote: > 1. Namenode stores blockmaps for all the blocks in its main memory. This > can be used to keep an up-to-date snapshot of total filesystem. But what i > feel is this blockmap is not a constant data and hence storing it in main > memory all the time can be avoided in order to save main memory space. On a > request for a file from the client the blockmap details can be fetched.