Hi, We are running hbase with hbase.regionserver.checksum.verify set to true. But we are seeing an equal # of seeks for .meta files on HDFS and data blocks. This is rather puzzling and I dont know if its broken. The hbase jar is compiled against 2.0.3-alpha and this behaviour occurs for both 0.94.3 and 0.94.7. Shortcircuit local reads is enabled is working well since only the region server is accessing the disk.
We run an strace limited to lseek calls and get the following: 28162 lseek(*668*, 0, SEEK_SET) = 0 28162 lseek(*635*, 57479463, SEEK_SET) = 57479463 28162 lseek(*2255*, 0, SEEK_SET) = 0 28162 lseek(*1938*, 29285843, SEEK_SET) = 29285843 Then we use lsof to find the underlying files and match them against the corresponding file decriptors... java 27947 hbase * 668u * REG 202,32 1048583 36176608 /data/xvdc/hadoop/dfs/data/current/BP-1854623640-10.158.62.78-1363075060974/current/finalized/subdir54/ *blk_5081211948968918615_597521.meta* * * java 27947 hbase *635u* REG 202,32 134217728 36176607 /data/xvdc/hadoop/dfs/data/current/BP-1854623640-10.158.62.78-1363075060974/current/finalized/subdir54/ *blk_5081211948968918615* * * java 27947 hbase *2255u* REG 202,16 802375 32768850 /mnt/hadoop/dfs/data/current/BP-1854623640-10.158.62.78-1363075060974/current/finalized/subdir40/ *blk_2670783290218647110_614641.meta* * * java 27947 hbase *1938u* REG 202,16 102702747 32768849 /mnt/hadoop/dfs/data/current/BP-1854623640-10.158.62.78-1363075060974/current/finalized/subdir40/ *blk_2670783290218647110* The pattern in strace is pretty clear - first the .meta is read and then the block is accessed. I am wondering if there are other places apart from the checksum where the .meta file for the HDFS block is being accessed or if the checksum stuff is simply broken ? It seems we are accessing 7 byte values in these .meta files from more strace output. Is there a way I can find out if the checksums were actually written out to HFiles in the first place ? Thanks Varun
