Hi, I am trying to backport the HLog group commit functionality to Hbase 0.20. For proper reliability, I am working with Dhruba to get the 0.21 syncFs() changes from HDFS ported back to HDFS 0.20 as well. When going through a peer review of the modified code, my group had a question about the SequenceFileLogReader.java (WALReader). I am hoping that you guys could be of assistance.
I know that there is an open issue [HBASE-2069] where Hlog::splitLog() does not call DFSDataInputStream::getVisibleLength(), which would properly sync hflushed, but unclosed, file lengths. I believe the current workaround is to open an HDFS file in append mode & then close, which would cause the namenode to get updates from the datanodes. However, I don’t see that shim present in Hlog::splitLog() on the 0.21 trunk. Is this a pending issue to fix or is calling FSDataInputStream::available() within WALReaderFsDataInputStream::getPos() sufficient to force the namenode to sync up with the datanodes? Nicolas Spiegelberg