----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/74/#review40 -----------------------------------------------------------
Looks pretty good. If possible it would be awesome to move all of these log splitting methods into a new class, but we can do that in a followup. src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java <http://review.hbase.org/r/74/#comment207> why not make these Class<? extends HLog.Reader>? src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java <http://review.hbase.org/r/74/#comment208> can we use conf.getClass here? That way we'd have the appropriate generic types src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java <http://review.hbase.org/r/74/#comment209> having looked at this function a few times and been very confused, I'd love to see a javadoc that explains the entire process - eg what the "batching" does, what the output looks like, etc src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java <http://review.hbase.org/r/74/#comment211> this conf is incorrectly named, right? it's not a number of threads, but rather a number of logs to read in in each round? src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java <http://review.hbase.org/r/74/#comment210> this conf is very vaguely named src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java <http://review.hbase.org/r/74/#comment212> should add e as a second parameter, so you get the full stack trace in this warning src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java <http://review.hbase.org/r/74/#comment213> typo, and we should fix this before committing. we can use the org.apache.hadoop.util.VersionInfo class for this, or use reflection to look for the "hflush()" function src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java <http://review.hbase.org/r/74/#comment214> javadoc src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java <http://review.hbase.org/r/74/#comment215> we should use a namingthreadfactory here. Consider: http://guava-libraries.googlecode.com/svn-history/r13/trunk/javadoc/com/google/common/util/concurrent/NamingThreadFactory.html (I have some patches coming in which use guava, it's very handy) src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java <http://review.hbase.org/r/74/#comment216> we're calling execute(Thread) whereas execute takes a Runnable. By some quirk of java, Thread extends Runnable, but this is somewhat strange - we should make createNewSplitThread be createNewSplitter or createNewSplitRunnable src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java <http://review.hbase.org/r/74/#comment217> if j > 30 (or something), escalate log to info level. Also preferable to log the number of seconds, not the iteration. src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java <http://review.hbase.org/r/74/#comment218> yes, I think so. The RS could have crashed right after opening but before writing any data, and if the master failed to recover that, then we'd never recover that region. I say ignore with a WARN src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java <http://review.hbase.org/r/74/#comment219> see above logic - writer could have crashed after writing only part of the sequencefile header, etc, so we should just warn and continue src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java <http://review.hbase.org/r/74/#comment220> I think we need to handle EOF specially here too, though OK to leave this for another JIRA. IIRC one of the FB guys opened this already src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java <http://review.hbase.org/r/74/#comment221> this thread name is not very descriptive src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java <http://review.hbase.org/r/74/#comment222> wouldn't it make more sense to just add them in the other order? LinkedList is doubly linked, so adding to the beginning or end are both constant time, and then we wouldn't have to do this contortion to iterate backwards here. src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java <http://review.hbase.org/r/74/#comment223> we can move this out of the inner loop, right? src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java <http://review.hbase.org/r/74/#comment224> add comment about what this is doing - presumably the point here is that some previous master may have started splitting the logs, then crashed in the middle. src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java <http://review.hbase.org/r/74/#comment225> no point to call .sync() here, it just wastes a bunch of IO to write "sync markers" which we don't make any real use of. src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java <http://review.hbase.org/r/74/#comment226> I'd think we need to fail here. This ties into the comment above about getting Futures out of the threadpool and checking any uncaught exceptions src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java <http://review.hbase.org/r/74/#comment227> javadoc src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java <http://review.hbase.org/r/74/#comment228> the .exists checks are redundant, since HDFS mkdirs acts like mkdir -p, we're just wasting RPCs src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java <http://review.hbase.org/r/74/#comment229> this log message has never been english src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogSplit.java <http://review.hbase.org/r/74/#comment230> apache license src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogSplit.java <http://review.hbase.org/r/74/#comment231> this is the case I take issue with - trailing garbage should still proceed (with warnings) even if we've told it not to skip errors. - Todd On 2010-05-21 12:11:59, stack wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > http://review.hbase.org/r/74/ > ----------------------------------------------------------- > > (Updated 2010-05-21 12:11:59) > > > Review request for hbase. > > > Summary > ------- > > This is a version of Cosmin's patch that applies to trunk doctored to work w/ > hadoop 0.20. > > > This addresses bug hbase-2437. > http://issues.apache.org/jira/browse/hbase-2437 > > > Diffs > ----- > > src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java e479eae > src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogSplit.java > PRE-CREATION > > Diff: http://review.hbase.org/r/74/diff > > > Testing > ------- > > All his new tests past. > > > Thanks, > > stack > >
