> On Aug. 9, 2012, 9:38 p.m., Hari Shreedharan wrote: > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/ReplayHandler.java, > > line 175 > > <https://reviews.apache.org/r/6411/diff/4/?file=135708#file135708line175> > > > > If I understand this correctly, what is being done here is: > > * Pick up files not in in logRecordBuffer > > * Pick up the next record from each of those files. > > * Pick up the record with smallest write order id, and remove the file > > from the buffer. > > > > Why not simply get next from all files and add it to logRecordBuffer, > > pick up the logRecord(with lowest orderID) from logRecordBuffer and add the > > next record from the file that this record belonged to into > > logRecordBuffer. Remove the file from readers if reader.next is null. That > > seems like it would be more efficient. > > Hari Shreedharan wrote: > You could use a priority queue to make this more efficient too. Each time > a record is removed from the queue, insert one from the same file. (sort of > like sorted file-merge)
In regard to your first comment, we cannot read a record from each file each time because some files maybe tens of 10GB into the future. Regarding the priority queue, I thought about that but at that point LogRecord did not have the file id associated with it. I realized later I need that so I think a PQ makes sense here. > On Aug. 9, 2012, 9:38 p.m., Hari Shreedharan wrote: > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FileChannel.java, > > line 104 > > <https://reviews.apache.org/r/6411/diff/4/?file=135702#file135702line104> > > > > Nit : Is this really required? Since this is a big patch it would be > > better if we avoided changes that we really don't need. I didn't actually move this method. For a time I had deleted TRANSACTION_ID field and when I replaced it, I did so in a more sensible place (fields above methods). If you would it removed, feel free to say so. - Brock ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/6411/#review10086 ----------------------------------------------------------- On Aug. 8, 2012, 5:51 p.m., Brock Noland wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/6411/ > ----------------------------------------------------------- > > (Updated Aug. 8, 2012, 5:51 p.m.) > > > Review request for Flume. > > > Description > ------- > > This resolves FLUME-1432 which describes a problem with how we replayed logs. > Instead of replaying one log at a time, we replay all the logs in the order > they were written. We do this by changing the "timestamp" field to be a > globally ordered number which allows us to do a merge of all the log files. > This should be faster and also fix some problems where we could not replay > some checkpoints. > > Its worth noting that the condition in testRaceFoundInFLUME1432 was observed > prior to this patch and will not pass if we simply used the transaction id to > order edits due to that being generated on transaction create or the previous > value of the field, timestamp, because they were observed to be non-unique on > a 2 core host. > > If committed FLUME-1433 can be closed as well since this change fixes that > test as well. > > > This addresses bug FLUME-1432. > https://issues.apache.org/jira/browse/FLUME-1432 > > > Diffs > ----- > > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Commit.java > 03b1060 > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FileChannel.java > cc8f89a > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FlumeEventQueue.java > e692934 > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Log.java > 2b733b1 > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/LogFile.java > 5615c6d > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/LogRecord.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Put.java > bcd37ab > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/ReplayHandler.java > da2d068 > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Rollback.java > b42501f > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Take.java > 42b197f > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/TransactionEventRecord.java > c222bd1 > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/WriteOrderOracle.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestCheckpoint.java > 17a7cf9 > > flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestFileChannel.java > 681ebcc > > flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestLogFile.java > 8995089 > > flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestLogRecord.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestTransactionEventRecord.java > a46526d > > flume-ng-channels/flume-file-channel/src/test/resources/fileformat-v2-checkpoint.gz > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/test/resources/fileformat-v2-log-1.gz > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/test/resources/fileformat-v2-log-2.gz > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/test/resources/fileformat-v2-log-3.gz > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/test/resources/fileformat-v2-pre-FLUME-1432-checkpoint.gz > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/test/resources/fileformat-v2-pre-FLUME-1432-log-1.gz > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/test/resources/fileformat-v2-pre-FLUME-1432-log-2.gz > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/test/resources/fileformat-v2-pre-FLUME-1432-log-3.gz > PRE-CREATION > > flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/TestAsyncHBaseSink.java > 641952a > > Diff: https://reviews.apache.org/r/6411/diff/ > > > Testing > ------- > > All unit tests pass. A unit test is added, testRaceFoundInFLUME1432 which > found a problem with the way we previously replayed logs. It passes with > these changes. I also tested on a two node flume installation stopping and > restarting a node several times. > > > Thanks, > > Brock Noland > >
