----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/6377/#review10681 -----------------------------------------------------------
Overall this looks good and it is going to be a very useful addition. I had time tonight for about half of a review, will try to follow up on the rest tomorrow. I've added some suggestions below, a few of which may be a little bit nitpicky. One thing that is kind of nagging me is whether we can simplify the logic of SpoolingFileLineReader, since right now I feel it's a little bit hard to follow. Maybe I'm just tired, so I'm going to stare at it some more later. flume-ng-core/src/main/java/org/apache/flume/client/avro/AvroCLIClient.java <https://reviews.apache.org/r/6377/#comment22987> Nit: typo in simultaneously flume-ng-core/src/main/java/org/apache/flume/client/avro/SpoolingFileLineReader.java <https://reviews.apache.org/r/6377/#comment22988> It would be safer to use File.createTempFile() for this instead of using a semi hardcoded filename. The 3-arg form accepts a directory as an argument. flume-ng-core/src/main/java/org/apache/flume/client/avro/SpoolingFileLineReader.java <https://reviews.apache.org/r/6377/#comment22994> Nit: please add a @throws comment and throws clause for FlumeException (even though it's a RuntimeException) flume-ng-core/src/main/java/org/apache/flume/client/avro/SpoolingFileLineReader.java <https://reviews.apache.org/r/6377/#comment22995> It is unfortunate that we have to check for this case, but it looks like there is a Windows atomicity issue with renameTo() for Case #1 (re. the comments). Maybe it would be more reliable to always throw an exception when existing.exists() occurs on a non-Windows OS. flume-ng-core/src/main/java/org/apache/flume/client/avro/SpoolingFileLineReader.java <https://reviews.apache.org/r/6377/#comment22997> Any reason not to make this method idempotent? Why throw? flume-ng-core/src/main/java/org/apache/flume/client/avro/SpoolingFileLineReader.java <https://reviews.apache.org/r/6377/#comment22991> What do you think of changing the contract to process the files in File.lastModified() time order, ascending? (oldest first) flume-ng-core/src/main/java/org/apache/flume/client/avro/SpoolingFileLineReader.java <https://reviews.apache.org/r/6377/#comment22992> Might consider using listFiles(FileFilter) for this to avoid allocating & traversing the list twice. Would likely only become noticeable in the case of a directory with hundreds or thousands of files flume-ng-core/src/main/java/org/apache/flume/client/avro/SpoolingFileLineReader.java <https://reviews.apache.org/r/6377/#comment22993> Might be better to simply throw an IllegalStateException here with an appropriate message, e.g. "Line reader has already been stopped" or something flume-ng-core/src/main/java/org/apache/flume/client/avro/SpoolingFileLineReader.java <https://reviews.apache.org/r/6377/#comment22989> This inner class should be declared as private static flume-ng-core/src/main/java/org/apache/flume/client/avro/SpoolingFileLineReader.java <https://reviews.apache.org/r/6377/#comment22996> In general I am wondering if we can get by without this exception type flume-ng-core/src/main/java/org/apache/flume/source/SpoolDirectorySource.java <https://reviews.apache.org/r/6377/#comment22998> Can we make this a static inner class by passing shared objects to the constructor as params? - Mike Percy On Aug. 14, 2012, 10:02 p.m., Patrick Wendell wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/6377/ > ----------------------------------------------------------- > > (Updated Aug. 14, 2012, 10:02 p.m.) > > > Review request for Flume. > > > Description > ------- > > This patch adds a spooling directory based source. The idea is that a user > can have a spool directory where files are deposited for ingestion into > flume. Once ingested, the files are clearly renamed and the implementation > guarantees at-least-once delivery semantics similar to those achieved within > flume itself, even across failures and restarts of the JVM running the code. > > This helps fill the gap for people who want a way to get reliable delivery of > events into flume, but don't want to directly write their application against > the flume API. They can simply drop log files off in a spooldir and let flume > ingest asynchronously (using some shell scripts or other automated process). > > Unlike the prior iteration, this patch implements a first-class source. It > also extends the avro client to support spooling in a similar manner. > > > This addresses bug FlUME-1425. > https://issues.apache.org/jira/browse/FlUME-1425 > > > Diffs > ----- > > > flume-ng-configuration/src/main/java/org/apache/flume/conf/source/SourceConfiguration.java > da804d7 > > flume-ng-configuration/src/main/java/org/apache/flume/conf/source/SourceType.java > abbbf1c > flume-ng-core/src/main/java/org/apache/flume/client/avro/AvroCLIClient.java > 4a5ecae > > flume-ng-core/src/main/java/org/apache/flume/client/avro/BufferedLineReader.java > PRE-CREATION > flume-ng-core/src/main/java/org/apache/flume/client/avro/LineReader.java > PRE-CREATION > > flume-ng-core/src/main/java/org/apache/flume/client/avro/SpoolingFileLineReader.java > PRE-CREATION > > flume-ng-core/src/main/java/org/apache/flume/source/SpoolDirectorySource.java > PRE-CREATION > > flume-ng-core/src/main/java/org/apache/flume/source/SpoolDirectorySourceConfigurationConstants.java > PRE-CREATION > > flume-ng-core/src/test/java/org/apache/flume/client/avro/TestBufferedLineReader.java > PRE-CREATION > > flume-ng-core/src/test/java/org/apache/flume/client/avro/TestSpoolingFileLineReader.java > PRE-CREATION > > flume-ng-core/src/test/java/org/apache/flume/source/TestSpoolDirectorySource.java > PRE-CREATION > flume-ng-doc/sphinx/FlumeUserGuide.rst 45dd7cc > > Diff: https://reviews.apache.org/r/6377/diff/ > > > Testing > ------- > > Extensive unit tests and I also built and played with this using a stub flume > agent. If you look at the JIRA I have a configuration file for an agent that > will print out Avro events to the command line - that's helpful when testing > this. > > > Thanks, > > Patrick Wendell > >
