> On Oct. 30, 2012, 6:22 a.m., Mike Percy wrote:
> > flume-ng-core/src/main/java/org/apache/flume/client/avro/SpoolingFileLineReader.java,
> >  line 300
> > <https://reviews.apache.org/r/6377/diff/5/?file=178797#file178797line300>
> >
> >     This should be something like:
> >     
> >     int bufferSize = bufferMaxLines * bufferMaxLineLength;
> >     BufferedReader reader = new BufferedReader(new FileReader(nextFile), 
> > bufferSize);
> >     reader.mark(bufferSize);
> >     
> >     ...since the default buffer size in the BufferedReader constructor is 
> > 8124 in JDK 6 and that may not be enough to hold your bufferSize (or it may 
> > be too much). Also, the edge-case behavior of BufferedReader (w.r.t. mark 
> > and reset) is different in situations where the constructor arg is 
> > different than the arg passed to mark(), so rather than deal with that 
> > inconsistency it makes sense to do this.

Ya good catch.


> On Oct. 30, 2012, 6:22 a.m., Mike Percy wrote:
> > flume-ng-core/src/main/java/org/apache/flume/client/avro/SpoolingFileLineReader.java,
> >  line 188
> > <https://reviews.apache.org/r/6377/diff/5/?file=178797#file178797line188>
> >
> >     If the line length exceeds the length of the BufferedReader, this 
> > reset() call will throw a java.io.IOException: Mark invalid
> >     
> >     Also, committed = true when you hit this case, since committed = false 
> > does not get set until right before the return statement.
> >     
> >     Therefore, assuming the client catches the exception and recovers from 
> > it, the next call to readLines(n) will simply skip over those lines and the 
> > data will be lost.
> >     
> >     So if there is an error in which the line is too long, since we have 
> > decided that we going to fail then we should fail explicitly and 
> > permanently:
> >     1. Throw a FlumeException indicating a too-long line was seen (do not 
> > try to reset the reader) and also:
> >     2. Permanently disable the SpoolingFileLineReader object so that the 
> > next time someone tries to call any operation on it, such as readLines(), 
> > commit(), or close(), throw an IllegalStateException indicating that it is 
> > no longer in a usable state.
> >     
> >     In a future revision of this implementation, we can do something a 
> > little more graceful, but we have to ensure that we don't lose data in any 
> > case.

Yep this should stop the world when it fails... my mistake. Added some tests to 
make sure this works.


> On Oct. 30, 2012, 6:22 a.m., Mike Percy wrote:
> > flume-ng-core/src/test/java/org/apache/flume/client/avro/TestSpoolingFileLineReader.java,
> >  line 363
> > <https://reviews.apache.org/r/6377/diff/5/?file=178801#file178801line363>
> >
> >     This test is failing for me. Not sure why, haven't dug into it much yet.

I can't yet reproduce this failure.


- Patrick


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6377/#review12907
-----------------------------------------------------------


On Oct. 22, 2012, 8:36 p.m., Patrick Wendell wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/6377/
> -----------------------------------------------------------
> 
> (Updated Oct. 22, 2012, 8:36 p.m.)
> 
> 
> Review request for Flume.
> 
> 
> Description
> -------
> 
> This patch adds a spooling directory based source. The  idea is that a user 
> can have a spool directory where files are deposited for ingestion into 
> flume. Once ingested, the files are clearly renamed and the implementation 
> guarantees at-least-once delivery semantics similar to those achieved within 
> flume itself, even across failures and restarts of the JVM running the code.
> 
> This helps fill the gap for people who want a way to get reliable delivery of 
> events into flume, but don't want to directly write their application against 
> the flume API. They can simply drop log files off in a spooldir and let flume 
> ingest asynchronously (using some shell scripts or other automated process).
> 
> Unlike the prior iteration, this patch implements a first-class source. It 
> also extends the avro client to support spooling in a similar manner.
> 
> 
> This addresses bug FlUME-1425.
>     https://issues.apache.org/jira/browse/FlUME-1425
> 
> 
> Diffs
> -----
> 
>   
> flume-ng-configuration/src/main/java/org/apache/flume/conf/source/SourceConfiguration.java
>  da804d7 
>   
> flume-ng-configuration/src/main/java/org/apache/flume/conf/source/SourceType.java
>  abbbf1c 
>   flume-ng-core/src/main/java/org/apache/flume/client/avro/AvroCLIClient.java 
> 4a5ecae 
>   
> flume-ng-core/src/main/java/org/apache/flume/client/avro/BufferedLineReader.java
>  PRE-CREATION 
>   flume-ng-core/src/main/java/org/apache/flume/client/avro/LineReader.java 
> PRE-CREATION 
>   
> flume-ng-core/src/main/java/org/apache/flume/client/avro/SpoolingFileLineReader.java
>  PRE-CREATION 
>   
> flume-ng-core/src/main/java/org/apache/flume/source/SpoolDirectorySource.java 
> PRE-CREATION 
>   
> flume-ng-core/src/main/java/org/apache/flume/source/SpoolDirectorySourceConfigurationConstants.java
>  PRE-CREATION 
>   
> flume-ng-core/src/test/java/org/apache/flume/client/avro/TestBufferedLineReader.java
>  PRE-CREATION 
>   
> flume-ng-core/src/test/java/org/apache/flume/client/avro/TestSpoolingFileLineReader.java
>  PRE-CREATION 
>   
> flume-ng-core/src/test/java/org/apache/flume/source/TestSpoolDirectorySource.java
>  PRE-CREATION 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 953a670 
> 
> Diff: https://reviews.apache.org/r/6377/diff/
> 
> 
> Testing
> -------
> 
> Extensive unit tests and I also built and played with this using a stub flume 
> agent. If you look at the JIRA I have a configuration file for an agent that 
> will print out Avro events to the command line - that's helpful when testing 
> this.
> 
> 
> Thanks,
> 
> Patrick Wendell
> 
>

Reply via email to