> On June 7, 2016, 1:44 a.m., Mike Percy wrote: > > flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/TaildirMatcher.java, > > line 164 > > <https://reviews.apache.org/r/48161/diff/1/?file=1404557#file1404557line164> > > > > This sorting function seems problematic to me. It can call stat() up to > > n^2 times (assuming quicksort). Shouldn't we get the last modified time of > > each file in the list once and then sort it? Why are we sorting, anyway? > > > > By the way, I just checked the OpenJDK source code and indeed every > > File.lastModified() maps to FileSystem.getLastModifiedTime(f) which maps to > > stat(2). See > > https://github.com/openjdk-mirror/jdk7u-jdk/blob/master/src/solaris/native/java/io/UnixFileSystem_md.c#L205 > > Attila Simon wrote: > Sorting was part of the original implementation (my change didn't make it > worse). I wanted to be non-intrusive.
If you want to maintain the sorting behavior (and I still don't know how it is used or why it is required) then please do the stat() call on each of the files in the list, and cache the mtime of each file, then sort the files based on those cached mtimes. It just doesn't make any sense to do IO to read the metadata O(n^2) times. > On June 7, 2016, 1:44 a.m., Mike Percy wrote: > > flume-ng-sources/flume-taildir-source/src/test/java/org/apache/flume/source/taildir/TestTaildirMatcher.java, > > line 49 > > <https://reviews.apache.org/r/48161/diff/1/?file=1404560#file1404560line49> > > > > style nit: leave spaces around your brackets. We use the java style > > guide in Flume (admittedly not as consistently as I wish we did, though) > > Attila Simon wrote: > Related to coding style nits: is there any importable flume related style > configuration for intellij somewhere I haven't found already? If yes could > you please point me to its direction? Nope, but a checkstyle file would be welcome that enforced Java style with 2-space indent. - Mike ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/48161/#review136086 ----------------------------------------------------------- On June 13, 2016, 2:14 p.m., Attila Simon wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/48161/ > ----------------------------------------------------------- > > (Updated June 13, 2016, 2:14 p.m.) > > > Review request for Flume. > > > Bugs: FLUME-2918 > https://issues.apache.org/jira/browse/FLUME-2918 > > > Repository: flume-git > > > Description > ------- > > The way TailDir source checks which files should be tracked was improved. > Existing implementation caused unneccessary high CPU usage for huge (+50K > files) directories. This fix allows users to eliminate continous listing of > parent directory (on each Source.process invocation) and introduce a more > performant method for listing&matching files. > > used java.nio.file.DirectoryStream to filter files > made pattern match calculation optionally cached > added junit tests > added javadoc > added license > > > Diffs > ----- > > > flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/ReliableTaildirEventReader.java > 5b6d465 > > flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/TaildirMatcher.java > PRE-CREATION > > flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/TaildirSource.java > 8816327 > > flume-ng-sources/flume-taildir-source/src/main/java/org/apache/flume/source/taildir/TaildirSourceConfigurationConstants.java > 6165276 > > flume-ng-sources/flume-taildir-source/src/test/java/org/apache/flume/source/taildir/TestTaildirMatcher.java > PRE-CREATION > > flume-ng-sources/flume-taildir-source/src/test/java/org/apache/flume/source/taildir/TestTaildirSource.java > f9e614c > > Diff: https://reviews.apache.org/r/48161/diff/ > > > Testing > ------- > > mvn clean install -DskipTests -> built > junit tests for flume-taildir-source module -> passed > > > Thanks, > > Attila Simon > >
