[ 
https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150124#comment-13150124
 ] 

Sebb commented on IO-288:
-------------------------

Good to know that it's easy to unambiguously detect CR and LF.

There seems to be a lot of spurious files in the zip archive.

I'm not sure that the getNewLineMatchByteCount() is as efficient as 
BufferedReader.readLine() - it seems to process characters multiple times. It 
could probably be improved by just checking current and previous chars. Also, I 
don't think it's necessary to encode \n or \r - just use the appropriate 
characters.

There are no tests for multi-block files where there may be lines spanning 
blocks.
Indeed the CRLF pair may span blocks; I'm not convinced that the code handles 
that correctly.
In order for getNewLineMatchByteCount() to detect all CRLF pairs, it generally 
needs at least 2 characters to be present; this does not seem to be guaranteed.

Note: could use a smaller block size to make the test files smaller; probably 
sensible to compare the results with a forward line reader. It would then be 
simple to have a directory of various different test files - read the file 
forward and store the lines; ensure that the reverse reader matches the 
reversed lines.

The field totalBlockCount needs to be a long, not an int.

Might simplify the code to use empty arrays rather than null.
                
> Supply a ReversedLinesFileReader 
> ---------------------------------
>
>                 Key: IO-288
>                 URL: https://issues.apache.org/jira/browse/IO-288
>             Project: Commons IO
>          Issue Type: New Feature
>          Components: Utilities
>            Reporter: Georg Henzler
>             Fix For: 2.2
>
>         Attachments: ReversedLinesFileReader0.2.zip
>
>
> I needed to analyse a log file today and I was looking for a 
> ReversedLinesFileReader: A class that behaves exactly like BufferedReader 
> except that it goes from bottom to top when readLine() is called. I didn't 
> find it in IOUtils and the internet didn't help a lot either, e.g. 
> http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly 
> inefficient - the log files I'm analysing are huge and it is not a good idea 
> to load the whole content in the memory. 
> So I ended up writing an implementation myself using little memory and the 
> class RandomAccessFile - see attached file. It's used as follows:
> int blockSize = 4096; // only that much memory is needed, no matter how big 
> the file is
> ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader 
> (myFile, blockSize, "UTF-8"); // encoding is supported
> String line = null;
> while((line=reversedLinesFileReader.readLine())!=null) {
>   ... // use the line
>   if(enoughLinesSeen) {
>      break;  
>   }
> }
> reversedLinesFileReader.close();
> I believe this could be useful for other people as well!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to