> On 30 Apr 2020, at 07:38, Robert Scholte <rfscho...@apache.org> wrote:
> 
> I prefer to see an in memory solution.

Well if it’s reasonable to assume that filtered files are always small then we 
could use replace the temporary file in my solution with an in memory buffer... 
but I’m not sure that’s what you’re shooting at?

> Key should be to detect if filtering is applied, which is done in the 
> MultiDelimiterInterpolatorFilterReaderLineEnding[1]
> Once a value has been interpolated, you must rewrite the file, otherwise you 
> shouldn't.

Again though, this appears to miss the subtlety: “if filtering is applied” is 
insufficient, the condition needs to be “if filtering is applied with different 
results than the previous run”. This requires either attempting to store some 
state between runs. 

We could scan the source file for filtered values and store just that state (or 
checksum) in a file between runs. The cost would be an extra read of the source 
file + state comparison + writing out the state of each filtering. Is this what 
you’re thinking?

The alternative is to just use the target file as the (not minimal) state. We 
could read and filter the source file once, while reading and comparing with 
the target file in parallel. As soon as the contents start to differ then 
truncate the target file and append the rest of the filtered source to it. The 
cost here would be an extra full read of the target. Is this what you’re after?

Otherwise I’m at a loss to understand what would be acceptable. 

Thanks!

Rob

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@maven.apache.org
For additional commands, e-mail: dev-h...@maven.apache.org

Reply via email to