-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60568/
-----------------------------------------------------------
Review request for Ambari, Oliver Szabo and Robert Nettleton.
Bugs: AMBARI-21387
https://issues.apache.org/jira/browse/AMBARI-21387
Repository: ambari
Description
-------
tail = true should mean that only the latest log file is loaded
tail = false should mean that the previous rolled over log files are loaded
too, but not followed
AbstractInputFile must be prepared for previous rolled over files to check in
even when the next part is actively handled, as the checking in is done when
the data was moved to solr, so all file related fields were moved to maps, so
the checkin can identify which file's status is checked in.
Log files are sorted alphabetically after identified, thus ensureing that the
first one is the active one. In practice they were sorted already, but the
File.listFiles function doesn't ensures it according to it's documentation, so
it's better to do so explicitly.
The thread that is responsible for deleting unused checkpoint files should do
one more check before deleting a checkpoint file because the file it belongs to
is no more there, or it's file key doesn't matches the one in the checkpoint
file: first it should look up if there is any other file in the same folder
which have the same file key, in which case it is assumed that the file was
renamed since. As the code that's looking for a checkpoint file to determine
the next line to load does that by the name of the checkpoint file, which is
generated from the file key it will successfuly identify it irrelevant of the
name of the file.
I know, it's a bit complicated :) But I couldn't put it into words less
complicated.
Also fixed Log Search Config ZK bug not to handle not node data change related
events.
Diffs
-----
ambari-logsearch/ambari-logsearch-config-zookeeper/src/main/java/org/apache/ambari/logsearch/config/zookeeper/LogSearchConfigZK.java
26375e1
ambari-logsearch/ambari-logsearch-logfeeder/src/main/java/org/apache/ambari/logfeeder/input/AbstractInputFile.java
ab50eb7
ambari-logsearch/ambari-logsearch-logfeeder/src/main/java/org/apache/ambari/logfeeder/input/Input.java
c36f96b
ambari-logsearch/ambari-logsearch-logfeeder/src/main/java/org/apache/ambari/logfeeder/input/InputFile.java
fc40ca4
ambari-logsearch/ambari-logsearch-logfeeder/src/main/java/org/apache/ambari/logfeeder/input/InputManager.java
19894ae
ambari-logsearch/ambari-logsearch-logfeeder/src/main/java/org/apache/ambari/logfeeder/input/InputS3File.java
2b19503
ambari-logsearch/ambari-logsearch-logfeeder/src/test/java/org/apache/ambari/logfeeder/input/InputManagerTest.java
e9bbe7e
Diff: https://reviews.apache.org/r/60568/diff/1/
Testing
-------
Tested on local vagrant cluster, by loading previous files again and again, and
by renaming rolled over files. In both cases no duplicate data were loaded,
checkpoints were handled correctly.
Thanks,
Miklos Gergely