Currently, the monitor has an HA-standby behavior as a side-effect of the way it handles logging.
The way this works is that the other services read the monitor's lock in ZK to get the current active monitor's host and port. They use this to set some system properties, which are referenced in the default example generic_logger.xml config file. They set these system properties and reset the log4j system whenever they detect a change in the active monitor. This has the effect of forcing all the logging to go to a particular socket open on the monitor, wherever it is currently running. The ZK lock is currently being used to restrict monitor functionality to a single monitor instance. But, this isn't really necessary. There isn't any reason to restrict concurrent monitors. The real purpose of the ZK lock, as I've described, is to hijack the ZK lock mechanism because it's also a service-advertisement feature. This is a bit convoluted and makes a lot of assumptions, and has a lot of issues. It is also could be impeding some possible avenues of simplification under ACCUMULO-3005. 1. It locks us in to using Log4J (probably a specific range of versions). 2. It sends logs across the network insecurely. 3. It assumes that you only want a single monitor service running. 4. Code assumes particular configuration with particular variable's embedded in them. 5. Extra threads needed to track changes 6. It adds code complexity. 7. It assumes the user wishes to use the monitor to watch logs, when other tools are better suited for log aggregation, monitoring, and alerting. This whole thing would be simpler if we just eliminated the monitor log-watching feature entirely. However, we also have some options short of that. For example, we could (1) use a different service-advertisement mechanism that doesn't lock. (2). We could stop doing this variable injection thing, and still leave the socket appender running, so that users will have to configure their destination in their own configuration files, if they want to emit logs to that socket. (3) We could use an Accumulo RPC to send logs, rather than the log4j API. Lots of options. What do you think? -- Christopher
