cshannon opened a new pull request, #4133: URL: https://github.com/apache/accumulo/pull/4133
This PR updates the Manager to create two Fate instances, one for store FATE operations against tables in the accumulo namespace (meta) in the existing zookeeper store and another instance to store FATE operations against user tables (user). The thrift API has been modified so that when acquiring a new transaction, the type is returned of the instance used because future operations with the same transaction will need to know which instance was used. A new table has been created `accumulo.fate` that is used to store FATE data for the accumulo store. I also included a second commit which starts the process of reworking the Fate Admin command to be able to use the new Accumulo store as well. This was needed to fix the FateSummaryIT and FateConcurrentIT. The work here is not complete and there is more to do to make it fully featured (such as being able to filter by fate instance type in the command) so I put it into its own commit as we might (probably) want to move this work to a follow on PR but I included it for now to show the direction I was headed. We could also keep what's here so the tests work and create a follow on PR to fix the rest. For testing, I ran through some of the ITs but not all so I need to kick off a full build. One interesting thing of note was after making changes here the SplitIT (specifically concurrentSplit) test intermittently was causing one of the tservers to die with OOM error. I bumped the default to 384M from 256M and that seemed to solve it. I marked this as a draft for now as I wanted to get some feedback here first on how things look and also how much to include in this PR (such as if I should just move the fate admin command to another PR). **Issues to still address (likely as follow on Issues):** 1. As noted above, need to finish fixing the Fate Admin command api to make sure it's fully compatible with ZK and Accumulo instances/stores and will support filtering correctly by instance type. 2. Conditional mutations need to be added to the Accumulo store which is being addressed already by: https://github.com/apache/accumulo/issues/4114 3. Upgrades need to be handled. Existing instances will need to create the new `accumulo.fate` table on upgrade so it exists. The current code handles initializing new instances with the table. 4. Currently the Accumulo store uses the AgeOff store (just like zookeeper) so on initialization the list of existing transactions is loaded. This means while the Manager is still starting up and initializing the store it will scan the fate table which causes the Tablet sever that accepts the scan to log an exception. This exception is due to the tablet server loading a tablet and trying to send a message back to the Manager over the manager RPC channel that the tablet was loaded but because the Manager hasn't finished loading, the Tsever can't connect back to the Manager yet so an error is logged. Eventually a retry happens and the message is passed back successfully so this is not really a huge issue as the scan completes successfully and eventually the message gets sent back on retry but it does cause an ugly error message on start up. I haven't not addressed this yet because this behavior is very likely to change with the update to #3964 anyways -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
