keith-turner opened a new pull request, #4129: URL: https://github.com/apache/accumulo/pull/4129
As we change Accumulo to use FATE for per tablet operations its important to avoid reading all FATEs persisted data into memory. This commit modifies FATE to use Streams internally instead of Collections. For the Accumulo implemention of FATE storage this makes it possible to have java stream backed by a scanner which avoids reading all of the FATE ids into memory. The Zookeeper storage implementation will still read everything into memory. Another change that was made in the PR was optimizing the Accumulo storage layer to read the status while reading the id. Before this change ids were read from scanner, then for each id a scanner was created to read the status. Now the status and id are read in stream from the same scanner which should be much faster. This change was not possible for Zookeeper, it will still make an RPC to get each status. Its ok that Zookeeper store is less efficient as the Accumulo store will likely store orders of magnitude more data. Its probably not possible to make the same optimizations for speed and memory in the zookeeper store. A bug in the Fate integration test was fixed by using the Unknown status which represents the status for transaction that does not exists in the persisted store. Ran into this bug while testing these changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
