Creating and/or deleting a lot of files is a pattern that occurs in multiple places in the Accumulo code (like GC, create table, and delete table). HDFS does not offer any primitives to make this efficient. However I think doing these ops in a thread pool can be efficient because HDFS client batch calls from multiple threads into since NN RPC calls. It might be nice to offer high level abstractions for batch NN ops in the Accumulo code that are implemented using thread pools. These could be implemented in Accumulo's volumne manager code. This issue was created based on the following.
https://github.com/apache/accumulo/pull/575#discussion_r215355339 [ Full content available at: https://github.com/apache/accumulo/issues/631 ] This message was relayed via gitbox.apache.org for [email protected]
