Hello, Folks. As I look at the following tickets, I thought it might be useful to share how we are using the BatchWriter, some of the challenges we've had, some thoughts about it's redesign and how we might get involved.
https://issues.apache.org/jira/browse/ACCUMULO-4154 https://issues.apache.org/jira/browse/ACCUMULO-2589 https://issues.apache.org/jira/browse/ACCUMULO-2990 One of our primary use cases of the BatchWriter is from within a Storm topology, reading from Kafka. Generally speaking, storm might be persisting a a single or small set of mutations at a time (low latency), or in larger batches with Trident (higher throughput). In addition to ACCUMULO-2990 (any TimedOutException, which then throws MutationsRejectedException and requires a new connection to be made), one of our requirements is to ensure that any given thread's mutations are the ones which are flushed and none others (pseudo transactions). Otherwise, we might get a failure for a mutation which belongs to another thread (and already ACKed by Storm) which means we don't have a 'handle' on that offset anymore in Kafka to replay the failure - i.e. the message could be 'lost'. Despite being threadsafe, we end up using a single BatchWriter per thread to make reasoning about the above simpler, but this creates a resource issue - number of connections to accumulo and zk. This all makes me wonder what the design goals might have been for the current version of the driver and if the efforts to rewrite it might benefit from incorporating elements to address some of these use cases above. What can we learn from how drivers for other "NoSQL" databases are implemented? Would it make sense to remove all the global variables ("somethingFailed"), thread sleep/notify, frequent calls to "checkForFailures()" and consider using a 'connection pool' model where writes are single-threaded, linearized and isolated during the connection lease? Could we make the client non-blocking and with optional pipelining, so multiple writes could share a connection and allow interleaving of operations (with individual acks)? Looking forward to hearing everyone's thoughts. -Mike
