Thanks for taking the time to write this up, Dylan.

I'm a little worried about the RemoteWriteIterator. Using a BatchWriter implies that you'll need some sort of resource management - both ensuring that the BatchWriter is close()'ed whenever a compaction/procedure ends and handling rejected mutations. Have you put any thought into how you would address these?

I'm not familiar enough with the internals anymore, but I remember that I had some pains trying to write to another table during compactions when I was working on replication. I think as long as it's not triggered off of the metadata table, it wouldn't have any deadlock issues.

Architecturally, it's a little worrisome, because it feels a bit like using a wrench as a hammer -- iterators are great for performing some passing computation, but not really for doing some arbitrary read/writes. It gets back to how Accumulo/HBase comparisons where people try to compare Iterators and Coprocessors. They can sometimes do the same thing, but they're definitely different features.

Anyways, I need to stew on it some more and give it a few more reads. Thanks again for sharing!

Dylan Hutchison wrote:
Hello all,

As promised
<https://mail-archives.apache.org/mod_mbox/accumulo-user/201502.mbox/%3CCAPx%3DJkakO3ice7vbH%2BeUo%2B6AP1JPebVbTDu%2Bg71KV8SvQ4J9WA%40mail.gmail.com%3E>,
here is a design doc open for comments on implementing server-side
computation in Accumulo.

https://github.com/Accla/accumulo_stored_procedure_design

Would love to hear your opinion, especially if the proposed design
pattern matches one of /your use cases/.

Regards,
Dylan Hutchison

Reply via email to