Hi Dylan, It's a clever way to leverage existing Accumulo behaviors (full major compaction) to act as clients in order to perform a parallel operation to populate a new table. Have you tried this method in practice at all, yet? What pitfalls have you run into, perhaps regarding client-side static state in the JVM, or resource management issues within the tablet servers? What do you think the similarities and differences are with other parallel execution methods that one could use to achieve the same results (like Map/Reduce)?
Also, do you have any code available for an example RemoteSourceIterator, which we might be able to try? The Transpose one seemed simple enough, but any others would be neat to try, also. Do you have any thoughts on whether there should be some abstract base class available in Accumulo (vs. as part of the contrib) to support these iterators and handle the boiler-plate stuff of setting up/serializing the client configuration when the procedure executes, or utilities to help create a stored procedure table? -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Thu, Feb 26, 2015 at 12:42 AM, Dylan Hutchison <[email protected]> wrote: > Hello all, > > As promised > <https://mail-archives.apache.org/mod_mbox/accumulo-user/201502.mbox/%3CCAPx%3DJkakO3ice7vbH%2BeUo%2B6AP1JPebVbTDu%2Bg71KV8SvQ4J9WA%40mail.gmail.com%3E>, > here is a design doc open for comments on implementing server-side > computation in Accumulo. > > https://github.com/Accla/accumulo_stored_procedure_design > > Would love to hear your opinion, especially if the proposed design pattern > matches one of *your use cases*. > > Regards, > Dylan Hutchison > >
