Thanks for the feedback and "points in the right direction". I will create a JIRA ticket and coordinate status from that point. Additionally, if I have anymore questions...will submit to the mailing list.
Again, thanks all! I definitely feel welcome! Wyatt On Thu, Jan 12, 2017 at 12:45 PM Stephen Sisk <[email protected]> wrote: > Hi Wyatt! > > some other info you might find useful: > * You might be tempted to implement a Sink - it's the obvious thing in the > API for writing to external data stores. However, we're finding it less > useful these days and generally discouraging its use unless you're writing > to files (which you're not). Instead, if you can, just implement a DoFn > that does the write. As Davor mentioned, BigTableIO is a good example of > this. > * It's useful to understand the lifecycle of DoFns > (setup/startbundle/finishbundle/teardown.) For example, you'll likely want > to batch writes for efficiency - BigTableIO does this by flushing writes > stored locally in finishBundle. > * BigTableIO uses a separate "service" class - that's useful for making > your tests simpler by abstracting out the network retry/etc logic > > As you'll have noticed by the multiple replies to your message, people are > eager to answer questions you might have - feel free to pipe up on the > mailing list (dev@ might be more appropriate in that case.) > > S > > On Wed, Jan 11, 2017 at 9:14 PM Jean-Baptiste Onofré <[email protected]> > wrote: > > Welcome and fully agree with Davor. > > You can count on me to do the review ! > > Regards > JB > On Jan 12, 2017, at 06:12, Davor Bonaci <[email protected]> wrote: > > Hi Wyatt -- welcome! > > If you'd like to write to a PCollection to Apache Accumulo's key/value > store, writing an new IO connector would be the best path forward. Accumulo > has somewhat similar concepts as BigTable, so you can use the existing > BigTableIO as an inspiration. > > You are thinking it exactly right -- a connector written in Beam would be > runner-independent, and thus can run anywhere. > > I'm not aware that anybody has started on this one yet -- feel free to > file a JIRA to have a place to coordinate if someone else is interested. > And, if you get stuck or need help in any way, there are plenty of people > on the Beam mailing lists happy to help! > > Once again, welcome! > > Davor > > On Wed, Jan 11, 2017 at 6:04 PM, Wyatt Frelot <[email protected]> wrote: > > All, > > Being new to Apache Beam...I want to ensure that I approach things the > "right way". > > My goal: > > I want to be able to write a PCollection to Apache Accumulo. Something > like this: > > PCollection.apply( AccumuloIO.Write.to("AccumuloTable")); > > > While I am sure I can create a custom class to do so, it has me thinking > about identifying the best way forward. > > I want to use the Apex Runner to run my applications. Apex has Malhar > libraries that are already written that would be really useful. But, I > don't think that is the point. The goal is to develop IO Connectors that > are able to be applied to any runner. Am I thinking about his correctly? > > Is there any work being done to develop an IO Connector for Apache > Accumulo? > > Wyatt > > > wa > > >
