Awesome, thanks Wyatt!

On Thu, Jan 12, 2017 at 10:08 AM, Wyatt Frelot <[email protected]> wrote:

> Thanks for the feedback and "points in the right direction". I will create
> a JIRA ticket and coordinate status from that point. Additionally, if I
> have anymore questions...will submit to the mailing list.
>
> Again, thanks all! I definitely feel welcome!
>
> Wyatt
>
> On Thu, Jan 12, 2017 at 12:45 PM Stephen Sisk <[email protected]> wrote:
>
>> Hi Wyatt!
>>
>> some other info you might find useful:
>> * You might be tempted to implement a Sink - it's the obvious thing in
>> the API for writing to external data stores. However, we're finding it less
>> useful these days and generally discouraging its use unless you're writing
>> to files (which you're not). Instead, if you can, just implement a DoFn
>> that does the write. As Davor mentioned, BigTableIO is a good example of
>> this.
>> * It's useful to understand the lifecycle of DoFns 
>> (setup/startbundle/finishbundle/teardown.)
>> For example, you'll likely want to batch writes for efficiency - BigTableIO
>> does this by flushing writes stored locally in finishBundle.
>> * BigTableIO uses a separate "service" class - that's useful for making
>> your tests simpler by abstracting out the network retry/etc logic
>>
>> As you'll have noticed by the multiple replies to your message, people
>> are eager to answer questions you might have - feel free to pipe up on the
>> mailing list (dev@ might be more appropriate in that case.)
>>
>> S
>>
>> On Wed, Jan 11, 2017 at 9:14 PM Jean-Baptiste Onofré <[email protected]>
>> wrote:
>>
>> Welcome and fully agree with Davor.
>>
>> You can count on me to do the review !
>>
>> Regards
>> JB
>> On Jan 12, 2017, at 06:12, Davor Bonaci <[email protected]> wrote:
>>
>> Hi  Wyatt -- welcome!
>>
>> If you'd like to write to a PCollection to Apache  Accumulo's key/value
>> store, writing an new IO connector would be the best path forward. Accumulo
>> has somewhat similar concepts as BigTable, so you can use the existing
>> BigTableIO as an inspiration.
>>
>> You are thinking it exactly right -- a connector written in Beam would be
>> runner-independent, and thus can run anywhere.
>>
>> I'm not aware that anybody has started on this one yet -- feel free to
>> file a JIRA to have a place to coordinate if someone else is interested.
>> And, if you get stuck or need help in any way, there are plenty of people
>> on the Beam mailing lists happy to help!
>>
>> Once again, welcome!
>>
>> Davor
>>
>> On Wed, Jan 11, 2017 at 6:04 PM, Wyatt Frelot <[email protected]> wrote:
>>
>> All,
>>
>> Being new to Apache Beam...I want to ensure that I approach things the
>> "right way".
>>
>> My goal:
>>
>> I want to be able to write a PCollection to Apache Accumulo. Something
>> like this:
>>
>>               PCollection.apply( AccumuloIO.Write.to("AccumuloTable"));
>>
>>
>> While I am sure I can create a custom class to do so, it has me thinking
>> about identifying the best way forward.
>>
>> I want to use the Apex Runner to run my applications. Apex has Malhar
>> libraries that are already written that would be really useful. But, I
>> don't think that is the point. The goal is to develop IO Connectors that
>> are able to be applied to any runner.  Am I thinking about his correctly?
>>
>> Is there any work being done to develop an IO Connector for Apache
>> Accumulo?
>>
>> Wyatt
>>
>>
>> wa
>>
>>
>>

Reply via email to