Steps:

1. Implement IBackingMap interface for DynamoDB
2. Use OpaqueTridentKafkaSpout to read Kafka stream
3. Use OpaqueMap with your IBackingMap implementation within the topology
4. All your logic will now have exactly once semantics inside your Trident
topology

This is all on that wiki page. Let me know if you have more specific
questions.



On Wed, Jan 13, 2016 at 12:11 PM, Ajay <[email protected]> wrote:

> I am new to Storm and not able to understand clearly how to use Trident
> state to achieve exactly once semantics with external database. Can you
> please elaborate?
>
> May be a use case with one external database as below:
> One of the sample use case could be as below:
>
> 1) Read an event from Kafka
> 2) Parse the JSON
> 3) Process the data
> 4) Write it to dynamo db
>
> If I have to exactly once semantics, what are the steps to do?.
>
> I am also confused about what state is stored in Zookeeper and what state
> is stored using the Trident state implementation.
>
>
> Thanks
> Ajay
>
>
> On Wed, Jan 13, 2016 at 11:53 AM, Nathan Marz <[email protected]>
> wrote:
>
>> That information is not accurate. Trident can have exactly once semantics
>> with *any* external database. See this page for the details:
>> https://storm.apache.org/documentation/Trident-state.html
>>
>>
>> On Tue, Jan 12, 2016 at 3:55 AM, Matthias J. Sax <[email protected]>
>> wrote:
>>
>>> Hi,
>>>
>>> Storm does only guarantee exactly once *within* the system. If you write
>>> to an external consumer you need to ensure that writes are idempotent or
>>> de-duplicate externally.
>>>
>>> About Tridents internals, start to read the documentation here:
>>>
>>> https://storm.apache.org/documentation/Transactional-topologies.html
>>>
>>> If you have further questions, don't hesitate to follow up on this
>>> thread.
>>>
>>> -Matthias
>>>
>>>
>>> On 01/12/2016 06:26 AM, Ajay wrote:
>>> > Hi,
>>> >
>>> > We are evaluating storm with Trident to process events from Kafka in an
>>> > exactly once manner so that the consumer application need not worry
>>> > about the de-duplication logic.
>>> >
>>> >
>>> > One of the sample use case could be as below:
>>> >
>>> > 1) Read an event from Kafka
>>> > 2) Parse the JSON
>>> > 3) Process the data
>>> > 4) Write it to dynamo db and Apache solr
>>> >
>>> > So I want to understand how does the Trident handles exactly once
>>> > semantics when the processing updates one or more distributed systems
>>> > like in this case. What happens if write to one of them fails.
>>> >
>>> > Also wish to know the implementation details of how Trident supports
>>> > exactly once semantics.
>>> >
>>> > Thanks
>>> > Ajay
>>>
>>>
>>
>>
>> --
>> Twitter: @nathanmarz
>> http://nathanmarz.com
>>
>
>


-- 
Twitter: @nathanmarz
http://nathanmarz.com

Reply via email to