Perfect. Thanks for the quick response Nathan.

That is exactly what we had hoped.

-brian

---
Brian O'Neill
Chief Technology Officer


Health Market Science
The Science of Better Results
2700 Horizon Drive € King of Prussia, PA € 19406
M: 215.588.6024 € @boneill42 <http://www.twitter.com/boneill42>   €
healthmarketscience.com


This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or the
person responsible to deliver it to the intended recipient, please contact
the sender at the email above and delete this email and any attachments and
destroy any copies thereof. Any review, retransmission, dissemination,
copying or other use of, or taking any action in reliance upon, this
information by persons or entities other than the intended recipient is
strictly prohibited.
 


From:  Nathan Marz <[email protected]>
Reply-To:  <[email protected]>
Date:  Sunday, February 16, 2014 at 3:07 PM
To:  <[email protected]>
Subject:  Re: Consistent partitions?

Yes batchId + partitionIndex consistently represents the same data as long
as:

1. Any repartitioning you do is deterministic (so partitionBy is, but
shuffle is not)
2. You're using a spout that replays the exact same batch each time (which
is true of transactional spouts but not of opaque transactional spouts)


On Sun, Feb 16, 2014 at 5:23 AM, Brian O'Neill <[email protected]>
wrote:
> I don't see an answer to the final question in this thread:
> https://groups.google.com/forum/#!topic/storm-user/m86grqSXjtQ
> 
> We have a similar use case and require consistent partitioning such that a
> batch partition always contains the same data.
> 
> Like David, I want to double check that the partitioning is consistent across
> replays, even in the event of host failures, etc.
> 
> Does a batchId + partitionIndex, consistently represent the same data?
> Does TransactionalTridentKafkaSpout make such a guarantee?
> 
> -brian
> 
> -- 
> Brian ONeill
> CTO, Health Market Science (http://healthmarketscience.com)
> mobile:215.588.6024 <tel:215.588.6024>
> blog: http://brianoneill.blogspot.com/
> twitter: @boneill42



-- 
Twitter: @nathanmarz
http://nathanmarz.com <http://nathanmarz.com/>


Reply via email to