Awesome Aaron, I can send you what we have done offline!

- Erik

On Thu, Jan 7, 2016 at 11:12 AM, Aaron.Dossett <[email protected]>
wrote:

> Thanks, Erik.  Your “Partitioner” is exactly what I had in mind and even
> what I named my stubbed out interface :-)  Since Target has decided against
> this approach for other reasons, it will have to be a side project for me
> for now.
>
> Best, Aaron
>
> From: Erik Weathers <[email protected]>
> Reply-To: "[email protected]" <[email protected]>
> Date: Wednesday, January 6, 2016 at 5:48 PM
> To: "[email protected]" <[email protected]>
> Cc: "[email protected]" <[email protected]>
> Subject: Re: HDFS Bolts -- partitioning output
>
> hey Aaron,
>
> We've also written a similar bolt at Groupon, we aren't super satisfied
> with the implementation though. :)  We are begrudgingly using it because
> there is no partitioning support in the OSS storm-hdfs bolt.
>
> Though one thing I do like about our implementation is having the ability
> to define your own "Partitioner" in each topology to do various types of
> partitioning (date-based, message ID-based, topic-based, whatever).  It
> would be great if your implementation had such logic too.  e.g., when
> deciding the HDFS path for a tuple's data, the Partitioner is called to
> determine the HDFS path.  For example, it can take the Tuple object and an
> opaque key/value Configuration hash that can pass items like a kafka topic
> name to be included into the HDFS path.
>
> - Erik
>
> On Tue, Dec 29, 2015 at 7:12 AM, Aaron.Dossett <[email protected]>
> wrote:
>
>> Hi,
>>
>> My team was exploring changes to the HDFS bolts that would allow for
>> partitioning the output, for example into directories corresponding to
>> day.  This is different that the existing functionality to rotate files
>> based on a set length of time.  For unrelated reasons, we are probably not
>> going to pursue this further.  However, I have some code changes that
>> implement most of this functionality for at least some partitioning use
>> cases.  If there is interest from the user or developer community for this
>> feature, I could get in shape for a PR to get feedback about our
>> implementation approach.
>>
>> Any feedback on this idea is welcome.  Thanks! -Aaron
>>
>
>

Reply via email to