Thanks Jacob.

I understand what you are saying. But it will be applicable to a case when
bolt to bolt connection is in place. So when a previous bolt sends a file
to this bolt grouped on something, it will be passed on to a unique thread.

In my case, few files are already lying in a directory and when the
topology comes up, the init method of this bolt x checks to see if there
are any files in this directory which is common across topology as such.
Now 4 threads of bolt x will be able to see these files, but I want to
ensure that each thread handles unique file.


I hope I am more clear this time in explaining the situation.



On Tue, Nov 15, 2016 at 2:45 PM, Jacob Johansen <[email protected]>
wrote:

> you need to partition on something, the bolt grouping should do
> your partitioning and eliminate the need for locking. Remove synchronised
> as it will slow down processes.
>
> Jacob Johansen
>
> On Tue, Nov 15, 2016 at 3:59 PM, Milind Vaidya <[email protected]> wrote:
>
>> Hi
>>
>> I am having a use case where few files in a directory are needed to be
>> processed by a certain bolt x written in Java.
>>
>> I am setting number of executers and tasks same which is > 1. Say I have
>> 4 executers and tasks.
>>
>> As I understand, these are essentially threads in the worker process. Now
>> I want to make sure that each of the executer / task should process a file
>> uniquely. How to ensure that ?
>>
>> Should I put a synchronised block inside execute method and make sure
>> processing is done in thread safe manner ?
>>
>>
>> This is actually to be done when a topology is launched. As the worker
>> starts, the corresponding bolts will scan a specific directory and process
>> the files previously generated but not processed.
>>
>> In normal scenario, another bolt will pass on file name and path to be
>> processed to this bolt.
>>
>>
>>
>>
>>
>

Reply via email to