If you use field grouping on "id" it just ensures that a tuple with id=21
will always go to the same task (one of the instances of your bolt). It
doesn't let you specify which task.

Are you trying to setup a topology where you have A->B and B has 20 tasks
(and you want to control which task the tuple goes to) or are you setting
up A->B and A->C and you want to set if the tuple goes to B or C.

In the first case, perhaps it doesn't matter, since all tasks will be
running the code for bolt B (and the same tuple will always go to the same
task). In the second case, the tuple will go to BOTH B and C, so you can
always code B to simply drop any tuples with ids it doesn't want, while C
will process them (or vice-versa).

Steve


On Thu, Aug 14, 2014 at 10:13 PM, amjad khan <[email protected]>
wrote:

> I have a situation where i have seven bolts and one spout & i want to
> distribute the tuples according to the field ID.
> For eg. if ID=21 I want the tuple to be processed by first bolt
>               ID=31 I want that tuple to be processed by second bolt & so
> on.
>
> So is there a way to implement these. I was thinking about using fields
> grouping but in that i can only define the field name but not the value of
> that field, So if i use field grouping i don't think there would be a
> guarantee that suppose for ID=21 the tuple would be processed by first bolt.
> Kindly correct me if i'm wrong about field grouping & provide solution to
> implement these kind of topology.
> Thanks in advance.
>
>
> On Fri, Aug 1, 2014 at 10:20 PM, amjad khan <[email protected]>
> wrote:
>
>> My bolt tries to write data to hdfs but the whole data is not written it
>> throws exception
>>
>> org.apache.hadoop.ipc.RemoteException: 
>> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on 
>> /storm.txt File does not exist. Holder DFSClient_attempt_storm.txt does not 
>> have any open files.
>> at 
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1557)
>> at 
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1548
>>
>> Kindly help me if anyone has any idea about this.
>>
>>
>>
>> On Sat, Jul 26, 2014 at 12:47 PM, amjad khan <[email protected]>
>> wrote:
>>
>>> Output when using bolt that tries to write its data to hdfs.
>>>
>>> INFO org.apache.hadoop,ipc.Client - Retrying Connect to Server:
>>> localhost/131.0.0.1:43785 Already tried 6 time(s).
>>> WARN Caught URI Exception
>>> java.net.ConnectException Call to localhost/131.0.0.1:43785 Failed on
>>> Connect Exception: java.net.ConnectException: Connection Refused
>>>
>>> IN MY Code:
>>> Configuration config = new Configuration();
>>> config.set("fs.defaultFS","hdfs://localhost:9000");
>>> FileSystem fs = FileSystem.get(config);
>>>
>>>
>>> /etc/hosts contain
>>> 181.45.83.79 localhost
>>>
>>> core-site contain
>>>
>>> <name>fs.default.name</name>
>>> <value>hdfs://localhost:9000</value>
>>>
>>>
>>> Kindly tell me why it is trying to connect on 131.0.0.1 & why at port
>>> 43785 .
>>>
>>>
>>> The same code is working fine in java without implementing it in storm &
>>> i'm using hadoop 1.0.2.
>>>
>>>
>>> On Fri, Jul 18, 2014 at 11:33 AM, Parth Brahmbhatt <
>>> [email protected]> wrote:
>>>
>>>> Hi Amjad,
>>>>
>>>> Is there any reason you can not upgrade to hadoop 2.0? Hadoop 2.0 has
>>>> made many improvements over 1.X versions and they are source compatible so
>>>> any of your MR jobs will be unaffected as long as you recompile with 2.x.
>>>>
>>>> The code we pointed at assumes that all the classes for hadoop 2.X are
>>>> present in your class path. if you are not using maven or some other build
>>>> system and would like to add jars manually you probably will have tough
>>>> time resolving conflicts so I would advise against it.
>>>> If you still want to add jars manually my best guess would be to look
>>>> under
>>>> <YOUR_HADOO_INSTALLATION_DIR>/libexec/share/hadoop/
>>>>
>>>> Thanks
>>>> Parth
>>>> On Jul 18, 2014, at 10:56 AM, amjad khan <[email protected]>
>>>> wrote:
>>>>
>>>> Thanks for your reply taylor. I'm using hadoop1.0.2. Can u suggest me
>>>> any alternative to connect to hadoop.
>>>>
>>>>
>>>>
>>>> On Fri, Jul 18, 2014 at 8:45 AM, P. Taylor Goetz <[email protected]>
>>>> wrote:
>>>>
>>>>> What version of Hadoop are you using? Storm-hdfs requires Hadoop 2.x.
>>>>>
>>>>> - Taylor
>>>>>
>>>>> On Jul 18, 2014, at 6:07 AM, amjad khan <[email protected]>
>>>>> wrote:
>>>>>
>>>>> Thanks for your help parth
>>>>>
>>>>> When i trying to run the topology to write the data to hdfs it throws
>>>>> exception Class Not Found:
>>>>> org.apache.hadoop.client.hdfs.HDFSDataOutputStream$SyncFlags
>>>>> Can anyone tell me what are the jars needed to execute the code to
>>>>> write data to hdfs. Please tell me all the required jars.
>>>>>
>>>>>
>>>>> On Wed, Jul 16, 2014 at 10:46 AM, Parth Brahmbhatt <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> You can use
>>>>>>
>>>>>> https://github.com/ptgoetz/storm-hdfs
>>>>>>
>>>>>> It supports writing to HDFS with both Storm bolts and trident states.
>>>>>> Thanks
>>>>>> Parth
>>>>>>
>>>>>> On Jul 16, 2014, at 10:41 AM, amjad khan <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>> Can anyone provide the code for bolt to write its data to hdfs.
>>>>>> Kindly tell me the jar's required to run that bolt.
>>>>>>
>>>>>>
>>>>>> On Mon, Jul 14, 2014 at 2:33 PM, Max Evers <[email protected]> wrote:
>>>>>>
>>>>>>> Can you expand on your use case? What is the query selecting on? Is
>>>>>>> the column you are querying on indexed?  Do you really need to look at 
>>>>>>> the
>>>>>>> entire 20 gb every 20ms?
>>>>>>>  On Jul 14, 2014 6:39 AM, "amjad khan" <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I made a storm topoogy where spout was fetching data from mysql
>>>>>>>> using select query. The select query was fired after every 30 msec but
>>>>>>>> because the size of the table is more than 20 GB the select query takes
>>>>>>>> more than 10 sec to execute therefore this is not working. I need to 
>>>>>>>> know
>>>>>>>> what are the possible alternatives for this situation. Kindly reply as 
>>>>>>>> soon
>>>>>>>> as possible.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> CONFIDENTIALITY NOTICE
>>>>>> NOTICE: This message is intended for the use of the individual or
>>>>>> entity to which it is addressed and may contain information that is
>>>>>> confidential, privileged and exempt from disclosure under applicable law.
>>>>>> If the reader of this message is not the intended recipient, you are 
>>>>>> hereby
>>>>>> notified that any printing, copying, dissemination, distribution,
>>>>>> disclosure or forwarding of this communication is strictly prohibited. If
>>>>>> you have received this communication in error, please contact the sender
>>>>>> immediately and delete it from your system. Thank You.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> CONFIDENTIALITY NOTICE
>>>> NOTICE: This message is intended for the use of the individual or
>>>> entity to which it is addressed and may contain information that is
>>>> confidential, privileged and exempt from disclosure under applicable law.
>>>> If the reader of this message is not the intended recipient, you are hereby
>>>> notified that any printing, copying, dissemination, distribution,
>>>> disclosure or forwarding of this communication is strictly prohibited. If
>>>> you have received this communication in error, please contact the sender
>>>> immediately and delete it from your system. Thank You.
>>>>
>>>
>>>
>>
>

Reply via email to