If you have A to both B and C and want to route you can emit on different
stream IDs and have B and C subscribed to appropriate stream IDs.


On Fri, Aug 15, 2014 at 12:44 PM, Stephen Armstrong <
[email protected]> wrote:

> If you use field grouping on "id" it just ensures that a tuple with id=21
> will always go to the same task (one of the instances of your bolt). It
> doesn't let you specify which task.
>
> Are you trying to setup a topology where you have A->B and B has 20 tasks
> (and you want to control which task the tuple goes to) or are you setting
> up A->B and A->C and you want to set if the tuple goes to B or C.
>
> In the first case, perhaps it doesn't matter, since all tasks will be
> running the code for bolt B (and the same tuple will always go to the same
> task). In the second case, the tuple will go to BOTH B and C, so you can
> always code B to simply drop any tuples with ids it doesn't want, while C
> will process them (or vice-versa).
>
> Steve
>
>
> On Thu, Aug 14, 2014 at 10:13 PM, amjad khan <[email protected]>
> wrote:
>
>> I have a situation where i have seven bolts and one spout & i want to
>> distribute the tuples according to the field ID.
>> For eg. if ID=21 I want the tuple to be processed by first bolt
>>               ID=31 I want that tuple to be processed by second bolt & so
>> on.
>>
>> So is there a way to implement these. I was thinking about using fields
>> grouping but in that i can only define the field name but not the value of
>> that field, So if i use field grouping i don't think there would be a
>> guarantee that suppose for ID=21 the tuple would be processed by first bolt.
>> Kindly correct me if i'm wrong about field grouping & provide solution to
>> implement these kind of topology.
>> Thanks in advance.
>>
>>
>> On Fri, Aug 1, 2014 at 10:20 PM, amjad khan <[email protected]>
>> wrote:
>>
>>> My bolt tries to write data to hdfs but the whole data is not written it
>>> throws exception
>>>
>>> org.apache.hadoop.ipc.RemoteException: 
>>> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on 
>>> /storm.txt File does not exist. Holder DFSClient_attempt_storm.txt does not 
>>> have any open files.
>>> at 
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1557)
>>> at 
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1548
>>>
>>> Kindly help me if anyone has any idea about this.
>>>
>>>
>>>
>>> On Sat, Jul 26, 2014 at 12:47 PM, amjad khan <[email protected]>
>>> wrote:
>>>
>>>> Output when using bolt that tries to write its data to hdfs.
>>>>
>>>> INFO org.apache.hadoop,ipc.Client - Retrying Connect to Server:
>>>> localhost/131.0.0.1:43785 Already tried 6 time(s).
>>>> WARN Caught URI Exception
>>>> java.net.ConnectException Call to localhost/131.0.0.1:43785 Failed on
>>>> Connect Exception: java.net.ConnectException: Connection Refused
>>>>
>>>> IN MY Code:
>>>> Configuration config = new Configuration();
>>>> config.set("fs.defaultFS","hdfs://localhost:9000");
>>>> FileSystem fs = FileSystem.get(config);
>>>>
>>>>
>>>> /etc/hosts contain
>>>> 181.45.83.79 localhost
>>>>
>>>> core-site contain
>>>>
>>>> <name>fs.default.name</name>
>>>> <value>hdfs://localhost:9000</value>
>>>>
>>>>
>>>> Kindly tell me why it is trying to connect on 131.0.0.1 & why at port
>>>> 43785 .
>>>>
>>>>
>>>> The same code is working fine in java without implementing it in storm
>>>> & i'm using hadoop 1.0.2.
>>>>
>>>>
>>>> On Fri, Jul 18, 2014 at 11:33 AM, Parth Brahmbhatt <
>>>> [email protected]> wrote:
>>>>
>>>>> Hi Amjad,
>>>>>
>>>>> Is there any reason you can not upgrade to hadoop 2.0? Hadoop 2.0 has
>>>>> made many improvements over 1.X versions and they are source compatible so
>>>>> any of your MR jobs will be unaffected as long as you recompile with 2.x.
>>>>>
>>>>> The code we pointed at assumes that all the classes for hadoop 2.X are
>>>>> present in your class path. if you are not using maven or some other build
>>>>> system and would like to add jars manually you probably will have tough
>>>>> time resolving conflicts so I would advise against it.
>>>>> If you still want to add jars manually my best guess would be to look
>>>>> under
>>>>> <YOUR_HADOO_INSTALLATION_DIR>/libexec/share/hadoop/
>>>>>
>>>>> Thanks
>>>>> Parth
>>>>> On Jul 18, 2014, at 10:56 AM, amjad khan <[email protected]>
>>>>> wrote:
>>>>>
>>>>> Thanks for your reply taylor. I'm using hadoop1.0.2. Can u suggest me
>>>>> any alternative to connect to hadoop.
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Jul 18, 2014 at 8:45 AM, P. Taylor Goetz <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> What version of Hadoop are you using? Storm-hdfs requires Hadoop 2.x.
>>>>>>
>>>>>> - Taylor
>>>>>>
>>>>>> On Jul 18, 2014, at 6:07 AM, amjad khan <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>> Thanks for your help parth
>>>>>>
>>>>>> When i trying to run the topology to write the data to hdfs it throws
>>>>>> exception Class Not Found:
>>>>>> org.apache.hadoop.client.hdfs.HDFSDataOutputStream$SyncFlags
>>>>>> Can anyone tell me what are the jars needed to execute the code to
>>>>>> write data to hdfs. Please tell me all the required jars.
>>>>>>
>>>>>>
>>>>>> On Wed, Jul 16, 2014 at 10:46 AM, Parth Brahmbhatt <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> You can use
>>>>>>>
>>>>>>> https://github.com/ptgoetz/storm-hdfs
>>>>>>>
>>>>>>> It supports writing to HDFS with both Storm bolts and trident
>>>>>>> states.
>>>>>>> Thanks
>>>>>>> Parth
>>>>>>>
>>>>>>> On Jul 16, 2014, at 10:41 AM, amjad khan <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Can anyone provide the code for bolt to write its data to hdfs.
>>>>>>> Kindly tell me the jar's required to run that bolt.
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Jul 14, 2014 at 2:33 PM, Max Evers <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Can you expand on your use case? What is the query selecting on? Is
>>>>>>>> the column you are querying on indexed?  Do you really need to look at 
>>>>>>>> the
>>>>>>>> entire 20 gb every 20ms?
>>>>>>>>  On Jul 14, 2014 6:39 AM, "amjad khan" <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I made a storm topoogy where spout was fetching data from mysql
>>>>>>>>> using select query. The select query was fired after every 30 msec but
>>>>>>>>> because the size of the table is more than 20 GB the select query 
>>>>>>>>> takes
>>>>>>>>> more than 10 sec to execute therefore this is not working. I need to 
>>>>>>>>> know
>>>>>>>>> what are the possible alternatives for this situation. Kindly reply 
>>>>>>>>> as soon
>>>>>>>>> as possible.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> CONFIDENTIALITY NOTICE
>>>>>>> NOTICE: This message is intended for the use of the individual or
>>>>>>> entity to which it is addressed and may contain information that is
>>>>>>> confidential, privileged and exempt from disclosure under applicable 
>>>>>>> law.
>>>>>>> If the reader of this message is not the intended recipient, you are 
>>>>>>> hereby
>>>>>>> notified that any printing, copying, dissemination, distribution,
>>>>>>> disclosure or forwarding of this communication is strictly prohibited. 
>>>>>>> If
>>>>>>> you have received this communication in error, please contact the sender
>>>>>>> immediately and delete it from your system. Thank You.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> CONFIDENTIALITY NOTICE
>>>>> NOTICE: This message is intended for the use of the individual or
>>>>> entity to which it is addressed and may contain information that is
>>>>> confidential, privileged and exempt from disclosure under applicable law.
>>>>> If the reader of this message is not the intended recipient, you are 
>>>>> hereby
>>>>> notified that any printing, copying, dissemination, distribution,
>>>>> disclosure or forwarding of this communication is strictly prohibited. If
>>>>> you have received this communication in error, please contact the sender
>>>>> immediately and delete it from your system. Thank You.
>>>>>
>>>>
>>>>
>>>
>>
>

Reply via email to