If you use field grouping on "id" it just ensures that a tuple with id=21 will always go to the same task (one of the instances of your bolt). It doesn't let you specify which task.
Are you trying to setup a topology where you have A->B and B has 20 tasks (and you want to control which task the tuple goes to) or are you setting up A->B and A->C and you want to set if the tuple goes to B or C. In the first case, perhaps it doesn't matter, since all tasks will be running the code for bolt B (and the same tuple will always go to the same task). In the second case, the tuple will go to BOTH B and C, so you can always code B to simply drop any tuples with ids it doesn't want, while C will process them (or vice-versa). Steve On Thu, Aug 14, 2014 at 10:13 PM, amjad khan <[email protected]> wrote: > I have a situation where i have seven bolts and one spout & i want to > distribute the tuples according to the field ID. > For eg. if ID=21 I want the tuple to be processed by first bolt > ID=31 I want that tuple to be processed by second bolt & so > on. > > So is there a way to implement these. I was thinking about using fields > grouping but in that i can only define the field name but not the value of > that field, So if i use field grouping i don't think there would be a > guarantee that suppose for ID=21 the tuple would be processed by first bolt. > Kindly correct me if i'm wrong about field grouping & provide solution to > implement these kind of topology. > Thanks in advance. > > > On Fri, Aug 1, 2014 at 10:20 PM, amjad khan <[email protected]> > wrote: > >> My bolt tries to write data to hdfs but the whole data is not written it >> throws exception >> >> org.apache.hadoop.ipc.RemoteException: >> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on >> /storm.txt File does not exist. Holder DFSClient_attempt_storm.txt does not >> have any open files. >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1557) >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1548 >> >> Kindly help me if anyone has any idea about this. >> >> >> >> On Sat, Jul 26, 2014 at 12:47 PM, amjad khan <[email protected]> >> wrote: >> >>> Output when using bolt that tries to write its data to hdfs. >>> >>> INFO org.apache.hadoop,ipc.Client - Retrying Connect to Server: >>> localhost/131.0.0.1:43785 Already tried 6 time(s). >>> WARN Caught URI Exception >>> java.net.ConnectException Call to localhost/131.0.0.1:43785 Failed on >>> Connect Exception: java.net.ConnectException: Connection Refused >>> >>> IN MY Code: >>> Configuration config = new Configuration(); >>> config.set("fs.defaultFS","hdfs://localhost:9000"); >>> FileSystem fs = FileSystem.get(config); >>> >>> >>> /etc/hosts contain >>> 181.45.83.79 localhost >>> >>> core-site contain >>> >>> <name>fs.default.name</name> >>> <value>hdfs://localhost:9000</value> >>> >>> >>> Kindly tell me why it is trying to connect on 131.0.0.1 & why at port >>> 43785 . >>> >>> >>> The same code is working fine in java without implementing it in storm & >>> i'm using hadoop 1.0.2. >>> >>> >>> On Fri, Jul 18, 2014 at 11:33 AM, Parth Brahmbhatt < >>> [email protected]> wrote: >>> >>>> Hi Amjad, >>>> >>>> Is there any reason you can not upgrade to hadoop 2.0? Hadoop 2.0 has >>>> made many improvements over 1.X versions and they are source compatible so >>>> any of your MR jobs will be unaffected as long as you recompile with 2.x. >>>> >>>> The code we pointed at assumes that all the classes for hadoop 2.X are >>>> present in your class path. if you are not using maven or some other build >>>> system and would like to add jars manually you probably will have tough >>>> time resolving conflicts so I would advise against it. >>>> If you still want to add jars manually my best guess would be to look >>>> under >>>> <YOUR_HADOO_INSTALLATION_DIR>/libexec/share/hadoop/ >>>> >>>> Thanks >>>> Parth >>>> On Jul 18, 2014, at 10:56 AM, amjad khan <[email protected]> >>>> wrote: >>>> >>>> Thanks for your reply taylor. I'm using hadoop1.0.2. Can u suggest me >>>> any alternative to connect to hadoop. >>>> >>>> >>>> >>>> On Fri, Jul 18, 2014 at 8:45 AM, P. Taylor Goetz <[email protected]> >>>> wrote: >>>> >>>>> What version of Hadoop are you using? Storm-hdfs requires Hadoop 2.x. >>>>> >>>>> - Taylor >>>>> >>>>> On Jul 18, 2014, at 6:07 AM, amjad khan <[email protected]> >>>>> wrote: >>>>> >>>>> Thanks for your help parth >>>>> >>>>> When i trying to run the topology to write the data to hdfs it throws >>>>> exception Class Not Found: >>>>> org.apache.hadoop.client.hdfs.HDFSDataOutputStream$SyncFlags >>>>> Can anyone tell me what are the jars needed to execute the code to >>>>> write data to hdfs. Please tell me all the required jars. >>>>> >>>>> >>>>> On Wed, Jul 16, 2014 at 10:46 AM, Parth Brahmbhatt < >>>>> [email protected]> wrote: >>>>> >>>>>> You can use >>>>>> >>>>>> https://github.com/ptgoetz/storm-hdfs >>>>>> >>>>>> It supports writing to HDFS with both Storm bolts and trident states. >>>>>> Thanks >>>>>> Parth >>>>>> >>>>>> On Jul 16, 2014, at 10:41 AM, amjad khan <[email protected]> >>>>>> wrote: >>>>>> >>>>>> Can anyone provide the code for bolt to write its data to hdfs. >>>>>> Kindly tell me the jar's required to run that bolt. >>>>>> >>>>>> >>>>>> On Mon, Jul 14, 2014 at 2:33 PM, Max Evers <[email protected]> wrote: >>>>>> >>>>>>> Can you expand on your use case? What is the query selecting on? Is >>>>>>> the column you are querying on indexed? Do you really need to look at >>>>>>> the >>>>>>> entire 20 gb every 20ms? >>>>>>> On Jul 14, 2014 6:39 AM, "amjad khan" <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> I made a storm topoogy where spout was fetching data from mysql >>>>>>>> using select query. The select query was fired after every 30 msec but >>>>>>>> because the size of the table is more than 20 GB the select query takes >>>>>>>> more than 10 sec to execute therefore this is not working. I need to >>>>>>>> know >>>>>>>> what are the possible alternatives for this situation. Kindly reply as >>>>>>>> soon >>>>>>>> as possible. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> CONFIDENTIALITY NOTICE >>>>>> NOTICE: This message is intended for the use of the individual or >>>>>> entity to which it is addressed and may contain information that is >>>>>> confidential, privileged and exempt from disclosure under applicable law. >>>>>> If the reader of this message is not the intended recipient, you are >>>>>> hereby >>>>>> notified that any printing, copying, dissemination, distribution, >>>>>> disclosure or forwarding of this communication is strictly prohibited. If >>>>>> you have received this communication in error, please contact the sender >>>>>> immediately and delete it from your system. Thank You. >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> CONFIDENTIALITY NOTICE >>>> NOTICE: This message is intended for the use of the individual or >>>> entity to which it is addressed and may contain information that is >>>> confidential, privileged and exempt from disclosure under applicable law. >>>> If the reader of this message is not the intended recipient, you are hereby >>>> notified that any printing, copying, dissemination, distribution, >>>> disclosure or forwarding of this communication is strictly prohibited. If >>>> you have received this communication in error, please contact the sender >>>> immediately and delete it from your system. Thank You. >>>> >>> >>> >> >
