Sam, Bolts are taking data you emit from your spout (or from other bolt) and then do what you need (persist data in db, aggregate etc).
In your case - you have a spout which emits sentences, you need to create another bolt that split the sentence in words and emit each word as a tuple. Then you should have another bolt - that gets the word as tuple from the previous bolt and does your processing. Use fieldsGrouping for your word processing task in topology and shuffleGrouping for your split sentence bolt. *I highly recommend Petrel library for python* https://github.com/AirSage/Petrel *Take a look at the sample that is very similar to your own task* https://github.com/AirSage/Petrel/tree/master/samples/wordcount *topology is defined here* https://github.com/AirSage/Petrel/blob/master/samples/wordcount/create.py *If you want to use Python/Storm (with Petrel) read this book* https://www.packtpub.com/big-data-and-business-intelligence/building-python-real-time-applications-storm Thanks, Dmitry On Tue, Apr 4, 2017 at 1:49 AM, sam mohel <[email protected]> wrote: > I need some help from you in this problem . I read that spout is > responsible for reading data or preparing it for processing in Bolt . so i > wrote some code in spout to open the file and read line by line > > class SimSpout(storm.Spout): > # Not much to do here for such a basic spout > def initialize(self, conf, context): > ## Open the file with read only permit > self.f = open('data.txt', 'r') > ## Read the first line > self._conf = conf > self._context = context > storm.logInfo("Spout instance starting...") > # Process the next tuple > def nextTuple(self): > # check if it reach at the EOF to close it > for line in self.f.readlines(): > # Emit a random sentence > storm.logInfo("Emiting %s" % line) > storm.emit([line]) > > # Start the spout when it's invoked > SimSpout().run() > > > Is that right ? > The actual problem with me now , How can i make Bolt take each line from > spout to make the processing on it as the processing on it is to read from > another file some calculations to compute the vector of each word > -- ------------------------------ <http://www.saritasa.com/> Dmitry Semenov [email protected] | 949.200.6839 | www.saritasa.com 20411 Birch St., Suite 330, Newport Beach, CA 92660
