Yes, the defaults mentioned in the blog worked for me too.
On Wed, Aug 27, 2014 at 2:49 AM, Kushan Maskey < [email protected]> wrote: > These changes did help a lot. Now I dont see any failed acks for alsmot > 70k data load. Thanks a lot for your help. > > -- > Kushan Maskey > 817.403.7500 > > > On Tue, Aug 26, 2014 at 9:45 AM, Kushan Maskey < > [email protected]> wrote: > >> Also FYI, I do not have any failure in any of my bolts. So I am guessing >> it has to do with the amount of message that the spout is trying to read >> form kafka. >> >> as per the document by Michael Noll I am trying to see if setting up the >> config as per suggestion as >> >> config.put(Config.TOPOLOGY_RECEIVER_BUFFER_SIZE, 8); >> >> config.put(Config.TOPOLOGY_TRANSFER_BUFFER_SIZE, 32); >> >> config.put(Config.TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE, 16384); >> >> config.put(Config.TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE, 16384); >> >> >> Let me know if any of you have any suggestion. Thanks. >> >> -- >> Kushan Maskey >> >> >> >> On Tue, Aug 26, 2014 at 9:28 AM, Kushan Maskey < >> [email protected]> wrote: >> >>> I started looking into setting up internal message buffer as mentioned >>> in this link. >>> >>> http://www.michael-noll.com/blog/2013/06/21/understanding-storm-internal-message-buffers/#how-to-configure-storms-internal-message-buffers >>> >>> I found out that my message size could be as big as 10K. So does that >>> mean that I should set the buffer size about 10K? >>> >>> -- >>> Kushan Maskey >>> 817.403.7500 >>> >>> >>> On Tue, Aug 26, 2014 at 7:45 AM, Kushan Maskey < >>> [email protected]> wrote: >>> >>>> Thanks, Michael, >>>> >>>> How do you verify the reliability of the KafkaSpout? I am using the >>>> KafkaSpout that came with storm 0.9.2. AFAIK kafkaSpout is quite reliable. >>>> I am guessing it the processing time for each record in the bolt. Yes form >>>> the log I do see few Cassandra exceptions while inserting the records. >>>> >>>> -- >>>> Kushan Maskey >>>> 817.403.7500 >>>> >>>> >>>> On Mon, Aug 25, 2014 at 9:39 PM, Michael Rose <[email protected]> >>>> wrote: >>>> >>>>> Hi Kushan, >>>>> >>>>> Depending on the Kafka spout you're using, it could be doing different >>>>> things when it failed. However, if it's running reliably, the Cassandra >>>>> insertion failures would have forced a replay from the spout until they >>>>> had >>>>> completed. >>>>> >>>>> Michael Rose (@Xorlev <https://twitter.com/xorlev>) >>>>> Senior Platform Engineer, FullContact <http://www.fullcontact.com/> >>>>> [email protected] >>>>> >>>>> >>>>> On Mon, Aug 25, 2014 at 4:42 PM, Kushan Maskey < >>>>> [email protected]> wrote: >>>>> >>>>>> I have set up topology to load a very large volume of data. Recently >>>>>> I just loaded about 60K records and found out that there are some failed >>>>>> acks on few spouts but non on the bolts. Storm completed running and seem >>>>>> to look stable. Initially i started with a lesser amount of data like >>>>>> about >>>>>> 500 records successfully and then increased up to 60K where i saw the >>>>>> failed acks. >>>>>> >>>>>> Questions: >>>>>> 1. Does that mean that the spout was not able to read some messages >>>>>> from Kafka? Since there are no failed ack on the bolts as per UI, what >>>>>> ever >>>>>> the message received has been successfully processed by the bolts. >>>>>> 2. how do i interpret the numbers of failed acks like this acked:315500 >>>>>> and failed: 2980. >>>>>> Does this mean that 2980 records failed to be processed? Is this is >>>>>> the case then, how do I avoid this from happening because I will be >>>>>> loosing >>>>>> 2980 records. >>>>>> 3. I also see that few of the records failed to be inserted into >>>>>> Cassandra database. What is the best way to reprocess the data again as >>>>>> it >>>>>> is quite difficult to do it through the batch process that I am currently >>>>>> running. >>>>>> >>>>>> LMK, thanks. >>>>>> >>>>>> -- >>>>>> Kushan Maskey >>>>>> 817.403.7500 >>>>>> >>>>> >>>>> >>>> >>> >> >
