Yes, the defaults mentioned in the blog worked for me too.

On Wed, Aug 27, 2014 at 2:49 AM, Kushan Maskey <
[email protected]> wrote:

> These changes did help a lot. Now I dont see any failed acks for alsmot
> 70k data load. Thanks a lot for your help.
>
> --
> Kushan Maskey
> 817.403.7500
>
>
> On Tue, Aug 26, 2014 at 9:45 AM, Kushan Maskey <
> [email protected]> wrote:
>
>> Also FYI, I do not have any failure in any of my bolts. So I am guessing
>> it has to do with the amount of message that the spout is trying to read
>> form kafka.
>>
>> as per the document by Michael Noll I am trying to see if setting up the
>> config as per suggestion as
>>
>> config.put(Config.TOPOLOGY_RECEIVER_BUFFER_SIZE,             8);
>>
>>  config.put(Config.TOPOLOGY_TRANSFER_BUFFER_SIZE,            32);
>>
>>  config.put(Config.TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE, 16384);
>>
>>  config.put(Config.TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE,    16384);
>>
>>
>> Let me know if any of you have any suggestion. Thanks.
>>
>> --
>> Kushan Maskey
>>
>>
>>
>> On Tue, Aug 26, 2014 at 9:28 AM, Kushan Maskey <
>> [email protected]> wrote:
>>
>>> I started looking into setting up internal message buffer as mentioned
>>> in this link.
>>>
>>> http://www.michael-noll.com/blog/2013/06/21/understanding-storm-internal-message-buffers/#how-to-configure-storms-internal-message-buffers
>>>
>>> I found out that my message size could be as big as 10K. So does that
>>> mean that I should set the buffer size about 10K?
>>>
>>> --
>>> Kushan Maskey
>>> 817.403.7500
>>>
>>>
>>> On Tue, Aug 26, 2014 at 7:45 AM, Kushan Maskey <
>>> [email protected]> wrote:
>>>
>>>> Thanks, Michael,
>>>>
>>>> How do you verify the reliability of the KafkaSpout? I am using the
>>>> KafkaSpout that came with storm 0.9.2. AFAIK kafkaSpout is quite reliable.
>>>> I am guessing it the processing time for each record in the bolt. Yes form
>>>> the log I do see few Cassandra exceptions while inserting the records.
>>>>
>>>> --
>>>> Kushan Maskey
>>>> 817.403.7500
>>>>
>>>>
>>>> On Mon, Aug 25, 2014 at 9:39 PM, Michael Rose <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi Kushan,
>>>>>
>>>>> Depending on the Kafka spout you're using, it could be doing different
>>>>> things when it failed. However, if it's running reliably, the Cassandra
>>>>> insertion failures would have forced a replay from the spout until they 
>>>>> had
>>>>> completed.
>>>>>
>>>>> Michael Rose (@Xorlev <https://twitter.com/xorlev>)
>>>>> Senior Platform Engineer, FullContact <http://www.fullcontact.com/>
>>>>> [email protected]
>>>>>
>>>>>
>>>>> On Mon, Aug 25, 2014 at 4:42 PM, Kushan Maskey <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> I have set up topology to load a very large volume of data. Recently
>>>>>> I just loaded about 60K records and found out that there are some failed
>>>>>> acks on few spouts but non on the bolts. Storm completed running and seem
>>>>>> to look stable. Initially i started with a lesser amount of data like 
>>>>>> about
>>>>>> 500 records  successfully and then increased up to 60K where i saw the
>>>>>> failed acks.
>>>>>>
>>>>>> Questions:
>>>>>> 1. Does that mean that the spout was not able to read some messages
>>>>>> from Kafka? Since there are no failed ack on the bolts as per UI, what 
>>>>>> ever
>>>>>> the message received has been successfully processed by the bolts.
>>>>>> 2. how do i interpret the numbers of failed acks like this acked:315500
>>>>>>  and failed: 2980.
>>>>>> Does this mean that 2980 records failed to be processed? Is this is
>>>>>> the case then, how do I avoid this from happening because I will be 
>>>>>> loosing
>>>>>> 2980 records.
>>>>>> 3. I also see that few of the records failed to be inserted into
>>>>>> Cassandra database. What is the best way to reprocess the data again as 
>>>>>> it
>>>>>> is quite difficult to do it through the batch process that I am currently
>>>>>> running.
>>>>>>
>>>>>> LMK, thanks.
>>>>>>
>>>>>> --
>>>>>> Kushan Maskey
>>>>>> 817.403.7500
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to