Great to hear, Ishan.
We faced a similar error here.
We will test this with the fix that you propose.
Best wishes
On 01/29/2013 12:43 PM, ishan chhabra wrote:
Hi Tsuna,
As Shrijeet mentioned, we (@Rocketfuel) were experiencing this bug internally when doing cluster restarts. After some trial and error, I was able to create a set of steps to reproduce this bug in a controlled fashion on our test cluster. Further, using heap dumps and added debug messages, this looks like the cause and fix: https://github.com/OpenTSDB/asynchbase/pull/48. I have tested this repeatedly on the test cluster and things are looking fine. Please have a look and see if this makes sense and if the fix is a correct one.

Cheers,
Ishan

On Friday, 25 January 2013 22:53:17 UTC-8, tsuna wrote:

    On Fri, Jan 25, 2013 at 5:28 PM, Tianying Chang <[email protected]
    <javascript:>> wrote:
    > Thanks for the information! We have seen this couple times
    recently. Last week, it was very long(like 40+ minutes before we
    restart). I will follow up on that discuss thread. Thanks a lot!!

    This is bug number 1, I haven't been able to track it down as I've
    never been able to reproduce it in a controller fashion :(
    https://github.com/OpenTSDB/asynchbase/issues/1
    <https://github.com/OpenTSDB/asynchbase/issues/1>

    I also spent hours manually walking references of heap dumps
    and checking state to see if anything was wrong but I haven't
    found anything, not even a clue.

-- Benoit "tsuna" Sigoure


--
Marcos Ortiz Valmaseda,
Product Manager && Data Scientist at DATEC
Blog: http://marcosluis2186.posterous.com
Twitter: @marcosluis2186 <http://twitter.com/marcosluis2186>

Reply via email to