Share storm UI screen shot so that we can have a look at stats of topology. Also a visualization screen shot to know the flow.
On Mon, Nov 2, 2015 at 10:25 AM, Renjie Liu <[email protected]> wrote: > The output speed is measured by the outpu of dstat, which shows the worker > traffic speed. > > On Mon, Nov 2, 2015 at 10:52 AM, Nathan Leung <[email protected]> wrote: > >> How are you measuring output speed? Is it possible that you are >> experiencing problems with HBase? >> >> On Sun, Nov 1, 2015 at 9:22 PM, Renjie Liu <[email protected]> >> wrote: >> >>> The result of jstat shows that it's not in full gc cycle but the minor >>> gc takes more than 1s each time. However, the frequence of minor gc is >>> quite low, which happens once every few seconds. >>> >>> On Mon, Nov 2, 2015 at 12:29 AM, Nathan Leung <[email protected]> wrote: >>> >>>> The box with no throughput might be in a gc loop. Check your heap >>>> utilization and maybe increase worker heap if necessary. Also consider >>>> decreasing the max spout pending, even without further details 20k seems >>>> high. >>>> On Nov 1, 2015 10:50 AM, "Harsha" <[email protected]> wrote: >>>> >>>>> Do you have any calls to external data sources which might be >>>>> increasing the latency and causing tuple timeout? >>>>> >>>>> >>>>> On Sun, Nov 1, 2015, at 04:49 AM, Renjie Liu wrote: >>>>> >>>>> Yes, I've set it to 20000 >>>>> >>>>> On Sun, Nov 1, 2015 at 6:40 PM, Santosh Pingale < >>>>> [email protected]> wrote: >>>>> >>>>> Have you set 'topology.*max*.*spout*.*pending'?* >>>>> >>>>> On Sun, Nov 1, 2015 at 2:26 PM, Renjie Liu <[email protected]> >>>>> wrote: >>>>> >>>>> Hi, storm community: >>>>> >>>>> We have a storm cluster deployed with 15 workers and recently we often >>>>> experience failure since ack timeout. Our input source is kafka and we >>>>> used >>>>> ganglia to monitor our cluster. Recently we experience failures every 12 >>>>> hours and following are my observations from some monitoring tools when >>>>> problem happens: >>>>> >>>>> 1. Topology page shows that no worker was down since uptime of >>>>> each task are nearly equal to topology uptime >>>>> 2. I've checked ganglia, the cpu report and mem report does not >>>>> give any clue about the problem. But network report shows something >>>>> unusual: the in speed decreases a little while the out speed decreases >>>>> to >>>>> nearly zero on some workers. >>>>> 3. I've logged in to one of machines mentioned above, and found >>>>> out that one of the survivor areas always remains 100% full. >>>>> 4. dstat show that csw turns to 4k+ every few seconds while it >>>>> remains around 400 in normal condition. >>>>> >>>>> Can anyone give us some hint about this problem? >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Renjie Liu >>>>> Department of Computer Science & Engineering >>>>> Shanghai JiaoTong University >>>>> >>>>> >>>>> >>>> >>> >>> >>> -- >>> Renjie Liu >>> Department of Computer Science & Engineering >>> Shanghai JiaoTong University >>> >> >> > > > -- > Renjie Liu > Department of Computer Science & Engineering > Shanghai JiaoTong University >
