Hi everyone, We are having some issues with the Storm topology. The problem is that some tuples are being lost somewhere in the topology. Just after the topology is deployed, it goes pretty well, but after several hours it starts to loose a significant amount of tuples.
>From what we've found out from the logs, the thing is that the tuples exit one bolt/spout, and never enter the next bolt. Here is some info about the topology: - The version is 0.9.1, and netty is used as transport - The spout is extending BaseRichSpout, and the bolts extend BaseBasicBolt - The spout is using Kestrel message queue - The cluster consists of 2 nodes: zookeeper, nimbus and ui are running on one node, and the workers run on another node. I am attaching the content of the config files below. We have also tried running the workers on another node (the same where nimbus and zookeeper are), and also on both nodes, but the behavior is the same. According to the Storm UI there are no Failed tuples. Can anybody give any idea of what might be the reason of the tuples getting lost? Thanks. *Storm config (storm.yaml)* (In case both nodes have workers running, the configuration is the same on both nodes, just the "storm.local.hostname" parameter changes) storm.zookeeper.servers: - "zkserver1" nimbus.host: "nimbusserver" storm.local.dir: "/mnt/storm" supervisor.slots.ports: - 6700 - 6701 - 6702 - 6703 storm.local.hostname: "storm1server" nimbus.childopts: "-Xmx1024m -Djava.net.preferIPv4Stack=true" ui.childopts: "-Xmx768m -Djava.net.preferIPv4Stack=true" supervisor.childopts: "-Xmx1024m -Djava.net.preferIPv4Stack=true" worker.childopts: "-Xmx3548m -Djava.net.preferIPv4Stack=true" storm.cluster.mode: "distributed" storm.local.mode.zmq: false storm.thrift.transport: "backtype.storm.security.auth.SimpleTransportPlugin" storm.messaging.transport: "backtype.storm.messaging.netty.Context" storm.messaging.netty.server_worker_threads: 1 storm.messaging.netty.client_worker_threads: 1 storm.messaging.netty.buffer_size: 5242880 #5MB buffer storm.messaging.netty.max_retries: 30 storm.messaging.netty.max_wait_ms: 1000 storm.messaging.netty.min_wait_ms: 100 *Zookeeper config (zoo.cfg):* tickTime=2000 initLimit=10 syncLimit=5 dataDir=/var/zookeeper clientPort=2181 autopurge.purgeInterval=24 autopurge.snapRetainCount=5 server.1=localhost:2888:3888 *Topology configuration* passed to the StormSubmitter: Config conf = new Config(); conf.setNumAckers(6); conf.setNumWorkers(4); conf.setMaxSpoutPending(100); Best regards, Daria Mayorova