Re: TransactionalTridentKafkaSpout using only 1 executor

2014-12-04 Thread Andrew Neilson
How is the kafka topic you are reading from partitioned? By default, kafka will write to a single random partition at a time for 10 minutes before switching to another. So if you are looking at live data, you would only see data in one partition at a time unless you use a different partitioning

Re: TransactionalTridentKafkaSpout using only 1 executor

2014-12-04 Thread Andrew Neilson
used only one producer), would it be rebalanced afterward? Best regards, Huy, Le Van On Thursday, Dec 4, 2014 at 10:00 p.m., Andrew Neilson arsneil...@gmail.com, wrote: How is the kafka topic you are reading from partitioned? By default, kafka will write to a single random

Re: KafkaConfig: what is the difference between -1 and -2 offset

2014-12-04 Thread Andrew Neilson
-1 and -2 come from kafka.api.OffsetRequest: https://github.com/apache/kafka/blob/0.8.1/core/src/main/scala/kafka/api/OffsetRequest.scala -1 is the latest time, -2 is the earliest time In order to be sure you always start from the most recent offset in kafka, you need to set up your KafkaConfig

Re: KafkaConfig: what is the difference between -1 and -2 offset

2014-12-06 Thread Andrew Neilson
(Thread.java:722) [na:1.7.0_17] On Thu, Dec 4, 2014 at 6:46 PM, Andrew Neilson arsneil...@gmail.com wrote: -1 and -2 come from kafka.api.OffsetRequest: https://github.com/apache/kafka/blob/0.8.1/core/src/main/scala/kafka/api/OffsetRequest.scala -1 is the latest time, -2 is the earliest time

Re: Issues With Parallelism In Kafka Spout

2014-12-23 Thread Andrew Neilson
Harsha asked a similar question, but I would definitely make certain that messages being written to kafka are being partitioned the way you are expecting. If you are trying to consume live events coming from a lone kafka producer using default configuration, then messages are only going to appear

Re: Is there anyone who have experience of using storm with druid?

2015-02-04 Thread Andrew Neilson
I can't answer your question directly as I haven't used Druid, but I will point out that Druid was developed by Metamarkets and they are using Druid with Storm and Kafka in their architecture ( https://metamarkets.com/2014/building-a-data-pipeline-that-handles-billions-of-events-in-real-time/).

Re: [Kafka Spout] - EarliestTime vs. forceFromStart

2015-03-31 Thread Andrew Neilson
Not exactly.. forceFromStart=true will tell the spout to start reading from whatever is set in startOffsetTime (available options are the earliest offset or the latest offset). If forceFromStart=false then startOffsetTime is not used at all and the offset is just retrieved from zookeeper, if it's

Re: Partitioning from Storm Trident to Kafka

2015-04-07 Thread Andrew Neilson
You'll need to make sure your Kafka producer is configured to partition the way you are expecting when you write to your topic. By default it will publish to the same partition for 10 minutes at a time then switch to a new one. It looks like you are trying to pass a partition key to the producer

Re: passing arguements to storm topology at load

2015-10-27 Thread Andrew Neilson
An alternate way to do it is to get these values from a .properties file or with command-line args within your Topology class and then send it to your bolts by setting new values in the Config object

Re: Storm workers dying

2016-11-29 Thread Andrew Neilson
That error makes me think zookeeper isn't healthy. It isn't necessarily clear from your message whether you've verified ZK is totally ok but I'd at least look at the healthcheck on each ZK host ($ echo ruok | nc zookeeperhost 2181). We've run into that issue before when a ZK host runs out of disk

Re: The acker does not work well.

2016-12-20 Thread Andrew Neilson
Are you calling ack() from within your topology? Make sure you read this carefully: http://storm.apache.org/releases/1.0.2/Guaranteeing-message-processing.html Sorry if your issue is deeper than this, but it would help to have more context. Andrew On Tue, Dec 20, 2016 at 3:23 AM

Re: What is the default parallelism_hint of bolt or spout ?

2019-04-23 Thread Andrew Neilson
Default is 1 On Mon, Apr 22, 2019 at 7:18 PM dumbdonkey wrote: > Hi, guys > > Is there anyone can answer me the default parallelism hint of bolt or > spout? I've tried to google but found nothing. > Thanks! > > > > > > >

Re: How to upgrade from 1.x to 2.x ?

2019-11-15 Thread Andrew Neilson
We're in the middle of an upgrade from 0.9.5 to 2.1 and we're doing this: - update all topologies to be compatible with v2.1; throughout the process we're making any changes to both the 0.9.5 and 2.1 versions - deploy Storm v2.1 cluster in parallel to the v0.9.5 one - kill each topology in old

Re: Storm 2.0 worker heartbeat

2020-04-09 Thread Andrew Neilson
itted, executed, etc.) used on UI. >> This timer only happens every 60s. So it shouldn't overload zookeeper. But >> Pacemaker can still be used here. The graph at >> https://github.com/apache/storm/pull/2389 might help to understand this. >> >> The code is merged at http

Re: Storm 2.0 worker heartbeat

2020-03-18 Thread Andrew Neilson
Hi Ethan, Pacemaker is not required but still can be used. > Under what circumstances would Pacemaker be used on v2.0+? It's not totally clear to me from how that ticket is written but it looks like that replaced all of the heartbeat logic that was managed by Pacemaker and ZK in older versions.

Behavior of heartbeats in 2.1

2020-03-19 Thread Andrew Neilson
We're working on moving from v0.9.5 to 2.1 right now and as you can imagine there have been quite a few changes :). One of the improvements we've been looking forward to is the different approach to heartbeats since we had observed the bottleneck in ZK's transaction log. I've seen references to

Re: Old state crashing nimbus? (v2.2.0)

2021-10-25 Thread Andrew Neilson
by a Nimbus's rolling restart should suffice. > > On Mon, Oct 25, 2021, 18:49 Andrew Neilson wrote: > >> Hi, >> >> We're running a v2.2.0 cluster with two nimbus hosts and recently noticed >> storm-nimbus on the leader is effectively in a restart loop. >>

Old state crashing nimbus? (v2.2.0)

2021-10-25 Thread Andrew Neilson
Hi, We're running a v2.2.0 cluster with two nimbus hosts and recently noticed storm-nimbus on the leader is effectively in a restart loop. When I look at nimbus.log on that host it is full of log entries related to old versions of topologies we're running. There are the two types of exceptions I