One more question. What is the optimal number partition per topic to have?
On Wed, Aug 14, 2013 at 9:47 AM, Vadim Keylis <vkeylis2...@gmail.com> wrote: > Joel thanks so much. > Do you guys have hard set limit on a maximum topics Kafka can support. Are > there any other OS level settings I should be concerned that may cause > kafka to crash. > I am still trying to understand how to recover from failure and start > service. > > The following error causes kafka not to restart > [2013-08-13 17:20:08,992] FATAL Fatal error during KafkaServerStable > startup. Prepare to shutdown (kafka.server.KafkaServerStartable) > java.lang.IllegalStateException: Found log file with no corresponding > index file. > > > On Wed, Aug 14, 2013 at 9:27 AM, Joel Koshy <jjkosh...@gmail.com> wrote: > >> We use 30k as the limit. It is largely driven by the number of partitions >> (including replicas), retention period and number of >> simultaneous producers/consumers. >> >> In your case it seems you have 150 topics, 36 partitions, 3x replication - >> with that configuration you will definitely need to up your file handle >> limit. >> >> Thanks, >> >> Joel >> >> On Wednesday, August 14, 2013, Vadim Keylis wrote: >> >> > Good morning Jun. Correction in terms of open file handler limit. I was >> > wrong. I re-ran the command ulimit -Hn and it shows 10240. Which >> brings to >> > the next question. How appropriately calculate open files handler >> required >> > by Kafka? What is your guys settings for this field? >> > >> > Thanks, >> > Vadim >> > >> > >> > >> > On Wed, Aug 14, 2013 at 8:19 AM, Vadim Keylis <vkeylis2...@gmail.com >> <javascript:;>> >> > wrote: >> > >> > > Good morning Jun. We are using Kafka 0.8 that I built from trunk in >> June >> > > or early July. I forgot to mention that running ulimit on the hosts >> shows >> > > open file handler set to unlimited. What are the ways to recover from >> > last >> > > error and restart Kafka ? How can I delete topic with Kafka service on >> > all >> > > host down? How many topics can Kafka support to prevent to many open >> file >> > > exception? What did you set open file handler limit in your cluster? >> > > >> > > Thanks so much, >> > > Vadim >> > > >> > > Sent from my iPhone >> > > >> > > On Aug 14, 2013, at 7:38 AM, Jun Rao <jun...@gmail.com<javascript:;>> >> > wrote: >> > > >> > > > The first error is caused by too many open file handlers. Kafka >> keeps >> > > each >> > > > of the segment files open on the broker. So, the more >> topics/partitions >> > > you >> > > > have, the more file handlers you need. You probably need to increase >> > the >> > > > open file handler limit and also monitor the # of open file >> handlers so >> > > > that you can get an alert when it gets close to the limit. >> > > > >> > > > Not sure why you get the second error on restart. Are you using the >> 0.8 >> > > > beta1 release? >> > > > >> > > > Thanks, >> > > > >> > > > Jun >> > > > >> > > > >> > > > On Tue, Aug 13, 2013 at 11:04 PM, Vadim Keylis < >> vkeylis2...@gmail.com<javascript:;> >> > > >wrote: >> > > > >> > > >> We have 3 node kafka cluster. I initially created 4 topics. >> > > >> I wrote small shell script to create 150 topics. >> > > >> >> > > >> TOPICS=$(< $1) >> > > >> for topic in $TOPICS >> > > >> do >> > > >> echo "/usr/local/kafka/bin/kafka-create-topic.sh --replica 3 >> --topic >> > > >> $topic --zookeeper $2:2181/kafka --partition 36" >> > > >> /usr/local/kafka/bin/kafka-create-topic.sh --replica 3 --topic >> > $topic >> > > >> --zookeeper $2:2181/kafka --partition 36 >> > > >> done >> > > >> >> > > >> 10 minutes later I see messages like this >> > > >> [2013-08-13 11:43:58,944] INFO [ReplicaFetcherManager on broker 7] >> > > Removing >> > > >> fetcher for partition [m3_registration,0] >> > > >> (kafka.server.ReplicaFetcherManager) followed by >> > > >> [2013-08-13 11:44:00,067] WARN [ReplicaFetcherThread-0-8], error >> for >> > > >> partition [m3_registration,22] to broker 8 >> > > >> (kafka.server.ReplicaFetcherThread) >> > > >> kafka.common.NotLeaderForPartitionException >> > > >> >> > > >> Then a few minutes later followed by the following messages that >> > > >> overwhelmed logging system. >> > > >> [2013-08-13 11:46:35,916] ERROR error in loggedRunnable >> > > >> (kafka.utils.Utils$) >> > > >> java.io.FileNotFoundException: >> > > >> /home/kafka/data7/replication-offset-checkpoint.tmp (Too many open >> > > files) >> > > >> at java.io.FileOutputStream.open(Native Method) >> > > >> at >> java.io.FileOutputStream.<init>(FileOutputStream.java:194) >> > > >> >> > > >> I restarted the service after discovering the problem. After a few >> > > minutes >> > > >> attempting to recover kafka service crashed with the following >> error. >> > > >> >> > > >> [2013-08-13 17:20:08,953] INFO [Log Manager on Broker 7] Loading >> log >> > > >> 'm3_registration-29' (kafka.log.LogManager) >> > > >> [2013-08-13 17:20:08,992] FATAL Fatal error during >> KafkaServerStable >> > > >> startup. Prepare to shutdown (kafka.server.KafkaServerStartable) >> > > >> java.lang.IllegalStateException: Found log file with no >> corresponding >> > > index >> > > >> file. >> > > >> >> > > >> No activity on the cluster after topics were added. >> > > >> What could have cause the crash and trigger too many open files >> > > exception? >> > > >> What the best way to recover in order to restart kafka service(Not >> > sure >> > > if >> > > >> delete topic command will work in this particular case as all 3 >> > services >> > > >> would not start)?How to prevent in the future? >> > > >> >> > > >> Thanks so much in advance, >> > > >> Vadim >> > > >> >> > > >> > >> > >