There is also no indication they are configuring Kafka such that it won't lose data in the event of a broker failure: http://aphyr.com/posts/293-call-me-maybe-kafka http://blog.empathybox.com/post/62279088548/a-few-notes-on-kafka-and-jepsen
The performance hit to Kafka is similar to Storm when you enable similar acks: https://mail-archives.apache.org/mod_mbox/kafka-users/201402.mbox/%3c26ba2b6a63f54fe39b343fc673f21...@by2pr03mb239.namprd03.prod.outlook.com%3E "Your best result shows close to a factor of 2 difference btw ack=-1 and ack=1, which is actually reasonable. Thanks, Jun" On Sun, Apr 6, 2014 at 2:59 PM, Jason Jackson <[email protected]> wrote: > It would of been far more useful if they measured the systems in terms of > dollars, as each system makes different tradeoffs. Certainly when you > enable acking you may become bottlenecked on CPU at that point instead of > being bottlenecked on disk/kafka. So one thing you can do is move to > hardware with higher class CPUs to solve the bottleneck. The system they > built is persisting intermediary queues between components in a topology. > So while this will reduce CPU load by not needing an acking system, you > will need more disks as potentially any of the intermediately queues can > start to fill up now, you need to reserve capacity for worst case scenario. > Potentially in terms of dollars the tradeoff to use more disks has > marginally better total cost. > > > > > On Fri, Apr 4, 2014 at 6:55 PM, Benjamin Black <[email protected]> wrote: > >> No part of the post made any sense to me. There is a significant >> performance hit when moving to reliable operation in any system and Storm >> is clearly doing a good job if a custom built solution can only manage 25% >> more throughput. >> >> >> On Fri, Apr 4, 2014 at 4:10 PM, Neelesh <[email protected]> wrote: >> >>> Its an interesting read. The blog is vague on some details - with ACK >>> on, the throughput was 80K/s. With their custom solution its 100K/s. >>> Assuming they were both deployed on similar hardware (I do not know , the >>> blog does not confirm either way), the difference is not something that >>> warrants a custom framework to me. Obviously its working better for Loggly. >>> >>> >>> On Fri, Apr 4, 2014 at 8:26 AM, Otis Gospodnetic < >>> [email protected]> wrote: >>> >>>> Hi, >>>> >>>> Apparently Loggly decided to ditch Storm when they got hit by the 2.5x >>>> performance degradation factor after turning on ACKing: >>>> https://www.loggly.com/what-we-learned-about-scaling-with-apache-storm/ >>>> >>>> How does one minimize this performance hit? >>>> Or maybe newer versions of Storm perform better with ACK? (Loggly >>>> tested 0.82, they say) >>>> >>>> Thanks, >>>> Otis >>>> -- >>>> Performance Monitoring * Log Analytics * Search Analytics >>>> Solr & Elasticsearch Support * http://sematext.com/ >>>> >>>> >>> >> >
