Re: Kafka in virtualized environments

2017-12-01 Thread Viktor Somogyi
@Girish, wow, that could be a nice issue to debug. I was thinking about exactly these kind of issues with virtualized environments. @Wim, how did you overcome the problem? Thinking about such issues my first thoughts are increasing the VM's memory that can be utilized to read/write caching by the

Re: Kafka in virtualized environments

2017-11-30 Thread Girish Aher
I am no storage or ESX expert, what I was told by our storage folks is that they essentially created a dedicated storage pool in the SAN for zookeeper VMs plus other VMs that did not have a lot of IO activity (non DB VMs). I assume that implies dedicated physical disks in the SAN for that pool. I

Re: Kafka in virtualized environments

2017-11-30 Thread Sean Glover
Giresh, I'm curious what your solution was. Did you use locally attached storage for your ZK ensemble? Did you move it to static machines? On Thu, Nov 30, 2017 at 4:50 PM, John Yost wrote: > Great point by Girish--its the delays of syncing with Zookeeper that are > particularly problematic. Mo

Re: Kafka in virtualized environments

2017-11-30 Thread John Yost
Great point by Girish--its the delays of syncing with Zookeeper that are particularly problematic. Moreover, Zookeeper sync delays and session timeouts impact other systems as well such as Storm. --John On Thu, Nov 30, 2017 at 10:14 AM, Girish Aher wrote: > We did not face any problems with kaf

Re: Kafka in virtualized environments

2017-11-30 Thread Girish Aher
We did not face any problems with kafka application per se but we have faced problems with zookeeper in virtualized environments due to slowness in fsyncs. We were using a shared SAN storage with shared pools with other VMs. So every time, there was some kind of considerable storage activity like D

Re: Kafka in virtualized environments

2017-11-30 Thread Thomas Crayford
We run many thousands of clusters on EC2 without notable issues, and achieve great performance there. The real thing that matters is how good your virtualization layer is and how much of a performance impact it has. E.g. in modern EC2, the performance overhead of using virtualized IO is around 1-5%

Re: Kafka in virtualized environments

2017-11-30 Thread Wim Van Leuven
We are running kafka on openstack for a testing/staging environment. It runs good and stable, but it obviously is way slower than bare-metal. Simple reason is the distance to the disk (as with any IO batch oriented system on virtualisation) and virtual network. HTH -wim On Thu, 30 Nov 2017 at 1