Re: Bind9 on VMWare
On 01/13/2016 04:34 AM, Philippe Maechler wrote: My idea for the new setup is: --- caching servers - Setup new caching servers - Configure the ipv4 addresses of both (old) servers on the new servers as a /32 and setup an anycast network. This way the stupid clients, who won't switch to the secondary ns server when the primary is not available, are happy when there is some problem with one server. If we're having issues with the load in the future we can setup a new server and put it into the anycast network Assuming you can manage the anycast effectively that's a good architecture that works well. Many of my customers have used it. auth. servers - Setup a hidden master on the vmware - Setup two physical servers which are slaves of the hidden master That way we have one box which is (anytime in the future) doing the dnssec stuff, gets the update that we're doing over the webinterface and deploys the ready-to-serve zones to his slaves. I would not hesitate to make the authoritative servers virtual as well. I'm not sure if it is a good thing to have physical serves, although we have a vmware cluster in both nodes which has enough capacity (ram, cpu, disk)? I once read that the vmware boxes have a performance issue with heavy udp based services. Did anyone of you face such an issue? Are your dns servers all running on physical or virtual boxes? When I was at BlueCat we recommended to customers that they put their resolving name servers on physical boxes in order to avoid chicken and egg problems after a catastrophic failure. Resolvers are core infrastructure, as are Virtualization clusters. It's better to avoid interdependencies between critical infrastructure wherever possible. Since you already have the physical boxes, I would continue to use them. The same argument can be made for DHCP as well, BTW. That said, a non-zero number of our customers had all of their stuff virtualized, and were quite happy with it. Modern VMware has little or no penalty, and certainly nothing that would slow you down at 15k qps. hope this helps, Doug ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Bind9 on VMWare
Re vmware, I’m definitely interested in anything folks have discovered about udp performance issues but I have no negative experience to offer. We mix vmware and hardware, but have both auth and query servers on both. Load tests didn’t reveal any issues that made us reconsider. We had an interesting time when we migrated a DNS server that doubled as our central ntp server into vmware. Later we moved the ntp server back to bare metal somewhere. But the issue was not udp; it was the virtualized “hardware” clock. I have a personal concern about dependencies, e.g. if you ever have to deal with a problem that’s taken a whole vmware cluster down. If the infrastructure or the folks attempting to fix the infrastructure depend on dns, or even if they merely work more efficiently when dns is there, then having that huge single point of failure that takes down dns could have costs. Same for a lot of low-level services. Overall architectures can take this into account. John Wobus Cornell University IT ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Bind9 on VMWare
Mike Hoskins (michoski) wrote: > > I've ran several large DNS infras over the years. Back in 2005/6 I > finally drank the koolaid and migrated a large caching infra > (authoritative was kept on bare metal) to VMWare+Linux. Amusingly our setup is the exact opposite - authoritative on VMs and recursive on metal. > Finally after babysitting that for a few years, we moved everything back > to bare metal in the name of "dependency reduction" -- we didn't want core > things like DNS relying on anything more than absolutely necessary (I'd > argue this is a sound engineering principle for any infrastructure admin > to fight for, despite the fact most pointy hairs will value cost savings > more and it flies in the face of NFV hotness). For exactly this reason :-) The recursive servers have their own copies of our zones, so they only depend on the auth servers for zone transfers; an auth outage doesn't damage local recursive service, and we have secondary servers to provide auth coverage for non-local users. Tony. -- f.anthony.n.finchhttp://dotat.at/ Southwest Dover, Wight, Portland, Plymouth, North Biscay: Northwesterly 6 to gale 8, perhaps severe gale 9 later. Moderate or rough. Squally showers. Good, occasionally moderate. ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Bind9 on VMWare
On 1/13/16, 4:02 PM, "bind-users-boun...@lists.isc.org on behalf of Reindl Harald" wrote: >Am 13.01.2016 um 19:54 schrieb Mike Hoskins (michoski): >> I've ran several large DNS infras over the years. Back in 2005/6 I >> finally drank the koolaid and migrated a large caching infra >> (authoritative was kept on bare metal) to VMWare+Linux > >i would be careful compare 2005/2006 with now for a lot of reasons > >* before vSphere 5.0 VMkernel was a 32bit kernel while capable > running 64 bit guests with 10 GB RAM but still a lot of magic > >* 2005/2006 a large part was binary translation while now > you need a x86_64 host with VT-support > >* in 2006 vmxnet3 was not available not was it for a long time > included in the mainline linux kernel while now any paravirt > drivers are in the stock kernel Agreed, that's what my "the past is not always the key to the future" quip tried to express. However, for the sake of posterity, during this and subsequent work I saw similar issues with vmxnet3 which vmware professional services could never fully explain. Also ran on hosts with VT support, and tried many Linux kernels including 3.x toward the end without complete improvement. Note that 2005/6 was the initial migration date, and actual operation continued through 2012/13 for our larger environments, with some still operating virtualized caches today (smaller environments which haven't had the same issues). So this is not an argument to never try virtualization by any means, and in many cases it could work quite well (everything has pros/cons)...just a place where I would be cautious in deployment and have a good rollback plan. Then again, as infrastructure operators that applies to pretty much everything we do. :-) ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Bind9 on VMWare
Am 13.01.2016 um 19:54 schrieb Mike Hoskins (michoski): I've ran several large DNS infras over the years. Back in 2005/6 I finally drank the koolaid and migrated a large caching infra (authoritative was kept on bare metal) to VMWare+Linux i would be careful compare 2005/2006 with now for a lot of reasons * before vSphere 5.0 VMkernel was a 32bit kernel while capable running 64 bit guests with 10 GB RAM but still a lot of magic * 2005/2006 a large part was binary translation while now you need a x86_64 host with VT-support * in 2006 vmxnet3 was not available not was it for a long time included in the mainline linux kernel while now any paravirt drivers are in the stock kernel signature.asc Description: OpenPGP digital signature ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Bind9 on VMWare
On 1/13/16, 10:28 AM, "bind-users-boun...@lists.isc.org on behalf of Reindl Harald" wrote: > > >Am 13.01.2016 um 16:19 schrieb Lightner, Jeff: >> We chose to do BIND on physical for our externally authoritative >>servers. >> >> We use Windows DNS for internal. >> >> One thing you should do if you're doing virtual is be sure you don't >>have your guests running on the same node of a cluster. If that node >>fails your DNS is going down. Ideally if you have multiple VMWare >>clusters you'd put your guests on separate clusters. > >while for sure you should run them on different nodes (except for >upgrades where you move them together to get one machine free of guests >for a short timeframe) a VMware cluster which can be called so has a >feature "VMware HA" which would start the VMs automatically on the other >node after a short period of time (node exploded or isolated from the >network for whatever reason) > >it would also restart a crashed guest automatically > >https://www.vmware.com/products/vsphere/features/high-availability > >one of the things which is much more harder to implement correctly with >physical setups I'll be the canary in the coal mine... having went down this road before, I felt like dying as a result. I've ran several large DNS infras over the years. Back in 2005/6 I finally drank the koolaid and migrated a large caching infra (authoritative was kept on bare metal) to VMWare+Linux. It worked well for awhile, and we did all the usual VMware BCPs (anti-affinity, full redundancy across storage/multipathing, etc). However, even with all the OCD nits we picked, there were still edge cases that just never performed as well (mostly high PPS) and misbehaviors stemming from VMWare or supporting infrastructure. After spending weeks tweaking every possible VMware setting, adding more VMs spread across more hosts, backend network and storage upgrades, etc we would still find or worse have end users report anomalies we hadn't seen before on the physical infra. I was devoted to making it work, and spent a lot of time including nights and weekends scouring usenet groups, talking to VMware support, etc. It never got completely better. Finally after babysitting that for a few years, we moved everything back to bare metal in the name of "dependency reduction" -- we didn't want core things like DNS relying on anything more than absolutely necessary (I'd argue this is a sound engineering principle for any infrastructure admin to fight for, despite the fact most pointy hairs will value cost savings more and it flies in the face of NFV hotness). Guess what? No more mystery behaviors, slow queries, etc. Hmm. Of course we still have issues, but now they are much more concrete (traceable to a known bug or other issue where the resolution is well understood). This probably wouldn't be an issue in most environments...as I said we ran virtual caches for years, and really only started seeing issues as clients ramped. However, is the cost savings really worth another complex dependency (quite possibly relying on another team based on your org structure), or risk you might have to back out some day as the size of your environment increases? Your call, but I've learned the hard way not to virtualize core infrastructure functions just because a whitepaper or exec says it should work. I also learned not to trust my own testing... because I spent a month with tools like dnsperf and real-world queryfiles from our environments pounding on VMware+Linux+BIND and even though testing didn't reveal any obvious problems, real world usage did. Again it worked for awhile, I understand the many justifications, it could make sense in some environments, the past is not necessarily the key to the future, and I even have colleagues still doing this... just had to rant a bit since it has caused me much pain and suffering. :-) ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Bind9 on VMWare
Am 13.01.2016 um 16:19 schrieb Lightner, Jeff: We chose to do BIND on physical for our externally authoritative servers. We use Windows DNS for internal. One thing you should do if you're doing virtual is be sure you don't have your guests running on the same node of a cluster. If that node fails your DNS is going down. Ideally if you have multiple VMWare clusters you'd put your guests on separate clusters. while for sure you should run them on different nodes (except for upgrades where you move them together to get one machine free of guests for a short timeframe) a VMware cluster which can be called so has a feature "VMware HA" which would start the VMs automatically on the other node after a short period of time (node exploded or isolated from the network for whatever reason) it would also restart a crashed guest automatically https://www.vmware.com/products/vsphere/features/high-availability one of the things which is much more harder to implement correctly with physical setups signature.asc Description: OpenPGP digital signature ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
RE: Bind9 on VMWare
We chose to do BIND on physical for our externally authoritative servers. We use Windows DNS for internal. One thing you should do if you're doing virtual is be sure you don't have your guests running on the same node of a cluster. If that node fails your DNS is going down. Ideally if you have multiple VMWare clusters you'd put your guests on separate clusters. ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Bind9 on VMWare
Hello Philippe >> where did you read that? > > I don't remember where I read that. I guess it was on a mailing list where > the OP had issues with either a DHCP or syslog server. It all came down to > the vmware host/switch which was not good enough for udp services. Could be > that this was on Vmware 4.x and got better on 5.x. > > But as I said, I can't recall exactly where that was Maybe this was referred to some old linux kernel. As long as you run something newish you should be fine. I remember that v2.6 was really bad but after v3.0, v3.2 and so on it really improved dramatically. We currently have v3.10 and I don't see any problems. However, we don't use Vmware. Daniel -- SWITCH Daniel Stirnimann, SWITCH-CERT Werdstrasse 2, P.O. Box, 8021 Zurich, Switzerland phone +41 44 268 15 15, direct +41 44 268 16 24 daniel.stirnim...@switch.ch, http://www.switch.ch ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
RE: Bind9 on VMWare
> > > Complexity? > > > > which complexity? > > > > a virtual guest is less complex because you don't need a ton of daemons > > for hardware-monitoring, drivers and what not on the guest > > For me the relevant comparison is my ordinary OS vs. my ordinary OS + > VMWare. > > > complex are 30 phyiscal servers instead two fat nodes running a > > virtualization cluster with one powerful shared storage > > Ayup, lots of eggs in one basket. > > I absolutely believe virtualization has its place. I also believe that > "everywhere" is not that place. I'm too thinking that virtualization has its place where it's a pretty good thing, but not everywhere. I saw bad setups where something went wrong an all of the vm's were affected. Yes, this is not a problem with the vm itself but bad design/setup of the vm cluster. Thank you for your responses. I'll run some benchmarks on a physical and a virtual server. Could be that we have one physical and one virtual server. That way we have both and can get our own experiences with it. Maybe we'll add switch to only physical or virtual servers in a second step Philippe ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
RE: Bind9 on VMWare
>> I'm not sure if it is a good thing to have physical serves, although we have >> a vmware cluster in both nodes which has enough capacity (ram, cpu, disk)? >> I once read that the vmware boxes have a performance issue with heavy udp >> based services. Did anyone of you face such an issue? Are your dns servers >> all running on physical or virtual boxes? > > where did you read that? I don't remember where I read that. I guess it was on a mailing list where the OP had issues with either a DHCP or syslog server. It all came down to the vmware host/switch which was not good enough for udp services. Could be that this was on Vmware 4.x and got better on 5.x. But as I said, I can't recall exactly where that was ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Bind9 on VMWare
> > Complexity? > > which complexity? > > a virtual guest is less complex because you don't need a ton of daemons > for hardware-monitoring, drivers and what not on the guest For me the relevant comparison is my ordinary OS vs. my ordinary OS + VMWare. > complex are 30 phyiscal servers instead two fat nodes running a > virtualization cluster with one powerful shared storage Ayup, lots of eggs in one basket. I absolutely believe virtualization has its place. I also believe that "everywhere" is not that place. bind-users is probably not the right forum to discuss virtualization, so I'll just leave the discussion at that for my part. Steinar Haug, Nethelp consulting, sth...@nethelp.no ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Bind9 on VMWare
Am 13.01.2016 um 13:50 schrieb Ray Bellis: On 13/01/2016 12:44, Reindl Harald wrote: where did you read that? we don't run *anything* on physical machines and all our nameservers (auth, caching with a mix of bind/unbound/rbldnsd) as anything else runs on top of VMware vSphere 5.5, previously 4.1/5.0 since 2008 ISTR that some of the Dyn guys presented at DNS-OARC in Warsaw that they found substantially worse behaviour running this sort of traffic using virtualisation when compared to light-weight containers *which* virtualization?! you can't compare a type 1 hypervisor with hosted virtualization as long as you miss *real numbers* that's worth nothing and one of the real numbers are your own load in case of good done virtualization you have slaves on different physical hosts anyways and so you need to ask yourself first if that "substantially" is measureable at all for your environment signature.asc Description: OpenPGP digital signature ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Bind9 on VMWare
first: no idea why you can't just respond to the list instead break "reply-list" and threading for others where duplicate mail get filtered and the offlist-reply without headers arrives Am 13.01.2016 um 14:06 schrieb sth...@nethelp.no: we don't run *anything* on physical machines and all our nameservers (auth, caching with a mix of bind/unbound/rbldnsd) as anything else runs on top of VMware vSphere 5.5, previously 4.1/5.0 since 2008 there is zero to no justification these days for run anything on bare metal when you read artciles like https://blogs.vmware.com/vsphere/2015/03/virtualized-big-data-faster-bare-metal.html Complexity? which complexity? a virtual guest is less complex because you don't need a ton of daemons for hardware-monitoring, drivers and what not on the guest you don't need to maintain high ability in each system you don't need to maintain monitoring in each system complex are 30 phyiscal servers instead two fat nodes running a virtualization cluster with one powerful shared storage I see mo particular reason to introduce VMWare when a normal OS on the bare metal works just great :-) * move you bare metal setup online to new hardware * replace the complete storage online * run BIOS/firmware updates without downtimes * make complete backups of your whole servers * have 10,20,100 backups of the whole servers (OS, config and data) all that you can't do with bare metal signature.asc Description: OpenPGP digital signature ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Bind9 on VMWare
> we don't run *anything* on physical machines and all our nameservers > (auth, caching with a mix of bind/unbound/rbldnsd) as anything else runs > on top of VMware vSphere 5.5, previously 4.1/5.0 since 2008 > > there is zero to no justification these days for run anything on bare > metal when you read artciles like > https://blogs.vmware.com/vsphere/2015/03/virtualized-big-data-faster-bare-metal.html Complexity? I see mo particular reason to introduce VMWare when a normal OS on the bare metal works just great :-) Differnt strokes for different folks, etc. Steinar Haug, Nethelp consulting, sth...@nethelp.no ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Bind9 on VMWare
On 13/01/2016 12:44, Reindl Harald wrote: > where did you read that? > > we don't run *anything* on physical machines and all our nameservers > (auth, caching with a mix of bind/unbound/rbldnsd) as anything else runs > on top of VMware vSphere 5.5, previously 4.1/5.0 since 2008 ISTR that some of the Dyn guys presented at DNS-OARC in Warsaw that they found substantially worse behaviour running this sort of traffic using virtualisation when compared to light-weight containers. Ray ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Bind9 on VMWare
Am 13.01.2016 um 13:34 schrieb Philippe Maechler: I'm not sure if it is a good thing to have physical serves, although we have a vmware cluster in both nodes which has enough capacity (ram, cpu, disk)? I once read that the vmware boxes have a performance issue with heavy udp based services. Did anyone of you face such an issue? Are your dns servers all running on physical or virtual boxes? where did you read that? we don't run *anything* on physical machines and all our nameservers (auth, caching with a mix of bind/unbound/rbldnsd) as anything else runs on top of VMware vSphere 5.5, previously 4.1/5.0 since 2008 there is zero to no justification these days for run anything on bare metal when you read artciles like https://blogs.vmware.com/vsphere/2015/03/virtualized-big-data-faster-bare-metal.html signature.asc Description: OpenPGP digital signature ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Bind9 on VMWare
Hello bind-users We have to deploy new auth. and caching DNS Servers in our environment and we're unsure how we should set it up. current setup - We currently have two main pop's and in each one a physical auth. and caching server. All four boxes are running Bind9.x on FreeBSD auth. servers On the auth. master server is a web interface for us, where we can make changes to the zones. These changes are written into a db and are exported into bind zone files The slave server gets his zone updates via zone-transfer over the internal network The bind configuration (zone "example.com { type master.}") is written to a text file which is transferred by scp to the slave. The slave build his config file and does an rndc reload. On rare occasions the slave is not reloading the new zones properly and we have to manually start the transfer of the config file At prime time we get < 1000 QPS on the auth server Most of the queries on the auth. servers is for IPv4 PTR records and for our mailservers (no ipv6 as of yet, but it on the roadmap for Q1 2016, and no dnssec). caching servers The caching servers have a small RPZ zone and nothing else (except the default-empty-zones) These servers are only for our networks, have an ipv6 address and they do dnssec validation. During heavy hours we have <5'000 QPS. A few customers have theses buggy netgear routers that ask 2'000 in a second for time-h.netgear.com. With theses boxes on we get ~15'000QPS We once had a performance issue on the server because of that. My idea for the new setup is: --- caching servers - Setup new caching servers - Configure the ipv4 addresses of both (old) servers on the new servers as a /32 and setup an anycast network. This way the stupid clients, who won't switch to the secondary ns server when the primary is not available, are happy when there is some problem with one server. If we're having issues with the load in the future we can setup a new server and put it into the anycast network auth. servers - Setup a hidden master on the vmware - Setup two physical servers which are slaves of the hidden master That way we have one box which is (anytime in the future) doing the dnssec stuff, gets the update that we're doing over the webinterface and deploys the ready-to-serve zones to his slaves. I'm not sure if it is a good thing to have physical serves, although we have a vmware cluster in both nodes which has enough capacity (ram, cpu, disk)? I once read that the vmware boxes have a performance issue with heavy udp based services. Did anyone of you face such an issue? Are your dns servers all running on physical or virtual boxes? Best regards and tia Philippe ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users