Time Clock Stops in FreeBSD 9.0 guest running under ESXi 5.0
I've now seen this on two different VMs on two different ESXi servers (Xeon based hosts but different hardware otherwise and at different facilities): Everything runs fine for weeks then (seemingly) suddenly/randomly the clock STOPS. In the first case I saw a jump backwards of about 15 minutes (and then a 'freeze' of the clock). The second time just 'time standing still' with no backwards jump. Logging accuracy is of course questionable given the nature of the issue, but nothing really jumps out (ie; I don't see NTPd adjusting the time just before this happens or anything like that). Naturally the clock stopping causes major issues, but the machine does technically stay running. My open sessions respond, but anything that relies on time moving forward hangs. I can't even gracefully reboot it because shutdown/etc all rely on time moving forward (heh). So I'm not sure if this is a VMWare/ESXi issue or a FreeBSD issue, or some kind of interaction between the two. I manage lots of VMWare based FreeBSD VMs, but these are the only ESXi 5.0 servers and the only FreeBSD 9.0 VMs. I have never seen anything quite like this before, and last night as I mentioned above I had it happen for the second time on a different VM + ESXi server combo so I'm not thinking its a fluke anymore. I've looked for other reports of this both in VMWare and FreeBSD contexts and not seeing anything. What is interesting is that the 2 servers that have shown this issue perform similar tasks, which are different from the other VMs which have not shown this issue (yet). This is 2 VMs out of a dozen VMs spread over two ESXi servers on different coasts. This might be a coincidence but seems suspicious. These two VMs run these services (where as the other VMs don't): - BIND - CouchDB - MySQL - NFS server - Dovecot 2.x I would also say that these two VMs probably are the most active, have the most RAM and consume the most CPU because of what they do (vs. the others). I have disabled NTPd since I am running the OpenVM Tools (which I believe should be keeping the time in sync with the ESXi host, which itself uses NTP), my only guess is maybe there is some kind of collision where NTPd and OpenVMTools were adjusting the time at the same time. I'm playing the waiting game now to see what this brings (again though I am running NTPd and OpenVMTools on all the other VMs which have yet to show this issue). Anyone seen anything like this? Ring any bells? -- Adam Strohl A-Team Systems http://ateamsystems.com/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Time Clock Stops in FreeBSD 9.0 guest running under ESXi 5.0
On 10. Mar 2012, at 08:07 , Adam Strohl wrote: I've now seen this on two different VMs on two different ESXi servers (Xeon based hosts but different hardware otherwise and at different facilities): Everything runs fine for weeks then (seemingly) suddenly/randomly the clock STOPS. Apart from the ntp vs. openvm-tools thing, do you have an idea what for weeks means in more detail? Can you check based on last/daily mails/.. how many days it was since last reboot to a) see if it's close to a integer wrap-around or b) to give anyone who wants to reproduce this maybe a clue on how long they'll have to wait? For that matter, is it a stock 9.0 or your own kernel? What other modules are loaded? /bz -- Bjoern A. Zeeb You have to have visions! It does not matter how good you are. It matters what good you do! ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Time Clock Stops in FreeBSD 9.0 guest running under ESXi 5.0
On 3/10/2012 17:10, Bjoern A. Zeeb wrote: On 10. Mar 2012, at 08:07 , Adam Strohl wrote: I've now seen this on two different VMs on two different ESXi servers (Xeon based hosts but different hardware otherwise and at different facilities): Everything runs fine for weeks then (seemingly) suddenly/randomly the clock STOPS. Apart from the ntp vs. openvm-tools thing, do you have an idea what for weeks means in more detail? Can you check based on last/daily mails/.. how many days it was since last reboot to a) see if it's close to a integer wrap-around or b) to give anyone who wants to reproduce this maybe a clue on how long they'll have to wait? For that matter, is it a stock 9.0 or your own kernel? What other modules are loaded? Uptime was 31 days on the first incident / server (occurred 5 days ago) Uptime was 4 days on the second incident / server (occurred last night) One additional unique factor I just thought of: the two problem VMs have 4 cores allocated to them inside ESXi, while the rest have 2 cores. Kernel config is a copy of GENERIC (amd64) with the following lines added to the bottom. All the VMs use this same kernel which I compiled once and then installed via NFS on the rest: # -- Add Support for nicer console # options VESA options SC_PIXEL_MODE # -- IPFW support # options IPFIREWALL options IPFIREWALL_VERBOSE options IPFIREWALL_VERBOSE_LIMIT=10 options IPDIVERT options IPFIREWALL_FORWARD ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
RE: FreeBSD root on a geli-encrypted ZFS pool
Fabian Keil writes: In my opinion protecting ZFS's default checksums (which cover non-metadata as well) with GEOM_ELI is sufficient. I don't see what advantage additionally enabling GEOM_ELI's integrity verification offers. I follow you now. You may be right about the extra integrity checking being redundant with ZFS. Anyway, it's a test without file system so the ZFS overhead isn't measured. I wasn't entirely clear about it, but my assumption was that the ZFS overhead might be big enough to make the difference between HMAC/MD5 and HMAC/SHA256 a lot less significant. Got it. That also makes sense. I'll put this on my to-test list. I'm currently using sector sizes between 512 and 8192 so I'm not actually expecting technical problems, it's just not clear to me how much the sector size matters and if 4096 is actually the best value when using ZFS. The geli(8) manual page claims that larger sector sizes lower the overhead of GEOM_ELI keying initialization and encryption/decryption steps by requiring fewer of these compute-intensive setup operations per block. You can think of it in terms of networking, where it makes sense to re-use a TCP connection for multiple HTTP requests, because for small HTTP requests, the bandwidth and latency caused by the TCP three-way handshake overshadows the actual data transfer. -- I FIGHT FOR THE USERS smime.p7s Description: S/MIME cryptographic signature