Oh, just as an illustrative example: on s7,
1162 r...@s7:/boot# time md5sum vmlinuz-2.6.9-89.0.25.ELsmp c4fb9036c6d660d8b5939bb597c3b8e3 vmlinuz-2.6.9-89.0.25.ELsmp real 0m0.008s user 0m0.006s sys 0m0.001s 1044 r...@s8:/boot# time md5sum vmlinuz-2.6.9-89.0.25.ELsmp c4fb9036c6d660d8b5939bb597c3b8e3 vmlinuz-2.6.9-89.0.25.ELsmp real 0m0.070s user 0m0.051s sys 0m0.018s The results are consistent. And pretty much *anything* exhibits the same slowness on s8 vs s7. -Bond > -----Original Message----- > From: [email protected] [mailto:linux-poweredge- > [email protected]] On Behalf Of Bond Masuda > Sent: Saturday, May 22, 2010 12:30 PM > To: [email protected] > Subject: any advice to find root cause of "Falling back to HPET" ? > > Hello, > > I'd appreciate any help/advice anyone can provide regarding our issue. > I've > run out of ideas on this one... > > We have two identical PowerEdge 2950, one is called s7 and the other is > s8. > Both are web servers running Apache and PHP. We first noticed the > problem > because our benchmarking showed drastically different results between > the > two servers. With s7, we were able to get 180 requests/sec while on s8 > we > only get 35 request/sec (and now only 15 requests/sec - more on that > below). > After this, we became aware that almost all tasks on s8 were slower > than s7, > whether it is CPU bound or I/O bound, everything we tried was slower on > s8 > than on s7 (untar'ing archives, running md5 hashes, etc). > > I started digging around. Both servers are identical in terms of > software > and configuration (other than things like hostname and IP addresses). > Both > servers are RHEL4U8, kernel-2.6.9-89.0.25.ELsmp, x86_64, exact same > packages > and exact same versions. I even ran 'rpm --verify' on all packages and > didn't find anything unusual on both s7 and s8. > > The ONLY error message I'm seeing that is unique to s8 are the > following > messages in dmesg: > > Losing some ticks... checking if CPU frequency changed. > warning: many lost ticks. > Your time source seems to be instable or some driver is hogging > interupts > rip __do_softirq+0x4d/0xd0 > Falling back to HPET > > Some google searching found: > > https://bugzilla.redhat.com/show_bug.cgi?id=429010 > > which refers to: > > https://bugzilla.redhat.com/show_bug.cgi?id=248488 > > But that seems to refer to problems with virtualization. This is on > real > hardware. > > What we don't understand is that s7 does *not* exhibit any slowness nor > the > messags above, only s8. Again, both are identical. > > So, thinking this might be a hardware issue, we asked our hosting > company to > pull the drives out of s8 and replace the entire chassis. After > replacing > the entire chassis of s8, we are still getting the above messages in > dmesg. > Not only that, things have gotten worse... our benchmarking (using > 'ab') now > shows the server can only do 15 requests/sec (all these test were run > locally on loopback to avoid any network related issue). > > Since the chassis was swapped, we feel that it probably isn't a > hardware > issue. But we have s7 which is configured identically to s8 that > doesn't > have this issue, so it is hard to say that it is a software issue. > > Any advice? What can I do to find the root cause? > > TIA, > -Bond _______________________________________________ Linux-PowerEdge mailing list [email protected] https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
