Re: [Users] Shutdown problems

2009-12-07 Thread Gregor at HostGIS

To clarify further on versions:

The HN is 2.6.24 ovz009.1 on Fedora 9.

We must use 2.6.24 despite its development status because 2.6.18 lacks 
support for AMCC/3ware RAID controllers. Aside from this shutdown issue, 
we have used it for 14 months now under high loads without issue. Aside 
from this shutdown issue, we consider it stable and production-grade.


--
HostGIS, Open Source solutions for the global GIS community
Greg Allensworth - SysAdmin, Programmer, GIS Person, Security
Network+   Server+   A+   Security+

No one cares if you can back up — only if you can recover.

___
Users mailing list
Users@openvz.org
https://openvz.org/mailman/listinfo/users


Re: [Users] Shutdown problems

2009-10-11 Thread HostGIS Support

To clarify further on versions:

The HN is 2.6.24 ovz009.1 on Fedora 9.

We must use 2.6.24 despite its development status because 2.6.18 lacks 
support for AMCC/3ware RAID controllers. Aside from this shutdown issue, 
we have used it for 14 months now under high loads without issue. Aside 
from this shutdown issue, we consider it stable and production-grade.


--
HostGIS, Open Source solutions for the global GIS community
Greg Allensworth - SysAdmin, Programmer, GIS Person, Security
Network+   Server+   A+   Security+

No one cares if you can back up — only if you can recover.

___
Users mailing list
Users@openvz.org
https://openvz.org/mailman/listinfo/users


Re: [Users] Shutdown problems

2009-10-11 Thread Scott Dowdle
Greetings,

- HostGIS Support supp...@hostgis.com wrote:
 To clarify further on versions:
 
 The HN is 2.6.24 ovz009.1 on Fedora 9.
 
 We must use 2.6.24 despite its development status because 2.6.18
 lacks support for AMCC/3ware RAID controllers. Aside from this shutdown
 issue, we have used it for 14 months now under high loads without issue.
 Aside from this shutdown issue, we consider it stable and production-grade.

As you may know, Red Hat updates their 2.6.18 kernel about ever6 six months 
with their update releases (example going from RHEL 5.2 - 5.3) and in the 
process they backport a lot of drivers... although granted, the OpenVZ Project 
lags behind Red Hat releases.  My point is though, have you tried the current 
OpenVZ RHEL5-based kernel to see if it is compatible with your hardware?

I'm a Fedora fan, in fact I'm wearing a Fedora tee-shirt as I type this... and 
I love it on the desktop (used on the laptop I'm typing this from)... but it 
isn't a server OS unless you want to upgrade at least once a year.  As you 
probably know, Fedora has a rapid 6 month release cycle and a very limited 
support cycle (2 months after a new release comes out, 2 releases back is 
EOLed).  You said you are using Fedora 9 and that was EOLed a while ago.

Now to actually address your issue... have you searched for a bug report or 
checked the forums?  Has this issue been reported?  I could have looked myself 
but I'd rather encourage you to get familiar with the OpenVZ bugzilla system 
(http://bugzilla.openvz.org/).

You mentioned you are using veth devices... which I assume you had need to 
use... since they are more of a security risk than venet devices and have 
slightly more overhead.  So, what are you running in the containers that 
require veth?  I'm just wondering if we can find what processes, if any, are 
leading to your problem... although you did mention that all of the processes 
in the container are stopped.  What distro or distros are you running in your 
containers?

I believe the 2.6.24 kernel stays on the devel list (rather than being moved to 
retired status) because it is used in Ubuntu 8.04x LTS.  2.6.26 is available as 
a devel because it is used by Debian 5.  2.6.27 is there because the mainline 
kernel developers have stated they plan to maintain 2.6.27 for at least two 
more years.  I don't know how much commitment from OpenVZ/Parallels exists for 
these devel kernels... although 2.6.27 does look lke the most natural target to 
make it to stable.  I also assume when RHEL6 comes out (whenever that might 
be), whatever kernel it is based on will also be a target for an OpenVZ stable 
kernel branch.

I think your best bet is to join an existing bug report (if one exists) or 
start a new one... and work with the kernel developers to gather information 
about the problem so they can solve it... assuming that the latest RHEL5-based 
kernel still doesn't support your RAID hardware.  Of course if the 
finding-a-fix process doesn't work out well for you within a reasonable amount 
of time, perhaps switching RAID cards would be an option... to something that 
is well supported in the RHEL5-based kernel.

TYL,
-- 
Scott Dowdle
704 Church Street
Belgrade, MT 59714
(406)388-0827 [home]
(406)994-3931 [work]
___
Users mailing list
Users@openvz.org
https://openvz.org/mailman/listinfo/users


Re: [Users] Shutdown problems

2009-10-11 Thread Scott Dowdle
Greetings again,

- HostGIS Support supp...@hostgis.com wrote:

 To clarify further on versions:
 
 The HN is 2.6.24 ovz009.1 on Fedora 9.
 
 We must use 2.6.24 despite its development status because 2.6.18
 lacks support for AMCC/3ware RAID controllers. Aside from this shutdown
 issue, we have used it for 14 months now under high loads without issue.
 Aside from this shutdown issue, we consider it stable and production-grade.

One other thing... it would be a good experiment to set up a CentOS 5 box with 
the OpenVZ RHEL5-based kernel... and migrate one of your containers to it... 
and see if it will shutdown properly there.   Another good test would be to 
setup a machine using the same kernel you are currently using but without the 
RAID hardware and see if you still have the same problem with container 
shutdown.  If so that would indicate that the problem is related to the RAID... 
although I'm guessing it is unrelated to the RAID but who knows. 

TYL,
-- 
Scott Dowdle
704 Church Street
Belgrade, MT 59714
(406)388-0827 [home]
(406)994-3931 [work]
___
Users mailing list
Users@openvz.org
https://openvz.org/mailman/listinfo/users


Re: [Users] Shutdown problems

2009-10-11 Thread Thorsten Schifferdecker
Hi,

can you please fill a bug at bugzilla.openvz.org and add info about:

- what template OS is used
- any log entries in syslog (kern|dmesg).log
- which raid controller

Bye,
 Thorsten

HostGIS Support schrieb:
 I emailed on the topic before, and have never found a solution -- nor
 indeed, more than one other corroboration of the problem's existence.
 But now, I have freed up a while server with OpenVZ where we can
 experiment with it at will.
 
 The problem: Shutting down a VPS gives me a timeout after several
 minutes. Although all processes in the container are dead, the container
 itself will not finish shutting down. The veth device never goes down,
 the container cannot be restarted, the phantom VPS will hang around
 until I power-cycle the server. This interrupts shutdowns too: init 0
 and reboot never, ever work; they do nothing, they don't turn anything
 off; and I have to pull the plug.
 
 Worse, this happens reliably -- I don't dare shut down a VPS unless it's
 a migration, and I can manually complete the migration and startup, then
 power-cycle the origin HN.
 
 BUT... Now we have a machine and some IPs with OpenVZ, and my current
 project is to figure this thing out so we can reboot with confidence.
 Where do we start and who's with me? :)
___
Users mailing list
Users@openvz.org
https://openvz.org/mailman/listinfo/users


Re: [Users] Shutdown problems

2009-10-11 Thread HostGIS Support
have you searched for a bug report or checked the forums?  
Has this issue been reported?


I'll take the if you don't have anything nice to say, don't say 
anything clause for my previous experience with forums. Maybe the 
OpenVZ forum is unlike any other, but honestly the thought of posting to 
a forum didn't occur to me with any degree of seriousness. I did post to 
this list several months ago, and got no replies.


But, I am surprised that the bug I filed last year is not visible. Boy 
do I feel like a luser; I bet I closed the wrong tab or something and 
never finished filing it -- no wonder nobody replied. Per Thorsten's 
request I will indeed file the bug report; maybe the kernel folks can 
verify that it's been solved or maybe something really weird is going on.




Red Hat updates their 2.6.18 kernel [...]

 in the process they backport a lot of drivers
 [...] current OpenVZ RHEL5-based kernel to see if it
 is compatible with your hardware?

According to the kernel config file inside the RPM, RHEL5-2.6.18 now has 
the 3W-9000 driver. Good to know; that may indeed be a realistic option 
and an easy fix.


So, can you please clarify: The RHEL5 kernel is usable and recommended 
if I'm NOT running RHEL5?




You said you are using Fedora 9 and that was EOLed a while ago.


Yeah, and it was the latest thing 14 months ago when we deployed this 
thing. I can't say I'm completely pleased with that fast a retirement 
cycle. You say that CentOS is a fine base OS? I have worked with it in a 
few occasions, and find it similar enough to Fedora that I like it.


Changing out base OSs isn't something I'd do lightly, but is something I 
would consider as a longer-term plan if it improved long-term support or 
fixed this bug we're having. Beyond this one bug, things are going great.




You mentioned you are using veth devices


Correct. In my initial setup of the pilot systems veth was the only 
thing that worked, so we went with it. Why do you ask, and what do you 
mean about it being a security issue? (I did ask about ARP and IP 
spoofing on the list some months back, and that one also got no replies.)




What distro or distros are you running in your containers?


This seems to happen equally with both of our offerings currently 
deployed: Ubuntu 8.04 and HostGIS Linux 4.2 (for your purposes, think 
Slamd64 12).


--
HostGIS, Open Source solutions for the global GIS community
Greg Allensworth - SysAdmin, Programmer, GIS Person, Security
Network+   Server+   A+   Security+

No one cares if you can back up — only if you can recover.

___
Users mailing list
Users@openvz.org
https://openvz.org/mailman/listinfo/users