On 2016-03-23 9:08 AM, Robert Mustacchi wrote:
On 3/21/16 8:38 , Karl Rossing wrote:
Upgraded to 20160317T000621Z and it still hangs.

On 2016-03-21 7:09 AM, Karl Rossing wrote:
We have a Cisco Systems Inc R200-1120402 (2xE5540, 64GB of ram,
ST1000NM0011 drives) that we upgraded to joyent_20151104T185720Z

The server stops responding, the console freezes and we can't ssh into
      It did panic once with ""panic message: I/O to pool 'zones'
appears to be hung" but has locked up twice since with no panic msg.

I'm wondering there is something between 20151104T185720Z and
20160317T000621Z that might explain the problem?
The fact that you have the I/O deadman fire is not always a good sign.
That suggests that for some reason I/O has stopped. When this happens,
are you able to inject an NMI (non-maskable interrupt) into the system?
You can do this via the 'chassis power diag' ipmitool command.

That should force it to generate a dump and we can talk through how to
investigate what's going on there.

Robert

Just read this email now. I will follow https://wiki.smartos.org/pages/viewpage.action?pageId=754743 the next time it happens.

I have been able to get some stability by freeing up ram on this particular system.

Current zonememstat
                                 ZONE  RSS(MB)  CAP(MB)    NOVER POUT(MB)
                               global        0        - -         -
 56d619a8-0e52-464e-95f2-7be0f5521185     1085     2048 0         0
 6d9ed1cf-548c-4254-86f9-377f53d42036     1085     2048 0         0
 49686ae2-01c2-4da8-badf-c12ae6860b2c     1085     2048 0         0
 876aa5e9-9b5c-4068-840e-ac4e20a45721     8254     9216 0         0
 da7434d4-8d95-11e2-b95c-9b2eb37e2e94     2124     3072 0         0
 6121d77a-8272-42f7-9c74-1ae18140f522     4138     5120 0         0
 c2c55797-a848-4e58-a666-9ac227ed3d00     3110     4096 0         0
 b10c1d5f-b648-4c23-8b21-b2a51c8e80a0    12419    13312 0         0

Total RSS of 33300
Total CAP of 40960

Previous zonememstat were 20GB higher.

 So far, we have seven days of uptime.

I'm also wondering about how at what number to set the CAP. I thought it was linked to vmadm update UUID ram= but that doesn't seem to be the case.
















CONFIDENTIALITY NOTICE:  This communication (including all attachments) is
confidential and is intended for the use of the named addressee(s) only and
may contain information that is private, confidential, privileged, and
exempt from disclosure under law.  All rights to privilege are expressly
claimed and reserved and are not waived.  Any use, dissemination,
distribution, copying or disclosure of this message and any attachments, in
whole or in part, by anyone other than the intended recipient(s) is strictly
prohibited.  If you have received this communication in error, please notify
the sender immediately, delete this communication from all data storage
devices and destroy all hard copies.


-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com

Reply via email to