It turns out that the problem that was being hit by Kristin was bug
10990 which is caused by using zoneadm cone to clone a zone. This
causes a snapshot name collision that we where not catching due to
bug 11062.

To work around this issue there are two possibilites:
1) delete zones that have been cloned using "zoneadm clone"
2) detach the zones before "pkg image-update" is run.

I worked with Kristin on this issue and we were successfully able
to work around the issue by using the second workaorund listed above.
We also found on her system that the zone causing the problem was a
test zone that could be deleted this also solved the problem on her
2009.06 BE.

This issue has been documented for the 2010.03 release.

Thanks,

-evan

On 3/7/10 6:41 AM, Oleksii Dzhulai wrote:
Hi, have a look at

http://defect.opensolaris.org/bz/show_bug.cgi?id=11062#c4

think it’s related to your problem.

--------------------------------------------------------------

http://unixinmind.blogspot.com

*From:* zfs-discuss-boun...@opensolaris.org
[mailto:zfs-discuss-boun...@opensolaris.org] *On Behalf Of *Kristin
Amundsen-Cubanski
*Sent:* Monday, March 01, 2010 9:16 PM
*To:* zfs-discuss@opensolaris.org; caiman-disc...@opensolaris.org
*Subject:* [zfs-discuss] image-update looping in libzfs or libbe

Hi all,

I had this on pkg-discuss but it seems the issue is not with pkg itself.
I am trying to update our OpenSolaris 2009.06 server and it is getting
stuck in a loop.

kris...@waldorf:~# pkg image-update -v
Creating Plan / Before evaluation:
UNEVALUATED:
(bunch of packages)

After evaluation:
(bunch of packages)
Actuators:
restart_fmri: svc:/system/manifest-import:

default
restart_fmri: svc:/application/desktop-cache/input-method-cache:default
restart_fmri:
svc:/application/desktop-cache/pixbuf-loaders-installer:default
restart_fmri: svc:/application/desktop-cache/gconf-cache:default
restart_fmri: svc:/application/desktop-cache/icon-cache:default
None
DOWNLOAD PKGS FILES XFER (MB)
Completed 66/66 3927/3927 119.95/119.95

PHASE ACTIONS
Removal Phase 167/167
Install Phase 614/614
Update Phase 6458/6458
PHASE ITEMS
Reading Existing Index 8/8
Indexing Packages 66/66
Optimizing Index...
PHASE ITEMS
Indexing Packages 637/637

This is where it hangs. Truss of the process shows:

ioctl(4, ZFS_IOC_OBJSET_STATS, 0x08044CD0) Err#12 ENOMEM
ioctl(4, ZFS_IOC_OBJSET_STATS, 0x08044CD0) = 0
ioctl(4, ZFS_IOC_OBJSET_STATS, 0x08043060) = 0
ioctl(4, ZFS_IOC_OBJSET_STATS, 0x08043060) Err#12 ENOMEM
ioctl(4, ZFS_IOC_OBJSET_STATS, 0x08043060) = 0
ioctl(4, ZFS_IOC_SNAPSHOT_LIST_NEXT, 0x08043480) = 0
ioctl(4, ZFS_IOC_OBJSET_STATS, 0x08041FE0) = 0
ioctl(4, ZFS_IOC_OBJSET_STATS, 0x08044CD0) Err#12 ENOMEM
ioctl(4, ZFS_IOC_OBJSET_STATS, 0x08044CD0) = 0
ioctl(4, ZFS_IOC_OBJSET_STATS, 0x08043060) = 0
ioctl(4, ZFS_IOC_OBJSET_STATS, 0x08043060) Err#12 ENOMEM
ioctl(4, ZFS_IOC_OBJSET_STATS, 0x08043060) = 0
ioctl(4, ZFS_IOC_SNAPSHOT_LIST_NEXT, 0x08043480) = 0
ioctl(4, ZFS_IOC_OBJSET_STATS, 0x08041FE0) = 0

just looping forever.

I killed the process. Then beadm also hung when I was changing the boot
environment back to the current one but it did change the boot
environment. I killed that as well. After, using beadm to unmount the
new image worked fine.

I deleted the new boot env and tried the update again with the exact
same hang in the same place.

Here is zfs list:

kris...@waldorf:~$ zfs list
NAME

USED AVAIL REFER MOUNTPOINT
local0 84.1G 184G 20K /local0
local0/sites 10.4G 184G 19K /local0/sites
local0/sites/database 10.4G 184G 10.4G /var/mysql/5.0/data
local0/zones 73.6G 184G 25K /zones
local0/zones/db01 28.4G 184G 22K /zones/db01
local0/zones/db01/ROOT 28.4G 184G 19K legacy
local0/zones/db01/ROOT/zbe 1.50M 184G 612M legacy
local0/zones/db01/ROOT/zbe-1 28.4G 184G 28.4G legacy
local0/zones/db01/ROOT/zbe-2 72.5K 184G 28.4G legacy
local0/zones/db02 2.44G 184G 22K /zones/db02
local0/zones/db02/ROOT 2.44G 184G 19K legacy
local0/zones/db02/ROOT/zbe 2.44G 184G 2.11G legacy
local0/zones/db02/ROOT/zbe-1 75.5K 184G 2.10G legacy
local0/zones/web01 36.3G 184G 22K /zones/web01
local0/zones/web01/ROOT 36.3G 184G 19K legacy
local0/zones/web01/ROOT/zbe 11.3G 184G 11.3G legacy
local0/zones/web01/ROOT/zbe-1 25.0G 184G 24.9G legacy
local0/zones/web01/ROOT/zbe-2 87.5K 184G 24.3G legacy
local0/zones/web01_old 3.10G 184G 22K /zones/web01_old
local0/zones/web01_old/ROOT 3.10G 184G 19K legacy
local0/zones/web01_old/ROOT/zbe 3.10G 184G 14.4G legacy
local0/zones/web01_old/ROOT/zbe-1 86.5K 184G 14.4G legacy
local0/zones/web02 3.39G 184G 22K /zones/web02
local0/zones/web02/ROOT 3.39G 184G 19K legacy
local0/zones/web02/ROOT/zbe 3.39G 184G 3.39G legacy
local0/zones/web02/ROOT/zbe-1 86.5K 184G 3.38G legacy
rpool 74.6G 59.3G 76K /rpool
rpool/ROOT 13.2G 59.3G 18K legacy
rpool/ROOT/opensolaris 26.5M 59.3G 3.37G /
rpool/ROOT/opensolaris-1 40.6M 59.3G 3.79G /
rpool/ROOT/opensolaris-2 12.7G 59.3G 10.2G /
rpool/ROOT/opensolaris-3 529M 59.3G 10.3G /
rpool/dump 16.0G 59.3G 16.0G -
rpool/export 29.3G 59.3G 19K /export
rpool/export/home 29.3G 59.3G 937K /export/home
rpool/export/home/Admin 16.6M 59.3G 16.6M /export/home/Admin
rpool/export/home/kristin 29.3G 59.3G 29.3G /export/home/kristin
rpool/export/home/tcubansk 466K 59.3G 466K /export/home/tcubansk
rpool/swap 16.0G 75.3G 16K -

pstack output:

kris...@waldorf:~# pstack `pgrep pkg`
27967: /usr/bin/python2.4 /usr/bin/pkg image-update -v
fed819d5 ioctl (d8aa5e0, 5a14, 8043480, 1020) + 15
fdf5e7f0 zfs_iter_snapshots (d8aa5e0, fdf5fde8, 8045d00, 0) + 7c
fdf60112 zfs_promote (cd81730, fe0283a0, 0, 0) + 166
fe01612c be_promote_ds_callback (cd81730, 0, 0, 0) + d0
fe015dce be_promote_zone_ds (caa2b70, c996038, 8047268, fe0145cd) + 2da
fe014609 _be_activate (caa2b70, fe029008, 804728c, fe014188) + 3d9
fe0141d0 be_activate (c5f8528) + 68
fe051be5 beActivate (0, c783b4c, 0, feeceb7c) + 69
fef03125 call_function (804738c, 1, 23fe96c1, 82d5aac) + 3f5
fef00221 PyEval_EvalFrame (8142a64, 8295660, 828be84, 0) + 2b11
fef01d23 PyEval_EvalCodeEx (8295660, 828be84, 0, c4b975c, 1, c4b9760) + 903
fef032f4 fast_function (dcb3764, 804754c, 1, 1, 0, 0) + 164
fef02dff call_function (804754c, 1, 6f, 0) + cf
fef00221 PyEval_EvalFrame (c4b95e4, 807def4, 828be84, 0) + 2b11
fef01d23 PyEval_EvalCodeEx (8295720, 828be84, 0, 84789fc, 1, 8478a00) + 903
fef032f4 fast_function (829656c, 804770c, 1, 1, 0, feebadd4) + 164
fef02dff call_function (804770c, 0, 200, 0) + cf
fef00221 PyEval_EvalFrame (847888c, 827b6a0, 8279acc, 0) + 2b11
fef03288 fast_function (84dbb1c, 804784c, 1, 1, 0, feebadd4) + f8
fef02dff call_function (804784c, 0, 43b, 0) + cf
fef00221 PyEval_EvalFrame (85061bc, 833bb20, 8079824, 0) + 2b11
fef03288 fast_function (84dd56c, 804798c, 2, 2, 0, feebadd4) + f8
fef02dff call_function (804798c, 2, de333679, 0) + cf
fef00221 PyEval_EvalFrame (8124e64, 8346920, 8079824, 0) + 2b11
fef03288 fast_function (84ddaac, 8047acc, 0, 0, 0, 0) + f8
fef02dff call_function (8047acc, 0, 322, 0) + cf
fef00221 PyEval_EvalFrame (80af5fc, 8346960, 8079824, 8079824) + 2b11
fef01d23 PyEval_EvalCodeEx (8346960, 8079824, 8079824, 0, 0, 0) + 903
feefd66e PyEval_EvalCode (8346960, 8079824, 8079824, 0) + 22
fef212a1 run_node (8061338, 8047df3, 8079824, 8079824, 8047c3c, 1) + 39
fef20425 PyRun_SimpleFileExFlags (fee037e0, 8047df3, 1, 8047c3c) + 14d
fef26ebb Py_Main (4, 8047d1c, 8047d30, feffb7b4) + 86b
080509bd _start (4, 8047de0, 8047df3, 8047e00, 8047e0d, 0) + 7d

My biggest question is if there is a way for me to fix or work around
this right now. Our main server is panicking almost every day on an IP
null pointer dereference that has been fixed.

Thanks,

-Kristin

--
Kristin Amundsen-Cubanski
CIO & Board of Directors
The Mommies Network
http://www.themommiesnetwork.org/



_______________________________________________
caiman-discuss mailing list
caiman-disc...@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to