Re: [zones-discuss] S11 zone bug with migrated rpool/export ZFS

2012-11-07 Thread Mike Gerdts

On 01/03/12 14:30, Mike Gerdts wrote:

On Tue 03 Jan 2012 at 12:23PM, John D Groenveld wrote:

In message201201031705.q03h5uwi000...@elvis.arl.psu.edu, John D Groenveld wr
ites:

My nightly backup consist of zone shutdown, detach, snapshot,
attach, boot.

FWIW, this is one of those cases where 'zoneadm attach -F' would
probably be reasonable.


Shortly after this thread was active, I made some changes that are now available 
in Solaris 11.1.  Now, the attach in the cycle that John mentioned above will 
take about the same amount of time with or without the -F option.  This is the 
91 percent decrease in the time it takes to attach a zone mentioned in the 
Solaris 11.1 What's New document.


http://www.oracle.com/technetwork/server-storage/solaris11/documentation/solaris11-1-whatsnew-1732377.pdf

--
Mike Gerdts
Solaris Core OS / Zones http://blogs.oracle.com/zoneszone/

___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] S11 zone bug with migrated rpool/export ZFS

2012-01-05 Thread John D Groenveld
In message 20120103234407.gq24...@ultra24.us.oracle.com, Mike Gerdts writes:
  - The disk is busy doing other things such that these reads from
the zone's /export/home are pretty slow to return?

I was still sending the previous night's ZFS snapshots over the WAN.

In any case, please let me know if you start to see this problem more
regularly.  I've opened a somewhat low priority bug:

7126819 migrate_export can get EBUSY while unmounting zone's rpool/export/home
 dataset

If it repeats for you I'll bump the priority up.  If a fix is important
to you, please open a service request and ask for an escalation to be
filed.

Thank you for the BugId.

John
groenv...@acm.org

___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] S11 zone bug with migrated rpool/export ZFS

2012-01-03 Thread John D Groenveld
In message 201201031705.q03h5uwi000...@elvis.arl.psu.edu, John D Groenveld wr
ites:
My nightly backup consist of zone shutdown, detach, snapshot,
attach, boot.

Here's the output from that cron:
Progress being logged to /var/log/zones/zoneadm.20120101T021243Z.search-1.attach
Attaching...
Installing: Using existing zone boot environment
  Zone BE root dataset: rpool/var/zones/search-1/rpool/ROOT/zbe-3
 Cache: Using /var/pkg/publisher.
Updating image format
  Updating non-global zone: Linking to image /.
  Updating non-global zone: Auditing packages.
No updates necessary for this image.

  Updating non-global zone: Zone updated.
Result: Attach Succeeded.
Log saved in non-global zone as 
/var/opt/zones/search-1/root/var/log/zones/zoneadm.20120101T021243Z.search-1.attach

Progress being logged to /var/log/zones/zoneadm.20120102T021110Z.search-1.attach
Attaching...
Installing: Using existing zone boot environment
Manual migration of export required.  Potential conflicts in
/var/opt/zones/search-1/root/export and rpool/var/zones/search-1/rpool/export.
  Zone BE root dataset: rpool/var/zones/search-1/rpool/ROOT/zbe-3
 Cache: Using /var/pkg/publisher.
Updating image format
  Updating non-global zone: Linking to image /.
  Updating non-global zone: Auditing packages.
No updates necessary for this image.

  Updating non-global zone: Zone updated.
Result: Attach Succeeded.
Log saved in non-global zone as 
/var/opt/zones/search-1/root/var/log/zones/zoneadm.20120102T021110Z.search-1.attach

Progress being logged to /var/log/zones/zoneadm.20120103T022057Z.search-1.attach
Attaching...
Installing: Using existing zone boot environment
ERROR: Error: Command zfs mount -o nodevices,mountpoint=/tmp/tmp.TcaOah/export 
rpool/var/zones/search-1/rpool/export exited with status 1
ERROR: ZFS temporary mount of rpool/var/zones/search-1/rpool/export on 
/tmp/tmp.TcaOah/export failed.
ERROR: Error: migration of /export from active boot environment to the zone's
rpool/export dataset failed.  Manual cleanup required.
  Zone BE root dataset: rpool/var/zones/search-1/rpool/ROOT/zbe-3
 Cache: Using /var/pkg/publisher.
Updating image format
  Updating non-global zone: Linking to image /.
  Updating non-global zone: Auditing packages.
No updates necessary for this image.

  Updating non-global zone: Zone updated.
Result: Attach Succeeded.
Log saved in non-global zone as 
/var/opt/zones/search-1/root/var/log/zones/zoneadm.20120103T022057Z.search-1.attach


Lots of evil in attach log:
[Sun Jan  1 21:11:30 EST 2012] Mounting 
rpool/var/zones/search-1/rpool/export/home at /tmp/tmp.7kayqJ/export/home with 
ZFS temporary mount
cannot unmount '/tmp/tmp.7kayqJ/export/home': Device busy
cannot unmount '/tmp/tmp.7kayqJ/export': Device busy
rmdir: directory /tmp/tmp.7kayqJ: Directory not empty

[Mon Jan  2 21:21:37 EST 2012] Mounting rpool/var/zones/search-1/rpool/export at
 /tmp/tmp.TcaOah/export with ZFS temporary mount
cannot mount 'rpool/var/zones/search-1/rpool/export': filesystem already mounted
[Mon Jan  2 21:21:37 EST 2012] ERROR: Error: Command zfs mount -o nodevices,mou
ntpoint=/tmp/tmp.TcaOah/export rpool/var/zones/search-1/rpool/export exited wit
h status 1
[Mon Jan  2 21:21:37 EST 2012] ERROR: ZFS temporary mount of rpool/var/zones/sea
rch-1/rpool/export on /tmp/tmp.TcaOah/export failed.
rmdir: directory /tmp/tmp.TcaOah: Directory not empty
[Mon Jan  2 21:21:37 EST 2012] ERROR: Error: migration of /export from active bo
ot environment to the zone's
rpool/export dataset failed.  Manual cleanup required.

John
groenv...@acm.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] S11 zone bug with migrated rpool/export ZFS

2012-01-03 Thread Mike Gerdts
On Tue 03 Jan 2012 at 12:23PM, John D Groenveld wrote:
 In message 201201031705.q03h5uwi000...@elvis.arl.psu.edu, John D Groenveld 
 wr
 ites:
 My nightly backup consist of zone shutdown, detach, snapshot,
 attach, boot.

FWIW, this is one of those cases where 'zoneadm attach -F' would
probably be reasonable.  

 
 Here's the output from that cron:
 Progress being logged to 
 /var/log/zones/zoneadm.20120101T021243Z.search-1.attach
 Attaching...
 Installing: Using existing zone boot environment
   Zone BE root dataset: rpool/var/zones/search-1/rpool/ROOT/zbe-3
  Cache: Using /var/pkg/publisher.
 Updating image format
   Updating non-global zone: Linking to image /.
   Updating non-global zone: Auditing packages.
 No updates necessary for this image.
 
   Updating non-global zone: Zone updated.
 Result: Attach Succeeded.
 Log saved in non-global zone as 
 /var/opt/zones/search-1/root/var/log/zones/zoneadm.20120101T021243Z.search-1.attach
 

Above was your last successful attach.  The failed attach starts here:

 Progress being logged to 
 /var/log/zones/zoneadm.20120102T021110Z.search-1.attach
 Attaching...
 Installing: Using existing zone boot environment
 Manual migration of export required.  Potential conflicts in
 /var/opt/zones/search-1/root/export and rpool/var/zones/search-1/rpool/export.

This error message is saying that it found two things that are supposed
to be mounted at /export.  Without understanding your zone configuration
and dataset layout, it is kind hard to know exactly what is going on.
Can you provide the following:

%---
zfs list -o name,mountpoint,canmount,mounted -r rpool/var/zones/search-1

zonecfg -z search1 info dataset
for ds in $(zonecfg -z z1 info dataset | nawk '$1 == name: {print $2}')
do
echo Dataset: $ds
zfs list -o name,mountpoint,canmount,mounted,zone $ds
done

zonecfg -z search1 info fs
%---

Also, any details about changes in the zone configuration and/or package
updates since the previous successful backup would be helpful.

-- 
Mike Gerdts
Solaris Core OS / Zones http://blogs.oracle.com/zoneszone/
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] S11 zone bug with migrated rpool/export ZFS

2012-01-03 Thread John D Groenveld
In message 20120103203031.gl24...@ultra24.us.oracle.com, Mike Gerdts writes:
Can you provide the following:

%---
zfs list -o name,mountpoint,canmount,mounted -r rpool/var/zones/search-1

# zfs list -o name,mountpoint,canmount,mounted -r rpool/var/zones/search-1
NAMEMOUNTPOINT  
  CANMOUNT  MOUNTED
rpool/var/zones/search-1/var/opt/zones/search-1 
on  yes
rpool/var/zones/search-1/rpool  /var/opt/zones/search-1/root/rpool  
on  yes
rpool/var/zones/search-1/rpool/ROOT legacy  
noauto   no
rpool/var/zones/search-1/rpool/ROOT/zbe-3   /var/opt/zones/search-1/root
noauto  yes
rpool/var/zones/search-1/rpool/export   /var/opt/zones/search-1/root/export 
on  yes
rpool/var/zones/search-1/rpool/export/home  
/var/opt/zones/search-1/root/export/homeon  yes

I couldn't figure why from within the zone zfs mount was complaining
that the export and export/home datasets were busy.
Then from global I noticed rpool/var/zones/search-1/rpool/export and
export/home had the temporary mountpoint which was completely
unexpected.
After I halt'd and detach'd my zone, umount'd the datasets and
attach'd the zone the mountpoints corrected themselves.

zonecfg -z search1 info dataset
for ds in $(zonecfg -z z1 info dataset | nawk '$1 == name: {print $2}')
do
   echo Dataset: $ds
   zfs list -o name,mountpoint,canmount,mounted,zone $ds
done
zonecfg -z search1 info fs

# zonecfg -z search-1 info
zonename: search-1
zonepath: /var/opt/zones/search-1
brand: solaris
autoboot: true
bootargs: -m verbose
file-mac-profile:
pool:
limitpriv:
scheduling-class:
ip-type: exclusive
hostid:
fs-allowed:
fs:
dir: /ematrix
special: tank/ematrix
raw not specified
type: zfs
options: []
net:
address not specified
allowed-address not specified
configure-allowed-address: true
physical: vnic3
defrouter not specified
capped-memory:
physical: 3G

Also, any details about changes in the zone configuration and/or package
updates since the previous successful backup would be helpful.

I made no changes.
The other zones on the system had no issues.

John
groenv...@acm.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] S11 zone bug with migrated rpool/export ZFS

2012-01-03 Thread John D Groenveld
In message 201201031723.q03hnbfr001...@elvis.arl.psu.edu, John D Groenveld 
writes:
Lots of evil in attach log:
[Sun Jan  1 21:11:30 EST 2012] Mounting 
rpool/var/zones/search-1/rpool/export/home at /tmp/tmp.7kayqJ/export/home with 
ZFS temporary mount
cannot unmount '/tmp/tmp.7kayqJ/export/home': Device busy
cannot unmount '/tmp/tmp.7kayqJ/export': Device busy
rmdir: directory /tmp/tmp.7kayqJ: Directory not empty

Backgrounded processes performing IO in those directories?
What is the purpose of these temporary mounts?

John
groenv...@acm.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] S11 zone bug with migrated rpool/export ZFS

2012-01-03 Thread Mike Gerdts
On Tue 03 Jan 2012 at 04:02PM, John D Groenveld wrote:
 In message 20120103203031.gl24...@ultra24.us.oracle.com, Mike Gerdts writes:
 Can you provide the following:
 
 %---
 zfs list -o name,mountpoint,canmount,mounted -r rpool/var/zones/search-1
 
 # zfs list -o name,mountpoint,canmount,mounted -r rpool/var/zones/search-1
 NAMEMOUNTPOINT
 CANMOUNT  MOUNTED
 rpool/var/zones/search-1/var/opt/zones/search-1   
   on  yes
 rpool/var/zones/search-1/rpool  
 /var/opt/zones/search-1/root/rpool  on  yes
 rpool/var/zones/search-1/rpool/ROOT legacy
   noauto   no
 rpool/var/zones/search-1/rpool/ROOT/zbe-3   /var/opt/zones/search-1/root  
   noauto  yes
 rpool/var/zones/search-1/rpool/export   
 /var/opt/zones/search-1/root/export on  yes
 rpool/var/zones/search-1/rpool/export/home  
 /var/opt/zones/search-1/root/export/homeon  yes
 
 I couldn't figure why from within the zone zfs mount was complaining
 that the export and export/home datasets were busy.
 Then from global I noticed rpool/var/zones/search-1/rpool/export and
 export/home had the temporary mountpoint which was completely
 unexpected.
 After I halt'd and detach'd my zone, umount'd the datasets and
 attach'd the zone the mountpoints corrected themselves.

It kinda sounds like something from the global zone had stepped into
some filesystems that were temporarily mounted during an attach process.
This is backed up by the evil in the attach log:

   Lots of evil in attach log:
   [Sun Jan  1 21:11:30 EST 2012] Mounting 
rpool/var/zones/search-1/rpool/export/home at /tmp/tmp.7kayqJ/export/home with 
ZFS temporary mount
   cannot unmount '/tmp/tmp.7kayqJ/export/home': Device busy
   cannot unmount '/tmp/tmp.7kayqJ/export': Device busy
   rmdir: directory /tmp/tmp.7kayqJ: Directory not empty

Do you by any chance have a /tmp cleaner (or something else that does a
find or du) running at roughly the same time?  If so, the -mount option
to find or the -d option to du may be a help to prevent recurrence.
/tmp/tmp.7kayqJ should have been created rwx by root only.

 
 zonecfg -z search1 info dataset
 for ds in $(zonecfg -z z1 info dataset | nawk '$1 == name: {print $2}')
 do
  echo Dataset: $ds
  zfs list -o name,mountpoint,canmount,mounted,zone $ds
 done
 zonecfg -z search1 info fs

Going back to the beginning of the thread I see you had already given
this info.  Sorry 'bout that.

 
 # zonecfg -z search-1 info
 zonename: search-1
 zonepath: /var/opt/zones/search-1
 brand: solaris
 autoboot: true
 bootargs: -m verbose
 file-mac-profile:
 pool:
 limitpriv:
 scheduling-class:
 ip-type: exclusive
 hostid:
 fs-allowed:
 fs:
 dir: /ematrix
 special: tank/ematrix
 raw not specified
 type: zfs
 options: []
 net:
 address not specified
 allowed-address not specified
 configure-allowed-address: true
 physical: vnic3
 defrouter not specified
 capped-memory:
 physical: 3G
 
 Also, any details about changes in the zone configuration and/or package
 updates since the previous successful backup would be helpful.
 
 I made no changes.
 The other zones on the system had no issues.

It's starting to look like a race with something else on the system.

If there is something beyond your control that likes to walk through
/tmp as root, you could probably add the following to the cron job.

--%--
mkdir /var/attachtmp
mount -F tmpfs - /var/attachtmp
chmod 1777 /var/attachtmp
export TMPDIR=/var/attachtmp

# Do the stuff you normally do here

unset TMPDIR
umount /var/attachtmp
rmdir /var/attachtmp
--%--

Adjust as your environment requires.

-- 
Mike Gerdts
Solaris Core OS / Zones http://blogs.oracle.com/zoneszone/
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] S11 zone bug with migrated rpool/export ZFS

2012-01-03 Thread John D Groenveld
In message 20120103220311.go24...@ultra24.us.oracle.com, Mike Gerdts writes:
It kinda sounds like something from the global zone had stepped into
some filesystems that were temporarily mounted during an attach process.
This is backed up by the evil in the attach log:

   Lots of evil in attach log:
   [Sun Jan  1 21:11:30 EST 2012] Mounting 
 rpool/var/zones/search-1/rpool/export/home at /tmp/tmp.7kayqJ/export/home 
 with ZFS temporary mount
   cannot unmount '/tmp/tmp.7kayqJ/export/home': Device busy
   cannot unmount '/tmp/tmp.7kayqJ/export': Device busy
   rmdir: directory /tmp/tmp.7kayqJ: Directory not empty

Do you by any chance have a /tmp cleaner (or something else that does a
find or du) running at roughly the same time?  If so, the -mount option
to find or the -d option to du may be a help to prevent recurrence.
/tmp/tmp.7kayqJ should have been created rwx by root only.

Besides my backup cron, I don't run any custom bits in global.
Nothing jumps out among the stock services that might be willy nilly
performing IO in /tmp.

Why shouldn't zoneadm's migration update umount -f these mounts
once the migration has been performed?
I think that's preferred to skipping the attach checks and balances
with attach -F.

John
groenv...@acm.org
___
zones-discuss mailing list
zones-discuss@opensolaris.org


Re: [zones-discuss] S11 zone bug with migrated rpool/export ZFS

2012-01-03 Thread Mike Gerdts
On Tue 03 Jan 2012 at 05:51PM, John D Groenveld wrote:
 In message 20120103220311.go24...@ultra24.us.oracle.com, Mike Gerdts writes:
 It kinda sounds like something from the global zone had stepped into
 some filesystems that were temporarily mounted during an attach process.
 This is backed up by the evil in the attach log:
 
Lots of evil in attach log:
[Sun Jan  1 21:11:30 EST 2012] Mounting 
  rpool/var/zones/search-1/rpool/export/home at /tmp/tmp.7kayqJ/export/home 
  with ZFS temporary mount
cannot unmount '/tmp/tmp.7kayqJ/export/home': Device busy
cannot unmount '/tmp/tmp.7kayqJ/export': Device busy
rmdir: directory /tmp/tmp.7kayqJ: Directory not empty
 
 Do you by any chance have a /tmp cleaner (or something else that does a
 find or du) running at roughly the same time?  If so, the -mount option
 to find or the -d option to du may be a help to prevent recurrence.
 /tmp/tmp.7kayqJ should have been created rwx by root only.
 
 Besides my backup cron, I don't run any custom bits in global.
 Nothing jumps out among the stock services that might be willy nilly
 performing IO in /tmp.
 
 Why shouldn't zoneadm's migration update umount -f these mounts
 once the migration has been performed?
 I think that's preferred to skipping the attach checks and balances
 with attach -F.

In most cases, the use of umount -f has been avoided in this code as it
is more likely to be hide some other problem that exists.  I think I may
see the other problem that exists, but it would require a bit of
investigation to know for sure.  By any chance are either of the
following true?

  - The zone's /export/home file system has more files in it than it
used to.  In particular, are there now enough files in it that find
will now generate more than 5120 bytes of output whereas before that
wasn't the case?

  - The disk is busy doing other things such that these reads from
the zone's /export/home are pretty slow to return?

In any case, please let me know if you start to see this problem more
regularly.  I've opened a somewhat low priority bug:

7126819 migrate_export can get EBUSY while unmounting zone's rpool/export/home 
dataset

If it repeats for you I'll bump the priority up.  If a fix is important
to you, please open a service request and ask for an escalation to be
filed.

-- 
Mike Gerdts
Solaris Core OS / Zones http://blogs.oracle.com/zoneszone/
___
zones-discuss mailing list
zones-discuss@opensolaris.org