On Thu, Dec 17, 2015 at 1:19 PM, Loic Dachary <l...@dachary.org> wrote:
> Hi Ilya,
>
> I'm seeing a partprobe failure right after a disk was zapped with sgdisk 
> --clear --mbrtogpt -- /dev/vdb:
>
> partprobe /dev/vdb failed : Error: Partition(s) 1 on /dev/vdb have been 
> written, but we have been unable to inform the kernel of the change, probably 
> because it/they are in use. As a result, the old partition(s) will remain in 
> use. You should reboot now before making further changes.
>
> waiting 60 seconds (see the log below) and trying again succeeds. The 
> partprobe call is guarded by udevadm settle to prevent udev actions from 
> racing and nothing else goes on in the machine.
>
> Any idea how that could happen ?
>
> Cheers
>
> 2015-12-17 11:46:10,356.356 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:DEBUG:ceph-disk:get_dm_uuid
>  /dev/vdb uuid path is /sys/dev/block/253:16/dm/uuid
> 2015-12-17 11:46:10,357.357 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:DEBUG:ceph-disk:Zapping
>  partition table on /dev/vdb
> 2015-12-17 11:46:10,358.358 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:INFO:ceph-disk:Running
>  command: /usr/sbin/sgdisk --zap-all -- /dev/vdb
> 2015-12-17 11:46:10,365.365 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:Caution:
>  invalid backup GPT header, but valid main header; regenerating
> 2015-12-17 11:46:10,366.366 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:backup 
> header from main header.
> 2015-12-17 11:46:10,366.366 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:
> 2015-12-17 11:46:10,366.366 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:Warning!
>  Main and backup partition tables differ! Use the 'c' and 'e' options
> 2015-12-17 11:46:10,367.367 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:on the 
> recovery & transformation menu to examine the two tables.
> 2015-12-17 11:46:10,367.367 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:
> 2015-12-17 11:46:10,367.367 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:Warning!
>  One or more CRCs don't match. You should repair the disk!
> 2015-12-17 11:46:10,368.368 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:
> 2015-12-17 11:46:11,413.413 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:****************************************************************************
> 2015-12-17 11:46:11,414.414 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:Caution:
>  Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk
> 2015-12-17 11:46:11,414.414 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:verification
>  and recovery are STRONGLY recommended.
> 2015-12-17 11:46:11,414.414 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:****************************************************************************
> 2015-12-17 11:46:11,415.415 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:Warning:
>  The kernel is still using the old partition table.
> 2015-12-17 11:46:11,415.415 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:The new 
> table will be used at the next reboot.
> 2015-12-17 11:46:11,416.416 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:GPT 
> data structures destroyed! You may now partition the disk using fdisk or
> 2015-12-17 11:46:11,416.416 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:other 
> utilities.
> 2015-12-17 11:46:11,416.416 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:INFO:ceph-disk:Running
>  command: /usr/sbin/sgdisk --clear --mbrtogpt -- /dev/vdb
> 2015-12-17 11:46:12,504.504 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:Creating
>  new GPT entries.
> 2015-12-17 11:46:12,505.505 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:Warning:
>  The kernel is still using the old partition table.
> 2015-12-17 11:46:12,505.505 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:The new 
> table will be used at the next reboot.
> 2015-12-17 11:46:12,505.505 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:The 
> operation has completed successfully.
> 2015-12-17 11:46:12,506.506 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:DEBUG:ceph-disk:Calling
>  partprobe on zapped device /dev/vdb
> 2015-12-17 11:46:12,507.507 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:INFO:ceph-disk:Running
>  command: /usr/bin/udevadm settle --timeout=600
> 2015-12-17 11:46:15,427.427 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:INFO:ceph-disk:Running
>  command: /usr/sbin/partprobe /dev/vdb
> 2015-12-17 11:46:16,860.860 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:DEBUG:ceph-disk:partprobe
>  /dev/vdb failed : Error: Partition(s) 1 on /dev/vdb have been written, but 
> we have been unable to inform the kernel of the change, probably because 
> it/they are in use.  As a result, the old partition(s) will remain in use.  
> You should reboot now before making further changes.
> 2015-12-17 11:46:16,860.860 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:(ignored,
>  waiting 60s)
> 2015-12-17 11:47:16,925.925 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:INFO:ceph-disk:Running
>  command: /usr/bin/udevadm settle --timeout=600
> 2015-12-17 11:47:19,681.681 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:INFO:ceph-disk:Running
>  command: /usr/sbin/partprobe /dev/vdb
> 2015-12-17 11:47:20,125.125 
> INFO:tasks.workunit.client.0.target167114233028.stderr:DEBUG:CephDisk:INFO:ceph-disk:Running
>  command: /usr/bin/udevadm settle --timeout=600

Well, evidently something was using that partition.  This is on
openstack, right?  It probably makes it hard to debug, but trying to
reproduce and doing some tracing is probably the only way to get an
idea.

udevadm settle doesn't guarantee that a device (or one of its
partitions) isn't going to be busy - it just waits for udevd to empty
its queue.  Both sgdisk invocations complained about a busy device, is
it possible something external to udev was doing something with it?

Thanks,

                Ilya
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to