Re: [dm-devel] [PATCH] multipath-tools: unreachable controllers maintainers

2016-06-21 Thread Martin George

On 6/6/2016 1:24 AM, Xose Vazquez Perez wrote:

Replace 3 unreachable controllers maintainers, their emails are bounced,
with the default maintainer(Christophe Varoqui )

@@ -790,8 +790,8 @@ static struct hwentry default_hw[] = {
/*
 * NETAPP controller family
 *
-* Maintainer : Dave Wysochanski
-* Mail : dav...@netapp.com
+* Maintainer : Christophe Varoqui
+* Mail : christophe.varo...@opensvc.com
 */
{
.vendor= "NETAPP",
@@ -835,8 +835,8 @@ static struct hwentry default_hw[] = {
/*
 * IBM NSeries (NETAPP) controller family
 *
-* Maintainer : Dave Wysochanski
-* Mail : dav...@netapp.com
+* Maintainer : Christophe Varoqui
+* Mail : christophe.varo...@opensvc.com
 */
{
.vendor= "IBM",



I can take up the maintainer's role here for the NetApp controller 
family. Please let me know if this mail suffices, or whether you need me 
to send a separate patch for this.


Thanks,
-Martin George (mart...@netapp.com)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 0/6] Support DAX for device-mapper dm-linear devices

2016-06-21 Thread Mike Snitzer
On Tue, Jun 21 2016 at 11:44am -0400,
Kani, Toshimitsu  wrote:

> On Tue, 2016-06-21 at 09:41 -0400, Mike Snitzer wrote:
> > On Mon, Jun 20 2016 at  6:22pm -0400,
> > Mike Snitzer  wrote:
> > > 
> > > On Mon, Jun 20 2016 at  5:28pm -0400,
> > > Kani, Toshimitsu  wrote:
> > > 
>  :
> > > Looks good, I folded it in and tested it to work.  Pushed to my 'wip'
> > > branch.
> > > 
> > > No longer seeing any corruption in my test that was using partitions to
> > > span pmem devices with a dm-linear device.
> > > 
> > > Jens, any chance you'd be open to picking up the first 2 patches in this
> > > series?  Or would you like to see them folded or something different?
> >
> > I'm now wondering if we'd be better off setting a new QUEUE_FLAG_DAX
> > rather than establish GENHD_FL_DAX on the genhd?
> > 
> > It'd be quite a bit easier to allow upper layers (e.g. XFS and ext4) to
> > check for a queue flag.
> 
> I think GENHD_FL_DAX is more appropriate since DAX does not use a request
> queue, except for protecting the underlining device being disabled while
> direct_access() is called (b2e0d1625e19).  

The devices in question have a request_queue.  All bio-based device have
a request_queue.

I don't have a big problem with GENHD_FL_DAX.  Just wanted to point out
that such block device capabilities are generally advertised in terms of
a QUEUE_FLAG.
 
> About protecting direct_access, this patch assumes that the underlining
> device cannot be disabled until dtr() is called.  Is this correct?  If not,
> I will need to call dax_map_atomic().

One of the big design considerations for DM that a DM device can be
suspended (with or without flush) and any new IO will be blocked until
the DM device is resumed.

So ideally DM should be able to have the same capability even if using
DAX.

But that is different than what commit b2e0d1625e19 is addressing.  For
DM, I wouldn't think you'd need the extra protections that
dax_map_atomic() is providing given that the underlying block device
lifetime is managed via DM core's dm_get_device/dm_put_device (see also:
dm.c:open_table_device/close_table_device).

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] Possible data corruption with dm-thin

2016-06-21 Thread Zdenek Kabelac

Dne 21.6.2016 v 09:56 Dennis Yang napsal(a):

Hi,

We have been dealing with a data corruption issue when we run out I/O
test suite made by ourselves with multiple thin devices built on top of a
thin-pool. In our test suites, we will create multiple thin devices and
continually write to them, check the file checksum, and delete all files
and issue DISCARD to reclaim space if no checksum error takes place.

We found that there is one data access pattern could corrupt the data.
Suppose that there are two thin devices A and B, and device A receives
a DISCARD bio to discard a physical(pool) block 100. Device A will quiesce
all previous I/O and held both virtual and physical data cell before it
actually remove the corresponding data mapping. After the data mapping
is removed, both data cell will be released and this DISCARD bio will
be passed down to underlying devices. If device B tries to allocate
a new block at the very same moment, it could reuse the block 100 which
was just been discarded by device A (suppose metadata commit had
been triggered, for a block cannot be reused in the same transaction).
In this case, we will have a race between the WRITE bio coming from
device B and the DISCARD bio coming from device A. Once the WRITE
bio completes before the DISCARD bio, there would be checksum error
for device B.

So my question is, does dm-thin have any mechanism to eliminate the race when
discarded block is reused right away by another device?

Any help would be grateful.
Thanks,



Please provide version of kernel and surrounding tools (OS release version)?
also are you using  'lvm2'  or you use directly 'dmsetup/ioctl' ?
(in the later case we would need to see exact sequencing of operation).

Also please provide  reproducer script.


Regards

Zdenek

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] Possible data corruption with dm-thin

2016-06-21 Thread Dennis Yang
Hi,

We have been dealing with a data corruption issue when we run out I/O
test suite made by ourselves with multiple thin devices built on top of a
thin-pool. In our test suites, we will create multiple thin devices and
continually write to them, check the file checksum, and delete all files
and issue DISCARD to reclaim space if no checksum error takes place.

We found that there is one data access pattern could corrupt the data.
Suppose that there are two thin devices A and B, and device A receives
a DISCARD bio to discard a physical(pool) block 100. Device A will quiesce
all previous I/O and held both virtual and physical data cell before it
actually remove the corresponding data mapping. After the data mapping
is removed, both data cell will be released and this DISCARD bio will
be passed down to underlying devices. If device B tries to allocate
a new block at the very same moment, it could reuse the block 100 which
was just been discarded by device A (suppose metadata commit had
been triggered, for a block cannot be reused in the same transaction).
In this case, we will have a race between the WRITE bio coming from
device B and the DISCARD bio coming from device A. Once the WRITE
bio completes before the DISCARD bio, there would be checksum error
for device B.

So my question is, does dm-thin have any mechanism to eliminate the race
when
discarded block is reused right away by another device?

Any help would be grateful.
Thanks,

Dennis


-- 
Dennis Yang
QNAP Systems, Inc.
Skype: qnap.dennis.yang
Email: dennisy...@qnap.com
Tel: (+886)-2-2393-5152 ext. 15018
Address: 13F., No.56, Sec. 1, Xinsheng S. Rd., Zhongzheng Dist., Taipei
City, Taiwan
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Re: [dm-devel] [PATCH 2/3] block: require write_same and discard requests align to logical block size

2016-06-21 Thread Bart Van Assche

On 06/17/2016 03:19 AM, Darrick J. Wong wrote:

Make sure that the offset and length arguments that we're using to
construct WRITE SAME and DISCARD requests are actually aligned to the
logical block size.  Failure to do this causes other errors in other
parts of the block layer or the SCSI layer because disks don't support
partial logical block writes.

Signed-off-by: Darrick J. Wong 
Reviewed-by: Christoph Hellwig 


Reviewed-by: Bart Van Assche 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel