Re: Strange data corruption issue with gluster (libgfapi) and ZFS

2020-02-28 Thread Stefan Ring
On Fri, Feb 28, 2020 at 12:10 PM Kevin Wolf wrote: > > This sounds almost like two other bugs we got fixed recently (in the > QEMU file-posix driver and in the XFS kernel driver) where two write > extending the file size were in flight in parallel, but if the shorter > one completed last, instead

Re: Strange data corruption issue with gluster (libgfapi) and ZFS

2020-02-28 Thread Kevin Wolf
Am 27.02.2020 um 23:25 hat Stefan Ring geschrieben: > On Thu, Feb 27, 2020 at 10:12 PM Stefan Ring wrote: > > Victory! I have a reproducer in the form of a plain C libgfapi client. > > > > However, I have not been able to trigger corruption by just executing > > the simple pattern in an

Re: Strange data corruption issue with gluster (libgfapi) and ZFS

2020-02-27 Thread Stefan Ring
On Thu, Feb 27, 2020 at 10:12 PM Stefan Ring wrote: > Victory! I have a reproducer in the form of a plain C libgfapi client. > > However, I have not been able to trigger corruption by just executing > the simple pattern in an artificial way. Currently, I need to feed my > reproducer 2 GB of data

Re: Strange data corruption issue with gluster (libgfapi) and ZFS

2020-02-27 Thread Stefan Ring
On Tue, Feb 25, 2020 at 3:12 PM Stefan Ring wrote: > > I find many instances with the following pattern: > > current file length (= max position + size written): p > write request n writes from (p + hole_size), thus leaving a hole > request n+1 writes exactly hole_size, starting from p, thus

Re: Strange data corruption issue with gluster (libgfapi) and ZFS

2020-02-25 Thread Stefan Ring
On Mon, Feb 24, 2020 at 1:35 PM Stefan Ring wrote: > > What I plan to do next is look at the block ranges being written in > the hope of finding overlaps there. Status update: I still have not found out what is actually causing this. I have not found concurrent writes to overlapping file areas.

Re: Strange data corruption issue with gluster (libgfapi) and ZFS

2020-02-24 Thread Stefan Ring
On Mon, Feb 24, 2020 at 2:27 PM Kevin Wolf wrote: > > > There are quite a few machines running on this host, and we have not > > > experienced other problems so far. So right now, only ZFS is able to > > > trigger this for some reason. The guest has 8 virtual cores. I also > > > tried writing

Re: Strange data corruption issue with gluster (libgfapi) and ZFS

2020-02-24 Thread Stefan Ring
On Thu, Feb 20, 2020 at 10:19 AM Stefan Ring wrote: > > Hi, > > I have a very curious problem on an oVirt-like virtualization host > whose storage lives on gluster (as qcow2). > > The problem is that of the writes done by ZFS, whose sizes according > to blktrace are a mixture of 8, 16, 24, ...

Re: Strange data corruption issue with gluster (libgfapi) and ZFS

2020-02-24 Thread Stefan Ring
On Mon, Feb 24, 2020 at 1:35 PM Stefan Ring wrote: > > [...]. As already stated in > the original post, the problem only occurs with multiple parallel > write requests happening. Actually I did not state that. Anyway, the corruption does not happen when I restrict the ZFS io scheduler to only 1

Re: Strange data corruption issue with gluster (libgfapi) and ZFS

2020-02-24 Thread Kevin Wolf
Am 24.02.2020 um 13:35 hat Stefan Ring geschrieben: > On Thu, Feb 20, 2020 at 10:19 AM Stefan Ring wrote: > > > > Hi, > > > > I have a very curious problem on an oVirt-like virtualization host > > whose storage lives on gluster (as qcow2). > > > > The problem is that of the writes done by ZFS,