Re: [developer] Manual dedup, aka --reflink support.

2020-06-17 Thread Matthew Ahrens via openzfs-developer
On Wed, Jun 17, 2020 at 3:47 PM Pawel Jakub Dawidek wrote: > On 6/15/20 09:18, Matthew Ahrens via openzfs-developer wrote: > > However, even so, looking up in the BRT for every single zio_free() > > would be a substantial cost. [...] > > After giving it some more thought we could avoid that cost

Re: [developer] Manual dedup, aka --reflink support.

2020-06-17 Thread Pawel Jakub Dawidek
On 6/15/20 09:18, Matthew Ahrens via openzfs-developer wrote: > However, even so, looking up in the BRT for every single zio_free() > would be a substantial cost. [...] After giving it some more thought we could avoid that cost by leveraging the fact that we operate on offsets within VDEVs. We

Re: [developer] Manual dedup, aka --reflink support.

2020-06-16 Thread Sean Fagan
On Jun 15, 2020, at 1:07 PM, Pawel Jakub Dawidek wrote: > Exactly. Plus on-disk BRT entry size is 24 bytes and in-memory is 80 > bytes for now (vs. 392 bytes of DDT entry). Smaller structure sizes just > delays the problem, of course, but sorting can be a big win. > > Also note that this table

Re: [developer] Manual dedup, aka --reflink support.

2020-06-15 Thread Pawel Jakub Dawidek
Thank you Matt for your comments, answers inline. On 6/15/20 09:18, Matthew Ahrens via openzfs-developer wrote: > Cool!  Couple of questions/observations: > > Do I understand correctly that the new data structure you're proposing > (the BRT) maps from DVA to refcount? Correct. > If so, and we

Re: [developer] Manual dedup, aka --reflink support.

2020-06-15 Thread Jason King
’ for the space of the file) — but it seems like it’d be good to confirm this is the expectation. From: Allan Jude Reply: openzfs-developer Date: June 15, 2020 at 1:38:49 PM To: developer@lists.open-zfs.org Subject: Re: [developer] Manual dedup, aka --reflink support. If we used a bit

Re: [developer] Manual dedup, aka --reflink support.

2020-06-15 Thread Allan Jude
If we used a bit in the block pointer, similar to the one we have for dedup, we would only need to example the BRT in zio_free() if the BP had the bit set. Of course the problem with that idea is that the 'original' file won't have that bit set in its BP, so you'd need to search the BRT to ensure

Re: [developer] Manual dedup, aka --reflink support.

2020-06-15 Thread Matthew Ahrens via openzfs-developer
Cool! Couple of questions/observations: Do I understand correctly that the new data structure you're proposing (the BRT) maps from DVA to refcount? If so, and we can keep this data structure sorted on disk (by DVA), we would be more likely to get multiple useful entries when reading one block

Re: [developer] Manual dedup, aka --reflink support.

2020-06-14 Thread Sean Fagan
Pawel spent a fair amount of time discussing this with me, which is good 'cause I apparently had been confused. The idea and implementation he suggests sounds reasonable to me, and will (finally!) allow offline dedup :). Sean. > On Jun 13, 2020, at 12:52 PM, Pawel Jakub Dawidek wrote: > >