Re: [PATCH v3] staging: writeboost: Add dm-writeboost

2015-02-20 Thread Akira Hayakawa
To be clear, bio's semantics doesn't require a io is written on
persistent medium before any ack. The border line is that ios that's acked
are persitent before ack to REQ_FLUSH request.
So, writing on the volatile buffer (log chunk in this case) and then ack
is safe if the data gets persistent before some future REQ_FLUSH
request is acked. That's dm-writeboost does.
And in general, ack should be quick as possible otherwise may incur
some problems such as upper layer may suspend any other requests.

The bio_vecs solution works only for a tiny prototype.
If I apply the solution there will appear the following problems

1. The write to the cache device isn't one single write.
This causes atomicity problem. And may cause performance
degradation.
2. We need to compute checksum of the entire log chunk before write.
Without this, the user isn't safe from partial write problem.
Like the 1 above, atomicity is to be cared.
(btw, I don't think dm-cache that has separete data device and
 metadata device can guarantee this level of safetiness)
3. Don't ack any bios until the full buffer is written is harmful.
We should ack as quick as possible as explained above.
4. Read caching becomes infeasible. It needs copying of the read data.

My conclusion is write buffer in practice should be a single buffer and
copying is inevitable.

>From a engineering point of view, memory copy can't be the bottleneck
(before that, SSD's throughput hits) so we shouldn't hack for the little
improvement.

- Akira

On 2015/02/21 1:17, Joe Thornber wrote:
> On Sat, Feb 21, 2015 at 01:06:08AM +0900, Akira Hayakawa wrote:
>> The size is configurable but typically 512KB (that's the default).
>>
>> Refer to bio payload sounds really dangerous but it may be possible
>> in some tricky way. but at the moment I am not sure how the
>> implementation would be.
>>
>> Is there some fancy function that is like memcpy but actually "move"
>> the ownership?
> When building up your log chunk bio copy the bio_vecs (not the data)
> from the original bios.  You can't complete the original bios until
> your log chunk has been written.
>
> - Joe

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH v3] staging: writeboost: Add dm-writeboost

2015-02-20 Thread Joe Thornber
On Sat, Feb 21, 2015 at 01:06:08AM +0900, Akira Hayakawa wrote:
> The size is configurable but typically 512KB (that's the default).
> 
> Refer to bio payload sounds really dangerous but it may be possible
> in some tricky way. but at the moment I am not sure how the
> implementation would be.
> 
> Is there some fancy function that is like memcpy but actually "move"
> the ownership?

When building up your log chunk bio copy the bio_vecs (not the data)
from the original bios.  You can't complete the original bios until
your log chunk has been written.

- Joe
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH v3] staging: writeboost: Add dm-writeboost

2015-02-20 Thread Akira Hayakawa
The size is configurable but typically 512KB (that's the default).

Refer to bio payload sounds really dangerous but it may be possible
in some tricky way. but at the moment I am not sure how the
implementation would be.

Is there some fancy function that is like memcpy but actually "move"
the ownership?

- Akira

On 2015/02/21 0:50, Joe Thornber wrote:
> On Sat, Feb 21, 2015 at 12:25:53AM +0900, Akira Hayakawa wrote:
>> Yes.
> How big are your log chunks?  Presumably they're relatively small (eg,
> 256k).  In which case you can optimise for the common case where you
> have enough bios to hand to build your log chunk by just referencing
> the bio payloads, rather than copying.  It's only the last bit of io
> in a burst that should be using this copying slow path.
>
> - Joe

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH v3] staging: writeboost: Add dm-writeboost

2015-02-20 Thread Joe Thornber
On Sat, Feb 21, 2015 at 12:25:53AM +0900, Akira Hayakawa wrote:
> Yes.

How big are your log chunks?  Presumably they're relatively small (eg,
256k).  In which case you can optimise for the common case where you
have enough bios to hand to build your log chunk by just referencing
the bio payloads, rather than copying.  It's only the last bit of io
in a burst that should be using this copying slow path.

- Joe
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH v3] staging: writeboost: Add dm-writeboost

2015-02-20 Thread Joe Thornber
On Fri, Feb 20, 2015 at 05:44:01PM +0900, Akira Hayakawa wrote:
> I will wait for ack from dm maintainers.

Are you still copying the contents of every bio to your own memory
buffer before writing it to disk?

- Joe
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH v3] staging: writeboost: Add dm-writeboost

2015-02-20 Thread Akira Hayakawa
Hi,

Very very sad to not receive any comments from dm maintainers
in the past 2 mouths. I implemented read-caching for v3 because
they like to see this feature in but no comment...

I still believe they reevaluate dm-writeboost because I don't
think this driver isn't bad as they claim.

They really dislike the 4KB splitting and that's the biggest reason
dm-writeboost isn't appreciated. Now let me argue about this.

Log-structured block-level caching isn't a fully brand new idea of
my own although the implementation is so.

Back to 1992, the concept of log-structured filesystem was invented.
3 years later, the concept of log-structured block-level caching
appeared inspired by the concept of lfs. The paper shows the I/O is
split into 4KB chunks and then managed as cache blocks.
http://www.ele.uri.edu/research/hpcl/DCD/DCD.html

Since then, no research follows DCD but
the idea of log-strucutured block-level caching revives as SSD emerges.

In 2010, MSR's Griffin also does 4KB split. Griffin uses HDD as the cache device
to extend the lifetime of the backing device which is SSD.
http://research.microsoft.com/apps/pubs/default.aspx?id=115352
(So, dm-writeboost can be applied in this way)

In 2012, NetApp's Mercury is a read caching for their storage system
that's quite log-structured to be durable and exploits the full throughput.
It managed in 4KB cache size too.
http://storageconference.us/2012/Papers/04.Flash.1.Mercury.pdf

They all splits I/O into 4KB chunks (and buffer write to cache device).
The history says the decision isn't wrong for log-structured block-level 
caching.
I decided this principal design decision based on this research papers' 
consensus.
Do you still say that I should change this design?

Joe started nacking after observing a low-throughput of large-sized read in 
virtual
environment. I reproduced the case in my KVM environment and realized that
the split chunks aren't merged in host machine. KVM seems to disable its 
I/O scheduler and delegates merging to the host.
When I run the same experiment _without_ virtual machine, the split chunks are 
fully
merged in the I/O scheduler. So, I can conclude this is due to KVM interference 
and
dm-writeboost isn't suitable for at least usage on VM. This isn't a big reason 
to nack
because dm-writeboost is usually used in host machine.

I will wait for ack from dm maintainers.

- Akira

On Sat, 17 Jan 2015 16:09:52 -0800
Greg KH  wrote:

> On Thu, Jan 01, 2015 at 05:44:39PM +0900, Akira Hayakawa wrote:
> > This patch adds dm-writeboost to staging tree.
> > 
> > dm-writeboost is a log-structured SSD-caching driver.
> > It caches data in log-structured way on the cache device
> > so that the performance is maximized.
> > 
> > The merit of putting this driver in staging tree is
> > to make it possible to get more feedback from users
> > and polish the codes.
> > 
> > v2->v3
> > - rebased onto 3.19-rc2
> > - Add read-caching support (disabled by default)
> >   Several tests are pushed to dmts.
> > - An critical bug fix
> >   flush_proc shouldn't free the work_struct it's running on.
> >   I found this bug while I am testing read-caching.
> >   I am not sure why i didn't exhibit before but it's truly a bug.
> > - Fully revised the README.
> >   Now that we have read-caching support, the old README was completely 
> > obsolete.
> > - Update TODO
> >   Implementing read-caching is done.
> > - bump up the copyright year to 2015
> > - fix up comments
> > 
> > 
> > Signed-off-by: Akira Hayakawa 
> 
> I need an ack from a dm developer before I can take this.
> 
> thanks,
> greg k-h
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: [PATCH v3] staging: writeboost: Add dm-writeboost

2015-01-17 Thread Greg KH
On Thu, Jan 01, 2015 at 05:44:39PM +0900, Akira Hayakawa wrote:
> This patch adds dm-writeboost to staging tree.
> 
> dm-writeboost is a log-structured SSD-caching driver.
> It caches data in log-structured way on the cache device
> so that the performance is maximized.
> 
> The merit of putting this driver in staging tree is
> to make it possible to get more feedback from users
> and polish the codes.
> 
> v2->v3
> - rebased onto 3.19-rc2
> - Add read-caching support (disabled by default)
>   Several tests are pushed to dmts.
> - An critical bug fix
>   flush_proc shouldn't free the work_struct it's running on.
>   I found this bug while I am testing read-caching.
>   I am not sure why i didn't exhibit before but it's truly a bug.
> - Fully revised the README.
>   Now that we have read-caching support, the old README was completely 
> obsolete.
> - Update TODO
>   Implementing read-caching is done.
> - bump up the copyright year to 2015
> - fix up comments
> 
> 
> Signed-off-by: Akira Hayakawa 

I need an ack from a dm developer before I can take this.

thanks,
greg k-h
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel