Re: Re : Re: Re: Re : Re: [PATCH] Squashfs: add asynchronous read support

2013-12-22 Thread Minchan Kim
On Mon, Dec 23, 2013 at 12:03:39PM +0900, Chanho Min wrote:
> 
> 
> > read_pages
> >   for(page_idx ...) {
> > if (!add_to_page_cache_lru)) { <-- 1)
> >   mapping->a_ops->readpage(filp, page)
> > squashfs_readpage
> >   for (i ...) {   2)  Here, 31 pages are inserted into page cache
> > grab_cahe_page_nowait <--/
> >   add_to_page_cache_lru
> >   }
> > }
> > /*
> >  * 1) will be failed with EEXIST by 2) so every pages other than first 
> > page
> >  * in list would be freed
> >  */
> > page_cache_release(page)
> >   }
> >
> > If you see ReadAhead works, it is just by luck as I told you.
> > Please simulate it with 64K dd.
> You right, This luck happened frequently with 128k dd or my test.

Yeah, it was not intented by MM's readahead.
If you test it with squashfs 256K compression, you couldn't get a benefit.
If you test it with small block size dd like 32K, you couldn't, either.
It means it's very fragile. One more thing. Your approach doesn't work
page cache has already some sparse page because you are solving only
direct page copy part, which couldn't work if we read some sparse page
in a file and reclaimed many pages.

Please rethink.

I already explained what's the problem in your patch.
You are ignoring VM's logic. (ex, PageReadahead mark)
The squashfs is rather special due to compression FS so if we have no other way,
I'd like to support your approach but I pointed out problem in your patch and
suggest my solution to overcome the problem. It culd be silly but at least,
it's time that you should prove why it's brain-damaged so maintainer will
review this thread and decide or suggest easily. :)

Here goes again.

I suggest it would be better to implment squashfs_readpages and it should
work with cache buffer instead of direct page cache so that it could copy
from cache buffers to pages passed by MM without freeing them so that it
preserves readhead hinted page and would work with VM's readahead well
1. although algorithm in readahead were changed, 2. although you use small
block size dd, 3. although you use other compression size of squashfs.

Thanks.

> 
> > I understand it but your patch doesn't make it.
> >
> I think my patch can make it if readahead works normally or luckily.
> 
> Thanks a lot!
> Chanho,
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Squashfs: add asynchronous read support

2013-12-22 Thread Phillip Lougher

On 16/12/13 05:30, Chanho Min wrote:

This patch removes synchronous wait for the up-to-date of buffer in the
file system level. Instead all operations after submit_bh are moved into
the End-of-IO handler and its associated workeque. It decompresses/copies
data into pages and unlock them asynchronously.

This patch enhances the performance of Squashfs in most cases.
Especially, large file reading is improved significantly.


Hi,

The following is the summarised results of a set of
comprehensive tests of the asynchronous patch against the current
synchronous Squashfs readpage implementation.

The following tables should be fairly self-explanatory, but,
the testing methodology was:

Generate a series of Squashfs filesystems, with block size
1024K, 512K, 256K, 128K and 64K.

Then for each filesystem

Run "dd if=/mnt/file of=/dev/null bs=X"

Where X is 4K, 8K, 16K, 32K, 64K, 128K, 256K, 512K, and 1024K

For each dd, run it against six different Squashfs modules,
configured with the following different options:

1. SQUASHFS_READ_DATA_ASYNC selected, SQUASHFS_DECOMP_SINGLE selected
   i.e. Async patch and single threaded decompression
   == Asyn Single in following tables

2. SQUASHFS_READ_DATA_ASYNC *not* selected, SQUASHFS_DECOMP_SINGLE selected
   i.e. No Async patch and single threaded decompression
   == No Asyn Single in following tables

3. SQUASHFS_READ_DATA_ASYNC selected, SQUASHFS_DECOMP_MULTI selected
   i.e. Async patch and multi-threaded decompression
   == Asyn Multi in following tables

4. SQUASHFS_READ_DATA_ASYNC *not* selected, SQUASHFS_DECOMP_MULTI selected
   i.e. No Async patch and multi-threaded decompression
   == No Asyn Multi in following tables

5. SQUASHFS_READ_DATA_ASYNC selected, SQUASHFS_DECOMP_MULTI_PERCPU selected
   i.e. Async patch and percpu multi-threaded decompression
   == Asyn Percpu in following tables

6. SQUASHFS_READ_DATA_ASYNC *not* selected, SQUASHFS_DECOMP_MULTI_PERCPU 
selected
   i.e. No Async patch and percpu multi-threaded decompression
   == No Asyn Percpu in following tables

The figures in the following tables are the MB/s reported by dd.

The tests were performed on a KVM guest with 4 cores and 4Gb of
memory, running on a core i5 based host.

The Squashfs filesystem was on "/dev/hdb".

/mnt/file is a 3Gb file, average compression 22% (635 Mb)

Squashfs: gzip filesystem 1024K blocks

AsynNo Asyn AsynNo Asyn AsynNo Asyn
Single  Single  Multi   Multi   Percpu  Percpu
---
4K: 89.497.589.998.190.699.1
8K: 89.999.089.799.490.399.4
16K:90.699.890.8100 90.297.0
32K:90.398.790.398.089.9101
64K:90.397.690.297.190.199.7
128K:   90.498.690.297.690.798.5
256K:   89.796.989.899.290.2101
512K:   89.798.990.898.189.497.8
1024K:  89.398.089.698.688.796.4

Squashfs: gzip filesystem 512K blocks   

AsynNo Asyn AsynNo Asyn AsynNo Asyn
Single  Single  Multi   Multi   Percpu  Percpu
---
4K: 68.594.967.699.068.997.0
8K: 69.3101 68.994.369.097.2
16K:68.998.669.498.968.898.0
32K:68.696.569.498.969.4108
64K:68.792.969.7101 68.898.2
128K:   67.4102 68.790.369.4100
256K:   68.795.168.299.768.597.7
512K:   69.9114 82.0104 74.294.4
1024K:  71.6105 79.2105 69.198.0

Squashfs: gzip filesystem 256K blocks

AsynNo Asyn AsynNo Asyn AsynNo Asyn
Single  Single  Multi   Multi   Percpu  Percpu
---
4K: 53.692.254.687.553.782.1
8K: 53.587.353.585.053.585.7
16K:53.189.053.895.753.591.1
32K:54.095.953.898.753.985.3
64K:53.786.953.4103 53.486.3
128K:   53.294.453.6100 53.797.9
256K:   55.5101 53.094.153.387.0
512K:   53.193.053.487.753.289.8
1024K:  53.291.452.791.353.095.4

A couple of points about the above can be noticed:

1. With a Squashfs block size of 256K and greater, Squashfs
   readpage() does its own readahead.  This means the asynchronous
   readpage is never called multiply (to run in parallel), because
   there is never any more work to do after the first readpage().

   The above results therefore reflect the basic performance of
   the asynchronous readpage implementation versus the
   synchronous readpage implementation.

2. It can be seen in all cases the asynchronous readpage implementation
   performs worse than the synchronous readpage implementation.

   

Re : Re: Re: Re : Re: [PATCH] Squashfs: add asynchronous read support

2013-12-22 Thread Chanho Min


> read_pages
>   for(page_idx ...) {
> if (!add_to_page_cache_lru)) { <-- 1)
>   mapping->a_ops->readpage(filp, page)
> squashfs_readpage
>   for (i ...) {   2)  Here, 31 pages are inserted into page cache
> grab_cahe_page_nowait <--/
>   add_to_page_cache_lru
>   }
> }
> /*
>  * 1) will be failed with EEXIST by 2) so every pages other than first 
> page
>  * in list would be freed
>  */
> page_cache_release(page)
>   }
>
> If you see ReadAhead works, it is just by luck as I told you.
> Please simulate it with 64K dd.
You right, This luck happened frequently with 128k dd or my test.

> I understand it but your patch doesn't make it.
>
I think my patch can make it if readahead works normally or luckily.

Thanks a lot!
Chanho,

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re : Re: [PATCH] Squashfs: add asynchronous read support

2013-12-22 Thread Minchan Kim
On Sat, Dec 21, 2013 at 11:05:51AM +0900, Chanho Min wrote:
> 
> > Please don't break thread.
> > You should reply to my mail instead of your original post.
> Sorry, It seems to be my mailer issue. I'm trying to fix it.
> 
> > It's a result which isn't what I want to know.
> > What I wnat to know is why upper layer issues more I/O per second.
> > For example, you read 32K so MM layer will prepare 8 pages to read in but
> > at issuing at a first page, squashfs make 32 pages and fill the page cache
> > if we assume you use 128K compression so MM layer's already prepared 7
> > page
> > would be freed without further I/O and do_generic_file_read will wait for
> > completion by lock_page without further I/O queueing. It's not suprising.
> > One of page freed is a READA marked page so readahead couldn't work.
> > If readahead works, it would be just by luck. Actually, by simulation
> > 64K dd, I found readahead logic would be triggered but it's just by luck
> > and it's not intended, I think.
> MM layer's readahead pages would not be freed immediately.
> Squashfs can use them by grab_cache_page_nowait and READA marked page is 
> available.
> Intentional or not, readahead works pretty well. I checked in experiment.


read_pages
  for(page_idx ...) {
if (!add_to_page_cache_lru)) { <-- 1)
  mapping->a_ops->readpage(filp, page)
squashfs_readpage
  for (i ...) {   2)  Here, 31 pages are inserted into page cache
grab_cahe_page_nowait <--/
  add_to_page_cache_lru 
  }
}
/*
 * 1) will be failed with EEXIST by 2) so every pages other than first page
 * in list would be freed
 */
page_cache_release(page) 
  }
   
If you see ReadAhead works, it is just by luck as I told you.
Please simulate it with 64K dd.

> 
> > If first issued I/O complete, squashfs decompress the I/O with 128K pages
> > so all 4 iteration(128K/32K) would be hit in page cache.
> > If all 128K hit in page cache, mm layer start to issue next I/O and
> > repeat above logic until you ends up reading all file size.
> > So my opition is that upper layer wouldn't issue more I/O logically.
> > If it worked, it's not what we expect but side-effect.
> >
> > That's why I'd like to know what's your thought for increasing IOPS.
> > Please, could you say your thought why IOPS increased, not a result
> > on low level driver?
> It is because readahead can works asynchronously in background.
> Suppose that you read a large file by 128k partially and contiguously
> like "dd bs=128k". Two IOs can be issued per 128k reading,
> First IO is for intended pages, second IO is for readahead.
> If first IO hit in cache thank to previous readahead, no wait for IO 
> completion
> is needed, because intended page is up-to-date already.
> But, current squashfs waits for second IO's completion unnecessarily.
> That is one of reason that we should move page's up-to-date
> to the asynchronous area like my patch.

I understand it but your patch doesn't make it.

> 
> > Anyway, in my opinion, we should take care of MM layer's readahead for
> > enhance sequential I/O. For it, we should use buffer pages passed by MM
> > instead of freeing them and allocating new pages in squashfs.
> > IMHO, it would be better to implement squashfs_readpages but my insight
> > is very weak so I guess Phillip will give more good idea/insight about
> > the issue.
> That's a good point. Also, I think my patch is another way which can be 
> implemented
> without significant impact on current implementation and I wait for Phillip's 
> comment.
> 
> Thanks
> Chanho
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re : Re: [PATCH] Squashfs: add asynchronous read support

2013-12-22 Thread Minchan Kim
On Sat, Dec 21, 2013 at 11:05:51AM +0900, Chanho Min wrote:
 
  Please don't break thread.
  You should reply to my mail instead of your original post.
 Sorry, It seems to be my mailer issue. I'm trying to fix it.
 
  It's a result which isn't what I want to know.
  What I wnat to know is why upper layer issues more I/O per second.
  For example, you read 32K so MM layer will prepare 8 pages to read in but
  at issuing at a first page, squashfs make 32 pages and fill the page cache
  if we assume you use 128K compression so MM layer's already prepared 7
  page
  would be freed without further I/O and do_generic_file_read will wait for
  completion by lock_page without further I/O queueing. It's not suprising.
  One of page freed is a READA marked page so readahead couldn't work.
  If readahead works, it would be just by luck. Actually, by simulation
  64K dd, I found readahead logic would be triggered but it's just by luck
  and it's not intended, I think.
 MM layer's readahead pages would not be freed immediately.
 Squashfs can use them by grab_cache_page_nowait and READA marked page is 
 available.
 Intentional or not, readahead works pretty well. I checked in experiment.


read_pages
  for(page_idx ...) {
if (!add_to_page_cache_lru)) { -- 1)
  mapping-a_ops-readpage(filp, page)
squashfs_readpage
  for (i ...) {   2)  Here, 31 pages are inserted into page cache
grab_cahe_page_nowait --/
  add_to_page_cache_lru 
  }
}
/*
 * 1) will be failed with EEXIST by 2) so every pages other than first page
 * in list would be freed
 */
page_cache_release(page) 
  }
   
If you see ReadAhead works, it is just by luck as I told you.
Please simulate it with 64K dd.

 
  If first issued I/O complete, squashfs decompress the I/O with 128K pages
  so all 4 iteration(128K/32K) would be hit in page cache.
  If all 128K hit in page cache, mm layer start to issue next I/O and
  repeat above logic until you ends up reading all file size.
  So my opition is that upper layer wouldn't issue more I/O logically.
  If it worked, it's not what we expect but side-effect.
 
  That's why I'd like to know what's your thought for increasing IOPS.
  Please, could you say your thought why IOPS increased, not a result
  on low level driver?
 It is because readahead can works asynchronously in background.
 Suppose that you read a large file by 128k partially and contiguously
 like dd bs=128k. Two IOs can be issued per 128k reading,
 First IO is for intended pages, second IO is for readahead.
 If first IO hit in cache thank to previous readahead, no wait for IO 
 completion
 is needed, because intended page is up-to-date already.
 But, current squashfs waits for second IO's completion unnecessarily.
 That is one of reason that we should move page's up-to-date
 to the asynchronous area like my patch.

I understand it but your patch doesn't make it.

 
  Anyway, in my opinion, we should take care of MM layer's readahead for
  enhance sequential I/O. For it, we should use buffer pages passed by MM
  instead of freeing them and allocating new pages in squashfs.
  IMHO, it would be better to implement squashfs_readpages but my insight
  is very weak so I guess Phillip will give more good idea/insight about
  the issue.
 That's a good point. Also, I think my patch is another way which can be 
 implemented
 without significant impact on current implementation and I wait for Phillip's 
 comment.
 
 Thanks
 Chanho
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re : Re: Re: Re : Re: [PATCH] Squashfs: add asynchronous read support

2013-12-22 Thread Chanho Min


 read_pages
   for(page_idx ...) {
 if (!add_to_page_cache_lru)) { -- 1)
   mapping-a_ops-readpage(filp, page)
 squashfs_readpage
   for (i ...) {   2)  Here, 31 pages are inserted into page cache
 grab_cahe_page_nowait --/
   add_to_page_cache_lru
   }
 }
 /*
  * 1) will be failed with EEXIST by 2) so every pages other than first 
 page
  * in list would be freed
  */
 page_cache_release(page)
   }

 If you see ReadAhead works, it is just by luck as I told you.
 Please simulate it with 64K dd.
You right, This luck happened frequently with 128k dd or my test.

 I understand it but your patch doesn't make it.

I think my patch can make it if readahead works normally or luckily.

Thanks a lot!
Chanho,

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re : Re: Re: Re : Re: [PATCH] Squashfs: add asynchronous read support

2013-12-22 Thread Minchan Kim
On Mon, Dec 23, 2013 at 12:03:39PM +0900, Chanho Min wrote:
 
 
  read_pages
for(page_idx ...) {
  if (!add_to_page_cache_lru)) { -- 1)
mapping-a_ops-readpage(filp, page)
  squashfs_readpage
for (i ...) {   2)  Here, 31 pages are inserted into page cache
  grab_cahe_page_nowait --/
add_to_page_cache_lru
}
  }
  /*
   * 1) will be failed with EEXIST by 2) so every pages other than first 
  page
   * in list would be freed
   */
  page_cache_release(page)
}
 
  If you see ReadAhead works, it is just by luck as I told you.
  Please simulate it with 64K dd.
 You right, This luck happened frequently with 128k dd or my test.

Yeah, it was not intented by MM's readahead.
If you test it with squashfs 256K compression, you couldn't get a benefit.
If you test it with small block size dd like 32K, you couldn't, either.
It means it's very fragile. One more thing. Your approach doesn't work
page cache has already some sparse page because you are solving only
direct page copy part, which couldn't work if we read some sparse page
in a file and reclaimed many pages.

Please rethink.

I already explained what's the problem in your patch.
You are ignoring VM's logic. (ex, PageReadahead mark)
The squashfs is rather special due to compression FS so if we have no other way,
I'd like to support your approach but I pointed out problem in your patch and
suggest my solution to overcome the problem. It culd be silly but at least,
it's time that you should prove why it's brain-damaged so maintainer will
review this thread and decide or suggest easily. :)

Here goes again.

I suggest it would be better to implment squashfs_readpages and it should
work with cache buffer instead of direct page cache so that it could copy
from cache buffers to pages passed by MM without freeing them so that it
preserves readhead hinted page and would work with VM's readahead well
1. although algorithm in readahead were changed, 2. although you use small
block size dd, 3. although you use other compression size of squashfs.

Thanks.

 
  I understand it but your patch doesn't make it.
 
 I think my patch can make it if readahead works normally or luckily.
 
 Thanks a lot!
 Chanho,
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Squashfs: add asynchronous read support

2013-12-22 Thread Phillip Lougher

On 16/12/13 05:30, Chanho Min wrote:

This patch removes synchronous wait for the up-to-date of buffer in the
file system level. Instead all operations after submit_bh are moved into
the End-of-IO handler and its associated workeque. It decompresses/copies
data into pages and unlock them asynchronously.

This patch enhances the performance of Squashfs in most cases.
Especially, large file reading is improved significantly.


Hi,

The following is the summarised results of a set of
comprehensive tests of the asynchronous patch against the current
synchronous Squashfs readpage implementation.

The following tables should be fairly self-explanatory, but,
the testing methodology was:

Generate a series of Squashfs filesystems, with block size
1024K, 512K, 256K, 128K and 64K.

Then for each filesystem

Run dd if=/mnt/file of=/dev/null bs=X

Where X is 4K, 8K, 16K, 32K, 64K, 128K, 256K, 512K, and 1024K

For each dd, run it against six different Squashfs modules,
configured with the following different options:

1. SQUASHFS_READ_DATA_ASYNC selected, SQUASHFS_DECOMP_SINGLE selected
   i.e. Async patch and single threaded decompression
   == Asyn Single in following tables

2. SQUASHFS_READ_DATA_ASYNC *not* selected, SQUASHFS_DECOMP_SINGLE selected
   i.e. No Async patch and single threaded decompression
   == No Asyn Single in following tables

3. SQUASHFS_READ_DATA_ASYNC selected, SQUASHFS_DECOMP_MULTI selected
   i.e. Async patch and multi-threaded decompression
   == Asyn Multi in following tables

4. SQUASHFS_READ_DATA_ASYNC *not* selected, SQUASHFS_DECOMP_MULTI selected
   i.e. No Async patch and multi-threaded decompression
   == No Asyn Multi in following tables

5. SQUASHFS_READ_DATA_ASYNC selected, SQUASHFS_DECOMP_MULTI_PERCPU selected
   i.e. Async patch and percpu multi-threaded decompression
   == Asyn Percpu in following tables

6. SQUASHFS_READ_DATA_ASYNC *not* selected, SQUASHFS_DECOMP_MULTI_PERCPU 
selected
   i.e. No Async patch and percpu multi-threaded decompression
   == No Asyn Percpu in following tables

The figures in the following tables are the MB/s reported by dd.

The tests were performed on a KVM guest with 4 cores and 4Gb of
memory, running on a core i5 based host.

The Squashfs filesystem was on /dev/hdb.

/mnt/file is a 3Gb file, average compression 22% (635 Mb)

Squashfs: gzip filesystem 1024K blocks

AsynNo Asyn AsynNo Asyn AsynNo Asyn
Single  Single  Multi   Multi   Percpu  Percpu
---
4K: 89.497.589.998.190.699.1
8K: 89.999.089.799.490.399.4
16K:90.699.890.8100 90.297.0
32K:90.398.790.398.089.9101
64K:90.397.690.297.190.199.7
128K:   90.498.690.297.690.798.5
256K:   89.796.989.899.290.2101
512K:   89.798.990.898.189.497.8
1024K:  89.398.089.698.688.796.4

Squashfs: gzip filesystem 512K blocks   

AsynNo Asyn AsynNo Asyn AsynNo Asyn
Single  Single  Multi   Multi   Percpu  Percpu
---
4K: 68.594.967.699.068.997.0
8K: 69.3101 68.994.369.097.2
16K:68.998.669.498.968.898.0
32K:68.696.569.498.969.4108
64K:68.792.969.7101 68.898.2
128K:   67.4102 68.790.369.4100
256K:   68.795.168.299.768.597.7
512K:   69.9114 82.0104 74.294.4
1024K:  71.6105 79.2105 69.198.0

Squashfs: gzip filesystem 256K blocks

AsynNo Asyn AsynNo Asyn AsynNo Asyn
Single  Single  Multi   Multi   Percpu  Percpu
---
4K: 53.692.254.687.553.782.1
8K: 53.587.353.585.053.585.7
16K:53.189.053.895.753.591.1
32K:54.095.953.898.753.985.3
64K:53.786.953.4103 53.486.3
128K:   53.294.453.6100 53.797.9
256K:   55.5101 53.094.153.387.0
512K:   53.193.053.487.753.289.8
1024K:  53.291.452.791.353.095.4

A couple of points about the above can be noticed:

1. With a Squashfs block size of 256K and greater, Squashfs
   readpage() does its own readahead.  This means the asynchronous
   readpage is never called multiply (to run in parallel), because
   there is never any more work to do after the first readpage().

   The above results therefore reflect the basic performance of
   the asynchronous readpage implementation versus the
   synchronous readpage implementation.

2. It can be seen in all cases the asynchronous readpage implementation
   performs worse than the synchronous readpage implementation.

   This 

Re: Re: Re : Re: [PATCH] Squashfs: add asynchronous read support

2013-12-20 Thread Chanho Min

> Please don't break thread.
> You should reply to my mail instead of your original post.
Sorry, It seems to be my mailer issue. I'm trying to fix it.

> It's a result which isn't what I want to know.
> What I wnat to know is why upper layer issues more I/O per second.
> For example, you read 32K so MM layer will prepare 8 pages to read in but
> at issuing at a first page, squashfs make 32 pages and fill the page cache
> if we assume you use 128K compression so MM layer's already prepared 7
> page
> would be freed without further I/O and do_generic_file_read will wait for
> completion by lock_page without further I/O queueing. It's not suprising.
> One of page freed is a READA marked page so readahead couldn't work.
> If readahead works, it would be just by luck. Actually, by simulation
> 64K dd, I found readahead logic would be triggered but it's just by luck
> and it's not intended, I think.
MM layer's readahead pages would not be freed immediately.
Squashfs can use them by grab_cache_page_nowait and READA marked page is 
available.
Intentional or not, readahead works pretty well. I checked in experiment.

> If first issued I/O complete, squashfs decompress the I/O with 128K pages
> so all 4 iteration(128K/32K) would be hit in page cache.
> If all 128K hit in page cache, mm layer start to issue next I/O and
> repeat above logic until you ends up reading all file size.
> So my opition is that upper layer wouldn't issue more I/O logically.
> If it worked, it's not what we expect but side-effect.
>
> That's why I'd like to know what's your thought for increasing IOPS.
> Please, could you say your thought why IOPS increased, not a result
> on low level driver?
It is because readahead can works asynchronously in background.
Suppose that you read a large file by 128k partially and contiguously
like "dd bs=128k". Two IOs can be issued per 128k reading,
First IO is for intended pages, second IO is for readahead.
If first IO hit in cache thank to previous readahead, no wait for IO completion
is needed, because intended page is up-to-date already.
But, current squashfs waits for second IO's completion unnecessarily.
That is one of reason that we should move page's up-to-date
to the asynchronous area like my patch.

> Anyway, in my opinion, we should take care of MM layer's readahead for
> enhance sequential I/O. For it, we should use buffer pages passed by MM
> instead of freeing them and allocating new pages in squashfs.
> IMHO, it would be better to implement squashfs_readpages but my insight
> is very weak so I guess Phillip will give more good idea/insight about
> the issue.
That's a good point. Also, I think my patch is another way which can be 
implemented
without significant impact on current implementation and I wait for Phillip's 
comment.

Thanks
Chanho

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: Re : Re: [PATCH] Squashfs: add asynchronous read support

2013-12-20 Thread Chanho Min

 Please don't break thread.
 You should reply to my mail instead of your original post.
Sorry, It seems to be my mailer issue. I'm trying to fix it.

 It's a result which isn't what I want to know.
 What I wnat to know is why upper layer issues more I/O per second.
 For example, you read 32K so MM layer will prepare 8 pages to read in but
 at issuing at a first page, squashfs make 32 pages and fill the page cache
 if we assume you use 128K compression so MM layer's already prepared 7
 page
 would be freed without further I/O and do_generic_file_read will wait for
 completion by lock_page without further I/O queueing. It's not suprising.
 One of page freed is a READA marked page so readahead couldn't work.
 If readahead works, it would be just by luck. Actually, by simulation
 64K dd, I found readahead logic would be triggered but it's just by luck
 and it's not intended, I think.
MM layer's readahead pages would not be freed immediately.
Squashfs can use them by grab_cache_page_nowait and READA marked page is 
available.
Intentional or not, readahead works pretty well. I checked in experiment.

 If first issued I/O complete, squashfs decompress the I/O with 128K pages
 so all 4 iteration(128K/32K) would be hit in page cache.
 If all 128K hit in page cache, mm layer start to issue next I/O and
 repeat above logic until you ends up reading all file size.
 So my opition is that upper layer wouldn't issue more I/O logically.
 If it worked, it's not what we expect but side-effect.

 That's why I'd like to know what's your thought for increasing IOPS.
 Please, could you say your thought why IOPS increased, not a result
 on low level driver?
It is because readahead can works asynchronously in background.
Suppose that you read a large file by 128k partially and contiguously
like dd bs=128k. Two IOs can be issued per 128k reading,
First IO is for intended pages, second IO is for readahead.
If first IO hit in cache thank to previous readahead, no wait for IO completion
is needed, because intended page is up-to-date already.
But, current squashfs waits for second IO's completion unnecessarily.
That is one of reason that we should move page's up-to-date
to the asynchronous area like my patch.

 Anyway, in my opinion, we should take care of MM layer's readahead for
 enhance sequential I/O. For it, we should use buffer pages passed by MM
 instead of freeing them and allocating new pages in squashfs.
 IMHO, it would be better to implement squashfs_readpages but my insight
 is very weak so I guess Phillip will give more good idea/insight about
 the issue.
That's a good point. Also, I think my patch is another way which can be 
implemented
without significant impact on current implementation and I wait for Phillip's 
comment.

Thanks
Chanho

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re : Re: [PATCH] Squashfs: add asynchronous read support

2013-12-17 Thread Minchan Kim
Hello,

Please don't break thread.
You should reply to my mail instead of your original post.

On Wed, Dec 18, 2013 at 01:29:37PM +0900, Chanho Min wrote:
> 
> > I did test it on x86 with USB stick and ARM with eMMC on my Nexus 4.
> > In experiment, I couldn't see much gain like you both system and even it
> > was regressed at bs=32k test, maybe workqueue allocation/schedule of work
> > per I/O.
> > Your test is rather special or what I am missing?
> Can you specify your test result on ARM with eMMC.

Sure.
before  after
32K 3.6M 3.4M
64K 6.3M 8.2M
128K11.4M11.7M
160K13.6M13.8M
256K19.8M19M
288K21.3M20.8M

> 
> > Before that, I'd like to know fundamental reason why your implementation
> > for asynchronous read enhance. At a first glance, I thought it's caused by
> > readahead from MM layer but when I read code, I found I was wrong.
> > MM's readahead logic works based on PageReadahead marker but squashfs
> > invalidates by grab_cache_page_nowait so it wouldn't work as we expected.
> >
> > Another possibility is block I/O merging in block layder by plugging logic,
> > which was what I tried a few month ago although implementation was really
> > bad. But it wouldn't work with your patch because do_generic_file_read
> > will unplug block layer by lock_page without merging enough I/O.
> >
> > So, what do you think real actuator for enhance your experiment?
> > Then, I could investigate why I can't get a benefit.
> Currently, squashfs adds request to the block device queue synchronously with
> wait for competion. mmc takes this request one by one and push them to host 
> driver,
> But it allows mmc to be idle frequently. This patch allows to add block 
> requset
> asynchrously without wait for competion, mmcqd can fetch a lot of request 
> from block
> at a time. As a result, mmcqd get busy and use a more bandwidth of mmc.
> For test, I added two count variables in mmc_queue_thread as bellows
> and tested same dd transfer.
> 
> static int mmc_queue_thread(void *d)
> {
> ..
>   do {
>   if (req || mq->mqrq_prev->req) {
>   fetch++;
>   } else {
>   idle++;
>   }
>   } while (1);
> ..
> }
> 
> without patch:
>  fetch: 920, idle: 460
> 
> with patch
>  fetch: 918, idle: 40

It's a result which isn't what I want to know.
What I wnat to know is why upper layer issues more I/O per second.

For example, you read 32K so MM layer will prepare 8 pages to read in but
at issuing at a first page, squashfs make 32 pages and fill the page cache
if we assume you use 128K compression so MM layer's already prepared 7 page
would be freed without further I/O and do_generic_file_read will wait for
completion by lock_page without further I/O queueing. It's not suprising.
One of page freed is a READA marked page so readahead couldn't work.
If readahead works, it would be just by luck. Actually, by simulation
64K dd, I found readahead logic would be triggered but it's just by luck
and it's not intended, I think.

If first issued I/O complete, squashfs decompress the I/O with 128K pages
so all 4 iteration(128K/32K) would be hit in page cache.
If all 128K hit in page cache, mm layer start to issue next I/O and
repeat above logic until you ends up reading all file size.
So my opition is that upper layer wouldn't issue more I/O logically.
If it worked, it's not what we expect but side-effect.

That's why I'd like to know what's your thought for increasing IOPS.
Please, could you say your thought why IOPS increased, not a result
on low level driver?

Anyway, in my opinion, we should take care of MM layer's readahead for
enhance sequential I/O. For it, we should use buffer pages passed by MM
instead of freeing them and allocating new pages in squashfs.
IMHO, it would be better to implement squashfs_readpages but my insight
is very weak so I guess Phillip will give more good idea/insight about
the issue.

Thanks!


> 
> Thanks
> Chanho.
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re : Re: [PATCH] Squashfs: add asynchronous read support

2013-12-17 Thread Chanho Min

> I did test it on x86 with USB stick and ARM with eMMC on my Nexus 4.
> In experiment, I couldn't see much gain like you both system and even it
> was regressed at bs=32k test, maybe workqueue allocation/schedule of work
> per I/O.
> Your test is rather special or what I am missing?
Can you specify your test result on ARM with eMMC.

> Before that, I'd like to know fundamental reason why your implementation
> for asynchronous read enhance. At a first glance, I thought it's caused by
> readahead from MM layer but when I read code, I found I was wrong.
> MM's readahead logic works based on PageReadahead marker but squashfs
> invalidates by grab_cache_page_nowait so it wouldn't work as we expected.
>
> Another possibility is block I/O merging in block layder by plugging logic,
> which was what I tried a few month ago although implementation was really
> bad. But it wouldn't work with your patch because do_generic_file_read
> will unplug block layer by lock_page without merging enough I/O.
>
> So, what do you think real actuator for enhance your experiment?
> Then, I could investigate why I can't get a benefit.
Currently, squashfs adds request to the block device queue synchronously with
wait for competion. mmc takes this request one by one and push them to host 
driver,
But it allows mmc to be idle frequently. This patch allows to add block requset
asynchrously without wait for competion, mmcqd can fetch a lot of request from 
block
at a time. As a result, mmcqd get busy and use a more bandwidth of mmc.
For test, I added two count variables in mmc_queue_thread as bellows
and tested same dd transfer.

static int mmc_queue_thread(void *d)
{
..
do {
if (req || mq->mqrq_prev->req) {
fetch++;
} else {
idle++;
}
} while (1);
..
}

without patch:
 fetch: 920, idle: 460

with patch
 fetch: 918, idle: 40

Thanks
Chanho.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re : Re: [PATCH] Squashfs: add asynchronous read support

2013-12-17 Thread Chanho Min

 I did test it on x86 with USB stick and ARM with eMMC on my Nexus 4.
 In experiment, I couldn't see much gain like you both system and even it
 was regressed at bs=32k test, maybe workqueue allocation/schedule of work
 per I/O.
 Your test is rather special or what I am missing?
Can you specify your test result on ARM with eMMC.

 Before that, I'd like to know fundamental reason why your implementation
 for asynchronous read enhance. At a first glance, I thought it's caused by
 readahead from MM layer but when I read code, I found I was wrong.
 MM's readahead logic works based on PageReadahead marker but squashfs
 invalidates by grab_cache_page_nowait so it wouldn't work as we expected.

 Another possibility is block I/O merging in block layder by plugging logic,
 which was what I tried a few month ago although implementation was really
 bad. But it wouldn't work with your patch because do_generic_file_read
 will unplug block layer by lock_page without merging enough I/O.

 So, what do you think real actuator for enhance your experiment?
 Then, I could investigate why I can't get a benefit.
Currently, squashfs adds request to the block device queue synchronously with
wait for competion. mmc takes this request one by one and push them to host 
driver,
But it allows mmc to be idle frequently. This patch allows to add block requset
asynchrously without wait for competion, mmcqd can fetch a lot of request from 
block
at a time. As a result, mmcqd get busy and use a more bandwidth of mmc.
For test, I added two count variables in mmc_queue_thread as bellows
and tested same dd transfer.

static int mmc_queue_thread(void *d)
{
..
do {
if (req || mq-mqrq_prev-req) {
fetch++;
} else {
idle++;
}
} while (1);
..
}

without patch:
 fetch: 920, idle: 460

with patch
 fetch: 918, idle: 40

Thanks
Chanho.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re : Re: [PATCH] Squashfs: add asynchronous read support

2013-12-17 Thread Minchan Kim
Hello,

Please don't break thread.
You should reply to my mail instead of your original post.

On Wed, Dec 18, 2013 at 01:29:37PM +0900, Chanho Min wrote:
 
  I did test it on x86 with USB stick and ARM with eMMC on my Nexus 4.
  In experiment, I couldn't see much gain like you both system and even it
  was regressed at bs=32k test, maybe workqueue allocation/schedule of work
  per I/O.
  Your test is rather special or what I am missing?
 Can you specify your test result on ARM with eMMC.

Sure.
before  after
32K 3.6M 3.4M
64K 6.3M 8.2M
128K11.4M11.7M
160K13.6M13.8M
256K19.8M19M
288K21.3M20.8M

 
  Before that, I'd like to know fundamental reason why your implementation
  for asynchronous read enhance. At a first glance, I thought it's caused by
  readahead from MM layer but when I read code, I found I was wrong.
  MM's readahead logic works based on PageReadahead marker but squashfs
  invalidates by grab_cache_page_nowait so it wouldn't work as we expected.
 
  Another possibility is block I/O merging in block layder by plugging logic,
  which was what I tried a few month ago although implementation was really
  bad. But it wouldn't work with your patch because do_generic_file_read
  will unplug block layer by lock_page without merging enough I/O.
 
  So, what do you think real actuator for enhance your experiment?
  Then, I could investigate why I can't get a benefit.
 Currently, squashfs adds request to the block device queue synchronously with
 wait for competion. mmc takes this request one by one and push them to host 
 driver,
 But it allows mmc to be idle frequently. This patch allows to add block 
 requset
 asynchrously without wait for competion, mmcqd can fetch a lot of request 
 from block
 at a time. As a result, mmcqd get busy and use a more bandwidth of mmc.
 For test, I added two count variables in mmc_queue_thread as bellows
 and tested same dd transfer.
 
 static int mmc_queue_thread(void *d)
 {
 ..
   do {
   if (req || mq-mqrq_prev-req) {
   fetch++;
   } else {
   idle++;
   }
   } while (1);
 ..
 }
 
 without patch:
  fetch: 920, idle: 460
 
 with patch
  fetch: 918, idle: 40

It's a result which isn't what I want to know.
What I wnat to know is why upper layer issues more I/O per second.

For example, you read 32K so MM layer will prepare 8 pages to read in but
at issuing at a first page, squashfs make 32 pages and fill the page cache
if we assume you use 128K compression so MM layer's already prepared 7 page
would be freed without further I/O and do_generic_file_read will wait for
completion by lock_page without further I/O queueing. It's not suprising.
One of page freed is a READA marked page so readahead couldn't work.
If readahead works, it would be just by luck. Actually, by simulation
64K dd, I found readahead logic would be triggered but it's just by luck
and it's not intended, I think.

If first issued I/O complete, squashfs decompress the I/O with 128K pages
so all 4 iteration(128K/32K) would be hit in page cache.
If all 128K hit in page cache, mm layer start to issue next I/O and
repeat above logic until you ends up reading all file size.
So my opition is that upper layer wouldn't issue more I/O logically.
If it worked, it's not what we expect but side-effect.

That's why I'd like to know what's your thought for increasing IOPS.
Please, could you say your thought why IOPS increased, not a result
on low level driver?

Anyway, in my opinion, we should take care of MM layer's readahead for
enhance sequential I/O. For it, we should use buffer pages passed by MM
instead of freeing them and allocating new pages in squashfs.
IMHO, it would be better to implement squashfs_readpages but my insight
is very weak so I guess Phillip will give more good idea/insight about
the issue.

Thanks!


 
 Thanks
 Chanho.
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Squashfs: add asynchronous read support

2013-12-16 Thread Minchan Kim
Hello Chanho,

On Mon, Dec 16, 2013 at 02:30:26PM +0900, Chanho Min wrote:
> This patch removes synchronous wait for the up-to-date of buffer in the
> file system level. Instead all operations after submit_bh are moved into
> the End-of-IO handler and its associated workeque. It decompresses/copies
> data into pages and unlock them asynchronously.
> 
> This patch enhances the performance of Squashfs in most cases.
> Especially, large file reading is improved significantly.
> 
> dd read test:
> 
>  - ARM cortex-a9 1GHz, 2 cores, eMMC 4.5 HS200 mode.
>  - dd if=file1 of=/dev/null bs=64k
> 
> Before
>  58707718 bytes (56.0MB) copied, 1.393653 seconds, 40.2MB/s
> 
> After
>  58707718 bytes (56.0MB) copied, 0.942413 seconds, 59.4MB/s

It's really nice!

I did test it on x86 with USB stick and ARM with eMMC on my Nexus 4.
In experiment, I couldn't see much gain like you both system and
even it was regressed at bs=32k test, maybe workqueue
allocation/schedule of work per I/O.
Your test is rather special or what I am missing?

Before that, I'd like to know fundamental reason why your implementation
for asynchronous read enhance. At a first glance, I thought it's caused
by readahead from MM layer but when I read code, I found I was wrong.
MM's readahead logic works based on PageReadahead marker but squashfs
invalidates by grab_cache_page_nowait so it wouldn't work as we expected.

Another possibility is block I/O merging in block layder by plugging logic,
which was what I tried a few month ago although implementation was really
bad. But it wouldn't work with your patch because do_generic_file_read
will unplug block layer by lock_page without merging enough I/O.

So, what do you think real actuator for enhance your experiment?
Then, I could investigate why I can't get a benefit.

Thanks for looking this.

> 
> Signed-off-by: Chanho Min 
> ---
>  fs/squashfs/Kconfig   |9 ++
>  fs/squashfs/block.c   |  262 
> +
>  fs/squashfs/file_direct.c |8 +-
>  fs/squashfs/page_actor.c  |3 +-
>  fs/squashfs/page_actor.h  |3 +-
>  fs/squashfs/squashfs.h|2 +
>  6 files changed, 284 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig
> index b6fa865..284aa5a 100644
> --- a/fs/squashfs/Kconfig
> +++ b/fs/squashfs/Kconfig
> @@ -51,6 +51,15 @@ config SQUASHFS_FILE_DIRECT
> it eliminates a memcpy and it also removes the lock contention
> on the single buffer.
>  
> +config SQUASHFS_READ_DATA_ASYNC
> + bool "Read and decompress data asynchronously"
> + depends on  SQUASHFS_FILE_DIRECT
> + help
> +   By default Squashfs read data synchronously by block (default 128k).
> +   This option removes such a synchronous wait in the file system level.
> +   All works after submit IO do at the End-of-IO handler asynchronously.
> +   This enhances the performance of Squashfs in most cases, especially,
> +   large file reading.
>  endchoice
>  
>  choice
> diff --git a/fs/squashfs/block.c b/fs/squashfs/block.c
> index 0cea9b9..1517ca3 100644
> --- a/fs/squashfs/block.c
> +++ b/fs/squashfs/block.c
> @@ -212,3 +212,265 @@ read_failure:
>   kfree(bh);
>   return -EIO;
>  }
> +
> +#ifdef CONFIG_SQUASHFS_READ_DATA_ASYNC
> +
> +struct squashfs_end_io_assoc {
> + int offset;
> + int b_count;
> + int compressed;
> + int length;
> + struct squashfs_page_actor *p_actor;
> + struct buffer_head **__bh;
> + struct squashfs_sb_info *msblk;
> + struct work_struct read_work;
> +};
> +
> +static int squashfs_copy_page(struct squashfs_sb_info *msblk,
> + struct buffer_head **bh, int b, int offset, int length,
> + struct squashfs_page_actor *output)
> +{
> + /*
> +  * Block is uncompressed.
> +  */
> + int in, pg_offset = 0, avail = 0, bytes, k = 0;
> + void *data = squashfs_first_page(output);
> + for (bytes = length; k < b; k++) {
> + in = min(bytes, msblk->devblksize - offset);
> + bytes -= in;
> + while (in) {
> + if (pg_offset == PAGE_CACHE_SIZE) {
> + data = squashfs_next_page(output);
> + pg_offset = 0;
> + }
> + avail = min_t(int, in, PAGE_CACHE_SIZE -
> + pg_offset);
> + memcpy(data + pg_offset, bh[k]->b_data + offset,
> + avail);
> + in -= avail;
> + pg_offset += avail;
> + offset += avail;
> + }
> + offset = 0;
> + put_bh(bh[k]);
> + }
> + squashfs_finish_page(output);
> + return length;
> +}
> +
> +/*
> + * This is executed in workqueue for squashfs_read_data_async().
> + * - pages come decompressed/copied and unlocked asynchronously.
> + */
> +static void 

Re: [PATCH] Squashfs: add asynchronous read support

2013-12-16 Thread Minchan Kim
Hello Chanho,

On Mon, Dec 16, 2013 at 02:30:26PM +0900, Chanho Min wrote:
 This patch removes synchronous wait for the up-to-date of buffer in the
 file system level. Instead all operations after submit_bh are moved into
 the End-of-IO handler and its associated workeque. It decompresses/copies
 data into pages and unlock them asynchronously.
 
 This patch enhances the performance of Squashfs in most cases.
 Especially, large file reading is improved significantly.
 
 dd read test:
 
  - ARM cortex-a9 1GHz, 2 cores, eMMC 4.5 HS200 mode.
  - dd if=file1 of=/dev/null bs=64k
 
 Before
  58707718 bytes (56.0MB) copied, 1.393653 seconds, 40.2MB/s
 
 After
  58707718 bytes (56.0MB) copied, 0.942413 seconds, 59.4MB/s

It's really nice!

I did test it on x86 with USB stick and ARM with eMMC on my Nexus 4.
In experiment, I couldn't see much gain like you both system and
even it was regressed at bs=32k test, maybe workqueue
allocation/schedule of work per I/O.
Your test is rather special or what I am missing?

Before that, I'd like to know fundamental reason why your implementation
for asynchronous read enhance. At a first glance, I thought it's caused
by readahead from MM layer but when I read code, I found I was wrong.
MM's readahead logic works based on PageReadahead marker but squashfs
invalidates by grab_cache_page_nowait so it wouldn't work as we expected.

Another possibility is block I/O merging in block layder by plugging logic,
which was what I tried a few month ago although implementation was really
bad. But it wouldn't work with your patch because do_generic_file_read
will unplug block layer by lock_page without merging enough I/O.

So, what do you think real actuator for enhance your experiment?
Then, I could investigate why I can't get a benefit.

Thanks for looking this.

 
 Signed-off-by: Chanho Min chanho@lge.com
 ---
  fs/squashfs/Kconfig   |9 ++
  fs/squashfs/block.c   |  262 
 +
  fs/squashfs/file_direct.c |8 +-
  fs/squashfs/page_actor.c  |3 +-
  fs/squashfs/page_actor.h  |3 +-
  fs/squashfs/squashfs.h|2 +
  6 files changed, 284 insertions(+), 3 deletions(-)
 
 diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig
 index b6fa865..284aa5a 100644
 --- a/fs/squashfs/Kconfig
 +++ b/fs/squashfs/Kconfig
 @@ -51,6 +51,15 @@ config SQUASHFS_FILE_DIRECT
 it eliminates a memcpy and it also removes the lock contention
 on the single buffer.
  
 +config SQUASHFS_READ_DATA_ASYNC
 + bool Read and decompress data asynchronously
 + depends on  SQUASHFS_FILE_DIRECT
 + help
 +   By default Squashfs read data synchronously by block (default 128k).
 +   This option removes such a synchronous wait in the file system level.
 +   All works after submit IO do at the End-of-IO handler asynchronously.
 +   This enhances the performance of Squashfs in most cases, especially,
 +   large file reading.
  endchoice
  
  choice
 diff --git a/fs/squashfs/block.c b/fs/squashfs/block.c
 index 0cea9b9..1517ca3 100644
 --- a/fs/squashfs/block.c
 +++ b/fs/squashfs/block.c
 @@ -212,3 +212,265 @@ read_failure:
   kfree(bh);
   return -EIO;
  }
 +
 +#ifdef CONFIG_SQUASHFS_READ_DATA_ASYNC
 +
 +struct squashfs_end_io_assoc {
 + int offset;
 + int b_count;
 + int compressed;
 + int length;
 + struct squashfs_page_actor *p_actor;
 + struct buffer_head **__bh;
 + struct squashfs_sb_info *msblk;
 + struct work_struct read_work;
 +};
 +
 +static int squashfs_copy_page(struct squashfs_sb_info *msblk,
 + struct buffer_head **bh, int b, int offset, int length,
 + struct squashfs_page_actor *output)
 +{
 + /*
 +  * Block is uncompressed.
 +  */
 + int in, pg_offset = 0, avail = 0, bytes, k = 0;
 + void *data = squashfs_first_page(output);
 + for (bytes = length; k  b; k++) {
 + in = min(bytes, msblk-devblksize - offset);
 + bytes -= in;
 + while (in) {
 + if (pg_offset == PAGE_CACHE_SIZE) {
 + data = squashfs_next_page(output);
 + pg_offset = 0;
 + }
 + avail = min_t(int, in, PAGE_CACHE_SIZE -
 + pg_offset);
 + memcpy(data + pg_offset, bh[k]-b_data + offset,
 + avail);
 + in -= avail;
 + pg_offset += avail;
 + offset += avail;
 + }
 + offset = 0;
 + put_bh(bh[k]);
 + }
 + squashfs_finish_page(output);
 + return length;
 +}
 +
 +/*
 + * This is executed in workqueue for squashfs_read_data_async().
 + * - pages come decompressed/copied and unlocked asynchronously.
 + */
 +static void squashfs_buffer_read_async(struct squashfs_end_io_assoc 
 *io_assoc)
 +{
 + struct squashfs_sb_info *msblk = 

[PATCH] Squashfs: add asynchronous read support

2013-12-15 Thread Chanho Min
This patch removes synchronous wait for the up-to-date of buffer in the
file system level. Instead all operations after submit_bh are moved into
the End-of-IO handler and its associated workeque. It decompresses/copies
data into pages and unlock them asynchronously.

This patch enhances the performance of Squashfs in most cases.
Especially, large file reading is improved significantly.

dd read test:

 - ARM cortex-a9 1GHz, 2 cores, eMMC 4.5 HS200 mode.
 - dd if=file1 of=/dev/null bs=64k

Before
 58707718 bytes (56.0MB) copied, 1.393653 seconds, 40.2MB/s

After
 58707718 bytes (56.0MB) copied, 0.942413 seconds, 59.4MB/s

Signed-off-by: Chanho Min 
---
 fs/squashfs/Kconfig   |9 ++
 fs/squashfs/block.c   |  262 +
 fs/squashfs/file_direct.c |8 +-
 fs/squashfs/page_actor.c  |3 +-
 fs/squashfs/page_actor.h  |3 +-
 fs/squashfs/squashfs.h|2 +
 6 files changed, 284 insertions(+), 3 deletions(-)

diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig
index b6fa865..284aa5a 100644
--- a/fs/squashfs/Kconfig
+++ b/fs/squashfs/Kconfig
@@ -51,6 +51,15 @@ config SQUASHFS_FILE_DIRECT
  it eliminates a memcpy and it also removes the lock contention
  on the single buffer.
 
+config SQUASHFS_READ_DATA_ASYNC
+   bool "Read and decompress data asynchronously"
+   depends on  SQUASHFS_FILE_DIRECT
+   help
+ By default Squashfs read data synchronously by block (default 128k).
+ This option removes such a synchronous wait in the file system level.
+ All works after submit IO do at the End-of-IO handler asynchronously.
+ This enhances the performance of Squashfs in most cases, especially,
+ large file reading.
 endchoice
 
 choice
diff --git a/fs/squashfs/block.c b/fs/squashfs/block.c
index 0cea9b9..1517ca3 100644
--- a/fs/squashfs/block.c
+++ b/fs/squashfs/block.c
@@ -212,3 +212,265 @@ read_failure:
kfree(bh);
return -EIO;
 }
+
+#ifdef CONFIG_SQUASHFS_READ_DATA_ASYNC
+
+struct squashfs_end_io_assoc {
+   int offset;
+   int b_count;
+   int compressed;
+   int length;
+   struct squashfs_page_actor *p_actor;
+   struct buffer_head **__bh;
+   struct squashfs_sb_info *msblk;
+   struct work_struct read_work;
+};
+
+static int squashfs_copy_page(struct squashfs_sb_info *msblk,
+   struct buffer_head **bh, int b, int offset, int length,
+   struct squashfs_page_actor *output)
+{
+   /*
+* Block is uncompressed.
+*/
+   int in, pg_offset = 0, avail = 0, bytes, k = 0;
+   void *data = squashfs_first_page(output);
+   for (bytes = length; k < b; k++) {
+   in = min(bytes, msblk->devblksize - offset);
+   bytes -= in;
+   while (in) {
+   if (pg_offset == PAGE_CACHE_SIZE) {
+   data = squashfs_next_page(output);
+   pg_offset = 0;
+   }
+   avail = min_t(int, in, PAGE_CACHE_SIZE -
+   pg_offset);
+   memcpy(data + pg_offset, bh[k]->b_data + offset,
+   avail);
+   in -= avail;
+   pg_offset += avail;
+   offset += avail;
+   }
+   offset = 0;
+   put_bh(bh[k]);
+   }
+   squashfs_finish_page(output);
+   return length;
+}
+
+/*
+ * This is executed in workqueue for squashfs_read_data_async().
+ * - pages come decompressed/copied and unlocked asynchronously.
+ */
+static void squashfs_buffer_read_async(struct squashfs_end_io_assoc *io_assoc)
+{
+   struct squashfs_sb_info *msblk = io_assoc->msblk;
+   struct squashfs_page_actor *actor = io_assoc->p_actor;
+   struct page **page = actor->page;
+   int pages = actor->pages;
+   struct page *target_page = actor->target_page;
+   int i, length, bytes = 0;
+   void *pageaddr;
+
+   if (io_assoc->compressed) {
+   length = squashfs_decompress(msblk, io_assoc->__bh,
+   io_assoc->b_count, io_assoc->offset,
+   io_assoc->length, actor);
+   if (length < 0) {
+   ERROR("squashfs_read_data failed to read block\n");
+   goto read_failure;
+   }
+   } else
+   length = squashfs_copy_page(msblk, io_assoc->__bh,
+   io_assoc->b_count, io_assoc->offset,
+   io_assoc->length, actor);
+
+   /* Last page may have trailing bytes not filled */
+   bytes = length % PAGE_CACHE_SIZE;
+   if (bytes) {
+   pageaddr = kmap_atomic(page[pages - 1]);
+   memset(pageaddr + bytes, 0, PAGE_CACHE_SIZE - bytes);
+   kunmap_atomic(pageaddr);
+   }
+
+   /* Mark 

[PATCH] Squashfs: add asynchronous read support

2013-12-15 Thread Chanho Min
This patch removes synchronous wait for the up-to-date of buffer in the
file system level. Instead all operations after submit_bh are moved into
the End-of-IO handler and its associated workeque. It decompresses/copies
data into pages and unlock them asynchronously.

This patch enhances the performance of Squashfs in most cases.
Especially, large file reading is improved significantly.

dd read test:

 - ARM cortex-a9 1GHz, 2 cores, eMMC 4.5 HS200 mode.
 - dd if=file1 of=/dev/null bs=64k

Before
 58707718 bytes (56.0MB) copied, 1.393653 seconds, 40.2MB/s

After
 58707718 bytes (56.0MB) copied, 0.942413 seconds, 59.4MB/s

Signed-off-by: Chanho Min chanho@lge.com
---
 fs/squashfs/Kconfig   |9 ++
 fs/squashfs/block.c   |  262 +
 fs/squashfs/file_direct.c |8 +-
 fs/squashfs/page_actor.c  |3 +-
 fs/squashfs/page_actor.h  |3 +-
 fs/squashfs/squashfs.h|2 +
 6 files changed, 284 insertions(+), 3 deletions(-)

diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig
index b6fa865..284aa5a 100644
--- a/fs/squashfs/Kconfig
+++ b/fs/squashfs/Kconfig
@@ -51,6 +51,15 @@ config SQUASHFS_FILE_DIRECT
  it eliminates a memcpy and it also removes the lock contention
  on the single buffer.
 
+config SQUASHFS_READ_DATA_ASYNC
+   bool Read and decompress data asynchronously
+   depends on  SQUASHFS_FILE_DIRECT
+   help
+ By default Squashfs read data synchronously by block (default 128k).
+ This option removes such a synchronous wait in the file system level.
+ All works after submit IO do at the End-of-IO handler asynchronously.
+ This enhances the performance of Squashfs in most cases, especially,
+ large file reading.
 endchoice
 
 choice
diff --git a/fs/squashfs/block.c b/fs/squashfs/block.c
index 0cea9b9..1517ca3 100644
--- a/fs/squashfs/block.c
+++ b/fs/squashfs/block.c
@@ -212,3 +212,265 @@ read_failure:
kfree(bh);
return -EIO;
 }
+
+#ifdef CONFIG_SQUASHFS_READ_DATA_ASYNC
+
+struct squashfs_end_io_assoc {
+   int offset;
+   int b_count;
+   int compressed;
+   int length;
+   struct squashfs_page_actor *p_actor;
+   struct buffer_head **__bh;
+   struct squashfs_sb_info *msblk;
+   struct work_struct read_work;
+};
+
+static int squashfs_copy_page(struct squashfs_sb_info *msblk,
+   struct buffer_head **bh, int b, int offset, int length,
+   struct squashfs_page_actor *output)
+{
+   /*
+* Block is uncompressed.
+*/
+   int in, pg_offset = 0, avail = 0, bytes, k = 0;
+   void *data = squashfs_first_page(output);
+   for (bytes = length; k  b; k++) {
+   in = min(bytes, msblk-devblksize - offset);
+   bytes -= in;
+   while (in) {
+   if (pg_offset == PAGE_CACHE_SIZE) {
+   data = squashfs_next_page(output);
+   pg_offset = 0;
+   }
+   avail = min_t(int, in, PAGE_CACHE_SIZE -
+   pg_offset);
+   memcpy(data + pg_offset, bh[k]-b_data + offset,
+   avail);
+   in -= avail;
+   pg_offset += avail;
+   offset += avail;
+   }
+   offset = 0;
+   put_bh(bh[k]);
+   }
+   squashfs_finish_page(output);
+   return length;
+}
+
+/*
+ * This is executed in workqueue for squashfs_read_data_async().
+ * - pages come decompressed/copied and unlocked asynchronously.
+ */
+static void squashfs_buffer_read_async(struct squashfs_end_io_assoc *io_assoc)
+{
+   struct squashfs_sb_info *msblk = io_assoc-msblk;
+   struct squashfs_page_actor *actor = io_assoc-p_actor;
+   struct page **page = actor-page;
+   int pages = actor-pages;
+   struct page *target_page = actor-target_page;
+   int i, length, bytes = 0;
+   void *pageaddr;
+
+   if (io_assoc-compressed) {
+   length = squashfs_decompress(msblk, io_assoc-__bh,
+   io_assoc-b_count, io_assoc-offset,
+   io_assoc-length, actor);
+   if (length  0) {
+   ERROR(squashfs_read_data failed to read block\n);
+   goto read_failure;
+   }
+   } else
+   length = squashfs_copy_page(msblk, io_assoc-__bh,
+   io_assoc-b_count, io_assoc-offset,
+   io_assoc-length, actor);
+
+   /* Last page may have trailing bytes not filled */
+   bytes = length % PAGE_CACHE_SIZE;
+   if (bytes) {
+   pageaddr = kmap_atomic(page[pages - 1]);
+   memset(pageaddr + bytes, 0, PAGE_CACHE_SIZE - bytes);
+   kunmap_atomic(pageaddr);
+   }
+
+   /* Mark