Re: Re : Re: Re: Re : Re: [PATCH] Squashfs: add asynchronous read support
On Mon, Dec 23, 2013 at 12:03:39PM +0900, Chanho Min wrote: > > > > read_pages > > for(page_idx ...) { > > if (!add_to_page_cache_lru)) { <-- 1) > > mapping->a_ops->readpage(filp, page) > > squashfs_readpage > > for (i ...) { 2) Here, 31 pages are inserted into page cache > > grab_cahe_page_nowait <--/ > > add_to_page_cache_lru > > } > > } > > /* > > * 1) will be failed with EEXIST by 2) so every pages other than first > > page > > * in list would be freed > > */ > > page_cache_release(page) > > } > > > > If you see ReadAhead works, it is just by luck as I told you. > > Please simulate it with 64K dd. > You right, This luck happened frequently with 128k dd or my test. Yeah, it was not intented by MM's readahead. If you test it with squashfs 256K compression, you couldn't get a benefit. If you test it with small block size dd like 32K, you couldn't, either. It means it's very fragile. One more thing. Your approach doesn't work page cache has already some sparse page because you are solving only direct page copy part, which couldn't work if we read some sparse page in a file and reclaimed many pages. Please rethink. I already explained what's the problem in your patch. You are ignoring VM's logic. (ex, PageReadahead mark) The squashfs is rather special due to compression FS so if we have no other way, I'd like to support your approach but I pointed out problem in your patch and suggest my solution to overcome the problem. It culd be silly but at least, it's time that you should prove why it's brain-damaged so maintainer will review this thread and decide or suggest easily. :) Here goes again. I suggest it would be better to implment squashfs_readpages and it should work with cache buffer instead of direct page cache so that it could copy from cache buffers to pages passed by MM without freeing them so that it preserves readhead hinted page and would work with VM's readahead well 1. although algorithm in readahead were changed, 2. although you use small block size dd, 3. although you use other compression size of squashfs. Thanks. > > > I understand it but your patch doesn't make it. > > > I think my patch can make it if readahead works normally or luckily. > > Thanks a lot! > Chanho, > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Squashfs: add asynchronous read support
On 16/12/13 05:30, Chanho Min wrote: This patch removes synchronous wait for the up-to-date of buffer in the file system level. Instead all operations after submit_bh are moved into the End-of-IO handler and its associated workeque. It decompresses/copies data into pages and unlock them asynchronously. This patch enhances the performance of Squashfs in most cases. Especially, large file reading is improved significantly. Hi, The following is the summarised results of a set of comprehensive tests of the asynchronous patch against the current synchronous Squashfs readpage implementation. The following tables should be fairly self-explanatory, but, the testing methodology was: Generate a series of Squashfs filesystems, with block size 1024K, 512K, 256K, 128K and 64K. Then for each filesystem Run "dd if=/mnt/file of=/dev/null bs=X" Where X is 4K, 8K, 16K, 32K, 64K, 128K, 256K, 512K, and 1024K For each dd, run it against six different Squashfs modules, configured with the following different options: 1. SQUASHFS_READ_DATA_ASYNC selected, SQUASHFS_DECOMP_SINGLE selected i.e. Async patch and single threaded decompression == Asyn Single in following tables 2. SQUASHFS_READ_DATA_ASYNC *not* selected, SQUASHFS_DECOMP_SINGLE selected i.e. No Async patch and single threaded decompression == No Asyn Single in following tables 3. SQUASHFS_READ_DATA_ASYNC selected, SQUASHFS_DECOMP_MULTI selected i.e. Async patch and multi-threaded decompression == Asyn Multi in following tables 4. SQUASHFS_READ_DATA_ASYNC *not* selected, SQUASHFS_DECOMP_MULTI selected i.e. No Async patch and multi-threaded decompression == No Asyn Multi in following tables 5. SQUASHFS_READ_DATA_ASYNC selected, SQUASHFS_DECOMP_MULTI_PERCPU selected i.e. Async patch and percpu multi-threaded decompression == Asyn Percpu in following tables 6. SQUASHFS_READ_DATA_ASYNC *not* selected, SQUASHFS_DECOMP_MULTI_PERCPU selected i.e. No Async patch and percpu multi-threaded decompression == No Asyn Percpu in following tables The figures in the following tables are the MB/s reported by dd. The tests were performed on a KVM guest with 4 cores and 4Gb of memory, running on a core i5 based host. The Squashfs filesystem was on "/dev/hdb". /mnt/file is a 3Gb file, average compression 22% (635 Mb) Squashfs: gzip filesystem 1024K blocks AsynNo Asyn AsynNo Asyn AsynNo Asyn Single Single Multi Multi Percpu Percpu --- 4K: 89.497.589.998.190.699.1 8K: 89.999.089.799.490.399.4 16K:90.699.890.8100 90.297.0 32K:90.398.790.398.089.9101 64K:90.397.690.297.190.199.7 128K: 90.498.690.297.690.798.5 256K: 89.796.989.899.290.2101 512K: 89.798.990.898.189.497.8 1024K: 89.398.089.698.688.796.4 Squashfs: gzip filesystem 512K blocks AsynNo Asyn AsynNo Asyn AsynNo Asyn Single Single Multi Multi Percpu Percpu --- 4K: 68.594.967.699.068.997.0 8K: 69.3101 68.994.369.097.2 16K:68.998.669.498.968.898.0 32K:68.696.569.498.969.4108 64K:68.792.969.7101 68.898.2 128K: 67.4102 68.790.369.4100 256K: 68.795.168.299.768.597.7 512K: 69.9114 82.0104 74.294.4 1024K: 71.6105 79.2105 69.198.0 Squashfs: gzip filesystem 256K blocks AsynNo Asyn AsynNo Asyn AsynNo Asyn Single Single Multi Multi Percpu Percpu --- 4K: 53.692.254.687.553.782.1 8K: 53.587.353.585.053.585.7 16K:53.189.053.895.753.591.1 32K:54.095.953.898.753.985.3 64K:53.786.953.4103 53.486.3 128K: 53.294.453.6100 53.797.9 256K: 55.5101 53.094.153.387.0 512K: 53.193.053.487.753.289.8 1024K: 53.291.452.791.353.095.4 A couple of points about the above can be noticed: 1. With a Squashfs block size of 256K and greater, Squashfs readpage() does its own readahead. This means the asynchronous readpage is never called multiply (to run in parallel), because there is never any more work to do after the first readpage(). The above results therefore reflect the basic performance of the asynchronous readpage implementation versus the synchronous readpage implementation. 2. It can be seen in all cases the asynchronous readpage implementation performs worse than the synchronous readpage implementation.
Re : Re: Re: Re : Re: [PATCH] Squashfs: add asynchronous read support
> read_pages > for(page_idx ...) { > if (!add_to_page_cache_lru)) { <-- 1) > mapping->a_ops->readpage(filp, page) > squashfs_readpage > for (i ...) { 2) Here, 31 pages are inserted into page cache > grab_cahe_page_nowait <--/ > add_to_page_cache_lru > } > } > /* > * 1) will be failed with EEXIST by 2) so every pages other than first > page > * in list would be freed > */ > page_cache_release(page) > } > > If you see ReadAhead works, it is just by luck as I told you. > Please simulate it with 64K dd. You right, This luck happened frequently with 128k dd or my test. > I understand it but your patch doesn't make it. > I think my patch can make it if readahead works normally or luckily. Thanks a lot! Chanho, -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re : Re: [PATCH] Squashfs: add asynchronous read support
On Sat, Dec 21, 2013 at 11:05:51AM +0900, Chanho Min wrote: > > > Please don't break thread. > > You should reply to my mail instead of your original post. > Sorry, It seems to be my mailer issue. I'm trying to fix it. > > > It's a result which isn't what I want to know. > > What I wnat to know is why upper layer issues more I/O per second. > > For example, you read 32K so MM layer will prepare 8 pages to read in but > > at issuing at a first page, squashfs make 32 pages and fill the page cache > > if we assume you use 128K compression so MM layer's already prepared 7 > > page > > would be freed without further I/O and do_generic_file_read will wait for > > completion by lock_page without further I/O queueing. It's not suprising. > > One of page freed is a READA marked page so readahead couldn't work. > > If readahead works, it would be just by luck. Actually, by simulation > > 64K dd, I found readahead logic would be triggered but it's just by luck > > and it's not intended, I think. > MM layer's readahead pages would not be freed immediately. > Squashfs can use them by grab_cache_page_nowait and READA marked page is > available. > Intentional or not, readahead works pretty well. I checked in experiment. read_pages for(page_idx ...) { if (!add_to_page_cache_lru)) { <-- 1) mapping->a_ops->readpage(filp, page) squashfs_readpage for (i ...) { 2) Here, 31 pages are inserted into page cache grab_cahe_page_nowait <--/ add_to_page_cache_lru } } /* * 1) will be failed with EEXIST by 2) so every pages other than first page * in list would be freed */ page_cache_release(page) } If you see ReadAhead works, it is just by luck as I told you. Please simulate it with 64K dd. > > > If first issued I/O complete, squashfs decompress the I/O with 128K pages > > so all 4 iteration(128K/32K) would be hit in page cache. > > If all 128K hit in page cache, mm layer start to issue next I/O and > > repeat above logic until you ends up reading all file size. > > So my opition is that upper layer wouldn't issue more I/O logically. > > If it worked, it's not what we expect but side-effect. > > > > That's why I'd like to know what's your thought for increasing IOPS. > > Please, could you say your thought why IOPS increased, not a result > > on low level driver? > It is because readahead can works asynchronously in background. > Suppose that you read a large file by 128k partially and contiguously > like "dd bs=128k". Two IOs can be issued per 128k reading, > First IO is for intended pages, second IO is for readahead. > If first IO hit in cache thank to previous readahead, no wait for IO > completion > is needed, because intended page is up-to-date already. > But, current squashfs waits for second IO's completion unnecessarily. > That is one of reason that we should move page's up-to-date > to the asynchronous area like my patch. I understand it but your patch doesn't make it. > > > Anyway, in my opinion, we should take care of MM layer's readahead for > > enhance sequential I/O. For it, we should use buffer pages passed by MM > > instead of freeing them and allocating new pages in squashfs. > > IMHO, it would be better to implement squashfs_readpages but my insight > > is very weak so I guess Phillip will give more good idea/insight about > > the issue. > That's a good point. Also, I think my patch is another way which can be > implemented > without significant impact on current implementation and I wait for Phillip's > comment. > > Thanks > Chanho > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re : Re: [PATCH] Squashfs: add asynchronous read support
On Sat, Dec 21, 2013 at 11:05:51AM +0900, Chanho Min wrote: Please don't break thread. You should reply to my mail instead of your original post. Sorry, It seems to be my mailer issue. I'm trying to fix it. It's a result which isn't what I want to know. What I wnat to know is why upper layer issues more I/O per second. For example, you read 32K so MM layer will prepare 8 pages to read in but at issuing at a first page, squashfs make 32 pages and fill the page cache if we assume you use 128K compression so MM layer's already prepared 7 page would be freed without further I/O and do_generic_file_read will wait for completion by lock_page without further I/O queueing. It's not suprising. One of page freed is a READA marked page so readahead couldn't work. If readahead works, it would be just by luck. Actually, by simulation 64K dd, I found readahead logic would be triggered but it's just by luck and it's not intended, I think. MM layer's readahead pages would not be freed immediately. Squashfs can use them by grab_cache_page_nowait and READA marked page is available. Intentional or not, readahead works pretty well. I checked in experiment. read_pages for(page_idx ...) { if (!add_to_page_cache_lru)) { -- 1) mapping-a_ops-readpage(filp, page) squashfs_readpage for (i ...) { 2) Here, 31 pages are inserted into page cache grab_cahe_page_nowait --/ add_to_page_cache_lru } } /* * 1) will be failed with EEXIST by 2) so every pages other than first page * in list would be freed */ page_cache_release(page) } If you see ReadAhead works, it is just by luck as I told you. Please simulate it with 64K dd. If first issued I/O complete, squashfs decompress the I/O with 128K pages so all 4 iteration(128K/32K) would be hit in page cache. If all 128K hit in page cache, mm layer start to issue next I/O and repeat above logic until you ends up reading all file size. So my opition is that upper layer wouldn't issue more I/O logically. If it worked, it's not what we expect but side-effect. That's why I'd like to know what's your thought for increasing IOPS. Please, could you say your thought why IOPS increased, not a result on low level driver? It is because readahead can works asynchronously in background. Suppose that you read a large file by 128k partially and contiguously like dd bs=128k. Two IOs can be issued per 128k reading, First IO is for intended pages, second IO is for readahead. If first IO hit in cache thank to previous readahead, no wait for IO completion is needed, because intended page is up-to-date already. But, current squashfs waits for second IO's completion unnecessarily. That is one of reason that we should move page's up-to-date to the asynchronous area like my patch. I understand it but your patch doesn't make it. Anyway, in my opinion, we should take care of MM layer's readahead for enhance sequential I/O. For it, we should use buffer pages passed by MM instead of freeing them and allocating new pages in squashfs. IMHO, it would be better to implement squashfs_readpages but my insight is very weak so I guess Phillip will give more good idea/insight about the issue. That's a good point. Also, I think my patch is another way which can be implemented without significant impact on current implementation and I wait for Phillip's comment. Thanks Chanho -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re : Re: Re: Re : Re: [PATCH] Squashfs: add asynchronous read support
read_pages for(page_idx ...) { if (!add_to_page_cache_lru)) { -- 1) mapping-a_ops-readpage(filp, page) squashfs_readpage for (i ...) { 2) Here, 31 pages are inserted into page cache grab_cahe_page_nowait --/ add_to_page_cache_lru } } /* * 1) will be failed with EEXIST by 2) so every pages other than first page * in list would be freed */ page_cache_release(page) } If you see ReadAhead works, it is just by luck as I told you. Please simulate it with 64K dd. You right, This luck happened frequently with 128k dd or my test. I understand it but your patch doesn't make it. I think my patch can make it if readahead works normally or luckily. Thanks a lot! Chanho, -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re : Re: Re: Re : Re: [PATCH] Squashfs: add asynchronous read support
On Mon, Dec 23, 2013 at 12:03:39PM +0900, Chanho Min wrote: read_pages for(page_idx ...) { if (!add_to_page_cache_lru)) { -- 1) mapping-a_ops-readpage(filp, page) squashfs_readpage for (i ...) { 2) Here, 31 pages are inserted into page cache grab_cahe_page_nowait --/ add_to_page_cache_lru } } /* * 1) will be failed with EEXIST by 2) so every pages other than first page * in list would be freed */ page_cache_release(page) } If you see ReadAhead works, it is just by luck as I told you. Please simulate it with 64K dd. You right, This luck happened frequently with 128k dd or my test. Yeah, it was not intented by MM's readahead. If you test it with squashfs 256K compression, you couldn't get a benefit. If you test it with small block size dd like 32K, you couldn't, either. It means it's very fragile. One more thing. Your approach doesn't work page cache has already some sparse page because you are solving only direct page copy part, which couldn't work if we read some sparse page in a file and reclaimed many pages. Please rethink. I already explained what's the problem in your patch. You are ignoring VM's logic. (ex, PageReadahead mark) The squashfs is rather special due to compression FS so if we have no other way, I'd like to support your approach but I pointed out problem in your patch and suggest my solution to overcome the problem. It culd be silly but at least, it's time that you should prove why it's brain-damaged so maintainer will review this thread and decide or suggest easily. :) Here goes again. I suggest it would be better to implment squashfs_readpages and it should work with cache buffer instead of direct page cache so that it could copy from cache buffers to pages passed by MM without freeing them so that it preserves readhead hinted page and would work with VM's readahead well 1. although algorithm in readahead were changed, 2. although you use small block size dd, 3. although you use other compression size of squashfs. Thanks. I understand it but your patch doesn't make it. I think my patch can make it if readahead works normally or luckily. Thanks a lot! Chanho, -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Squashfs: add asynchronous read support
On 16/12/13 05:30, Chanho Min wrote: This patch removes synchronous wait for the up-to-date of buffer in the file system level. Instead all operations after submit_bh are moved into the End-of-IO handler and its associated workeque. It decompresses/copies data into pages and unlock them asynchronously. This patch enhances the performance of Squashfs in most cases. Especially, large file reading is improved significantly. Hi, The following is the summarised results of a set of comprehensive tests of the asynchronous patch against the current synchronous Squashfs readpage implementation. The following tables should be fairly self-explanatory, but, the testing methodology was: Generate a series of Squashfs filesystems, with block size 1024K, 512K, 256K, 128K and 64K. Then for each filesystem Run dd if=/mnt/file of=/dev/null bs=X Where X is 4K, 8K, 16K, 32K, 64K, 128K, 256K, 512K, and 1024K For each dd, run it against six different Squashfs modules, configured with the following different options: 1. SQUASHFS_READ_DATA_ASYNC selected, SQUASHFS_DECOMP_SINGLE selected i.e. Async patch and single threaded decompression == Asyn Single in following tables 2. SQUASHFS_READ_DATA_ASYNC *not* selected, SQUASHFS_DECOMP_SINGLE selected i.e. No Async patch and single threaded decompression == No Asyn Single in following tables 3. SQUASHFS_READ_DATA_ASYNC selected, SQUASHFS_DECOMP_MULTI selected i.e. Async patch and multi-threaded decompression == Asyn Multi in following tables 4. SQUASHFS_READ_DATA_ASYNC *not* selected, SQUASHFS_DECOMP_MULTI selected i.e. No Async patch and multi-threaded decompression == No Asyn Multi in following tables 5. SQUASHFS_READ_DATA_ASYNC selected, SQUASHFS_DECOMP_MULTI_PERCPU selected i.e. Async patch and percpu multi-threaded decompression == Asyn Percpu in following tables 6. SQUASHFS_READ_DATA_ASYNC *not* selected, SQUASHFS_DECOMP_MULTI_PERCPU selected i.e. No Async patch and percpu multi-threaded decompression == No Asyn Percpu in following tables The figures in the following tables are the MB/s reported by dd. The tests were performed on a KVM guest with 4 cores and 4Gb of memory, running on a core i5 based host. The Squashfs filesystem was on /dev/hdb. /mnt/file is a 3Gb file, average compression 22% (635 Mb) Squashfs: gzip filesystem 1024K blocks AsynNo Asyn AsynNo Asyn AsynNo Asyn Single Single Multi Multi Percpu Percpu --- 4K: 89.497.589.998.190.699.1 8K: 89.999.089.799.490.399.4 16K:90.699.890.8100 90.297.0 32K:90.398.790.398.089.9101 64K:90.397.690.297.190.199.7 128K: 90.498.690.297.690.798.5 256K: 89.796.989.899.290.2101 512K: 89.798.990.898.189.497.8 1024K: 89.398.089.698.688.796.4 Squashfs: gzip filesystem 512K blocks AsynNo Asyn AsynNo Asyn AsynNo Asyn Single Single Multi Multi Percpu Percpu --- 4K: 68.594.967.699.068.997.0 8K: 69.3101 68.994.369.097.2 16K:68.998.669.498.968.898.0 32K:68.696.569.498.969.4108 64K:68.792.969.7101 68.898.2 128K: 67.4102 68.790.369.4100 256K: 68.795.168.299.768.597.7 512K: 69.9114 82.0104 74.294.4 1024K: 71.6105 79.2105 69.198.0 Squashfs: gzip filesystem 256K blocks AsynNo Asyn AsynNo Asyn AsynNo Asyn Single Single Multi Multi Percpu Percpu --- 4K: 53.692.254.687.553.782.1 8K: 53.587.353.585.053.585.7 16K:53.189.053.895.753.591.1 32K:54.095.953.898.753.985.3 64K:53.786.953.4103 53.486.3 128K: 53.294.453.6100 53.797.9 256K: 55.5101 53.094.153.387.0 512K: 53.193.053.487.753.289.8 1024K: 53.291.452.791.353.095.4 A couple of points about the above can be noticed: 1. With a Squashfs block size of 256K and greater, Squashfs readpage() does its own readahead. This means the asynchronous readpage is never called multiply (to run in parallel), because there is never any more work to do after the first readpage(). The above results therefore reflect the basic performance of the asynchronous readpage implementation versus the synchronous readpage implementation. 2. It can be seen in all cases the asynchronous readpage implementation performs worse than the synchronous readpage implementation. This
Re: Re: Re : Re: [PATCH] Squashfs: add asynchronous read support
> Please don't break thread. > You should reply to my mail instead of your original post. Sorry, It seems to be my mailer issue. I'm trying to fix it. > It's a result which isn't what I want to know. > What I wnat to know is why upper layer issues more I/O per second. > For example, you read 32K so MM layer will prepare 8 pages to read in but > at issuing at a first page, squashfs make 32 pages and fill the page cache > if we assume you use 128K compression so MM layer's already prepared 7 > page > would be freed without further I/O and do_generic_file_read will wait for > completion by lock_page without further I/O queueing. It's not suprising. > One of page freed is a READA marked page so readahead couldn't work. > If readahead works, it would be just by luck. Actually, by simulation > 64K dd, I found readahead logic would be triggered but it's just by luck > and it's not intended, I think. MM layer's readahead pages would not be freed immediately. Squashfs can use them by grab_cache_page_nowait and READA marked page is available. Intentional or not, readahead works pretty well. I checked in experiment. > If first issued I/O complete, squashfs decompress the I/O with 128K pages > so all 4 iteration(128K/32K) would be hit in page cache. > If all 128K hit in page cache, mm layer start to issue next I/O and > repeat above logic until you ends up reading all file size. > So my opition is that upper layer wouldn't issue more I/O logically. > If it worked, it's not what we expect but side-effect. > > That's why I'd like to know what's your thought for increasing IOPS. > Please, could you say your thought why IOPS increased, not a result > on low level driver? It is because readahead can works asynchronously in background. Suppose that you read a large file by 128k partially and contiguously like "dd bs=128k". Two IOs can be issued per 128k reading, First IO is for intended pages, second IO is for readahead. If first IO hit in cache thank to previous readahead, no wait for IO completion is needed, because intended page is up-to-date already. But, current squashfs waits for second IO's completion unnecessarily. That is one of reason that we should move page's up-to-date to the asynchronous area like my patch. > Anyway, in my opinion, we should take care of MM layer's readahead for > enhance sequential I/O. For it, we should use buffer pages passed by MM > instead of freeing them and allocating new pages in squashfs. > IMHO, it would be better to implement squashfs_readpages but my insight > is very weak so I guess Phillip will give more good idea/insight about > the issue. That's a good point. Also, I think my patch is another way which can be implemented without significant impact on current implementation and I wait for Phillip's comment. Thanks Chanho -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re: Re : Re: [PATCH] Squashfs: add asynchronous read support
Please don't break thread. You should reply to my mail instead of your original post. Sorry, It seems to be my mailer issue. I'm trying to fix it. It's a result which isn't what I want to know. What I wnat to know is why upper layer issues more I/O per second. For example, you read 32K so MM layer will prepare 8 pages to read in but at issuing at a first page, squashfs make 32 pages and fill the page cache if we assume you use 128K compression so MM layer's already prepared 7 page would be freed without further I/O and do_generic_file_read will wait for completion by lock_page without further I/O queueing. It's not suprising. One of page freed is a READA marked page so readahead couldn't work. If readahead works, it would be just by luck. Actually, by simulation 64K dd, I found readahead logic would be triggered but it's just by luck and it's not intended, I think. MM layer's readahead pages would not be freed immediately. Squashfs can use them by grab_cache_page_nowait and READA marked page is available. Intentional or not, readahead works pretty well. I checked in experiment. If first issued I/O complete, squashfs decompress the I/O with 128K pages so all 4 iteration(128K/32K) would be hit in page cache. If all 128K hit in page cache, mm layer start to issue next I/O and repeat above logic until you ends up reading all file size. So my opition is that upper layer wouldn't issue more I/O logically. If it worked, it's not what we expect but side-effect. That's why I'd like to know what's your thought for increasing IOPS. Please, could you say your thought why IOPS increased, not a result on low level driver? It is because readahead can works asynchronously in background. Suppose that you read a large file by 128k partially and contiguously like dd bs=128k. Two IOs can be issued per 128k reading, First IO is for intended pages, second IO is for readahead. If first IO hit in cache thank to previous readahead, no wait for IO completion is needed, because intended page is up-to-date already. But, current squashfs waits for second IO's completion unnecessarily. That is one of reason that we should move page's up-to-date to the asynchronous area like my patch. Anyway, in my opinion, we should take care of MM layer's readahead for enhance sequential I/O. For it, we should use buffer pages passed by MM instead of freeing them and allocating new pages in squashfs. IMHO, it would be better to implement squashfs_readpages but my insight is very weak so I guess Phillip will give more good idea/insight about the issue. That's a good point. Also, I think my patch is another way which can be implemented without significant impact on current implementation and I wait for Phillip's comment. Thanks Chanho -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re : Re: [PATCH] Squashfs: add asynchronous read support
Hello, Please don't break thread. You should reply to my mail instead of your original post. On Wed, Dec 18, 2013 at 01:29:37PM +0900, Chanho Min wrote: > > > I did test it on x86 with USB stick and ARM with eMMC on my Nexus 4. > > In experiment, I couldn't see much gain like you both system and even it > > was regressed at bs=32k test, maybe workqueue allocation/schedule of work > > per I/O. > > Your test is rather special or what I am missing? > Can you specify your test result on ARM with eMMC. Sure. before after 32K 3.6M 3.4M 64K 6.3M 8.2M 128K11.4M11.7M 160K13.6M13.8M 256K19.8M19M 288K21.3M20.8M > > > Before that, I'd like to know fundamental reason why your implementation > > for asynchronous read enhance. At a first glance, I thought it's caused by > > readahead from MM layer but when I read code, I found I was wrong. > > MM's readahead logic works based on PageReadahead marker but squashfs > > invalidates by grab_cache_page_nowait so it wouldn't work as we expected. > > > > Another possibility is block I/O merging in block layder by plugging logic, > > which was what I tried a few month ago although implementation was really > > bad. But it wouldn't work with your patch because do_generic_file_read > > will unplug block layer by lock_page without merging enough I/O. > > > > So, what do you think real actuator for enhance your experiment? > > Then, I could investigate why I can't get a benefit. > Currently, squashfs adds request to the block device queue synchronously with > wait for competion. mmc takes this request one by one and push them to host > driver, > But it allows mmc to be idle frequently. This patch allows to add block > requset > asynchrously without wait for competion, mmcqd can fetch a lot of request > from block > at a time. As a result, mmcqd get busy and use a more bandwidth of mmc. > For test, I added two count variables in mmc_queue_thread as bellows > and tested same dd transfer. > > static int mmc_queue_thread(void *d) > { > .. > do { > if (req || mq->mqrq_prev->req) { > fetch++; > } else { > idle++; > } > } while (1); > .. > } > > without patch: > fetch: 920, idle: 460 > > with patch > fetch: 918, idle: 40 It's a result which isn't what I want to know. What I wnat to know is why upper layer issues more I/O per second. For example, you read 32K so MM layer will prepare 8 pages to read in but at issuing at a first page, squashfs make 32 pages and fill the page cache if we assume you use 128K compression so MM layer's already prepared 7 page would be freed without further I/O and do_generic_file_read will wait for completion by lock_page without further I/O queueing. It's not suprising. One of page freed is a READA marked page so readahead couldn't work. If readahead works, it would be just by luck. Actually, by simulation 64K dd, I found readahead logic would be triggered but it's just by luck and it's not intended, I think. If first issued I/O complete, squashfs decompress the I/O with 128K pages so all 4 iteration(128K/32K) would be hit in page cache. If all 128K hit in page cache, mm layer start to issue next I/O and repeat above logic until you ends up reading all file size. So my opition is that upper layer wouldn't issue more I/O logically. If it worked, it's not what we expect but side-effect. That's why I'd like to know what's your thought for increasing IOPS. Please, could you say your thought why IOPS increased, not a result on low level driver? Anyway, in my opinion, we should take care of MM layer's readahead for enhance sequential I/O. For it, we should use buffer pages passed by MM instead of freeing them and allocating new pages in squashfs. IMHO, it would be better to implement squashfs_readpages but my insight is very weak so I guess Phillip will give more good idea/insight about the issue. Thanks! > > Thanks > Chanho. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re : Re: [PATCH] Squashfs: add asynchronous read support
> I did test it on x86 with USB stick and ARM with eMMC on my Nexus 4. > In experiment, I couldn't see much gain like you both system and even it > was regressed at bs=32k test, maybe workqueue allocation/schedule of work > per I/O. > Your test is rather special or what I am missing? Can you specify your test result on ARM with eMMC. > Before that, I'd like to know fundamental reason why your implementation > for asynchronous read enhance. At a first glance, I thought it's caused by > readahead from MM layer but when I read code, I found I was wrong. > MM's readahead logic works based on PageReadahead marker but squashfs > invalidates by grab_cache_page_nowait so it wouldn't work as we expected. > > Another possibility is block I/O merging in block layder by plugging logic, > which was what I tried a few month ago although implementation was really > bad. But it wouldn't work with your patch because do_generic_file_read > will unplug block layer by lock_page without merging enough I/O. > > So, what do you think real actuator for enhance your experiment? > Then, I could investigate why I can't get a benefit. Currently, squashfs adds request to the block device queue synchronously with wait for competion. mmc takes this request one by one and push them to host driver, But it allows mmc to be idle frequently. This patch allows to add block requset asynchrously without wait for competion, mmcqd can fetch a lot of request from block at a time. As a result, mmcqd get busy and use a more bandwidth of mmc. For test, I added two count variables in mmc_queue_thread as bellows and tested same dd transfer. static int mmc_queue_thread(void *d) { .. do { if (req || mq->mqrq_prev->req) { fetch++; } else { idle++; } } while (1); .. } without patch: fetch: 920, idle: 460 with patch fetch: 918, idle: 40 Thanks Chanho. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re : Re: [PATCH] Squashfs: add asynchronous read support
I did test it on x86 with USB stick and ARM with eMMC on my Nexus 4. In experiment, I couldn't see much gain like you both system and even it was regressed at bs=32k test, maybe workqueue allocation/schedule of work per I/O. Your test is rather special or what I am missing? Can you specify your test result on ARM with eMMC. Before that, I'd like to know fundamental reason why your implementation for asynchronous read enhance. At a first glance, I thought it's caused by readahead from MM layer but when I read code, I found I was wrong. MM's readahead logic works based on PageReadahead marker but squashfs invalidates by grab_cache_page_nowait so it wouldn't work as we expected. Another possibility is block I/O merging in block layder by plugging logic, which was what I tried a few month ago although implementation was really bad. But it wouldn't work with your patch because do_generic_file_read will unplug block layer by lock_page without merging enough I/O. So, what do you think real actuator for enhance your experiment? Then, I could investigate why I can't get a benefit. Currently, squashfs adds request to the block device queue synchronously with wait for competion. mmc takes this request one by one and push them to host driver, But it allows mmc to be idle frequently. This patch allows to add block requset asynchrously without wait for competion, mmcqd can fetch a lot of request from block at a time. As a result, mmcqd get busy and use a more bandwidth of mmc. For test, I added two count variables in mmc_queue_thread as bellows and tested same dd transfer. static int mmc_queue_thread(void *d) { .. do { if (req || mq-mqrq_prev-req) { fetch++; } else { idle++; } } while (1); .. } without patch: fetch: 920, idle: 460 with patch fetch: 918, idle: 40 Thanks Chanho. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Re : Re: [PATCH] Squashfs: add asynchronous read support
Hello, Please don't break thread. You should reply to my mail instead of your original post. On Wed, Dec 18, 2013 at 01:29:37PM +0900, Chanho Min wrote: I did test it on x86 with USB stick and ARM with eMMC on my Nexus 4. In experiment, I couldn't see much gain like you both system and even it was regressed at bs=32k test, maybe workqueue allocation/schedule of work per I/O. Your test is rather special or what I am missing? Can you specify your test result on ARM with eMMC. Sure. before after 32K 3.6M 3.4M 64K 6.3M 8.2M 128K11.4M11.7M 160K13.6M13.8M 256K19.8M19M 288K21.3M20.8M Before that, I'd like to know fundamental reason why your implementation for asynchronous read enhance. At a first glance, I thought it's caused by readahead from MM layer but when I read code, I found I was wrong. MM's readahead logic works based on PageReadahead marker but squashfs invalidates by grab_cache_page_nowait so it wouldn't work as we expected. Another possibility is block I/O merging in block layder by plugging logic, which was what I tried a few month ago although implementation was really bad. But it wouldn't work with your patch because do_generic_file_read will unplug block layer by lock_page without merging enough I/O. So, what do you think real actuator for enhance your experiment? Then, I could investigate why I can't get a benefit. Currently, squashfs adds request to the block device queue synchronously with wait for competion. mmc takes this request one by one and push them to host driver, But it allows mmc to be idle frequently. This patch allows to add block requset asynchrously without wait for competion, mmcqd can fetch a lot of request from block at a time. As a result, mmcqd get busy and use a more bandwidth of mmc. For test, I added two count variables in mmc_queue_thread as bellows and tested same dd transfer. static int mmc_queue_thread(void *d) { .. do { if (req || mq-mqrq_prev-req) { fetch++; } else { idle++; } } while (1); .. } without patch: fetch: 920, idle: 460 with patch fetch: 918, idle: 40 It's a result which isn't what I want to know. What I wnat to know is why upper layer issues more I/O per second. For example, you read 32K so MM layer will prepare 8 pages to read in but at issuing at a first page, squashfs make 32 pages and fill the page cache if we assume you use 128K compression so MM layer's already prepared 7 page would be freed without further I/O and do_generic_file_read will wait for completion by lock_page without further I/O queueing. It's not suprising. One of page freed is a READA marked page so readahead couldn't work. If readahead works, it would be just by luck. Actually, by simulation 64K dd, I found readahead logic would be triggered but it's just by luck and it's not intended, I think. If first issued I/O complete, squashfs decompress the I/O with 128K pages so all 4 iteration(128K/32K) would be hit in page cache. If all 128K hit in page cache, mm layer start to issue next I/O and repeat above logic until you ends up reading all file size. So my opition is that upper layer wouldn't issue more I/O logically. If it worked, it's not what we expect but side-effect. That's why I'd like to know what's your thought for increasing IOPS. Please, could you say your thought why IOPS increased, not a result on low level driver? Anyway, in my opinion, we should take care of MM layer's readahead for enhance sequential I/O. For it, we should use buffer pages passed by MM instead of freeing them and allocating new pages in squashfs. IMHO, it would be better to implement squashfs_readpages but my insight is very weak so I guess Phillip will give more good idea/insight about the issue. Thanks! Thanks Chanho. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Squashfs: add asynchronous read support
Hello Chanho, On Mon, Dec 16, 2013 at 02:30:26PM +0900, Chanho Min wrote: > This patch removes synchronous wait for the up-to-date of buffer in the > file system level. Instead all operations after submit_bh are moved into > the End-of-IO handler and its associated workeque. It decompresses/copies > data into pages and unlock them asynchronously. > > This patch enhances the performance of Squashfs in most cases. > Especially, large file reading is improved significantly. > > dd read test: > > - ARM cortex-a9 1GHz, 2 cores, eMMC 4.5 HS200 mode. > - dd if=file1 of=/dev/null bs=64k > > Before > 58707718 bytes (56.0MB) copied, 1.393653 seconds, 40.2MB/s > > After > 58707718 bytes (56.0MB) copied, 0.942413 seconds, 59.4MB/s It's really nice! I did test it on x86 with USB stick and ARM with eMMC on my Nexus 4. In experiment, I couldn't see much gain like you both system and even it was regressed at bs=32k test, maybe workqueue allocation/schedule of work per I/O. Your test is rather special or what I am missing? Before that, I'd like to know fundamental reason why your implementation for asynchronous read enhance. At a first glance, I thought it's caused by readahead from MM layer but when I read code, I found I was wrong. MM's readahead logic works based on PageReadahead marker but squashfs invalidates by grab_cache_page_nowait so it wouldn't work as we expected. Another possibility is block I/O merging in block layder by plugging logic, which was what I tried a few month ago although implementation was really bad. But it wouldn't work with your patch because do_generic_file_read will unplug block layer by lock_page without merging enough I/O. So, what do you think real actuator for enhance your experiment? Then, I could investigate why I can't get a benefit. Thanks for looking this. > > Signed-off-by: Chanho Min > --- > fs/squashfs/Kconfig |9 ++ > fs/squashfs/block.c | 262 > + > fs/squashfs/file_direct.c |8 +- > fs/squashfs/page_actor.c |3 +- > fs/squashfs/page_actor.h |3 +- > fs/squashfs/squashfs.h|2 + > 6 files changed, 284 insertions(+), 3 deletions(-) > > diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig > index b6fa865..284aa5a 100644 > --- a/fs/squashfs/Kconfig > +++ b/fs/squashfs/Kconfig > @@ -51,6 +51,15 @@ config SQUASHFS_FILE_DIRECT > it eliminates a memcpy and it also removes the lock contention > on the single buffer. > > +config SQUASHFS_READ_DATA_ASYNC > + bool "Read and decompress data asynchronously" > + depends on SQUASHFS_FILE_DIRECT > + help > + By default Squashfs read data synchronously by block (default 128k). > + This option removes such a synchronous wait in the file system level. > + All works after submit IO do at the End-of-IO handler asynchronously. > + This enhances the performance of Squashfs in most cases, especially, > + large file reading. > endchoice > > choice > diff --git a/fs/squashfs/block.c b/fs/squashfs/block.c > index 0cea9b9..1517ca3 100644 > --- a/fs/squashfs/block.c > +++ b/fs/squashfs/block.c > @@ -212,3 +212,265 @@ read_failure: > kfree(bh); > return -EIO; > } > + > +#ifdef CONFIG_SQUASHFS_READ_DATA_ASYNC > + > +struct squashfs_end_io_assoc { > + int offset; > + int b_count; > + int compressed; > + int length; > + struct squashfs_page_actor *p_actor; > + struct buffer_head **__bh; > + struct squashfs_sb_info *msblk; > + struct work_struct read_work; > +}; > + > +static int squashfs_copy_page(struct squashfs_sb_info *msblk, > + struct buffer_head **bh, int b, int offset, int length, > + struct squashfs_page_actor *output) > +{ > + /* > + * Block is uncompressed. > + */ > + int in, pg_offset = 0, avail = 0, bytes, k = 0; > + void *data = squashfs_first_page(output); > + for (bytes = length; k < b; k++) { > + in = min(bytes, msblk->devblksize - offset); > + bytes -= in; > + while (in) { > + if (pg_offset == PAGE_CACHE_SIZE) { > + data = squashfs_next_page(output); > + pg_offset = 0; > + } > + avail = min_t(int, in, PAGE_CACHE_SIZE - > + pg_offset); > + memcpy(data + pg_offset, bh[k]->b_data + offset, > + avail); > + in -= avail; > + pg_offset += avail; > + offset += avail; > + } > + offset = 0; > + put_bh(bh[k]); > + } > + squashfs_finish_page(output); > + return length; > +} > + > +/* > + * This is executed in workqueue for squashfs_read_data_async(). > + * - pages come decompressed/copied and unlocked asynchronously. > + */ > +static void
Re: [PATCH] Squashfs: add asynchronous read support
Hello Chanho, On Mon, Dec 16, 2013 at 02:30:26PM +0900, Chanho Min wrote: This patch removes synchronous wait for the up-to-date of buffer in the file system level. Instead all operations after submit_bh are moved into the End-of-IO handler and its associated workeque. It decompresses/copies data into pages and unlock them asynchronously. This patch enhances the performance of Squashfs in most cases. Especially, large file reading is improved significantly. dd read test: - ARM cortex-a9 1GHz, 2 cores, eMMC 4.5 HS200 mode. - dd if=file1 of=/dev/null bs=64k Before 58707718 bytes (56.0MB) copied, 1.393653 seconds, 40.2MB/s After 58707718 bytes (56.0MB) copied, 0.942413 seconds, 59.4MB/s It's really nice! I did test it on x86 with USB stick and ARM with eMMC on my Nexus 4. In experiment, I couldn't see much gain like you both system and even it was regressed at bs=32k test, maybe workqueue allocation/schedule of work per I/O. Your test is rather special or what I am missing? Before that, I'd like to know fundamental reason why your implementation for asynchronous read enhance. At a first glance, I thought it's caused by readahead from MM layer but when I read code, I found I was wrong. MM's readahead logic works based on PageReadahead marker but squashfs invalidates by grab_cache_page_nowait so it wouldn't work as we expected. Another possibility is block I/O merging in block layder by plugging logic, which was what I tried a few month ago although implementation was really bad. But it wouldn't work with your patch because do_generic_file_read will unplug block layer by lock_page without merging enough I/O. So, what do you think real actuator for enhance your experiment? Then, I could investigate why I can't get a benefit. Thanks for looking this. Signed-off-by: Chanho Min chanho@lge.com --- fs/squashfs/Kconfig |9 ++ fs/squashfs/block.c | 262 + fs/squashfs/file_direct.c |8 +- fs/squashfs/page_actor.c |3 +- fs/squashfs/page_actor.h |3 +- fs/squashfs/squashfs.h|2 + 6 files changed, 284 insertions(+), 3 deletions(-) diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig index b6fa865..284aa5a 100644 --- a/fs/squashfs/Kconfig +++ b/fs/squashfs/Kconfig @@ -51,6 +51,15 @@ config SQUASHFS_FILE_DIRECT it eliminates a memcpy and it also removes the lock contention on the single buffer. +config SQUASHFS_READ_DATA_ASYNC + bool Read and decompress data asynchronously + depends on SQUASHFS_FILE_DIRECT + help + By default Squashfs read data synchronously by block (default 128k). + This option removes such a synchronous wait in the file system level. + All works after submit IO do at the End-of-IO handler asynchronously. + This enhances the performance of Squashfs in most cases, especially, + large file reading. endchoice choice diff --git a/fs/squashfs/block.c b/fs/squashfs/block.c index 0cea9b9..1517ca3 100644 --- a/fs/squashfs/block.c +++ b/fs/squashfs/block.c @@ -212,3 +212,265 @@ read_failure: kfree(bh); return -EIO; } + +#ifdef CONFIG_SQUASHFS_READ_DATA_ASYNC + +struct squashfs_end_io_assoc { + int offset; + int b_count; + int compressed; + int length; + struct squashfs_page_actor *p_actor; + struct buffer_head **__bh; + struct squashfs_sb_info *msblk; + struct work_struct read_work; +}; + +static int squashfs_copy_page(struct squashfs_sb_info *msblk, + struct buffer_head **bh, int b, int offset, int length, + struct squashfs_page_actor *output) +{ + /* + * Block is uncompressed. + */ + int in, pg_offset = 0, avail = 0, bytes, k = 0; + void *data = squashfs_first_page(output); + for (bytes = length; k b; k++) { + in = min(bytes, msblk-devblksize - offset); + bytes -= in; + while (in) { + if (pg_offset == PAGE_CACHE_SIZE) { + data = squashfs_next_page(output); + pg_offset = 0; + } + avail = min_t(int, in, PAGE_CACHE_SIZE - + pg_offset); + memcpy(data + pg_offset, bh[k]-b_data + offset, + avail); + in -= avail; + pg_offset += avail; + offset += avail; + } + offset = 0; + put_bh(bh[k]); + } + squashfs_finish_page(output); + return length; +} + +/* + * This is executed in workqueue for squashfs_read_data_async(). + * - pages come decompressed/copied and unlocked asynchronously. + */ +static void squashfs_buffer_read_async(struct squashfs_end_io_assoc *io_assoc) +{ + struct squashfs_sb_info *msblk =
[PATCH] Squashfs: add asynchronous read support
This patch removes synchronous wait for the up-to-date of buffer in the file system level. Instead all operations after submit_bh are moved into the End-of-IO handler and its associated workeque. It decompresses/copies data into pages and unlock them asynchronously. This patch enhances the performance of Squashfs in most cases. Especially, large file reading is improved significantly. dd read test: - ARM cortex-a9 1GHz, 2 cores, eMMC 4.5 HS200 mode. - dd if=file1 of=/dev/null bs=64k Before 58707718 bytes (56.0MB) copied, 1.393653 seconds, 40.2MB/s After 58707718 bytes (56.0MB) copied, 0.942413 seconds, 59.4MB/s Signed-off-by: Chanho Min --- fs/squashfs/Kconfig |9 ++ fs/squashfs/block.c | 262 + fs/squashfs/file_direct.c |8 +- fs/squashfs/page_actor.c |3 +- fs/squashfs/page_actor.h |3 +- fs/squashfs/squashfs.h|2 + 6 files changed, 284 insertions(+), 3 deletions(-) diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig index b6fa865..284aa5a 100644 --- a/fs/squashfs/Kconfig +++ b/fs/squashfs/Kconfig @@ -51,6 +51,15 @@ config SQUASHFS_FILE_DIRECT it eliminates a memcpy and it also removes the lock contention on the single buffer. +config SQUASHFS_READ_DATA_ASYNC + bool "Read and decompress data asynchronously" + depends on SQUASHFS_FILE_DIRECT + help + By default Squashfs read data synchronously by block (default 128k). + This option removes such a synchronous wait in the file system level. + All works after submit IO do at the End-of-IO handler asynchronously. + This enhances the performance of Squashfs in most cases, especially, + large file reading. endchoice choice diff --git a/fs/squashfs/block.c b/fs/squashfs/block.c index 0cea9b9..1517ca3 100644 --- a/fs/squashfs/block.c +++ b/fs/squashfs/block.c @@ -212,3 +212,265 @@ read_failure: kfree(bh); return -EIO; } + +#ifdef CONFIG_SQUASHFS_READ_DATA_ASYNC + +struct squashfs_end_io_assoc { + int offset; + int b_count; + int compressed; + int length; + struct squashfs_page_actor *p_actor; + struct buffer_head **__bh; + struct squashfs_sb_info *msblk; + struct work_struct read_work; +}; + +static int squashfs_copy_page(struct squashfs_sb_info *msblk, + struct buffer_head **bh, int b, int offset, int length, + struct squashfs_page_actor *output) +{ + /* +* Block is uncompressed. +*/ + int in, pg_offset = 0, avail = 0, bytes, k = 0; + void *data = squashfs_first_page(output); + for (bytes = length; k < b; k++) { + in = min(bytes, msblk->devblksize - offset); + bytes -= in; + while (in) { + if (pg_offset == PAGE_CACHE_SIZE) { + data = squashfs_next_page(output); + pg_offset = 0; + } + avail = min_t(int, in, PAGE_CACHE_SIZE - + pg_offset); + memcpy(data + pg_offset, bh[k]->b_data + offset, + avail); + in -= avail; + pg_offset += avail; + offset += avail; + } + offset = 0; + put_bh(bh[k]); + } + squashfs_finish_page(output); + return length; +} + +/* + * This is executed in workqueue for squashfs_read_data_async(). + * - pages come decompressed/copied and unlocked asynchronously. + */ +static void squashfs_buffer_read_async(struct squashfs_end_io_assoc *io_assoc) +{ + struct squashfs_sb_info *msblk = io_assoc->msblk; + struct squashfs_page_actor *actor = io_assoc->p_actor; + struct page **page = actor->page; + int pages = actor->pages; + struct page *target_page = actor->target_page; + int i, length, bytes = 0; + void *pageaddr; + + if (io_assoc->compressed) { + length = squashfs_decompress(msblk, io_assoc->__bh, + io_assoc->b_count, io_assoc->offset, + io_assoc->length, actor); + if (length < 0) { + ERROR("squashfs_read_data failed to read block\n"); + goto read_failure; + } + } else + length = squashfs_copy_page(msblk, io_assoc->__bh, + io_assoc->b_count, io_assoc->offset, + io_assoc->length, actor); + + /* Last page may have trailing bytes not filled */ + bytes = length % PAGE_CACHE_SIZE; + if (bytes) { + pageaddr = kmap_atomic(page[pages - 1]); + memset(pageaddr + bytes, 0, PAGE_CACHE_SIZE - bytes); + kunmap_atomic(pageaddr); + } + + /* Mark
[PATCH] Squashfs: add asynchronous read support
This patch removes synchronous wait for the up-to-date of buffer in the file system level. Instead all operations after submit_bh are moved into the End-of-IO handler and its associated workeque. It decompresses/copies data into pages and unlock them asynchronously. This patch enhances the performance of Squashfs in most cases. Especially, large file reading is improved significantly. dd read test: - ARM cortex-a9 1GHz, 2 cores, eMMC 4.5 HS200 mode. - dd if=file1 of=/dev/null bs=64k Before 58707718 bytes (56.0MB) copied, 1.393653 seconds, 40.2MB/s After 58707718 bytes (56.0MB) copied, 0.942413 seconds, 59.4MB/s Signed-off-by: Chanho Min chanho@lge.com --- fs/squashfs/Kconfig |9 ++ fs/squashfs/block.c | 262 + fs/squashfs/file_direct.c |8 +- fs/squashfs/page_actor.c |3 +- fs/squashfs/page_actor.h |3 +- fs/squashfs/squashfs.h|2 + 6 files changed, 284 insertions(+), 3 deletions(-) diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig index b6fa865..284aa5a 100644 --- a/fs/squashfs/Kconfig +++ b/fs/squashfs/Kconfig @@ -51,6 +51,15 @@ config SQUASHFS_FILE_DIRECT it eliminates a memcpy and it also removes the lock contention on the single buffer. +config SQUASHFS_READ_DATA_ASYNC + bool Read and decompress data asynchronously + depends on SQUASHFS_FILE_DIRECT + help + By default Squashfs read data synchronously by block (default 128k). + This option removes such a synchronous wait in the file system level. + All works after submit IO do at the End-of-IO handler asynchronously. + This enhances the performance of Squashfs in most cases, especially, + large file reading. endchoice choice diff --git a/fs/squashfs/block.c b/fs/squashfs/block.c index 0cea9b9..1517ca3 100644 --- a/fs/squashfs/block.c +++ b/fs/squashfs/block.c @@ -212,3 +212,265 @@ read_failure: kfree(bh); return -EIO; } + +#ifdef CONFIG_SQUASHFS_READ_DATA_ASYNC + +struct squashfs_end_io_assoc { + int offset; + int b_count; + int compressed; + int length; + struct squashfs_page_actor *p_actor; + struct buffer_head **__bh; + struct squashfs_sb_info *msblk; + struct work_struct read_work; +}; + +static int squashfs_copy_page(struct squashfs_sb_info *msblk, + struct buffer_head **bh, int b, int offset, int length, + struct squashfs_page_actor *output) +{ + /* +* Block is uncompressed. +*/ + int in, pg_offset = 0, avail = 0, bytes, k = 0; + void *data = squashfs_first_page(output); + for (bytes = length; k b; k++) { + in = min(bytes, msblk-devblksize - offset); + bytes -= in; + while (in) { + if (pg_offset == PAGE_CACHE_SIZE) { + data = squashfs_next_page(output); + pg_offset = 0; + } + avail = min_t(int, in, PAGE_CACHE_SIZE - + pg_offset); + memcpy(data + pg_offset, bh[k]-b_data + offset, + avail); + in -= avail; + pg_offset += avail; + offset += avail; + } + offset = 0; + put_bh(bh[k]); + } + squashfs_finish_page(output); + return length; +} + +/* + * This is executed in workqueue for squashfs_read_data_async(). + * - pages come decompressed/copied and unlocked asynchronously. + */ +static void squashfs_buffer_read_async(struct squashfs_end_io_assoc *io_assoc) +{ + struct squashfs_sb_info *msblk = io_assoc-msblk; + struct squashfs_page_actor *actor = io_assoc-p_actor; + struct page **page = actor-page; + int pages = actor-pages; + struct page *target_page = actor-target_page; + int i, length, bytes = 0; + void *pageaddr; + + if (io_assoc-compressed) { + length = squashfs_decompress(msblk, io_assoc-__bh, + io_assoc-b_count, io_assoc-offset, + io_assoc-length, actor); + if (length 0) { + ERROR(squashfs_read_data failed to read block\n); + goto read_failure; + } + } else + length = squashfs_copy_page(msblk, io_assoc-__bh, + io_assoc-b_count, io_assoc-offset, + io_assoc-length, actor); + + /* Last page may have trailing bytes not filled */ + bytes = length % PAGE_CACHE_SIZE; + if (bytes) { + pageaddr = kmap_atomic(page[pages - 1]); + memset(pageaddr + bytes, 0, PAGE_CACHE_SIZE - bytes); + kunmap_atomic(pageaddr); + } + + /* Mark