RE: [PATCH V2 2/7] Cleancache (was Transcendent Memory): core files
> I just finished a rough (but working) implementation of in-kernel > page cache compression backend (called zcache). During this work, > I found some issues with cleancache, mostly related to (lack of) > comments/documentation: Great to hear! And excellent feedback on the missing documentation... I am working on this right now so your feedback is very timely. (documentation and funcition return values comments deleted as I will fix all of them) > > + > > +static inline int cleancache_init_fs(size_t pagesize) > > + > > - It seems that returning pool_id of 0 is considered as error > condition (as it appears from deactivate_locked_super() changes). > This seems weird; I think only negative pool_id should considered > as error. Anyway, please add function comments for these. Hmmm... this is a bug. 0 is a valid pool_id. I'll fix it for the next rev. > Page cache compression was a long-pending project. I'm glad its > coming into shape with the help of cleancache :) Thanks! Dan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V2 2/7] Cleancache (was Transcendent Memory): core files
Hi, On 05/28/2010 11:05 PM, Dan Magenheimer wrote: > [PATCH V2 2/7] Cleancache (was Transcendent Memory): core files I just finished a rough (but working) implementation of in-kernel page cache compression backend (called zcache). During this work, I found some issues with cleancache, mostly related to (lack of) comments/documentation: > + > +static inline int cleancache_init_fs(size_t pagesize) > + - It is not very obvious that this function is called when an instance of cleancache supported filesystem is *mounted*. Initially, I thought this is called which any such filesystem module is loaded. - It seems that returning pool_id of 0 is considered as error condition (as it appears from deactivate_locked_super() changes). This seems weird; I think only negative pool_id should considered as error. Anyway, please add function comments for these. > +int __cleancache_get_page(struct page *page) > +{ > + int ret = 0; > + int pool_id = page->mapping->host->i_sb->cleancache_poolid; > + > + if (pool_id >= 0) { > + ret = (*cleancache_ops->get_page)(pool_id, > + page->mapping->host->i_ino, > + page->index, > + page); > + if (ret == CLEANCACHE_GET_PAGE_SUCCESS) > + succ_gets++; > + else > + failed_gets++; > + } > + return ret; > +} It seems "non-standard" to use '1' as success code. You could simply use 0 for success and negative error code as failure. Then you can also get rid of CLEANCACHE_GET_PAGE_SUCCESS. > + > +int __cleancache_put_page(struct page *page) What return values stands for successful put? 1? Anyway, following the same, 0 for success, negative codes for errors, seems to be better. > + > +int __cleancache_flush_page(struct address_space *mapping, struct page *page) > +int __cleancache_flush_inode(struct address_space *mapping) Return values for all the flush functions is ignored everywhere, so why not make them return void instead? > +static inline void cleancache_flush_fs(int pool_id) Like init_fs, please document that it is called when a cleancache aware filesystem is unmounted (or in other cases too?). Page cache compression was a long-pending project. I'm glad its coming into shape with the help of cleancache :) Thanks, Nitin -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
missing include from btrfsck.c?
i'm not a C developer, but i like to think i know enough to be dangerous (pragmatic) :-D building from git master failed with: .. .. gcc -Wp,-MMD,./.btrfsck.o.d,-MT,btrfsck.o -Wall -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2 -g -Werror -Os -c btrfsck.c cc1: warnings being treated as errors btrfsck.c: In function ‘maybe_free_inode_rec’: btrfsck.c:323:2: error: implicit declaration of function ‘S_ISDIR’ btrfsck.c:328:2: error: implicit declaration of function ‘S_ISREG’ btrfsck.c:328:2: error: implicit declaration of function ‘S_ISLNK’ make: *** [btrfsck.o] Error 1 grepping the source turned up several other files successfully using those functions. after a quick serach, it looked to be a part of stat... and the other files were all including . i'm not sure if it's my gcc being paranoid (archlinux), but adding: #include to btrfsck.c fixed the issue for me. C Anthony -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Fwd: Re: Linking two files together][RFC]
On Wed, 9 Jun 2010, Roberto Ragusa wrote: > I hope that ideas about btrfs are not off-topic for this mailing list. > > The forwarded message below was written by me on fedora-users. > The thread is about the ability to link two files in a manner > similar to "cat 1 2 >3 && rm 1 2" while avoiding any data > movement on the disk. > The implementation should just put the original extents together in > the new file. Is there any filesystem which is capable of doing that? > As btrfs is already based on extents and COW, couldn't this feature be > evaluated for feasibility? I think a lot of usages will be found > for it if actually implemented. Btrfs already has a CLONE_RANGE ioctl that will clone a range of (block-aligned) bytes from file A to any offset in file B. The fs just fixes up the file metadata to reference the same bytes on disk without reading or writing any actual file data. sage -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Fwd: Re: Linking two files together][RFC]
Roberto Ragusa writes: > I hope that ideas about btrfs are not off-topic for this mailing list. > > The forwarded message below was written by me on fedora-users. > The thread is about the ability to link two files in a manner > similar to "cat 1 2 >3 && rm 1 2" while avoiding any data > movement on the disk. OCFS2 can do this today with "reflinks" -Andi -- a...@linux.intel.com -- Speaking for myself only. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH][RFC] Complex filesystem operations: split and join
I had a need to split a file into smaller files on a thumb drive with no free space on it or anywhere else in the system. When the filesystem supports sparse files(truncate_range), I could create files, while punching holes in the original file. But when the underlying fs is FAT, I couldn't. Also why should we do needless I/O, when all I want is to split/join files. i.e., all the data are already on the disk, under the same filesystem. I just want to do some metadata changes. So, I added two inode operations, namely split and join, that lets me tell the OS, that all I want is meta-data changes. And the file-system can avoid doing lots of I/O, when only metadata changes are needed. sys_split(fd1, n, fd2) 1. Attach the data of file after n bytes in fd1 to fd2. 2. Truncate fd1 to n bytes. Roughly can be thought of as equivalent of following commands: 1. dd if=file1 of=file2 skip=n 2. truncate -c -s n file1 sys_join(fd1, fd2) 1. Extend fd1 with data of fd2 2. Truncate fd2 to 0. Roughly can be thought of as equivalent of following commands: 1. dd if=file2 of=file1 seek=`filesize file1` 2. truncate -c -s 0 file2 Attached is the patch that adds these new syscalls and support for them to the FAT filesystem. I guess, this approach can be extended to splice() kind of call, between files, instead of pipes. On a COW fs, splice could simply setup blocks as shared between files, instead of doing I/O. It would be a kind of explicit online data-deduplication. Later when a file modifies any of those blocks, we copy blocks. i.e., COW. Thanks Nikanth p.s: Strangely fibrils and syslets came to my mind, when thinking along these lines. But, I guess fibrils or syslets are not really related to this. From: Nikanth Karthikesan Subject: vfs and vfat: add filesystem operations: split and join Add 2 new inode_operation, namely sys_split and sys_join, with the following semantics. sys_split(fd1, n, fd2) 1. Attach the data of file after n bytes in fd1 to fd2. 2. Truncate fd1 to n bytes. sys_join(fd1, fd2) 1. Extend fd1 with data of fd2 2. Truncate fd2 to 0. These avoid doing unnecessary I/O that would be needed when the same should be accompolished using only read,write,truncate. Also using read/write would require temporary additional free space on filesystems that do not support sparse files. The files should belong to the same super block. The split should be on a cluster boundary, i.e., it should be a multiple of cluster size i.e., filesystem block-size. And for join the size of destination file should be a multiple of filesystem block size i.e., FAT cluster size. Also the syscalls are added only to x86_64, for now. Some performance numbers of splitting a file into half of its size and then concatenating it back together using and not using these syscalls. filesize Using sys_split & sys_joinUsing read/write 1GB 0.08056.557 2GB 0.117 116.140 3GB 0.112 144.658 All numbers are seconds. Signed-off-by: Nikanth Karthikesan --- diff --git a/arch/x86/include/asm/unistd_64.h b/arch/x86/include/asm/unistd_64.h index ff4307b..0b9bdf8 100644 --- a/arch/x86/include/asm/unistd_64.h +++ b/arch/x86/include/asm/unistd_64.h @@ -663,6 +663,10 @@ __SYSCALL(__NR_rt_tgsigqueueinfo, sys_rt_tgsigqueueinfo) __SYSCALL(__NR_perf_event_open, sys_perf_event_open) #define __NR_recvmmsg 299 __SYSCALL(__NR_recvmmsg, sys_recvmmsg) +#define __NR_split 300 +__SYSCALL(__NR_split, sys_split) +#define __NR_join 301 +__SYSCALL(__NR_join, sys_join) #ifndef __NO_STUBS #define __ARCH_WANT_OLD_READDIR diff --git a/fs/fat/file.c b/fs/fat/file.c index 990dfae..81e426c 100644 --- a/fs/fat/file.c +++ b/fs/fat/file.c @@ -453,7 +453,118 @@ out: } EXPORT_SYMBOL_GPL(fat_setattr); +/* + * Join the cluster chain of tail_inode to the end of head_inode. + */ +int fat_join(struct inode *head_inode, struct inode *tail_inode) +{ + struct super_block *sb = head_inode->i_sb; + struct msdos_sb_info *sbi = MSDOS_SB(sb); + int nr_cluster; + int ret = 0; + + nr_cluster = head_inode->i_size >> sbi->cluster_bits; + if (nr_cluster << sbi->cluster_bits != head_inode->i_size) { + return -EINVAL; + } + + nr_cluster = tail_inode->i_size >> sbi->cluster_bits; + + fat_cache_inval_inode(head_inode); + fat_cache_inval_inode(tail_inode); + + ret = fat_chain_add(head_inode, MSDOS_I(tail_inode)->i_start, nr_cluster); + if (ret) + goto out; + + MSDOS_I(tail_inode)->i_start = MSDOS_I(tail_inode)->i_logstart = 0; + ret = simple_setsize(head_inode, head_inode->i_size + tail_inode->i_size); + if (ret) + goto out; + head_inode->i_blocks = ((head_inode->i_size + tail_inode->i_size)>> sbi->cluster_bits) << (sbi->cluster_bits - 9);
Re: [Fwd: Re: Linking two files together][RFC]
On Wednesday 09 June 2010 13:53:00 Roberto Ragusa wrote: > Hi, > > I hope that ideas about btrfs are not off-topic for this mailing list. > > The forwarded message below was written by me on fedora-users. > The thread is about the ability to link two files in a manner > similar to "cat 1 2 >3 && rm 1 2" while avoiding any data > movement on the disk. > The implementation should just put the original extents together in > the new file. Is there any filesystem which is capable of doing that? > As btrfs is already based on extents and COW, couldn't this feature be > evaluated for feasibility? I think a lot of usages will be found > for it if actually implemented. It will come naturally with online data deduplication -- though, at the moment the only FS I know of that can do this is ZFS. Otherwise, we would need a completely new system calls to perform those operations. > > Read the following part if interested. > > Thanks. > > Original Message > From: - Thu May 27 20:44:26 2010 > X-Mozilla-Status: 0001 > X-Mozilla-Status2: > Message-ID: <4bfe537b.8050...@robertoragusa.it> > Date: Thu, 27 May 2010 13:11:55 +0200 > From: Roberto Ragusa > User-Agent: Thunderbird 2.0.0.23 (X11/20090825) > MIME-Version: 1.0 > To: Community support for Fedora users > Subject: Re: Linking two files together > References: > <7f593570d3366e4e85c76baf70fd0eed0106dbf31...@cvmmbx.vetmed.wsu.edu> > <4bfd589f.7090...@kjchome.homeip.net> In-Reply-To: > <4bfd589f.7090...@kjchome.homeip.net> > X-Enigmail-Version: 0.96.0 > Content-Type: text/plain; charset=ISO-8859-1 > Content-Transfer-Encoding: 7bit > > Kevin J. Cummings wrote: > > On 05/26/2010 01:16 PM, Rector, David wrote: > >> Hello, > >> > >> I have studied various filesystems, and am fairly familiar with how they > >> are structured. However, I am currently stuck on trying to do what > >> seems like a simple thing. > >> > >> I would like to join two files together without having to physically > >> copy bytes (i.e. I have vary large files, so I don't want to use > >> 'cat'). It seems to me that it should be possible to simply modify the > >> file entry in the filesystem such that the last inode of the first file > >> points to the first inode of the second file. I guess this is similar > >> to a "hard link", but used to join files rather than simply have > >> another pointer to one file. > >> > >> I have seen 'mmv' and 'lxsplit' and they all seem to do the same thing, > >> namely they want to physically copy the bytes in order to join two > >> files together. > >> > >> Is there any such utility in linux to perform such a hard link to join > >> or connect two files together without having to copy bytes? > > > > If you could guarantee that the last extent used by the first file was > > completely full of data with no extraneous bytes, it might be possible > > to "merge" the extent maps of the 2 files into a single file entry. If > > you cannot guarantee that, then you will have to copy bytes from the 2nd > > file to the end of the first file. > > But everything becomes possible if the fileystem permits partially empty > blocks in the middle of the file. No filesystem does it AFAIK, but it is > not a big issue, as partial blocks (or compacted tails) are already > permitted at the end of the file. New filesystems use extents rather than > blocks, so if the extents are measured in bytes instead of 512b-blocks you > can just use a smaller extent in the middle of the file where the join > happened. > > At this point, you can support inplace-joining, inplace-inflating (add > 1 bytes in this file at position 30), inplace-erasure (remove > 1 bytes at position 30) and data shuffling (swap the first 50meg > of the file with the last 50meg). > > With heavy usage you have just created a new kind of fragmentation, which > can be corrected with the usual defragmentation tools (including "cp"). > (add that fragmentation is losing importance with the spreading of SSD) > > Considering that sparse files have been a reality for decades and that > the implementation of operation with inside-file byte-grained extents > is not more difficult than truncate, I wonder if we will see something > of this kind in some advanced filesystem (btrfs?). > > There are a lot of possible uses: > - delete/replace mail in mbox format repositories > - smart packaging (delete from tar, delete from zip) > - in-place iso creation > and just imagine. > - video editing (!) add/remove/replace frames inside a 150GiB captured > video > > Where can you submit ideas to btrfs? > It also has COW, so everything becomes even more exciting... -- Hubert Kario QBS - Quality Business Software 02-656 Warszawa, ul. Ksawerów 30/85 tel. +48 (22) 646-61-51, 646-74-24 www.qbs.com.pl System Zarządzania Jakością zgodny z normą ISO 9001:2000 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More major
[Fwd: Re: Linking two files together][RFC]
Hi, I hope that ideas about btrfs are not off-topic for this mailing list. The forwarded message below was written by me on fedora-users. The thread is about the ability to link two files in a manner similar to "cat 1 2 >3 && rm 1 2" while avoiding any data movement on the disk. The implementation should just put the original extents together in the new file. Is there any filesystem which is capable of doing that? As btrfs is already based on extents and COW, couldn't this feature be evaluated for feasibility? I think a lot of usages will be found for it if actually implemented. Read the following part if interested. Thanks. Original Message From: - Thu May 27 20:44:26 2010 X-Mozilla-Status: 0001 X-Mozilla-Status2: Message-ID: <4bfe537b.8050...@robertoragusa.it> Date: Thu, 27 May 2010 13:11:55 +0200 From: Roberto Ragusa User-Agent: Thunderbird 2.0.0.23 (X11/20090825) MIME-Version: 1.0 To: Community support for Fedora users Subject: Re: Linking two files together References: <7f593570d3366e4e85c76baf70fd0eed0106dbf31...@cvmmbx.vetmed.wsu.edu> <4bfd589f.7090...@kjchome.homeip.net> In-Reply-To: <4bfd589f.7090...@kjchome.homeip.net> X-Enigmail-Version: 0.96.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Kevin J. Cummings wrote: > On 05/26/2010 01:16 PM, Rector, David wrote: >> Hello, >> >> I have studied various filesystems, and am fairly familiar with how they are >> structured. However, I am currently stuck on trying to do what seems like a >> simple thing. >> >> I would like to join two files together without having to physically copy >> bytes (i.e. I have vary large files, so I don't want to use 'cat'). It seems >> to me that it should be possible to simply modify the file entry in the >> filesystem such that the last inode of the first file points to the first >> inode of the second file. I guess this is similar to a "hard link", but used >> to join files rather than simply have another pointer to one file. >> >> I have seen 'mmv' and 'lxsplit' and they all seem to do the same thing, >> namely they want to physically copy the bytes in order to join two files >> together. >> >> Is there any such utility in linux to perform such a hard link to join or >> connect two files together without having to copy bytes? > > If you could guarantee that the last extent used by the first file was > completely full of data with no extraneous bytes, it might be possible > to "merge" the extent maps of the 2 files into a single file entry. If > you cannot guarantee that, then you will have to copy bytes from the 2nd > file to the end of the first file. But everything becomes possible if the fileystem permits partially empty blocks in the middle of the file. No filesystem does it AFAIK, but it is not a big issue, as partial blocks (or compacted tails) are already permitted at the end of the file. New filesystems use extents rather than blocks, so if the extents are measured in bytes instead of 512b-blocks you can just use a smaller extent in the middle of the file where the join happened. At this point, you can support inplace-joining, inplace-inflating (add 1 bytes in this file at position 30), inplace-erasure (remove 1 bytes at position 30) and data shuffling (swap the first 50meg of the file with the last 50meg). With heavy usage you have just created a new kind of fragmentation, which can be corrected with the usual defragmentation tools (including "cp"). (add that fragmentation is losing importance with the spreading of SSD) Considering that sparse files have been a reality for decades and that the implementation of operation with inside-file byte-grained extents is not more difficult than truncate, I wonder if we will see something of this kind in some advanced filesystem (btrfs?). There are a lot of possible uses: - delete/replace mail in mbox format repositories - smart packaging (delete from tar, delete from zip) - in-place iso creation and just imagine. - video editing (!) add/remove/replace frames inside a 150GiB captured video Where can you submit ideas to btrfs? It also has COW, so everything becomes even more exciting... -- Roberto Ragusamail at robertoragusa.it -- Roberto Ragusamail at robertoragusa.it -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
compression type per extent
I am trying to add some hooks to support different compression types In end_compressed_bio_read(struct bio *bio, int err) (compression.c) does anyone know how can i read the compression_type flag set for the current extent being read for decompression. I was looking at using btrfs_get_file_extent_compression but I am unsure if this is correct in this function. At the moment btrfs_zlib_decompress_biovec is called but obviously this is because only zlib is supported right now. Philip -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html