Re: recommendations and contraindications of using btrfs for Oracle Database Server
On Thu, Jan 11, 2018 at 6:23 PM, Nikolay Borisov wrote: > > > On 11.01.2018 12:51, Ext-Strii-Houttemane Philippe wrote: >> Hello, >> >>We are using btrfs filesystem on local disks (RAID 1) as underlying >> filesystem to host our Oracle 12c datafiles. >> This allow us to cold backup databases via snapshot in a few seconds and >> benefit from higher performance than over Linux filesystem formats. >> This is the problem we meet: Oracle regularly crashes with error of this 2 >> types, the errors occur on different physical machines with same softwares: >> >> ORA-63999: data file suffered media failure >> ORA-01114: IO error writing block to file 99 (block # 99968) >> ORA-01110: data file 99: '/oradata/PS92PRD/data/pcapp.dbf' >> ORA-27072: File I/O error >> Linux-x86_64 Error: 17: File exists >> Mount options: defaults,nofail,nodatacow,nobarrier,noatime >> >> uname -a: >> Linux 3.10.0-514.10.2.el7.x86_64 #1 SMP Fri Mar 3 00:04:05 UTC 2017 x86_64 >> x86_64 x86_64 GNU/Linux > > You are using a vendor-specific kernel. It's best if you turn to them > for support since it's very likely their code doesn't match what is in > upstream, let alone the fact you are using an ancient kernel. 3.10 is redhat compatible kernel. They put btrfs as tech preview, and will deprecate it after 7.4. So it's probably useless to ask redhat for support on that. @Philippe, you might want to try Oracle's UEK R4. At least they're still promoting btrfs improvements in their kernel, and you might be able to ask them in case of problems (assuming you have proper subscription & support). However IIRC oracle only supports btrfs for its application binary, and not for the data files (again, you should contact oracle support to be sure). If you simply want to use latest kernel (with latest btrfs fixes) to see if your problem still occurs, and don't care about support, you can try kernel-ml (http://elrepo.org/tiki/kernel-ml) or compile your own. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?
On Thu, Aug 3, 2017 at 1:44 AM, Chris Mason wrote: > > On 08/02/2017 04:38 AM, Brendan Hide wrote: >> >> The title seems alarmist to me - and I suspect it is going to be >> misconstrued. :-/ > > > Supporting any filesystem is a huge amount of work. I don't have a problem > with Redhat or any distro picking and choosing the projects they want to > support. > It'd help a lot of people if things like https://btrfs.wiki.kernel.org/index.php/Status is kept up-to-date and 'promoted', so at least users are more informed about what they're getting into and can choose which features (stable/still in dev/likely to destroy your data) that they want to use. For example, https://btrfs.wiki.kernel.org/index.php/Status says compression is 'mostly OK' ('auto-repair and compression may crash' looks pretty scary, as from newcomers-perspective it might be interpretted as 'potential data loss'), while https://en.opensuse.org/SDB:BTRFS#Compressed_btrfs_filesystems says they support compression on newer opensuse versions. > > At least inside of FB, our own internal btrfs usage is continuing to grow. > Btrfs is becoming a big part of how we ship containers and other workloads > where snapshots improve performance. > Ubuntu also support btrfs as part their container implementation (lxd), and (reading lxd mailing list) some people use lxd+btrfs on their production environment. IIRC the last problem posted on lxd list about btrfs was about how 'btrfs send/receive (used by lxd copy) is slower than rsync for full/initial copy'. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS, remarkable problem: filesystem turns to read-only caused by firefox download
On Wed, Jun 15, 2016 at 1:29 PM, Paul Verreth wrote: > Dear all. > > When I download a video using Firefox DownloadHelper addon, the > filesystem suddenly turns read only. Not a coincedence, I tried it > several times, and it happened every time again > > Info: > Linux wolfgang 4.2.0-35-generic #40-Ubuntu SMP Tue Mar 15 22:15:45 UTC > 2016 x86_64 x86_64 x86_64 GNU/Linux > Segmentation fault > > Jun 5 15:03:15 ubuntu kernel: [ 2062.544303] BTRFS info (device > sdb5): relocating block group 383447465984 flags 17 > What can I do to repair this problem? The usual starting advice would be "try with latest kernel and see if you can still reproduce the problem". Is it ubuntu wily? It'd go end of in July anyway, so you might want to upgrade to xenial (or at least, just the kernel, for the purpose of troubleshooting your problem). Or even try http://kernel.ubuntu.com/~kernel-ppa/mainline/daily/current/ (should be usable, but might report some errors/warning due to missing ubuntu patches) -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs subvolume clone or fork (btrfs-progs feature request)
On Thu, Jul 9, 2015 at 8:20 AM, james harvey wrote: > Request for new btrfs subvolume subcommand: > > clone or fork [-i [] >Create a subvolume in , which is a clone or fork of source. >If is not given, subvolume will be created in the > current directory. >Options >-i > Add the newly created subvolume to a qgroup. This option can be > given multiple times. > > Would (I think): > * btrfs subvolume create > * cp -ax --reflink=always /* / What's wrong with "btrfs subvolume snapshot"? -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: CoW with webserver databases: innodb_file_per_table and dedicated tables for blobs?
On Tue, Jun 16, 2015 at 2:06 PM, Ingvar Bogdahn wrote: > Hi again, > > Benchmarking over time seems a good idea, but what if I see that a > particular database does indeed degrade in performance? How can I then > selectively improve performance for that file, since disabling cow only > works for new empty files? > you might be overcomplicating things. > Is it correct that bundling small random writes into groups of writes > reduces fragmentation? If so, some form of write-caching should help? I'm > still investigating, but one solution might be: > 1) identify which exact tables do have frequent writes > 2) decrease the system-wide write-caching (vm.dirty_background_ratio and > vm.dirty_ratio) to lower levels, because this wastes lots of RAM by > indiscriminately caching writes of the whole system, and tends to causes > spikes where suddenly the entire cache gets written to disk and block the > system. Rather use that RAM selectively to cache only the critical files. IIRC innodb uses O_DIRECT by default, which should bypass fs cache, so the above should be irrelevant > 4) create a software RAID-1 made up of a ramdisk and a mounted image, using > mdadm. > 5) Setting up mdadm using rather large value for "write-behind=" > 6) put only those tables on that disk-backed ramdisk which do have frequent > writes. > raid1 writes everything to both, so your write performance would still be limited by the disk. As for reads, instead of using ramdisk for half of md, I would just use that amount of ram for innodb_buffer_pool > What do you think? I would say "determine your priorities". If you absolutely need btrfs + innodb, then I would: - increase innodb_buffer_pool - don't mess with nocow, leave it as is - don't mess with autodefrag - enable compression on btrfs - use latest known good kernel (AFAIK 4.0.5 should be good) If you absolutely must have high performance with innodb, then I would look at using raw block device directly for innodb. You'd lose all btrfs features of course (e.g. snapshots), but it's a tradeoff for performance. If you don't HAVE to use innodb but still want to use btrfs, then I would use tokudb engine instead (available in tokudb's mysql fork and mariadb >= 10), with compression handled by tokudb (disable compression in btrfs). tokudb doesn't support foreign constraint, but other than that it should be able to replace innodb for your purposes. Among other things, tokudb uses larger block size (4MB) so it should help reduce fragmentation compared to innodb. If you don't HAVE to use either btrfs or innodb, but just want "mysql db that supports transactions with an fs that supports snapshot/clone", then I would use zfs + tokudb. And read http://blog.delphix.com/matt/2014/06/06/zfs-stripe-width/ (with the exception that compression should be used in tokudb instead of zfs) -- Fajar > > Ingvar > > > > Am 15.06.15 um 11:57 schrieb Hugo Mills: > >> On Mon, Jun 15, 2015 at 11:34:35AM +0200, Ingvar Bogdahn wrote: >>> >>> Hello there, >>> >>> I'm planing to use btrfs for a medium-sized webserver. It is >>> commonly recommended to set nodatacow for database files to avoid >>> performance degradation. However, apparently nodatacow disables some >>> of my main motivations of using btrfs : checksumming and (probably) >>> incremental backups with send/receive (please correct me if I'm >>> wrong on this). Also, the databases are among the most important >>> data on my webserver, so it is particularly there that I would like >>> those feature working. >>> >>> My question is, are there strategies to avoid nodatacow of databases >>> that are suitable and safe in a production server? >>> I thought about the following: >>> - in mysql/mariadb: setting "innodb_file_per_table" should avoid >>> having few very big database files. >> >> It's not so much about the overall size of the files, but about the >> write patterns, so this probably won't be useful. >> >>> - in mysql/mariadb: adapting database schema to store blobs into >>> dedicated tables. >> >> Probably not an issue -- each BLOB is (likely) to be written in a >> single unit, which won't cause the fragmentation problems. >> >>> - btrfs: set autodefrag or some cron job to regularly defrag only >>> database fails to avoid performance degradation due to fragmentation >> >> Autodefrag is a good idea, and I would suggest trying that first, >> before anything else, to see if it gives you good enough performance >> over time. >> >> Running an explicit defrag will break any CoW copies you have (like >> snapshots), causing them to take up additional space. For example, >> start with a 10 GB subvolume. Snapshot it, and you will still only >> have 10 GB of disk usage. Defrag one (or both) copies, and you'll >> suddenly be using 20 GB. >> >>> - turn on compression on either btrfs or mariadb >> >> Again, won't help. The issue is not the size of the data, it's the >> write patterns: small random writes into the middle of existing files >> w
Re: should I use btrfs on Centos 7 for a new production server?
On Wed, Dec 31, 2014 at 1:04 PM, Eric Sandeen wrote: > On 12/30/14 10:06 PM, Wang Shilong wrote: >>> I used CentOS7 btrfs myself, just doing some tests..it crashed easily. >>> I don’t know how much efforts that Redhat do on btrfs for 7 series. >> >> Maybe use SUSE enterprise for btrfs will be a better choice, they offered >> better support for btrfs as far as i know. > > I believe SuSE's most recent support statement on btrfs is here, I think. > > https://www.suse.com/releasenotes/x86_64/SUSE-SLES/12/#fate-317221 Wow. Suse use btrfs for root by default, but actively prevents user from using compression (unless specifically overiden using module parameter)? Weird, since IIRC compression has been around and stable for a long time. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Kernel panic / Ubuntu 12.04.4
On Tue, Jul 1, 2014 at 2:44 PM, Tomasz Torcz wrote: > > On Tue, Jul 01, 2014 at 08:45:59AM +0200, Tor Houghton wrote: > > Well, I probably should have looked at the logs first and not tried to > > delete > > some old data, but as the command (rm -rf) hung, I got suspicious: > > > > Jun 30 23:51:06 moonshade kernel: [ 1440.454721] btrfs: relocating block > > group 2939934474240 flags 1 > > Jun 30 23:51:07 moonshade kernel: [ 1440.637248] btrfs: relocating block > > group 2999510827008 flags 36 > > Jun 30 23:51:07 moonshade kernel: [ 1440.897153] btrfs: relocating block > > group 2997363343360 flags 36 > > Jun 30 23:51:07 moonshade kernel: [ 1441.147110] btrfs: relocating block > > group 2995215859712 flags 36 > > Looks like balance running? > > > Jul 1 01:31:47 moonshade kernel: [ 7480.673992] Pid: 17884, comm: btrfs > > Not tainted 3.5.0-40-generic #62~precise1-Ubuntu Dell Inc. > > Dell DM051 /0HJ054 > > Kernel 3.5 is extremely old and lacks fixes from 11 kernel releases done > afterwards. Please contact your vendor (Canonical?) for support. > ubuntu precise can use linux-generic-lts-trusty, which brings kernel 3.13. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: latest btrfs-progs and asciidoc dependency
On Thu, Jun 5, 2014 at 9:41 PM, Marc MERLIN wrote: > On Thu, Jun 05, 2014 at 12:52:04PM +0100, Tomasz Chmielewski wrote: >> And it looks the dependency is ~1 GB of new packages? O_o > > That seems painful, but at the same time, the alternative, nroff/troff sucks. > > Part ofyour problem however seems to be runaway dependencies. > You are getting x11 and stuff like libdrm which clearly you shouldn't need. > If your disk space is more valuable than your time, I recommend you build > asciidoc yourself and you should hopefully end up with less. > > Or you can also remove asciidoc from the makefile and read the raw files > which are readable. ... or try this # apt-get install --no-install-recommends asciidoc If that still doesn't work, AND you have lost of free time, AND familiar with debian packaging, then you can use latest available debian source, adapt it for latest version, and use opensuse build service to compile it. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Very slow filesystem
(resending to the list as plain text, the original reply was rejected due to HTML format) On Thu, Jun 5, 2014 at 10:05 AM, Duncan <1i5t5.dun...@cox.net> wrote: > > Igor M posted on Thu, 05 Jun 2014 00:15:31 +0200 as excerpted: > > > Why btrfs becames EXTREMELY slow after some time (months) of usage ? > > This is now happened second time, first time I though it was hard drive > > fault, but now drive seems ok. > > Filesystem is mounted with compress-force=lzo and is used for MySQL > > databases, files are mostly big 2G-8G. > > That's the problem right there, database access pattern on files over 1 > GiB in size, but the problem along with the fix has been repeated over > and over and over and over... again on this list, and it's covered on the > btrfs wiki as well Which part on the wiki? It's not on https://btrfs.wiki.kernel.org/index.php/FAQ or https://btrfs.wiki.kernel.org/index.php/UseCases > so I guess you haven't checked existing answers > before you asked the same question yet again. > > Never-the-less, here's the basic answer yet again... > > Btrfs, like all copy-on-write (COW) filesystems, has a tough time with a > particular file rewrite pattern, that being frequently changed and > rewritten data internal to an existing file (as opposed to appended to > it, like a log file). In the normal case, such an internal-rewrite > pattern triggers copies of the rewritten blocks every time they change, > *HIGHLY* fragmenting this type of files after only a relatively short > period. While compression changes things up a bit (filefrag doesn't know > how to deal with it yet and its report isn't reliable), it's not unusual > to see people with several-gig files with this sort of write pattern on > btrfs without compression find filefrag reporting literally hundreds of > thousands of extents! > > For smaller files with this access pattern (think firefox/thunderbird > sqlite database files and the like), typically up to a few hundred MiB or > so, btrfs' autodefrag mount option works reasonably well, as when it sees > a file fragmenting due to rewrite, it'll queue up that file for > background defrag via sequential copy, deleting the old fragmented copy > after the defrag is done. > > For larger files (say a gig plus) with this access pattern, typically > larger database files as well as VM images, autodefrag doesn't scale so > well, as the whole file must be rewritten each time, and at that size the > changes can come faster than the file can be rewritten. So a different > solution must be used for them. If COW and rewrite is the main issue, why don't zfs experience the extreme slowdown (that is, not if you have sufficient free space available, like 20% or so)? -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Very slow filesystem
On Thu, Jun 5, 2014 at 5:15 AM, Igor M wrote: > Hello, > > Why btrfs becames EXTREMELY slow after some time (months) of usage ? > # btrfs fi show > Label: none uuid: b367812a-b91a-4fb2-a839-a3a153312eba > Total devices 1 FS bytes used 2.36TiB > devid1 size 2.73TiB used 2.38TiB path /dev/sde > # btrfs fi df /mnt/old > Data, single: total=2.36TiB, used=2.35TiB Is that the fs that is slow? It's almost full. Most filesystems would exhibit really bad performance when close to full due to fragmentation issue (threshold vary, but 80-90% full usually means you need to start adding space). You should free up some space (e.g. add a new disk so it becomes multi-device, or delete some files) and rebalance/defrag. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Convert btrfs software code to ASIC
On Mon, May 19, 2014 at 8:09 PM, Le Nguyen Tran wrote: > I now need to understand the operation of btrfs source code to > determine. I hope that one of you can help me Have you read the wiki link? -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Convert btrfs software code to ASIC
On Mon, May 19, 2014 at 3:40 PM, Le Nguyen Tran wrote: > Hi, > > I am Nguyen. I am not a software development engineer but an IC (chip) > development engineer. I have a plan to develop an IC controller for > Network Attached Storage (NAS). The main idea is converting software > code into hardware implementation. Because the chip is customized for > NAS, its performance is high, and its cost is lower than using micro > processor like Atom or Xeon (for servers). > > I plan to use btrfs as the file system specification for my NAS. The > main point is that I need to understand the btrfs sofware code in > order to covert them into hardware implementation. I am wandering if > any of you can help me. If we can make the chip in a good shape, we > can start up a company and have our own business. I'm not sure if that's a good idea. AFAIK btrfs depends a lot on other linux subsystems (e.g. vfs, block, etc). Rather than converting/reimplementing everything, if your aim is lower cost, you might have easier time using something like a mediatek SOC (the ones used on smartphones) and run a custom-built linux with btrfs support on it. For documentation, https://btrfs.wiki.kernel.org/index.php/Main_Page#Developer_documentation is probably the best place to start -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Which companies contribute to Btrfs?
On Thu, Apr 24, 2014 at 6:39 PM, David Sterba wrote: > On Wed, Apr 23, 2014 at 06:18:34PM -0700, Marc MERLIN wrote: >> I writing slides about btrfs for an upcoming talk (at linuxcon) and I was >> trying to gather a list of companies that contribute code to btrfs. > > https://btrfs.wiki.kernel.org/index.php/Main_Page > > "[...] Jointly developed at Oracle, Red Hat, Fujitsu, Intel, SUSE, STRATO > [...]" > >> Are there other companies I missed? The page now says "... Jointly developed at Facebook, Oracle, Red Hat " :D -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Can anyone boot a system using btrfs root with linux 3.14 or newer?
On Thu, Apr 24, 2014 at 10:23 AM, Chris Murphy wrote: > > It sounds like either a grub.cfg misconfiguration, or a failure to correctly > build the initrd/initramfs. So I'd post the grub.cfg kernel command line for > the boot entry that works and the entry that fails, for comparison. > > And then also check and see if whatever utility builds your initrd has been > upgraded along with your kernel, maybe there's a bug/regression. > I believe the OP mentioned that he's using a distro without initrd, and that all required modules are built in. -- Fajar > > Chris Murphy-- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs and ECC RAM
On Mon, Jan 20, 2014 at 10:13 AM, Austin S Hemmelgarn wrote: > > AFAIK, ZFS does background data scrubbing without user intervention No, it doesn't. > BTRFS however works differently, it only scrubs data when you tell it > to. If it encounters a checksum or read error on a data block, it > first tries to find another copy of that block elsewhere (usually on > another disk), if it still sees a wrong checksum there, or gets > another read error, or can't find another copy, then it returns a read > error to userspace, zfs does the same thing. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: drawbacks of non-ECC RAM
On Sat, Jan 18, 2014 at 1:33 AM, valleysmail-l...@yahoo.de wrote: > > > > I'd like to know if there are drawbacks in using btrfs with non-ECC RAM > instead of using ext4 with non-ECC RAM. Non-ECC RAM can cause problems no matter what fs you use. > I know that some features of btrfs may rely on ECC RAM but is the chance of > data corruption or even a damaged filesystem higher than when i use ext4 > instead of btrfs? Not really. In the past the occurence of corrupted btrfs report on this list (regardless of RAM) is somewhat high, but I don't see much of it in recent versions. > I want to know this because i would like to use the snapshot feature of btrfs > and ext4 does not support that. I will not use btrfs for fixing silent data > corruption nor for using RAID like features or encryption. ZFS however checks > files in the background (even if i don't want) zfs does not "checks files in the background" by default. When checksum is enabled (the default option), zfs only checks file integrity when you access it, and when you run the "scrub" command. It does not run background scrubs automatically. AFAIK btrfs behaves the same way. > and if it thinks there is an error it will fix it and i cannot disable this > feature. So errors in RAM may corrupt my files or even more. You can disable checksum on both btrfs and zfs. See https://btrfs.wiki.kernel.org/index.php/FAQ#Can_data_checksumming_be_turned_off.3F , https://btrfs.wiki.kernel.org/index.php/Mount_options -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Two identical copies of an image mounted result in changes to both images if only one is modified
On Thu, Jun 20, 2013 at 3:47 PM, Clemens Eisserer wrote: > Hi, > > I've observed a rather strange behaviour while trying to mount two > identical copies of the same image to different mount points. > Each modification to one image is also performed in the second one. > > Example: > dd if=/dev/sda? of=image1 bs=1M > cp image1 image2 > mount -o loop image1 m1 > mount -o loop image2 m2 > > touch m2/hello > ls -la m1 //will now also include a file calles "hello" What do you get if you unmount BOTH m1 and m2, and THEN mount m1 again? Is the file still there? > > Is this behaviour intentional and known or should I create a bug-report? > I've deleted quite a bunch of files on my production system because of this... I'm pretty sure this is a known behavior in btrfs. http://markmail.org/message/i522sdkrhlxhw757#query:+page:1+mid:ksdi5d4v26eqgxpi+state:results -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: lvm volume like support
On Tue, Feb 26, 2013 at 9:30 PM, Martin Steigerwald wrote: > Am Dienstag, 26. Februar 2013 schrieb Fajar A. Nugraha: >> On Tue, Feb 26, 2013 at 11:59 AM, Mike Fleetwood >> >> wrote: >> > On 25 February 2013 23:35, Suman C wrote: >> >> Hi, >> >> >> >> I think it would be great if there is a lvm volume or zfs zvol type >> >> support in btrfs. >> > >> > Btrfs already has capabilities to add and remove block devices on the >> > fly. Data can be stripped or mirrored or both. Raid 5/6 is in >> > testing at the moment. >> > https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devic >> > es https://btrfs.wiki.kernel.org/index.php/UseCases#RAID >> > >> > Which specific features do you think btrfs is lacking? >> >> I think he's talking about zvol-like feature. >> >> In zfs, instead of creating a >> filesystem-that-is-accessible-as-a-directory, you can create a zvol >> which behaves just like any other standard block device (e.g. you can >> use it as swap, or create ext4 filesystem on top of it). But it would >> also have most of the benefits that a normal zfs filesystem has, like: >> - thin provisioning (sparse allocation, snapshot & clone) >> - compression >> - integrity check (via checksum) >> >> Typical use cases would be: >> - swap in a pure-zfs system >> - virtualization (xen, kvm, etc) >> - NAS which exports the block device using iscsi/AoE >> >> AFAIK no such feature exist in btrfs yet. > > Sounds like the RADOS block device stuff for Ceph. Exactly. While using files + loopback device mostly works, there were problems regarding performance and data integrity. Not to mention the hassle in accessing the data if it resides on a partition inside the file (e.g. you need losetup + kpartx to access it, and you must remember to do the reverse when you're finished with it). In zfsonlinux it's very easy to do so since a zvol is treated pretty much like a disk, and whenever there's a partition inside a zvol, a coressponding device noed is also created automatically. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: lvm volume like support
On Tue, Feb 26, 2013 at 11:59 AM, Mike Fleetwood wrote: > On 25 February 2013 23:35, Suman C wrote: >> Hi, >> >> I think it would be great if there is a lvm volume or zfs zvol type >> support in btrfs. > Btrfs already has capabilities to add and remove block devices on the > fly. Data can be stripped or mirrored or both. Raid 5/6 is in > testing at the moment. > https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices > https://btrfs.wiki.kernel.org/index.php/UseCases#RAID > > Which specific features do you think btrfs is lacking? I think he's talking about zvol-like feature. In zfs, instead of creating a filesystem-that-is-accessible-as-a-directory, you can create a zvol which behaves just like any other standard block device (e.g. you can use it as swap, or create ext4 filesystem on top of it). But it would also have most of the benefits that a normal zfs filesystem has, like: - thin provisioning (sparse allocation, snapshot & clone) - compression - integrity check (via checksum) Typical use cases would be: - swap in a pure-zfs system - virtualization (xen, kvm, etc) - NAS which exports the block device using iscsi/AoE AFAIK no such feature exist in btrfs yet. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Production use with vanilla 3.6.6
On Mon, Nov 5, 2012 at 7:07 PM, Stefan Priebe - Profihost AG wrote: > Hello list, > > is btrfs ready for production use in 3.6.6? Or should i backport fixes from > 3.7-rc? > > Is it planned to have a stable kernel which will get all btrfs fixes > backported? I would say "no" to both, but you should check with distros that supports btrfs (Oracle Linux and SLES). In particular, whether they backport fixes, and what exactly does "supported" status gives you when you buy support for that distro. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Request for review] [RFC] Add label support for snapshots and subvols
On Fri, Nov 2, 2012 at 5:32 AM, Hugo Mills wrote: > On Fri, Nov 02, 2012 at 05:28:01AM +0700, Fajar A. Nugraha wrote: >> On Fri, Nov 2, 2012 at 5:16 AM, cwillu wrote: >> >> btrfs fi label -t /btrfs/snap1-sv1 >> >> Prod-DB-sand-box-testing >> > >> > Why is this better than: >> > >> > # btrfs su snap /btrfs/Prod-DB /btrfs/Prod-DB-sand-box-testing >> > # mv /btrfs/Prod-DB-sand-box-testing /btrfs/Prod-DB-production-test >> > # ls /btrfs/ >> > Prod-DB Prod-DB-production-test >> >> >> ... because it would mean possibilty to decouple subvol name from >> whatever-data-you-need (in this case, a label). >> >> My request, though, is to just implement properties, and USER >> properties, like what we have in zfs. This seems to be a cleaner, >> saner approach. For example, this is on Ubutu + zfsonlinux: >> >> # zfs create rpool/u >> # zfs set user:label="Some test filesystem" rpool/u >> # zfs get creation,user:label rpool/u >> NAME PROPERTYVALUE SOURCE >> rpool/u creationFri Nov 2 5:24 2012 - >> rpool/u user:label Some test filesystem local > >Don't we already have an equivalent to that with user xattrs? > >Hugo. Anand did say one way to implement the label is by using attr, so +1 from me for that approach. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Request for review] [RFC] Add label support for snapshots and subvols
On Fri, Nov 2, 2012 at 5:16 AM, cwillu wrote: >> btrfs fi label -t /btrfs/snap1-sv1 >> Prod-DB-sand-box-testing > > Why is this better than: > > # btrfs su snap /btrfs/Prod-DB /btrfs/Prod-DB-sand-box-testing > # mv /btrfs/Prod-DB-sand-box-testing /btrfs/Prod-DB-production-test > # ls /btrfs/ > Prod-DB Prod-DB-production-test ... because it would mean possibilty to decouple subvol name from whatever-data-you-need (in this case, a label). My request, though, is to just implement properties, and USER properties, like what we have in zfs. This seems to be a cleaner, saner approach. For example, this is on Ubutu + zfsonlinux: # zfs create rpool/u # zfs set user:label="Some test filesystem" rpool/u # zfs get creation,user:label rpool/u NAME PROPERTYVALUE SOURCE rpool/u creationFri Nov 2 5:24 2012 - rpool/u user:label Some test filesystem local More info about zfs user properties here: http://docs.oracle.com/cd/E19082-01/817-2271/gdrcw/index.html -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Naming of (bootable) subvolumes
On Sun, Oct 28, 2012 at 12:22 AM, Chris Murphy wrote: > > On Oct 26, 2012, at 9:03 PM, Fajar A. Nugraha wrote: > >> >> So back to the original question, I'd suggest NOT to use either >> send/receive or set-default. Instead, setup multiple boot environment >> (e.g. old version, current version) and let user choose which one to >> boot using a menu. > > Is it possible to make a functioning symbolic or hard link of a subvolume? > Nope, I don't think so. > I'm fine with "current" and "previous" options. More than that seems > unnecessary. But then, how does the user choose? WIth up and down arrow :) > What's the UI? Grub boot menu. > Is this properly the domain of GRUB2 or something else? In my setup I use grub2's "configfile" ability. Which basically does a "go evaluate this other menu config file". Each boot environment (BE, the term that solaris uses) has a different entry on the "main" grub.cfg, which loads the BE's corresponding grub.cfg. > > On BIOS machines, perhaps GRUB. On UEFI, I'd say distinctly not GRUB (I think > it's a distinctly bad idea to have a combined boot manager and bootloader in > a UEFI context, but that's a separate debate). I don't use UEFI. But the general idea is to have one bootloader which can load additional config files. And the location of that additional config file depends on which BE user wants to boot. > On this system, grub-mkconfig produces a grub.cfg only for the system I'm > currently booted from. It does not include any entries for fedora18/boot, > fedora18/root, even though they are well within the normal search path. And > the reference used is relative, i.e. the kernel parameter in the grub.cfg is > rootflags=subvol=root > > If it were to create entries potentially for every snapshotted system, it > would be a very messy grub.cfg indeed. > > It stands to reason that each distro will continue to have their own grub.cfg. > No arguments there. Even in my setup, when I run "update-grub", it will only update its own grub.cfg, and leave the "main" grub.cfg untouched. This is how my "main" grub.cfg looks like: #=== set timeout=2 menuentry 'Ubuntu - 20120905 boot menu' { configfile /ROOT/precise-5/@/boot/grub/grub.cfg } menuentry 'Ubuntu - 20120814 boot menu' { configfile /ROOT/precise-4/@/boot/grub/grub.cfg } #=== each BE's grub cfg (e.g. the one under ROOT/precise-5 dataset) is just your typical Ubuntu's grub.cfg, with only references to kernel/initrd under that dataser. > For BIOS machines, it could be useful if a single core.img containing a > single standardized prefix specifying a grub location could be agreed upon. > And then merely changing the set-default subvolume would allow different > distro grub.cfg's to be found, read and workable with the relative references > now in place, (except for home which likely needs to be mounted using > subvolid). IMHO the biggest difference is that grub support for zfsonlinux, even though it has bootfs pool property, has a way to reference ALL versions of a file (including grub.cfg/kernel/initrd) during boot time. This way you don't even need to change bootfs whenever you want to change to a boot environment, you simply choose (or write) a different grub stanza to boot. If we continue to rely on current btrfs grub support, unfortunately we can't have the same thing. And the closest thing would be "set-default". Which IMHO is VERY messy. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Naming of subvolumes
On Sat, Oct 27, 2012 at 8:58 AM, cwillu wrote: >> I haven't tried btrfs send/receive for this purpose, so I can't compare. But >> btrfs subvolume set-default is faster than the release of my finger from the >> return key. And it's easy enough the user could do it themselves if they had >> reasons for regression to a snapshot that differ than the automagic >> determination of the upgrade pass/fail. >> >> The one needed change, however, is to get /etc/fstab to use an absolute >> reference for home. >> >> >> Chris Murphy > I'd argue that everything should be absolute references to subvolumes > (/@home, /@, etc), and neither set-default nor subvolume id's should > be touched. There's no need, as you can simply mv those around (even > while mounted). More importantly, it doesn't result in a case where > the fstab in one snapshot points its mountpoint to a different > snapshot, with all the hilarity that would cause over time, and also > allows multiple distros to be installed on the same filesystem without > having them stomp on each others set-defaults: /@fedora, /@rawhide, > /@ubuntu, /@home, etc. What I do with zfs, which might also be applicable on btrfs: - Have a separate dataset to install grub: poolname/boot. This can also be a dedicated partition, if you want. The sole purpose for this partition/dataset is to select which dataset's grub.cfg to load next (using "configfile" directive). The grub.cfg here is edited manually. - Have different datasets for each versioned OS (e.g. before and after upgrades): poolname/ROOT/ubuntu-1, poolname/ROOT/ubuntu-2, etc. Each dataset is independent of each other, contains their own /boot (complete with grub/grub.cfg, kernel, and initrd). grub.cfg on each dataset selects its own dataset to boot using "bootfs" kernel command line. - Have a common home for all environment: poolname/home - Have zfs set the mountpoint (or mounted in initramfs, in root case), so I can get away with an empty fstab. - Do upgrades/modifications in the currently-booted root environment, but create a clone of current environment (and give it a different name) so I can roll back to it if needed. It works great for me so far, since: - each boot environment is portable-enough to move around when needed, with only about four config files needed to be changed (e.g. grub.cfg) when moving between different computers, or when renaming a root dataset. - I can rename each root environment easily, or even move it to different pool/disk when needed. - I can move back and forth between multiple versions of the boot environment (all are ubuntu so far, cause IMHO it currently has best zfs root support). So back to the original question, I'd suggest NOT to use either send/receive or set-default. Instead, setup multiple boot environment (e.g. old version, current version) and let user choose which one to boot using a menu. However for this to work, grub (the bootloader, and the userland programs like "update-grub") needs to be able to refer to each grub.cfg/kernel/initrd in a global manner regardless of what the current default subvolume is (zfs' grub code uses something like /poolname/dataset_name/@/path/to/file/in/dataset). -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs causing reboots and kernel oops on SL 6 (RHEL 6)
On Sat, Jun 4, 2011 at 11:33 AM, Joel Pearson wrote: > Hi, > > I'm using SL 6 (RHEL 6) and I've been playing around with running > PostgreSQL on btrfs. Snapshotting works ok, but the computer keeps > rebooting without warning (can be 5 mins or 1.5 hours), finally I > actually managed to get a Kernel Crash instead of just a reboot. > > I took a picture of the screen: > http://imageshack.us/photo/my-images/716/img0143y.jpg/ > > The important bits are: > > IP: [] btrfs_print_leaf +0x31/0x820 [btrfs] > PGD 0 > Oops: [#1] SMP > last sysfs file: /sys/devices/virtual/block/dm-3/dm/name > > The crashes aren't predictable either. Like it doesn't always happen > when I do a snapshot or anything like that. > > Is this a known problem, that is fixed in a later kernel or something like > that? Which kernel is this? If it's the default SL/RHEL 2.6.32 kernel, then you should try upgrade first. http://elrepo.org/tiki/kernel-ml is a good choice. It's highly unlikely that anyone would be willing to look at bugs on that "archaic" (in btrfs world) kernel. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Tunning - cache write (database)
On Tue, Oct 2, 2012 at 3:16 AM, Clemens Eisserer wrote: >> I suggest you start by reading >> http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg18827.html >> >> After that, PROBABLY start your database by preloading libeatmydata to >> disable fsync completely. > > Which will cure the sympthoms, not the issue itself - I remember the > same advice was given for Reiser4 back then ;) > Usually for non-toy use-cases data is too valueable to just disable fsync. The OP DID say he doesn't really care about security, recovery, nor integrity (or at least, it's not an obligatiion) :D Other than trying latest -rc and using libeatmydata, I can't see what else can be done to improve current db performance on btrfs. As the list archive shows, zfs is currently MUCH more suitable for that. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Tunning - cache write (database)
On Mon, Oct 1, 2012 at 8:27 PM, Cesar Inacio Martins wrote: > My problem: > * Using btrfs + compression , flush of 60 MB/s take 4 minutes > (on this 4 minutes they keep constatly I/O of +- 4MB/s no disks) > (flush from Informix database) > * OpenSuse 12.1 64bits, running over VmWare ESXi 5 > * Btrfs version : btrfsprogs-0.19-43.1.2.x86_64 > * Kernel : Linux jdivm06 3.1.10-1.16-desktop #1 SMP PREEMPT Wed Jun 27 > My question, what I believed will help to avoid this long flush : > * Have some way to force this flush all in memory cache and then use the > btrfs background process to flush to disk ... > Security and recover aren't a priority for now, because this is part of a > database bulkload ...after finish , integrity will be desirable (not a > obligation, since this is a test environment) > > For now, performance is the mainly requirement... I suggest you start by reading http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg18827.html After that, PROBABLY start your database by preloading libeatmydata to disable fsync completely. On a side note, zfs has "sync" property, which when set to "disabled", have pretty much the same effect as libeatmydata. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Experiences: Why BTRFS had to yield for ZFS
On Wed, Sep 19, 2012 at 2:28 PM, Casper Bang wrote: >> Anand Jain oracle.com> writes: >> archive-log-apply script - if you could, can you share the >> script itself ? or provide more details about the script. >> (It will help to understand the work-load in question). > > Our setup entails a whole bunch of scripts, but the apply script looks like > this > (orion is the production environment, pandium is the shadow): > http://pastebin.com/k4T7deap > > The script invokes rman passing rman_recover_database.rcs: IIRC there were some patches post-3.0 which relates to sync. If oracle db uses sync writes (or call sync somewhere, which it should), it might help to re-run the test with more recent kernel. kernel-ml repository might help. > Ext4 starts out with a realtime to SCN ratio of about 3.4 and ends down > around a > factor 2.2. > > ZFS starts out with a realtime to SCN ratio of about 7.5 and ends down around > a > factor 4.4. So zfsonlinux is actually faster than ext4 for that purpuse? coool ! > > Btrfs starts out with a realtime to SCN ratio of about 2.2 and ends down > around > a factor 0.8. This of course means we will never be able to catch up with > production, as btrfs can't apply these as fast as they're created. > > It was even worse with btrfs on our 10xSSD server, where 20 min. of realtime > work would end up taking some 5h to get applied (factor 0.06), obviously > useless > to us. Just wondering, did you use "discard" option by any chance? In my experience it makes btrfs MUCH slower. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: specify UUID for btrfs
On Thu, Sep 13, 2012 at 1:07 PM, ching lu wrote: > Is it possible to specify UUID for btrfs when creating the filesystem? Not that I know of > or changing it when it is offline? This one is a definite no. > i have several script/setting file which have hardcoded UUID and i do > not want to update them every time when restore backup. Using label would probably make more sense for that purpose. It can be set and changed later. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Workaround for hardlink count problem?
On Mon, Sep 10, 2012 at 4:12 PM, Martin Steigerwald wrote: > Am Samstag, 8. September 2012 schrieb Marc MERLIN: >> I was migrating a backup disk to a new btrfs disk, and the backup had a >> lot of hardlinks to collapse identical files to cut down on inode >> count and disk space. >> >> Then, I started seeing: > […] >> Has someone come up with a cool way to work around the too many link >> error and only when that happens, turn the hardlink into a file copy >> instead? (that is when copying an entire tree with millions of files). > > What about: > > - copy first backup version > - btrfs subvol create first next > - copy next backup version > - btrfs subvol create previous next Wouldn't "btrfs subvolume snapshot", plus "rsync --inplace" more useful here? That is. if the original hardlink is caused by multiple versions of backup of the same file. Personally, if I need a feature not currently impelented yet in btrfs, I'd just switch to something else for now, like zfs. And revisit btrfs later when it has the needed features have been merged. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: enquiry about defrag
On Sun, Sep 9, 2012 at 2:49 PM, ching wrote: > On 09/09/2012 08:30 AM, Jan Steffens wrote: >> On Sun, Sep 9, 2012 at 2:03 AM, ching wrote: >>> 2. Is there any command for the fragmentation status of a file/dir ? e.g. >>> fragment size, number of fragments. >> Use the "filefrag" command, part of e2fsprogs. >> > > my image is a 16G sparse file, after defragment, it still has 101387 extents, > is it normal? Is compression enabled? If so, yes, it's normal. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
oops with btrfs on zvol
Hi, I'm experimenting with btrfs on top of zvol block device (using zfsonlinux), and got oops on a simple mount test. While I'm sure that zfsonlinux is somehow also at fault here (since the same test with zram works fine), the oops only shows things btrfs-related without any usable mention of zfs/zvol. Could anyone help me interpret the kernel logs, which btrfs-zvol interaction is at fault, so I can pass it on to zfs guys to work on their side as well? Thanks. The test is creating a sparse 100G block device (zfs create -V 100G -s -o volblocksize=4k rpool/vbd/test1), format it (mkfs.btrfs /dev/zvol/rpool/vbd/test1), and mount it. Oops occured, and the mount process stuck. Same thing happens on ubuntu precise's kernel 3.2 and quantal's 3.5. What's interesting is: - if I use ext4 (instead of btrfs) on the zvol, it works just fine - if I add a layer on top of zvol (losetup, or iscsi export-import) then btrfs works just fine. Syslog shows this (from Ubuntu's 3.2 kernel): #= Aug 31 20:30:13 DELL kernel: [34307.828311] zd0: unknown partition table Aug 31 20:30:34 DELL kernel: [34328.129249] device fsid cfd88ff9-def8-4d1f-9435-65becd5fa2b7 devid 1 transid 4 /dev/zd0 Aug 31 20:30:34 DELL kernel: [34328.134001] btrfs: disk space caching is enabled Aug 31 20:30:34 DELL kernel: [34328.135701] BUG: unable to handle kernel NULL pointer dereference at (null) Aug 31 20:30:34 DELL kernel: [34328.137200] IP: [] extent_range_uptodate+0x59/0xe0 [btrfs] Aug 31 20:30:34 DELL kernel: [34328.138759] PGD 0 Aug 31 20:30:34 DELL kernel: [34328.140248] Oops: [#1] SMP Aug 31 20:30:34 DELL kernel: [34328.141777] CPU 3 Aug 31 20:30:34 DELL kernel: [34328.141811] Modules linked in: ses enclosure ppp_mppe ppp_async crc_ccitt pci_stub vboxpci(O) vboxnetadp(O) vboxnetflt (O) vboxdrv(O) arc4 ath9k mac80211 radeon uvcvideo snd_hda_codec_hdmi ath9k_common snd_hda_codec_realtek ath9k_hw videodev ipt_MASQUERADE xt_state ipt able_nat nf_nat v4l2_compat_ioctl32 i915 nf_conntrack_ipv4 nf_conntrack iptable_filter nf_defrag_ipv4 ip_tables dm_multipath dummy x_tables bnep ath3k btusb bridge rfcomm bluetooth snd_hda_intel snd_hda_codec stp joydev ath snd_hwdep ttm snd_pcm mei(C) drm_kms_helper drm snd_seq_midi snd_rawmidi snd _seq_midi_event dell_wmi sparse_keymap snd_seq dell_laptop wmi snd_timer i2c_algo_bit video psmouse snd_seq_device cfg80211 snd mac_hid serio_raw soun dcore dcdbas snd_page_alloc parport_pc ppdev lp parport binfmt_misc zfs(P) zcommon(P) znvpair(P) zavl(P) zunicode(P) spl(O) ums_realtek uas r8169 btrf s zlib_deflate libcrc32c usb_storage Aug 31 20:30:34 DELL kernel: [34328.155820] Aug 31 20:30:34 DELL kernel: [34328.157974] Pid: 15887, comm: btrfs-endio-met Tainted: P C O 3.2.0-29-generic #46-Ubuntu Dell Inc. De ll System Inspiron N4110/03NKW8 Aug 31 20:30:34 DELL kernel: [34328.160283] RIP: 0010:[] [] extent_range_uptodate+0x59/0xe0 [btrfs] Aug 31 20:30:34 DELL kernel: [34328.162700] RSP: 0018:8800351dfde0 EFLAGS: 00010246 Aug 31 20:30:34 DELL kernel: [34328.165099] RAX: RBX: 01401000 RCX: Aug 31 20:30:34 DELL kernel: [34328.167548] RDX: 0001 RSI: 1401 RDI: Aug 31 20:30:34 DELL kernel: [34328.169989] RBP: 8800351dfe00 R08: R09: 880067021418 Aug 31 20:30:34 DELL kernel: [34328.172474] R10: 8800b680d010 R11: 1000 R12: 88011d997bf0 Aug 31 20:30:34 DELL kernel: [34328.174922] R13: 01401fff R14: 880031c45c00 R15: 88011aedc9b0 Aug 31 20:30:34 DELL kernel: [34328.177401] FS: () GS:88013e6c() knlGS: Aug 31 20:30:34 DELL kernel: [34328.179904] CS: 0010 DS: ES: CR0: 8005003b Aug 31 20:30:34 DELL kernel: [34328.182426] CR2: CR3: 0001291e CR4: 000406e0 Aug 31 20:30:34 DELL kernel: [34328.185005] DR0: DR1: DR2: Aug 31 20:30:34 DELL kernel: [34328.187602] DR3: DR6: 0ff0 DR7: 0400 Aug 31 20:30:34 DELL kernel: [34328.190246] Process btrfs-endio-met (pid: 15887, threadinfo 8800351de000, task 880031c45c00) Aug 31 20:30:34 DELL kernel: [34328.193171] Stack: Aug 31 20:30:34 DELL kernel: [34328.196542] 8800351dfdf0 880088ff6638 8800b61953c0 88011cbbb000 Aug 31 20:30:34 DELL kernel: [34328.199469] 8800351dfe10 a004224d 8800351dfe40 a00422d6 Aug 31 20:30:34 DELL kernel: [34328.202295] 8800351dfe88 88011aedc960 8800351dfe88 8800351dfe98 Aug 31 20:30:34 DELL kernel: [34328.204685] Call Trace: Aug 31 20:30:34 DELL kernel: [34328.206645] [] bio_ready_for_csum.isra.107+0xbd/0xc0 [btrfs] Aug 31 20:30:34 DELL kernel: [34328.208591] [] end_workqueue_fn+0x86/0xa0 [btrfs] Aug 31 20:30:34 DELL kernel: [34328.210565] [] worker_loop+0xa0/0x2b0 [btrfs] Aug 31 20:30:34 DELL kernel: [34328.212531] [] ? __schedule+0x3cc/0
Re: raw partition or LV for btrfs?
On Tue, Aug 14, 2012 at 9:09 PM, cwillu wrote: >>> If I understand correctly, if I don't use LVM, then such move and resize >>> operations can't be done for an online filesystem and it has more risk. >> >> You can resize, add, and remove devices from btrfs online without the >> need for LVM. IIRC LVM has finer granularity though, you can do >> something like "move only the first 10GB now, I'll move the rest >> later". > > You can certainly resize the filesystem itself, but without lvm I > don't believe you can resize the underlying partition online. I'm pretty sure you can do that with parted. At least, when your version of parted is NOT 2.2. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: raw partition or LV for btrfs?
On Tue, Aug 14, 2012 at 8:28 PM, Daniel Pocock wrote: > Can you just elaborate on the qgroups feature? > - Does this just mean I can make the subvolume sizes rigid, like LV sizes? Pretty much. > - Or is it per-user restrictions or some other more elaborate solution? No > > If I create 10 LVs today, with btrfs on each, can I merge them all into > subvolumes on a single btrfs later? No > > If I just create a 1TB btrfs with subvolumes now, can I upgrade to > qgroups later? Yes > Or would I have to recreate the filesystem? No > If I understand correctly, if I don't use LVM, then such move and resize > operations can't be done for an online filesystem and it has more risk. You can resize, add, and remove devices from btrfs online without the need for LVM. IIRC LVM has finer granularity though, you can do something like "move only the first 10GB now, I'll move the rest later". -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: raw partition or LV for btrfs?
On Mon, Aug 13, 2012 at 11:19 AM, Kyle Gates wrote: > Also, I think the current grub2 has lzo support. You're right grub2 (1.99-18) unstable; urgency=low [ Colin Watson ] ... * Backport from upstream: - Add support for LZO compression in btrfs (LP: #727535). so Ubuntu has it since precise, which is roughly the time I switched to zfs for rootfs :P Thanks for letting us know about that. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: I want to try something on the BTR file system,...
On Mon, Aug 13, 2012 at 8:22 AM, Ben Leverett wrote: > could you please send me a copy of the btr driver/kernel? I wonder if using "live.com" email has something to do with how you ask that question :P Anyway, depending on what you want to use it for, you might find it easier to just download latest version of Ubuntu or whatever-your-favorite-linux-distro. Or, if you want to modify the source code, The link that Michael sends provide a good starting point. What is it that you want to try? If your question is more specific, you can get more specific answer. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: raw partition or LV for btrfs?
On Sun, Aug 12, 2012 at 11:46 PM, Daniel Pocock wrote: > > > I notice this question on the wiki/faq: > > > https://btrfs.wiki.kernel.org/index.php/UseCases#What_is_best_practice_when_partitioning_a_device_that_holds_one_or_more_btr-filesystems > > and as it hasn't been answered, can anyone make any comments on the subject > > Various things come to mind: > > a) partition the disk, create an LVM partition, and create lots of small > LVs, format each as btrfs > > b) partition the disk, create an LVM partition, and create one big LV, > format as btrfs, make subvolumes > > c) what about using btrfs RAID1? Does either approach (a) or (b) seem > better for someone who wants the RAID1 feature? IMHO when the qgroup feature is "stable" (i.e. adopted by distros, or at least in stable kernel) then simply creating one big partition (and letting btrfs handle RAID1, if you use it) is better. When 3.6 is out, perhaps? Until then I'd use LVM. > > d) what about booting from a btrfs system? Is it recommended to follow > the ages-old practice of keeping a real partition of 128-500MB, > formatting it as btrfs, even if all other data is in subvolumes as per (b)? You can have one single partition only and boot directly from that. However btrfs has the same problems as zfs in this regard: - grub can read both, but can't write to either. In other words, no support for grubenv - the "best" compression method (gzip for zfs, lzo for btrfs) is not supported by grub For the first problem, an easy workaroud is just to disable the grub configuration that uses grubenv. Easy enough, and no major functionality loss. The second one is harder for btrfs. zfs allows you to have separate dataset (i.e. subvolume, in btfs terms) with different compression, so you can have a dedicated dataset for /boot with different compression setting from the rest of the dataset. With btrfs you're currently stuck with using the same compression setting for everything, so if you love lzo this might be a major setback. There's also a btrfs-specific problem: it's hard to have a system which have /boot on a separate subvol while managing it with current automatic tools (e.g. update-grub). Due to second and third problem, I'd recommend you just use a separate partition with ext2/4 for now. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How can btrfs take 23sec to stat 23K files from an SSD?
On Wed, Aug 1, 2012 at 1:01 PM, Marc MERLIN wrote: > So, clearly, there is something wrong with the samsung 830 SSD with linux > It it were a random crappy SSD from a random vendor, I'd blame the SSD, but > I have a hard time believing that samsung is selling SSDs that are slower > than hard drives at random IO and 'seeks'. You'd be surprised on how badly some vendors can screw up :) > First: btrfs is the slowest: > gandalfthegreat:/mnt/ssd/var/local# grep /mnt/ssd/var /proc/mounts > /dev/mapper/ssd /mnt/ssd/var btrfs > rw,noatime,compress=lzo,ssd,discard,space_cache 0 0 Just checking, did you explicitly activate "discard"? Cause on my setup (with corsair SSD) it made things MUCH slower. Also, try adding "noatime" (just in case the slow down was because "du" cause many access time updates) -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Upgrading from 2.6.38, how?
On Wed, Jul 25, 2012 at 11:39 AM, Gareth Pye wrote: > My proposed upgrade method is: > Boot from a live CD with the latest kernel I can find so I can do a few tests: > A - run the fsck in read only mode to confirm things look good > B - mount read only, confirm that I can read files well > C - mount read write, confirm working > Install latest OS, upgrade to latest kernel, then repeat above steps. > > Any likely hiccups with the above procedure and suggested alternatives? I'd simply install the new OS on a new partition/subvol. This is what I did when upgrading from natty -> oneiric -> precise. IIRC there are some incompatibilites (e.g. space/inode cache disk format?) but newer kernels will just do the right thing, drop the old cache and create a new one. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Very slow samba file transfer speed... any ideas ?
On Fri, Jul 20, 2012 at 5:23 PM, Shavi N wrote: > Hence I'm asking.. I know that I get fast copy/write speeds on the > btrfs volume from real life situations, How did you know that? So far none of your posted test result have shown that btrfs vol in your system is FAST. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Very slow samba file transfer speed... any ideas ?
On Thu, Jul 19, 2012 at 7:39 PM, Shavi N wrote: > So btrfs gives a massive difference locally, but that still doesn't > explain the slow transfer speeds. > Is there a way to test this? I'd try with real data, not /dev/zero. e.g: dd_rescue -b 1M -m 1.4G /dev/sda testfile.img ... or use whatever non-zero data source you have. dd_rescue will give a nice progress bar and speed indicator. Also, run "iostat -mx 3" while you're running dd, and while accessing it from samba. In my experice, btrfs is simply slower than ext4. Period. There's no way around it for now. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: brtfs on top of dmcrypt with SSD -> Trim or no Trim
On Thu, Jul 19, 2012 at 1:13 AM, Marc MERLIN wrote: > TL;DR: > I'm going to change the FAQ to say people should use TRIM with dmcrypt > because not doing so definitely causes some lesser SSDs to suck, or > possibly even fail and lose our data. > > > Longer version: > Ok, so several months later I can report back with useful info. > > Not using TRIM on my Crucial RealSSD C300 256GB is most likely what caused > its garbage collection algorithm to fail (killing the drive and all its > data), and it was also causing BRTFS to hang badly when I was getting > within 10GB of the drive getting full. > > I reported some problems I had with btrfs being very slow and hanging when I > only had 10GB free, and I'm now convinced that it was the SSD that was at > fault. > > On the Crucial RealSSD C300 256GB, and from talking to their tech support > and other folks who happened to have gotten that 'drive' at work and also > got weird unexplained failures, I'm convinced that even its latest 007 > firmware (the firmware it shipped with would just hang the system for a few > seconds every so often so I did upgrade to 007 early on), the drive does > very poorly without TRIM when it's getting close to full. If you're going to edit the wiki, I'd suggest you say "SOME SSDs might need to use TRIM with dmcrypt". That's because some SSD controllers (e.g. sandforce) performs just fine without TRIM, and in my case TRIM made performance worse. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: file system corruption removal / documentation quandry
On Thu, Jul 12, 2012 at 12:13 PM, eric gisse wrote: > Basically, phoronix showed there is a --repair option. After enabling > snapshotting and playing around with the various discussed options, I > discovered that --repair and no special mount options was sufficient > to get the files removable. I'm curious, whether running it directly on newer kernel (e.g. latest ubuntu kernel-ppa/mainline) will be able to "solve" the problem, even without btrfsck. Also note that if by "snapshotting" you mean "create LVM snapshots", then you might be in for another surprise, as btrfs doesn't play nice with block devices with the same fs UUID. Don't rely on that as backup option. > > Now what I'm hoping for is better documentation on btrfsck even if it > just boils down to a brief enumeration of the options as that would be > better than nothing which is what we have now. Do I need to file a bug > or is this sufficient? Edit https://btrfs.wiki.kernel.org/index.php/Btrfsck ? -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS fsck apparent errors
On Wed, Jul 4, 2012 at 8:42 PM, David Sterba wrote: > On Wed, Jul 04, 2012 at 07:40:05AM +0700, Fajar A. Nugraha wrote: >> Are there any known btrfs regression in 3.4? I'm using 3.4.0-3-generic >> from a ppa, but a normal mount - umount cycle seems MUCH longer >> compared to how it was on 3.2, and iostat shows the disk is >> read-IOPS-bound > > Is it just mount/umount without any other activity? Yes > Is the fs > fragmented Not sure how to check that quickly > (or aged), Over 1 year, so yes > almost full, df says 83% used, so probably yes (depending on how you define "almost") ~ $ df -h /media/WD-root Filesystem Size Used Avail Use% Mounted on /dev/sdc2 922G 733G 155G 83% /media/WD-root ~ $ sudo btrfs fi df /media/WD-root/ Data: total=883.95GB, used=729.68GB System, DUP: total=8.00MB, used=104.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=18.75GB, used=1.49GB Metadata: total=8.00MB, used=0.00 > has lots of files? it's a "normal" 1 TB usb disk, with docs, movies, vm images, etc. No particular lots-of-small-files like maildir or anything like that. >> # time umount /media/WD-root/ >> >> real 0m22.419s >> user 0m0.000s >> sys 0m0.064s >> >> # /proc/10142/stack <--- the PID of umount process > > The process(es) actually doing the work are the btrfs workers, usual > sucspects are btrfs-cache (free space cache) or btrfs-ino (inode cache) > that are writing the cache states back to disk. Not sure about that, since iostat shows it's mostly read, not write. Will try iotop later. I tested also with Chris' for-linus on top of 3.4, same result (really long time to umount). Reverting back to ubuntu's 3.2.0-26-generic, umount only took less than 1 s :P So I guess I'm switching back to 3.2 for now. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS fsck apparent errors
On Tue, Jul 3, 2012 at 10:22 PM, Hugo Mills wrote: > On Tue, Jul 03, 2012 at 05:10:13PM +0200, Swâmi Petaramesh wrote: >> After I had shifted, I tried to defragment and compress my FS using >> commands such as : >> >> find /mnt/STORAGEFS/STORAGE/ -exec btrfs fi defrag -clzo -v {} \; >> >> During execution of such commands, my kernel oopsed, so I restarted. >I would also suggest using a 3.4 kernel. There's at least one FS > corruption bug known to exist in 3.2 that's been fixed in 3.4. Are there any known btrfs regression in 3.4? I'm using 3.4.0-3-generic from a ppa, but a normal mount - umount cycle seems MUCH longer compared to how it was on 3.2, and iostat shows the disk is read-IOPS-bound # time mount LABEL=WD-root real0m10.400s user0m0.000s sys 0m0.060s # time umount /media/WD-root/ real0m22.419s user0m0.000s sys 0m0.064s # /proc/10142/stack <--- the PID of umount process [] sleep_on_page+0xe/0x20 [] wait_on_page_bit+0x78/0x80 [] filemap_fdatawait_range+0x10c/0x1a0 [] btrfs_wait_marked_extents+0x6b/0xc0 [btrfs] [] btrfs_write_and_wait_marked_extents+0x3b/0x60 [btrfs] [] btrfs_write_and_wait_transaction+0x2b/0x50 [btrfs] [] btrfs_commit_transaction+0x759/0x960 [btrfs] [] btrfs_commit_super+0xbb/0x110 [btrfs] [] close_ctree+0x2a0/0x310 [btrfs] [] btrfs_put_super+0x19/0x20 [btrfs] [] generic_shutdown_super+0x62/0xf0 [] kill_anon_super+0x16/0x30 [] btrfs_kill_super+0x1a/0x90 [btrfs] [] deactivate_locked_super+0x3c/0xa0 [] deactivate_super+0x4e/0x70 [] mntput_no_expire+0xdc/0x130 [] sys_umount+0x66/0xe0 [] system_call_fastpath+0x16/0x1b [] 0x -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Kernel panic from "btrfs subvolume delete"
On Fri, Jun 29, 2012 at 9:23 PM, Richard Cooper wrote: >>> If so, how? >> >> https://blogs.oracle.com/linux/entry/oracle_unbreakable_enterprise_kernel_release >> http://elrepo.org/tiki/kernel-ml > > Perfect, thank you! I was looking for a mainline kernel yum repo but my > google-fu was failing me. That looks like just what I need. > > I've installed kernel v3.4.4 from http://elrepo.org/tiki/kernel-ml and that > seems to have fixed my kernel panic. I'm still using the default Cent OS 6 > versions of the btrfs userspace programs (v0.19). Any reason why that might > be a bad idea? At the very least, newer version of btrfsck has --repair, which you might need later in the future. There's also features lke forcing a certain compression (e.g. zlib) on a file as part of "btrfs filesystem defrag" command. Just grab updated btrfs-progs (or whatever it's called) from Oracle's repo. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Kernel panic from "btrfs subvolume delete"
On Fri, Jun 29, 2012 at 5:11 PM, Richard Cooper wrote: > Hi All, > > I have two machines where I've been testing various btrfs based backup > strategies. They are both Cent OS 6 with the standard kernel and btrfs-progs > RPMs from the CentOS repos. > > - kernel-2.6.32-220.17.1.el6.x86_64 > - btrfs-progs-0.19-12.el6.x86_64 In btrfs terms, 2.6.32 is ... stone age :P > What should I do now? Do I need to upgrade to a more recent btrfs? Yep > If so, how? https://blogs.oracle.com/linux/entry/oracle_unbreakable_enterprise_kernel_release http://elrepo.org/tiki/kernel-ml -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: System Policy for Filenames
On Wed, Jun 27, 2012 at 1:28 AM, Aaron Peterson wrote: > Billy, > > Thank you! I will look into FUSE. > > Ultimately, I want my / to be mounted with these rules, I will need a > boot loader to be able to handle it. Try looking at how ubuntu live cd works. Last time I check, it can use unionfs-fuse as "/" to make the read-only cd media appear "writable" live session. Something similar should be applicable to your needs. > I am wondering if filesystem software has hooks for AppArmor or > SELinux, or some other Linux Security Module would be appropriated to > add to filesystem code? Not that I know of. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Subvolumes and /proc/self/mountinfo
On Wed, Jun 20, 2012 at 10:22 AM, H. Peter Anvin wrote: > a. Make a snapshot of the current root; > b. Mount said snapshot; > c. Install the new distro on the snapshot; > d. Change the bootloader configuration *inside* the snapshot to point > to the snapshot as the root; > e. Install the bootloader on the snapshot, thereby making the boot > block point to it and making it "live". IMHO a more elegant solution would be similar to what (open)solaris/indiana does: make the boot parts (bootloader, configuration) as a separate area, separate from root snapshots. In solaris case IIRC this is will br /rpool/grub. A similar approach should be implementable in linux, at least on certain configurations, since if you put /boot as part of "/" (thus, also on btrfs), AND you don't change the default subvolume, AND the roots are on their own subvolume, the paths to vmlinuz and initrd on grub.cfg will have subvols name in it. So it's possible to have a single grub.cfg having several entries that points to different subvols. So you don't need to install a new bootloader to make a particular subvol live, you only need to select it from the boot menu. I'm doing this currently with ubuntu precise, but with manually-created grub.cfg though. Still haven't found a way to manage this automatically. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Subvolumes and /proc/self/mountinfo
On Wed, Jun 20, 2012 at 6:35 AM, H. Peter Anvin wrote: > On 06/19/2012 07:22 AM, Calvin Walton wrote: >> >> All subvolumes are accessible from the volume mounted when you use -o >> subvolid=0. (Note that 0 is not the real ID of the root volume, it's >> just a shortcut for mounting it.) >> > > Could you clarify this bit? Specifically, what is the real ID of the > root volume, then? 5 -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Moving top level to a subvolume
On Wed, Jun 13, 2012 at 4:44 PM, C Anthony Risinger wrote: > On Wed, Jun 13, 2012 at 2:21 AM, Arne Jansen wrote: >> On 13.06.2012 09:04, C Anthony Risinger wrote: >>> ... because in a), data will *copied* the slow way >> What I don't understand is why you think data will be copied. > at one point i tried to create a new subvol and `mv` files there, and > it took quite some time to complete > (cross-link-device-what-have-you?), but maybe things changed ... will > try it out. IIRC it hasn't. Not in upstream anyway. Some distros (e.g. opensuse) carry their own patch which allows cross-subvolume links (cp --reflink ...). But it shouldn't matter anyway, since you can SNAPSHOT the old subvol (even root subvol), instead of creating a new subvol. Which means nothing needs to be copied. You'd still have to do "rm" manually though. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Moving top level to a subvolume
On Wed, Jun 13, 2012 at 2:23 PM, Duncan <1i5t5.dun...@cox.net> wrote: > Fajar A. Nugraha posted on Wed, 13 Jun 2012 08:49:47 +0700 as excerpted: > >> As for "lose their filesystems", are there recent ones that uses one of >> the three distros above, and is purely btrfs "fault"? The ones I can >> remember (from the post to this list) were broken on earlier kernels, or >> caused by bad disks. > My system's old and has a bit of a problem with overheating in the > Phoenix summer, so has been suffering SATA resets > it's exactly this sort of > corner-case that filesystems need to be able to deal with IIRC XFS had corruption problems when used on top of LVM (or other block device that doesn't support barriers correctly), while using ext2/3/4 on the same block device will be "fine". Yet XFS doesn't have the mark of "unstable, highly experimental, do not use". People simply use the right (for them) fs for the right job. My point is yes, btrfs is new. And it's being developed at much faster rate than any other more-mature fs out there. And there are known cases of data loss on certain configuration of corner cases/"buggy" hardware and/or old version of kernel. But when used in the correct environment, btrfs can be a good choice, even for critical data. Of course IF the data were REALLY critical, and I REALLY need btrfs' features, and it were on an enterprise environment, I would've bought support from oracle linux (or SLES 12, when it's out, or whatever enterprise distro supporting btrfs which sells support contract) so I can have someone to turn to in case of problems, and (in some cases) transfer the risk/blame :D -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Moving top level to a subvolume
On Tue, Jun 12, 2012 at 9:52 PM, Randy Barlow wrote: > I personally run Gentoo, but I've been told by some coworkers that the Ubuntu > installer offers btrfs as an option to the users without marking it as > experimental, unstable, or under development. I wonder if that is why we see > so many people surprised when they lose their filesystems. Can anyone verify > whether that is true of Ubuntu, or of any other Linux distributions? Oracle linux (when used with UEK2) officially supports btrfs. Opensuse also supports btrfs, and use its functionality for snapper. I haven't found any updated (i.e. released post 12.04) official support status statement from Ubuntu, but they do offer btrfs as installation option. As for "lose their filesystems", are there recent ones that uses one of the three distros above, and is purely btrfs "fault"? The ones I can remember (from the post to this list) were broken on earlier kernels, or caused by bad disks. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Preparing single-disk setup for future multi-disk usage
On Thu, May 24, 2012 at 1:05 PM, Björn Wüst wrote: > > Unfortunately, I do not have a disk to test it right now. The disk I am > planning to use is with the post service still :) . you can use sparse files. Possibly with losetup, if necessary. > Thank you for your replies to this email (bjoern.wu...@gmx.net, That's not the email you use to send > I am not subscribed to the mailing lists, thus please do a 'reply all'). IMHO asking something to a list and then saying "I am not subscribed" and "send your reply to this other email address that I'm not using to send" is rude. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kernel 3.3.4 damages filesystem (?)
On Tue, May 8, 2012 at 2:39 PM, Helmut Hullen wrote: >> And you can use three BTRFS filesystems the same way as three Ext4 >> filesystems if you prefer such a setup if the time spent for >> restoring the backup does not make up the cost for one additional >> disk for you. > > But where's the gain? If a disk fails I have a lot of tools for > repairing an ext2/3/4 system. It won't work if you use it in RAID0 (e.g. with LVM spanning three disks, then use ext4 on top of the LV). Which is basically the same thing that you did (using btrfs in raid0 mode). As others said, if your only concern is "if a disk is dead, I want to be able to access data on other disks", then simply use btrfs as three different fs, mounted on three directories. btrfs will shine when: - you need checksum and self-healing in raid10 mode - you have lots of small files - you have highly compressible content - you need snapshot/clone feature Since you don't need either, IMHO it's actually better if you just use ext4. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Can btrfs silently repair read-error in raid1
On Tue, May 8, 2012 at 2:13 PM, Clemens Eisserer wrote: > Hi, > > I have a quite unreliable SSD here which develops some bad blocks from > time to time which result in read-errors. > Once the block is written to again, its remapped internally and > everything is fine again for that block. > > Would it be possible to create 2 btrfs partitions on that drive and > use it in RAID1 - with btrfs silently repairing read-errors when they > occur? > Would it require special settings, to not fallback to read-only mode > when a read-error occurs? The problem would be how the SSD (and linux) behaves when it encounters bad blocks (not bad disks, which is easier). If it does "oh, I can't read this block. I just return an error immediately", then it's good. However, in most situation, it would be like "hmmm, I can't read this block, let me retry that again. What? still error? then lets retry it again, and again.", which could take several minutes for a single bad block. And during that time linux (the kernel) would do something like "hey, the disk is not responding. Why don't we try some stuff? Let's try resetting the link. If it doesn't work, try downgrading the link speed". In short, if you KNOW the SSD is already showing signs of bad blocks, better just throw it away. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kernel 3.3.4 damages filesystem (?)
On Mon, May 7, 2012 at 5:46 PM, Helmut Hullen wrote: > For some months I run btrfs unter kernel 3.2.5 and 3.2.9, without > problems. > > Yesterday I compiled kernel 3.3.4, and this morning I started the > machine with this kernel. There may be some ugly problems. > Data, RAID0: total=5.29TB, used=4.29TB Raid0? Yaiks! > System, RAID1: total=8.00MB, used=352.00KB > System: total=4.00MB, used=0.00 > Metadata, RAID1: total=149.00GB, used=5.00GB > > Label: 'MMedia' uuid: 9adfdc84-0fbe-431b-bcb1-cabb6a915e91 > Total devices 3 FS bytes used 4.29TB > devid 3 size 2.73TB used 1.98TB path /dev/sdi1 > devid 2 size 2.73TB used 1.94TB path /dev/sdf1 > devid 1 size 1.82TB used 1.63TB path /dev/sdc1 > > May 7 06:55:26 Arktur kernel: ata5: exception Emask 0x10 SAct 0x0 SErr > 0x1 action 0xe frozen > May 7 06:55:26 Arktur kernel: ata5: SError: { PHYRdyChg } > May 7 06:55:26 Arktur kernel: ata5: hard resetting link > May 7 06:55:31 Arktur kernel: ata5: COMRESET failed (errno=-19) > May 7 06:55:31 Arktur kernel: ata5: reset failed (errno=-19), retrying in 6 > secs > May 7 07:15:19 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device > May 7 07:15:19 Arktur kernel: end_request: I/O error, dev sdf, sector 0 > May 7 07:15:19 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device > May 7 07:15:19 Arktur kernel: lost page write due to I/O error on sdf1 That looks like a bad disk to me, and it shouldn't be related to ther kernel version you use. Your best chance might be: - unmount the fs - get another disk to replace /dev/sdf, copy the content over with dd_rescue. Ata resets can be a PITA, so you might be better of by moving the failed disk to a usb external adapter, and du some creative combination of plug-unplug and selectively skip bad sectors manually (by passing "-s" to dd_rescue). - reboot, with the bad disk unplugged - (optional) run "btrfs filesystem scrub" (you might need to build btrfs-progs manually from git source). or simply read the entire fs (e.g. using tar to /dev/null, or whatever). It should check the checksum of all files and print out which files are damaged (either in stdout or syslog). I don't think there's anything you can do to recover the damaged files (other than restore from backup), but at least you know which files are NOT damaged. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Can't mount
On Thu, May 3, 2012 at 10:31 PM, Hugo Mills wrote: > On Thu, May 03, 2012 at 03:18:01PM +, Yo'av Moshe wrote: >> Is there anything else I can try? >> >> I'm using kernel 3.2 on Ubuntu 12.04. > > In approximate order: > > * Try a 3.3 or 3.4-rc5 kernel. I don't think those will do anything > to fix this particular issue, but it's worth a try. > > * You have the last-but-one generation listed by find-root: > > Well block 216926195712 seems great, but generation doesn't match, > have=135713, want=135714 > > so you can use the restore tool[1] with that block number (and -t) > to copy off any data that you need that isn't backed up. Is btrfs-zero-log still relevant? I imagine losing several last transactions is MUCH more convinient than having to recreate the enitre fs (even if restore managed to salvage everything). And what about mont -o ro,recover? -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How file store when using Btrfs on multi-devices? What happen when a device fail?
On Thu, May 3, 2012 at 1:46 PM, Chu Duc Minh wrote: > Hi, i have some questions when using Btrfs on multi-devices: > 1. a large file will always be stored wholely on a device or it may > spread on some devices/partitions? IIRC: - in raid1 mode, it will be written on all disks (or was it TWO disks, regarless how many device in a mirror? can't remember which). - in raid10 and raid0, it will always be spread, on a minimum of two devices > Btrfs has option to specify it > explicitly? Not that I know of. > 2. suppose i have a directory tree like that: > Dir_1 > |--> file_1A > |--> file_1B > |--> Dir_2 > |--> file_2C > |--> file_2D > > If Dir_2, file_2C on a failed device, can i still have access to file_2D? Unless you're using raid10, my guess is you'll be screwed, as each file will be spread on multiple devices (including the one that fails). > If i use GlusterFS (mirror mode) on two nodes, each nodes run Btrfs on > multi-device. When a device on a node fail and I replace it, then > GlusterFS resync it, can i have troubles with data consistency? This question might be more suitable on glusterfs list. My guess is that glusterfs will discard all data on the failed node. After you recreate the storage backend (the btrfs, on a new device), you can tell glusterfs to copy everything from the good node. Of course, if you use raid10 mode in btrfs, and only one device fail, it should be transparent to end users. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs across a mix of SSDs & HDDs
On Wed, May 2, 2012 at 12:00 PM, Bardur Arantsson wrote: > On 05/02/2012 06:28 AM, Fajar A. Nugraha wrote: >>> From Kconfig: >>> >>> "Btrfs filesystem (EXPERIMENTAL) Unstable disk format" >>> ^^ >>> >>> Btrfs is too immature to use in ANY kind of production-like scenario >>> where >>> you cannot afford to lose a certain amount of data (i.e. be forced to >>> restore from backup) AND suffer downtime. >>> >>> I don't think email users are going to be thrilled about the prospect of >>> "lossy" email. >> >> >> Oracle fully supports btrfs for production environment: >> http://oss.oracle.com/ol6/docs/RELEASE-NOTES-UEK2-en.html >> >> http://www.zdnet.com/blog/open-source/oracles-unbreakable-enterprise-kernel-2-arrives-with-linux-30-kernel-btrfs/10588 >> http://www.oracle.com/us/technologies/linux/index.html >> > > What does "fully supports" mean? Does it mean that it's actually stable > (considerably more stable that mainline), or does it mean that you can pay > them to help fix a broken FS, for example? Does the included btrfsck > actually work reliably? Is there some non-legalese official statement of > what, exactly, "fully supported" means and whether OL's btrfs falls under > this rubric? That question would be best addressed to Oracle directly. Or other distro vendors supporting btrfs (IIRC SLES also supports it). > > Also, AFAIUI the 3.0.x kernels (which OL claims to use in the release notes) > are woefully outdated wrt. btrfs reliability/stability. Have all the more > recent stability improvements been backported? Chris or other devs from oracle might be able to comment more on that. I know that it's quite common for an OSS vendor to have a supported version of something, based on a version that is more thoroughly tested, and have another version (in this case the version of btrfs in mainline) that has newer, bleeding-edge code, with more features, but possibly also more bugs. > > Is the OP using Oracle Linux? He didn't say. But he didn't say he WON'T be using oracle linux (or other distro which supports btrfs) either. Plus the kernel can be installed on top of RHEL/Centos 5 and 6, so he can easily choose either the supported version, or the mainline version, each with its own consequences. > Given the semi-regular posts about FS corruption on this list(*) and the > "EXPERIEMENTAL" status in the KConfig it would be unwise to use btrfs for > anything called "production" (unless you can actually afford downtime/data > loss). Fair opinion. Personally I'm quite happy with the version that is included in Ubuntu Precise (kernel 3.2). It has actually helped me recover from a bad SSD. It was a somewhat old SSD, and about 1GB (out of 50GB) data becomes unreadable (reading directly from the block device). "btrfs scrub" was helpful enough to help me find out which files are corrupted, something I wouldn't be able to do with ext4. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs across a mix of SSDs & HDDs
On Wed, May 2, 2012 at 9:22 AM, Bardur Arantsson wrote: > On 05/01/2012 09:35 PM, Martin wrote: >> >> How well does btrfs perform across a mix of: >> >> 1 SSD and 1 HDD for 'raid' 1 mirror for both data and metadata? >> The idea is to gain the random access speed of the SSDs but have the >> HDDs as backup in case the SSDs fail due to wear... AFAIK only zfs officially supports that configuration, using L2ARC and SLOG >> >> The usage is to support a few hundred Maildirs + imap for users that >> often have many thousands of emails in the one folder for their inbox... Some mail programs uses hardlinks, and btrfs has a low limit on maximum number of hardlinks in a directory. If you use one of those programs, better stay away for now. Plus, from my experience, when using the same disk, btrfs will use up more disk I/O compared to ext4, so if you're already I/O-starved, better stick with ext4. >> Or is btrfs yet too premature to suffer such use? >> > > From Kconfig: > > "Btrfs filesystem (EXPERIMENTAL) Unstable disk format" > ^^ > > Btrfs is too immature to use in ANY kind of production-like scenario where > you cannot afford to lose a certain amount of data (i.e. be forced to > restore from backup) AND suffer downtime. > > I don't think email users are going to be thrilled about the prospect of > "lossy" email. Oracle fully supports btrfs for production environment: http://oss.oracle.com/ol6/docs/RELEASE-NOTES-UEK2-en.html http://www.zdnet.com/blog/open-source/oracles-unbreakable-enterprise-kernel-2-arrives-with-linux-30-kernel-btrfs/10588 http://www.oracle.com/us/technologies/linux/index.html -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: snapper for Ubuntu? (WAS: btrfs auto snapshot)
On Tue, Apr 10, 2012 at 9:35 PM, Matthias G. Eckermann wrote: > On 2012-04-10 T 20:48 +0700 Fajar A. Nugraha wrote: >> How can I create config for /data or other directories (other than >> manually creating the config file and .snapshots directory)? > > This should do it: > > sudo snapper -c home create-config /home > sudo snapper -c data create-config /data > > The reasons for the extra "-c " is that you have to > tell snapper, which name to choose for the configuration > you want to create. This name is the one you can reference > in future actions such as create/modify/delete. Great! That works, thanks. Is there an oposite of create-config, i.e. delete for just one subvolume? delete-config seems to delete everything (configs for all subvolume and all snapshots). Also, one minor detail, I noticed that the cron configuration file is /etc/sysconfig/snapper. It should be /etc/default/snapper in ubuntu/debian. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: snapper for Ubuntu? (WAS: btrfs auto snapshot)
On Tue, Apr 10, 2012 at 6:46 PM, Arvin Schnell wrote: > On Mon, Apr 09, 2012 at 08:18:45AM +0700, Fajar A. Nugraha wrote: >> I noticed that openSUSE buildservice now provides debs for ubuntu as >> well. I can't seem to find a way to add it to apt source list though, >> using the usual line >> >> deb uri distribution [component1] > > You can use these commands: > > echo 'deb > http://download.opensuse.org/repositories/filesystems:/snapper/Debian_6.0/ /' > >> /etc/apt/sources.list I didn't know you could use that format :D Just tested it, and it works, although the command I use is echo 'deb http://download.opensuse.org/repositories/filesystems:/snapper/Debian_6.0/ /' | sudo tee /etc/apt/sources.list.d/opensuse-snapper.list > > apt-get update That got me the error W: GPG error: http://download.opensuse.org Release: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 2DA6FAF4175BFA4E easily fixed though, using $ sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 2DA6FAF4175BFA4E ... and then another apt-get update after that. > apt-get install snapper That result in a warning WARNING: The following packages cannot be authenticated! libsnapper snapper Install these packages without verification [y/N]? Did the package creation process somehow ommit signing process, perhaps? Or is there something else I missed? Anyway, I got snapper-0.0.10-0 installed now, but having a small problem. I use different subvolumes for multiple directories. For example, /home and /data. Creating the config for both results in an error $ sudo snapper list-configs Config | Subvolume ---+-- $ sudo snapper create-config /home $ sudo snapper create-config /data Creating config failed (config already exists). $ sudo snapper list-configs Config | Subvolume ---+-- root | /home How can I create config for /data or other directories (other than manually creating the config file and .snapshots directory)? -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Snapper packages for Ubuntu
On Tue, Apr 10, 2012 at 6:50 PM, Arvin Schnell wrote: > On Tue, Apr 10, 2012 at 05:37:38PM +0700, Fajar A. Nugraha wrote: >> Hi, >> >> I've created snapper packages for Ubuntu, available on >> https://launchpad.net/~snapper/+archive/stable. For those new to >> snapper, it's a tool for managing btrfs snapshots >> (http://en.opensuse.org/Portal:Snapper). It depends on libblocxx > libblocxx is not required for snapper anymore since about a > month. It's checked during configure. You're right. I just tested it, and not having libblocxx during compilation results in less dependencies (namely libblocxx itself, plus libssl, libcrypto, and libpcre). What functionality, if any, is not available when not using libblocxx? Since it's still used when present during configure, I assume it's good for something. Thanks. Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Snapper: Always create .snapshot dir unconditonally
Current version of snapper (commit 50dec40) bails out with this error if .snapshots directory doesn't exist (as is the case on new snapper install): 2012-04-10 16:15:30,241 ERROR libsnapper(17784) Snapshot.cc(nextNumber):362 - mkdir failed errno:2 (No such file or directory) This patch tries to create .snapshots dir unconditionally. Signed-off-by: Fajar A. Nugraha --- snapper/Snapshot.cc |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/snapper/Snapshot.cc b/snapper/Snapshot.cc index 8e9cc37..277fad7 100644 --- a/snapper/Snapshot.cc +++ b/snapper/Snapshot.cc @@ -353,6 +353,9 @@ namespace snapper if (snapper->getFilesystem()->checkSnapshot(num)) continue; + // try to create .snapshots dir unconditionally + mkdir(snapper->infosDir().c_str(), 0711); + if (mkdir((snapper->infosDir() + "/" + decString(num)).c_str(), 0777) == 0) break; -- 1.7.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Snapper packages for Ubuntu
Hi, I've created snapper packages for Ubuntu, available on https://launchpad.net/~snapper/+archive/stable. For those new to snapper, it's a tool for managing btrfs snapshots (http://en.opensuse.org/Portal:Snapper). It depends on libblocxx available from https://launchpad.net/~bjoern-esser-n/+archive/blocxx , and currently uses git source up to commit 50dec40. I've done some limited testing and it seems to to work correctly so far. There's a small, distro-independent patch needed for it to work correctly though. I'm sending it as a separate mail. @Arvin, @MGE, I don't know the correct list for snapper development so I'm cc-ing you both. If there's a dedicated list for snapper please let me know and I'll post further updates there. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
snapper for Ubuntu? (WAS: btrfs auto snapshot)
On Thu, Mar 1, 2012 at 8:48 PM, Arvin Schnell wrote: > We have now created a project in the openSUSE buildservice were > we provide snapper packages for various distributions, e.g. RHEL6 > and Fedora 16. Please find the downloads at: > > http://download.opensuse.org/repositories/filesystems:/snapper/ > > I'll also add a link from the snapper home page: > > http://en.opensuse.org/Portal:Snapper. > > I have tested snapper on Fedora 16 and found no problems. Hi Arvin, I noticed that openSUSE buildservice now provides debs for ubuntu as well. I can't seem to find a way to add it to apt source list though, using the usual line deb uri distribution [component1] Is there a howto somewhere, or is it download-all-debs-manually-and-install-with-dpkg for now? Thanks, Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfsck integration with userlevel API for fsck
On Sat, Mar 31, 2012 at 3:35 AM, Avi Miller wrote: > > On 30/03/2012, at 2:22 PM, Fajar A. Nugraha wrote: > >> On Fri, Mar 30, 2012 at 5:08 AM, member graysky wrote: >>> Are there plans to integrate btrfsck with the userlevel API for fsck? >> >> There isn't even a stable, working, fixing btrfsck yet :) > > Yes, there is. Chris merged the btrfsck changes into the btrfs-progs master > in git a few days ago and we shipped it with the Oracle Linux UEK2 update as > well. Ah, OK. I must've missed the announcement. Thanks for the update. Now if only UEK2 fully supports LXC as well instead of tech preview ... :D -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfsck integration with userlevel API for fsck
On Fri, Mar 30, 2012 at 5:08 AM, member graysky wrote: > Are there plans to integrate btrfsck with the userlevel API for fsck? There isn't even a stable, working, fixing btrfsck yet :) > AFAIK, it currently does not work as such (i.e. `shutdown -rF now` > does not trigger a check on the next boot). What is the recommended > method to check a btrfs root filesystem? Live media? Currently? None. Set the last part of root line in fstab to "0" to disable fsck. Newer kernels should be smart enough to recover from unclean shutdown automatically, kinda like what zfs does, or what ext3/4 does with its journal replay. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Create subvolume from a directory?
On Wed, Mar 28, 2012 at 5:24 AM, Matthias G. Eckermann wrote: > While the time measurement might be flawed due to the subvol > actions inbetween, caching etc.: I tried several times, and > "cp --reflinks" always is multiple times faster than "mv" in > my environment. So this is cross-subvolume reflinks? I thought the code for that wasn't merged yet? -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs and backups
On Mon, Mar 26, 2012 at 3:56 PM, Felix Blanke wrote: > On 3/26/12 10:30 AM, James Courtier-Dutton wrote: >> Is there some tool like rsync that I could copy all the data and >> snapshots to a backup system, but still only use the same amount of >> space as the source filesystem. > I'm not sure if I understand your problem right, but I would suggest: > > 1) Snapshot the subvolume on the source > 2) rsync the snapshot to the destination > 3) Snapshot the destination James did say "only use the same amount of space as the source filesystem." Your approach would increase the usage when one or more subvolume shares the same space (e.g. when one subvolume starts as snapshot). AFAIK the (planned) way to do this is using "btrfs send | receive", which is not available yet. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Can't mount, power failure - recoverable?
On Mon, Mar 26, 2012 at 3:49 PM, Skylar Burtenshaw wrote: > Fajar A. Nugraha fajar.net> writes: > >> Didn't Chris' last response basically say "use kernel 3.2 or newer, >> mount the fs (possibly with -o ro), and copy the data elsewhere"? > > Why yes, yes it did actually. I appreciate your spotlighting it, just in case > I > somehow managed to miss it, though. > >> Have you done that? > > I have. In fact, in my first message, I stated that in all kernels up to > present > 3.2 kernels, I get several minutes of disk churning, then a stack trace. Also > present in my messages is the fact that the filesystem will not mount, as well > as data output from the recovery program etc which fail to recognize things in > the filesystem that they require in order to fix it. Did you have something > you > wished to suggest, in order to help me? If so, I'd gladly listen to any > proposed > ideas. Since you apprently tried "-o ro" (which I missed), then my last suggestion is probably kernel 3.3 with "-o ro". just in case :) -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Can't mount, power failure - recoverable?
On Mon, Mar 26, 2012 at 3:34 PM, Skylar Burtenshaw wrote: > Hey - been a few days, not meaning to pester but I wanted to make sure my > previous message didn't slip through the cracks. If I offended, I apologize - > I > certainly didn't mean to, and my attempts at joviality can come across as > abrasive. If you simply haven't had time to look into this yet, or it's > bizarre > enough that it's taking time to isolate, take all the time you need. Thank > you. Didn't Chris' last response basically say "use kernel 3.2 or newer, mount the fs (possibly with -o ro), and copy the data elsewhere"? Have you done that? -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: compressed btrfs "No space left on device"
On Thu, Nov 17, 2011 at 12:59 AM, Arnd Hannemann wrote: > Am 14.11.2011 19:24, schrieb Arnd Hannemann: >> Am 14.11.2011 15:57, schrieb Arnd Hannemann: >> >>> I'm using btrfs for my /usr/share/ partition and keep getting the following >>> error >>> while installing a debian package which should take no more than 228 MB: >>> >>> Unpacking texlive-fonts-extra (from >>> .../texlive-fonts-extra_2009-10ubuntu1_all.deb) ... >>> dpkg: error processing >>> /var/cache/apt/archives/texlive-fonts-extra_2009-10ubuntu1_all.deb >>> (--unpack): >>> unable to install new version of >>> `/usr/share/texmf-texlive/fonts/type1/public/allrunes/frutlt.pfb': No space >>> left on device >> FYI: The problem is the same with mainline kernel v3.1.1. > > JFYI: the problem went away in 3.2-rc2 so someone must > have fixed something. I just experienced the same thing in Ubuntu precise, 3.2.0-17-generic, so I don't think it's fixed yet. $ sudo btrfs fi df / Data: total=43.47GB, used=38.47GB System, DUP: total=8.00MB, used=12.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=3.25GB, used=912.47MB Metadata: total=8.00MB, used=0.00 $ df -h / Filesystem Size Used Avail Use% Mounted on /dev/sda650G 41G 5.1G 89% / the problem occur when copying precise lxc root template (322M, 13759 files/directories). It only happens when using zlib compression though, using lzo works fine. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [RFC] Add btrfs autosnap feature
On Mon, Mar 5, 2012 at 1:51 PM, Anand Jain wrote: > >> (notably the direct modification of >> crontab files, which is considered to be an internal detail if I >> understand correctly, and I'm fairly certain is broken as written), > > > I did came across that point of view however, using crontab cli in the > program wasn't convincing either, (library call would have been better). > any other better ways to manage cron entries ? > /etc/cron.{d,daily,hourly} ? -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: filesystem full when it's not? out of inodes? huh?
On Fri, Mar 2, 2012 at 6:50 PM, Brian J. Murrell wrote: > Is 2010-06-01 really the last time the tools were considered > stable or are Ubuntu just being conservative and/or lazy about updating? The last one :) Or probably no one has bugged them enough and point out they're already using a git snapshot anyway and there are many new features in the "current" git version of btrfs-tools. I have been compiling my own kernel (just recently switched to Precise's kernel though) and btrfs-progs for quite some time, so even if Ubuntu doesn't provide updated package it wouldn't matter much to me. If it's important for you, you could file a bug report in launchpad asking for an update. Even debian testing has an updated version (which you might be able to use: http://packages.debian.org/btrfs-tools) Or create your own ppa with an updated version (or at least rebuilt of Debian's version). -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] btrfs auto snapshot
On Thu, Mar 1, 2012 at 8:48 PM, Arvin Schnell wrote: > On Thu, Feb 23, 2012 at 04:54:06PM +0700, Fajar A. Nugraha wrote: >> On Thu, Aug 18, 2011 at 12:38 AM, Matthias G. Eckermann >> wrote: > >> > are available in the openSUSE buildservice at: >> > >> > http://download.opensuse.org/repositories/home:/mge1512:/snapper/ >> > >> >> Hi Matthias, >> >> I'm testing your packages on top of RHEL6 + kernel 3.2.7. A small >> suggestion, you should include /etc/sysconfig/snapper in the package >> (at least for RHEL6, haven't tested the other ones). Even if it just >> contains >> >> SNAPPER_CONFIGS="" > > Hi Fajar, > > thanks for reporting that issue, I have fixed it now. Great! Thanks. > > We have now created a project in the openSUSE buildservice were > we provide snapper packages for various distributions, e.g. RHEL6 > and Fedora 16. Please find the downloads at: > > http://download.opensuse.org/repositories/filesystems:/snapper/ > > I'll also add a link from the snapper home page: > > http://en.opensuse.org/Portal:Snapper. > > I have tested snapper on Fedora 16 and found no problems. When I installed it back then, the first thing that comes to mind was "there's no documentation on how to get started". http://en.opensuse.org/openSUSE:Snapper_Tutorial is good, but that' is assuming root is btrfs, and snapper is already configured to snapshot root. For other distros, you need to first create the config manually, e.g. as shown for home in http://en.opensuse.org/openSUSE:Snapper_FAQ Could you update the tutorial, or perhaps create a new "quickstart" page? I'm kinda reluctant to do it myself since I don't use opensuse, and some of my edits might not reflect the "correct" way to do it in opensuse. If that's not possible, I'll put up the documentation somewhere else (perhaps the semi-official http://btrfs.ipv5.de/ , or my own wiki). Two other things that I have't find is: - how to add pre and post hooks, so (for example) snapper could create the same pre-post snapshot whenever user runs "yum", similar to when a user runs "yast" in opensuse, - whether a rollback REALLY rolls back everyting (including binary and new/missing files), or is it git-like behavior, or if it only process text files. ... but those two aren't as important as the getting-started documentation. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Btrfs Storage Array Corrupted
On Wed, Feb 29, 2012 at 7:13 AM, Travis Shivers wrote: > # ./btrfs-zero-log /dev/sdh > parent transid verify failed on 5568194695168 wanted 43477 found 43151 > parent transid verify failed on 5568194695168 wanted 43477 found 43151 > parent transid verify failed on 5568194695168 wanted 43477 found 43151 > parent transid verify failed on 5568194695168 wanted 43477 found 43151 > Ignoring transid failure Did you try a read-only mount (-o ro) after you run btrfs-zero-log? -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] btrfs auto snapshot
On Thu, Feb 23, 2012 at 7:02 PM, Anand Jain wrote: > > > autosnap code is available either end of this week or early > next week I thought you stopped working on this :D Alternatives are good though. Will test yours when it's out. FWIW, I also have another one, based on zfsonlinux's autosnapshot script :D > and what you will notice is autosnap snapshots > are named using uuid. > > Main reason to drop time-stamp based names is that, > - test (clicking on Take-snapshot button) which took more > than one snapshot per second was failing. > - a more descriptive creation time is available using a > command line option as in the example below. > - > # btrfs su list -t tag=@minute,parent=/btrfs/sv1 /btrfs > /btrfs/.autosnap/6c0dabfa-5ddb-11e1-a8c1-0800271feb99 Thu Feb 23 13:01:18 > 2012 /btrfs/sv1 @minute > /btrfs/.autosnap/5669613e-5ddd-11e1-a644-0800271feb99 Thu Feb 23 13:15:01 > 2012 /btrfs/sv1 @minute > - > As of now code for time-stamp as autosnap snapshot name is > commented out, if more people wanted it to be a time-stamp > based names, I don't mind having that way. Please do let me know. For me the main bonus point of having timestamp in names in the abiility to sort it by creation date by simply using "ls". As for the more-than-one-click-per-second problem, in my script I simply let it fail and return informative-enough error message. A workaround would be adding nanosecond timestamp, or put the UUID AFTER the time stamp, e.g: /btrfs/.autosnap/@minute_20120223_131501_123456_5669613e-5ddd-11e1-a644-0800271feb99 -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] btrfs auto snapshot
On Thu, Aug 18, 2011 at 12:38 AM, Matthias G. Eckermann wrote: > Ah, sure. Sorry. Packages for "blocxx" for: > Fedora_14 Fedora_15 > RHEL-5 RHEL-6 > SLE_11_SP1 > openSUSE_11.4 openSUSE_Factory > > are available in the openSUSE buildservice at: > > http://download.opensuse.org/repositories/home:/mge1512:/snapper/ > Hi Matthias, I'm testing your packages on top of RHEL6 + kernel 3.2.7. A small suggestion, you should include /etc/sysconfig/snapper in the package (at least for RHEL6, haven't tested the other ones). Even if it just contains SNAPPER_CONFIGS="" Thanks, Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
PATCH: Fix incorrect "error checking ... mount status" in mkfs.btrfs
Originally reported on linux-btrfs list: http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg08086.html Fix suggested on: http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg08386.html loop_info.lo_name is limited to LO_NAME_SIZE (curerently 64) characters. This can cause a problem if a file whose full path is longer than LO_NAME_SIZE is currently mounted. This patch changes resolve_loop_device() to: * Check /sys/block/loopX/loop/backing_file first * If that fails, fallback to original behaviour using loop_info.lo_name Patch is both inline and attached (in case mail client mungles it). Signed-off-by: Fajar A. Nugraha --- utils.c | 18 +- 1 files changed, 17 insertions(+), 1 deletions(-) diff --git a/utils.c b/utils.c index 178d1b9..a62 100644 --- a/utils.c +++ b/utils.c @@ -649,6 +649,9 @@ int resolve_loop_device(const char* loop_dev, char* loop_file, int max_len) { int loop_fd; int ret_ioctl; + int sysfs_fd; + char sysfs_path[PATH_MAX]; + const char* sysfs_path_format = "/sys/block/loop%d/loop/backing_file"; struct loop_info loopinfo; if ((loop_fd = open(loop_dev, O_RDONLY)) < 0) @@ -658,7 +661,20 @@ int resolve_loop_device(const char* loop_dev, char* loop_file, int max_len) close(loop_fd); if (ret_ioctl == 0) - strncpy(loop_file, loopinfo.lo_name, max_len); + { + snprintf(sysfs_path, PATH_MAX, sysfs_path_format, loopinfo.lo_number); + sysfs_fd = open(sysfs_path, O_RDONLY); + if (sysfs_fd < 0) + { + strncpy(loop_file, loopinfo.lo_name, max_len); + } + else + { + read(sysfs_fd, loop_file, max_len); + loop_file[strlen(loop_file)-1] = '\0'; + close(sysfs_fd); + } + } else return -errno; -- 1.7.9 From e004166d8f3b30e0d498df995ac9de8b11cce59a Mon Sep 17 00:00:00 2001 From: "Fajar A. Nugraha" Date: Thu, 23 Feb 2012 13:28:33 +0700 Subject: [PATCH] Fix incorrect "error checking ... mount status" in mkfs.btrfs Originally reported on linux-btrfs list: http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg08086.html Fix suggested on: http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg08386.html loop_info.lo_name is limited to LO_NAME_SIZE (curerently 64) characters. This can cause a problem if a file whose full path is longer than LO_NAME_SIZE is currently mounted. This patch changes resolve_loop_device() to: * Check /sys/block/loopX/loop/backing_file first * If that fails, fallback to original behaviour using loop_info.lo_name Signed-off-by: Fajar A. Nugraha --- utils.c | 18 +- 1 files changed, 17 insertions(+), 1 deletions(-) diff --git a/utils.c b/utils.c index 178d1b9..a62 100644 --- a/utils.c +++ b/utils.c @@ -649,6 +649,9 @@ int resolve_loop_device(const char* loop_dev, char* loop_file, int max_len) { int loop_fd; int ret_ioctl; + int sysfs_fd; + char sysfs_path[PATH_MAX]; + const char* sysfs_path_format = "/sys/block/loop%d/loop/backing_file"; struct loop_info loopinfo; if ((loop_fd = open(loop_dev, O_RDONLY)) < 0) @@ -658,7 +661,20 @@ int resolve_loop_device(const char* loop_dev, char* loop_file, int max_len) close(loop_fd); if (ret_ioctl == 0) - strncpy(loop_file, loopinfo.lo_name, max_len); + { + snprintf(sysfs_path, PATH_MAX, sysfs_path_format, loopinfo.lo_number); + sysfs_fd = open(sysfs_path, O_RDONLY); + if (sysfs_fd < 0) + { + strncpy(loop_file, loopinfo.lo_name, max_len); + } + else + { + read(sysfs_fd, loop_file, max_len); + loop_file[strlen(loop_file)-1] = '\0'; + close(sysfs_fd); + } + } else return -errno; -- 1.7.9
Re: btrfs-convert processing time
On Mon, Feb 20, 2012 at 9:29 PM, Olivier Bonvalet wrote: > On 20/02/2012 15:00, Fajar A. Nugraha wrote: >> >> On Mon, Feb 20, 2012 at 8:50 PM, Hubert Kario wrote: >>> >>> On Monday 20 of February 2012 14:41:33 Olivier Bonvalet wrote: >>>> >>>> Lot of small files (like compressed email from Maildir), and lot of >>>> hardlinks, and probably low free space (near 15% I suppose). >>>> >>>> >>>> So I think I have my answer :) >>>> >>> >>> Yes, this is probably the worst possible combination. >>> >>> Plese keep us updated. Just to have exact numbers for new users. >> >> >> >> ... although it would probably fail anyway due to btrfs hardlink limit >> in the same directory. >> > > And in that case, btrfs-convert will abort, or ignore the error, or just > hang ? On my simple test with ubuntu precise, loop-mounted ext4, 8k hardlinks: $ sudo btrfs-convert /dev/loop0 creating btrfs metadata. $ echo $? 139 so no useful error message, but it doesn't crash. And when mounted the device still shows ext4. A successful conversionn would look like this: $ sudo btrfs-convert /dev/loop1 creating btrfs metadata. creating ext2fs image file. cleaning up system chunk. conversion complete. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs-convert processing time
On Mon, Feb 20, 2012 at 8:50 PM, Hubert Kario wrote: > On Monday 20 of February 2012 14:41:33 Olivier Bonvalet wrote: >> Lot of small files (like compressed email from Maildir), and lot of >> hardlinks, and probably low free space (near 15% I suppose). >> >> >> So I think I have my answer :) >> > > Yes, this is probably the worst possible combination. > > Plese keep us updated. Just to have exact numbers for new users. ... although it would probably fail anyway due to btrfs hardlink limit in the same directory. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs open_ctree failed (after recent Ubuntu update)
On Mon, Feb 20, 2012 at 10:34 AM, Curtis Jones wrote: > Chris, > > Thank you for those kernel-update instructions. That was the least painful > kernel update I could have imagined. I rebooted and verified (via uname) that > I am in fact running the new kernel. After looking at dmesg I can confirm > that the exact same error is still occurring though. I re-read your previous > email and saw that you recommended the 3.3-rc release if 3.2.6 didn't > suffice. So I did the same thing with 3.3-rc. And I found the same error (or > what appears to be the same error), again: > >> [ 186.982910] device label StoreW devid 1 transid 37077 /dev/sdb >> [ 187.015081] parent transid verify failed on 79466496 wanted 33999 found >> 36704 >> [ 187.015088] parent transid verify failed on 79466496 wanted 33999 found >> 36704 >> [ 187.015091] parent transid verify failed on 79466496 wanted 33999 found >> 36704 >> [ 187.015094] parent transid verify failed on 79466496 wanted 33999 found >> 36704 >> [ 187.015764] btrfs: open_ctree failed > > uname now reports: > >> Linux veriton 3.3.0-030300rc4-generic-pae #201202181935 SMP Sun Feb 19 >> 00:53:06 UTC 2012 i686 i686 i386 GNU/Linux > > I'm not sure what to try next; I'd try with latest tools now. IIRC there's two programs you can try: - btrfs-zero-log, which (as the name implies) zeroes-out transaction log - restore, which would try to read files from a broken btrfs and copy it elsewhere, and Try the first one. If you can, dump the content of the disk to a file first (with dd or dd_rescue) and try it on that file. Just in case something goes horribly wrong :) -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: subvolume info in /proc/mounts
On Sun, Feb 5, 2012 at 5:30 PM, Nikos Voutsinas wrote: >>> If not, what is the formal way to find out which subvolume is mounted; >> >> Not right now, see detailed answer to a similar question: >> http://permalink.gmane.org/gmane.comp.file-systems.btrfs/15385 > Assuming that pools with multiple subvolumes and/or snapshots is the way > of doing things in btrfs, the subvolume info is required for the every day > administration. For simple administration, /etc/mtab (or the output of mount) as mentioned in David's link should work. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Setting options permanently?
On Sat, Jan 28, 2012 at 7:49 AM, Hadmut Danisch wrote: > Am 28.01.2012 00:20, schrieb Chester: >> It should be okay to mount with compress or without compress. Even if >> you mount a volume with compressed data without '-o compress' you will >> still be able to correctly read the data (but newly written data will >> not be compressed) > > But having both compressed and uncompressed files in the filesystem is > exactly what I want to avoid. Not because of reading problems, but to > avoid wasting disk space. I don't have a reading problem. I have a > writing problem. If you've been using -o compress, then you should know that even then not ALL data is compressed. If btrfs predicts that a data will be unable to benefit frmo compression, it will store it uncompressed. The problem is the prediction is not always right. Which is why there's -o compress-force. Anyway, for removable media case, there's a workaround that you can use (at least it works with gnome). Put the entry for the usb block device (e.g. /dev/sdb1) in fstab, with appropriate mount option, and the option will be used when you mount it using nautilus. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Problem with 3.3.0-rc1+: Target filesystem cannot find /sbin/init
On Sun, Jan 22, 2012 at 7:21 PM, Swapnil Pimpale wrote: > I can successfully boot into Ubuntu 11.10 (3.0.0-14-generic-pae) with > a btrfs root filesystem and an ext2 /boot partition. > But when I installed the latest vanilla (3.3.0-rc1+) and booted into where did you get the kernel from? kernel.org snapshot? git? third party package? > it, the first time the system froze. > Next time onwards, I get the following error every time: > > [ 0.427443] [drm:i915_init] *ERROR* drm/i915 cannot work without > intel_agp module! > mount: mounting udev on /dev failed: No such device > W: devtmpfs not available, falling back to tmpfs for /dev > mount: mounting /dev/disk/by-uuid/f43fdd7a-8ad7-4e96-ab1c-14ba82a4324d > on /root failed: No such device Do you know how to use your own costom kernel? That error is common when a driver is missing (i.e. not built-in, and not included in initrd). The easiest way to test that is to look at what's in /proc/partitions and /dev/disk/by-id during normal system boot (I assume you still have the old, working Ubuntu kernel?) and during failed boot when you're dropped to busybox. If your root device (sda8?) is not on /proc/partitions, then it's definitely block device driver problem. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Btrfsck gives me errors
On Fri, Jan 20, 2012 at 11:24 AM, Jérôme Poulin wrote: > On Wed, Jan 18, 2012 at 11:59 PM, Fajar A. Nugraha wrote: >> some files, unmount, and mount it again. If second mount does not show >> any error message then I'm pretty sure you're safe. > > I just upgraded from 3.0 to 3.2.1 and mounted the filesystem, tried > find > /dev/null and only got messages about old space inode. That's normal. You'll also get the message if you switch back to 3.0, but it should be harmless. > I then > used btrfsck again for the same exact result, I'll ignore them for > now, let's see what the shiny new btrfsck will do about them! who knows when it will be available :) Then again, most fsck feature has been implemented in kernel space so a mount will automatically "fix" some types of problems (somewhat similar to what zfs does, which has no fsck whatsoever). So just watch syslog for any unusual error messages. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Btrfsck gives me errors
On Thu, Jan 19, 2012 at 9:02 AM, Jérôme Poulin wrote: > I did a preemptive fsck after a RAID crash and got many errors, is > there something I should do if everything I use works? Probably just ignore it. Recent kernels (e.g. 3.1 or 3.2) is smart enough to automatically fix certain types of errors. Watch syslog when you mount the fs, access some files, unmount, and mount it again. If second mount does not show any error message then I'm pretty sure you're safe. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Encryption implementation like ZFS?
On Sun, Jan 1, 2012 at 12:12 AM, Niels de Carpentier wrote: >>> ... and depending on which SSD you use, it shouldn't matter. Really. >>> >>> Last time I tried with sandforce SSD + btrfs + -o discard, forcing >>> trim actually made things slower. Sandforce (and probably other modern >>> SSD) controllers can work just fine even without explicit trim fs >>> support. >> >> What command did you use to test this? Normal usage, and some random i/o test tool like fio. >> >> I have an OCZ Agility 3 SSD, which have the latest Sandforce >> controller, so I would really like to try reproduce your test setup. Yours should be newer. Mine is somewhat-old corsair force 60 GB with btrfs on top. When I activated -o discard, it actually become slower. Also, when I used fstrim, the IOPS were capped at 100, so probably the slowdown is because of that (i.e. IOPS-limit of TRIM somewhere, possibly the controller) > > Ok, the sandforce controller makes things interesting. > > First of all, sandforce controllers have a very high failure rate, so make > sure you have backups!! Yes, but even knowing that I can't imagine going back to HDD for this particular system. It'd be too slow to bear :P > Sandforce controllers also use compression and deduplication to increase > performance. Encryption will make your data incompressible and random, so > this can have a big impact on performance, depending on the > characteristics of your data. In my case I use compress=lzo, so it shouldn't be compressible by the controllers. > Sandforce controllers also have life time throttling, which will throttle > writes heavily if it thinks you will wear out the flash within the > warranty period. If you have a very heavy write workload this can be an > issue. That's new. Is there a link/reference for that? > > If you don't have a working trim it is a good idea to leave part of your > drive unused. (Make sure you either do this after a full write erase of > the drive, or do a manual trim of that area, otherwise it won't work). > This will make sure the drive has enough spare sectors to do garbage > collection and can greatly improve performance if your drive is full. True. But on my last test I can't get fstrim to trim everything. It could only trim about 2GB out of 12GB free space. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Encryption implementation like ZFS?
On Sat, Dec 31, 2011 at 3:12 AM, Sandra Schlichting wrote: >> How is this advantageous over dmcrypt-LUKS? > > TRIM pass-through for SSD's. With dmcrypt on an SSD write performance > is very slow. ... and depending on which SSD you use, it shouldn't matter. Really. Last time I tried with sandforce SSD + btrfs + -o discard, forcing trim actually made things slower. Sandforce (and probably other modern SSD) controllers can work just fine even without explicit trim fs support. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Re: Re: Two way mirror in BRTFS
2011/12/30 Jaromir Zdrazil : >> > And if I am not mistaken, current version does not yet support a mountable >> filesystem. >> >> You're mistaken :) With some extra work, you can even use it as root: >> - http://zfsonlinux.org/example-zpl.html >> - >> https://github.com/dajhorn/pkg-zfs/wiki/HOWTO-install-Ubuntu-to-a-Native-ZFS-Root-Filesystem >> > It seems I trust the web pages too much - in http://zfsonlinux.org/ is > written that it does not ;O)) otherwise I would be using it already. The web page is correct. http://zfsonlinux.org/: "Please keep in mind the current 0.5.2 stable release does not yet support a mountable filesystem. This functionality is currently available only in the 0.6.0-rc6 release candidate." http://zfsonlinux.org/example-zpl.html: "However, all the core functionality is in place and most of the advanced features are working. Stability of the latest release candidates has been very good and performance is respectible. Many people are successfully using the ZFS on Linux release candidates." Most zfsonlinux users use 0.6.0-rc6, and a big part of those is using the easy-to-install package from ubuntu ppa. >> Either way, neither zfs or the (planned) btrfs send/receive supports >> two-way/active-active setup. Both should (or will) work just fine for >> one-way replication. >> > That is what I needed to know! Thank you very much! You're welcome. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Re: Two way mirror in BRTFS
2011/12/30 Jaromir Zdrazil : >> > Just to add, I would like to see a two way mirror solution, but if it will >> > not >> work now/is not implemnted yet, I would propably choose between drbd in >> asynchronous mode or make a some kind if "incremental" snapshot to a remote >> mapped disk (I do not know yet, if brtfs support it) - it means have one >> shapshop and let's say have a daily incremental update of this snapshot. >> >> You mean like "zfs send -i"? If yes, why not just use zfs? There's >> zfsonlinux project, with easy-to-install ppa for ubuntu. Or you could >> compile it manually. >> > Thank you for your suggestion. As I know, there is not everything ported yet, > and one of the missing important features I plan to use is to crypt fs. correct. But btrfs doesn't do encryption as well. And if you're thinking of using luks/dm-crupt to provide encryption for btrfs, there's nothing preventing you to use the same thing with zfs. > And if I am not mistaken, current version does not yet support a mountable > filesystem. You're mistaken :) With some extra work, you can even use it as root: - http://zfsonlinux.org/example-zpl.html - https://github.com/dajhorn/pkg-zfs/wiki/HOWTO-install-Ubuntu-to-a-Native-ZFS-Root-Filesystem >> > >> > How would you do it? >> >> If you DO mean zfs-send-like-functionality, then you should ask about >> "btrfs send and receive", not "two way mirror" (which is not an >> accurate way to describe what you want). Also, send/receive ability >> does not mean it can act as two-way mirror. It CAN be an alternative >> to drbd async though. > > If I understand it correctly, the diff between send and receive and two way > mirror is that one is synchronous and the other is not (sends the signal that > the file have been succesfully written after all/one instance have been > succesfully written). > Maybe you can explain it a bit more. Two way: A replicates changes to B, and B can replicate it's own changes to A One way: A replicates changes to B, but B can not replicate it's own changes to A While drbd only supports synchronous mode for active-active setup, the generic "two way replication" does not have to be so. Also, just because something is synchronous does not automatically mean it supports two-way replication. Either way, neither zfs or the (planned) btrfs send/receive supports two-way/active-active setup. Both should (or will) work just fine for one-way replication. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Two way mirror in BRTFS
2011/12/30 Jaromir Zdrazil : > Sorry fo the typo in the subject! > > Just to add, I would like to see a two way mirror solution, but if it will > not work now/is not implemnted yet, I would propably choose between drbd in > asynchronous mode or make a some kind if "incremental" snapshot to a remote > mapped disk (I do not know yet, if brtfs support it) - it means have one > shapshop and let's say have a daily incremental update of this snapshot. You mean like "zfs send -i"? If yes, why not just use zfs? There's zfsonlinux project, with easy-to-install ppa for ubuntu. Or you could compile it manually. > > How would you do it? If you DO mean zfs-send-like-functionality, then you should ask about "btrfs send and receive", not "two way mirror" (which is not an accurate way to describe what you want). Also, send/receive ability does not mean it can act as two-way mirror. It CAN be an alternative to drbd async though. I don't think there's any publicly available code for it yet though. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: fstrim on BTRFS
On Fri, Dec 30, 2011 at 1:19 PM, Li Zefan wrote: >> Or would some data >> block group can be converted to metadata, and vice versa? >> > > This won't happen. Also empty block groups won't be reclaimed, but it's > in TODO list. Ah, OK. 6G for metadata out of 50G total seems a bit much, but I can live with it for now. Thanks, Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Compession, on filesystem or volume?
On Thu, Dec 29, 2011 at 5:51 PM, Remco Hosman wrote: > Hi, > > Something i could not find in the documentation i managed to find: > if you mount with compress=lzo and rebalance, is compression on for that > filesystem or only a single volume? > > eg, can i have a @boot volume uncompressed and @ and @home compressed. Last time I asked a similar question, the answer was no. It's per filesystem. however you can change compression of individual files between zlib/lzo using "btrfs fi defragment -c", regardless of what the filesystem is currently mounted with. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: fstrim on BTRFS
On Thu, Dec 29, 2011 at 4:39 PM, Li Zefan wrote: > Fajar A. Nugraha wrote: >> On Wed, Dec 28, 2011 at 11:57 PM, Martin Steigerwald >> wrote: >>> But BTRFS does not: >>> >>> merkaba:~> fstrim -v / >>> /: 4431613952 bytes were trimmed >>> merkaba:~> fstrim -v / >>> /: 4341846016 bytes were trimmed >> >> and apparently it can't trim everything. Or maybe my kernel is >> just too old. >> >> >> $ sudo fstrim -v / >> 2258165760 Bytes was trimmed >> >> $ df -h / >> Filesystem Size Used Avail Use% Mounted on >> /dev/sda6 50G 34G 12G 75% / >> >> $ mount | grep "/ " >> /dev/sda6 on / type btrfs (rw,noatime,subvolid=258,compress-force=lzo) >> >> so only about 2G out of 12G can be trimmed. This is on kernel 3.1.4. >> > > That's because only free spaces in block groups will be trimmed. Btrfs > allocates space from block groups, and when there's no space availabe, > it will allocate a new block group from the pool. In your case there's > ~10G in the pool. Thanks for your response. > > You can do a "btrfs fi df /", and you'll see the total size of existing > block groups. $ sudo btrfs fi df / Data: total=43.47GB, used=31.88GB System, DUP: total=8.00MB, used=12.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=3.25GB, used=619.88MB Metadata: total=8.00MB, used=0.00 That should mean existing block groups is at least 46GB, right? In which case my pool (a 50G partition) should only have about 4GB of space not allocated to block groups. The numbers don't seem to match. > > You can empty the pool by: > > # dd if=/dev/zero of=/mytmpfile bs=1M > > Then release the space (but it won't return back to the pool): > > # rm /mytmpfile > # sync Is there a bad side effect of doing so? For example, since all free space in the pool would be allocated to data block group, would that mean my metadata block group is capped at 3.25GB? Or would some data block group can be converted to metadata, and vice versa? -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: fstrim on BTRFS
On Thu, Dec 29, 2011 at 11:37 AM, Roman Mamedov wrote: > On Thu, 29 Dec 2011 11:21:14 +0700 > "Fajar A. Nugraha" wrote: > >> I'm trying fstrim and my disk is now pegged at write IOPS. Just >> wondering if maybe a "btrfs fi balance" would be more useful, since: > Modern controllers (like the SandForce you mentioned) do their own wear > leveling 'under the hood', i.e. the same user-visible sectors DO NOT > neccessarily map to the same locations on the flash at all times; and > introducing 'manual' wear leveling by additional rewriting is not a good > idea, it's just going to wear it out more. I know that modern controllers have their own wear leveling, but AFAIK they basically: (1) have reserved a certain size for wear leveling purposes (2) when a write request comes, they basically use new sectors from the pool, and put the "old" sectors to the pool (doing garbage collection like trim/rewrite in the process) (3) they can't re-use sectors that are currently being used and not rewritten (e.g. sectors used by OS files) If (3) is still valid, then the only way to reuse the sectors is by forcing a rewrite (e.g. using "btrfs fi defrag"). So the question is, is (3) still valid? -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: fstrim on BTRFS
On Thu, Dec 29, 2011 at 11:21 AM, Fajar A. Nugraha wrote: > I'm trying fstrim and my disk is now pegged at write IOPS. Just > wondering if maybe a "btrfs fi balance" would be more useful, Sorry, I meant "btrfs fi defrag" -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: fstrim on BTRFS
On Wed, Dec 28, 2011 at 11:57 PM, Martin Steigerwald wrote: > But BTRFS does not: > > merkaba:~> fstrim -v / > /: 4431613952 bytes were trimmed > merkaba:~> fstrim -v / > /: 4341846016 bytes were trimmed and apparently it can't trim everything. Or maybe my kernel is just too old. $ sudo fstrim -v / 2258165760 Bytes was trimmed $ df -h / FilesystemSize Used Avail Use% Mounted on /dev/sda6 50G 34G 12G 75% / $ mount | grep "/ " /dev/sda6 on / type btrfs (rw,noatime,subvolid=258,compress-force=lzo) so only about 2G out of 12G can be trimmed. This is on kernel 3.1.4. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html