Re: BTRFS as image store for KVM?

2015-09-25 Thread Jim Salter
very well at first but then degrades rapidly. FWIW I've been using KVM + ZFS in wide production (>50 hosts) for 5+ years now. On 09/25/2015 08:48 AM, Rich Freeman wrote: On Sat, Sep 19, 2015 at 9:26 PM, Jim Salter <j...@jrs-s.net> wrote: ZFS, by contrast, works like absolute gangbusters fo

Re: BTRFS as image store for KVM?

2015-09-25 Thread Jim Salter
Pretty much bog-standard, as ZFS goes. Nothing different than what's recommended for any generic ZFS use. * set blocksize to match hardware blocksize - 4K drives get 4K blocksize, 8K drives get 8K blocksize (Samsung SSDs) * LZO compression is a win. But it's not like anything sucks without

Re: BTRFS as image store for KVM?

2015-09-19 Thread Jim Salter
I can't recommend btrfs+KVM, and I speak from experience. Performance will be fantastic... except when it's completely abysmal. When I tried it, I also ended up with a completely borked (btrfs-raid1) filesystem that would only mount read-only and read at hideously reduced speeds after about

Re: Announcement: buttersink - like rsync for btrfs snapshots

2014-08-11 Thread Jim Salter
How has it been for reliability? I wrote a btrsync app a while back, and the app /itself/ worked fine, but the btrfs send / btrfs receive itself proved problematic. Since btrfs would keep a partial receive - with no easy way to tell whether a receive WAS partial or full - I would inevitably

Re: Announcement: buttersink - like rsync for btrfs snapshots

2014-08-11 Thread Jim Salter
To any core btrfs devs who are listening and care - the unreliability of btrfs send/receive is IMO the single biggest roadblock to adoption of btrfs as a serious next-gen FS. I can live with occasional corner-case performance issues, I can even live with (very) occasional filesystem

Re: VM nocow, should VM software set +C by default?

2014-02-25 Thread Jim Salter
Put me in on Team Justin on this particular issue. I get and grant that in some use cases you might get pathological behavior out of DB or VM binaries which aren't set NODATACOW, but in my own use - including several near-terabyte-size VM images being used by ten+ people all day long for

Re: No space left on device (again)

2014-02-25 Thread Jim Salter
370GB of 410GB used isn't really fine, it's over 90% usage. That said, I'd be interested to know why btrfs fi show /dev/sda3 shows 412.54G used, but btrfs fi df /home shows 379G used... On 02/25/2014 11:49 AM, Marcus Sundman wrote: Hi I get No space left on device and it is unclear why:

btrfs-raid10 - stripes or blocks?

2014-02-20 Thread Jim Salter
Hi list - Can anybody tell me whether btrfs-raid10 reads and writes a stripe at a time, like traditional raid10, or whether it reads and writes individual redundant blocks like btrfs-raid1, but just locks particular disks as mirror pairs, and/or locks the order in which mirror pairs should be

Re: btrfs send problems

2014-02-18 Thread Jim Salter
Bacik wrote: I'm on my phone so apologies for top posting but please try btrfs-next, I recently fixed a pretty epic performance problem with send which should help you, I'd like to see how much. Thanks, Josef Jim Salter j...@jrs-s.net wrote: Hi list - I'm having problems with btrfs send

Re: BTRFS send: exclude directories

2014-02-16 Thread Jim Salter
Simplest way to do it is to make the directories that you don't want to replicate be separate subvolumes. I do this to keep from replicating copies of snapshot directories. Example: mv /home/GEO/.cache /home/GEO/.cache.tmp btrfs sub create /home/GEO/.cache cp -a --reflink=always

btrfs send problems

2014-02-15 Thread Jim Salter
Hi list - I'm having problems with btrfs send in general, and incremental send in particular. 1. Performance: in kernel 3.11, btrfs send would send data at 500+MB/sec from a Samsung 840 series solid state drive. In kernel 3.12 and up, btrfs send will only send 30-ish MB/sec from the same

btrfs-RAID(3 or 5/6/etc) like btrfs-RAID1?

2014-02-13 Thread Jim Salter
This might be a stupid question but... Are there any plans to make parity RAID levels in btrfs similar to the current implementation of btrfs-raid1? It took me a while to realize how different and powerful btrfs-raid1 is from traditional raid1. The ability to string together virtually any

Re: btrfs-RAID(3 or 5/6/etc) like btrfs-RAID1?

2014-02-13 Thread Jim Salter
That is FANTASTIC news. Thank you for wielding the LART gently. =) I do a fair amount of public speaking and writing about next-gen filesystems (example: http://arstechnica.com/information-technology/2014/01/bitrot-and-atomic-cows-inside-next-gen-filesystems/) and I will be VERY sure to talk

Re: Snapshots – noob questions

2014-01-27 Thread Jim Salter
Well... you can't /create/ snapshots on different partitions / cloud storage / whatever, but you can certainly /send/ them there once you've created them. Generally the best way to do this kind of thing is with incremental replication of snapshots to another btrfs filesystem, which may be

Re: Scrubbing with BTRFS Raid 5

2014-01-21 Thread Jim Salter
Would it be reasonably accurate to say btrfs' RAID5 implementation is likely working well enough and safe enough if you are backing up regularly and are willing and able to restore from backup if necessary if a device failure goes horribly wrong, then? This is a reasonably serious question.

Re: Scrubbing with BTRFS Raid 5

2014-01-21 Thread Jim Salter
There are different values of testing and of production - in my world, at least, they're not atomically defined categories. =) On 01/21/2014 12:38 PM, Chris Murphy wrote: It's for testing purposes. If you really want to commit a production machine for testing a file system, and you're prepared

Re: btrfs send: page allocation failure

2014-01-14 Thread Jim Salter
fragmentation AFAICT. On 01/14/2014 08:13 AM, David Sterba wrote: On Mon, Jan 13, 2014 at 02:03:52PM -0500, Jim Salter wrote: OK, thanks. If kernel memory fragmentation is a big factor, that would also explain why it succeeds after a reboot but does not succeed after weeks of uptime... Yes, that's

btrfs send: page allocation failure

2014-01-13 Thread Jim Salter
Hi list - Getting sporadic page allocation failures in btrfs send. This happened once several weeks ago but was fine after a reboot; yesterday I did not reboot, but had the failure back-to-back trying to send two different snapshots. These are full sends, not incremental, of a bit over 600G

Re: btrfs send: page allocation failure

2014-01-13 Thread Jim Salter
Er... I can't use incremental send if I can't get one full send to go through first. =) I'm hoping the problem will go away for long enough to get a full send completed once I reboot the box, but I can't do that until (much) later in the day. On 01/13/2014 10:17 AM, Wang Shilong wrote:

Re: btrfs send: page allocation failure

2014-01-13 Thread Jim Salter
upstream kernel and see if problem still exist? Thanks, Wang 在 2014-1-13,下午11:20,Jim Salter j...@jrs-s.net 写道: Er... I can't use incremental send if I can't get one full send to go through first. =) sory, i mean one approach is use '-p' option, you can use: # btrfs sub create subv # btrfs

Re: btrfs send: page allocation failure

2014-01-13 Thread Jim Salter
kernel? If yes, can you have a try at the latest upstream kernel and see if problem still exist? Thanks, Wang 在 2014-1-13,下午11:20,Jim Salter j...@jrs-s.net 写道: Er... I can't use incremental send if I can't get one full send to go through first. =) sory, i mean one approach is use '-p

Re: btrfs send: page allocation failure

2014-01-13 Thread Jim Salter
What makes you believe that? The bug filed there appears to be related to defragging, which I am not doing either manually or automatically. On 01/13/2014 01:23 PM, David Sterba wrote: On Mon, Jan 13, 2014 at 07:58:48AM -0500, Jim Salter wrote: Getting sporadic page allocation failures

Re: btrfs send: page allocation failure

2014-01-13 Thread Jim Salter
OK, thanks. If kernel memory fragmentation is a big factor, that would also explain why it succeeds after a reboot but does not succeed after weeks of uptime... On 01/13/2014 01:56 PM, David Sterba wrote: On Mon, Jan 13, 2014 at 01:37:31PM -0500, Jim Salter wrote: What makes you believe

Re: btrfs raid1 and btrfs raid10 arrays NOT REDUNDANT

2014-01-06 Thread Jim Salter
FWIW, Ubuntu (and I presume Debian) will work just fine with a single / on btrfs, single or multi disk. I currently have two machines booting to a btrfs-raid10 / with no separate /boot, one booting to a btrfs single disk / with no /boot, and one booting to a btrfs-raid10 / with an

correct way to rollback a root filesystem?

2014-01-06 Thread Jim Salter
Hi list - I tried a kernel upgrade with moderately disastrous (non-btrfs-related) results this morning; after the kernel upgrade Xorg was completely borked beyond my ability to get it working properly again through any normal means. I do have hourly snapshots being taken by cron, though, so

Re: btrfs raid1 and btrfs raid10 arrays NOT REDUNDANT

2014-01-06 Thread Jim Salter
will scan all drives and find any that are there. The only hitch is the need to mount degraded that I Chicken Littled about earlier so loudly. =) On 01/06/2014 05:05 PM, Chris Murphy wrote: On Jan 6, 2014, at 12:25 PM, Jim Salter j...@jrs-s.net wrote: FWIW, Ubuntu (and I presume Debian

Re: btrfs-transaction blocked for more than 120 seconds

2014-01-05 Thread Jim Salter
On 01/05/2014 12:09 PM, Chris Murphy wrote: I haven't read anything so far indicating defrag applies to the VM container use case, rather nodatacow via xattr +C is the way to go. At least for now. Can you elaborate on the rationale behind database or VM binaries being set nodatacow? I

Re: how to properly mount an external usb hard drive other questions

2014-01-05 Thread Jim Salter
On 01/05/2014 12:50 PM, Justus Seifert wrote: oh i forgot: if you want to mount it without su privileges you have to use: /dev/sdc /path/to/your/favorite/mountpoint compress,noauto,users,user 0 0 If you want LZO compression, as you specified: /dev/sdc /path/to/mountpoint

Re: how to properly mount an external usb hard drive other questions

2014-01-05 Thread Jim Salter
On 01/05/2014 01:02 PM, Jim Salter wrote: If you want LZO compression, as you specified: /dev/sdc /path/to/mountpoint compress=lzo,noauto,users,user 0 0 Better yet, if your btrfs is actually on /dev/sdc right now, let's get that fstab entry mounting it by UUID instead

Re: btrfs raid1 and btrfs raid10 arrays NOT REDUNDANT

2014-01-04 Thread Jim Salter
On 01/04/2014 02:18 PM, Chris Murphy wrote: I'm not sure what else you're referring to?(working on boot environment of btrfs) Just the string of caveats regarding mounting at boot time - needing to monkeypatch 00_header to avoid the bogus sparse file error (which, worse, tells you to press

Re: btrfs raid1 and btrfs raid10 arrays NOT REDUNDANT

2014-01-04 Thread Jim Salter
On 01/04/2014 01:10 AM, Duncan wrote: The example given in the OP was of a 4-device raid10, already the minimum number to work undegraded, with one device dropped out, to below the minimum required number to mount undegraded, so of /course/ it wouldn't mount without that option. The issue

btrfs raid1 and btrfs raid10 arrays NOT REDUNDANT

2014-01-03 Thread Jim Salter
I'm using Ubuntu 12.04.3 with an up-to-date 3.11 kernel, and the btrfs-progs from Debian Sid (since the ones from Ubuntu are ancient). I discovered to my horror during testing today that neither raid1 nor raid10 arrays are fault tolerant of losing an actual disk. mkfs.btrfs -d raid10 -m

Re: btrfs raid1 and btrfs raid10 arrays NOT REDUNDANT

2014-01-03 Thread Jim Salter
enough if you ask me. =) On 01/03/2014 05:43 PM, Joshua Schüler wrote: Am 03.01.2014 23:28, schrieb Jim Salter: I'm using Ubuntu 12.04.3 with an up-to-date 3.11 kernel, and the btrfs-progs from Debian Sid (since the ones from Ubuntu are ancient). I discovered to my horror during testing today

Re: btrfs raid1 and btrfs raid10 arrays NOT REDUNDANT

2014-01-03 Thread Jim Salter
Sorry - where do I put this in GRUB? /boot/grub/grub.cfg is still kinda black magic to me, and I don't think I'm supposed to be editing it directly at all anymore anyway, if I remember correctly... HOWEVER - this won't allow a root filesystem to mount. How do you deal with this if you'd set up

Re: btrfs raid1 and btrfs raid10 arrays NOT REDUNDANT

2014-01-03 Thread Jim Salter
, whether the disks are all present and working fine or not. That's kind of blecchy. =\ On 01/03/2014 06:18 PM, Hugo Mills wrote: On Fri, Jan 03, 2014 at 06:13:25PM -0500, Jim Salter wrote: Sorry - where do I put this in GRUB? /boot/grub/grub.cfg is still kinda black magic to me, and I don't think

Re: btrfs raid1 and btrfs raid10 arrays NOT REDUNDANT

2014-01-03 Thread Jim Salter
For anybody else interested, if you want your system to automatically boot a degraded btrfs array, here are my crib notes, verified working: * boot degraded 1. edit /etc/grub.d/10_linux, add degraded to the rootflags

Re: btrfs raid1 and btrfs raid10 arrays NOT REDUNDANT

2014-01-03 Thread Jim Salter
Minor correction: you need to close the double-quotes at the end of the GRUB_CMDLINE_LINUX line: GRUB_CMDLINE_LINUX=rootflags=degraded,subvol=${rootsubvol} ${GRUB_CMDLINE_LINUX} On 01/03/2014 06:42 PM, Jim Salter wrote: For anybody else interested, if you want your system

Re: btrfs raid1 and btrfs raid10 arrays NOT REDUNDANT

2014-01-03 Thread Jim Salter
On 01/03/2014 07:27 PM, Chris Murphy wrote: This is the wrong way to solve this. /etc/grub.d/10_linux is subject to being replaced on updates. It is not recommended it be edited, same as for grub.cfg. The correct way is as I already stated, which is to edit the GRUB_CMDLINE_LINUX= line in

ERROR: send ioctl failed with -12: Cannot allocate memory

2013-12-04 Thread Jim Salter
Sending a 585G snapshot from box1 to box2: # ionice -c3 btrfs send daily*2013-12-01* | pv -L40m -s585G | ssh -c arcfour 10.0.0.40 btrfs receive /data/.snapshots/data/images At subvol daily_[1385956801]_2013-12-01_23:00:01 At subvol daily_[1385956801]_2013-12-01_23:00:01 ERROR:

btrfs send size

2013-11-27 Thread Jim Salter
Hi list - Long time ZFS guy here trying to move everything over from ZFS to btrfs, which entails a lot of re-scripting and re-learning. Question of the day: how can I determine the size of a btrfs send operation before hand? I'd like to be able to provide a progress bar (I'm accustomed to

Re: btrfs scrub ioprio

2013-11-25 Thread Jim Salter
Can you elaborate on this please? I'm not directly familiar with cgroups, I'd greatly appreciate a quick-and-dirty example of using BIO cgroup to limit I/O bandwidth. Limiting bandwidth definitely would ameliorate the problem for me; I already use pv's bw-limiting feature to make btrfs send

btrfs scrub ioprio

2013-11-24 Thread Jim Salter
TL;DR scrub's ioprio argument isn't really helpful - a scrub murders system performance til it's done. My system: 3.11 kernel (from Ubuntu Saucy) btrfs-tools from 2013-07 (from Debian Sid) Opteron 8-core CPU 32GB RAM 4 WD 1TB Black drives in a btrfs RAID10 (data and metadata). iotop shows