Re: recommendations and contraindications of using btrfs for Oracle Database Server

2018-01-11 Thread Fajar A. Nugraha
On Thu, Jan 11, 2018 at 6:23 PM, Nikolay Borisov  wrote:
>
>
> On 11.01.2018 12:51, Ext-Strii-Houttemane Philippe wrote:
>> Hello,
>>
>>We are using btrfs filesystem on local disks (RAID 1) as underlying 
>> filesystem to host our Oracle 12c datafiles.
>> This allow us to cold backup databases via snapshot in a few seconds and 
>> benefit from higher performance than over Linux filesystem formats.
>> This is the problem we meet: Oracle regularly crashes with error of this 2 
>> types, the errors occur on different physical machines with same softwares:
>>
>> ORA-63999: data file suffered media failure
>> ORA-01114: IO error writing block to file 99 (block # 99968)
>> ORA-01110: data file 99: '/oradata/PS92PRD/data/pcapp.dbf'
>> ORA-27072: File I/O error
>> Linux-x86_64 Error: 17: File exists



>> Mount options: defaults,nofail,nodatacow,nobarrier,noatime
>>
>> uname -a:
>> Linux 3.10.0-514.10.2.el7.x86_64 #1 SMP Fri Mar 3 00:04:05 UTC 2017 x86_64 
>> x86_64 x86_64 GNU/Linux
>


> You are using a vendor-specific kernel. It's best if you turn to them
> for support since it's very likely their code doesn't match what is in
> upstream, let alone the fact you are using an ancient kernel.


3.10 is redhat compatible kernel. They put btrfs as tech preview, and
will deprecate it after 7.4. So it's probably useless to ask redhat
for support on that.

@Philippe, you might want to try Oracle's UEK R4. At least they're
still promoting btrfs improvements in their kernel, and you might be
able to ask them in case of problems (assuming you have proper
subscription & support). However IIRC oracle only supports btrfs for
its application binary, and not for the data files (again, you should
contact oracle support to be sure).

If you simply want to use latest kernel (with latest btrfs fixes) to
see if your problem still occurs, and don't care about support, you
can try kernel-ml (http://elrepo.org/tiki/kernel-ml) or compile your
own.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?

2017-08-02 Thread Fajar A. Nugraha
On Thu, Aug 3, 2017 at 1:44 AM, Chris Mason  wrote:
>
> On 08/02/2017 04:38 AM, Brendan Hide wrote:
>>
>> The title seems alarmist to me - and I suspect it is going to be 
>> misconstrued. :-/
>
>
> Supporting any filesystem is a huge amount of work.  I don't have a problem 
> with Redhat or any distro picking and choosing the projects they want to 
> support.
>

It'd help a lot of people if things like
https://btrfs.wiki.kernel.org/index.php/Status is kept up-to-date and
'promoted', so at least users are more informed about what they're
getting into and can choose which features (stable/still in dev/likely
to destroy your data) that they want to use.

For example, https://btrfs.wiki.kernel.org/index.php/Status says
compression is 'mostly OK' ('auto-repair and compression may crash'
looks pretty scary, as from newcomers-perspective it might be
interpretted as 'potential data loss'), while
https://en.opensuse.org/SDB:BTRFS#Compressed_btrfs_filesystems says
they support compression on newer opensuse versions.


>
> At least inside of FB, our own internal btrfs usage is continuing to grow.  
> Btrfs is becoming a big part of how we ship containers and other workloads 
> where snapshots improve performance.
>

Ubuntu also support btrfs as part their container implementation
(lxd), and (reading lxd mailing list) some people use lxd+btrfs on
their production environment. IIRC the last problem posted on lxd list
about btrfs was about how 'btrfs send/receive (used by lxd copy) is
slower than rsync for full/initial copy'.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS, remarkable problem: filesystem turns to read-only caused by firefox download

2016-06-15 Thread Fajar A. Nugraha
On Wed, Jun 15, 2016 at 1:29 PM, Paul Verreth  wrote:
> Dear all.
>
> When I download a video using  Firefox DownloadHelper addon, the
> filesystem suddenly turns read only. Not a coincedence, I tried it
> several times, and it happened every time again
>
> Info:
> Linux wolfgang 4.2.0-35-generic #40-Ubuntu SMP Tue Mar 15 22:15:45 UTC
> 2016 x86_64 x86_64 x86_64 GNU/Linux

> Segmentation fault
>
> Jun  5 15:03:15 ubuntu kernel: [ 2062.544303] BTRFS info (device
> sdb5): relocating block group 383447465984 flags 17


> What can I do to repair this problem?

The usual starting advice would be "try with latest kernel and see if
you can still reproduce the problem". Is it ubuntu wily? It'd go end
of in July anyway, so you might want to upgrade to xenial (or at
least, just the kernel, for the purpose of troubleshooting your
problem).

Or even try http://kernel.ubuntu.com/~kernel-ppa/mainline/daily/current/
(should be usable, but might report some errors/warning due to missing
ubuntu patches)

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs subvolume clone or fork (btrfs-progs feature request)

2015-07-09 Thread Fajar A. Nugraha
On Thu, Jul 9, 2015 at 8:20 AM, james harvey jamespharve...@gmail.com wrote:
 Request for new btrfs subvolume subcommand:

 clone or fork [-i qgroupid] source [dest]name
Create a subvolume name in dest, which is a clone or fork of source.
If dest is not given, subvolume name will be created in the
 current directory.
Options
-i qgroupid
   Add the newly created subvolume to a qgroup.  This option can be
 given multiple times.

 Would (I think):
 * btrfs subvolume create dest-subvolume
 * cp -ax --reflink=always source-subvolume/* dest-subvolume/

What's wrong with btrfs subvolume snapshot?

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: CoW with webserver databases: innodb_file_per_table and dedicated tables for blobs?

2015-06-16 Thread Fajar A. Nugraha
On Tue, Jun 16, 2015 at 2:06 PM, Ingvar Bogdahn
ingvar.bogd...@googlemail.com wrote:
 Hi again,

 Benchmarking over time seems a good idea, but what if I see that a
 particular database does indeed degrade in performance? How can I then
 selectively improve performance for that file, since disabling cow only
 works for new empty files?


you might be overcomplicating things.

 Is it correct that bundling small random writes into groups of writes
 reduces fragmentation? If so, some form of write-caching should help? I'm
 still investigating, but one solution might be:
 1) identify which exact tables do have frequent writes
 2) decrease the system-wide write-caching (vm.dirty_background_ratio and
 vm.dirty_ratio) to lower levels, because this wastes lots of RAM by
 indiscriminately caching writes of the whole system, and tends to causes
 spikes where suddenly the entire cache gets written to disk and block the
 system. Rather use that RAM selectively to cache only the critical files.

IIRC innodb uses O_DIRECT by default, which should bypass fs cache, so
the above should be irrelevant


 4) create a software RAID-1 made up of a ramdisk and a mounted image, using
 mdadm.
 5) Setting up mdadm using rather large value for write-behind=
 6) put only those tables on that disk-backed ramdisk which do have frequent
 writes.


raid1 writes everything to both, so your write performance would still
be limited by the disk.
As for reads, instead of using ramdisk for half of md, I would just
use that amount of ram for innodb_buffer_pool


 What do you think?


I would say determine your priorities.

If you absolutely need btrfs + innodb, then I would:
- increase innodb_buffer_pool
- don't mess with nocow, leave it as is
- don't mess with autodefrag
- enable compression on btrfs
- use latest known good kernel (AFAIK 4.0.5 should be good)

If you absolutely must have high performance with innodb, then I would
look at using raw block device directly for innodb. You'd lose all
btrfs features of course (e.g. snapshots), but it's a tradeoff for
performance.

If you don't HAVE to use innodb but still want to use btrfs, then I
would use tokudb engine instead (available in tokudb's mysql fork and
mariadb = 10), with compression handled by tokudb (disable
compression in btrfs). tokudb doesn't support foreign constraint, but
other than that it should be able to replace innodb for your purposes.
Among other things, tokudb uses larger block size (4MB) so it should
help reduce fragmentation compared to innodb.

If you don't HAVE to use either btrfs or innodb, but just want mysql
db that supports transactions with an fs that supports
snapshot/clone, then I would use zfs + tokudb. And read
http://blog.delphix.com/matt/2014/06/06/zfs-stripe-width/ (with the
exception that compression should be used in tokudb instead of zfs)

-- 
Fajar


 Ingvar



 Am 15.06.15 um 11:57 schrieb Hugo Mills:

 On Mon, Jun 15, 2015 at 11:34:35AM +0200, Ingvar Bogdahn wrote:

 Hello there,

 I'm planing to use btrfs for a medium-sized webserver. It is
 commonly recommended to set nodatacow for database files to avoid
 performance degradation. However, apparently nodatacow disables some
 of my main motivations of using btrfs : checksumming and (probably)
 incremental backups with send/receive (please correct me if I'm
 wrong on this). Also, the databases are among the most important
 data on my webserver, so it is particularly there that I would like
 those feature working.

 My question is, are there strategies to avoid nodatacow of databases
 that are suitable and safe in a production server?
 I thought about the following:
 - in mysql/mariadb: setting innodb_file_per_table should avoid
 having few very big database files.

 It's not so much about the overall size of the files, but about the
 write patterns, so this probably won't be useful.

 - in mysql/mariadb: adapting database schema to store blobs into
 dedicated tables.

 Probably not an issue -- each BLOB is (likely) to be written in a
 single unit, which won't cause the fragmentation problems.

 - btrfs: set autodefrag or some cron job to regularly defrag only
 database fails to avoid performance degradation due to fragmentation

 Autodefrag is a good idea, and I would suggest trying that first,
 before anything else, to see if it gives you good enough performance
 over time.

 Running an explicit defrag will break any CoW copies you have (like
 snapshots), causing them to take up additional space. For example,
 start with a 10 GB subvolume. Snapshot it, and you will still only
 have 10 GB of disk usage. Defrag one (or both) copies, and you'll
 suddenly be using 20 GB.

 - turn on compression on either btrfs or mariadb

 Again, won't help. The issue is not the size of the data, it's the
 write patterns: small random writes into the middle of existing files
 will eventually cause those files to fragment, which causes lots of
 seeks and short reads, which degrades performance.

 Is 

Re: should I use btrfs on Centos 7 for a new production server?

2014-12-30 Thread Fajar A. Nugraha
On Wed, Dec 31, 2014 at 1:04 PM, Eric Sandeen sand...@redhat.com wrote:
 On 12/30/14 10:06 PM, Wang Shilong wrote:
 I used CentOS7 btrfs myself, just doing some tests..it crashed easily.
 I don’t know how much efforts that Redhat do on btrfs for 7 series.

 Maybe use SUSE enterprise for btrfs will be a better choice, they offered
 better support for btrfs as far as i know.

 I believe SuSE's most recent support statement on btrfs is here, I think.

 https://www.suse.com/releasenotes/x86_64/SUSE-SLES/12/#fate-317221

Wow. Suse use btrfs for root by default, but actively prevents user
from using compression (unless specifically overiden using module
parameter)?

Weird, since IIRC compression has been around and stable for a long time.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: latest btrfs-progs and asciidoc dependency

2014-06-05 Thread Fajar A. Nugraha
On Thu, Jun 5, 2014 at 9:41 PM, Marc MERLIN m...@merlins.org wrote:
 On Thu, Jun 05, 2014 at 12:52:04PM +0100, Tomasz Chmielewski wrote:
 And it looks the dependency is ~1 GB of new packages? O_o

 That seems painful, but at the same time, the alternative, nroff/troff sucks.

 Part ofyour problem however seems to be runaway dependencies.
 You are getting x11 and stuff like libdrm which clearly you shouldn't need.
 If your disk space is more valuable than your time, I recommend you build
 asciidoc yourself and you should hopefully end up with less.

 Or you can also remove asciidoc from the makefile and read the raw files
 which are readable.


... or try this

# apt-get install --no-install-recommends asciidoc

If that still doesn't work, AND you have lost of free time, AND
familiar with debian packaging, then you can use latest available
debian source, adapt it for latest version, and use opensuse build
service to compile it.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Very slow filesystem

2014-06-04 Thread Fajar A. Nugraha
On Thu, Jun 5, 2014 at 5:15 AM, Igor M igor...@gmail.com wrote:
 Hello,

 Why btrfs becames EXTREMELY slow after some time (months) of usage ?

 # btrfs fi show
 Label: none  uuid: b367812a-b91a-4fb2-a839-a3a153312eba
 Total devices 1 FS bytes used 2.36TiB
 devid1 size 2.73TiB used 2.38TiB path /dev/sde

 # btrfs fi df /mnt/old
 Data, single: total=2.36TiB, used=2.35TiB

Is that the fs that is slow?

It's almost full. Most filesystems would exhibit really bad
performance when close to full due to fragmentation issue (threshold
vary, but 80-90% full usually means you need to start adding space).
You should free up some space (e.g. add a new disk so it becomes
multi-device, or delete some files) and rebalance/defrag.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Very slow filesystem

2014-06-04 Thread Fajar A. Nugraha
(resending to the list as plain text, the original reply was rejected
due to HTML format)

On Thu, Jun 5, 2014 at 10:05 AM, Duncan 1i5t5.dun...@cox.net wrote:

 Igor M posted on Thu, 05 Jun 2014 00:15:31 +0200 as excerpted:

  Why btrfs becames EXTREMELY slow after some time (months) of usage ?
  This is now happened second time, first time I though it was hard drive
  fault, but now drive seems ok.
  Filesystem is mounted with compress-force=lzo and is used for MySQL
  databases, files are mostly big 2G-8G.

 That's the problem right there, database access pattern on files over 1
 GiB in size, but the problem along with the fix has been repeated over
 and over and over and over... again on this list, and it's covered on the
 btrfs wiki as well

Which part on the wiki? It's not on
https://btrfs.wiki.kernel.org/index.php/FAQ or
https://btrfs.wiki.kernel.org/index.php/UseCases

 so I guess you haven't checked existing answers
 before you asked the same question yet again.

 Never-the-less, here's the basic answer yet again...

 Btrfs, like all copy-on-write (COW) filesystems, has a tough time with a
 particular file rewrite pattern, that being frequently changed and
 rewritten data internal to an existing file (as opposed to appended to
 it, like a log file).  In the normal case, such an internal-rewrite
 pattern triggers copies of the rewritten blocks every time they change,
 *HIGHLY* fragmenting this type of files after only a relatively short
 period.  While compression changes things up a bit (filefrag doesn't know
 how to deal with it yet and its report isn't reliable), it's not unusual
 to see people with several-gig files with this sort of write pattern on
 btrfs without compression find filefrag reporting literally hundreds of
 thousands of extents!

 For smaller files with this access pattern (think firefox/thunderbird
 sqlite database files and the like), typically up to a few hundred MiB or
 so, btrfs' autodefrag mount option works reasonably well, as when it sees
 a file fragmenting due to rewrite, it'll queue up that file for
 background defrag via sequential copy, deleting the old fragmented copy
 after the defrag is done.

 For larger files (say a gig plus) with this access pattern, typically
 larger database files as well as VM images, autodefrag doesn't scale so
 well, as the whole file must be rewritten each time, and at that size the
 changes can come faster than the file can be rewritten.  So a different
 solution must be used for them.


If COW and rewrite is the main issue, why don't zfs experience the
extreme slowdown (that is, not if you have sufficient free space
available, like 20% or so)?

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Convert btrfs software code to ASIC

2014-05-19 Thread Fajar A. Nugraha
On Mon, May 19, 2014 at 3:40 PM, Le Nguyen Tran lntran...@gmail.com wrote:
 Hi,

 I am Nguyen. I am not a software development engineer but an IC (chip)
 development engineer. I have a plan to develop an IC controller for
 Network Attached Storage (NAS). The main idea is converting software
 code into hardware implementation. Because the chip is customized for
 NAS, its performance is high, and its cost is lower than using micro
 processor like Atom or Xeon (for servers).

 I plan to use btrfs as the file system specification for my NAS. The
 main point is that I need to understand the btrfs sofware code in
 order to covert them into hardware implementation. I am wandering if
 any of you can help me. If we can make the chip in a good shape, we
 can start up a company and have our own business.

I'm not sure if that's a good idea.

AFAIK btrfs depends a lot on other linux subsystems (e.g. vfs, block,
etc). Rather than converting/reimplementing everything, if your aim is
lower cost, you might have easier time using something like a mediatek
SOC (the ones used on smartphones) and run a custom-built linux with
btrfs support on it.

For documentation,
https://btrfs.wiki.kernel.org/index.php/Main_Page#Developer_documentation
is probably the best place to start

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Convert btrfs software code to ASIC

2014-05-19 Thread Fajar A. Nugraha
On Mon, May 19, 2014 at 8:09 PM, Le Nguyen Tran lntran...@gmail.com wrote:
 I now need to understand the operation of btrfs source code to
 determine. I hope that one of you can help me


Have you read the wiki link?

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can anyone boot a system using btrfs root with linux 3.14 or newer?

2014-04-24 Thread Fajar A. Nugraha
On Thu, Apr 24, 2014 at 10:23 AM, Chris Murphy li...@colorremedies.com wrote:

 It sounds like either a grub.cfg misconfiguration, or a failure to correctly 
 build the initrd/initramfs. So I'd post the grub.cfg kernel command line for 
 the boot entry that works and the entry that fails, for comparison.

 And then also check and see if whatever utility builds your initrd has been 
 upgraded along with your kernel, maybe there's a bug/regression.


I believe the OP mentioned that he's using a distro without initrd,
and that all required modules are built in.

-- 
Fajar



 Chris Murphy--
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Which companies contribute to Btrfs?

2014-04-24 Thread Fajar A. Nugraha
On Thu, Apr 24, 2014 at 6:39 PM, David Sterba dste...@suse.cz wrote:
 On Wed, Apr 23, 2014 at 06:18:34PM -0700, Marc MERLIN wrote:
 I writing slides about btrfs for an upcoming talk (at linuxcon) and I was
 trying to gather a list of companies that contribute code to btrfs.

 https://btrfs.wiki.kernel.org/index.php/Main_Page

 [...] Jointly developed at Oracle, Red Hat, Fujitsu, Intel, SUSE, STRATO 
 [...]

 Are there other companies I missed?


The page now says ... Jointly developed at Facebook, Oracle, Red Hat   :D

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs and ECC RAM

2014-01-20 Thread Fajar A. Nugraha
On Mon, Jan 20, 2014 at 10:13 AM, Austin S Hemmelgarn
ahferro...@gmail.com wrote:

 AFAIK, ZFS does background data scrubbing without user intervention


No, it doesn't.

 BTRFS however works differently, it only scrubs data when you tell it
 to.  If it encounters a checksum or read error on a data block, it
 first tries to find another copy of that block elsewhere (usually on
 another disk), if it still sees a wrong checksum there, or gets
 another read error, or can't find another copy, then it returns a read
 error to userspace,


zfs does the same thing.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: drawbacks of non-ECC RAM

2014-01-17 Thread Fajar A. Nugraha
On Sat, Jan 18, 2014 at 1:33 AM, valleysmail-l...@yahoo.de
valleysmail-l...@yahoo.de wrote:



 I'd like to know if there are drawbacks in using btrfs with non-ECC RAM 
 instead of using ext4 with non-ECC RAM.

Non-ECC RAM can cause problems no matter what fs you use.

 I know that some features of btrfs may rely on ECC RAM but is the chance of 
 data corruption or even a damaged filesystem higher than when i use ext4 
 instead of btrfs?

Not really.

In the past the occurence of corrupted btrfs report on this list
(regardless of RAM) is somewhat high, but I don't see much of it in
recent versions.

 I want to know this because i would like to use the snapshot feature of btrfs 
 and ext4 does not support that. I will not use btrfs for fixing silent data 
 corruption nor for using RAID like features or encryption. ZFS however checks 
 files in the background (even if i don't want)

zfs does not checks files in the background by default. When
checksum is enabled (the default option), zfs only checks file
integrity when you access it, and when you run the scrub command. It
does not run background scrubs automatically.

AFAIK btrfs behaves the same way.

 and if it thinks there is an error it will fix it and i cannot disable this 
 feature. So errors in RAM may corrupt my files or even more.

You can disable checksum on both btrfs and zfs. See
https://btrfs.wiki.kernel.org/index.php/FAQ#Can_data_checksumming_be_turned_off.3F
,  https://btrfs.wiki.kernel.org/index.php/Mount_options

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Two identical copies of an image mounted result in changes to both images if only one is modified

2013-06-20 Thread Fajar A. Nugraha
On Thu, Jun 20, 2013 at 3:47 PM, Clemens Eisserer linuxhi...@gmail.com wrote:
 Hi,

 I've observed a rather strange behaviour while trying to mount two
 identical copies of the same image to different mount points.
 Each modification to one image is also performed in the second one.

 Example:
 dd if=/dev/sda? of=image1 bs=1M
 cp image1 image2
 mount -o loop image1 m1
 mount -o loop image2 m2

 touch m2/hello
 ls -la m1  //will now also include a file calles hello

What do you get if you unmount BOTH m1 and m2, and THEN mount m1
again? Is the file still there?


 Is this behaviour intentional and known or should I create a bug-report?
 I've deleted quite a bunch of files on my production system because of this...

I'm pretty sure this is a known behavior in btrfs.
http://markmail.org/message/i522sdkrhlxhw757#query:+page:1+mid:ksdi5d4v26eqgxpi+state:results

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: lvm volume like support

2013-02-26 Thread Fajar A. Nugraha
On Tue, Feb 26, 2013 at 9:30 PM, Martin Steigerwald mar...@lichtvoll.de wrote:
 Am Dienstag, 26. Februar 2013 schrieb Fajar A. Nugraha:
 On Tue, Feb 26, 2013 at 11:59 AM, Mike Fleetwood

 mike.fleetw...@googlemail.com wrote:
  On 25 February 2013 23:35, Suman C schakr...@gmail.com wrote:
  Hi,
 
  I think it would be great if there is a lvm volume or zfs zvol type
  support in btrfs.
 
  Btrfs already has capabilities to add and remove block devices on the
  fly.  Data can be stripped or mirrored or both.  Raid 5/6 is in
  testing at the moment.
  https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devic
  es https://btrfs.wiki.kernel.org/index.php/UseCases#RAID
 
  Which specific features do you think btrfs is lacking?

 I think he's talking about zvol-like feature.

 In zfs, instead of creating a
 filesystem-that-is-accessible-as-a-directory, you can create a zvol
 which behaves just like any other standard block device (e.g. you can
 use it as swap, or create ext4 filesystem on top of it). But it would
 also have most of the benefits that a normal zfs filesystem has, like:
 - thin provisioning (sparse allocation, snapshot  clone)
 - compression
 - integrity check (via checksum)

 Typical use cases would be:
 - swap in a pure-zfs system
 - virtualization (xen, kvm, etc)
 - NAS which exports the block device using iscsi/AoE

 AFAIK no such feature exist in btrfs yet.

 Sounds like the RADOS block device stuff for Ceph.

Exactly.

While using files + loopback device mostly works, there were problems
regarding performance and data integrity. Not to mention the hassle in
accessing the data if it resides on a partition inside the file (e.g.
you need losetup + kpartx to access it, and you must remember to do
the reverse when you're finished with it).

In zfsonlinux it's very easy to do so since a zvol is treated pretty
much like a disk, and whenever there's a partition inside a zvol, a
coressponding device noed is also created automatically.

--
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: lvm volume like support

2013-02-25 Thread Fajar A. Nugraha
On Tue, Feb 26, 2013 at 11:59 AM, Mike Fleetwood
mike.fleetw...@googlemail.com wrote:
 On 25 February 2013 23:35, Suman C schakr...@gmail.com wrote:
 Hi,

 I think it would be great if there is a lvm volume or zfs zvol type
 support in btrfs.


 Btrfs already has capabilities to add and remove block devices on the
 fly.  Data can be stripped or mirrored or both.  Raid 5/6 is in
 testing at the moment.
 https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices
 https://btrfs.wiki.kernel.org/index.php/UseCases#RAID

 Which specific features do you think btrfs is lacking?


I think he's talking about zvol-like feature.

In zfs, instead of creating a
filesystem-that-is-accessible-as-a-directory, you can create a zvol
which behaves just like any other standard block device (e.g. you can
use it as swap, or create ext4 filesystem on top of it). But it would
also have most of the benefits that a normal zfs filesystem has, like:
- thin provisioning (sparse allocation, snapshot  clone)
- compression
- integrity check (via checksum)

Typical use cases would be:
- swap in a pure-zfs system
- virtualization (xen, kvm, etc)
- NAS which exports the block device using iscsi/AoE

AFAIK no such feature exist in btrfs yet.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Production use with vanilla 3.6.6

2012-11-05 Thread Fajar A. Nugraha
On Mon, Nov 5, 2012 at 7:07 PM, Stefan Priebe - Profihost AG
s.pri...@profihost.ag wrote:
 Hello list,

 is btrfs ready for production use in 3.6.6? Or should i backport fixes from
 3.7-rc?

 Is it planned to have a stable kernel which will get all btrfs fixes
 backported?

I would say no to both, but you should check with distros that
supports btrfs (Oracle Linux and SLES). In particular, whether they
backport fixes, and what exactly does supported status gives you
when you buy support for that distro.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Request for review] [RFC] Add label support for snapshots and subvols

2012-11-01 Thread Fajar A. Nugraha
On Fri, Nov 2, 2012 at 5:16 AM, cwillu cwi...@cwillu.com wrote:
  btrfs fi label -t /btrfs/snap1-sv1
 Prod-DB-sand-box-testing

 Why is this better than:

 # btrfs su snap /btrfs/Prod-DB /btrfs/Prod-DB-sand-box-testing
 # mv /btrfs/Prod-DB-sand-box-testing /btrfs/Prod-DB-production-test
 # ls /btrfs/
 Prod-DB  Prod-DB-production-test


... because it would mean possibilty to decouple subvol name from
whatever-data-you-need (in this case, a label).

My request, though, is to just implement properties, and USER
properties, like what we have in zfs. This seems to be a cleaner,
saner approach. For example, this is on Ubutu + zfsonlinux:

# zfs create rpool/u
# zfs set user:label=Some test filesystem rpool/u
# zfs get creation,user:label rpool/u
NAME PROPERTYVALUE  SOURCE
rpool/u  creationFri Nov  2  5:24 2012  -
rpool/u  user:label  Some test filesystem   local

More info about zfs user properties here:
http://docs.oracle.com/cd/E19082-01/817-2271/gdrcw/index.html

--
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Request for review] [RFC] Add label support for snapshots and subvols

2012-11-01 Thread Fajar A. Nugraha
On Fri, Nov 2, 2012 at 5:32 AM, Hugo Mills h...@carfax.org.uk wrote:
 On Fri, Nov 02, 2012 at 05:28:01AM +0700, Fajar A. Nugraha wrote:
 On Fri, Nov 2, 2012 at 5:16 AM, cwillu cwi...@cwillu.com wrote:
   btrfs fi label -t /btrfs/snap1-sv1
  Prod-DB-sand-box-testing
 
  Why is this better than:
 
  # btrfs su snap /btrfs/Prod-DB /btrfs/Prod-DB-sand-box-testing
  # mv /btrfs/Prod-DB-sand-box-testing /btrfs/Prod-DB-production-test
  # ls /btrfs/
  Prod-DB  Prod-DB-production-test


 ... because it would mean possibilty to decouple subvol name from
 whatever-data-you-need (in this case, a label).

 My request, though, is to just implement properties, and USER
 properties, like what we have in zfs. This seems to be a cleaner,
 saner approach. For example, this is on Ubutu + zfsonlinux:

 # zfs create rpool/u
 # zfs set user:label=Some test filesystem rpool/u
 # zfs get creation,user:label rpool/u
 NAME PROPERTYVALUE  SOURCE
 rpool/u  creationFri Nov  2  5:24 2012  -
 rpool/u  user:label  Some test filesystem   local

Don't we already have an equivalent to that with user xattrs?

Hugo.


Anand did say one way to implement the label is by using attr, so +1
from me for that approach.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Naming of (bootable) subvolumes

2012-10-28 Thread Fajar A. Nugraha
On Sun, Oct 28, 2012 at 12:22 AM, Chris Murphy li...@colorremedies.com wrote:

 On Oct 26, 2012, at 9:03 PM, Fajar A. Nugraha l...@fajar.net wrote:


 So back to the original question, I'd suggest NOT to use either
 send/receive or set-default. Instead, setup multiple boot environment
 (e.g. old version, current version) and let user choose which one to
 boot using a menu.

 Is it possible to make a functioning symbolic or hard link of a subvolume?


Nope, I don't think so.

 I'm fine with current and previous options. More than that seems 
 unnecessary. But then, how does the user choose?

WIth up and down arrow :)

 What's the UI?

Grub boot menu.

 Is this properly the domain of GRUB2 or something else?

In my setup I use grub2's configfile ability. Which basically does a
go evaluate this other menu config file.

Each boot environment (BE, the term that solaris uses) has a different
entry on the main grub.cfg, which loads the BE's corresponding
grub.cfg.


 On BIOS machines, perhaps GRUB. On UEFI, I'd say distinctly not GRUB (I think 
 it's a distinctly bad idea to have a combined boot manager and bootloader in 
 a UEFI context, but that's a separate debate).

I don't use UEFI. But the general idea is to have one bootloader which
can load additional config files. And the location of that additional
config file depends on which BE user wants to boot.

 On this system, grub-mkconfig produces a grub.cfg only for the system I'm 
 currently booted from. It does not include any entries for fedora18/boot, 
 fedora18/root, even though they are well within the normal search path. And 
 the reference used is relative,  i.e. the kernel parameter in the grub.cfg is 
 rootflags=subvol=root

 If it were to create entries potentially for every snapshotted system, it 
 would be a very messy grub.cfg indeed.

 It stands to reason that each distro will continue to have their own grub.cfg.


No arguments there. Even in my setup, when I run update-grub, it
will only update its own grub.cfg, and leave the main grub.cfg
untouched. This is how my main grub.cfg looks like:


#===
set timeout=2

menuentry 'Ubuntu - 20120905 boot menu' {
configfile  /ROOT/precise-5/@/boot/grub/grub.cfg
}
menuentry 'Ubuntu - 20120814 boot menu' {
configfile  /ROOT/precise-4/@/boot/grub/grub.cfg
}
#===

each BE's grub cfg (e.g. the one under ROOT/precise-5 dataset) is just
your typical Ubuntu's grub.cfg, with only references to kernel/initrd
under that dataser.

 For BIOS machines, it could be useful if a single core.img containing a 
 single standardized prefix specifying a grub location could be agreed upon. 
 And then merely changing the set-default subvolume would allow different 
 distro grub.cfg's to be found, read and workable with the relative references 
 now in place, (except for home which likely needs to be mounted using 
 subvolid).

IMHO the biggest difference is that grub support for zfsonlinux, even
though it has bootfs pool property, has a way to reference ALL
versions of a file (including grub.cfg/kernel/initrd) during boot
time. This way you don't even need to change bootfs whenever you want
to change to a boot environment, you simply choose (or write) a
different grub stanza to boot.

If we continue to rely on current btrfs grub support, unfortunately we
can't have the same thing. And the closest thing would be
set-default. Which IMHO is VERY messy.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Naming of subvolumes

2012-10-26 Thread Fajar A. Nugraha
On Sat, Oct 27, 2012 at 8:58 AM, cwillu cwi...@cwillu.com wrote:
 I haven't tried btrfs send/receive for this purpose, so I can't compare. But 
 btrfs subvolume set-default is faster than the release of my finger from the 
 return key. And it's easy enough the user could do it themselves if they had 
 reasons for regression to a snapshot that differ than the automagic 
 determination of the upgrade pass/fail.

 The one needed change, however, is to get /etc/fstab to use an absolute 
 reference for home.


 Chris Murphy

 I'd argue that everything should be absolute references to subvolumes
 (/@home, /@, etc), and neither set-default nor subvolume id's should
 be touched.  There's no need, as you can simply mv those around (even
 while mounted).  More importantly, it doesn't result in a case where
 the fstab in one snapshot points its mountpoint to a different
 snapshot, with all the hilarity that would cause over time, and also
 allows multiple distros to be installed on the same filesystem without
 having them stomp on each others set-defaults: /@fedora, /@rawhide,
 /@ubuntu, /@home, etc.


What I do with zfs, which might also be applicable on btrfs:

- Have a separate dataset to install grub: poolname/boot. This can
also be a dedicated partition, if you want. The sole purpose for this
partition/dataset is to select which dataset's grub.cfg to load next
(using configfile directive). The grub.cfg here is edited manually.

- Have different datasets for each versioned OS (e.g. before and after
upgrades): poolname/ROOT/ubuntu-1, poolname/ROOT/ubuntu-2, etc. Each
dataset is independent of each other, contains their own /boot
(complete with grub/grub.cfg, kernel, and initrd). grub.cfg on each
dataset selects its own dataset to boot using bootfs kernel command
line.

- Have a common home for all environment: poolname/home

- Have zfs set the mountpoint (or mounted in initramfs, in root case),
so I can get away with an empty fstab.

- Do upgrades/modifications in the currently-booted root environment,
but create a clone of current environment (and give it a different
name) so I can roll back to it if needed.


It works great for me so far, since:
- each boot environment is portable-enough to move around when needed,
with only about four config files needed to be changed (e.g. grub.cfg)
when moving between different computers, or when renaming a root
dataset.

- I can rename each root environment easily, or even move it to
different pool/disk when needed.

- I can move back and forth between multiple versions of the boot
environment (all are ubuntu so far, cause IMHO it currently has best
zfs root support).


So back to the original question, I'd suggest NOT to use either
send/receive or set-default. Instead, setup multiple boot environment
(e.g. old version, current version) and let user choose which one to
boot using a menu. However for this to work, grub (the bootloader, and
the userland programs like update-grub) needs to be able to refer to
each grub.cfg/kernel/initrd in a global manner regardless of what the
current default subvolume is (zfs' grub code uses something like
/poolname/dataset_name/@/path/to/file/in/dataset).

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs causing reboots and kernel oops on SL 6 (RHEL 6)

2012-10-04 Thread Fajar A. Nugraha
On Sat, Jun 4, 2011 at 11:33 AM, Joel Pearson
japear...@agiledigital.com.au wrote:
 Hi,

 I'm using SL 6 (RHEL 6) and I've been playing around with running
 PostgreSQL on btrfs. Snapshotting works ok, but the computer keeps
 rebooting without warning (can be 5 mins or 1.5 hours), finally I
 actually managed to get a Kernel Crash instead of just a reboot.

 I took a picture of the screen:
 http://imageshack.us/photo/my-images/716/img0143y.jpg/

 The important bits are:

 IP: [a032c471] btrfs_print_leaf +0x31/0x820 [btrfs]
 PGD 0
 Oops:  [#1] SMP
 last sysfs file: /sys/devices/virtual/block/dm-3/dm/name

 The crashes aren't predictable either. Like it doesn't always happen
 when I do a snapshot or anything like that.

 Is this a known problem, that is fixed in a later kernel or something like 
 that?


Which kernel is this?

If it's the default SL/RHEL 2.6.32 kernel, then you should try upgrade
first. http://elrepo.org/tiki/kernel-ml is a good choice.

It's highly unlikely that anyone would be willing to look at bugs on
that archaic (in btrfs world) kernel.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Tunning - cache write (database)

2012-10-01 Thread Fajar A. Nugraha
On Mon, Oct 1, 2012 at 8:27 PM, Cesar Inacio Martins
cesar_inacio_mart...@yahoo.com.br wrote:

 My problem:
 * Using btrfs + compression , flush of 60 MB/s take 4 minutes
 (on this 4 minutes they keep constatly I/O of +- 4MB/s no disks)
 (flush from Informix database)


 * OpenSuse 12.1 64bits, running over VmWare ESXi 5
 * Btrfs version : btrfsprogs-0.19-43.1.2.x86_64
 * Kernel : Linux jdivm06 3.1.10-1.16-desktop #1 SMP PREEMPT Wed Jun 27


 My question, what I believed will help to avoid this long flush :
 * Have some way to force this flush all in memory cache and then use the
 btrfs background process to flush to disk ...
   Security and recover aren't a priority for now, because this is part of a
 database bulkload ...after finish , integrity will be desirable (not a
 obligation, since this is a test environment)

 For now, performance is the mainly requirement...


I suggest you start by reading
http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg18827.html

After that, PROBABLY start your database by preloading libeatmydata to
disable fsync completely.

On a side note, zfs has sync property, which when set to disabled,
have pretty much the same effect as libeatmydata.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Tunning - cache write (database)

2012-10-01 Thread Fajar A. Nugraha
On Tue, Oct 2, 2012 at 3:16 AM, Clemens Eisserer linuxhi...@gmail.com wrote:
 I suggest you start by reading
 http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg18827.html

 After that, PROBABLY start your database by preloading libeatmydata to
 disable fsync completely.

 Which will cure the sympthoms, not the issue itself - I remember the
 same advice was given for Reiser4 back then ;)
 Usually for non-toy use-cases data is too valueable to just disable fsync.

The OP DID say he doesn't really care about security, recovery, nor
integrity (or at least, it's not an obligatiion) :D

Other than trying latest -rc and using libeatmydata, I can't see what
else can be done to improve current db performance on btrfs. As the
list archive shows, zfs is currently MUCH more suitable for that.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Experiences: Why BTRFS had to yield for ZFS

2012-09-19 Thread Fajar A. Nugraha
On Wed, Sep 19, 2012 at 2:28 PM, Casper Bang casper.b...@gmail.com wrote:
 Anand Jain Anand.Jain at oracle.com writes:
   archive-log-apply script - if you could, can you share the
   script itself ? or provide more details about the script.
   (It will help to understand the work-load in question).

 Our setup entails a whole bunch of scripts, but the apply script looks like 
 this
 (orion is the production environment, pandium is the shadow):
 http://pastebin.com/k4T7deap

 The script invokes rman passing rman_recover_database.rcs:

IIRC there were some patches post-3.0 which relates to sync. If oracle
db uses sync writes (or call sync somewhere, which it should), it
might help to re-run the test with more recent kernel. kernel-ml
repository might help.

 Ext4 starts out with a realtime to SCN ratio of about 3.4 and ends down 
 around a
 factor 2.2.

 ZFS starts out with a realtime to SCN ratio of about 7.5 and ends down around 
 a
 factor 4.4.

So zfsonlinux is actually faster than ext4 for that purpuse? coool !


 Btrfs starts out with a realtime to SCN ratio of about 2.2 and ends down 
 around
 a factor 0.8. This of course means we will never be able to catch up with
 production, as btrfs can't apply these as fast as they're created.

 It was even worse with btrfs on our 10xSSD server, where 20 min. of realtime
 work would end up taking some 5h to get applied (factor 0.06), obviously 
 useless
 to us.

Just wondering, did you use discard option by any chance? In my
experience it makes btrfs MUCH slower.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: specify UUID for btrfs

2012-09-13 Thread Fajar A. Nugraha
On Thu, Sep 13, 2012 at 1:07 PM, ching lu lschin...@gmail.com wrote:
 Is it possible to specify UUID for btrfs when creating the filesystem?

Not that I know of

 or changing it when it is offline?

This one is a definite no.

 i have several script/setting file which have hardcoded UUID and i do
 not want to update them every time when restore backup.

Using label would probably make more sense for that purpose. It can be
set and changed later.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Workaround for hardlink count problem?

2012-09-10 Thread Fajar A. Nugraha
On Mon, Sep 10, 2012 at 4:12 PM, Martin Steigerwald mar...@lichtvoll.de wrote:
 Am Samstag, 8. September 2012 schrieb Marc MERLIN:
 I was migrating a backup disk to a new btrfs disk, and the backup had a
 lot of hardlinks to collapse identical files to cut down on inode
 count and disk space.

 Then, I started seeing:
 […]
 Has someone come up with a cool way to work around the too many link
 error and only when that happens, turn the hardlink into a file copy
 instead? (that is when copying an entire tree with millions of files).

 What about:

 - copy first backup version
 - btrfs subvol create first next
 - copy next backup version
 - btrfs subvol create previous next

Wouldn't btrfs subvolume snapshot, plus rsync --inplace more
useful here? That is. if the original hardlink is caused by multiple
versions of backup of the same file.

Personally, if I need a feature not currently impelented yet in btrfs,
I'd just switch to something else for now, like zfs. And revisit btrfs
later when it has the needed features have been merged.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: enquiry about defrag

2012-09-09 Thread Fajar A. Nugraha
On Sun, Sep 9, 2012 at 2:49 PM, ching lschin...@gmail.com wrote:
 On 09/09/2012 08:30 AM, Jan Steffens wrote:
 On Sun, Sep 9, 2012 at 2:03 AM, ching lschin...@gmail.com wrote:
 2. Is there any command for the fragmentation status of a file/dir ? e.g. 
 fragment size, number of fragments.
 Use the filefrag command, part of e2fsprogs.


 my image is a 16G sparse file, after defragment, it still has 101387 extents, 
 is it normal?

Is compression enabled? If so, yes, it's normal.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


oops with btrfs on zvol

2012-08-31 Thread Fajar A. Nugraha
Hi,

I'm experimenting with btrfs on top of zvol block device (using
zfsonlinux), and got oops on a simple mount test.

While I'm sure that zfsonlinux is somehow also at fault here (since
the same test with zram works fine), the oops only shows things
btrfs-related without any usable mention of zfs/zvol. Could anyone
help me interpret the kernel logs, which btrfs-zvol interaction is at
fault, so I can pass it on to zfs guys to work on their side as well?
Thanks.

The test is creating a sparse 100G block device (zfs create -V 100G -s
-o volblocksize=4k rpool/vbd/test1), format it (mkfs.btrfs
/dev/zvol/rpool/vbd/test1), and mount it. Oops occured, and the mount
process stuck. Same thing happens on ubuntu precise's kernel 3.2 and
quantal's 3.5.

What's interesting is:
- if I use ext4 (instead of btrfs) on the zvol, it works just fine
- if I add a layer on top of zvol (losetup, or iscsi export-import)
then btrfs works just fine.

Syslog shows this (from Ubuntu's 3.2 kernel):

#=
Aug 31 20:30:13 DELL kernel: [34307.828311]  zd0: unknown partition table
Aug 31 20:30:34 DELL kernel: [34328.129249] device fsid
cfd88ff9-def8-4d1f-9435-65becd5fa2b7 devid 1 transid 4 /dev/zd0
Aug 31 20:30:34 DELL kernel: [34328.134001] btrfs: disk space caching is enabled
Aug 31 20:30:34 DELL kernel: [34328.135701] BUG: unable to handle
kernel NULL pointer dereference at   (null)
Aug 31 20:30:34 DELL kernel: [34328.137200] IP: [a0068439]
extent_range_uptodate+0x59/0xe0 [btrfs]
Aug 31 20:30:34 DELL kernel: [34328.138759] PGD 0
Aug 31 20:30:34 DELL kernel: [34328.140248] Oops:  [#1] SMP
Aug 31 20:30:34 DELL kernel: [34328.141777] CPU 3
Aug 31 20:30:34 DELL kernel: [34328.141811] Modules linked in: ses
enclosure ppp_mppe ppp_async crc_ccitt pci_stub vboxpci(O)
vboxnetadp(O) vboxnetflt
(O) vboxdrv(O) arc4 ath9k mac80211 radeon uvcvideo snd_hda_codec_hdmi
ath9k_common snd_hda_codec_realtek ath9k_hw videodev ipt_MASQUERADE
xt_state ipt
able_nat nf_nat v4l2_compat_ioctl32 i915 nf_conntrack_ipv4
nf_conntrack iptable_filter nf_defrag_ipv4 ip_tables dm_multipath
dummy x_tables bnep ath3k
 btusb bridge rfcomm bluetooth snd_hda_intel snd_hda_codec stp joydev
ath snd_hwdep ttm snd_pcm mei(C) drm_kms_helper drm snd_seq_midi
snd_rawmidi snd
_seq_midi_event dell_wmi sparse_keymap snd_seq dell_laptop wmi
snd_timer i2c_algo_bit video psmouse snd_seq_device cfg80211 snd
mac_hid serio_raw soun
dcore dcdbas snd_page_alloc parport_pc ppdev lp parport binfmt_misc
zfs(P) zcommon(P) znvpair(P) zavl(P) zunicode(P) spl(O) ums_realtek
uas r8169 btrf
s zlib_deflate libcrc32c usb_storage
Aug 31 20:30:34 DELL kernel: [34328.155820]
Aug 31 20:30:34 DELL kernel: [34328.157974] Pid: 15887, comm:
btrfs-endio-met Tainted: P C O 3.2.0-29-generic #46-Ubuntu
Dell Inc.  De
ll System Inspiron N4110/03NKW8
Aug 31 20:30:34 DELL kernel: [34328.160283] RIP:
0010:[a0068439]  [a0068439]
extent_range_uptodate+0x59/0xe0 [btrfs]
Aug 31 20:30:34 DELL kernel: [34328.162700] RSP: 0018:8800351dfde0
 EFLAGS: 00010246
Aug 31 20:30:34 DELL kernel: [34328.165099] RAX:  RBX:
01401000 RCX: 
Aug 31 20:30:34 DELL kernel: [34328.167548] RDX: 0001 RSI:
1401 RDI: 
Aug 31 20:30:34 DELL kernel: [34328.169989] RBP: 8800351dfe00 R08:
 R09: 880067021418
Aug 31 20:30:34 DELL kernel: [34328.172474] R10: 8800b680d010 R11:
1000 R12: 88011d997bf0
Aug 31 20:30:34 DELL kernel: [34328.174922] R13: 01401fff R14:
880031c45c00 R15: 88011aedc9b0
Aug 31 20:30:34 DELL kernel: [34328.177401] FS:
() GS:88013e6c()
knlGS:
Aug 31 20:30:34 DELL kernel: [34328.179904] CS:  0010 DS:  ES:
 CR0: 8005003b
Aug 31 20:30:34 DELL kernel: [34328.182426] CR2:  CR3:
0001291e CR4: 000406e0
Aug 31 20:30:34 DELL kernel: [34328.185005] DR0:  DR1:
 DR2: 
Aug 31 20:30:34 DELL kernel: [34328.187602] DR3:  DR6:
0ff0 DR7: 0400
Aug 31 20:30:34 DELL kernel: [34328.190246] Process btrfs-endio-met
(pid: 15887, threadinfo 8800351de000, task 880031c45c00)
Aug 31 20:30:34 DELL kernel: [34328.193171] Stack:
Aug 31 20:30:34 DELL kernel: [34328.196542]  8800351dfdf0
880088ff6638 8800b61953c0 88011cbbb000
Aug 31 20:30:34 DELL kernel: [34328.199469]  8800351dfe10
a004224d 8800351dfe40 a00422d6
Aug 31 20:30:34 DELL kernel: [34328.202295]  8800351dfe88
88011aedc960 8800351dfe88 8800351dfe98
Aug 31 20:30:34 DELL kernel: [34328.204685] Call Trace:
Aug 31 20:30:34 DELL kernel: [34328.206645]  [a004224d]
bio_ready_for_csum.isra.107+0xbd/0xc0 [btrfs]
Aug 31 20:30:34 DELL kernel: [34328.208591]  [a00422d6]
end_workqueue_fn+0x86/0xa0 [btrfs]
Aug 31 20:30:34 DELL kernel: [34328.210565]  [a00714e0]

Re: raw partition or LV for btrfs?

2012-08-14 Thread Fajar A. Nugraha
On Tue, Aug 14, 2012 at 8:28 PM, Daniel Pocock dan...@pocock.com.au wrote:
 Can you just elaborate on the qgroups feature?
 - Does this just mean I can make the subvolume sizes rigid, like LV sizes?

Pretty much.

 - Or is it per-user restrictions or some other more elaborate solution?

No


 If I create 10 LVs today, with btrfs on each, can I merge them all into
 subvolumes on a single btrfs later?

No


 If I just create a 1TB btrfs with subvolumes now, can I upgrade to
 qgroups later?

Yes

  Or would I have to recreate the filesystem?

No

 If I understand correctly, if I don't use LVM, then such move and resize
 operations can't be done for an online filesystem and it has more risk.

You can resize, add, and remove devices from btrfs online without the
need for LVM. IIRC LVM has finer granularity though, you can do
something like move only the first 10GB now, I'll move the rest
later.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raw partition or LV for btrfs?

2012-08-14 Thread Fajar A. Nugraha
On Tue, Aug 14, 2012 at 9:09 PM, cwillu cwi...@cwillu.com wrote:
 If I understand correctly, if I don't use LVM, then such move and resize
 operations can't be done for an online filesystem and it has more risk.

 You can resize, add, and remove devices from btrfs online without the
 need for LVM. IIRC LVM has finer granularity though, you can do
 something like move only the first 10GB now, I'll move the rest
 later.

 You can certainly resize the filesystem itself, but without lvm I
 don't believe you can resize the underlying partition online.

I'm pretty sure you can do that with parted. At least, when your
version of parted is NOT 2.2.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raw partition or LV for btrfs?

2012-08-12 Thread Fajar A. Nugraha
On Sun, Aug 12, 2012 at 11:46 PM, Daniel Pocock dan...@pocock.com.au wrote:


 I notice this question on the wiki/faq:


 https://btrfs.wiki.kernel.org/index.php/UseCases#What_is_best_practice_when_partitioning_a_device_that_holds_one_or_more_btr-filesystems

 and as it hasn't been answered, can anyone make any comments on the subject

 Various things come to mind:

 a) partition the disk, create an LVM partition, and create lots of small
 LVs, format each as btrfs

 b) partition the disk, create an LVM partition, and create one big LV,
 format as btrfs, make subvolumes

 c) what about using btrfs RAID1?  Does either approach (a) or (b) seem
 better for someone who wants the RAID1 feature?

IMHO when the qgroup feature is stable (i.e. adopted by distros, or
at least in stable kernel) then simply creating one big partition (and
letting btrfs handle RAID1, if you use it) is better. When 3.6 is out,
perhaps?

Until then I'd use LVM.


 d) what about booting from a btrfs system?  Is it recommended to follow
 the ages-old practice of keeping a real partition of 128-500MB,
 formatting it as btrfs, even if all other data is in subvolumes as per (b)?

You can have one single partition only and boot directly from that.
However btrfs has the same problems as zfs in this regard:
- grub can read both, but can't write to either. In other words, no
support for grubenv
- the best compression method (gzip for zfs, lzo for btrfs) is not
supported by grub

For the first problem, an easy workaroud is just to disable the grub
configuration that uses grubenv. Easy enough, and no major
functionality loss.

The second one is harder for btrfs. zfs allows you to have separate
dataset (i.e. subvolume, in btfs terms) with different compression, so
you can have a dedicated dataset for /boot with different compression
setting from the rest of the dataset. With btrfs you're currently
stuck with using the same compression setting for everything, so if
you love lzo this might be a major setback.

There's also a btrfs-specific problem: it's hard to have a system
which have /boot on a separate subvol while managing it with current
automatic tools (e.g. update-grub).

Due to second and third problem, I'd recommend you just use a separate
partition with ext2/4 for now.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: I want to try something on the BTR file system,...

2012-08-12 Thread Fajar A. Nugraha
On Mon, Aug 13, 2012 at 8:22 AM, Ben Leverett ben...@live.com wrote:
 could you please send me a copy of the btr driver/kernel?

I wonder if using live.com email has something to do with how you
ask that question :P

Anyway, depending on what you want to use it for, you might find it
easier to just download latest version of Ubuntu or
whatever-your-favorite-linux-distro. Or, if you want to modify the
source code, The link that Michael sends provide a good starting
point.

What is it that you want to try? If your question is more specific,
you can get more specific answer.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raw partition or LV for btrfs?

2012-08-12 Thread Fajar A. Nugraha
On Mon, Aug 13, 2012 at 11:19 AM, Kyle Gates kylega...@hotmail.com wrote:
 Also, I think the current grub2 has lzo support.

You're right

grub2 (1.99-18) unstable; urgency=low

  [ Colin Watson ]
...
  * Backport from upstream:
- Add support for LZO compression in btrfs (LP: #727535).

so Ubuntu has it since precise, which is roughly the time I switched
to zfs for rootfs :P

Thanks for letting us know about that.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How can btrfs take 23sec to stat 23K files from an SSD?

2012-08-01 Thread Fajar A. Nugraha
On Wed, Aug 1, 2012 at 1:01 PM, Marc MERLIN m...@merlins.org wrote:

 So, clearly, there is something wrong with the samsung 830 SSD with linux


 It it were a random crappy SSD from a random vendor, I'd blame the SSD, but
 I have a hard time believing that samsung is selling SSDs that are slower
 than hard drives at random IO and 'seeks'.

You'd be surprised on how badly some vendors can screw up :)


 First: btrfs is the slowest:

 gandalfthegreat:/mnt/ssd/var/local# grep /mnt/ssd/var /proc/mounts
 /dev/mapper/ssd /mnt/ssd/var btrfs 
 rw,noatime,compress=lzo,ssd,discard,space_cache 0 0

Just checking, did you explicitly activate discard? Cause on my
setup (with corsair SSD) it made things MUCH slower. Also, try adding
noatime (just in case the slow down was because du cause many
access time updates)

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Upgrading from 2.6.38, how?

2012-07-24 Thread Fajar A. Nugraha
On Wed, Jul 25, 2012 at 11:39 AM, Gareth Pye gar...@cerberos.id.au wrote:
 My proposed upgrade method is:
 Boot from a live CD with the latest kernel I can find so I can do a few tests:
  A - run the fsck in read only mode to confirm things look good
  B - mount read only, confirm that I can read files well
  C - mount read write, confirm working
 Install latest OS, upgrade to latest kernel, then repeat above steps.

 Any likely hiccups with the above procedure and suggested alternatives?

I'd simply install the new OS on a new partition/subvol. This is what
I did when upgrading from natty - oneiric - precise.

IIRC there are some incompatibilites (e.g. space/inode cache disk
format?) but newer kernels will just do the right thing, drop the old
cache and create a new one.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Very slow samba file transfer speed... any ideas ?

2012-07-19 Thread Fajar A. Nugraha
On Thu, Jul 19, 2012 at 7:39 PM, Shavi N shav...@gmail.com wrote:
 So btrfs gives a massive difference locally, but that still doesn't
 explain the slow transfer speeds.
 Is there a way to test this?

I'd try with real data, not /dev/zero. e.g:
dd_rescue -b 1M -m 1.4G /dev/sda testfile.img

... or use whatever non-zero data source you have. dd_rescue will give
a nice progress bar and speed indicator.

Also, run iostat -mx 3 while you're running dd, and while accessing
it from samba. In my experice, btrfs is simply slower than ext4.
Period. There's no way around it for now.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: brtfs on top of dmcrypt with SSD - Trim or no Trim

2012-07-18 Thread Fajar A. Nugraha
On Thu, Jul 19, 2012 at 1:13 AM, Marc MERLIN m...@merlins.org wrote:
 TL;DR:
 I'm going to change the FAQ to say people should use TRIM with dmcrypt
 because not doing so definitely causes some lesser SSDs to suck, or
 possibly even fail and lose our data.


 Longer version:
 Ok, so several months later I can report back with useful info.

 Not using TRIM on my Crucial RealSSD C300 256GB is most likely what caused
 its garbage collection algorithm to fail (killing the drive and all its
 data), and it was also causing BRTFS to hang badly when I was getting
 within 10GB of the drive getting full.

 I reported some problems I had with btrfs being very slow and hanging when I
 only had 10GB free, and I'm now convinced that it was the SSD that was at
 fault.

 On the Crucial RealSSD C300 256GB, and from talking to their tech support
 and other folks who happened to have gotten that 'drive' at work and also
 got weird unexplained failures, I'm convinced that even its latest 007
 firmware (the firmware it shipped with would just hang the system for a few
 seconds every so often so I did upgrade to 007 early on), the drive does
 very poorly without TRIM when it's getting close to full.


If you're going to edit the wiki, I'd suggest you say SOME SSDs might
need to use TRIM with dmcrypt. That's because some SSD controllers
(e.g. sandforce) performs just fine without TRIM, and in my case TRIM
made performance worse.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: file system corruption removal / documentation quandry

2012-07-11 Thread Fajar A. Nugraha
On Thu, Jul 12, 2012 at 12:13 PM, eric gisse jowr...@gmail.com wrote:

 Basically, phoronix showed there is a --repair option. After enabling
 snapshotting and playing around with the various discussed options, I
 discovered that --repair and no special mount options was sufficient
 to get the files removable.

I'm curious, whether running it directly on newer kernel (e.g. latest
ubuntu kernel-ppa/mainline) will be able to solve the problem, even
without btrfsck.

Also note that if by snapshotting you mean create LVM snapshots,
then you might be in for another surprise, as btrfs doesn't play nice
with block devices with the same fs UUID. Don't rely on that as backup
option.


 Now what I'm hoping for is better documentation on btrfsck even if it
 just boils down to a brief enumeration of the options as that would be
 better than nothing which is what we have now. Do I need to file a bug
 or is this sufficient?

Edit https://btrfs.wiki.kernel.org/index.php/Btrfsck ?

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS fsck apparent errors

2012-07-04 Thread Fajar A. Nugraha
On Wed, Jul 4, 2012 at 8:42 PM, David Sterba d...@jikos.cz wrote:
 On Wed, Jul 04, 2012 at 07:40:05AM +0700, Fajar A. Nugraha wrote:
 Are there any known btrfs regression in 3.4? I'm using 3.4.0-3-generic
 from a ppa, but a normal mount - umount cycle seems MUCH longer
 compared to how it was on 3.2, and iostat shows the disk is
 read-IOPS-bound

 Is it just mount/umount without any other activity?

Yes

 Is the fs
 fragmented

Not sure how to check that quickly

 (or aged),

Over 1 year, so yes

 almost full,

df says 83% used, so probably yes (depending on how you define almost)

~ $ df -h /media/WD-root
Filesystem  Size  Used Avail Use% Mounted on
/dev/sdc2   922G  733G  155G  83% /media/WD-root

~ $ sudo btrfs fi df /media/WD-root/
Data: total=883.95GB, used=729.68GB
System, DUP: total=8.00MB, used=104.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=18.75GB, used=1.49GB
Metadata: total=8.00MB, used=0.00

 has lots of files?

it's a normal 1 TB usb disk, with docs, movies, vm images, etc. No
particular lots-of-small-files like maildir or anything like that.


 # time umount /media/WD-root/

 real  0m22.419s
 user  0m0.000s
 sys   0m0.064s

 # /proc/10142/stack  --- the PID of umount process

 The process(es) actually doing the work are the btrfs workers, usual
 sucspects are btrfs-cache (free space cache) or btrfs-ino (inode cache)
 that are writing the cache states back to disk.

Not sure about that, since iostat shows it's mostly read, not write.
Will try iotop later.
I tested also with Chris' for-linus on top of 3.4, same result (really
long time to umount).

Reverting back to ubuntu's 3.2.0-26-generic, umount only took less than 1 s :P
So I guess I'm switching back to 3.2 for now.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS fsck apparent errors

2012-07-03 Thread Fajar A. Nugraha
On Tue, Jul 3, 2012 at 10:22 PM, Hugo Mills h...@carfax.org.uk wrote:
 On Tue, Jul 03, 2012 at 05:10:13PM +0200, Swâmi Petaramesh wrote:

 After I had shifted, I tried to defragment and compress my FS using
 commands such as :

 find /mnt/STORAGEFS/STORAGE/ -exec btrfs fi defrag -clzo -v {} \;

 During execution of such commands, my kernel oopsed, so I restarted.

I would also suggest using a 3.4 kernel. There's at least one FS
 corruption bug known to exist in 3.2 that's been fixed in 3.4.


Are there any known btrfs regression in 3.4? I'm using 3.4.0-3-generic
from a ppa, but a normal mount - umount cycle seems MUCH longer
compared to how it was on 3.2, and iostat shows the disk is
read-IOPS-bound

# time mount LABEL=WD-root

real0m10.400s
user0m0.000s
sys 0m0.060s

# time umount /media/WD-root/

real0m22.419s
user0m0.000s
sys 0m0.064s

# /proc/10142/stack  --- the PID of umount process
[8111dd1e] sleep_on_page+0xe/0x20
[8111de88] wait_on_page_bit+0x78/0x80
[8111e08c] filemap_fdatawait_range+0x10c/0x1a0
[a00744eb] btrfs_wait_marked_extents+0x6b/0xc0 [btrfs]
[a007457b] btrfs_write_and_wait_marked_extents+0x3b/0x60 [btrfs]
[a00745cb] btrfs_write_and_wait_transaction+0x2b/0x50 [btrfs]
[a0074e69] btrfs_commit_transaction+0x759/0x960 [btrfs]
[a00700db] btrfs_commit_super+0xbb/0x110 [btrfs]
[a0071490] close_ctree+0x2a0/0x310 [btrfs]
[a004b6c9] btrfs_put_super+0x19/0x20 [btrfs]
[811810b2] generic_shutdown_super+0x62/0xf0
[811811d6] kill_anon_super+0x16/0x30
[a004df3a] btrfs_kill_super+0x1a/0x90 [btrfs]
[811816ac] deactivate_locked_super+0x3c/0xa0
[81181f9e] deactivate_super+0x4e/0x70
[8119df9c] mntput_no_expire+0xdc/0x130
[8119f296] sys_umount+0x66/0xe0
[8169e129] system_call_fastpath+0x16/0x1b
[] 0x

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Kernel panic from btrfs subvolume delete

2012-06-29 Thread Fajar A. Nugraha
On Fri, Jun 29, 2012 at 5:11 PM, Richard Cooper
rich...@richardcooper.net wrote:
 Hi All,

 I have two machines where I've been testing various btrfs based backup 
 strategies. They are both Cent OS 6 with the standard kernel and btrfs-progs 
 RPMs from the CentOS repos.

 - kernel-2.6.32-220.17.1.el6.x86_64
 - btrfs-progs-0.19-12.el6.x86_64

In btrfs terms, 2.6.32 is ... stone age :P

 What should I do now? Do I need to upgrade to a more recent btrfs?

Yep

 If so, how?

https://blogs.oracle.com/linux/entry/oracle_unbreakable_enterprise_kernel_release
http://elrepo.org/tiki/kernel-ml

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Kernel panic from btrfs subvolume delete

2012-06-29 Thread Fajar A. Nugraha
On Fri, Jun 29, 2012 at 9:23 PM, Richard Cooper
rich...@richardcooper.net wrote:
 If so, how?

 https://blogs.oracle.com/linux/entry/oracle_unbreakable_enterprise_kernel_release
 http://elrepo.org/tiki/kernel-ml

 Perfect, thank you! I was looking for a mainline kernel yum repo but my 
 google-fu was failing me. That looks like just what I need.

 I've installed kernel v3.4.4 from http://elrepo.org/tiki/kernel-ml and that 
 seems to have fixed my kernel panic. I'm still using the default Cent OS 6 
 versions of the btrfs userspace programs (v0.19). Any reason why that might 
 be a bad idea?

At the very least, newer version of btrfsck has --repair, which you
might need later in the future.
There's also features lke forcing a certain compression (e.g. zlib) on
a file as part of btrfs filesystem defrag command.

Just grab updated btrfs-progs (or whatever it's called) from Oracle's repo.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: System Policy for Filenames

2012-06-26 Thread Fajar A. Nugraha
On Wed, Jun 27, 2012 at 1:28 AM, Aaron Peterson
myusualnickn...@gmail.com wrote:
 Billy,

 Thank you! I will look into FUSE.

 Ultimately, I want my / to be mounted with these rules,  I will need a
 boot loader to be able to handle it.

Try looking at how ubuntu live cd works. Last time I check, it can use
unionfs-fuse as / to make the read-only cd media appear writable
live session. Something similar should be applicable to your needs.

  I am wondering if filesystem software has hooks for AppArmor or
 SELinux, or some other Linux Security Module would be appropriated to
 add to filesystem code?

Not that I know of.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Subvolumes and /proc/self/mountinfo

2012-06-20 Thread Fajar A. Nugraha
On Wed, Jun 20, 2012 at 10:22 AM, H. Peter Anvin h...@zytor.com wrote:
 a. Make a snapshot of the current root;
 b. Mount said snapshot;
 c. Install the new distro on the snapshot;
 d. Change the bootloader configuration *inside* the snapshot to point
   to the snapshot as the root;
 e. Install the bootloader on the snapshot, thereby making the boot
   block point to it and making it live.


IMHO a more elegant solution would be similar to what
(open)solaris/indiana does: make the boot parts (bootloader,
configuration) as a separate area, separate from root snapshots. In
solaris case IIRC this is will br /rpool/grub.

A similar approach should be implementable in linux, at least on
certain configurations, since if you put /boot as part of / (thus,
also on btrfs), AND you don't change the default subvolume, AND the
roots are on their own subvolume, the paths to vmlinuz and initrd on
grub.cfg will have subvols name in it. So it's possible to have a
single grub.cfg having several entries that points to different
subvols. So you don't need to install a new bootloader to make a
particular subvol live, you only need to select it from the boot menu.

I'm doing this currently with ubuntu precise, but with
manually-created grub.cfg though. Still haven't found a way to manage
this automatically.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Subvolumes and /proc/self/mountinfo

2012-06-19 Thread Fajar A. Nugraha
On Wed, Jun 20, 2012 at 6:35 AM, H. Peter Anvin h...@zytor.com wrote:
 On 06/19/2012 07:22 AM, Calvin Walton wrote:

 All subvolumes are accessible from the volume mounted when you use -o
 subvolid=0. (Note that 0 is not the real ID of the root volume, it's
 just a shortcut for mounting it.)


 Could you clarify this bit?  Specifically, what is the real ID of the
 root volume, then?

5

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Moving top level to a subvolume

2012-06-13 Thread Fajar A. Nugraha
On Wed, Jun 13, 2012 at 4:44 PM, C Anthony Risinger anth...@xtfx.me wrote:
 On Wed, Jun 13, 2012 at 2:21 AM, Arne Jansen sensi...@gmx.net wrote:
 On 13.06.2012 09:04, C Anthony Risinger wrote:

 ... because in a), data will *copied* the slow way

 What I don't understand is why you think data will be copied.

 at one point i tried to create a new subvol and `mv` files there, and
 it took quite some time to complete
 (cross-link-device-what-have-you?), but maybe things changed ... will
 try it out.

IIRC it hasn't. Not in upstream anyway. Some distros (e.g. opensuse)
carry their own patch which allows cross-subvolume links (cp --reflink
...).

But it shouldn't matter anyway, since you can SNAPSHOT the old subvol
(even root subvol), instead of creating a new subvol. Which means
nothing needs to be copied.

You'd still have to do rm manually though.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Moving top level to a subvolume

2012-06-12 Thread Fajar A. Nugraha
On Tue, Jun 12, 2012 at 9:52 PM, Randy Barlow
ra...@electronsweatshop.com wrote:
 I personally run Gentoo, but I've been told by some coworkers that the Ubuntu
 installer offers btrfs as an option to the users without marking it as
 experimental, unstable, or under development. I wonder if that is why we see
 so many people surprised when they lose their filesystems. Can anyone verify
 whether that is true of Ubuntu, or of any other Linux distributions?

Oracle linux (when used with UEK2) officially supports btrfs. Opensuse
also supports btrfs, and use its functionality for snapper.

I haven't found any updated (i.e. released post 12.04) official
support status statement from Ubuntu, but they do offer btrfs as
installation option.

As for lose their filesystems, are there recent ones that uses one
of the three distros above, and is purely btrfs fault? The ones I
can remember (from the post to this list) were broken on earlier
kernels, or caused by bad disks.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Preparing single-disk setup for future multi-disk usage

2012-05-24 Thread Fajar A. Nugraha
On Thu, May 24, 2012 at 1:05 PM, Björn Wüst bjoern.wu...@iteratec.de wrote:

 Unfortunately, I do not have a disk to test it right now. The disk I am 
 planning to use is with the post service still :) .

you can use sparse files. Possibly with losetup, if necessary.

 Thank you for your replies to this email (bjoern.wu...@gmx.net,

That's not the email you use to send

 I am not subscribed to the mailing lists, thus please do a 'reply all').

IMHO asking something to a list and then saying I am not subscribed
and send your reply to this other email address that I'm not using to
send is rude.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can btrfs silently repair read-error in raid1

2012-05-08 Thread Fajar A. Nugraha
On Tue, May 8, 2012 at 2:13 PM, Clemens Eisserer linuxhi...@gmail.com wrote:
 Hi,

 I have a quite unreliable SSD here which develops some bad blocks from
 time to time which result in read-errors.
 Once the block is written to again, its remapped internally and
 everything is fine again for that block.

 Would it be possible to create 2 btrfs partitions on that drive and
 use it in RAID1 - with btrfs silently repairing read-errors when they
 occur?
 Would it require special settings, to not fallback to read-only mode
 when a read-error occurs?

The problem would be how the SSD (and linux) behaves when it
encounters bad blocks (not bad disks, which is easier).

If it does oh, I can't read this block. I just return an error
immediately, then it's good.

However, in most situation, it would be like hmmm, I can't read this
block, let me retry that again. What? still error? then lets retry it
again, and again., which could take several minutes for a single bad
block. And during that time linux (the kernel) would do something like
hey, the disk is not responding. Why don't we try some stuff? Let's
try resetting the link. If it doesn't work, try downgrading the link
speed.

In short, if you KNOW the SSD is already showing signs of bad blocks,
better just throw it away.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel 3.3.4 damages filesystem (?)

2012-05-08 Thread Fajar A. Nugraha
On Tue, May 8, 2012 at 2:39 PM, Helmut Hullen hul...@t-online.de wrote:

 And you can use three BTRFS filesystems the same way as three Ext4
 filesystems if you prefer such a setup if the time spent for
 restoring the backup does not make up the cost for one additional
 disk for you.

 But where's the gain? If a disk fails I have a lot of tools for
 repairing an ext2/3/4 system.

It won't work if you use it in RAID0 (e.g. with LVM spanning three
disks, then use ext4 on top of the LV). Which is basically the same
thing that you did (using btrfs in raid0 mode).

As others said, if your only concern is if a disk is dead, I want to
be able to access data on other disks, then simply use btrfs as three
different fs, mounted on three directories.

btrfs will shine when:
- you need checksum and self-healing in raid10 mode
- you have lots of small files
- you have highly compressible content
- you need snapshot/clone feature

Since you don't need either, IMHO it's actually better if you just use ext4.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel 3.3.4 damages filesystem (?)

2012-05-07 Thread Fajar A. Nugraha
On Mon, May 7, 2012 at 5:46 PM, Helmut Hullen hul...@t-online.de wrote:

 For some months I run btrfs unter kernel 3.2.5 and 3.2.9, without
 problems.

 Yesterday I compiled kernel 3.3.4, and this morning I started the
 machine with this kernel. There may be some ugly problems.


 Data, RAID0: total=5.29TB, used=4.29TB

Raid0? Yaiks!

 System, RAID1: total=8.00MB, used=352.00KB
 System: total=4.00MB, used=0.00
 Metadata, RAID1: total=149.00GB, used=5.00GB

 Label: 'MMedia'  uuid: 9adfdc84-0fbe-431b-bcb1-cabb6a915e91
        Total devices 3 FS bytes used 4.29TB
        devid    3 size 2.73TB used 1.98TB path /dev/sdi1
        devid    2 size 2.73TB used 1.94TB path /dev/sdf1
        devid    1 size 1.82TB used 1.63TB path /dev/sdc1



 May  7 06:55:26 Arktur kernel: ata5: exception Emask 0x10 SAct 0x0 SErr 
 0x1 action 0xe frozen
 May  7 06:55:26 Arktur kernel: ata5: SError: { PHYRdyChg }
 May  7 06:55:26 Arktur kernel: ata5: hard resetting link
 May  7 06:55:31 Arktur kernel: ata5: COMRESET failed (errno=-19)
 May  7 06:55:31 Arktur kernel: ata5: reset failed (errno=-19), retrying in 6 
 secs


 May  7 07:15:19 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device
 May  7 07:15:19 Arktur kernel: end_request: I/O error, dev sdf, sector 0
 May  7 07:15:19 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device
 May  7 07:15:19 Arktur kernel: lost page write due to I/O error on sdf1


That looks like a bad disk to me, and it shouldn't be related to ther
kernel version you use.

Your best chance might be:
- unmount the fs
- get another disk to replace /dev/sdf, copy the content over with
dd_rescue. Ata resets can be a PITA, so you might be better of by
moving the failed disk to a usb external adapter, and du some creative
combination of plug-unplug and selectively skip bad sectors manually
(by passing -s to dd_rescue).
- reboot, with the bad disk unplugged
- (optional) run btrfs filesystem scrub (you might need to build
btrfs-progs manually from git source). or simply read the entire fs
(e.g. using tar to /dev/null, or whatever). It should check the
checksum of all files and print out which files are damaged (either in
stdout or syslog).

I don't think there's anything you can do to recover the damaged files
(other than restore from backup), but at least you know which files
are NOT damaged.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How file store when using Btrfs on multi-devices? What happen when a device fail?

2012-05-03 Thread Fajar A. Nugraha
On Thu, May 3, 2012 at 1:46 PM, Chu Duc Minh chu.ducm...@gmail.com wrote:
 Hi, i have some questions when using Btrfs on multi-devices:
 1. a large file will always be stored wholely on a device or it may
 spread on some devices/partitions?

IIRC:
- in raid1 mode, it will be written on all disks (or was it TWO disks,
regarless how many device in a mirror? can't remember which).
- in raid10 and raid0, it will always be spread, on a minimum of two devices

 Btrfs has option to specify it
 explicitly?

Not that I know of.

 2. suppose i have a directory tree like that:
 Dir_1
  |-- file_1A
  |-- file_1B
  |-- Dir_2
  |-- file_2C
  |-- file_2D

 If Dir_2, file_2C  on a failed device, can i still have access to file_2D?

Unless you're using raid10, my guess is you'll be screwed, as each
file will be spread on multiple devices (including the one that
fails).

 If i use GlusterFS (mirror mode) on two nodes, each nodes run Btrfs on
 multi-device. When a device on a node fail and I replace it, then
 GlusterFS resync it, can i have troubles with data consistency?

This question might be more suitable on glusterfs list. My guess is
that glusterfs will discard all data on the failed node. After you
recreate the storage backend (the btrfs, on a new device), you can
tell glusterfs to copy everything from the good node.

Of course, if you use raid10 mode in btrfs, and only one device fail,
it should be transparent to end users.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Snapper packages for Ubuntu

2012-04-10 Thread Fajar A. Nugraha
Hi,

I've created snapper packages for Ubuntu, available on
https://launchpad.net/~snapper/+archive/stable. For those new to
snapper, it's a tool for managing btrfs snapshots
(http://en.opensuse.org/Portal:Snapper). It depends on libblocxx
available from https://launchpad.net/~bjoern-esser-n/+archive/blocxx ,
and currently uses git source up to commit 50dec40. I've done some
limited testing and it seems to to work correctly so far.

There's a small, distro-independent patch needed for it to work
correctly though. I'm sending it as a separate mail.

@Arvin, @MGE, I don't know the correct list for snapper development so
I'm cc-ing you both. If there's a dedicated list for snapper please
let me know and I'll post further updates there.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Snapper: Always create .snapshot dir unconditonally

2012-04-10 Thread Fajar A. Nugraha
Current version of snapper (commit 50dec40) bails out with this error
if .snapshots directory doesn't exist (as is the case on new snapper
install):

2012-04-10 16:15:30,241 ERROR libsnapper(17784)
Snapshot.cc(nextNumber):362 - mkdir failed errno:2 (No such file or
directory)

This patch tries to create .snapshots dir unconditionally.

Signed-off-by: Fajar A. Nugraha git...@fajar.net
---
 snapper/Snapshot.cc |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/snapper/Snapshot.cc b/snapper/Snapshot.cc
index 8e9cc37..277fad7 100644
--- a/snapper/Snapshot.cc
+++ b/snapper/Snapshot.cc
@@ -353,6 +353,9 @@ namespace snapper
if (snapper-getFilesystem()-checkSnapshot(num))
continue;

+   // try to create .snapshots dir unconditionally
+   mkdir(snapper-infosDir().c_str(), 0711);
+
if (mkdir((snapper-infosDir() + / + decString(num)).c_str(),
0777) == 0)
break;

-- 
1.7.9.1
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Snapper packages for Ubuntu

2012-04-10 Thread Fajar A. Nugraha
On Tue, Apr 10, 2012 at 6:50 PM, Arvin Schnell aschn...@suse.de wrote:
 On Tue, Apr 10, 2012 at 05:37:38PM +0700, Fajar A. Nugraha wrote:
 Hi,

 I've created snapper packages for Ubuntu, available on
 https://launchpad.net/~snapper/+archive/stable. For those new to
 snapper, it's a tool for managing btrfs snapshots
 (http://en.opensuse.org/Portal:Snapper). It depends on libblocxx

 libblocxx is not required for snapper anymore since about a
 month. It's checked during configure.

You're right. I just tested it, and not having libblocxx during
compilation results in less dependencies (namely libblocxx itself,
plus libssl, libcrypto, and libpcre).

What functionality, if any, is not available when not using libblocxx?
Since it's still used when present during configure, I assume it's
good for something.

Thanks.

Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: snapper for Ubuntu? (WAS: btrfs auto snapshot)

2012-04-10 Thread Fajar A. Nugraha
On Tue, Apr 10, 2012 at 6:46 PM, Arvin Schnell aschn...@suse.de wrote:
 On Mon, Apr 09, 2012 at 08:18:45AM +0700, Fajar A. Nugraha wrote:
 I noticed that openSUSE buildservice now provides debs for ubuntu as
 well. I can't seem to find a way to add it to apt source list though,
 using the usual line

 deb uri distribution [component1] 

 You can use these commands:

 echo 'deb 
 http://download.opensuse.org/repositories/filesystems:/snapper/Debian_6.0/ /' 
  /etc/apt/sources.list

I didn't know you could use that format :D Just tested it, and it
works, although the command I use is

echo 'deb 
http://download.opensuse.org/repositories/filesystems:/snapper/Debian_6.0/
/' | sudo tee /etc/apt/sources.list.d/opensuse-snapper.list


 apt-get update

That got me the error

W: GPG error: http://download.opensuse.org  Release: The following
signatures couldn't be verified because the public key is not
available: NO_PUBKEY 2DA6FAF4175BFA4E

easily fixed though, using

$ sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 2DA6FAF4175BFA4E

... and then another apt-get update after that.


 apt-get install snapper

That result in a warning

WARNING: The following packages cannot be authenticated!
  libsnapper snapper
Install these packages without verification [y/N]?

Did the package creation process somehow ommit signing process,
perhaps? Or is there something else I missed?

Anyway, I got snapper-0.0.10-0 installed now, but having a small
problem. I use different subvolumes for multiple directories. For
example, /home and /data. Creating the config for both results in an
error

$ sudo snapper list-configs
Config | Subvolume
---+--
$ sudo snapper create-config /home
$ sudo snapper create-config /data
Creating config failed (config already exists).
$ sudo snapper list-configs
Config | Subvolume
---+--
root   | /home

How can I create config for /data or other directories (other than
manually creating the config file and .snapshots directory)?

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: snapper for Ubuntu? (WAS: btrfs auto snapshot)

2012-04-10 Thread Fajar A. Nugraha
On Tue, Apr 10, 2012 at 9:35 PM, Matthias G. Eckermann m...@suse.com wrote:
 On 2012-04-10 T 20:48 +0700 Fajar A. Nugraha wrote:
 How can I create config for /data or other directories (other than
 manually creating the config file and .snapshots directory)?

 This should do it:

 sudo snapper -c home create-config /home
 sudo snapper -c data create-config /data

 The reasons for the extra -c name is that you have to
 tell snapper, which name to choose for the configuration
 you want to create. This name is the one you can reference
 in future actions such as create/modify/delete.

Great! That works, thanks.

Is there an oposite of create-config, i.e. delete for just one subvolume?
delete-config seems to delete everything (configs for all subvolume
and all snapshots).

Also, one minor detail, I noticed that the cron configuration file is
/etc/sysconfig/snapper. It should be /etc/default/snapper in
ubuntu/debian.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


snapper for Ubuntu? (WAS: btrfs auto snapshot)

2012-04-08 Thread Fajar A. Nugraha
On Thu, Mar 1, 2012 at 8:48 PM, Arvin Schnell aschn...@suse.de wrote:
 We have now created a project in the openSUSE buildservice were
 we provide snapper packages for various distributions, e.g. RHEL6
 and Fedora 16. Please find the downloads at:

  http://download.opensuse.org/repositories/filesystems:/snapper/

 I'll also add a link from the snapper home page:

  http://en.opensuse.org/Portal:Snapper.

 I have tested snapper on Fedora 16 and found no problems.

Hi Arvin,

I noticed that openSUSE buildservice now provides debs for ubuntu as
well. I can't seem to find a way to add it to apt source list though,
using the usual line

deb uri distribution [component1] 

Is there a howto somewhere, or is it
download-all-debs-manually-and-install-with-dpkg for now?

Thanks,

Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfsck integration with userlevel API for fsck

2012-03-30 Thread Fajar A. Nugraha
On Sat, Mar 31, 2012 at 3:35 AM, Avi Miller avi.mil...@oracle.com wrote:

 On 30/03/2012, at 2:22 PM, Fajar A. Nugraha wrote:

 On Fri, Mar 30, 2012 at 5:08 AM, member graysky gray...@archlinux.us wrote:
 Are there plans to integrate btrfsck with the userlevel API for fsck?

 There isn't even a stable, working, fixing btrfsck yet :)

 Yes, there is. Chris merged the btrfsck changes into the btrfs-progs master 
 in git a few days ago and we shipped it with the Oracle Linux UEK2 update as 
 well.

Ah, OK. I must've missed the announcement. Thanks for the update.

Now if only UEK2 fully supports LXC as well instead of tech preview ... :D

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfsck integration with userlevel API for fsck

2012-03-29 Thread Fajar A. Nugraha
On Fri, Mar 30, 2012 at 5:08 AM, member graysky gray...@archlinux.us wrote:
 Are there plans to integrate btrfsck with the userlevel API for fsck?

There isn't even a stable, working, fixing btrfsck yet :)

 AFAIK, it currently does not work as such (i.e. `shutdown -rF now`
 does not trigger a check on the next boot).  What is the recommended
 method to check a btrfs root filesystem?  Live media?

Currently? None. Set the last part of root line in fstab to 0 to
disable fsck.
Newer kernels should be smart enough to recover from unclean shutdown
automatically, kinda like what zfs does, or what ext3/4 does with its
journal replay.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Create subvolume from a directory?

2012-03-27 Thread Fajar A. Nugraha
On Wed, Mar 28, 2012 at 5:24 AM, Matthias G. Eckermann m...@suse.com wrote:
 While the time measurement might be flawed due to the subvol
 actions inbetween, caching etc.: I tried several times, and
 cp --reflinks always is multiple times faster than mv in
 my environment.

So this is cross-subvolume reflinks? I thought the code for that
wasn't merged yet?

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can't mount, power failure - recoverable?

2012-03-26 Thread Fajar A. Nugraha
On Mon, Mar 26, 2012 at 3:34 PM, Skylar Burtenshaw daninfu...@gmail.com wrote:
 Hey - been a few days, not meaning to pester but I wanted to make sure my
 previous message didn't slip through the cracks. If I offended, I apologize - 
 I
 certainly didn't mean to, and my attempts at joviality can come across as
 abrasive. If you simply haven't had time to look into this yet, or it's 
 bizarre
 enough that it's taking time to isolate, take all the time you need. Thank 
 you.

Didn't Chris' last response basically say use kernel 3.2 or newer,
mount the fs (possibly with -o ro), and copy the data elsewhere? Have
you done that?

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can't mount, power failure - recoverable?

2012-03-26 Thread Fajar A. Nugraha
On Mon, Mar 26, 2012 at 3:49 PM, Skylar Burtenshaw daninfu...@gmail.com wrote:
 Fajar A. Nugraha list at fajar.net writes:

 Didn't Chris' last response basically say use kernel 3.2 or newer,
 mount the fs (possibly with -o ro), and copy the data elsewhere?

 Why yes, yes it did actually. I appreciate your spotlighting it, just in case 
 I
 somehow managed to miss it, though.

 Have you done that?

 I have. In fact, in my first message, I stated that in all kernels up to 
 present
 3.2 kernels, I get several minutes of disk churning, then a stack trace. Also
 present in my messages is the fact that the filesystem will not mount, as well
 as data output from the recovery program etc which fail to recognize things in
 the filesystem that they require in order to fix it. Did you have something 
 you
 wished to suggest, in order to help me? If so, I'd gladly listen to any 
 proposed
 ideas.

Since you apprently tried -o ro (which I missed), then my last
suggestion is probably kernel 3.3 with -o ro. just in case :)

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs and backups

2012-03-26 Thread Fajar A. Nugraha
On Mon, Mar 26, 2012 at 3:56 PM, Felix Blanke felixbla...@gmail.com wrote:
 On 3/26/12 10:30 AM, James Courtier-Dutton wrote:
 Is there some tool like rsync that I could copy all the data and
 snapshots to a backup system, but still only use the same amount of
 space as the source filesystem.


 I'm not sure if I understand your problem right, but I would suggest:

 1) Snapshot the subvolume on the source
 2) rsync the snapshot to the destination
 3) Snapshot the destination

James did say only use the same amount of space as the source
filesystem. Your approach would increase the usage when one or more
subvolume shares the same space (e.g. when one subvolume starts as
snapshot).

AFAIK the (planned) way to do this is using btrfs send | receive,
which is not available yet.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: compressed btrfs No space left on device

2012-03-06 Thread Fajar A. Nugraha
On Thu, Nov 17, 2011 at 12:59 AM, Arnd Hannemann a...@arndnet.de wrote:
 Am 14.11.2011 19:24, schrieb Arnd Hannemann:
 Am 14.11.2011 15:57, schrieb Arnd Hannemann:

 I'm using btrfs for my /usr/share/ partition and keep getting the following 
 error
 while installing a debian package which should take no more than 228 MB:

 Unpacking texlive-fonts-extra (from 
 .../texlive-fonts-extra_2009-10ubuntu1_all.deb) ...
  dpkg: error processing 
 /var/cache/apt/archives/texlive-fonts-extra_2009-10ubuntu1_all.deb 
 (--unpack):
  unable to install new version of 
 `/usr/share/texmf-texlive/fonts/type1/public/allrunes/frutlt.pfb': No space 
 left on device


 FYI: The problem is the same with mainline kernel v3.1.1.

 JFYI: the problem went away in 3.2-rc2  so someone must
 have fixed something.

I just experienced the same thing in Ubuntu precise, 3.2.0-17-generic,
so I don't think it's fixed yet.

$ sudo btrfs fi df /
Data: total=43.47GB, used=38.47GB
System, DUP: total=8.00MB, used=12.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=3.25GB, used=912.47MB
Metadata: total=8.00MB, used=0.00

$ df -h /
Filesystem  Size  Used Avail Use% Mounted on
/dev/sda650G   41G  5.1G  89% /

the problem occur when copying precise lxc root template (322M, 13759
files/directories).

It only happens when using zlib compression though, using lzo works fine.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [RFC] Add btrfs autosnap feature

2012-03-04 Thread Fajar A. Nugraha
On Mon, Mar 5, 2012 at 1:51 PM, Anand Jain anand.j...@oracle.com wrote:

 (notably the direct modification of
 crontab files, which is considered to be an internal detail if I
 understand correctly, and I'm fairly certain is broken as written),


  I did came across that point of view however, using crontab cli in the
  program wasn't convincing either, (library call would have been better).
  any other better ways to manage cron entries ?


/etc/cron.{d,daily,hourly} ?

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: filesystem full when it's not? out of inodes? huh?

2012-03-02 Thread Fajar A. Nugraha
On Fri, Mar 2, 2012 at 6:50 PM, Brian J. Murrell br...@interlinx.bc.ca wrote:
 Is  2010-06-01 really the last time the tools were considered
 stable or are Ubuntu just being conservative and/or lazy about updating?

The last one :)

Or probably no one has bugged them enough and point out they're
already using a git snapshot anyway and there are many new features in
the current git version of btrfs-tools.

I have been compiling my own kernel (just recently switched to
Precise's kernel though) and btrfs-progs for quite some time, so even
if Ubuntu doesn't provide updated package it wouldn't matter much to
me. If it's important for you, you could file a bug report in
launchpad asking for an update. Even debian testing has an updated
version (which you might be able to use:
http://packages.debian.org/btrfs-tools)

Or create your own ppa with an updated version (or at least rebuilt of
Debian's version).

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] btrfs auto snapshot

2012-03-01 Thread Fajar A. Nugraha
On Thu, Mar 1, 2012 at 8:48 PM, Arvin Schnell aschn...@suse.de wrote:
 On Thu, Feb 23, 2012 at 04:54:06PM +0700, Fajar A. Nugraha wrote:
 On Thu, Aug 18, 2011 at 12:38 AM, Matthias G. Eckermann m...@suse.com 
 wrote:

  are available in the openSUSE buildservice at:
 
         http://download.opensuse.org/repositories/home:/mge1512:/snapper/
 

 Hi Matthias,

 I'm testing your packages on top of RHEL6 + kernel 3.2.7. A small
 suggestion, you should include /etc/sysconfig/snapper in the package
 (at least for RHEL6, haven't tested the other ones). Even if it just
 contains

 SNAPPER_CONFIGS=

 Hi Fajar,

 thanks for reporting that issue, I have fixed it now.

Great! Thanks.


 We have now created a project in the openSUSE buildservice were
 we provide snapper packages for various distributions, e.g. RHEL6
 and Fedora 16. Please find the downloads at:

  http://download.opensuse.org/repositories/filesystems:/snapper/

 I'll also add a link from the snapper home page:

  http://en.opensuse.org/Portal:Snapper.

 I have tested snapper on Fedora 16 and found no problems.

When I installed it back then, the first thing that comes to mind was
there's no documentation on how to get started.

http://en.opensuse.org/openSUSE:Snapper_Tutorial is good, but that' is
assuming root is btrfs, and snapper is already configured to snapshot
root. For other distros, you need to first create the config manually,
e.g. as shown for home in http://en.opensuse.org/openSUSE:Snapper_FAQ

Could you update the tutorial, or perhaps create a new quickstart
page? I'm kinda reluctant to do it myself since I don't use opensuse,
and some of my edits might not reflect the correct way to do it in
opensuse. If that's not possible, I'll put up the documentation
somewhere else (perhaps the semi-official http://btrfs.ipv5.de/ , or
my own wiki).

Two other things that I have't find is:
- how to add pre and post hooks, so (for example) snapper could create
the same pre-post snapshot whenever user runs yum, similar to when a
user runs yast in opensuse,
- whether a rollback REALLY rolls back everyting (including binary and
new/missing files), or is it git-like behavior, or if it only process
text files.

... but those two aren't as important as the getting-started documentation.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs Storage Array Corrupted

2012-02-28 Thread Fajar A. Nugraha
On Wed, Feb 29, 2012 at 7:13 AM, Travis Shivers ttshiv...@gmail.com wrote:
 # ./btrfs-zero-log /dev/sdh
 parent transid verify failed on 5568194695168 wanted 43477 found 43151
 parent transid verify failed on 5568194695168 wanted 43477 found 43151
 parent transid verify failed on 5568194695168 wanted 43477 found 43151
 parent transid verify failed on 5568194695168 wanted 43477 found 43151
 Ignoring transid failure

Did you try a read-only mount (-o ro) after you run btrfs-zero-log?

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] btrfs auto snapshot

2012-02-23 Thread Fajar A. Nugraha
On Thu, Aug 18, 2011 at 12:38 AM, Matthias G. Eckermann m...@suse.com wrote:
 Ah, sure. Sorry.  Packages for blocxx for:
        Fedora_14       Fedora_15
        RHEL-5          RHEL-6
        SLE_11_SP1
        openSUSE_11.4   openSUSE_Factory

 are available in the openSUSE buildservice at:

        http://download.opensuse.org/repositories/home:/mge1512:/snapper/


Hi Matthias,

I'm testing your packages on top of RHEL6 + kernel 3.2.7. A small
suggestion, you should include /etc/sysconfig/snapper in the package
(at least for RHEL6, haven't tested the other ones). Even if it just
contains

SNAPPER_CONFIGS=

Thanks,

Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs-convert processing time

2012-02-20 Thread Fajar A. Nugraha
On Mon, Feb 20, 2012 at 8:50 PM, Hubert Kario h...@qbs.com.pl wrote:
 On Monday 20 of February 2012 14:41:33 Olivier Bonvalet wrote:
 Lot of small files (like compressed email from Maildir), and lot of
 hardlinks, and probably low free space (near 15% I suppose).


 So I think I have my answer :)


 Yes, this is probably the worst possible combination.

 Plese keep us updated. Just to have exact numbers for new users.


... although it would probably fail anyway due to btrfs hardlink limit
in the same directory.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs-convert processing time

2012-02-20 Thread Fajar A. Nugraha
On Mon, Feb 20, 2012 at 9:29 PM, Olivier Bonvalet btrfs.l...@daevel.fr wrote:
 On 20/02/2012 15:00, Fajar A. Nugraha wrote:

 On Mon, Feb 20, 2012 at 8:50 PM, Hubert Karioh...@qbs.com.pl  wrote:

 On Monday 20 of February 2012 14:41:33 Olivier Bonvalet wrote:

 Lot of small files (like compressed email from Maildir), and lot of
 hardlinks, and probably low free space (near 15% I suppose).


 So I think I have my answer :)


 Yes, this is probably the worst possible combination.

 Plese keep us updated. Just to have exact numbers for new users.



 ... although it would probably fail anyway due to btrfs hardlink limit
 in the same directory.


 And in that case, btrfs-convert will abort, or ignore the error, or just
 hang ?

On my simple test with ubuntu precise, loop-mounted ext4, 8k hardlinks:

$ sudo btrfs-convert /dev/loop0
creating btrfs metadata.
$ echo $?
139

so no useful error message, but it doesn't crash. And when mounted the
device still shows ext4.

A successful conversionn would look like this:

$ sudo btrfs-convert /dev/loop1
creating btrfs metadata.
creating ext2fs image file.
cleaning up system chunk.
conversion complete.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs open_ctree failed (after recent Ubuntu update)

2012-02-19 Thread Fajar A. Nugraha
On Mon, Feb 20, 2012 at 10:34 AM, Curtis Jones curtis.jo...@gmail.com wrote:
 Chris,

 Thank you for those kernel-update instructions. That was the least painful 
 kernel update I could have imagined. I rebooted and verified (via uname) that 
 I am in fact running the new kernel. After looking at dmesg I can confirm 
 that the exact same error is still occurring though. I re-read your previous 
 email and saw that you recommended the 3.3-rc release if 3.2.6 didn't 
 suffice. So I did the same thing with 3.3-rc. And I found the same error (or 
 what appears to be the same error), again:

 [  186.982910] device label StoreW devid 1 transid 37077 /dev/sdb
 [  187.015081] parent transid verify failed on 79466496 wanted 33999 found 
 36704
 [  187.015088] parent transid verify failed on 79466496 wanted 33999 found 
 36704
 [  187.015091] parent transid verify failed on 79466496 wanted 33999 found 
 36704
 [  187.015094] parent transid verify failed on 79466496 wanted 33999 found 
 36704
 [  187.015764] btrfs: open_ctree failed

 uname now reports:

 Linux veriton 3.3.0-030300rc4-generic-pae #201202181935 SMP Sun Feb 19 
 00:53:06 UTC 2012 i686 i686 i386 GNU/Linux

 I'm not sure what to try next;

I'd try with latest tools now. IIRC there's two programs you can try:
- btrfs-zero-log, which (as the name implies) zeroes-out transaction log
- restore, which would try to read files from a broken btrfs and copy
it elsewhere, and

Try the first one. If you can, dump the content of the disk to a file
first (with dd or dd_rescue) and try it on that file. Just in case
something goes horribly wrong :)

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Setting options permanently?

2012-01-28 Thread Fajar A. Nugraha
On Sat, Jan 28, 2012 at 7:49 AM, Hadmut Danisch had...@danisch.de wrote:
 Am 28.01.2012 00:20, schrieb Chester:
 It should be okay to mount with compress or without compress. Even if
 you mount a volume with compressed data without '-o compress' you will
 still be able to correctly read the data (but newly written data will
 not be compressed)

 But having both compressed and uncompressed files in the filesystem is
 exactly what I want to avoid. Not because of reading problems, but to
 avoid wasting disk space. I don't have a reading problem. I have a
 writing problem.

If you've been using -o compress, then you should know that even then
not ALL data is compressed. If btrfs predicts that a data will be
unable to benefit frmo compression, it will store it uncompressed. The
problem is the prediction is not always right. Which is why there's -o
compress-force.

Anyway, for removable media case, there's a workaround that you can
use (at least it works with gnome). Put the entry for the usb block
device (e.g. /dev/sdb1) in fstab, with appropriate mount option, and
the option will be used when you mount it using nautilus.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Problem with 3.3.0-rc1+: Target filesystem cannot find /sbin/init

2012-01-22 Thread Fajar A. Nugraha
On Sun, Jan 22, 2012 at 7:21 PM, Swapnil Pimpale swapnil.p...@gmail.com wrote:
 I can successfully boot into Ubuntu 11.10 (3.0.0-14-generic-pae) with
 a btrfs root filesystem and an ext2 /boot partition.
 But when I installed the latest vanilla (3.3.0-rc1+) and booted into

where did you get the kernel from? kernel.org snapshot? git? third
party package?

 it, the first time the system froze.
 Next time onwards, I get the following error every time:

 [   0.427443] [drm:i915_init]  *ERROR* drm/i915 cannot work without
 intel_agp module!
 mount: mounting udev on /dev failed: No such device
 W: devtmpfs not available, falling back to tmpfs for /dev
 mount: mounting /dev/disk/by-uuid/f43fdd7a-8ad7-4e96-ab1c-14ba82a4324d
 on /root failed: No such device

Do you know how to use your own costom kernel? That error is common
when a driver is missing (i.e. not built-in, and not included in
initrd). The easiest way to test that is to look at what's in
/proc/partitions and /dev/disk/by-id during normal system boot (I
assume you still have the old, working Ubuntu kernel?) and during
failed boot when you're dropped to busybox. If your root device
(sda8?) is not on /proc/partitions, then it's definitely block device
driver problem.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfsck gives me errors

2012-01-19 Thread Fajar A. Nugraha
On Fri, Jan 20, 2012 at 11:24 AM, Jérôme Poulin jeromepou...@gmail.com wrote:
 On Wed, Jan 18, 2012 at 11:59 PM, Fajar A. Nugraha l...@fajar.net wrote:
 some files, unmount, and mount it again. If second mount does not show
 any error message then I'm pretty sure you're safe.

 I just upgraded from 3.0 to 3.2.1 and mounted the filesystem, tried
 find  /dev/null and only got messages about old space inode.

That's normal. You'll also get the message if you switch back to 3.0,
but it should be harmless.

 I then
 used btrfsck again for the same exact result, I'll ignore them for
 now, let's see what the shiny new btrfsck will do about them!

who knows when it will be available :)

Then again, most fsck feature has been implemented in kernel space so
a mount will automatically fix some types of problems (somewhat
similar to what zfs does, which has no fsck whatsoever). So just watch
syslog for any unusual error messages.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfsck gives me errors

2012-01-18 Thread Fajar A. Nugraha
On Thu, Jan 19, 2012 at 9:02 AM, Jérôme Poulin jeromepou...@gmail.com wrote:
 I did a preemptive fsck after a RAID crash and got many errors, is
 there something I should do if everything I use works?

Probably just ignore it.

Recent kernels (e.g. 3.1 or 3.2) is smart enough to automatically fix
certain types of errors. Watch syslog when you mount the fs, access
some files, unmount, and mount it again. If second mount does not show
any error message then I'm pretty sure you're safe.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Encryption implementation like ZFS?

2011-12-31 Thread Fajar A. Nugraha
On Sat, Dec 31, 2011 at 3:12 AM, Sandra Schlichting
littlesandr...@gmail.com wrote:
 How is this advantageous over dmcrypt-LUKS?

 TRIM pass-through for SSD's. With dmcrypt on an SSD write performance
 is very slow.

... and depending on which SSD you use, it shouldn't matter. Really.

Last time I tried with sandforce SSD + btrfs + -o discard, forcing
trim actually made things slower. Sandforce (and probably other modern
SSD) controllers can work just fine even without explicit trim fs
support.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Encryption implementation like ZFS?

2011-12-31 Thread Fajar A. Nugraha
On Sun, Jan 1, 2012 at 12:12 AM, Niels de Carpentier
ni...@decarpentier.com wrote:
 ... and depending on which SSD you use, it shouldn't matter. Really.

 Last time I tried with sandforce SSD + btrfs + -o discard, forcing
 trim actually made things slower. Sandforce (and probably other modern
 SSD) controllers can work just fine even without explicit trim fs
 support.

 What command did you use to test this?

Normal usage, and some random i/o test tool like fio.


 I have an OCZ Agility 3 SSD, which have the latest Sandforce
 controller, so I would really like to try reproduce your test setup.

Yours should be newer. Mine is somewhat-old corsair force 60 GB with
btrfs on top. When I activated -o discard, it actually become slower.
Also, when I used fstrim, the IOPS were capped at 100, so probably the
slowdown is because of that (i.e. IOPS-limit of TRIM somewhere,
possibly the controller)


 Ok, the sandforce controller makes things interesting.

 First of all, sandforce controllers have a very high failure rate, so make
 sure you have backups!!

Yes, but even knowing that I can't imagine going back to HDD for this
particular system. It'd be too slow to bear :P

 Sandforce controllers also use compression and deduplication to increase
 performance. Encryption will make your data incompressible and random, so
 this can have a big impact on performance, depending on the
 characteristics of your data.

In my case I use compress=lzo, so it shouldn't be compressible by the
controllers.

 Sandforce controllers also have life time throttling, which will throttle
 writes heavily if it thinks you will wear out the  flash within the
 warranty period. If you have a very heavy write workload this can be an
 issue.

That's new. Is there a link/reference for that?


 If you don't have a working trim it is a good idea to leave part of your
 drive unused. (Make sure you either do this after a full write erase of
 the drive, or do a manual trim of that area, otherwise it won't work).
 This will make sure the drive has enough spare sectors to do garbage
 collection and can greatly improve performance if your drive is full.

True. But on my last test I can't get fstrim to trim everything. It
could only trim about 2GB out of 12GB free space.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Two way mirror in BRTFS

2011-12-30 Thread Fajar A. Nugraha
2011/12/30 Jaromir Zdrazil jaromir.zdra...@email.cz:
 Sorry fo the typo in the subject!

 Just to add, I would like to see a two way mirror solution, but if it will 
 not work now/is not implemnted yet, I would propably choose between drbd in 
 asynchronous mode or make a some kind if incremental snapshot to a remote 
 mapped disk (I do not know yet, if brtfs support it)  - it means have one 
 shapshop and let's say have a daily incremental update of this snapshot.

You mean like zfs send -i? If yes, why not just use zfs? There's
zfsonlinux project, with easy-to-install ppa for ubuntu. Or you could
compile it manually.


 How would you do it?

If you DO mean zfs-send-like-functionality, then you should ask about
btrfs send and receive, not two way mirror (which is not an
accurate way to describe what you want). Also, send/receive ability
does not mean it can act as two-way mirror. It CAN be an alternative
to drbd async though.

I don't think there's any publicly available code for it yet though.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re: Two way mirror in BRTFS

2011-12-30 Thread Fajar A. Nugraha
2011/12/30 Jaromir Zdrazil jaromir.zdra...@email.cz:
  Just to add, I would like to see a two way mirror solution, but if it will 
  not
 work now/is not implemnted yet, I would propably choose between drbd in
 asynchronous mode or make a some kind if incremental snapshot to a remote
 mapped disk (I do not know yet, if brtfs support it)  - it means have one
 shapshop and let's say have a daily incremental update of this snapshot.

 You mean like zfs send -i? If yes, why not just use zfs? There's
 zfsonlinux project, with easy-to-install ppa for ubuntu. Or you could
 compile it manually.

 Thank you for your suggestion. As I know, there is not everything ported yet, 
 and one of the missing important features I plan to use is to crypt fs.

correct. But btrfs doesn't do encryption as well.
And if you're thinking of using luks/dm-crupt to provide encryption
for btrfs, there's nothing preventing you to use the same thing with
zfs.

 And if I am not mistaken, current version does not yet support a mountable 
 filesystem.

You're mistaken :) With some extra work, you can even use it as root:
- http://zfsonlinux.org/example-zpl.html
- 
https://github.com/dajhorn/pkg-zfs/wiki/HOWTO-install-Ubuntu-to-a-Native-ZFS-Root-Filesystem

 
  How would you do it?

 If you DO mean zfs-send-like-functionality, then you should ask about
 btrfs send and receive, not two way mirror (which is not an
 accurate way to describe what you want). Also, send/receive ability
 does not mean it can act as two-way mirror. It CAN be an alternative
 to drbd async though.

 If I understand it correctly, the diff between send and receive and two way 
 mirror is that one is synchronous and the other is not (sends the signal that 
 the file have been succesfully written after all/one instance have been 
 succesfully written).
 Maybe you can explain it a bit more.

Two way: A replicates changes to B, and B can replicate it's own changes to A
One way: A replicates changes to B, but B can not replicate it's own
changes to A

While drbd only supports synchronous mode for active-active setup, the
generic two way replication does not have to be so. Also, just
because something is synchronous does not automatically mean it
supports two-way replication.

Either way, neither zfs or the (planned) btrfs send/receive supports
two-way/active-active setup. Both should (or will) work just fine for
one-way replication.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fstrim on BTRFS

2011-12-29 Thread Fajar A. Nugraha
On Thu, Dec 29, 2011 at 4:39 PM, Li Zefan l...@cn.fujitsu.com wrote:
 Fajar A. Nugraha wrote:
 On Wed, Dec 28, 2011 at 11:57 PM, Martin Steigerwald
 mar...@lichtvoll.de wrote:
 But BTRFS does not:

 merkaba:~ fstrim -v /
 /: 4431613952 bytes were trimmed
 merkaba:~ fstrim -v /
 /: 4341846016 bytes were trimmed

  and apparently it can't trim everything. Or maybe my kernel is
 just too old.


 $ sudo fstrim -v /
 2258165760 Bytes was trimmed

 $ df -h /
 Filesystem            Size  Used Avail Use% Mounted on
 /dev/sda6              50G   34G   12G  75% /

 $ mount | grep / 
 /dev/sda6 on / type btrfs (rw,noatime,subvolid=258,compress-force=lzo)

 so only about 2G out of 12G can be trimmed. This is on kernel 3.1.4.


 That's because only free spaces in block groups will be trimmed. Btrfs
 allocates space from block groups, and when there's no space availabe,
 it will allocate a new block group from the pool. In your case there's
 ~10G in the pool.

Thanks for your response.


 You can do a btrfs fi df /, and you'll see the total size of existing
 block groups.

$ sudo btrfs fi df /
Data: total=43.47GB, used=31.88GB
System, DUP: total=8.00MB, used=12.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=3.25GB, used=619.88MB
Metadata: total=8.00MB, used=0.00

That should mean existing block groups is at least 46GB, right? In
which case my pool (a 50G partition) should only have about 4GB of
space not allocated to block groups. The numbers don't seem to match.


 You can empty the pool by:

        # dd if=/dev/zero of=/mytmpfile bs=1M

 Then release the space (but it won't return back to the pool):

        # rm /mytmpfile
        # sync

Is there a bad side effect of doing so? For example, since all free
space in the pool would be allocated to data block group, would that
mean my metadata block group is capped at 3.25GB? Or would some data
block group can be converted to metadata, and vice versa?

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Compession, on filesystem or volume?

2011-12-29 Thread Fajar A. Nugraha
On Thu, Dec 29, 2011 at 5:51 PM, Remco Hosman re...@hosman.xs4all.nl wrote:
 Hi,

 Something i could not find in the documentation i managed to find:
 if you mount with compress=lzo and rebalance, is compression on for that
 filesystem or only a single volume?

 eg, can i have a @boot volume uncompressed and @ and @home compressed.

Last time I asked a similar question, the answer was no. It's per filesystem.

however you can change compression of individual files between
zlib/lzo using btrfs fi defragment -c, regardless of what the
filesystem is currently mounted with.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fstrim on BTRFS

2011-12-29 Thread Fajar A. Nugraha
On Fri, Dec 30, 2011 at 1:19 PM, Li Zefan l...@cn.fujitsu.com wrote:
 Or would some data
 block group can be converted to metadata, and vice versa?


 This won't happen. Also empty block groups won't be reclaimed, but it's
 in TODO list.

Ah, OK.

6G for metadata out of 50G total seems a bit much, but I can live with
it for now.

Thanks,

Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fstrim on BTRFS

2011-12-28 Thread Fajar A. Nugraha
On Thu, Dec 29, 2011 at 11:37 AM, Roman Mamedov r...@romanrm.ru wrote:
 On Thu, 29 Dec 2011 11:21:14 +0700
 Fajar A. Nugraha l...@fajar.net wrote:

 I'm trying fstrim and my disk is now pegged at write IOPS. Just
 wondering if maybe a btrfs fi balance would be more useful, since:


 Modern controllers (like the SandForce you mentioned) do their own wear 
 leveling 'under the hood', i.e. the same user-visible sectors DO NOT 
 neccessarily map to the same locations on the flash at all times; and 
 introducing 'manual' wear leveling by additional rewriting is not a good 
 idea, it's just going to wear it out more.

I know that modern controllers have their own wear leveling, but AFAIK
they basically:
(1) have reserved a certain size for wear leveling purposes
(2) when a write request comes, they basically use new sectors from
the pool, and put the old sectors to the pool (doing garbage
collection like trim/rewrite in the process)
(3) they can't re-use sectors that are currently being used and not
rewritten (e.g. sectors used by OS files)

If (3) is still valid, then the only way to reuse the sectors is by
forcing a rewrite (e.g. using btrfs fi defrag). So the question is,
is (3) still valid?

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fstrim on BTRFS

2011-12-28 Thread Fajar A. Nugraha
On Thu, Dec 29, 2011 at 11:02 AM, Li Zefan l...@cn.fujitsu.com wrote:
 Martin Steigerwald wrote:

 With 3.2-rc4 (probably earlier), Ext4 seems to remember what areas it
 trimmed:

 But BTRFS does not:


 There's no such plan, but it's do-able, and I can take care of it.
 There's an issue though.

 For btrfs this issue can't be solved without disk format change that
 will break older kernels, but only 3.2-rcX kernels will be affected if
 we push the following change into mainline before 3.2 release.

Slightly off-topic, how useful would trim be for btrfs when using
newer SSD which have their own garbage collection and wear leveling
(e.g. sandforce-based)?

I'm trying fstrim and my disk is now pegged at write IOPS. Just
wondering if maybe a btrfs fi balance would be more useful, since:
- with trim, used space will remain used. Thus future writes will only
utilized space marked as free, making them wear faster
- with btrfs fi balance, btrfs will move the data around so (to some
degree) the currently-unused space will be used, and  currently-used
space will be unused, which will improve wear leveling.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fstrim on BTRFS

2011-12-28 Thread Fajar A. Nugraha
On Wed, Dec 28, 2011 at 11:57 PM, Martin Steigerwald
mar...@lichtvoll.de wrote:
 But BTRFS does not:

 merkaba:~ fstrim -v /
 /: 4431613952 bytes were trimmed
 merkaba:~ fstrim -v /
 /: 4341846016 bytes were trimmed

 and apparently it can't trim everything. Or maybe my kernel is
just too old.


$ sudo fstrim -v /
2258165760 Bytes was trimmed

$ df -h /
FilesystemSize  Used Avail Use% Mounted on
/dev/sda6  50G   34G   12G  75% /

$ mount | grep / 
/dev/sda6 on / type btrfs (rw,noatime,subvolid=258,compress-force=lzo)

so only about 2G out of 12G can be trimmed. This is on kernel 3.1.4.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fstrim on BTRFS

2011-12-28 Thread Fajar A. Nugraha
On Thu, Dec 29, 2011 at 11:21 AM, Fajar A. Nugraha l...@fajar.net wrote:
 I'm trying fstrim and my disk is now pegged at write IOPS. Just
 wondering if maybe a btrfs fi balance would be more useful,

Sorry, I meant btrfs fi defrag

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Extreme slowdown

2011-12-15 Thread Fajar A. Nugraha
On Fri, Dec 16, 2011 at 1:49 AM, Tobias tra...@robotech.de wrote:
 Hi all!

 My BTRFS-FS ist getting really slow. Reading is ok, writing is slow and
 deleting is horrible slow.

 There are many files and many links on the FS.

 # btrfs filesystem df /srv/storage
 Data: total=3.09TB, used=3.07TB

this is ... what, over 99% full?
The slow down is normal, somewhat. Same thing happens on zfs, which
became slower when usage is above 80-90%.

 Maybe it's because there is so much Metadata and it needs so many seeks on
 the discs when deleting?

I doubt it.

 I'd like to delete some of the old Files but its so horrible slow that i
 think its maybe faster to copy all needed Data to a different Disc, killing
 the FS and move the Files back...

 The machine is a QuadCore with 8GB RAM. Kernel is 3.1+for-linus.

 Any hints how i could speed it up?

Try:
- mounting it with nodatacow:
https://btrfs.wiki.kernel.org/articles/f/a/q/FAQ_1fe9.html#Can_copy-on-write_be_turned_off_for_data_blocks.3F
- clobbering a big file:
https://btrfs.wiki.kernel.org/articles/f/a/q/FAQ_1fe9.html#if_your_device_is_large_.28.3E16gb.29

... until you have at least 20% free space available.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What is best practice when partitioning a device that holds one or more btr-filesystems

2011-12-14 Thread Fajar A. Nugraha
On Thu, Dec 15, 2011 at 4:42 AM, Wilfred van Velzen wvvel...@gmail.com wrote:
 On Wed, Dec 14, 2011 at 9:56 PM, Gareth Pye gar...@cerberos.id.au wrote:
 On Thu, Dec 15, 2011 at 5:51 AM, Wilfred van Velzen wvvel...@gmail.com
 wrote:

 (I'm not interested in what early adopter users do when they are using
 rc kernels...)

 Yet your going to use a FS without a working fsck? That puts you in early
 adopter territory to me.

 Yeah maybe. But I'm still not interested in it regarding partitioning! ;)

I'd just use one big partition. That way all subvolume can share free
space, making space use more efficient.

If you decide to go that route, the missing feature is quota and space
accounting. At this moment you can't tell which subvol use how much,
and limit it. There are (unmerged) patches for that though.


 But actually I decided not to use it for the production environment.
 The missing working fsck is one of the reasons.
 Although opensuse supports it and Suse Linux Enterprise Server 11 is
 going to support it with their next SP release in Februari, and Fedora
 might use it as default in their next release... Did I miss any?

Oracle linux :D

 But I'm going to use it at home and probably in some test environments rsn... 
 ;)

If you're keeping your options open, try zfsonlinux as well. It might
be better suited for certain needs.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs encryption problems

2011-12-12 Thread Fajar A. Nugraha
On Thu, Dec 1, 2011 at 5:15 AM, 810d4rk 810d...@gmail.com wrote:
 I plugged it directly by sata and this is what I get from the 3.1 kernel:

 [  581.921417]  sdb: sdb1
 [  581.921642] sd 2:0:0:0: [sdb] Attached SCSI disk
 [  660.040263] EXT4-fs (dm-4): VFS: Can't find ext4 filesystem

... and then what? Did you try decrypting and mounting it?

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Filesystem acting up during balance

2011-12-08 Thread Fajar A. Nugraha
2011/12/9 Ricardo Bánffy rban...@gmail.com:
 Dec  9 01:06:21 adams kernel: [  207.912535] usb 1-2.1: reset high
 speed USB device number 7 using ehci_hcd

That's usually a REALLY bad sign.

If you can remove the drive from the USB enclosure, I suggest you plug
it to onboard SATA port. That way at least you won't have to deal with
USB reset/retries. After that, I'd try copying the data off the disk
first (with dd_rescue). You'd need a big enough disk for this.

Only after that, I'd try again mounting the copy.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs errors

2011-12-02 Thread Fajar A. Nugraha
On Fri, Dec 2, 2011 at 7:34 PM, Mike Thomas bt...@thomii.com wrote:
 Hi, I've been using btrfs for a while now, I've been utilizing snapshotting
 nightly/weekly/monthly.  During the weekly I also do a backup of the
 filesystem to an ext4 filesystem.  My storage is a linux md raid 5 volume.
 I've recently noticed these errors in the logs during the backup of the
 files to the ext4 filesystem.  I am running RHEL 6.1
 (2.6.32-131.0.15.el6.x86_64)

that's like dinosaur age, in btrfs terms :P


 I'd like to help in any way I can so I thought I'd post here to see if there
 is anything I can do to help.

I'd try compiling latest 3.2-rc kernel. or Chris' for-linus tree
(http://git.kernel.org/?p=linux/kernel/git/mason/linux-btrfs.git;a=shortlog;h=refs/heads/for-linus)
which is based on 3.1, and see if it goes away.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs/git question.

2011-11-29 Thread Fajar A. Nugraha
On Tue, Nov 29, 2011 at 10:22 PM, Chris Mason chris.ma...@oracle.com wrote:
 On Tue, Nov 29, 2011 at 09:33:37AM +0700, Fajar A. Nugraha wrote:
 On Tue, Nov 29, 2011 at 8:58 AM, Phillip Susi ps...@cfl.rr.com wrote:
  On 11/28/2011 12:53 PM, Ken D'Ambrosio wrote:
  Seems I've picked up a wireless regression, and randomly drop my WiFi
  connection with more recent kernels.  While I'd love to try to track down 
  the
  issue, the sporadic nature makes it difficult.  But I don't want to 
  revert to a
  flat-out old kernel because of all the btrfs modifications.  Is it 
  possible
  using git to add *just* btrfs patches to an older kernel?
 
  Sure: use git rebase to apply the patches to the older kernel.

 ... or use 3.1.2, and get ONLY fs/btrfs from Chris' for-linus tree,
 compile it out-of-tree, and use it to replace the original btrfs.ko.

 If you're on a 3.1 kernel, you can pull my for-linus directly on top of
 it with git pull.  I always keep a btrfs tree against the previous
 kernel so that people can use the latest btrfs goodness without having
 to use an rc kernel.

Yes, thanks for that.

My suggestion is simply an alternative (instead of git pull) for people who:
- aren't quite familiar with git, but know enough to grab a directory
snapshot from gitweb (e.g.
http://git.kernel.org/?p=linux/kernel/git/mason/linux-btrfs.git;a=tree;f=fs/btrfs;h=5f51bd7e3b8b6c4825681408450e6580bdbccce1;hb=refs/heads/for-linus)
- know how to build a module out-of-tree
- on the latest stable, but don't want to re-compile the whole kernel
just to get btrfs fix

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs/git question.

2011-11-28 Thread Fajar A. Nugraha
On Tue, Nov 29, 2011 at 8:58 AM, Phillip Susi ps...@cfl.rr.com wrote:
 On 11/28/2011 12:53 PM, Ken D'Ambrosio wrote:
 Seems I've picked up a wireless regression, and randomly drop my WiFi
 connection with more recent kernels.  While I'd love to try to track down the
 issue, the sporadic nature makes it difficult.  But I don't want to revert 
 to a
 flat-out old kernel because of all the btrfs modifications.  Is it possible
 using git to add *just* btrfs patches to an older kernel?

 Sure: use git rebase to apply the patches to the older kernel.

... or use 3.1.2, and get ONLY fs/btrfs from Chris' for-linus tree,
compile it out-of-tree, and use it to replace the original btrfs.ko.

There used to be this:
https://btrfs.wiki.kernel.org/articles/b/t/r/Btrfs_source_repositories.html#Building_latest_btrfs_against_a_recent_kernel_with_DKMS

But personally it's much easier to just compile it manually without dkms:
make -C /llib/modules/`uname -r`/build M=$(pwd) modules

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs and load (sys)

2011-11-23 Thread Fajar A. Nugraha
On Thu, Nov 24, 2011 at 8:00 AM, Chris Samuel ch...@csamuel.org wrote:
 Another possibility I *think* is that you could try 3.1 with
 Chris Mason's for-linus git branch pulled into it.  Hopefully
 someone who knows the procedure better than I can correct me
 on this! :-)

My method is:
- use 3.1.1 (latest stable at the time I was compiling it), compiie
btrfs as module
- use only fs/btrfs directory from Chris' for-linus tree, compile the
module externally (make -C /lib/modules/`pwd`/build M=$(pwd) modules)
- put the resulting btrfs.ko on /lib/modules/`uname -r`/updates/manual/
- depmod -a
- verify the new module is selected by default (modinfo btrfs)
- rebuild initrd
- reboot

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fsck with err is 1

2011-11-22 Thread Fajar A. Nugraha
On Wed, Nov 23, 2011 at 12:33 PM, Blair Zajac bl...@orcaware.com wrote:
 Hello,

 I'm trying btrfs in a VirtualBox VM running Ubuntu 11.10 with kernel 3.0.0.  
 Running fsck I get a message with err is 1.

 Does this mean there's an error?  Is err either always 0 or 1, or does err 
 increment beyond 1?

I can't answer that, but I can tell you that fsck for btrfs right now
is almost useless. It can't fix anyting.

Short summary, if you can mount the fs, and can access the data, and
don't have any weird messages on syslog, then it's most likely OK.

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >