Re: recommendations and contraindications of using btrfs for Oracle Database Server

2018-01-11 Thread Fajar A. Nugraha
On Thu, Jan 11, 2018 at 6:23 PM, Nikolay Borisov  wrote:
>
>
> On 11.01.2018 12:51, Ext-Strii-Houttemane Philippe wrote:
>> Hello,
>>
>>We are using btrfs filesystem on local disks (RAID 1) as underlying 
>> filesystem to host our Oracle 12c datafiles.
>> This allow us to cold backup databases via snapshot in a few seconds and 
>> benefit from higher performance than over Linux filesystem formats.
>> This is the problem we meet: Oracle regularly crashes with error of this 2 
>> types, the errors occur on different physical machines with same softwares:
>>
>> ORA-63999: data file suffered media failure
>> ORA-01114: IO error writing block to file 99 (block # 99968)
>> ORA-01110: data file 99: '/oradata/PS92PRD/data/pcapp.dbf'
>> ORA-27072: File I/O error
>> Linux-x86_64 Error: 17: File exists



>> Mount options: defaults,nofail,nodatacow,nobarrier,noatime
>>
>> uname -a:
>> Linux 3.10.0-514.10.2.el7.x86_64 #1 SMP Fri Mar 3 00:04:05 UTC 2017 x86_64 
>> x86_64 x86_64 GNU/Linux
>


> You are using a vendor-specific kernel. It's best if you turn to them
> for support since it's very likely their code doesn't match what is in
> upstream, let alone the fact you are using an ancient kernel.


3.10 is redhat compatible kernel. They put btrfs as tech preview, and
will deprecate it after 7.4. So it's probably useless to ask redhat
for support on that.

@Philippe, you might want to try Oracle's UEK R4. At least they're
still promoting btrfs improvements in their kernel, and you might be
able to ask them in case of problems (assuming you have proper
subscription & support). However IIRC oracle only supports btrfs for
its application binary, and not for the data files (again, you should
contact oracle support to be sure).

If you simply want to use latest kernel (with latest btrfs fixes) to
see if your problem still occurs, and don't care about support, you
can try kernel-ml (http://elrepo.org/tiki/kernel-ml) or compile your
own.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?

2017-08-02 Thread Fajar A. Nugraha
On Thu, Aug 3, 2017 at 1:44 AM, Chris Mason  wrote:
>
> On 08/02/2017 04:38 AM, Brendan Hide wrote:
>>
>> The title seems alarmist to me - and I suspect it is going to be 
>> misconstrued. :-/
>
>
> Supporting any filesystem is a huge amount of work.  I don't have a problem 
> with Redhat or any distro picking and choosing the projects they want to 
> support.
>

It'd help a lot of people if things like
https://btrfs.wiki.kernel.org/index.php/Status is kept up-to-date and
'promoted', so at least users are more informed about what they're
getting into and can choose which features (stable/still in dev/likely
to destroy your data) that they want to use.

For example, https://btrfs.wiki.kernel.org/index.php/Status says
compression is 'mostly OK' ('auto-repair and compression may crash'
looks pretty scary, as from newcomers-perspective it might be
interpretted as 'potential data loss'), while
https://en.opensuse.org/SDB:BTRFS#Compressed_btrfs_filesystems says
they support compression on newer opensuse versions.


>
> At least inside of FB, our own internal btrfs usage is continuing to grow.  
> Btrfs is becoming a big part of how we ship containers and other workloads 
> where snapshots improve performance.
>

Ubuntu also support btrfs as part their container implementation
(lxd), and (reading lxd mailing list) some people use lxd+btrfs on
their production environment. IIRC the last problem posted on lxd list
about btrfs was about how 'btrfs send/receive (used by lxd copy) is
slower than rsync for full/initial copy'.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS, remarkable problem: filesystem turns to read-only caused by firefox download

2016-06-14 Thread Fajar A. Nugraha
On Wed, Jun 15, 2016 at 1:29 PM, Paul Verreth  wrote:
> Dear all.
>
> When I download a video using  Firefox DownloadHelper addon, the
> filesystem suddenly turns read only. Not a coincedence, I tried it
> several times, and it happened every time again
>
> Info:
> Linux wolfgang 4.2.0-35-generic #40-Ubuntu SMP Tue Mar 15 22:15:45 UTC
> 2016 x86_64 x86_64 x86_64 GNU/Linux

> Segmentation fault
>
> Jun  5 15:03:15 ubuntu kernel: [ 2062.544303] BTRFS info (device
> sdb5): relocating block group 383447465984 flags 17


> What can I do to repair this problem?

The usual starting advice would be "try with latest kernel and see if
you can still reproduce the problem". Is it ubuntu wily? It'd go end
of in July anyway, so you might want to upgrade to xenial (or at
least, just the kernel, for the purpose of troubleshooting your
problem).

Or even try http://kernel.ubuntu.com/~kernel-ppa/mainline/daily/current/
(should be usable, but might report some errors/warning due to missing
ubuntu patches)

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs subvolume clone or fork (btrfs-progs feature request)

2015-07-08 Thread Fajar A. Nugraha
On Thu, Jul 9, 2015 at 8:20 AM, james harvey  wrote:
> Request for new btrfs subvolume subcommand:
>
> clone or fork [-i  []
>Create a subvolume  in , which is a clone or fork of source.
>If  is not given, subvolume  will be created in the
> current directory.
>Options
>-i 
>   Add the newly created subvolume to a qgroup.  This option can be
> given multiple times.
>
> Would (I think):
> * btrfs subvolume create 
> * cp -ax --reflink=always /* /

What's wrong with "btrfs subvolume snapshot"?

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: CoW with webserver databases: innodb_file_per_table and dedicated tables for blobs?

2015-06-16 Thread Fajar A. Nugraha
On Tue, Jun 16, 2015 at 2:06 PM, Ingvar Bogdahn
 wrote:
> Hi again,
>
> Benchmarking over time seems a good idea, but what if I see that a
> particular database does indeed degrade in performance? How can I then
> selectively improve performance for that file, since disabling cow only
> works for new empty files?
>

you might be overcomplicating things.

> Is it correct that bundling small random writes into groups of writes
> reduces fragmentation? If so, some form of write-caching should help? I'm
> still investigating, but one solution might be:
> 1) identify which exact tables do have frequent writes
> 2) decrease the system-wide write-caching (vm.dirty_background_ratio and
> vm.dirty_ratio) to lower levels, because this wastes lots of RAM by
> indiscriminately caching writes of the whole system, and tends to causes
> spikes where suddenly the entire cache gets written to disk and block the
> system. Rather use that RAM selectively to cache only the critical files.

IIRC innodb uses O_DIRECT by default, which should bypass fs cache, so
the above should be irrelevant


> 4) create a software RAID-1 made up of a ramdisk and a mounted image, using
> mdadm.
> 5) Setting up mdadm using rather large value for "write-behind="
> 6) put only those tables on that disk-backed ramdisk which do have frequent
> writes.
>

raid1 writes everything to both, so your write performance would still
be limited by the disk.
As for reads, instead of using ramdisk for half of md, I would just
use that amount of ram for innodb_buffer_pool


> What do you think?


I would say "determine your priorities".

If you absolutely need btrfs + innodb, then I would:
- increase innodb_buffer_pool
- don't mess with nocow, leave it as is
- don't mess with autodefrag
- enable compression on btrfs
- use latest known good kernel (AFAIK 4.0.5 should be good)

If you absolutely must have high performance with innodb, then I would
look at using raw block device directly for innodb. You'd lose all
btrfs features of course (e.g. snapshots), but it's a tradeoff for
performance.

If you don't HAVE to use innodb but still want to use btrfs, then I
would use tokudb engine instead (available in tokudb's mysql fork and
mariadb >= 10), with compression handled by tokudb (disable
compression in btrfs). tokudb doesn't support foreign constraint, but
other than that it should be able to replace innodb for your purposes.
Among other things, tokudb uses larger block size (4MB) so it should
help reduce fragmentation compared to innodb.

If you don't HAVE to use either btrfs or innodb, but just want "mysql
db that supports transactions with an fs that supports
snapshot/clone", then I would use zfs + tokudb. And read
http://blog.delphix.com/matt/2014/06/06/zfs-stripe-width/ (with the
exception that compression should be used in tokudb instead of zfs)

-- 
Fajar

>
> Ingvar
>
>
>
> Am 15.06.15 um 11:57 schrieb Hugo Mills:
>
>> On Mon, Jun 15, 2015 at 11:34:35AM +0200, Ingvar Bogdahn wrote:
>>>
>>> Hello there,
>>>
>>> I'm planing to use btrfs for a medium-sized webserver. It is
>>> commonly recommended to set nodatacow for database files to avoid
>>> performance degradation. However, apparently nodatacow disables some
>>> of my main motivations of using btrfs : checksumming and (probably)
>>> incremental backups with send/receive (please correct me if I'm
>>> wrong on this). Also, the databases are among the most important
>>> data on my webserver, so it is particularly there that I would like
>>> those feature working.
>>>
>>> My question is, are there strategies to avoid nodatacow of databases
>>> that are suitable and safe in a production server?
>>> I thought about the following:
>>> - in mysql/mariadb: setting "innodb_file_per_table" should avoid
>>> having few very big database files.
>>
>> It's not so much about the overall size of the files, but about the
>> write patterns, so this probably won't be useful.
>>
>>> - in mysql/mariadb: adapting database schema to store blobs into
>>> dedicated tables.
>>
>> Probably not an issue -- each BLOB is (likely) to be written in a
>> single unit, which won't cause the fragmentation problems.
>>
>>> - btrfs: set autodefrag or some cron job to regularly defrag only
>>> database fails to avoid performance degradation due to fragmentation
>>
>> Autodefrag is a good idea, and I would suggest trying that first,
>> before anything else, to see if it gives you good enough performance
>> over time.
>>
>> Running an explicit defrag will break any CoW copies you have (like
>> snapshots), causing them to take up additional space. For example,
>> start with a 10 GB subvolume. Snapshot it, and you will still only
>> have 10 GB of disk usage. Defrag one (or both) copies, and you'll
>> suddenly be using 20 GB.
>>
>>> - turn on compression on either btrfs or mariadb
>>
>> Again, won't help. The issue is not the size of the data, it's the
>> write patterns: small random writes into the middle of existing files
>> w

Re: should I use btrfs on Centos 7 for a new production server?

2014-12-30 Thread Fajar A. Nugraha
On Wed, Dec 31, 2014 at 1:04 PM, Eric Sandeen  wrote:
> On 12/30/14 10:06 PM, Wang Shilong wrote:
>>> I used CentOS7 btrfs myself, just doing some tests..it crashed easily.
>>> I don’t know how much efforts that Redhat do on btrfs for 7 series.
>>
>> Maybe use SUSE enterprise for btrfs will be a better choice, they offered
>> better support for btrfs as far as i know.
>
> I believe SuSE's most recent support statement on btrfs is here, I think.
>
> https://www.suse.com/releasenotes/x86_64/SUSE-SLES/12/#fate-317221

Wow. Suse use btrfs for root by default, but actively prevents user
from using compression (unless specifically overiden using module
parameter)?

Weird, since IIRC compression has been around and stable for a long time.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Kernel panic / Ubuntu 12.04.4

2014-07-01 Thread Fajar A. Nugraha
On Tue, Jul 1, 2014 at 2:44 PM, Tomasz Torcz  wrote:
>
> On Tue, Jul 01, 2014 at 08:45:59AM +0200, Tor Houghton wrote:
> > Well, I probably should have looked at the logs first and not tried to 
> > delete
> > some old data, but as the command (rm -rf) hung, I got suspicious:
> >
> > Jun 30 23:51:06 moonshade kernel: [ 1440.454721] btrfs: relocating block 
> > group 2939934474240 flags 1
> > Jun 30 23:51:07 moonshade kernel: [ 1440.637248] btrfs: relocating block 
> > group 2999510827008 flags 36
> > Jun 30 23:51:07 moonshade kernel: [ 1440.897153] btrfs: relocating block 
> > group 2997363343360 flags 36
> > Jun 30 23:51:07 moonshade kernel: [ 1441.147110] btrfs: relocating block 
> > group 2995215859712 flags 36
>
>   Looks like balance running?
>
> > Jul  1 01:31:47 moonshade kernel: [ 7480.673992] Pid: 17884, comm: btrfs 
> > Not tainted 3.5.0-40-generic #62~precise1-Ubuntu Dell Inc. 
> > Dell DM051   /0HJ054
>
>   Kernel 3.5 is extremely old and lacks fixes from 11 kernel releases done
> afterwards.  Please contact your vendor (Canonical?) for support.
>


ubuntu precise can use linux-generic-lts-trusty, which brings kernel 3.13.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: latest btrfs-progs and asciidoc dependency

2014-06-05 Thread Fajar A. Nugraha
On Thu, Jun 5, 2014 at 9:41 PM, Marc MERLIN  wrote:
> On Thu, Jun 05, 2014 at 12:52:04PM +0100, Tomasz Chmielewski wrote:
>> And it looks the dependency is ~1 GB of new packages? O_o
>
> That seems painful, but at the same time, the alternative, nroff/troff sucks.
>
> Part ofyour problem however seems to be runaway dependencies.
> You are getting x11 and stuff like libdrm which clearly you shouldn't need.
> If your disk space is more valuable than your time, I recommend you build
> asciidoc yourself and you should hopefully end up with less.
>
> Or you can also remove asciidoc from the makefile and read the raw files
> which are readable.


... or try this

# apt-get install --no-install-recommends asciidoc

If that still doesn't work, AND you have lost of free time, AND
familiar with debian packaging, then you can use latest available
debian source, adapt it for latest version, and use opensuse build
service to compile it.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Very slow filesystem

2014-06-04 Thread Fajar A. Nugraha
(resending to the list as plain text, the original reply was rejected
due to HTML format)

On Thu, Jun 5, 2014 at 10:05 AM, Duncan <1i5t5.dun...@cox.net> wrote:
>
> Igor M posted on Thu, 05 Jun 2014 00:15:31 +0200 as excerpted:
>
> > Why btrfs becames EXTREMELY slow after some time (months) of usage ?
> > This is now happened second time, first time I though it was hard drive
> > fault, but now drive seems ok.
> > Filesystem is mounted with compress-force=lzo and is used for MySQL
> > databases, files are mostly big 2G-8G.
>
> That's the problem right there, database access pattern on files over 1
> GiB in size, but the problem along with the fix has been repeated over
> and over and over and over... again on this list, and it's covered on the
> btrfs wiki as well

Which part on the wiki? It's not on
https://btrfs.wiki.kernel.org/index.php/FAQ or
https://btrfs.wiki.kernel.org/index.php/UseCases

> so I guess you haven't checked existing answers
> before you asked the same question yet again.
>
> Never-the-less, here's the basic answer yet again...
>
> Btrfs, like all copy-on-write (COW) filesystems, has a tough time with a
> particular file rewrite pattern, that being frequently changed and
> rewritten data internal to an existing file (as opposed to appended to
> it, like a log file).  In the normal case, such an internal-rewrite
> pattern triggers copies of the rewritten blocks every time they change,
> *HIGHLY* fragmenting this type of files after only a relatively short
> period.  While compression changes things up a bit (filefrag doesn't know
> how to deal with it yet and its report isn't reliable), it's not unusual
> to see people with several-gig files with this sort of write pattern on
> btrfs without compression find filefrag reporting literally hundreds of
> thousands of extents!
>
> For smaller files with this access pattern (think firefox/thunderbird
> sqlite database files and the like), typically up to a few hundred MiB or
> so, btrfs' autodefrag mount option works reasonably well, as when it sees
> a file fragmenting due to rewrite, it'll queue up that file for
> background defrag via sequential copy, deleting the old fragmented copy
> after the defrag is done.
>
> For larger files (say a gig plus) with this access pattern, typically
> larger database files as well as VM images, autodefrag doesn't scale so
> well, as the whole file must be rewritten each time, and at that size the
> changes can come faster than the file can be rewritten.  So a different
> solution must be used for them.


If COW and rewrite is the main issue, why don't zfs experience the
extreme slowdown (that is, not if you have sufficient free space
available, like 20% or so)?

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Very slow filesystem

2014-06-04 Thread Fajar A. Nugraha
On Thu, Jun 5, 2014 at 5:15 AM, Igor M  wrote:
> Hello,
>
> Why btrfs becames EXTREMELY slow after some time (months) of usage ?

> # btrfs fi show
> Label: none  uuid: b367812a-b91a-4fb2-a839-a3a153312eba
> Total devices 1 FS bytes used 2.36TiB
> devid1 size 2.73TiB used 2.38TiB path /dev/sde

> # btrfs fi df /mnt/old
> Data, single: total=2.36TiB, used=2.35TiB

Is that the fs that is slow?

It's almost full. Most filesystems would exhibit really bad
performance when close to full due to fragmentation issue (threshold
vary, but 80-90% full usually means you need to start adding space).
You should free up some space (e.g. add a new disk so it becomes
multi-device, or delete some files) and rebalance/defrag.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Convert btrfs software code to ASIC

2014-05-19 Thread Fajar A. Nugraha
On Mon, May 19, 2014 at 8:09 PM, Le Nguyen Tran  wrote:
> I now need to understand the operation of btrfs source code to
> determine. I hope that one of you can help me


Have you read the wiki link?

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Convert btrfs software code to ASIC

2014-05-19 Thread Fajar A. Nugraha
On Mon, May 19, 2014 at 3:40 PM, Le Nguyen Tran  wrote:
> Hi,
>
> I am Nguyen. I am not a software development engineer but an IC (chip)
> development engineer. I have a plan to develop an IC controller for
> Network Attached Storage (NAS). The main idea is converting software
> code into hardware implementation. Because the chip is customized for
> NAS, its performance is high, and its cost is lower than using micro
> processor like Atom or Xeon (for servers).
>
> I plan to use btrfs as the file system specification for my NAS. The
> main point is that I need to understand the btrfs sofware code in
> order to covert them into hardware implementation. I am wandering if
> any of you can help me. If we can make the chip in a good shape, we
> can start up a company and have our own business.

I'm not sure if that's a good idea.

AFAIK btrfs depends a lot on other linux subsystems (e.g. vfs, block,
etc). Rather than converting/reimplementing everything, if your aim is
lower cost, you might have easier time using something like a mediatek
SOC (the ones used on smartphones) and run a custom-built linux with
btrfs support on it.

For documentation,
https://btrfs.wiki.kernel.org/index.php/Main_Page#Developer_documentation
is probably the best place to start

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Which companies contribute to Btrfs?

2014-04-24 Thread Fajar A. Nugraha
On Thu, Apr 24, 2014 at 6:39 PM, David Sterba  wrote:
> On Wed, Apr 23, 2014 at 06:18:34PM -0700, Marc MERLIN wrote:
>> I writing slides about btrfs for an upcoming talk (at linuxcon) and I was
>> trying to gather a list of companies that contribute code to btrfs.
>
> https://btrfs.wiki.kernel.org/index.php/Main_Page
>
> "[...] Jointly developed at Oracle, Red Hat, Fujitsu, Intel, SUSE, STRATO 
> [...]"
>
>> Are there other companies I missed?


The page now says "... Jointly developed at Facebook, Oracle, Red Hat  " :D

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can anyone boot a system using btrfs root with linux 3.14 or newer?

2014-04-23 Thread Fajar A. Nugraha
On Thu, Apr 24, 2014 at 10:23 AM, Chris Murphy  wrote:
>
> It sounds like either a grub.cfg misconfiguration, or a failure to correctly 
> build the initrd/initramfs. So I'd post the grub.cfg kernel command line for 
> the boot entry that works and the entry that fails, for comparison.
>
> And then also check and see if whatever utility builds your initrd has been 
> upgraded along with your kernel, maybe there's a bug/regression.
>

I believe the OP mentioned that he's using a distro without initrd,
and that all required modules are built in.

-- 
Fajar


>
> Chris Murphy--
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs and ECC RAM

2014-01-20 Thread Fajar A. Nugraha
On Mon, Jan 20, 2014 at 10:13 AM, Austin S Hemmelgarn
 wrote:
>
> AFAIK, ZFS does background data scrubbing without user intervention


No, it doesn't.

> BTRFS however works differently, it only scrubs data when you tell it
> to.  If it encounters a checksum or read error on a data block, it
> first tries to find another copy of that block elsewhere (usually on
> another disk), if it still sees a wrong checksum there, or gets
> another read error, or can't find another copy, then it returns a read
> error to userspace,


zfs does the same thing.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: drawbacks of non-ECC RAM

2014-01-17 Thread Fajar A. Nugraha
On Sat, Jan 18, 2014 at 1:33 AM, valleysmail-l...@yahoo.de
 wrote:
>
>
>
> I'd like to know if there are drawbacks in using btrfs with non-ECC RAM 
> instead of using ext4 with non-ECC RAM.

Non-ECC RAM can cause problems no matter what fs you use.

> I know that some features of btrfs may rely on ECC RAM but is the chance of 
> data corruption or even a damaged filesystem higher than when i use ext4 
> instead of btrfs?

Not really.

In the past the occurence of corrupted btrfs report on this list
(regardless of RAM) is somewhat high, but I don't see much of it in
recent versions.

> I want to know this because i would like to use the snapshot feature of btrfs 
> and ext4 does not support that. I will not use btrfs for fixing silent data 
> corruption nor for using RAID like features or encryption. ZFS however checks 
> files in the background (even if i don't want)

zfs does not "checks files in the background" by default. When
checksum is enabled (the default option), zfs only checks file
integrity when you access it, and when you run the "scrub" command. It
does not run background scrubs automatically.

AFAIK btrfs behaves the same way.

> and if it thinks there is an error it will fix it and i cannot disable this 
> feature. So errors in RAM may corrupt my files or even more.

You can disable checksum on both btrfs and zfs. See
https://btrfs.wiki.kernel.org/index.php/FAQ#Can_data_checksumming_be_turned_off.3F
,  https://btrfs.wiki.kernel.org/index.php/Mount_options

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Two identical copies of an image mounted result in changes to both images if only one is modified

2013-06-20 Thread Fajar A. Nugraha
On Thu, Jun 20, 2013 at 3:47 PM, Clemens Eisserer  wrote:
> Hi,
>
> I've observed a rather strange behaviour while trying to mount two
> identical copies of the same image to different mount points.
> Each modification to one image is also performed in the second one.
>
> Example:
> dd if=/dev/sda? of=image1 bs=1M
> cp image1 image2
> mount -o loop image1 m1
> mount -o loop image2 m2
>
> touch m2/hello
> ls -la m1  //will now also include a file calles "hello"

What do you get if you unmount BOTH m1 and m2, and THEN mount m1
again? Is the file still there?

>
> Is this behaviour intentional and known or should I create a bug-report?
> I've deleted quite a bunch of files on my production system because of this...

I'm pretty sure this is a known behavior in btrfs.
http://markmail.org/message/i522sdkrhlxhw757#query:+page:1+mid:ksdi5d4v26eqgxpi+state:results

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: lvm volume like support

2013-02-26 Thread Fajar A. Nugraha
On Tue, Feb 26, 2013 at 9:30 PM, Martin Steigerwald  wrote:
> Am Dienstag, 26. Februar 2013 schrieb Fajar A. Nugraha:
>> On Tue, Feb 26, 2013 at 11:59 AM, Mike Fleetwood
>>
>>  wrote:
>> > On 25 February 2013 23:35, Suman C  wrote:
>> >> Hi,
>> >>
>> >> I think it would be great if there is a lvm volume or zfs zvol type
>> >> support in btrfs.
>> >
>> > Btrfs already has capabilities to add and remove block devices on the
>> > fly.  Data can be stripped or mirrored or both.  Raid 5/6 is in
>> > testing at the moment.
>> > https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devic
>> > es https://btrfs.wiki.kernel.org/index.php/UseCases#RAID
>> >
>> > Which specific features do you think btrfs is lacking?
>>
>> I think he's talking about zvol-like feature.
>>
>> In zfs, instead of creating a
>> filesystem-that-is-accessible-as-a-directory, you can create a zvol
>> which behaves just like any other standard block device (e.g. you can
>> use it as swap, or create ext4 filesystem on top of it). But it would
>> also have most of the benefits that a normal zfs filesystem has, like:
>> - thin provisioning (sparse allocation, snapshot & clone)
>> - compression
>> - integrity check (via checksum)
>>
>> Typical use cases would be:
>> - swap in a pure-zfs system
>> - virtualization (xen, kvm, etc)
>> - NAS which exports the block device using iscsi/AoE
>>
>> AFAIK no such feature exist in btrfs yet.
>
> Sounds like the RADOS block device stuff for Ceph.

Exactly.

While using files + loopback device mostly works, there were problems
regarding performance and data integrity. Not to mention the hassle in
accessing the data if it resides on a partition inside the file (e.g.
you need losetup + kpartx to access it, and you must remember to do
the reverse when you're finished with it).

In zfsonlinux it's very easy to do so since a zvol is treated pretty
much like a disk, and whenever there's a partition inside a zvol, a
coressponding device noed is also created automatically.

--
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: lvm volume like support

2013-02-25 Thread Fajar A. Nugraha
On Tue, Feb 26, 2013 at 11:59 AM, Mike Fleetwood
 wrote:
> On 25 February 2013 23:35, Suman C  wrote:
>> Hi,
>>
>> I think it would be great if there is a lvm volume or zfs zvol type
>> support in btrfs.


> Btrfs already has capabilities to add and remove block devices on the
> fly.  Data can be stripped or mirrored or both.  Raid 5/6 is in
> testing at the moment.
> https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices
> https://btrfs.wiki.kernel.org/index.php/UseCases#RAID
>
> Which specific features do you think btrfs is lacking?


I think he's talking about zvol-like feature.

In zfs, instead of creating a
filesystem-that-is-accessible-as-a-directory, you can create a zvol
which behaves just like any other standard block device (e.g. you can
use it as swap, or create ext4 filesystem on top of it). But it would
also have most of the benefits that a normal zfs filesystem has, like:
- thin provisioning (sparse allocation, snapshot & clone)
- compression
- integrity check (via checksum)

Typical use cases would be:
- swap in a pure-zfs system
- virtualization (xen, kvm, etc)
- NAS which exports the block device using iscsi/AoE

AFAIK no such feature exist in btrfs yet.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Production use with vanilla 3.6.6

2012-11-05 Thread Fajar A. Nugraha
On Mon, Nov 5, 2012 at 7:07 PM, Stefan Priebe - Profihost AG
 wrote:
> Hello list,
>
> is btrfs ready for production use in 3.6.6? Or should i backport fixes from
> 3.7-rc?
>
> Is it planned to have a stable kernel which will get all btrfs fixes
> backported?

I would say "no" to both, but you should check with distros that
supports btrfs (Oracle Linux and SLES). In particular, whether they
backport fixes, and what exactly does "supported" status gives you
when you buy support for that distro.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Request for review] [RFC] Add label support for snapshots and subvols

2012-11-01 Thread Fajar A. Nugraha
On Fri, Nov 2, 2012 at 5:32 AM, Hugo Mills  wrote:
> On Fri, Nov 02, 2012 at 05:28:01AM +0700, Fajar A. Nugraha wrote:
>> On Fri, Nov 2, 2012 at 5:16 AM, cwillu  wrote:
>> >>  btrfs fi label -t /btrfs/snap1-sv1
>> >> Prod-DB-sand-box-testing
>> >
>> > Why is this better than:
>> >
>> > # btrfs su snap /btrfs/Prod-DB /btrfs/Prod-DB-sand-box-testing
>> > # mv /btrfs/Prod-DB-sand-box-testing /btrfs/Prod-DB-production-test
>> > # ls /btrfs/
>> > Prod-DB  Prod-DB-production-test
>>
>>
>> ... because it would mean possibilty to decouple subvol name from
>> whatever-data-you-need (in this case, a label).
>>
>> My request, though, is to just implement properties, and USER
>> properties, like what we have in zfs. This seems to be a cleaner,
>> saner approach. For example, this is on Ubutu + zfsonlinux:
>>
>> # zfs create rpool/u
>> # zfs set user:label="Some test filesystem" rpool/u
>> # zfs get creation,user:label rpool/u
>> NAME PROPERTYVALUE  SOURCE
>> rpool/u  creationFri Nov  2  5:24 2012  -
>> rpool/u  user:label  Some test filesystem   local
>
>Don't we already have an equivalent to that with user xattrs?
>
>Hugo.


Anand did say one way to implement the label is by using attr, so +1
from me for that approach.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Request for review] [RFC] Add label support for snapshots and subvols

2012-11-01 Thread Fajar A. Nugraha
On Fri, Nov 2, 2012 at 5:16 AM, cwillu  wrote:
>>  btrfs fi label -t /btrfs/snap1-sv1
>> Prod-DB-sand-box-testing
>
> Why is this better than:
>
> # btrfs su snap /btrfs/Prod-DB /btrfs/Prod-DB-sand-box-testing
> # mv /btrfs/Prod-DB-sand-box-testing /btrfs/Prod-DB-production-test
> # ls /btrfs/
> Prod-DB  Prod-DB-production-test


... because it would mean possibilty to decouple subvol name from
whatever-data-you-need (in this case, a label).

My request, though, is to just implement properties, and USER
properties, like what we have in zfs. This seems to be a cleaner,
saner approach. For example, this is on Ubutu + zfsonlinux:

# zfs create rpool/u
# zfs set user:label="Some test filesystem" rpool/u
# zfs get creation,user:label rpool/u
NAME PROPERTYVALUE  SOURCE
rpool/u  creationFri Nov  2  5:24 2012  -
rpool/u  user:label  Some test filesystem   local

More info about zfs user properties here:
http://docs.oracle.com/cd/E19082-01/817-2271/gdrcw/index.html

--
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Naming of (bootable) subvolumes

2012-10-28 Thread Fajar A. Nugraha
On Sun, Oct 28, 2012 at 12:22 AM, Chris Murphy  wrote:
>
> On Oct 26, 2012, at 9:03 PM, Fajar A. Nugraha  wrote:
>
>>
>> So back to the original question, I'd suggest NOT to use either
>> send/receive or set-default. Instead, setup multiple boot environment
>> (e.g. old version, current version) and let user choose which one to
>> boot using a menu.
>
> Is it possible to make a functioning symbolic or hard link of a subvolume?
>

Nope, I don't think so.

> I'm fine with "current" and "previous" options. More than that seems 
> unnecessary. But then, how does the user choose?

WIth up and down arrow :)

> What's the UI?

Grub boot menu.

> Is this properly the domain of GRUB2 or something else?

In my setup I use grub2's "configfile" ability. Which basically does a
"go evaluate this other menu config file".

Each boot environment (BE, the term that solaris uses) has a different
entry on the "main" grub.cfg, which loads the BE's corresponding
grub.cfg.

>
> On BIOS machines, perhaps GRUB. On UEFI, I'd say distinctly not GRUB (I think 
> it's a distinctly bad idea to have a combined boot manager and bootloader in 
> a UEFI context, but that's a separate debate).

I don't use UEFI. But the general idea is to have one bootloader which
can load additional config files. And the location of that additional
config file depends on which BE user wants to boot.

> On this system, grub-mkconfig produces a grub.cfg only for the system I'm 
> currently booted from. It does not include any entries for fedora18/boot, 
> fedora18/root, even though they are well within the normal search path. And 
> the reference used is relative,  i.e. the kernel parameter in the grub.cfg is 
> rootflags=subvol=root
>
> If it were to create entries potentially for every snapshotted system, it 
> would be a very messy grub.cfg indeed.
>
> It stands to reason that each distro will continue to have their own grub.cfg.
>

No arguments there. Even in my setup, when I run "update-grub", it
will only update its own grub.cfg, and leave the "main" grub.cfg
untouched. This is how my "main" grub.cfg looks like:


#===
set timeout=2

menuentry 'Ubuntu - 20120905 boot menu' {
configfile  /ROOT/precise-5/@/boot/grub/grub.cfg
}
menuentry 'Ubuntu - 20120814 boot menu' {
configfile  /ROOT/precise-4/@/boot/grub/grub.cfg
}
#===

each BE's grub cfg (e.g. the one under ROOT/precise-5 dataset) is just
your typical Ubuntu's grub.cfg, with only references to kernel/initrd
under that dataser.

> For BIOS machines, it could be useful if a single core.img containing a 
> single standardized prefix specifying a grub location could be agreed upon. 
> And then merely changing the set-default subvolume would allow different 
> distro grub.cfg's to be found, read and workable with the relative references 
> now in place, (except for home which likely needs to be mounted using 
> subvolid).

IMHO the biggest difference is that grub support for zfsonlinux, even
though it has bootfs pool property, has a way to reference ALL
versions of a file (including grub.cfg/kernel/initrd) during boot
time. This way you don't even need to change bootfs whenever you want
to change to a boot environment, you simply choose (or write) a
different grub stanza to boot.

If we continue to rely on current btrfs grub support, unfortunately we
can't have the same thing. And the closest thing would be
"set-default". Which IMHO is VERY messy.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Naming of subvolumes

2012-10-26 Thread Fajar A. Nugraha
On Sat, Oct 27, 2012 at 8:58 AM, cwillu  wrote:
>> I haven't tried btrfs send/receive for this purpose, so I can't compare. But 
>> btrfs subvolume set-default is faster than the release of my finger from the 
>> return key. And it's easy enough the user could do it themselves if they had 
>> reasons for regression to a snapshot that differ than the automagic 
>> determination of the upgrade pass/fail.
>>
>> The one needed change, however, is to get /etc/fstab to use an absolute 
>> reference for home.
>>
>>
>> Chris Murphy

> I'd argue that everything should be absolute references to subvolumes
> (/@home, /@, etc), and neither set-default nor subvolume id's should
> be touched.  There's no need, as you can simply mv those around (even
> while mounted).  More importantly, it doesn't result in a case where
> the fstab in one snapshot points its mountpoint to a different
> snapshot, with all the hilarity that would cause over time, and also
> allows multiple distros to be installed on the same filesystem without
> having them stomp on each others set-defaults: /@fedora, /@rawhide,
> /@ubuntu, /@home, etc.


What I do with zfs, which might also be applicable on btrfs:

- Have a separate dataset to install grub: poolname/boot. This can
also be a dedicated partition, if you want. The sole purpose for this
partition/dataset is to select which dataset's grub.cfg to load next
(using "configfile" directive). The grub.cfg here is edited manually.

- Have different datasets for each versioned OS (e.g. before and after
upgrades): poolname/ROOT/ubuntu-1, poolname/ROOT/ubuntu-2, etc. Each
dataset is independent of each other, contains their own /boot
(complete with grub/grub.cfg, kernel, and initrd). grub.cfg on each
dataset selects its own dataset to boot using "bootfs" kernel command
line.

- Have a common home for all environment: poolname/home

- Have zfs set the mountpoint (or mounted in initramfs, in root case),
so I can get away with an empty fstab.

- Do upgrades/modifications in the currently-booted root environment,
but create a clone of current environment (and give it a different
name) so I can roll back to it if needed.


It works great for me so far, since:
- each boot environment is portable-enough to move around when needed,
with only about four config files needed to be changed (e.g. grub.cfg)
when moving between different computers, or when renaming a root
dataset.

- I can rename each root environment easily, or even move it to
different pool/disk when needed.

- I can move back and forth between multiple versions of the boot
environment (all are ubuntu so far, cause IMHO it currently has best
zfs root support).


So back to the original question, I'd suggest NOT to use either
send/receive or set-default. Instead, setup multiple boot environment
(e.g. old version, current version) and let user choose which one to
boot using a menu. However for this to work, grub (the bootloader, and
the userland programs like "update-grub") needs to be able to refer to
each grub.cfg/kernel/initrd in a global manner regardless of what the
current default subvolume is (zfs' grub code uses something like
/poolname/dataset_name/@/path/to/file/in/dataset).

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs causing reboots and kernel oops on SL 6 (RHEL 6)

2012-10-04 Thread Fajar A. Nugraha
On Sat, Jun 4, 2011 at 11:33 AM, Joel Pearson
 wrote:
> Hi,
>
> I'm using SL 6 (RHEL 6) and I've been playing around with running
> PostgreSQL on btrfs. Snapshotting works ok, but the computer keeps
> rebooting without warning (can be 5 mins or 1.5 hours), finally I
> actually managed to get a Kernel Crash instead of just a reboot.
>
> I took a picture of the screen:
> http://imageshack.us/photo/my-images/716/img0143y.jpg/
>
> The important bits are:
>
> IP: [] btrfs_print_leaf +0x31/0x820 [btrfs]
> PGD 0
> Oops:  [#1] SMP
> last sysfs file: /sys/devices/virtual/block/dm-3/dm/name
>
> The crashes aren't predictable either. Like it doesn't always happen
> when I do a snapshot or anything like that.
>
> Is this a known problem, that is fixed in a later kernel or something like 
> that?


Which kernel is this?

If it's the default SL/RHEL 2.6.32 kernel, then you should try upgrade
first. http://elrepo.org/tiki/kernel-ml is a good choice.

It's highly unlikely that anyone would be willing to look at bugs on
that "archaic" (in btrfs world) kernel.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Tunning - cache write (database)

2012-10-01 Thread Fajar A. Nugraha
On Tue, Oct 2, 2012 at 3:16 AM, Clemens Eisserer  wrote:
>> I suggest you start by reading
>> http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg18827.html
>>
>> After that, PROBABLY start your database by preloading libeatmydata to
>> disable fsync completely.
>
> Which will cure the sympthoms, not the issue itself - I remember the
> same advice was given for Reiser4 back then ;)
> Usually for non-toy use-cases data is too valueable to just disable fsync.

The OP DID say he doesn't really care about security, recovery, nor
integrity (or at least, it's not an obligatiion) :D

Other than trying latest -rc and using libeatmydata, I can't see what
else can be done to improve current db performance on btrfs. As the
list archive shows, zfs is currently MUCH more suitable for that.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Tunning - cache write (database)

2012-10-01 Thread Fajar A. Nugraha
On Mon, Oct 1, 2012 at 8:27 PM, Cesar Inacio Martins
 wrote:

> My problem:
> * Using btrfs + compression , flush of 60 MB/s take 4 minutes
> (on this 4 minutes they keep constatly I/O of +- 4MB/s no disks)
> (flush from Informix database)


> * OpenSuse 12.1 64bits, running over VmWare ESXi 5
> * Btrfs version : btrfsprogs-0.19-43.1.2.x86_64
> * Kernel : Linux jdivm06 3.1.10-1.16-desktop #1 SMP PREEMPT Wed Jun 27


> My question, what I believed will help to avoid this long flush :
> * Have some way to force this flush all in memory cache and then use the
> btrfs background process to flush to disk ...
>   Security and recover aren't a priority for now, because this is part of a
> database bulkload ...after finish , integrity will be desirable (not a
> obligation, since this is a test environment)
>
> For now, performance is the mainly requirement...


I suggest you start by reading
http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg18827.html

After that, PROBABLY start your database by preloading libeatmydata to
disable fsync completely.

On a side note, zfs has "sync" property, which when set to "disabled",
have pretty much the same effect as libeatmydata.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Experiences: Why BTRFS had to yield for ZFS

2012-09-19 Thread Fajar A. Nugraha
On Wed, Sep 19, 2012 at 2:28 PM, Casper Bang  wrote:
>> Anand Jain  oracle.com> writes:
>>   archive-log-apply script - if you could, can you share the
>>   script itself ? or provide more details about the script.
>>   (It will help to understand the work-load in question).
>
> Our setup entails a whole bunch of scripts, but the apply script looks like 
> this
> (orion is the production environment, pandium is the shadow):
> http://pastebin.com/k4T7deap
>
> The script invokes rman passing rman_recover_database.rcs:

IIRC there were some patches post-3.0 which relates to sync. If oracle
db uses sync writes (or call sync somewhere, which it should), it
might help to re-run the test with more recent kernel. kernel-ml
repository might help.

> Ext4 starts out with a realtime to SCN ratio of about 3.4 and ends down 
> around a
> factor 2.2.
>
> ZFS starts out with a realtime to SCN ratio of about 7.5 and ends down around 
> a
> factor 4.4.

So zfsonlinux is actually faster than ext4 for that purpuse? coool !

>
> Btrfs starts out with a realtime to SCN ratio of about 2.2 and ends down 
> around
> a factor 0.8. This of course means we will never be able to catch up with
> production, as btrfs can't apply these as fast as they're created.
>
> It was even worse with btrfs on our 10xSSD server, where 20 min. of realtime
> work would end up taking some 5h to get applied (factor 0.06), obviously 
> useless
> to us.

Just wondering, did you use "discard" option by any chance? In my
experience it makes btrfs MUCH slower.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: specify UUID for btrfs

2012-09-12 Thread Fajar A. Nugraha
On Thu, Sep 13, 2012 at 1:07 PM, ching lu  wrote:
> Is it possible to specify UUID for btrfs when creating the filesystem?

Not that I know of

> or changing it when it is offline?

This one is a definite no.

> i have several script/setting file which have hardcoded UUID and i do
> not want to update them every time when restore backup.

Using label would probably make more sense for that purpose. It can be
set and changed later.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Workaround for hardlink count problem?

2012-09-10 Thread Fajar A. Nugraha
On Mon, Sep 10, 2012 at 4:12 PM, Martin Steigerwald  wrote:
> Am Samstag, 8. September 2012 schrieb Marc MERLIN:
>> I was migrating a backup disk to a new btrfs disk, and the backup had a
>> lot of hardlinks to collapse identical files to cut down on inode
>> count and disk space.
>>
>> Then, I started seeing:
> […]
>> Has someone come up with a cool way to work around the too many link
>> error and only when that happens, turn the hardlink into a file copy
>> instead? (that is when copying an entire tree with millions of files).
>
> What about:
>
> - copy first backup version
> - btrfs subvol create first next
> - copy next backup version
> - btrfs subvol create previous next

Wouldn't "btrfs subvolume snapshot", plus "rsync --inplace" more
useful here? That is. if the original hardlink is caused by multiple
versions of backup of the same file.

Personally, if I need a feature not currently impelented yet in btrfs,
I'd just switch to something else for now, like zfs. And revisit btrfs
later when it has the needed features have been merged.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: enquiry about defrag

2012-09-09 Thread Fajar A. Nugraha
On Sun, Sep 9, 2012 at 2:49 PM, ching  wrote:
> On 09/09/2012 08:30 AM, Jan Steffens wrote:
>> On Sun, Sep 9, 2012 at 2:03 AM, ching  wrote:
>>> 2. Is there any command for the fragmentation status of a file/dir ? e.g. 
>>> fragment size, number of fragments.
>> Use the "filefrag" command, part of e2fsprogs.
>>
>
> my image is a 16G sparse file, after defragment, it still has 101387 extents, 
> is it normal?

Is compression enabled? If so, yes, it's normal.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


oops with btrfs on zvol

2012-08-31 Thread Fajar A. Nugraha
Hi,

I'm experimenting with btrfs on top of zvol block device (using
zfsonlinux), and got oops on a simple mount test.

While I'm sure that zfsonlinux is somehow also at fault here (since
the same test with zram works fine), the oops only shows things
btrfs-related without any usable mention of zfs/zvol. Could anyone
help me interpret the kernel logs, which btrfs-zvol interaction is at
fault, so I can pass it on to zfs guys to work on their side as well?
Thanks.

The test is creating a sparse 100G block device (zfs create -V 100G -s
-o volblocksize=4k rpool/vbd/test1), format it (mkfs.btrfs
/dev/zvol/rpool/vbd/test1), and mount it. Oops occured, and the mount
process stuck. Same thing happens on ubuntu precise's kernel 3.2 and
quantal's 3.5.

What's interesting is:
- if I use ext4 (instead of btrfs) on the zvol, it works just fine
- if I add a layer on top of zvol (losetup, or iscsi export-import)
then btrfs works just fine.

Syslog shows this (from Ubuntu's 3.2 kernel):

#=
Aug 31 20:30:13 DELL kernel: [34307.828311]  zd0: unknown partition table
Aug 31 20:30:34 DELL kernel: [34328.129249] device fsid
cfd88ff9-def8-4d1f-9435-65becd5fa2b7 devid 1 transid 4 /dev/zd0
Aug 31 20:30:34 DELL kernel: [34328.134001] btrfs: disk space caching is enabled
Aug 31 20:30:34 DELL kernel: [34328.135701] BUG: unable to handle
kernel NULL pointer dereference at   (null)
Aug 31 20:30:34 DELL kernel: [34328.137200] IP: []
extent_range_uptodate+0x59/0xe0 [btrfs]
Aug 31 20:30:34 DELL kernel: [34328.138759] PGD 0
Aug 31 20:30:34 DELL kernel: [34328.140248] Oops:  [#1] SMP
Aug 31 20:30:34 DELL kernel: [34328.141777] CPU 3
Aug 31 20:30:34 DELL kernel: [34328.141811] Modules linked in: ses
enclosure ppp_mppe ppp_async crc_ccitt pci_stub vboxpci(O)
vboxnetadp(O) vboxnetflt
(O) vboxdrv(O) arc4 ath9k mac80211 radeon uvcvideo snd_hda_codec_hdmi
ath9k_common snd_hda_codec_realtek ath9k_hw videodev ipt_MASQUERADE
xt_state ipt
able_nat nf_nat v4l2_compat_ioctl32 i915 nf_conntrack_ipv4
nf_conntrack iptable_filter nf_defrag_ipv4 ip_tables dm_multipath
dummy x_tables bnep ath3k
 btusb bridge rfcomm bluetooth snd_hda_intel snd_hda_codec stp joydev
ath snd_hwdep ttm snd_pcm mei(C) drm_kms_helper drm snd_seq_midi
snd_rawmidi snd
_seq_midi_event dell_wmi sparse_keymap snd_seq dell_laptop wmi
snd_timer i2c_algo_bit video psmouse snd_seq_device cfg80211 snd
mac_hid serio_raw soun
dcore dcdbas snd_page_alloc parport_pc ppdev lp parport binfmt_misc
zfs(P) zcommon(P) znvpair(P) zavl(P) zunicode(P) spl(O) ums_realtek
uas r8169 btrf
s zlib_deflate libcrc32c usb_storage
Aug 31 20:30:34 DELL kernel: [34328.155820]
Aug 31 20:30:34 DELL kernel: [34328.157974] Pid: 15887, comm:
btrfs-endio-met Tainted: P C O 3.2.0-29-generic #46-Ubuntu
Dell Inc.  De
ll System Inspiron N4110/03NKW8
Aug 31 20:30:34 DELL kernel: [34328.160283] RIP:
0010:[]  []
extent_range_uptodate+0x59/0xe0 [btrfs]
Aug 31 20:30:34 DELL kernel: [34328.162700] RSP: 0018:8800351dfde0
 EFLAGS: 00010246
Aug 31 20:30:34 DELL kernel: [34328.165099] RAX:  RBX:
01401000 RCX: 
Aug 31 20:30:34 DELL kernel: [34328.167548] RDX: 0001 RSI:
1401 RDI: 
Aug 31 20:30:34 DELL kernel: [34328.169989] RBP: 8800351dfe00 R08:
 R09: 880067021418
Aug 31 20:30:34 DELL kernel: [34328.172474] R10: 8800b680d010 R11:
1000 R12: 88011d997bf0
Aug 31 20:30:34 DELL kernel: [34328.174922] R13: 01401fff R14:
880031c45c00 R15: 88011aedc9b0
Aug 31 20:30:34 DELL kernel: [34328.177401] FS:
() GS:88013e6c()
knlGS:
Aug 31 20:30:34 DELL kernel: [34328.179904] CS:  0010 DS:  ES:
 CR0: 8005003b
Aug 31 20:30:34 DELL kernel: [34328.182426] CR2:  CR3:
0001291e CR4: 000406e0
Aug 31 20:30:34 DELL kernel: [34328.185005] DR0:  DR1:
 DR2: 
Aug 31 20:30:34 DELL kernel: [34328.187602] DR3:  DR6:
0ff0 DR7: 0400
Aug 31 20:30:34 DELL kernel: [34328.190246] Process btrfs-endio-met
(pid: 15887, threadinfo 8800351de000, task 880031c45c00)
Aug 31 20:30:34 DELL kernel: [34328.193171] Stack:
Aug 31 20:30:34 DELL kernel: [34328.196542]  8800351dfdf0
880088ff6638 8800b61953c0 88011cbbb000
Aug 31 20:30:34 DELL kernel: [34328.199469]  8800351dfe10
a004224d 8800351dfe40 a00422d6
Aug 31 20:30:34 DELL kernel: [34328.202295]  8800351dfe88
88011aedc960 8800351dfe88 8800351dfe98
Aug 31 20:30:34 DELL kernel: [34328.204685] Call Trace:
Aug 31 20:30:34 DELL kernel: [34328.206645]  []
bio_ready_for_csum.isra.107+0xbd/0xc0 [btrfs]
Aug 31 20:30:34 DELL kernel: [34328.208591]  []
end_workqueue_fn+0x86/0xa0 [btrfs]
Aug 31 20:30:34 DELL kernel: [34328.210565]  []
worker_loop+0xa0/0x2b0 [btrfs]
Aug 31 20:30:34 DELL kernel: [34328.212531]  [] ?
__schedule+0x3cc/0

Re: raw partition or LV for btrfs?

2012-08-14 Thread Fajar A. Nugraha
On Tue, Aug 14, 2012 at 9:09 PM, cwillu  wrote:
>>> If I understand correctly, if I don't use LVM, then such move and resize
>>> operations can't be done for an online filesystem and it has more risk.
>>
>> You can resize, add, and remove devices from btrfs online without the
>> need for LVM. IIRC LVM has finer granularity though, you can do
>> something like "move only the first 10GB now, I'll move the rest
>> later".
>
> You can certainly resize the filesystem itself, but without lvm I
> don't believe you can resize the underlying partition online.

I'm pretty sure you can do that with parted. At least, when your
version of parted is NOT 2.2.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raw partition or LV for btrfs?

2012-08-14 Thread Fajar A. Nugraha
On Tue, Aug 14, 2012 at 8:28 PM, Daniel Pocock  wrote:
> Can you just elaborate on the qgroups feature?
> - Does this just mean I can make the subvolume sizes rigid, like LV sizes?

Pretty much.

> - Or is it per-user restrictions or some other more elaborate solution?

No

>
> If I create 10 LVs today, with btrfs on each, can I merge them all into
> subvolumes on a single btrfs later?

No

>
> If I just create a 1TB btrfs with subvolumes now, can I upgrade to
> qgroups later?

Yes

>  Or would I have to recreate the filesystem?

No

> If I understand correctly, if I don't use LVM, then such move and resize
> operations can't be done for an online filesystem and it has more risk.

You can resize, add, and remove devices from btrfs online without the
need for LVM. IIRC LVM has finer granularity though, you can do
something like "move only the first 10GB now, I'll move the rest
later".

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raw partition or LV for btrfs?

2012-08-12 Thread Fajar A. Nugraha
On Mon, Aug 13, 2012 at 11:19 AM, Kyle Gates  wrote:
> Also, I think the current grub2 has lzo support.

You're right

grub2 (1.99-18) unstable; urgency=low

  [ Colin Watson ]
...
  * Backport from upstream:
- Add support for LZO compression in btrfs (LP: #727535).

so Ubuntu has it since precise, which is roughly the time I switched
to zfs for rootfs :P

Thanks for letting us know about that.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: I want to try something on the BTR file system,...

2012-08-12 Thread Fajar A. Nugraha
On Mon, Aug 13, 2012 at 8:22 AM, Ben Leverett  wrote:
> could you please send me a copy of the btr driver/kernel?

I wonder if using "live.com" email has something to do with how you
ask that question :P

Anyway, depending on what you want to use it for, you might find it
easier to just download latest version of Ubuntu or
whatever-your-favorite-linux-distro. Or, if you want to modify the
source code, The link that Michael sends provide a good starting
point.

What is it that you want to try? If your question is more specific,
you can get more specific answer.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raw partition or LV for btrfs?

2012-08-12 Thread Fajar A. Nugraha
On Sun, Aug 12, 2012 at 11:46 PM, Daniel Pocock  wrote:
>
>
> I notice this question on the wiki/faq:
>
>
> https://btrfs.wiki.kernel.org/index.php/UseCases#What_is_best_practice_when_partitioning_a_device_that_holds_one_or_more_btr-filesystems
>
> and as it hasn't been answered, can anyone make any comments on the subject
>
> Various things come to mind:
>
> a) partition the disk, create an LVM partition, and create lots of small
> LVs, format each as btrfs
>
> b) partition the disk, create an LVM partition, and create one big LV,
> format as btrfs, make subvolumes
>
> c) what about using btrfs RAID1?  Does either approach (a) or (b) seem
> better for someone who wants the RAID1 feature?

IMHO when the qgroup feature is "stable" (i.e. adopted by distros, or
at least in stable kernel) then simply creating one big partition (and
letting btrfs handle RAID1, if you use it) is better. When 3.6 is out,
perhaps?

Until then I'd use LVM.

>
> d) what about booting from a btrfs system?  Is it recommended to follow
> the ages-old practice of keeping a real partition of 128-500MB,
> formatting it as btrfs, even if all other data is in subvolumes as per (b)?

You can have one single partition only and boot directly from that.
However btrfs has the same problems as zfs in this regard:
- grub can read both, but can't write to either. In other words, no
support for grubenv
- the "best" compression method (gzip for zfs, lzo for btrfs) is not
supported by grub

For the first problem, an easy workaroud is just to disable the grub
configuration that uses grubenv. Easy enough, and no major
functionality loss.

The second one is harder for btrfs. zfs allows you to have separate
dataset (i.e. subvolume, in btfs terms) with different compression, so
you can have a dedicated dataset for /boot with different compression
setting from the rest of the dataset. With btrfs you're currently
stuck with using the same compression setting for everything, so if
you love lzo this might be a major setback.

There's also a btrfs-specific problem: it's hard to have a system
which have /boot on a separate subvol while managing it with current
automatic tools (e.g. update-grub).

Due to second and third problem, I'd recommend you just use a separate
partition with ext2/4 for now.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How can btrfs take 23sec to stat 23K files from an SSD?

2012-07-31 Thread Fajar A. Nugraha
On Wed, Aug 1, 2012 at 1:01 PM, Marc MERLIN  wrote:

> So, clearly, there is something wrong with the samsung 830 SSD with linux


> It it were a random crappy SSD from a random vendor, I'd blame the SSD, but
> I have a hard time believing that samsung is selling SSDs that are slower
> than hard drives at random IO and 'seeks'.

You'd be surprised on how badly some vendors can screw up :)


> First: btrfs is the slowest:

> gandalfthegreat:/mnt/ssd/var/local# grep /mnt/ssd/var /proc/mounts
> /dev/mapper/ssd /mnt/ssd/var btrfs 
> rw,noatime,compress=lzo,ssd,discard,space_cache 0 0

Just checking, did you explicitly activate "discard"? Cause on my
setup (with corsair SSD) it made things MUCH slower. Also, try adding
"noatime" (just in case the slow down was because "du" cause many
access time updates)

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Upgrading from 2.6.38, how?

2012-07-24 Thread Fajar A. Nugraha
On Wed, Jul 25, 2012 at 11:39 AM, Gareth Pye  wrote:
> My proposed upgrade method is:
> Boot from a live CD with the latest kernel I can find so I can do a few tests:
>  A - run the fsck in read only mode to confirm things look good
>  B - mount read only, confirm that I can read files well
>  C - mount read write, confirm working
> Install latest OS, upgrade to latest kernel, then repeat above steps.
>
> Any likely hiccups with the above procedure and suggested alternatives?

I'd simply install the new OS on a new partition/subvol. This is what
I did when upgrading from natty -> oneiric -> precise.

IIRC there are some incompatibilites (e.g. space/inode cache disk
format?) but newer kernels will just do the right thing, drop the old
cache and create a new one.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Very slow samba file transfer speed... any ideas ?

2012-07-20 Thread Fajar A. Nugraha
On Fri, Jul 20, 2012 at 5:23 PM, Shavi N  wrote:
> Hence I'm asking.. I know that I get fast copy/write speeds on the
> btrfs volume from real life situations,

How did you know that? So far none of your posted test result have
shown that btrfs vol in your system is FAST.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Very slow samba file transfer speed... any ideas ?

2012-07-19 Thread Fajar A. Nugraha
On Thu, Jul 19, 2012 at 7:39 PM, Shavi N  wrote:
> So btrfs gives a massive difference locally, but that still doesn't
> explain the slow transfer speeds.
> Is there a way to test this?

I'd try with real data, not /dev/zero. e.g:
dd_rescue -b 1M -m 1.4G /dev/sda testfile.img

... or use whatever non-zero data source you have. dd_rescue will give
a nice progress bar and speed indicator.

Also, run "iostat -mx 3" while you're running dd, and while accessing
it from samba. In my experice, btrfs is simply slower than ext4.
Period. There's no way around it for now.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: brtfs on top of dmcrypt with SSD -> Trim or no Trim

2012-07-18 Thread Fajar A. Nugraha
On Thu, Jul 19, 2012 at 1:13 AM, Marc MERLIN  wrote:
> TL;DR:
> I'm going to change the FAQ to say people should use TRIM with dmcrypt
> because not doing so definitely causes some lesser SSDs to suck, or
> possibly even fail and lose our data.
>
>
> Longer version:
> Ok, so several months later I can report back with useful info.
>
> Not using TRIM on my Crucial RealSSD C300 256GB is most likely what caused
> its garbage collection algorithm to fail (killing the drive and all its
> data), and it was also causing BRTFS to hang badly when I was getting
> within 10GB of the drive getting full.
>
> I reported some problems I had with btrfs being very slow and hanging when I
> only had 10GB free, and I'm now convinced that it was the SSD that was at
> fault.
>
> On the Crucial RealSSD C300 256GB, and from talking to their tech support
> and other folks who happened to have gotten that 'drive' at work and also
> got weird unexplained failures, I'm convinced that even its latest 007
> firmware (the firmware it shipped with would just hang the system for a few
> seconds every so often so I did upgrade to 007 early on), the drive does
> very poorly without TRIM when it's getting close to full.


If you're going to edit the wiki, I'd suggest you say "SOME SSDs might
need to use TRIM with dmcrypt". That's because some SSD controllers
(e.g. sandforce) performs just fine without TRIM, and in my case TRIM
made performance worse.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: file system corruption removal / documentation quandry

2012-07-11 Thread Fajar A. Nugraha
On Thu, Jul 12, 2012 at 12:13 PM, eric gisse  wrote:

> Basically, phoronix showed there is a --repair option. After enabling
> snapshotting and playing around with the various discussed options, I
> discovered that --repair and no special mount options was sufficient
> to get the files removable.

I'm curious, whether running it directly on newer kernel (e.g. latest
ubuntu kernel-ppa/mainline) will be able to "solve" the problem, even
without btrfsck.

Also note that if by "snapshotting" you mean "create LVM snapshots",
then you might be in for another surprise, as btrfs doesn't play nice
with block devices with the same fs UUID. Don't rely on that as backup
option.

>
> Now what I'm hoping for is better documentation on btrfsck even if it
> just boils down to a brief enumeration of the options as that would be
> better than nothing which is what we have now. Do I need to file a bug
> or is this sufficient?

Edit https://btrfs.wiki.kernel.org/index.php/Btrfsck ?

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS fsck apparent errors

2012-07-04 Thread Fajar A. Nugraha
On Wed, Jul 4, 2012 at 8:42 PM, David Sterba  wrote:
> On Wed, Jul 04, 2012 at 07:40:05AM +0700, Fajar A. Nugraha wrote:
>> Are there any known btrfs regression in 3.4? I'm using 3.4.0-3-generic
>> from a ppa, but a normal mount - umount cycle seems MUCH longer
>> compared to how it was on 3.2, and iostat shows the disk is
>> read-IOPS-bound
>
> Is it just mount/umount without any other activity?

Yes

> Is the fs
> fragmented

Not sure how to check that quickly

> (or aged),

Over 1 year, so yes

> almost full,

df says 83% used, so probably yes (depending on how you define "almost")

~ $ df -h /media/WD-root
Filesystem  Size  Used Avail Use% Mounted on
/dev/sdc2   922G  733G  155G  83% /media/WD-root

~ $ sudo btrfs fi df /media/WD-root/
Data: total=883.95GB, used=729.68GB
System, DUP: total=8.00MB, used=104.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=18.75GB, used=1.49GB
Metadata: total=8.00MB, used=0.00

> has lots of files?

it's a "normal" 1 TB usb disk, with docs, movies, vm images, etc. No
particular lots-of-small-files like maildir or anything like that.


>> # time umount /media/WD-root/
>>
>> real  0m22.419s
>> user  0m0.000s
>> sys   0m0.064s
>>
>> # /proc/10142/stack  <--- the PID of umount process
>
> The process(es) actually doing the work are the btrfs workers, usual
> sucspects are btrfs-cache (free space cache) or btrfs-ino (inode cache)
> that are writing the cache states back to disk.

Not sure about that, since iostat shows it's mostly read, not write.
Will try iotop later.
I tested also with Chris' for-linus on top of 3.4, same result (really
long time to umount).

Reverting back to ubuntu's 3.2.0-26-generic, umount only took less than 1 s :P
So I guess I'm switching back to 3.2 for now.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS fsck apparent errors

2012-07-03 Thread Fajar A. Nugraha
On Tue, Jul 3, 2012 at 10:22 PM, Hugo Mills  wrote:
> On Tue, Jul 03, 2012 at 05:10:13PM +0200, Swâmi Petaramesh wrote:

>> After I had shifted, I tried to defragment and compress my FS using
>> commands such as :
>>
>> find /mnt/STORAGEFS/STORAGE/ -exec btrfs fi defrag -clzo -v {} \;
>>
>> During execution of such commands, my kernel oopsed, so I restarted.

>I would also suggest using a 3.4 kernel. There's at least one FS
> corruption bug known to exist in 3.2 that's been fixed in 3.4.


Are there any known btrfs regression in 3.4? I'm using 3.4.0-3-generic
from a ppa, but a normal mount - umount cycle seems MUCH longer
compared to how it was on 3.2, and iostat shows the disk is
read-IOPS-bound

# time mount LABEL=WD-root

real0m10.400s
user0m0.000s
sys 0m0.060s

# time umount /media/WD-root/

real0m22.419s
user0m0.000s
sys 0m0.064s

# /proc/10142/stack  <--- the PID of umount process
[] sleep_on_page+0xe/0x20
[] wait_on_page_bit+0x78/0x80
[] filemap_fdatawait_range+0x10c/0x1a0
[] btrfs_wait_marked_extents+0x6b/0xc0 [btrfs]
[] btrfs_write_and_wait_marked_extents+0x3b/0x60 [btrfs]
[] btrfs_write_and_wait_transaction+0x2b/0x50 [btrfs]
[] btrfs_commit_transaction+0x759/0x960 [btrfs]
[] btrfs_commit_super+0xbb/0x110 [btrfs]
[] close_ctree+0x2a0/0x310 [btrfs]
[] btrfs_put_super+0x19/0x20 [btrfs]
[] generic_shutdown_super+0x62/0xf0
[] kill_anon_super+0x16/0x30
[] btrfs_kill_super+0x1a/0x90 [btrfs]
[] deactivate_locked_super+0x3c/0xa0
[] deactivate_super+0x4e/0x70
[] mntput_no_expire+0xdc/0x130
[] sys_umount+0x66/0xe0
[] system_call_fastpath+0x16/0x1b
[] 0x

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Kernel panic from "btrfs subvolume delete"

2012-06-29 Thread Fajar A. Nugraha
On Fri, Jun 29, 2012 at 9:23 PM, Richard Cooper
 wrote:
>>> If so, how?
>>
>> https://blogs.oracle.com/linux/entry/oracle_unbreakable_enterprise_kernel_release
>> http://elrepo.org/tiki/kernel-ml
>
> Perfect, thank you! I was looking for a mainline kernel yum repo but my 
> google-fu was failing me. That looks like just what I need.
>
> I've installed kernel v3.4.4 from http://elrepo.org/tiki/kernel-ml and that 
> seems to have fixed my kernel panic. I'm still using the default Cent OS 6 
> versions of the btrfs userspace programs (v0.19). Any reason why that might 
> be a bad idea?

At the very least, newer version of btrfsck has --repair, which you
might need later in the future.
There's also features lke forcing a certain compression (e.g. zlib) on
a file as part of "btrfs filesystem defrag" command.

Just grab updated btrfs-progs (or whatever it's called) from Oracle's repo.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Kernel panic from "btrfs subvolume delete"

2012-06-29 Thread Fajar A. Nugraha
On Fri, Jun 29, 2012 at 5:11 PM, Richard Cooper
 wrote:
> Hi All,
>
> I have two machines where I've been testing various btrfs based backup 
> strategies. They are both Cent OS 6 with the standard kernel and btrfs-progs 
> RPMs from the CentOS repos.
>
> - kernel-2.6.32-220.17.1.el6.x86_64
> - btrfs-progs-0.19-12.el6.x86_64

In btrfs terms, 2.6.32 is ... stone age :P

> What should I do now? Do I need to upgrade to a more recent btrfs?

Yep

> If so, how?

https://blogs.oracle.com/linux/entry/oracle_unbreakable_enterprise_kernel_release
http://elrepo.org/tiki/kernel-ml

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: System Policy for Filenames

2012-06-26 Thread Fajar A. Nugraha
On Wed, Jun 27, 2012 at 1:28 AM, Aaron Peterson
 wrote:
> Billy,
>
> Thank you! I will look into FUSE.
>
> Ultimately, I want my / to be mounted with these rules,  I will need a
> boot loader to be able to handle it.

Try looking at how ubuntu live cd works. Last time I check, it can use
unionfs-fuse as "/" to make the read-only cd media appear "writable"
live session. Something similar should be applicable to your needs.

>  I am wondering if filesystem software has hooks for AppArmor or
> SELinux, or some other Linux Security Module would be appropriated to
> add to filesystem code?

Not that I know of.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Subvolumes and /proc/self/mountinfo

2012-06-19 Thread Fajar A. Nugraha
On Wed, Jun 20, 2012 at 10:22 AM, H. Peter Anvin  wrote:
> a. Make a snapshot of the current root;
> b. Mount said snapshot;
> c. Install the new distro on the snapshot;
> d. Change the bootloader configuration *inside* the snapshot to point
>   to the snapshot as the root;
> e. Install the bootloader on the snapshot, thereby making the boot
>   block point to it and making it "live".


IMHO a more elegant solution would be similar to what
(open)solaris/indiana does: make the boot parts (bootloader,
configuration) as a separate area, separate from root snapshots. In
solaris case IIRC this is will br /rpool/grub.

A similar approach should be implementable in linux, at least on
certain configurations, since if you put /boot as part of "/" (thus,
also on btrfs), AND you don't change the default subvolume, AND the
roots are on their own subvolume, the paths to vmlinuz and initrd on
grub.cfg will have subvols name in it. So it's possible to have a
single grub.cfg having several entries that points to different
subvols. So you don't need to install a new bootloader to make a
particular subvol live, you only need to select it from the boot menu.

I'm doing this currently with ubuntu precise, but with
manually-created grub.cfg though. Still haven't found a way to manage
this automatically.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Subvolumes and /proc/self/mountinfo

2012-06-19 Thread Fajar A. Nugraha
On Wed, Jun 20, 2012 at 6:35 AM, H. Peter Anvin  wrote:
> On 06/19/2012 07:22 AM, Calvin Walton wrote:
>>
>> All subvolumes are accessible from the volume mounted when you use -o
>> subvolid=0. (Note that 0 is not the real ID of the root volume, it's
>> just a shortcut for mounting it.)
>>
>
> Could you clarify this bit?  Specifically, what is the real ID of the
> root volume, then?

5

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Moving top level to a subvolume

2012-06-13 Thread Fajar A. Nugraha
On Wed, Jun 13, 2012 at 4:44 PM, C Anthony Risinger  wrote:
> On Wed, Jun 13, 2012 at 2:21 AM, Arne Jansen  wrote:
>> On 13.06.2012 09:04, C Anthony Risinger wrote:

>>> ... because in a), data will *copied* the slow way

>> What I don't understand is why you think data will be copied.

> at one point i tried to create a new subvol and `mv` files there, and
> it took quite some time to complete
> (cross-link-device-what-have-you?), but maybe things changed ... will
> try it out.

IIRC it hasn't. Not in upstream anyway. Some distros (e.g. opensuse)
carry their own patch which allows cross-subvolume links (cp --reflink
...).

But it shouldn't matter anyway, since you can SNAPSHOT the old subvol
(even root subvol), instead of creating a new subvol. Which means
nothing needs to be copied.

You'd still have to do "rm" manually though.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Moving top level to a subvolume

2012-06-13 Thread Fajar A. Nugraha
On Wed, Jun 13, 2012 at 2:23 PM, Duncan <1i5t5.dun...@cox.net> wrote:
> Fajar A. Nugraha posted on Wed, 13 Jun 2012 08:49:47 +0700 as excerpted:
>
>> As for "lose their filesystems", are there recent ones that uses one of
>> the three distros above, and is purely btrfs "fault"? The ones I can
>> remember (from the post to this list) were broken on earlier kernels, or
>> caused by bad disks.

> My system's old and has a bit of a problem with overheating in the
> Phoenix summer, so has been suffering SATA resets

> it's exactly this sort of
> corner-case that filesystems need to be able to deal with

IIRC XFS had corruption problems when used on top of LVM (or other
block device that doesn't support barriers correctly), while using
ext2/3/4 on the same block device will be "fine". Yet XFS doesn't have
the mark of "unstable, highly experimental, do not use". People simply
use the right (for them) fs for the right job.

My point is yes, btrfs is new. And it's being developed at much faster
rate than any other more-mature fs out there. And there are known
cases of data loss on certain configuration of corner cases/"buggy"
hardware and/or old version of kernel. But when used in the correct
environment, btrfs can be a good choice, even for critical data.

Of course IF the data were REALLY critical, and I REALLY need btrfs'
features, and it were on an enterprise environment, I would've bought
support from oracle linux (or SLES 12, when it's out, or whatever
enterprise distro supporting btrfs which sells support contract) so I
can have someone to turn to in case of problems, and (in some cases)
transfer the risk/blame :D

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Moving top level to a subvolume

2012-06-12 Thread Fajar A. Nugraha
On Tue, Jun 12, 2012 at 9:52 PM, Randy Barlow
 wrote:
> I personally run Gentoo, but I've been told by some coworkers that the Ubuntu
> installer offers btrfs as an option to the users without marking it as
> experimental, unstable, or under development. I wonder if that is why we see
> so many people surprised when they lose their filesystems. Can anyone verify
> whether that is true of Ubuntu, or of any other Linux distributions?

Oracle linux (when used with UEK2) officially supports btrfs. Opensuse
also supports btrfs, and use its functionality for snapper.

I haven't found any updated (i.e. released post 12.04) official
support status statement from Ubuntu, but they do offer btrfs as
installation option.

As for "lose their filesystems", are there recent ones that uses one
of the three distros above, and is purely btrfs "fault"? The ones I
can remember (from the post to this list) were broken on earlier
kernels, or caused by bad disks.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Preparing single-disk setup for future multi-disk usage

2012-05-23 Thread Fajar A. Nugraha
On Thu, May 24, 2012 at 1:05 PM, Björn Wüst  wrote:
>
> Unfortunately, I do not have a disk to test it right now. The disk I am 
> planning to use is with the post service still :) .

you can use sparse files. Possibly with losetup, if necessary.

> Thank you for your replies to this email (bjoern.wu...@gmx.net,

That's not the email you use to send

> I am not subscribed to the mailing lists, thus please do a 'reply all').

IMHO asking something to a list and then saying "I am not subscribed"
and "send your reply to this other email address that I'm not using to
send" is rude.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel 3.3.4 damages filesystem (?)

2012-05-08 Thread Fajar A. Nugraha
On Tue, May 8, 2012 at 2:39 PM, Helmut Hullen  wrote:

>> And you can use three BTRFS filesystems the same way as three Ext4
>> filesystems if you prefer such a setup if the time spent for
>> restoring the backup does not make up the cost for one additional
>> disk for you.
>
> But where's the gain? If a disk fails I have a lot of tools for
> repairing an ext2/3/4 system.

It won't work if you use it in RAID0 (e.g. with LVM spanning three
disks, then use ext4 on top of the LV). Which is basically the same
thing that you did (using btrfs in raid0 mode).

As others said, if your only concern is "if a disk is dead, I want to
be able to access data on other disks", then simply use btrfs as three
different fs, mounted on three directories.

btrfs will shine when:
- you need checksum and self-healing in raid10 mode
- you have lots of small files
- you have highly compressible content
- you need snapshot/clone feature

Since you don't need either, IMHO it's actually better if you just use ext4.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can btrfs silently repair read-error in raid1

2012-05-08 Thread Fajar A. Nugraha
On Tue, May 8, 2012 at 2:13 PM, Clemens Eisserer  wrote:
> Hi,
>
> I have a quite unreliable SSD here which develops some bad blocks from
> time to time which result in read-errors.
> Once the block is written to again, its remapped internally and
> everything is fine again for that block.
>
> Would it be possible to create 2 btrfs partitions on that drive and
> use it in RAID1 - with btrfs silently repairing read-errors when they
> occur?
> Would it require special settings, to not fallback to read-only mode
> when a read-error occurs?

The problem would be how the SSD (and linux) behaves when it
encounters bad blocks (not bad disks, which is easier).

If it does "oh, I can't read this block. I just return an error
immediately", then it's good.

However, in most situation, it would be like "hmmm, I can't read this
block, let me retry that again. What? still error? then lets retry it
again, and again.", which could take several minutes for a single bad
block. And during that time linux (the kernel) would do something like
"hey, the disk is not responding. Why don't we try some stuff? Let's
try resetting the link. If it doesn't work, try downgrading the link
speed".

In short, if you KNOW the SSD is already showing signs of bad blocks,
better just throw it away.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel 3.3.4 damages filesystem (?)

2012-05-07 Thread Fajar A. Nugraha
On Mon, May 7, 2012 at 5:46 PM, Helmut Hullen  wrote:

> For some months I run btrfs unter kernel 3.2.5 and 3.2.9, without
> problems.
>
> Yesterday I compiled kernel 3.3.4, and this morning I started the
> machine with this kernel. There may be some ugly problems.


> Data, RAID0: total=5.29TB, used=4.29TB

Raid0? Yaiks!

> System, RAID1: total=8.00MB, used=352.00KB
> System: total=4.00MB, used=0.00
> Metadata, RAID1: total=149.00GB, used=5.00GB
>
> Label: 'MMedia'  uuid: 9adfdc84-0fbe-431b-bcb1-cabb6a915e91
>        Total devices 3 FS bytes used 4.29TB
>        devid    3 size 2.73TB used 1.98TB path /dev/sdi1
>        devid    2 size 2.73TB used 1.94TB path /dev/sdf1
>        devid    1 size 1.82TB used 1.63TB path /dev/sdc1
>


> May  7 06:55:26 Arktur kernel: ata5: exception Emask 0x10 SAct 0x0 SErr 
> 0x1 action 0xe frozen
> May  7 06:55:26 Arktur kernel: ata5: SError: { PHYRdyChg }
> May  7 06:55:26 Arktur kernel: ata5: hard resetting link
> May  7 06:55:31 Arktur kernel: ata5: COMRESET failed (errno=-19)
> May  7 06:55:31 Arktur kernel: ata5: reset failed (errno=-19), retrying in 6 
> secs


> May  7 07:15:19 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device
> May  7 07:15:19 Arktur kernel: end_request: I/O error, dev sdf, sector 0
> May  7 07:15:19 Arktur kernel: sd 5:0:0:0: rejecting I/O to offline device
> May  7 07:15:19 Arktur kernel: lost page write due to I/O error on sdf1


That looks like a bad disk to me, and it shouldn't be related to ther
kernel version you use.

Your best chance might be:
- unmount the fs
- get another disk to replace /dev/sdf, copy the content over with
dd_rescue. Ata resets can be a PITA, so you might be better of by
moving the failed disk to a usb external adapter, and du some creative
combination of plug-unplug and selectively skip bad sectors manually
(by passing "-s" to dd_rescue).
- reboot, with the bad disk unplugged
- (optional) run "btrfs filesystem scrub" (you might need to build
btrfs-progs manually from git source). or simply read the entire fs
(e.g. using tar to /dev/null, or whatever). It should check the
checksum of all files and print out which files are damaged (either in
stdout or syslog).

I don't think there's anything you can do to recover the damaged files
(other than restore from backup), but at least you know which files
are NOT damaged.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can't mount

2012-05-03 Thread Fajar A. Nugraha
On Thu, May 3, 2012 at 10:31 PM, Hugo Mills  wrote:
> On Thu, May 03, 2012 at 03:18:01PM +, Yo'av Moshe wrote:

>> Is there anything else I can try?
>>
>> I'm using kernel 3.2 on Ubuntu 12.04.
>
>   In approximate order:
>
>  * Try a 3.3 or 3.4-rc5 kernel. I don't think those will do anything
>   to fix this particular issue, but it's worth a try.
>
>  * You have the last-but-one generation listed by find-root:
>
> Well block 216926195712 seems great, but generation doesn't match, 
> have=135713, want=135714
>
>   so you can use the restore tool[1] with that block number (and -t)
>   to copy off any data that you need that isn't backed up.

Is btrfs-zero-log still relevant? I imagine losing several last
transactions is MUCH more convinient than having to recreate the
enitre fs (even if restore managed to salvage everything).

And what about mont -o ro,recover?

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How file store when using Btrfs on multi-devices? What happen when a device fail?

2012-05-02 Thread Fajar A. Nugraha
On Thu, May 3, 2012 at 1:46 PM, Chu Duc Minh  wrote:
> Hi, i have some questions when using Btrfs on multi-devices:
> 1. a large file will always be stored wholely on a device or it may
> spread on some devices/partitions?

IIRC:
- in raid1 mode, it will be written on all disks (or was it TWO disks,
regarless how many device in a mirror? can't remember which).
- in raid10 and raid0, it will always be spread, on a minimum of two devices

> Btrfs has option to specify it
> explicitly?

Not that I know of.

> 2. suppose i have a directory tree like that:
> Dir_1
>  |--> file_1A
>  |--> file_1B
>  |--> Dir_2
>  |--> file_2C
>  |--> file_2D
>
> If Dir_2, file_2C  on a failed device, can i still have access to file_2D?

Unless you're using raid10, my guess is you'll be screwed, as each
file will be spread on multiple devices (including the one that
fails).

> If i use GlusterFS (mirror mode) on two nodes, each nodes run Btrfs on
> multi-device. When a device on a node fail and I replace it, then
> GlusterFS resync it, can i have troubles with data consistency?

This question might be more suitable on glusterfs list. My guess is
that glusterfs will discard all data on the failed node. After you
recreate the storage backend (the btrfs, on a new device), you can
tell glusterfs to copy everything from the good node.

Of course, if you use raid10 mode in btrfs, and only one device fail,
it should be transparent to end users.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs across a mix of SSDs & HDDs

2012-05-01 Thread Fajar A. Nugraha
On Wed, May 2, 2012 at 12:00 PM, Bardur Arantsson  wrote:
> On 05/02/2012 06:28 AM, Fajar A. Nugraha wrote:

>>>  From Kconfig:
>>>
>>>   "Btrfs filesystem (EXPERIMENTAL) Unstable disk format"
>>>                         ^^
>>>
>>> Btrfs is too immature to use in ANY kind of production-like scenario
>>> where
>>> you cannot afford to lose a certain amount of data (i.e. be forced to
>>> restore from backup) AND suffer downtime.
>>>
>>> I don't think email users are going to be thrilled about the prospect of
>>> "lossy" email.
>>
>>
>> Oracle fully supports btrfs for production environment:
>> http://oss.oracle.com/ol6/docs/RELEASE-NOTES-UEK2-en.html
>>
>> http://www.zdnet.com/blog/open-source/oracles-unbreakable-enterprise-kernel-2-arrives-with-linux-30-kernel-btrfs/10588
>> http://www.oracle.com/us/technologies/linux/index.html
>>
>
> What does "fully supports" mean? Does it mean that it's actually stable
> (considerably more stable that mainline), or does it mean that you can pay
> them to help fix a broken FS, for example? Does the included btrfsck
> actually work reliably? Is there some non-legalese official statement of
> what, exactly, "fully supported" means and whether OL's btrfs falls under
> this rubric?

That question would be best addressed to Oracle directly. Or other
distro vendors supporting btrfs (IIRC SLES also supports it).

>
> Also, AFAIUI the 3.0.x kernels (which OL claims to use in the release notes)
> are woefully outdated wrt. btrfs reliability/stability. Have all the more
> recent stability improvements been backported?

Chris or other devs from oracle might be able to comment more on that.

I know that it's quite common for an OSS vendor to have a supported
version of something, based on a version that is more thoroughly
tested, and have another version (in this case the version of btrfs in
mainline) that has newer, bleeding-edge code, with more features, but
possibly also more bugs.

>
> Is the OP using Oracle Linux?

He didn't say. But he didn't say he WON'T be using oracle linux (or
other distro which supports btrfs) either. Plus the kernel can be
installed on top of RHEL/Centos 5 and 6, so he can easily choose
either the supported version, or the mainline version, each with its
own consequences.

> Given the semi-regular posts about FS corruption on this list(*) and the
> "EXPERIEMENTAL" status in the KConfig it would be unwise to use btrfs for
> anything called "production" (unless you can actually afford downtime/data
> loss).

Fair opinion.

Personally I'm quite happy with the version that is included in Ubuntu
Precise (kernel 3.2). It has actually helped me recover from a bad
SSD. It was a somewhat old SSD, and about 1GB (out of 50GB) data
becomes unreadable (reading directly from the block device). "btrfs
scrub" was helpful enough to help me find out which files are
corrupted, something I wouldn't be able to do with ext4.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs across a mix of SSDs & HDDs

2012-05-01 Thread Fajar A. Nugraha
On Wed, May 2, 2012 at 9:22 AM, Bardur Arantsson  wrote:
> On 05/01/2012 09:35 PM, Martin wrote:
>>
>> How well does btrfs perform across a mix of:
>>
>> 1 SSD and 1 HDD for 'raid' 1 mirror for both data and metadata?

>> The idea is to gain the random access speed of the SSDs but have the
>> HDDs as backup in case the SSDs fail due to wear...

AFAIK only zfs officially supports that configuration, using L2ARC and SLOG

>>
>> The usage is to support a few hundred Maildirs + imap for users that
>> often have many thousands of emails in the one folder for their inbox...

Some mail programs uses hardlinks, and btrfs has a low limit on
maximum number of hardlinks in a directory. If you use one of those
programs, better stay away for now.

Plus, from my experience, when using the same disk, btrfs will use up
more disk I/O compared to ext4, so if you're already I/O-starved,
better stick with ext4.


>> Or is btrfs yet too premature to suffer such use?
>>
>
> From Kconfig:
>
>   "Btrfs filesystem (EXPERIMENTAL) Unstable disk format"
>                         ^^
>
> Btrfs is too immature to use in ANY kind of production-like scenario where
> you cannot afford to lose a certain amount of data (i.e. be forced to
> restore from backup) AND suffer downtime.
>
> I don't think email users are going to be thrilled about the prospect of
> "lossy" email.

Oracle fully supports btrfs for production environment:
http://oss.oracle.com/ol6/docs/RELEASE-NOTES-UEK2-en.html
http://www.zdnet.com/blog/open-source/oracles-unbreakable-enterprise-kernel-2-arrives-with-linux-30-kernel-btrfs/10588
http://www.oracle.com/us/technologies/linux/index.html

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: snapper for Ubuntu? (WAS: btrfs auto snapshot)

2012-04-10 Thread Fajar A. Nugraha
On Tue, Apr 10, 2012 at 9:35 PM, Matthias G. Eckermann  wrote:
> On 2012-04-10 T 20:48 +0700 Fajar A. Nugraha wrote:
>> How can I create config for /data or other directories (other than
>> manually creating the config file and .snapshots directory)?
>
> This should do it:
>
> sudo snapper -c home create-config /home
> sudo snapper -c data create-config /data
>
> The reasons for the extra "-c " is that you have to
> tell snapper, which name to choose for the configuration
> you want to create. This name is the one you can reference
> in future actions such as create/modify/delete.

Great! That works, thanks.

Is there an oposite of create-config, i.e. delete for just one subvolume?
delete-config seems to delete everything (configs for all subvolume
and all snapshots).

Also, one minor detail, I noticed that the cron configuration file is
/etc/sysconfig/snapper. It should be /etc/default/snapper in
ubuntu/debian.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: snapper for Ubuntu? (WAS: btrfs auto snapshot)

2012-04-10 Thread Fajar A. Nugraha
On Tue, Apr 10, 2012 at 6:46 PM, Arvin Schnell  wrote:
> On Mon, Apr 09, 2012 at 08:18:45AM +0700, Fajar A. Nugraha wrote:
>> I noticed that openSUSE buildservice now provides debs for ubuntu as
>> well. I can't seem to find a way to add it to apt source list though,
>> using the usual line
>>
>> deb uri distribution [component1] 
>
> You can use these commands:
>
> echo 'deb 
> http://download.opensuse.org/repositories/filesystems:/snapper/Debian_6.0/ /' 
> >> /etc/apt/sources.list

I didn't know you could use that format :D Just tested it, and it
works, although the command I use is

echo 'deb 
http://download.opensuse.org/repositories/filesystems:/snapper/Debian_6.0/
/' | sudo tee /etc/apt/sources.list.d/opensuse-snapper.list

>
> apt-get update

That got me the error

W: GPG error: http://download.opensuse.org  Release: The following
signatures couldn't be verified because the public key is not
available: NO_PUBKEY 2DA6FAF4175BFA4E

easily fixed though, using

$ sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 2DA6FAF4175BFA4E

... and then another apt-get update after that.


> apt-get install snapper

That result in a warning

WARNING: The following packages cannot be authenticated!
  libsnapper snapper
Install these packages without verification [y/N]?

Did the package creation process somehow ommit signing process,
perhaps? Or is there something else I missed?

Anyway, I got snapper-0.0.10-0 installed now, but having a small
problem. I use different subvolumes for multiple directories. For
example, /home and /data. Creating the config for both results in an
error

$ sudo snapper list-configs
Config | Subvolume
---+--
$ sudo snapper create-config /home
$ sudo snapper create-config /data
Creating config failed (config already exists).
$ sudo snapper list-configs
Config | Subvolume
---+--
root   | /home

How can I create config for /data or other directories (other than
manually creating the config file and .snapshots directory)?

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Snapper packages for Ubuntu

2012-04-10 Thread Fajar A. Nugraha
On Tue, Apr 10, 2012 at 6:50 PM, Arvin Schnell  wrote:
> On Tue, Apr 10, 2012 at 05:37:38PM +0700, Fajar A. Nugraha wrote:
>> Hi,
>>
>> I've created snapper packages for Ubuntu, available on
>> https://launchpad.net/~snapper/+archive/stable. For those new to
>> snapper, it's a tool for managing btrfs snapshots
>> (http://en.opensuse.org/Portal:Snapper). It depends on libblocxx

> libblocxx is not required for snapper anymore since about a
> month. It's checked during configure.

You're right. I just tested it, and not having libblocxx during
compilation results in less dependencies (namely libblocxx itself,
plus libssl, libcrypto, and libpcre).

What functionality, if any, is not available when not using libblocxx?
Since it's still used when present during configure, I assume it's
good for something.

Thanks.

Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Snapper: Always create .snapshot dir unconditonally

2012-04-10 Thread Fajar A. Nugraha
Current version of snapper (commit 50dec40) bails out with this error
if .snapshots directory doesn't exist (as is the case on new snapper
install):

2012-04-10 16:15:30,241 ERROR libsnapper(17784)
Snapshot.cc(nextNumber):362 - mkdir failed errno:2 (No such file or
directory)

This patch tries to create .snapshots dir unconditionally.

Signed-off-by: Fajar A. Nugraha 
---
 snapper/Snapshot.cc |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/snapper/Snapshot.cc b/snapper/Snapshot.cc
index 8e9cc37..277fad7 100644
--- a/snapper/Snapshot.cc
+++ b/snapper/Snapshot.cc
@@ -353,6 +353,9 @@ namespace snapper
if (snapper->getFilesystem()->checkSnapshot(num))
continue;

+   // try to create .snapshots dir unconditionally
+   mkdir(snapper->infosDir().c_str(), 0711);
+
if (mkdir((snapper->infosDir() + "/" + decString(num)).c_str(),
0777) == 0)
break;

-- 
1.7.9.1
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Snapper packages for Ubuntu

2012-04-10 Thread Fajar A. Nugraha
Hi,

I've created snapper packages for Ubuntu, available on
https://launchpad.net/~snapper/+archive/stable. For those new to
snapper, it's a tool for managing btrfs snapshots
(http://en.opensuse.org/Portal:Snapper). It depends on libblocxx
available from https://launchpad.net/~bjoern-esser-n/+archive/blocxx ,
and currently uses git source up to commit 50dec40. I've done some
limited testing and it seems to to work correctly so far.

There's a small, distro-independent patch needed for it to work
correctly though. I'm sending it as a separate mail.

@Arvin, @MGE, I don't know the correct list for snapper development so
I'm cc-ing you both. If there's a dedicated list for snapper please
let me know and I'll post further updates there.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


snapper for Ubuntu? (WAS: btrfs auto snapshot)

2012-04-08 Thread Fajar A. Nugraha
On Thu, Mar 1, 2012 at 8:48 PM, Arvin Schnell  wrote:
> We have now created a project in the openSUSE buildservice were
> we provide snapper packages for various distributions, e.g. RHEL6
> and Fedora 16. Please find the downloads at:
>
>  http://download.opensuse.org/repositories/filesystems:/snapper/
>
> I'll also add a link from the snapper home page:
>
>  http://en.opensuse.org/Portal:Snapper.
>
> I have tested snapper on Fedora 16 and found no problems.

Hi Arvin,

I noticed that openSUSE buildservice now provides debs for ubuntu as
well. I can't seem to find a way to add it to apt source list though,
using the usual line

deb uri distribution [component1] 

Is there a howto somewhere, or is it
download-all-debs-manually-and-install-with-dpkg for now?

Thanks,

Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfsck integration with userlevel API for fsck

2012-03-30 Thread Fajar A. Nugraha
On Sat, Mar 31, 2012 at 3:35 AM, Avi Miller  wrote:
>
> On 30/03/2012, at 2:22 PM, Fajar A. Nugraha wrote:
>
>> On Fri, Mar 30, 2012 at 5:08 AM, member graysky  wrote:
>>> Are there plans to integrate btrfsck with the userlevel API for fsck?
>>
>> There isn't even a stable, working, fixing btrfsck yet :)
>
> Yes, there is. Chris merged the btrfsck changes into the btrfs-progs master 
> in git a few days ago and we shipped it with the Oracle Linux UEK2 update as 
> well.

Ah, OK. I must've missed the announcement. Thanks for the update.

Now if only UEK2 fully supports LXC as well instead of tech preview ... :D

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfsck integration with userlevel API for fsck

2012-03-29 Thread Fajar A. Nugraha
On Fri, Mar 30, 2012 at 5:08 AM, member graysky  wrote:
> Are there plans to integrate btrfsck with the userlevel API for fsck?

There isn't even a stable, working, fixing btrfsck yet :)

> AFAIK, it currently does not work as such (i.e. `shutdown -rF now`
> does not trigger a check on the next boot).  What is the recommended
> method to check a btrfs root filesystem?  Live media?

Currently? None. Set the last part of root line in fstab to "0" to
disable fsck.
Newer kernels should be smart enough to recover from unclean shutdown
automatically, kinda like what zfs does, or what ext3/4 does with its
journal replay.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Create subvolume from a directory?

2012-03-27 Thread Fajar A. Nugraha
On Wed, Mar 28, 2012 at 5:24 AM, Matthias G. Eckermann  wrote:
> While the time measurement might be flawed due to the subvol
> actions inbetween, caching etc.: I tried several times, and
> "cp --reflinks" always is multiple times faster than "mv" in
> my environment.

So this is cross-subvolume reflinks? I thought the code for that
wasn't merged yet?

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs and backups

2012-03-26 Thread Fajar A. Nugraha
On Mon, Mar 26, 2012 at 3:56 PM, Felix Blanke  wrote:
> On 3/26/12 10:30 AM, James Courtier-Dutton wrote:
>> Is there some tool like rsync that I could copy all the data and
>> snapshots to a backup system, but still only use the same amount of
>> space as the source filesystem.


> I'm not sure if I understand your problem right, but I would suggest:
>
> 1) Snapshot the subvolume on the source
> 2) rsync the snapshot to the destination
> 3) Snapshot the destination

James did say "only use the same amount of space as the source
filesystem." Your approach would increase the usage when one or more
subvolume shares the same space (e.g. when one subvolume starts as
snapshot).

AFAIK the (planned) way to do this is using "btrfs send | receive",
which is not available yet.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can't mount, power failure - recoverable?

2012-03-26 Thread Fajar A. Nugraha
On Mon, Mar 26, 2012 at 3:49 PM, Skylar Burtenshaw  wrote:
> Fajar A. Nugraha  fajar.net> writes:
>
>> Didn't Chris' last response basically say "use kernel 3.2 or newer,
>> mount the fs (possibly with -o ro), and copy the data elsewhere"?
>
> Why yes, yes it did actually. I appreciate your spotlighting it, just in case 
> I
> somehow managed to miss it, though.
>
>> Have you done that?
>
> I have. In fact, in my first message, I stated that in all kernels up to 
> present
> 3.2 kernels, I get several minutes of disk churning, then a stack trace. Also
> present in my messages is the fact that the filesystem will not mount, as well
> as data output from the recovery program etc which fail to recognize things in
> the filesystem that they require in order to fix it. Did you have something 
> you
> wished to suggest, in order to help me? If so, I'd gladly listen to any 
> proposed
> ideas.

Since you apprently tried "-o ro" (which I missed), then my last
suggestion is probably kernel 3.3 with "-o ro". just in case :)

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can't mount, power failure - recoverable?

2012-03-26 Thread Fajar A. Nugraha
On Mon, Mar 26, 2012 at 3:34 PM, Skylar Burtenshaw  wrote:
> Hey - been a few days, not meaning to pester but I wanted to make sure my
> previous message didn't slip through the cracks. If I offended, I apologize - 
> I
> certainly didn't mean to, and my attempts at joviality can come across as
> abrasive. If you simply haven't had time to look into this yet, or it's 
> bizarre
> enough that it's taking time to isolate, take all the time you need. Thank 
> you.

Didn't Chris' last response basically say "use kernel 3.2 or newer,
mount the fs (possibly with -o ro), and copy the data elsewhere"? Have
you done that?

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: compressed btrfs "No space left on device"

2012-03-06 Thread Fajar A. Nugraha
On Thu, Nov 17, 2011 at 12:59 AM, Arnd Hannemann  wrote:
> Am 14.11.2011 19:24, schrieb Arnd Hannemann:
>> Am 14.11.2011 15:57, schrieb Arnd Hannemann:
>>
>>> I'm using btrfs for my /usr/share/ partition and keep getting the following 
>>> error
>>> while installing a debian package which should take no more than 228 MB:
>>>
>>> Unpacking texlive-fonts-extra (from 
>>> .../texlive-fonts-extra_2009-10ubuntu1_all.deb) ...
>>>  dpkg: error processing 
>>> /var/cache/apt/archives/texlive-fonts-extra_2009-10ubuntu1_all.deb 
>>> (--unpack):
>>>  unable to install new version of 
>>> `/usr/share/texmf-texlive/fonts/type1/public/allrunes/frutlt.pfb': No space 
>>> left on device


>> FYI: The problem is the same with mainline kernel v3.1.1.
>
> JFYI: the problem went away in 3.2-rc2  so someone must
> have fixed something.

I just experienced the same thing in Ubuntu precise, 3.2.0-17-generic,
so I don't think it's fixed yet.

$ sudo btrfs fi df /
Data: total=43.47GB, used=38.47GB
System, DUP: total=8.00MB, used=12.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=3.25GB, used=912.47MB
Metadata: total=8.00MB, used=0.00

$ df -h /
Filesystem  Size  Used Avail Use% Mounted on
/dev/sda650G   41G  5.1G  89% /

the problem occur when copying precise lxc root template (322M, 13759
files/directories).

It only happens when using zlib compression though, using lzo works fine.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [RFC] Add btrfs autosnap feature

2012-03-04 Thread Fajar A. Nugraha
On Mon, Mar 5, 2012 at 1:51 PM, Anand Jain  wrote:
>
>> (notably the direct modification of
>> crontab files, which is considered to be an internal detail if I
>> understand correctly, and I'm fairly certain is broken as written),
>
>
>  I did came across that point of view however, using crontab cli in the
>  program wasn't convincing either, (library call would have been better).
>  any other better ways to manage cron entries ?
>

/etc/cron.{d,daily,hourly} ?

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: filesystem full when it's not? out of inodes? huh?

2012-03-02 Thread Fajar A. Nugraha
On Fri, Mar 2, 2012 at 6:50 PM, Brian J. Murrell  wrote:
> Is  2010-06-01 really the last time the tools were considered
> stable or are Ubuntu just being conservative and/or lazy about updating?

The last one :)

Or probably no one has bugged them enough and point out they're
already using a git snapshot anyway and there are many new features in
the "current" git version of btrfs-tools.

I have been compiling my own kernel (just recently switched to
Precise's kernel though) and btrfs-progs for quite some time, so even
if Ubuntu doesn't provide updated package it wouldn't matter much to
me. If it's important for you, you could file a bug report in
launchpad asking for an update. Even debian testing has an updated
version (which you might be able to use:
http://packages.debian.org/btrfs-tools)

Or create your own ppa with an updated version (or at least rebuilt of
Debian's version).

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] btrfs auto snapshot

2012-03-01 Thread Fajar A. Nugraha
On Thu, Mar 1, 2012 at 8:48 PM, Arvin Schnell  wrote:
> On Thu, Feb 23, 2012 at 04:54:06PM +0700, Fajar A. Nugraha wrote:
>> On Thu, Aug 18, 2011 at 12:38 AM, Matthias G. Eckermann  
>> wrote:
>
>> > are available in the openSUSE buildservice at:
>> >
>> >        http://download.opensuse.org/repositories/home:/mge1512:/snapper/
>> >
>>
>> Hi Matthias,
>>
>> I'm testing your packages on top of RHEL6 + kernel 3.2.7. A small
>> suggestion, you should include /etc/sysconfig/snapper in the package
>> (at least for RHEL6, haven't tested the other ones). Even if it just
>> contains
>>
>> SNAPPER_CONFIGS=""
>
> Hi Fajar,
>
> thanks for reporting that issue, I have fixed it now.

Great! Thanks.

>
> We have now created a project in the openSUSE buildservice were
> we provide snapper packages for various distributions, e.g. RHEL6
> and Fedora 16. Please find the downloads at:
>
>  http://download.opensuse.org/repositories/filesystems:/snapper/
>
> I'll also add a link from the snapper home page:
>
>  http://en.opensuse.org/Portal:Snapper.
>
> I have tested snapper on Fedora 16 and found no problems.

When I installed it back then, the first thing that comes to mind was
"there's no documentation on how to get started".

http://en.opensuse.org/openSUSE:Snapper_Tutorial is good, but that' is
assuming root is btrfs, and snapper is already configured to snapshot
root. For other distros, you need to first create the config manually,
e.g. as shown for home in http://en.opensuse.org/openSUSE:Snapper_FAQ

Could you update the tutorial, or perhaps create a new "quickstart"
page? I'm kinda reluctant to do it myself since I don't use opensuse,
and some of my edits might not reflect the "correct" way to do it in
opensuse. If that's not possible, I'll put up the documentation
somewhere else (perhaps the semi-official http://btrfs.ipv5.de/ , or
my own wiki).

Two other things that I have't find is:
- how to add pre and post hooks, so (for example) snapper could create
the same pre-post snapshot whenever user runs "yum", similar to when a
user runs "yast" in opensuse,
- whether a rollback REALLY rolls back everyting (including binary and
new/missing files), or is it git-like behavior, or if it only process
text files.

... but those two aren't as important as the getting-started documentation.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs Storage Array Corrupted

2012-02-28 Thread Fajar A. Nugraha
On Wed, Feb 29, 2012 at 7:13 AM, Travis Shivers  wrote:
> # ./btrfs-zero-log /dev/sdh
> parent transid verify failed on 5568194695168 wanted 43477 found 43151
> parent transid verify failed on 5568194695168 wanted 43477 found 43151
> parent transid verify failed on 5568194695168 wanted 43477 found 43151
> parent transid verify failed on 5568194695168 wanted 43477 found 43151
> Ignoring transid failure

Did you try a read-only mount (-o ro) after you run btrfs-zero-log?

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] btrfs auto snapshot

2012-02-23 Thread Fajar A. Nugraha
On Thu, Feb 23, 2012 at 7:02 PM, Anand Jain  wrote:
>
>
>  autosnap code is available either end of this week or early
>  next week

I thought you stopped working on this :D

Alternatives are good though. Will test yours when it's out.

FWIW, I also have another one, based on zfsonlinux's autosnapshot script :D

> and what you will notice is autosnap snapshots
>  are named using uuid.
>
>  Main reason to drop time-stamp based names is that,
>    - test (clicking on Take-snapshot button) which took more
>  than one snapshot per second was failing.
>    - a more descriptive creation time is available using a
>   command line option as in the example below.
>  -
>  # btrfs su list -t tag=@minute,parent=/btrfs/sv1 /btrfs
>  /btrfs/.autosnap/6c0dabfa-5ddb-11e1-a8c1-0800271feb99 Thu Feb 23 13:01:18
> 2012 /btrfs/sv1 @minute
>  /btrfs/.autosnap/5669613e-5ddd-11e1-a644-0800271feb99 Thu Feb 23 13:15:01
> 2012 /btrfs/sv1 @minute
>  -
>  As of now code for time-stamp as autosnap snapshot name is
>  commented out, if more people wanted it to be a time-stamp
>  based names, I don't mind having that way. Please do let me know.

For me the main bonus point of having timestamp in names in the
abiility to sort it by creation date by simply using "ls". As for the
more-than-one-click-per-second problem, in my script I simply let it
fail and return informative-enough error message.

A workaround would be adding nanosecond timestamp, or put the UUID
AFTER the time stamp, e.g:
/btrfs/.autosnap/@minute_20120223_131501_123456_5669613e-5ddd-11e1-a644-0800271feb99

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] btrfs auto snapshot

2012-02-23 Thread Fajar A. Nugraha
On Thu, Aug 18, 2011 at 12:38 AM, Matthias G. Eckermann  wrote:
> Ah, sure. Sorry.  Packages for "blocxx" for:
>        Fedora_14       Fedora_15
>        RHEL-5          RHEL-6
>        SLE_11_SP1
>        openSUSE_11.4   openSUSE_Factory
>
> are available in the openSUSE buildservice at:
>
>        http://download.opensuse.org/repositories/home:/mge1512:/snapper/
>

Hi Matthias,

I'm testing your packages on top of RHEL6 + kernel 3.2.7. A small
suggestion, you should include /etc/sysconfig/snapper in the package
(at least for RHEL6, haven't tested the other ones). Even if it just
contains

SNAPPER_CONFIGS=""

Thanks,

Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


PATCH: Fix incorrect "error checking ... mount status" in mkfs.btrfs

2012-02-22 Thread Fajar A. Nugraha
Originally reported on linux-btrfs list:
http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg08086.html

Fix suggested on:
http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg08386.html

loop_info.lo_name is limited to LO_NAME_SIZE (curerently 64) characters.
This can cause a problem if a file whose full path is longer than
LO_NAME_SIZE is currently mounted.

This patch changes resolve_loop_device() to:
* Check /sys/block/loopX/loop/backing_file first
* If that fails, fallback to original behaviour using loop_info.lo_name

Patch is both inline and attached (in case mail client mungles it).

Signed-off-by: Fajar A. Nugraha 
---
 utils.c |   18 +-
 1 files changed, 17 insertions(+), 1 deletions(-)

diff --git a/utils.c b/utils.c
index 178d1b9..a62 100644
--- a/utils.c
+++ b/utils.c
@@ -649,6 +649,9 @@ int resolve_loop_device(const char* loop_dev,
char* loop_file, int max_len)
 {
int loop_fd;
int ret_ioctl;
+   int sysfs_fd;
+   char sysfs_path[PATH_MAX];
+   const char* sysfs_path_format = "/sys/block/loop%d/loop/backing_file";
struct loop_info loopinfo;

if ((loop_fd = open(loop_dev, O_RDONLY)) < 0)
@@ -658,7 +661,20 @@ int resolve_loop_device(const char* loop_dev,
char* loop_file, int max_len)
close(loop_fd);

if (ret_ioctl == 0)
-   strncpy(loop_file, loopinfo.lo_name, max_len);
+   {
+   snprintf(sysfs_path, PATH_MAX, sysfs_path_format, 
loopinfo.lo_number);
+   sysfs_fd = open(sysfs_path, O_RDONLY);
+   if (sysfs_fd < 0)
+   {
+   strncpy(loop_file, loopinfo.lo_name, max_len);
+   }
+   else
+   {
+   read(sysfs_fd, loop_file, max_len);
+   loop_file[strlen(loop_file)-1] = '\0';
+   close(sysfs_fd);
+   }
+   }
else
return -errno;

-- 
1.7.9
From e004166d8f3b30e0d498df995ac9de8b11cce59a Mon Sep 17 00:00:00 2001
From: "Fajar A. Nugraha" 
Date: Thu, 23 Feb 2012 13:28:33 +0700
Subject: [PATCH] Fix incorrect "error checking ... mount status" in
 mkfs.btrfs

Originally reported on linux-btrfs list:
http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg08086.html

Fix suggested on:
http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg08386.html

loop_info.lo_name is limited to LO_NAME_SIZE (curerently 64) characters.
This can cause a problem if a file whose full path is longer than
LO_NAME_SIZE is currently mounted.

This patch changes resolve_loop_device() to:
* Check /sys/block/loopX/loop/backing_file first
* If that fails, fallback to original behaviour using loop_info.lo_name

Signed-off-by: Fajar A. Nugraha 
---
 utils.c |   18 +-
 1 files changed, 17 insertions(+), 1 deletions(-)

diff --git a/utils.c b/utils.c
index 178d1b9..a62 100644
--- a/utils.c
+++ b/utils.c
@@ -649,6 +649,9 @@ int resolve_loop_device(const char* loop_dev, char* loop_file, int max_len)
 {
 	int loop_fd;
 	int ret_ioctl;
+	int sysfs_fd;
+	char sysfs_path[PATH_MAX];
+	const char* sysfs_path_format = "/sys/block/loop%d/loop/backing_file";
 	struct loop_info loopinfo;
 
 	if ((loop_fd = open(loop_dev, O_RDONLY)) < 0)
@@ -658,7 +661,20 @@ int resolve_loop_device(const char* loop_dev, char* loop_file, int max_len)
 	close(loop_fd);
 
 	if (ret_ioctl == 0)
-		strncpy(loop_file, loopinfo.lo_name, max_len);
+	{
+		snprintf(sysfs_path, PATH_MAX, sysfs_path_format, loopinfo.lo_number);
+		sysfs_fd = open(sysfs_path, O_RDONLY);
+		if (sysfs_fd < 0)
+		{
+			strncpy(loop_file, loopinfo.lo_name, max_len);
+		}
+		else
+		{
+			read(sysfs_fd, loop_file, max_len);
+			loop_file[strlen(loop_file)-1] = '\0';
+			close(sysfs_fd);
+		}
+	}
 	else
 		return -errno;
 
-- 
1.7.9



Re: btrfs-convert processing time

2012-02-20 Thread Fajar A. Nugraha
On Mon, Feb 20, 2012 at 9:29 PM, Olivier Bonvalet  wrote:
> On 20/02/2012 15:00, Fajar A. Nugraha wrote:
>>
>> On Mon, Feb 20, 2012 at 8:50 PM, Hubert Kario  wrote:
>>>
>>> On Monday 20 of February 2012 14:41:33 Olivier Bonvalet wrote:
>>>>
>>>> Lot of small files (like compressed email from Maildir), and lot of
>>>> hardlinks, and probably low free space (near 15% I suppose).
>>>>
>>>>
>>>> So I think I have my answer :)
>>>>
>>>
>>> Yes, this is probably the worst possible combination.
>>>
>>> Plese keep us updated. Just to have exact numbers for new users.
>>
>>
>>
>> ... although it would probably fail anyway due to btrfs hardlink limit
>> in the same directory.
>>
>
> And in that case, btrfs-convert will abort, or ignore the error, or just
> hang ?

On my simple test with ubuntu precise, loop-mounted ext4, 8k hardlinks:

$ sudo btrfs-convert /dev/loop0
creating btrfs metadata.
$ echo $?
139

so no useful error message, but it doesn't crash. And when mounted the
device still shows ext4.

A successful conversionn would look like this:

$ sudo btrfs-convert /dev/loop1
creating btrfs metadata.
creating ext2fs image file.
cleaning up system chunk.
conversion complete.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs-convert processing time

2012-02-20 Thread Fajar A. Nugraha
On Mon, Feb 20, 2012 at 8:50 PM, Hubert Kario  wrote:
> On Monday 20 of February 2012 14:41:33 Olivier Bonvalet wrote:
>> Lot of small files (like compressed email from Maildir), and lot of
>> hardlinks, and probably low free space (near 15% I suppose).
>>
>>
>> So I think I have my answer :)
>>
>
> Yes, this is probably the worst possible combination.
>
> Plese keep us updated. Just to have exact numbers for new users.


... although it would probably fail anyway due to btrfs hardlink limit
in the same directory.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs open_ctree failed (after recent Ubuntu update)

2012-02-19 Thread Fajar A. Nugraha
On Mon, Feb 20, 2012 at 10:34 AM, Curtis Jones  wrote:
> Chris,
>
> Thank you for those kernel-update instructions. That was the least painful 
> kernel update I could have imagined. I rebooted and verified (via uname) that 
> I am in fact running the new kernel. After looking at dmesg I can confirm 
> that the exact same error is still occurring though. I re-read your previous 
> email and saw that you recommended the 3.3-rc release if 3.2.6 didn't 
> suffice. So I did the same thing with 3.3-rc. And I found the same error (or 
> what appears to be the same error), again:
>
>> [  186.982910] device label StoreW devid 1 transid 37077 /dev/sdb
>> [  187.015081] parent transid verify failed on 79466496 wanted 33999 found 
>> 36704
>> [  187.015088] parent transid verify failed on 79466496 wanted 33999 found 
>> 36704
>> [  187.015091] parent transid verify failed on 79466496 wanted 33999 found 
>> 36704
>> [  187.015094] parent transid verify failed on 79466496 wanted 33999 found 
>> 36704
>> [  187.015764] btrfs: open_ctree failed
>
> uname now reports:
>
>> Linux veriton 3.3.0-030300rc4-generic-pae #201202181935 SMP Sun Feb 19 
>> 00:53:06 UTC 2012 i686 i686 i386 GNU/Linux
>
> I'm not sure what to try next;

I'd try with latest tools now. IIRC there's two programs you can try:
- btrfs-zero-log, which (as the name implies) zeroes-out transaction log
- restore, which would try to read files from a broken btrfs and copy
it elsewhere, and

Try the first one. If you can, dump the content of the disk to a file
first (with dd or dd_rescue) and try it on that file. Just in case
something goes horribly wrong :)

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: subvolume info in /proc/mounts

2012-02-05 Thread Fajar A. Nugraha
On Sun, Feb 5, 2012 at 5:30 PM, Nikos Voutsinas  wrote:
>>> If not, what is the formal way to find out which subvolume is mounted;
>>
>> Not right now, see detailed answer to a similar question:
>> http://permalink.gmane.org/gmane.comp.file-systems.btrfs/15385

> Assuming that pools with multiple subvolumes and/or snapshots is the way
> of doing things in btrfs, the subvolume info is required for the every day
> administration.

For simple administration, /etc/mtab (or the output of mount) as
mentioned in David's link should work.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Setting options permanently?

2012-01-28 Thread Fajar A. Nugraha
On Sat, Jan 28, 2012 at 7:49 AM, Hadmut Danisch  wrote:
> Am 28.01.2012 00:20, schrieb Chester:
>> It should be okay to mount with compress or without compress. Even if
>> you mount a volume with compressed data without '-o compress' you will
>> still be able to correctly read the data (but newly written data will
>> not be compressed)
>
> But having both compressed and uncompressed files in the filesystem is
> exactly what I want to avoid. Not because of reading problems, but to
> avoid wasting disk space. I don't have a reading problem. I have a
> writing problem.

If you've been using -o compress, then you should know that even then
not ALL data is compressed. If btrfs predicts that a data will be
unable to benefit frmo compression, it will store it uncompressed. The
problem is the prediction is not always right. Which is why there's -o
compress-force.

Anyway, for removable media case, there's a workaround that you can
use (at least it works with gnome). Put the entry for the usb block
device (e.g. /dev/sdb1) in fstab, with appropriate mount option, and
the option will be used when you mount it using nautilus.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Problem with 3.3.0-rc1+: Target filesystem cannot find /sbin/init

2012-01-22 Thread Fajar A. Nugraha
On Sun, Jan 22, 2012 at 7:21 PM, Swapnil Pimpale  wrote:
> I can successfully boot into Ubuntu 11.10 (3.0.0-14-generic-pae) with
> a btrfs root filesystem and an ext2 /boot partition.
> But when I installed the latest vanilla (3.3.0-rc1+) and booted into

where did you get the kernel from? kernel.org snapshot? git? third
party package?

> it, the first time the system froze.
> Next time onwards, I get the following error every time:
>
> [   0.427443] [drm:i915_init]  *ERROR* drm/i915 cannot work without
> intel_agp module!
> mount: mounting udev on /dev failed: No such device
> W: devtmpfs not available, falling back to tmpfs for /dev
> mount: mounting /dev/disk/by-uuid/f43fdd7a-8ad7-4e96-ab1c-14ba82a4324d
> on /root failed: No such device

Do you know how to use your own costom kernel? That error is common
when a driver is missing (i.e. not built-in, and not included in
initrd). The easiest way to test that is to look at what's in
/proc/partitions and /dev/disk/by-id during normal system boot (I
assume you still have the old, working Ubuntu kernel?) and during
failed boot when you're dropped to busybox. If your root device
(sda8?) is not on /proc/partitions, then it's definitely block device
driver problem.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfsck gives me errors

2012-01-19 Thread Fajar A. Nugraha
On Fri, Jan 20, 2012 at 11:24 AM, Jérôme Poulin  wrote:
> On Wed, Jan 18, 2012 at 11:59 PM, Fajar A. Nugraha  wrote:
>> some files, unmount, and mount it again. If second mount does not show
>> any error message then I'm pretty sure you're safe.
>
> I just upgraded from 3.0 to 3.2.1 and mounted the filesystem, tried
> find > /dev/null and only got messages about old space inode.

That's normal. You'll also get the message if you switch back to 3.0,
but it should be harmless.

> I then
> used btrfsck again for the same exact result, I'll ignore them for
> now, let's see what the shiny new btrfsck will do about them!

who knows when it will be available :)

Then again, most fsck feature has been implemented in kernel space so
a mount will automatically "fix" some types of problems (somewhat
similar to what zfs does, which has no fsck whatsoever). So just watch
syslog for any unusual error messages.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfsck gives me errors

2012-01-18 Thread Fajar A. Nugraha
On Thu, Jan 19, 2012 at 9:02 AM, Jérôme Poulin  wrote:
> I did a preemptive fsck after a RAID crash and got many errors, is
> there something I should do if everything I use works?

Probably just ignore it.

Recent kernels (e.g. 3.1 or 3.2) is smart enough to automatically fix
certain types of errors. Watch syslog when you mount the fs, access
some files, unmount, and mount it again. If second mount does not show
any error message then I'm pretty sure you're safe.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Encryption implementation like ZFS?

2011-12-31 Thread Fajar A. Nugraha
On Sun, Jan 1, 2012 at 12:12 AM, Niels de Carpentier
 wrote:
>>> ... and depending on which SSD you use, it shouldn't matter. Really.
>>>
>>> Last time I tried with sandforce SSD + btrfs + -o discard, forcing
>>> trim actually made things slower. Sandforce (and probably other modern
>>> SSD) controllers can work just fine even without explicit trim fs
>>> support.
>>
>> What command did you use to test this?

Normal usage, and some random i/o test tool like fio.

>>
>> I have an OCZ Agility 3 SSD, which have the latest Sandforce
>> controller, so I would really like to try reproduce your test setup.

Yours should be newer. Mine is somewhat-old corsair force 60 GB with
btrfs on top. When I activated -o discard, it actually become slower.
Also, when I used fstrim, the IOPS were capped at 100, so probably the
slowdown is because of that (i.e. IOPS-limit of TRIM somewhere,
possibly the controller)

>
> Ok, the sandforce controller makes things interesting.
>
> First of all, sandforce controllers have a very high failure rate, so make
> sure you have backups!!

Yes, but even knowing that I can't imagine going back to HDD for this
particular system. It'd be too slow to bear :P

> Sandforce controllers also use compression and deduplication to increase
> performance. Encryption will make your data incompressible and random, so
> this can have a big impact on performance, depending on the
> characteristics of your data.

In my case I use compress=lzo, so it shouldn't be compressible by the
controllers.

> Sandforce controllers also have life time throttling, which will throttle
> writes heavily if it thinks you will wear out the  flash within the
> warranty period. If you have a very heavy write workload this can be an
> issue.

That's new. Is there a link/reference for that?

>
> If you don't have a working trim it is a good idea to leave part of your
> drive unused. (Make sure you either do this after a full write erase of
> the drive, or do a manual trim of that area, otherwise it won't work).
> This will make sure the drive has enough spare sectors to do garbage
> collection and can greatly improve performance if your drive is full.

True. But on my last test I can't get fstrim to trim everything. It
could only trim about 2GB out of 12GB free space.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Encryption implementation like ZFS?

2011-12-31 Thread Fajar A. Nugraha
On Sat, Dec 31, 2011 at 3:12 AM, Sandra Schlichting
 wrote:
>> How is this advantageous over dmcrypt-LUKS?
>
> TRIM pass-through for SSD's. With dmcrypt on an SSD write performance
> is very slow.

... and depending on which SSD you use, it shouldn't matter. Really.

Last time I tried with sandforce SSD + btrfs + -o discard, forcing
trim actually made things slower. Sandforce (and probably other modern
SSD) controllers can work just fine even without explicit trim fs
support.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re: Re: Two way mirror in BRTFS

2011-12-30 Thread Fajar A. Nugraha
2011/12/30 Jaromir Zdrazil :
>> > And if I am not mistaken, current version does not yet support a mountable
>> filesystem.
>>
>> You're mistaken :) With some extra work, you can even use it as root:
>> - http://zfsonlinux.org/example-zpl.html
>> -
>> https://github.com/dajhorn/pkg-zfs/wiki/HOWTO-install-Ubuntu-to-a-Native-ZFS-Root-Filesystem
>>
> It seems I trust the web pages too much - in http://zfsonlinux.org/ is 
> written that it does not ;O)) otherwise I would be using it already.

The web page is correct.
http://zfsonlinux.org/: "Please keep in mind the current 0.5.2 stable
release does not yet support a mountable filesystem. This
functionality is currently available only in the 0.6.0-rc6 release
candidate."
http://zfsonlinux.org/example-zpl.html: "However, all the core
functionality is in place and most of the advanced features are
working. Stability of the latest release candidates has been very good
and performance is respectible. Many people are successfully using the
ZFS on Linux release candidates."

Most zfsonlinux users use 0.6.0-rc6, and a big part of those is using
the easy-to-install package from ubuntu ppa.

>> Either way, neither zfs or the (planned) btrfs send/receive supports
>> two-way/active-active setup. Both should (or will) work just fine for
>> one-way replication.
>>
> That is what I needed to know! Thank you very much!

You're welcome.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re: Two way mirror in BRTFS

2011-12-30 Thread Fajar A. Nugraha
2011/12/30 Jaromir Zdrazil :
>> > Just to add, I would like to see a two way mirror solution, but if it will 
>> > not
>> work now/is not implemnted yet, I would propably choose between drbd in
>> asynchronous mode or make a some kind if "incremental" snapshot to a remote
>> mapped disk (I do not know yet, if brtfs support it)  - it means have one
>> shapshop and let's say have a daily incremental update of this snapshot.
>>
>> You mean like "zfs send -i"? If yes, why not just use zfs? There's
>> zfsonlinux project, with easy-to-install ppa for ubuntu. Or you could
>> compile it manually.
>>
> Thank you for your suggestion. As I know, there is not everything ported yet, 
> and one of the missing important features I plan to use is to crypt fs.

correct. But btrfs doesn't do encryption as well.
And if you're thinking of using luks/dm-crupt to provide encryption
for btrfs, there's nothing preventing you to use the same thing with
zfs.

> And if I am not mistaken, current version does not yet support a mountable 
> filesystem.

You're mistaken :) With some extra work, you can even use it as root:
- http://zfsonlinux.org/example-zpl.html
- 
https://github.com/dajhorn/pkg-zfs/wiki/HOWTO-install-Ubuntu-to-a-Native-ZFS-Root-Filesystem

>> >
>> > How would you do it?
>>
>> If you DO mean zfs-send-like-functionality, then you should ask about
>> "btrfs send and receive", not "two way mirror" (which is not an
>> accurate way to describe what you want). Also, send/receive ability
>> does not mean it can act as two-way mirror. It CAN be an alternative
>> to drbd async though.
>
> If I understand it correctly, the diff between send and receive and two way 
> mirror is that one is synchronous and the other is not (sends the signal that 
> the file have been succesfully written after all/one instance have been 
> succesfully written).
> Maybe you can explain it a bit more.

Two way: A replicates changes to B, and B can replicate it's own changes to A
One way: A replicates changes to B, but B can not replicate it's own
changes to A

While drbd only supports synchronous mode for active-active setup, the
generic "two way replication" does not have to be so. Also, just
because something is synchronous does not automatically mean it
supports two-way replication.

Either way, neither zfs or the (planned) btrfs send/receive supports
two-way/active-active setup. Both should (or will) work just fine for
one-way replication.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Two way mirror in BRTFS

2011-12-30 Thread Fajar A. Nugraha
2011/12/30 Jaromir Zdrazil :
> Sorry fo the typo in the subject!
>
> Just to add, I would like to see a two way mirror solution, but if it will 
> not work now/is not implemnted yet, I would propably choose between drbd in 
> asynchronous mode or make a some kind if "incremental" snapshot to a remote 
> mapped disk (I do not know yet, if brtfs support it)  - it means have one 
> shapshop and let's say have a daily incremental update of this snapshot.

You mean like "zfs send -i"? If yes, why not just use zfs? There's
zfsonlinux project, with easy-to-install ppa for ubuntu. Or you could
compile it manually.

>
> How would you do it?

If you DO mean zfs-send-like-functionality, then you should ask about
"btrfs send and receive", not "two way mirror" (which is not an
accurate way to describe what you want). Also, send/receive ability
does not mean it can act as two-way mirror. It CAN be an alternative
to drbd async though.

I don't think there's any publicly available code for it yet though.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fstrim on BTRFS

2011-12-29 Thread Fajar A. Nugraha
On Fri, Dec 30, 2011 at 1:19 PM, Li Zefan  wrote:
>> Or would some data
>> block group can be converted to metadata, and vice versa?
>>
>
> This won't happen. Also empty block groups won't be reclaimed, but it's
> in TODO list.

Ah, OK.

6G for metadata out of 50G total seems a bit much, but I can live with
it for now.

Thanks,

Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Compession, on filesystem or volume?

2011-12-29 Thread Fajar A. Nugraha
On Thu, Dec 29, 2011 at 5:51 PM, Remco Hosman  wrote:
> Hi,
>
> Something i could not find in the documentation i managed to find:
> if you mount with compress=lzo and rebalance, is compression on for that
> filesystem or only a single volume?
>
> eg, can i have a @boot volume uncompressed and @ and @home compressed.

Last time I asked a similar question, the answer was no. It's per filesystem.

however you can change compression of individual files between
zlib/lzo using "btrfs fi defragment -c", regardless of what the
filesystem is currently mounted with.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fstrim on BTRFS

2011-12-29 Thread Fajar A. Nugraha
On Thu, Dec 29, 2011 at 4:39 PM, Li Zefan  wrote:
> Fajar A. Nugraha wrote:
>> On Wed, Dec 28, 2011 at 11:57 PM, Martin Steigerwald
>>  wrote:
>>> But BTRFS does not:
>>>
>>> merkaba:~> fstrim -v /
>>> /: 4431613952 bytes were trimmed
>>> merkaba:~> fstrim -v /
>>> /: 4341846016 bytes were trimmed
>>
>>  and apparently it can't trim everything. Or maybe my kernel is
>> just too old.
>>
>>
>> $ sudo fstrim -v /
>> 2258165760 Bytes was trimmed
>>
>> $ df -h /
>> Filesystem            Size  Used Avail Use% Mounted on
>> /dev/sda6              50G   34G   12G  75% /
>>
>> $ mount | grep "/ "
>> /dev/sda6 on / type btrfs (rw,noatime,subvolid=258,compress-force=lzo)
>>
>> so only about 2G out of 12G can be trimmed. This is on kernel 3.1.4.
>>
>
> That's because only free spaces in block groups will be trimmed. Btrfs
> allocates space from block groups, and when there's no space availabe,
> it will allocate a new block group from the pool. In your case there's
> ~10G in the pool.

Thanks for your response.

>
> You can do a "btrfs fi df /", and you'll see the total size of existing
> block groups.

$ sudo btrfs fi df /
Data: total=43.47GB, used=31.88GB
System, DUP: total=8.00MB, used=12.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=3.25GB, used=619.88MB
Metadata: total=8.00MB, used=0.00

That should mean existing block groups is at least 46GB, right? In
which case my pool (a 50G partition) should only have about 4GB of
space not allocated to block groups. The numbers don't seem to match.

>
> You can empty the pool by:
>
>        # dd if=/dev/zero of=/mytmpfile bs=1M
>
> Then release the space (but it won't return back to the pool):
>
>        # rm /mytmpfile
>        # sync

Is there a bad side effect of doing so? For example, since all free
space in the pool would be allocated to data block group, would that
mean my metadata block group is capped at 3.25GB? Or would some data
block group can be converted to metadata, and vice versa?

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fstrim on BTRFS

2011-12-28 Thread Fajar A. Nugraha
On Thu, Dec 29, 2011 at 11:37 AM, Roman Mamedov  wrote:
> On Thu, 29 Dec 2011 11:21:14 +0700
> "Fajar A. Nugraha"  wrote:
>
>> I'm trying fstrim and my disk is now pegged at write IOPS. Just
>> wondering if maybe a "btrfs fi balance" would be more useful, since:


> Modern controllers (like the SandForce you mentioned) do their own wear 
> leveling 'under the hood', i.e. the same user-visible sectors DO NOT 
> neccessarily map to the same locations on the flash at all times; and 
> introducing 'manual' wear leveling by additional rewriting is not a good 
> idea, it's just going to wear it out more.

I know that modern controllers have their own wear leveling, but AFAIK
they basically:
(1) have reserved a certain size for wear leveling purposes
(2) when a write request comes, they basically use new sectors from
the pool, and put the "old" sectors to the pool (doing garbage
collection like trim/rewrite in the process)
(3) they can't re-use sectors that are currently being used and not
rewritten (e.g. sectors used by OS files)

If (3) is still valid, then the only way to reuse the sectors is by
forcing a rewrite (e.g. using "btrfs fi defrag"). So the question is,
is (3) still valid?

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fstrim on BTRFS

2011-12-28 Thread Fajar A. Nugraha
On Thu, Dec 29, 2011 at 11:21 AM, Fajar A. Nugraha  wrote:
> I'm trying fstrim and my disk is now pegged at write IOPS. Just
> wondering if maybe a "btrfs fi balance" would be more useful,

Sorry, I meant "btrfs fi defrag"

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fstrim on BTRFS

2011-12-28 Thread Fajar A. Nugraha
On Wed, Dec 28, 2011 at 11:57 PM, Martin Steigerwald
 wrote:
> But BTRFS does not:
>
> merkaba:~> fstrim -v /
> /: 4431613952 bytes were trimmed
> merkaba:~> fstrim -v /
> /: 4341846016 bytes were trimmed

 and apparently it can't trim everything. Or maybe my kernel is
just too old.


$ sudo fstrim -v /
2258165760 Bytes was trimmed

$ df -h /
FilesystemSize  Used Avail Use% Mounted on
/dev/sda6  50G   34G   12G  75% /

$ mount | grep "/ "
/dev/sda6 on / type btrfs (rw,noatime,subvolid=258,compress-force=lzo)

so only about 2G out of 12G can be trimmed. This is on kernel 3.1.4.

-- 
Fajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >