from:"Imran Geriskovan"

Re: zstd compression

2017-11-16 Thread Imran Geriskovan

On 11/16/17, Austin S. Hemmelgarn  wrote:
> I'm pretty sure defrag is equivalent to 'compress-force', not
> 'compress', but I may be wrong.

Are there any devs to confirm this?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Read before you deploy btrfs + zstd

2017-11-15 Thread Imran Geriskovan

On 11/15/17, Martin Steigerwald  wrote:
> Somehow I am happy that I still have a plain Ext4 for /boot. :)

You may use uncompressed btrfs for /boot.
Both Syslinux (my choice) and Grub supports it.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: zstd compression

2017-11-15 Thread Imran Geriskovan

On 11/15/17, Lukas Pirl  wrote:
> you might be interested in the thread "Read before you deploy
> btrfs + zstd"¹.

Thanks. I've read it. Bootloader is not an issue since /boot is on
another uncompressed fs.

Let me make my question more generic:

Can there be any issues for switching mount time
compressions options from one to another, in any order?
(i.e none -> lzo -> zlib -> zstd -> none -> ...)

zstd is only a newcomer so my question applies to all
combinations..
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

zstd compression

2017-11-15 Thread Imran Geriskovan

Kernel 4.14 now includes btrfs zstd compression support.

My question:
I currently have a fs mounted and used with "compress=lzo"
option. What happens if I change it to "compress=zstd"?

My guess is that existing files will be read and uncompressed via lzo.
And new files will be written with zstd compression. And
everything will run smoothly.

Is this optimistic guess valid? What are possible pitfalls,
if there are any? Any further advices?

Regards,
Imran
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0)

2017-08-01 Thread Imran Geriskovan

On 8/1/17, Duncan <1i5t5.dun...@cox.net> wrote:
> Imran Geriskovan posted on Mon, 31 Jul 2017 22:32:39 +0200 as excerpted:
>>>> Now the init on /boot is a "19 lines" shell script, including lines
>>>> for keymap, hdparm, crytpsetup. And let's not forget this is possible
>>>> by a custom kernel and its reliable buddy syslinux.

>>> And I'm using dracut for that, tho quite cut down from its default,
>>> with a monolithic kernel and only installing necessary dracut modules.

>> Just create minimal bootable /boot for running below init.
>> (Your initramfs/rd is a bloated and packaged version of this anyway.)
>> Kick the rest. Since you a have your own kernel you are not far away
>> from it.

> Thanks.  You just solved my primary problem of needing to take the time
> to actually research all the steps and in what order I needed to do them,
> for a hand-rolled script. =:^)

It's just a minimal one. But it is a good start. For possible extensions
extract your initramfs and explore it. Dracut is bloated. Try mkinitcpio.

Once your have your self hosting bootmng, kernel, modules, /boot, init, etc
chain, you'll be shocked to realize you have been spending so much time for
that bullshit while trying to keep them up..

Get to this point in the shortest possible time. Save your precious
time. And reclaim your systems reliability.

For X, you'll still need udev or eudev.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0)

2017-07-31 Thread Imran Geriskovan

 Do you have any experience/advice/comment regarding dup data on ssds?

>>> Very good question. =:^)

>> Now the init on /boot is a "19 lines" shell script, including lines for
>> keymap, hdparm, crytpsetup. And let's not forget this is possible by a
>> custom kernel and its reliable buddy syslinux.
>
> FWIW...
> And I'm using dracut for that, tho quite cut down from its default, with
> a monolithic kernel and only installing necessary dracut modules.

Just create minimal bootable /boot for running below init.
(Your initramfs/rd is a bloated and packaged version of
this anyway.) Kick the rest. Since you a have your own
kernel you are not far away from it.


#!/bin/sh
# This is actually busybox ash or hush. Cant remember now.
# You may compile/customize your busybox as well. Easy.

mount proc /proc -t proc
mount sys  /sys  -t sysfs
mount run  /run  -t tmpfs
mkdir /dev/pts /dev/shm /run/lock
mount devpts /dev/pts -t devpts &
mount shm/dev/shm -t tmpfs &
mount -o remount,rw,noatime / &

# '&' is for backgrounding/parallel_execution.
# Use responsibly double checking its side effects
# depending on your setup.

hdparm -B 254 /dev/sda &
loadkmap < /boot/trq.bkmap

cryptsetup -T 10 luksOpen /dev/sdXX sdXX
mount /dev/mapper/sdXX /mnt/new_root -t btrfs -o noatime,compress=lzo

cd /mnt/new_root
mount --move /dev  ./dev
mount --move /proc ./proc
mount --move /sys  ./sys
mount --move /run  ./run
pivot_root . boot

exec chroot . busybox init
# Jump to your real roots init. Whatever it may be.


> But particularly after the last dracut update pulled in kmod as a
> mandatory dep as it now links against its libs, despite my monolithic
> kernel built without module support, I've been considering similar initr*
> alternatives, including hand-rolling my own initr* build scripts.
>
> Because I'm still not happy having to run an initr* at all, especially
> since there's more "magic" there than I'm particularly comfortable with
> since I like to grok the boot and thus potential recovery process better
> than I do this, and dracut was just the most convenient option at the
> time.

>> Interestingly my seach for reliability started with "dup data" and ended
>> up here. :)
> =:^)
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0)

2017-07-30 Thread Imran Geriskovan

On 7/30/17, Duncan <1i5t5.dun...@cox.net> wrote:
>>> Also, all my btrfs are raid1 or dup for checksummed redundancy

>> Do you have any experience/advice/comment regarding
>> dup data on ssds?

> Very good question. =:^)

> Limited.  Most of my btrfs are raid1, with dup only used on the device-
> respective /boot btrfs (of which there are four, one on each of the two
> ssds that otherwise form the btrfs raid1 pairs, for each of the working
> and backup copy pairs -- I can use BIOS to select any of the four to
> boot), and those are all sub-GiB mixed-bg mode.

Is this a military or deep space device? ;)

> So all my dup experience is sub-GiB mixed-blockgroup mode.
>
> Within that limitation, my only btrfs problem has been that at my
> initially chosen size of 256 MiB, mkfs.btrfs at least used to create an
> initial data/metadata chunk of 64 MiB.  Remember, this is dup mode, so
> there's two of them = 128 MiB.  Because there's also a system chunk, that
> means the initial chunk cannot be balanced even with an entirely empty
> filesystem, because there's not enough space to write a second 64 MiB
> chunk duped to 128 MiB.

For /boot, I've also tried dup data.

But because of combinations of constraints you've mentioned,
I totally give-up trying to have a bullet proof /boot
as my poor laptop is not mission critical as your device and
as I do always have bootable backups and always carry
some bootable sdcards.

Perhaps that has something to do with me kicking
out all systemd, inits, initramfs, mkinitcpio, dracut, etc, etc.

Now the init on /boot is a "19 lines" shell script, including lines
for keymap, hdparm, crytpsetup. And let's not forget this is
possible by a custom kernel, its reliable buddy syslinux.

Interestingly my seach for reliability started with
"dup data" and ended up here. :)
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0)

2017-07-29 Thread Imran Geriskovan

On 7/9/17, Duncan <1i5t5.dun...@cox.net> wrote:
> I have however just upgraded to new ssds then wiped and setup the old
> ones as another backup set, so everything is on brand new filesystems on
> fast ssds, no possibility of old undetected corruption suddenly
> triggering problems.
>
> Also, all my btrfs are raid1 or dup for checksummed redundancy

Do you have any experience/advice/comment regarding
dup data on ssds?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Btrfs/SSD

2017-05-15 Thread Imran Geriskovan

On 5/15/17, Tomasz Kusmierz  wrote:
> Theoretically all sectors in over provision are erased - practically they
> are either erased or waiting to be erased or broken.
> Over provisioned area does have more uses than that. For example if you have
> a 1TB drive where you store 500GB of data that you never modify -> SSD will
> copy part of that data to over provisioned area -> free sectors that were
> unwritten for a while -> free sectors that were continuously hammered by
> writes and write a static data there. This mechanism is wear levelling - it
> means that SSD internals make sure that sectors on SSD have an equal use
> over time. Despite of some thinking that it’s pointless imagine situation
> where you’ve got a 1TB drive with 1GB free and you keep writing and
> modifying data in this 1GB free … those sectors will quickly die due to
> short flash life expectancy ( some as short as 1k erases ! ).

Thanks for the info. It can be understood that, the drive
has a pool of erase blocks from which some portion (say %90-95)
is provided as usable. Trimmed blocks are candidates
for new allocations. If the drive is not trimmed, that allocatable
pool becomes smaller than it can be and new allocations
under wear levelling logic is done from smaller group.
This will probably increase data traffic on that "small group"
of blocks, eating from their erase cycles.

However, this logic is valid if the drive does NOT move
data on trimmed blocks to trimmed/available ones.

Under some advanced wear leveling operations, drive may
decide to swap two blocks (one occupied/one vacant) if the
cummulative erase cycles of the former is much lower than
the latter to provide some balancing affect.

Theoretically swapping may even occur when the flash tend
to lose charge (and thus data) based on the age of the
data and/or block health.

But in any case I understand that trimming will provide
important degree of freedom and health to the drive.
Without trimming drive will continue to deal with worthless
blocks simply because it doesn't know they are worthless...
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Btrfs/SSD

2017-05-14 Thread Imran Geriskovan

On 5/14/17, Tomasz Kusmierz  wrote:
> In terms of over provisioning of SSD it’s a give and take relationship … on
> good drive there is enough over provisioning to allow a normal operation on
> systems without TRIM … now if you would use a 1TB drive daily without TRIM
> and have only 30GB stored on it you will have fantastic performance but if
> you will want to store 500GB at roughly 200GB you will hit a brick wall and
> you writes will slow dow to megabytes / s … this is symptom of drive running
> out of over provisioning space …

What exactly happens on a non-trimmed drive?
Does it begin to forge certain erase-blocks? If so
which are those? What happens when you never
trim and continue dumping data on it?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Btrfs/SSD

2017-05-12 Thread Imran Geriskovan

On 5/12/17, Kai Krakow  wrote:
> I don't think it is important for the file system to know where the SSD
> FTL located a data block. It's just important to keep everything nicely
> aligned with erase block sizes, reduce rewrite patterns, and free up
> complete erase blocks as good as possible.

Yeah. "Tight packing" of data into erase blocks will reduce fragmentation
at flash level, but not necessarily the fragmentation at fs level. And
unless we are writing in continuous journaling style (as f2fs ?),
we still need to have some info about the erase blocks.

Of course while these are going on, there is also something like roundrobin
mapping or some kind of journaling would be going on at the low level flash
as wear leveling/bad block replacements which is totally invisible to us.

> Maybe such a process should be called "compaction" and not
> "defragmentation". In the end, the more continuous blocks of free space
> there are, the better the chance for proper wear leveling.

Tight packing into erase blocks seems dominant factor for ssd welfare.

However, fs fragmentation may still be a thing to consider because
increased fs fragmentation will probably increase the # of erase
blocks involved, affecting both read/write performance and wear.

Keeping an eye on both is a though job. Worse there is "two" uncoordinated
eyes one watching the "fs" and the other watching the "flash" making the
whole process suboptimal.

I think the ultimate utopic combination would be "absolutely dumb flash
controller" providing direct access to physical bytes and the ultimate
"Flash FS" making use of possible performance, wear leveling tricks.

Clearly, we are far from it.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Btrfs/SSD

2017-05-12 Thread Imran Geriskovan

On 5/12/17, Duncan <1i5t5.dun...@cox.net> wrote:
> FWIW, I'm in the market for SSDs ATM, and remembered this from a couple
> weeks ago so went back to find it.  Thanks. =:^)
>
> (I'm currently still on quarter-TB generation ssds, plus spinning rust
> for the larger media partition and backups, and want to be rid of the
> spinning rust, so am looking at half-TB to TB, which seems to be the
> pricing sweet spot these days anyway.)

Since you are taking ssds to mainstream based on your experience,
I guess your perception of data retension/reliability is better than that
of spinning rust. Right? Can you eloborate?

Or an other criteria might be physical constraints of spinning rust
on notebooks which dictates that you should handle the device
with care when running.

What was your primary motivation other than performance?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Btrfs/SSD

2017-04-17 Thread Imran Geriskovan

On 4/17/17, Roman Mamedov  wrote:
> "Austin S. Hemmelgarn"  wrote:

>> * Compression should help performance and device lifetime most of the
>> time, unless your CPU is fully utilized on a regular basis (in which
>> case it will hurt performance, but still improve device lifetimes).

> Days are long gone since the end user had to ever think about device lifetimes
> with SSDs. Refer to endurance studies such as
> It has been demonstrated that all SSDs on the market tend to overshoot even
> their rated TBW by several times, as a result it will take any user literally
> dozens of years to wear out the flash no matter which filesystem or what
> settings used. And most certainly it's not worth it changing anything
> significant in your workflow (such as enabling compression if it's
> otherwise inconvenient or not needed) just to save the SSD lifetime.

Going over the thread following questions come to my mind:

- What exactly does btrfs ssd option does relative to plain mode?

- Most(all?) SSDs employ wear leveling. Isn't it? That is they are
constrantly remapping their blocks under the hood. So isn't it
meaningless to speak of some kind of a block forging/fragmentation/etc..
affect of any writing pattern?

- If it is so, Doesn't it mean that there is no better ssd usage strategy
other than minimizing the total bytes written? That is whatever we do,
if it contributes to this fact it is good, otherwise bad. Are all other things
are beyond any user control? Is there a recommended setting?

- How about "data retension" experiences? It is known that
new ssds can hold data safely for longer period. As they age
that margin gets shorter. As an extreme case if I write into a new
ssd and shelve it, can i get back my data back after 5 years?
How about a file written 5 years ago and never touched again although
rest of the ssd is in active use during that period?

- Yes may be lifetimes getting irrelevant. However TBW has
still direct relation with data retension capability.
Knowing that writing more data to a ssd can reduce the
"life time of your data" is something strange.

- But someone can come and say: Hey don't worry about
"data retension years". Because your ssd will already be dead
before data retension becomes a problem for you... Which is
relieving.. :)) Anyway what are your opinions?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Intel XPoint Tech / Optane SSDs

2017-04-14 Thread Imran Geriskovan

Well, this may the follow up for the Btrfs/SSD discussion.

Probably nobody here had his hands on these Optane SSDs (or is it?)

Anyway, what are your expectations/projections about
memory/storage hybrid tech?

XPoint and/or any other tech will make memory and storage
eventually to converge.

With regard to this, how would the meaning of "FileSystem" be
affected in general? How would Btrfs be affected in particular?

Regards,
Imran
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Btrfs/SSD

2017-04-14 Thread Imran Geriskovan

Hi,
Sometime ago we had some discussion about SSDs.
Within the limits of unknown/undocumented device infos,
we loosely had covered data retension capability/disk age/life time
interrelations, (in?)effectiveness of btrfs dup on SSDs, etc..

Now, as time passed and with some accumulated experience on SSDs
I think we again can have a status check/update on them if you
can share your experiences and best practices.

So if you have something to share about SSDs (it may or may not be
directly related with btrfs) I'm sure everybody here will be happy to
hear it.

Regard,
Imran
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Opps.. Should be 4.9/4.10 Experiences

2017-02-16 Thread Imran Geriskovan

Opps.. I mean 4.9/4.10 Experiences

On 2/16/17, Imran Geriskovan <imran.gerisko...@gmail.com> wrote:
> What are your experiences for btrfs regarding 4.10 and 4.11 kernels?
> I'm still on 4.8.x. I'd be happy to hear from anyone using 4.1x for
> a very typical single disk setup. Are they reasonably stable/good
> enough for this case?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 4.10/4.11 Experiences

2017-02-16 Thread Imran Geriskovan

What are your experiences for btrfs regarding 4.10 and 4.11 kernels?
I'm still on 4.8.x. I'd be happy to hear from anyone using 4.1x for
a very typical single disk setup. Are they reasonably stable/good
enough for this case?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

4.10/4.11 Experiences

2017-02-16 Thread Imran Geriskovan


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,

2017-01-19 Thread Imran Geriskovan

I don't know if it is btrfs related but I'm getting
hard freezes on 4.8.17.

So I went back to 4.8.14 (with identical .config file).
It is one of my kernels which is known to be trouble
free for a long time.

Since they are hard lock up for real, I can't provide
anything.. Does anyone experience anything like that?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: kernel crash after upgrading to 4.9

2017-01-06 Thread Imran Geriskovan

>> I seem to have a similar issue to a subject in December:
>> Subject: page allocation stall in kernel 4.9 when copying files from one
>> btrfs hdd to another
>> In my case, this is caused when rsync'ing large amounts of data over NFS
>> to the server with the BTRFS file system.  This was not apparent in the
>> previous kernel (4.7).

As I browse through latest series of btrfs corruption/crash reports
I wonder which kernel version is reasonably safest to use.
4.7, 4.8 or 4.9 series?

What are your experiences and recommendations?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Small fs

2016-09-12 Thread Imran Geriskovan

> Wait wait wait a second:
> This is 256 MB SINGLE created
> by GPARTED, which is the replacement of MANUALLY
> CREATED 127MB DUP which is now non-existant..
> Which I was not aware it was a DUP at the time..
> Peeww... Small btrfs is full of surprises.. ;)

What's more, I also have another 128MB SINGLE
which I've been using for some years and did not
bother with its DUP/SINGLENESS. And I compared
them all to draw some conclusions. Heh..
That's the story.

Verdict is: DUP/SINGLE is a very serious fun stuff
when used unknowingly. Small btrfs is such a case.
And third party tools (ex: gparted) plays with it.

Let's warn users with some documentation,
together with "A formal small fs" behaviour..
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Small fs

2016-09-12 Thread Imran Geriskovan

> btrfs filesystem df /mnt/back/boot
> Data, single: total=8.00MiB, used=0.00B
> System, DUP: total=8.00MiB, used=16.00KiB
> Metadata, DUP: total=32.00MiB, used=112.00KiB
> GlobalReserve, single: total=16.00MiB, used=0.00B
> IT IS DUP!!

Wait wait wait a second:
This is 256 MB SINGLE created
by GPARTED, which is the replacement of MANUALLY
CREATED 127MB DUP which is now non-existant..
Which I was not aware it was a DUP at the time..
Peeww... Small btrfs is full of surprises.. ;)
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Small fs

2016-09-12 Thread Imran Geriskovan

>> Just to note again:
>> Ordinary 127MB btrfs gives "Out of space" around 64MB payload. 128MB is
>> usable to the end.

> Thanks, and just to clarify for others possibly following along or
> googling it up later, that's single mode (as opposed to dup mode) for at
> least data, if in normal separate data/metadata mode, and single for the
> combined mixed-mode chunks if in mixed-bg mode, correct?
>
> Because if the data is dup mode as well, as it would be by default in
> mixed-bg mode (unless on ssd), 128 MiB should allow storing only 64 MiB
> (and that's not accounting for the system chunk or global reserve
> metadata, so it'd be less than that) data.


That's /boot on my laptop. Its fairly old fs
which is created about 3-4 years. May be 5.
I'm not so sure..

btrfs filesystem df /boot
System, single: total=4.00MiB, used=4.00KiB
Data+Metadata, single: total=124.00MiB, used=103.88MiB
GlobalReserve, single: total=4.00MiB, used=0.00B
IT IS SINGLE!!

The 128/64Mb thing happened when I created a backup
usb drive for mirroring it. Then make it 256MB.

btrfs filesystem df /mnt/back/boot
Data, single: total=8.00MiB, used=0.00B
System, DUP: total=8.00MiB, used=16.00KiB
Metadata, DUP: total=32.00MiB, used=112.00KiB
GlobalReserve, single: total=16.00MiB, used=0.00B
IT IS DUP!!

So Duncan you got it:
I'm comparing the SINGLE to DUP...
I'm not sure how I got that SINGLE though...


>> I'm experimenting with an extracted (and customized) initrd on /boot.
>> That is, /boot is a "minimal root on its own" which can switch to real
>> root or do other things. Kernel and modules at /boot will not support
>> any fs other than btrfs. (Or it may)
>> It seems a minimally usable root around 10MB is possible.
>> And it is free of udev and systemd..
>
> You don't specifically mention the distro, but given that gentoo's one of
> the only general-purpose distros that hasn't switched to systemd yet (tho
> it offers the choice for those who want it, and I've taken that choice
> here), there's a fair chance that's what you're running, as both I and
> Martin Steigerwald (based on earlier threads) are.

Now the offtopic:

- I'm on Arch. Considering to switch to Void.

- Using syslinux as bootmanager. It is rock solid and compact.
Just edit syslinux.cfg and reboot to any configuration
without any hassle.

- Also using runit side stepping systemd. It is also rock solid
and compact.

- Know your hardware and thus the Kernel.
And get rid of and middleware (all udev derivatives) playing
guess games on your machine.

- Two custom kernel configurations:
All built-in (Was using until now.)
All modules. (Now switching to this.)
Load all your modules on exactly known locations
when booting. To do this customize your
runit start up scripts.
I'm switching to "All modules" approach
after I determined and understand the key modules.
Don't affaid. Its accually very doable.
You need at most 8-9 modprobes
to get them all. If you know your HW and
kernel it is no big deal.

- Configuring kernel teaches alot.

- Compiling a kernel with only required modules
takes about 15 minutes if you got your .config
file right. Play with "make xconfig". Alot...

- Keep all your previous .config files as record.
When something goes wrong diff them for
throubleshooting.

- Dracut is a dirty hack. mkinitcpio is great.

- Compressed initrd is bad. It hides what is going
on. Extract it. Play with it. Than make it the root fs
of your /boot. (I'm now here)

- Starting X without udev is simple.
Disable hotplug.
5-6 lines of Manual entries for keyboard, mouse and
touchpad is sufficient in your /etc/X11/xorg.conf.d
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Small fs

2016-09-12 Thread Imran Geriskovan

On 9/11/16, Chris Murphy  wrote:
> Something else that's screwy in that bug that I just realized, why is
> it not defaulting to mixed-block groups on a 100MiB fallocated file? I
> thought mixed-bg was the default below a certain size like 2GiB or
> whatever?

>> With an ordinary partition on a single disk,
>> fs created with just "mkfs.btrfs /dev/sdxx":
>> - 128MB works fine.
>> - 127MB works but as if it is 64MB.
>> Can we say size should be in multiples of 64MB?

> Why should it be in multiples?  I think what you're describing is part
> of the bug above that just needs to be fixed. Btrfs itself internally
> uses bytes, so multiples of 64MiB is OK but I wouldn't  use the word
> "should" with it.

I'm not suggesting anything. I'm just describing the behaviour
we've seen. If it is (or will be) something different its all ok for me.

But, what is that "formal behaviour" at the low end?
That is the discussion..
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Small fs

2016-09-11 Thread Imran Geriskovan

On 9/11/16, Duncan <1i5t5.dun...@cox.net> wrote:
> Martin Steigerwald posted on Sun, 11 Sep 2016 17:32:44 +0200 as excerpted:
>>> What is the smallest recommended fs size for btrfs?
>>> Can we say size should be in multiples of 64MB?
>> Do you want to know the smalled *recommended* or the smallest *possible*
>> size?

In fact both.
I'm reconsidering my options for /boot

> * Metadata, and thus mixed-bg, defaults to DUP mode on a single-device
> filesystem (except on ssd where I actually still use it myself, and
> recommend it except for ssds that do firmware dedupe).  In mixed-mode
> this means two copies of data as well, which halves the usable space.

> IOW, when using mixed-mode, which is recommended under a gig, and dup
> replication which is then the single-device default, effective usable
> space is **HALVED**, so 256 MiB btrfs size becomes 128 MiB usable. (!!)

> * There's also a system chunk to consider.  This too is normally dup mode
> on single device, raid1 on multi.  While it shrinks to some extent with
> size of filesystem, my 256 MiB /boot still has a 15.5 MiB system chunk,
> doubled due to dup mode to 31 MiB.  This is overhead you won't be able to
> use for anything else.
>
> * There's also global reserve.  This is a part of metadata (and thus
> mixed-mode) that cannot be used under normal circumstances either.
> However, unlike normal metadata, the accounting here is single -- it's X
> space reserved no matter the replication type.  On my 256 MiB /boot, it's
> 4 MiB.
>
> So of my 256 MiB btrfs mixed-mode /boot, 31+4=35 MiB is overhead, leaving
> 221 MiB for actual data and metadata.  But due to dup mode that's halved,
> to 110.5 MiB usable space.

That's quite an info.. Thanks a lot..

Just to note again:
Ordinary 127MB btrfs gives "Out of space" around
64MB payload. 128MB is usable to the end.

I'm experimenting with an extracted (and customized) initrd
on /boot. That is, /boot is a "minimal root on its own" which
can switch to real root or do other things. Kernel and modules
at /boot will not support any fs other than btrfs. (Or it may)

It seems a minimally usable root around 10MB is possible.
And it is free of udev and systemd..
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Small fs

2016-09-11 Thread Imran Geriskovan

What is the smallest recommended fs size for btrfs?

- There are mentions of 256MB around the net.
- Gparted reserves minimum of 256MB for btrfs.

With an ordinary partition on a single disk,
fs created with just "mkfs.btrfs /dev/sdxx":
- 128MB works fine.
- 127MB works but as if it is 64MB.

Can we say size should be in multiples of 64MB?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs and systemd

2016-08-29 Thread Imran Geriskovan

> Why not just create a Systemd unit (or whatever the proper term is) that
> runs on boot and runs the mount command manually and doesn't wait for it to
> return? Seems easier than messing with init systems.

Exactly: Never "mess" with inits..
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs and systemd

2016-08-29 Thread Imran Geriskovan

>>> I can't find any fstab setting for systemd to higher this timeout.
>>> There's just  the x-systemd.device-timeout but this controls how long to
>>> wait for the device and not for the mount command.
>>> Is there any solution for big btrfs volumes and systemd?
>>> Stefan

Switch to Runit.

First time I seriously consider another init on my
notebook is when I have a problem like yours.

Even when / (root) is mounted just fine, if there is
any problem with any other fstab entry, you'll
get into such a situation on systemd.

Give it a try, appending "init=/usr/bin/runit-init"
to your kernel command line on your bootloader.
You dont need to uninstall any package until
getting Runit behave "exactly" as you like.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: compression disk space saving - what are your results?

2015-12-03 Thread Imran Geriskovan

>> On a side note, I really wish BTRFS would just add LZ4 support.  It's a
>> lot more deterministic WRT decompression time than LZO, gets a similar
>> compression ratio, and runs faster on most processors for both
>> compression and decompression.

Relative ratios according to
http://catchchallenger.first-world.info//wiki/Quick_Benchmark:_Gzip_vs_Bzip2_vs_LZMA_vs_XZ_vs_LZ4_vs_LZO

Compressed size
gzip (1) - lzo (1.4) - lz4 (1.4)

Compression Time
gzip (5) - lzo (1) - lz4 (0.8)

Decompression Time
gzip (9) - lzo (4) - lz4 (1)

Compression Memory
gzip (1) - lzo (2) - lz4 (20)

Decompression Memory
gzip (1) - lzo (2) - lz4 (130). Yes 130! not a typo.

But there is a note:
Note: lz4 it's the program using this size, the
code for internal lz4 use very less memory.

However, I could not find any better apples to apples
comparison.

If lz4's real memory consumption is in orders of lzo,
than it looks good.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: compression disk space saving - what are your results?

2015-12-02 Thread Imran Geriskovan

>> What are your disk space savings when using btrfs with compression?

> * There's the compress vs. compress-force option and discussion.  A
> number of posters have reported that for mostly text, compress didn't
> give them expected compression results and they needed to use compress-
> force.

"compress-force" option compresses regardless of the "compressibility"
of the file.

"compress" option makes some inference about the "compressibility"
and decides to compress or not.

I wonder how that inference is done?
Can anyone provide some pseudo code for it?

Regards,
Imran
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: "disk full" on a 5 GB btrfs filesystem, FAQ outdated?

2015-11-29 Thread Imran Geriskovan

On 11/30/15, Duncan <1i5t5.dun...@cox.net> wrote:
> Of course you can also try compress-force(=lzo the default
> compression so the =spec isn't required), which should give
> you slightly better performance than zlib, but also a bit
> less efficient compression in terms of size saved.

lzo perf relative to zlib (A very very rough comparison)

Compress/Decompress times: 1/3 - 1/2 (which is significant)
Output Size: %10-20 larger
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: BTRFS: could not find root 8

2015-11-28 Thread Imran Geriskovan

>>> After upgrading from systemd227 to 228
>>> these messages began to show up during boot:
>>> [   24.652118] BTRFS: could not find root 8
>>> [   24.664742] BTRFS: could not find root 8

> b. For the OP, is it possible quotas was ever enabled on this file system?

Quotas have never been enabled since creation of this fs.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: BTRFS: could not find root 8

2015-11-28 Thread Imran Geriskovan

It's on every boot.
With systemd.log_level=debug boot parameter appended,
I could not find any meaningful operation just before the message.
The systemd journal boot dump will be in your personal mailbox shortly.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

BTRFS: could not find root 8

2015-11-27 Thread Imran Geriskovan

After upgrading from systemd227 to 228
these messages began to show up during boot:

[   24.652118] BTRFS: could not find root 8
[   24.664742] BTRFS: could not find root 8

Are they important?

Regards,
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Questions on incremental backups

2014-07-18 Thread Imran Geriskovan

It's not about snapshots but here is an other incremental
backup recipe for optical mediums like DVDs, BlueRays:

Base Backup:
1) Create encrypted loopback devices of DVD or BlueRay sizes.
2) Create a compressed multi device Btrfs spanning these
loopback devices. (To save space, you may use single
metadata if this is not your only backup)
3) Rsync your data into this fs.
4) Unmount it and make it SEED fs (btrfstune -S 1..)
5) Burn loopback device files to DVDs, Bluerays.

Incremental Part:
a) Before your next backup, create additional encrypted
   loopback devices as needed.
b) Mount your base backup. (It will mount as read-only)
c) Add devices created at (a) to your base backup fs.
d) Rsync into your fs. (Note that incremental data
will only go into the devices at (a)
e) Unmount all.
f) Only burn devices at (a) to DVDs, Bluerays. These
   are your incremental disks.

Regards,
Imran
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs on whole disk (no partitions)

2014-06-26 Thread Imran Geriskovan

On 6/25/14, Duncan 1i5t5.dun...@cox.net wrote:
 Imran Geriskovan posted on Wed, 25 Jun 2014 15:01:49 +0200 as excerpted:
 Note that gdisk gives default 8 sector alignment value for AF disks.
 That is 'sector' meant by gdisk is 'Logical Sector'!
 Sufficiently determined user may create misaligned partitions by playing
 with alignment value and partition start/end values.

 AFAIK, for gdisk it's actually 2048 sector (1 MiB) alignment by default,
 on new devices or if you clear and redo the entire partition table.

Alignment does not mean starting your partition on 2048.
It means aligning partitions with physical block boundries.

By default cgdisk chooses 2048 sector as the start of first partition.
However you can use 32 - 2047 as well. (i.e. for Bios_Boot partition)

Alignment value of 8 contraints your choices of partition locations
such as Sector 32, 40, 48,... so forth.. (in increments of 4K which
corresponds one physical block).

Regards, Imran
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs data dup on single device?

2014-06-26 Thread Imran Geriskovan

On 6/25/14, Chris Murphy li...@colorremedies.com wrote:
 On Jun 25, 2014, at 1:47 AM, Hugo Mills h...@carfax.org.uk wrote:
   The question is, why? If you have enough disk media errors to make
 it worth using multiple copies, then your storage device is basically
 broken and needs replacing, and it can't really be relied on for very
 much longer.

 Yeah basically -d dup tells me the user believes I do not trust the media
 that much. Specifically they believe the media surface variability is what
 they are suspicious of, not the read/write head, actuator, or spindle, or
 motor. And a.) I don't know how they can possibly have reliable information
 to arrive at this kind of suspicion; b.) why bother with such crap
 hardware?
 Chris Murphy


Does Dup Metadata need to tell anything to anyone?

Regards, Imran
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs on whole disk (no partitions)

2014-06-25 Thread Imran Geriskovan

On 6/23/14, Martin K. Petersen martin.peter...@oracle.com wrote:
 Anyway. The short answer is that Linux will pretty much always do I/O in
 multiples of the system page size regardless of the logical block size
 of the underlying device. There are a few exceptions to this such as
 direct I/O, legacy filesystems using bufferheads and raw block device
 access.

Thanks for the clarification.

And some random notes:

Note that gdisk gives default 8 sector alignment value for AF disks.
That is 'sector' meant by gdisk is 'Logical Sector'!
Sufficiently determined user may create misaligned
partitions by playing with alignment value and partition start/end
values.

There are SSDs with 4K, 8K block/page sizes and
512K, 1M, 1.5M Erase block sizes.

Partitions should be aligned with Erase blocks.
And filesystem block size (leafsize for btrfs. Default 16 K)
should be in multiples of device block size.

Regards, Imran
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs data dup on single device?

2014-06-25 Thread Imran Geriskovan

On 6/25/14, Hugo Mills h...@carfax.org.uk wrote:
 Storage is pretty cheap now, and to have multiple copies in btrfs is
 something that I think could be used a lot. I know I will use multiple
 copies of my data if made possible.

The question is, why? If you have enough disk media errors to make
 it worth using multiple copies, then your storage device is basically
 broken and needs replacing, and it can't really be relied on for very
 much longer.

Because btrfs single data profile can detect bitrot but can not recover
from it. Hardware Raid may be the solution. But you can not use it on
a laptop, or backup usb drive. However, you can still have 2 partitions
and mount them as Raid1.

Ofcourse we all have backups. But loss of certain files in a big file
set may have gone unnoticed if you do not scan though whole
backup log each time.

You will definetly lose some files unless you keep 5-10 years of
incremental backups. Even if you keep them they are too susceptible
to bitrots too.

Thus, there is definetely a need for ensured/enhanced data integrity.

Note that deduplication features of modern drives makes duplication
useless unless you used encrypted disk.

Regard, Imran
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs on whole disk (no partitions)

2014-06-22 Thread Imran Geriskovan

 The 64KB Btrfs bootloader pad is 8 sector aligned, so for 512e AF disks
 there's no problem formatting the whole drive. The alignment problem
 actually happens when partitioning it, using old partition tools that don't
 align on 8 sector boundaries. There are some such tools still floating
 around.

A 'somewhat' related question:

So called Advanced Format drives has 4K physical sector size,
however they report 512B logical sector size.

How does linux kernel access those drives?
512B or 4K at a time?

Imran
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs on whole disk (no partitions)

2014-06-19 Thread Imran Geriskovan

On 6/19/14, Russell Coker russ...@coker.com.au wrote:
 On Wed, 18 Jun 2014 21:29:39 Daniel Cegiełka wrote:
 Everything works fine. Is such a solution is recommended? In my
 opinion, the creation of the partitions seems to be completely
 unnecessary if you can use btrfs.

 If you don't need to have a boot loader or swap space on the disk
 then there's no reason to have a partition table.  Note that it's often good
 to have some swap space even if everything can fit in RAM because
 Linux sometimes pages things out to make more space for cache.

Grub installs itself and boots from Partitionless Btrfs disk.
It is handy for straight forward installations.

However, you need boot partition (ie. initramfs and kernel to boot
from encrypted root) its another story.

Swap is an issue. But you may try zram (compressed ram swap).
I've got some crashes on 3.14. Thus, waiting for it to stabilize.

Imran
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs on whole disk (no partitions)

2014-06-19 Thread Imran Geriskovan

On 6/19/14, Russell Coker russ...@coker.com.au wrote:
 On Wed, 18 Jun 2014 21:29:39 Daniel Cegiełka wrote:
 Everything works fine. Is such a solution is recommended? In my
 opinion, the creation of the partitions seems to be completely
 unnecessary if you can use btrfs.

 If you don't need to have a boot loader or swap space on the disk
 then there's no reason to have a partition table.  Note that it's often good
 to have some swap space even if everything can fit in RAM because
 Linux sometimes pages things out to make more space for cache.

Grub installs itself and boots from Partitionless Btrfs disk.
It is handy for straight forward installations.

However, IF you need boot partition (ie. initramfs and kernel to boot
from encrypted root) its another story.

Swap is an issue. But you may try zram (compressed ram swap).
I've got some crashes on 3.14. Thus, waiting for it to stabilize.

Imran
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs on whole disk (no partitions)

2014-06-18 Thread Imran Geriskovan

On 6/18/14, Daniel Cegiełka daniel.cegie...@gmail.com wrote:
 I created btrfs directly to disk using such a scheme (no partitions):
 cd /mnt
 btrfs subvolume create __active
 btrfs subvolume create __active/rootvol
 Everything works fine. Is such a solution is recommended? In my
 opinion, the creation of the partitions seems to be completely
 unnecessary if you can use btrfs.

Partitionless and subvolumeless desktop setup (everything is in
default subvolume) is operational here since kernel 3.10 (now 3.14).
No issues.

Regards, Imran
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Issue with btrfs balance

2014-02-10 Thread Imran Geriskovan

I've experienced the following with balance:

Setup:
- Kernel 3.12.9
- 11 DVD sized (4.3GB) loopback devices.
(9 Read-Only Seed devices + 2 Read/Write devices)
- 9 device seed created with -m single -d single and made
Read-only with btrfstune -S 1 ...
- 2 devices was added at different dates. NO balance performed until now.
- NOW add 1 more device to the array and perform a balance.

Result:
Balance did run for a while and exited displaying Process Killed.
Any attempt to unmount the array failed, preventing any
shutdown. Hence I had no option other than hard reset.

After reboot, issuing balance command gave the message
Balance in progress.

I cancelled the balance and tried to remove the last device
which ended with a kernel crash. So I dumped 2 + 1 normal devices.

Former 9 multi device seed was OK and was mountable.

Regards,
Imran
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Rapid memory exhaustion during normal operation

2014-01-29 Thread Imran Geriskovan

 I'm trying to track this down - this started happening without changing
 the kernel in use, so probably
 a corrupted filesystem. The symptoms are that all memory is suddenly used
 by no apparent source.  OOM
 killer is invoked on every task, still can't free up enough memory to
 continue.

I don't know if it is related or not, but my experience (Kernel
3.12.8) is as follows:

If a process traverses a directory tree of millions of subdirectories and files,
memory consumption increases by Gigabytes and the memory is NEVER freed.

The traversal includes no reads, no writes but just getting the directory
contents. However a second run for the same dirs does NOT increase
memory usage.

Interestingly process list gives no clues about what consumed that much
memory.

Regards,
Imran
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Options for SSD - autodefrag etc?

2014-01-25 Thread Imran Geriskovan

Every write on a SSD block reduces its data retension capability.

No concrete figures but it is assumed to be
- 10 years for new devices
- 1 year at rated usage. (There are much lower figures around)

Hence, I would not trade retension time and wear for
autodefrag with no/minor benefits on SSD. (which means
at least +2x write amplification on fragments)

On hard disks, we've experienced temporary freezes
(about 10secs to 3mins) during background autodefrag.

Regards,
Imran
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Feature Req: mkfs.btrfs -d dup option on single device

2013-12-17 Thread Imran Geriskovan

On 12/12/13, Chris Mason c...@fb.com wrote:
 For me anyway, data=dup in mixed mode is definitely an accident ;)
 I personally think data dup is a false sense of security, but drives
 have gotten so huge that it may actually make sense in a few
 configurations.

Sure, it's not about any security regarding the device.

It's about the capability of recovering from any
bit-rot which can creep into your backups and can be
detected when you need the file after 20-30 generations
of backups which is too late. (Who keeps that much
incremental archive and reads backup logs of millions of
files, regularly?)

 Someone asks for it roughly once a year, so it probably isn't a horrible
 idea.
 -chris

Today, I've brought up an old 2 GB Seagate from the basement.
Literaly, it has been Rusted. So it deserves the title of
Spinning Rust for real. I had no hope whether it would work,
but out of curiosity I plugged it into a USB-IDE box.

It spinned up and wow!; it showed up among the devices.
It had two swap and an ext2 partition. I remembered that it was
one of the disk used for linux installations more than
10 years ago. I mounted it . Most of the files dates back to 2001-07.

They are more than 12 years old and they seem to be intact
with just one inode size missmatch. (See fsck output below).

If there were BTRFS (and -d dup :) ) at the time, now I would
perform a scrub and report the outcome here. Hence,
'Digital Archeology' can surely benefit from Btrfs. :)

PS: And regarding the SSD data retension debate this can be an
interesting benchmark for a device whick was kept in an unfavorable
environment.

Regards,
Imran


FSCK output:

fsck from util-linux 2.20.1
e2fsck 1.42.8 (20-Jun-2013)
/dev/sdb3 has gone 4209 days without being checked, check forced.
Pass 1: Checking inodes, blocks, and sizes
Special (device/socket/fifo) inode 82669 has non-zero size.  Fixy? yes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information

/dev/sdb3: * FILE SYSTEM WAS MODIFIED *
/dev/sdb3: 41930/226688 files (1.0% non-contiguous), 200558/453096 blocks
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Feature Req: mkfs.btrfs -d dup option on single device

2013-12-11 Thread Imran Geriskovan

 That's actually the reason btrfs defaults to SINGLE metadata mode on
 single-device SSD-backed filesystems, as well.

 But as Imran points out, SSDs aren't all there is.  There's still
 spinning rust around.

 And defaults aside, even on SSDs it should be /possible/ to specify data-
 dup mode, because there's enough different SSD variants and enough
 different use-cases, that it's surely going to be useful some-of-the-time
 to someone. =:^)

We didn't start with SSDs but the thread heads to there. Well ok then.

Since hard drives with more complex firmwares, hybrids, and so..
are becoming available. Eventually they will share common problems with
SSDs.

To make story short lets say Eventually we all will have block address
devices, without any sensible physically bound addresses.

Without physically bound addresses, any duplicate written to device, MAY
end up in the same unreliable portion of the device. Note it MAY. However
the devices are so large that, this probability is very low. The paranoid who
wants to make this lower may simply increase the number of duplicates.

On the other hand people who work with multiple physical devices
may want to decrease number of duplicates. (Probably to single copy)

Hence, there is definetely use case for tunable duplicates both
data and metadata.

Now, there is one open issue:
In its current form -d dup interferes with -M. Is it constraint of design?
Or an arbitrary/temporary constraint. What will be the situation if there
is tunable duplicates?

And more:
Is -M good for everyday usage on large fs for efficient packing?
What's the penalty? Can it be curable? If so, why not make it default?

Imran
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Feature Req: mkfs.btrfs -d dup option on single device

2013-12-11 Thread Imran Geriskovan

 What's more (in relation to our long term data integrity aim)
 order of magnitude for their unpowered data retension period is
 1 YEAR. (Read it as 6months to 2-3 years.

 Does btrfs need to date-stamp each block/chunk to ensure that data is
 rewritten before suffering flash memory bitrot?
 Is not the firmware in SSDs aware to rewrite any too-long unchanged data?

No. It is supposed to handled by firmware. That's why they should be
powered. It is not visible to the file system.
You can do a google search with terms ssd data retension.

There is no concrete info about it. But figures range from:
- 10 years retention for new devices to
- 3-6 months for devices at their 'rated' usage.

Seems there is consensus about 1 year.  And it seems
SSD vendors are close to the datacenters.

Its todays tech. In time we'll see if it will get better or worse.

In the long run, we may have no choice but to put all our data
in the hands of belowed cloud lords. Hence the NSA. :)

Note that Sony has shutdown its optical disc unit.

Regards,
Imran
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Feature Req: mkfs.btrfs -d dup option on single device

2013-12-10 Thread Imran Geriskovan

Currently, if you want to protect your data against bit-rot on
a single device you must have 2 btrfs partitions and mount
them as Raid1. The requested option will save the user from
partitioning and will provide flexibility.

Yes, I know: This will not provide any safety againts hardware
failure. But it is not the purpose anyway.

The main purpose is to Ensure Data Integrity on:
a- Computers (ie. laptops) where hardware raid is not practical.
b- Backup sets (ie. usb drives) where hardware raid is an overkill.

Even if you have regular backups, without having
Guaranteed Data Integrity on all data sets, you will
lose some data on some day, somewhere.

See discussion at:
http://hardware.slashdot.org/story/13/12/10/178234/ask-slashdot-practical-bitrot-detection-for-backups


Now, the futuristic and OPTIONAL part for the sufficiently paranoid:
The number of duplicates may be parametric:

mkfs.btrfs -m dup 4 -d dup 3 ... (4 duplicates for metadata, 3
duplicates for data)

I kindly request your comments. (At least for -d dup)

Regards,
Imran Geriskovan
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Feature Req: mkfs.btrfs -d dup option on single device

2013-12-10 Thread Imran Geriskovan

 Currently, if you want to protect your data against bit-rot on
 a single device you must have 2 btrfs partitions and mount
 them as Raid1.

 No this also works:
 mkfs.btrfs -d dup -m dup -M device

Thanks a lot.

I guess docs need an update:

https://btrfs.wiki.kernel.org/index.php/Mkfs.btrfs:
-d: Data profile, values like metadata. EXCEPT DUP CANNOT BE USED

man mkfs.btrfs (btrfs-tools 0.19+20130705)
-d, --data type
  Specify  how  the data must be spanned across
  the devices specified. Valid values are raid0, raid1,
  raid5,  raid6,  raid10  or single.

Imran
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Fwd: Feature Req: mkfs.btrfs -d dup option on single device

2013-12-10 Thread Imran Geriskovan

-- Forwarded message --
From: Imran Geriskovan imran.gerisko...@gmail.com
Date: Wed, 11 Dec 2013 02:14:25 +0200
Subject: Re: Feature Req: mkfs.btrfs -d dup option on single device
To: Chris Murphy li...@colorremedies.com

 Current btrfs-progs is v3.12. 0.19 is a bit old. But yes, looks like the
 wiki also needs updating.

 Anyway I just tried it on an 8GB stick and it works, but -M (mixed
 data+metadata) is required, which documentation also says incurs a
 performance hit, although I'm uncertain of the significance.

btrfs-tools 0.19+20130705 is the most recent one on Debian's
leading edge Sid/Unstable.

Given the state of the docs probably very few or no people ever used
'-d dup'. As being the lead developer, is it possible for you to
provide some insights for the reliability of this option?

Can '-M' requirement be an indication of code which has not been
ironed out, or is it simply a constraint of the internal machinery?

How well does the main idea of Guaranteed Data Integrity
for extra reliability and the option -d dup in its current state match?

Regards,
Imran
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Feature Req: mkfs.btrfs -d dup option on single device

2013-12-10 Thread Imran Geriskovan

 I'm not a developer, I'm just an ape who wears pants. Chris Mason is the
 lead developer. All I can say about it is that it's been working for me OK
 so far.

Great:) Now, I understand that you were using -d dup, which is quite
valuable for me. And since GMail only show first names in Inbox list,
I thougth you were Chris Mason. Sorry. Now, I see your full name
in the header.


 Can '-M' requirement be an indication of code which has not been
 ironed out, or is it simply a constraint of the internal machinery?

 I think it's just how chunks are allocated it becomes space inefficient to
 have two separate metadata and data chunks, hence the requirement to mix
 them if -d dup is used. But I'm not really sure.


Sounds like it is implemented paralel/similar to -m dup. That's why -M
is implied. Of course we are speculating here..

Now the question is, is it a good practice to use -M for large filesystems?
Pros, Cons? What is the performance impact? Or any other possible impact?


 Well given that Btrfs is still flagged as experimental, most notably when
 creating any Btrfs file system, I'd say that doesn't apply here. If the case
 you're trying to mitigate is some kind of corruption that can only be
 repaired if you have at least one other copy of data, then -d dup is useful.
 But obviously this ignores the statistically greater chance of a more
 significant hardware failure, as this is still single device.


From the beginning we've put possiblity of full hardware failure aside.
The user is expected to handle that risk elsewhere.

Our scope is about localized failures which may cost you
some files. Since btrfs has checksums you may be aware of them.
Using -d dup we increase our chances of recovering from them.

But probablity of corruption of all duplicates is non zero.
Hence checking the output of btrfs scrub start path is beneficial
before making/updating any backups. And then check the output of the
scrub on the backup too..


 Not only could
 the entire single device fail, but it's possible that erase blocks
 individually fail. And since the FTL decides where pages are stored, the
 duplicate data/metadata copies could be stored in the same erase block. So
 there is a failure vector other than full failure where some data can still
 be lost on a single device even with duplicate, or triplicate copies.


I guess you are talking about SSD's. Even if you write duplicates
on distinct erase blocks, they may end up in same block after
firmware's relocation, defragmentation, migration, remapping,
god knows what ...ation operations. So practically, block
address does not point any fixed physical location on SSDs.

What's more (in relation to our long term data integrity aim)
order of magnitude for their unpowered data retension period is
1 YEAR. (Read it as 6months to 2-3 years. While powered they
refresh/shuffle the blocks) This makes SSDs
unsuitable for mid-to-long tem consumer storage. Hence they are
out of this discussion. (By the way, the only way for reliable
duplication on SSDs, is using physically seperate devices.)

Luckly we have hard drives with still sensible block addressing.
Even with bad block relocation. So duplication, triplicate, still
makes sense.. Or IS IT? Comments?

i.e. The new Advanced format drives may employ 4K blocks
but present 512B logical blocks which may be another reencarnation
of the SSD problem above. However, I guess linux kernel does not
access such drives using logical addressing..

Imran
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

53 matches

Mail list logo