Re: Monitoring Btrfs

2016-10-17 Thread Kyle Manna
On Mon, Oct 17, 2016 at 9:44 AM, Stefan Malte Schumacher
 wrote:
> Hello
>
> I would like to monitor my btrfs-filesystem for missing drives. On
> Debian mdadm uses a script in /etc/cron.daily, which calls mdadm and
> sends an email if anything is wrong with the array. I would like to do
> the same with btrfs. In my first attempt I grepped and cut the
> information from "btrfs fi show" and let the script send an email if
> the number of devices was not equal to the preselected number.
>
> ...
>
> 1) Has anybody already written a script like this? After all, there is
> no need to reinvent the wheel a second time.

Not that I have a solution to your primary question regarding message
parsing, but do something different which may offer a different
perspective on your monitoring and reporting.

I employ systemd with timers to scrub my btrfs volumes[0][1] every
week.  I used to use either an OnFailure[2] trigger or my failure
monitor log (aka systemd-journal) parser[3] to send me emails if the
service failed to run.  This is a more "modern" approach to
cron.weekly + custom shell script for people that like systemd, love
it or hate it.

Recently I dropped the systemd journal parser for remote logging with
rsyslog + Papertrail[4] with a few alerts for things like "systemd
Failed to start" which indicates that the script returned a non-zero
exit code.  Papertrail then emails me when any of a handful of
machines trip up.

It's also worth noting logstash[5] (or similar) may be another way to
parse log files.  It could be a bloated overkill solution for
something that a 10 line shell script could accomplish, depends on if
you leverage it for things beyond basic log parsing.

[0] 
https://github.com/kylemanna/systemd-utils/blob/master/units/btrfs-scrub.service
[1] 
https://github.com/kylemanna/systemd-utils/blob/master/units/btrfs-scrub.timer
[2] https://github.com/kylemanna/systemd-utils/tree/master/onfailure
[3] https://github.com/kylemanna/systemd-utils/tree/master/failure-monitor
[4] 
https://blog.kylemanna.com/linux/logging-all-the-things-with-rsyslog-and-papertrail/
[5] https://www.elastic.co/guide/en/logstash/current/introduction.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to remove missing device on RAID1?

2015-10-21 Thread Kyle Manna
Hi Henk,

This trick/hack worked great for me.  After the rebalance was
complete, a sparse file + loop device, `btrfs replace` and `btrfs
delete` worked as expected.  Thanks.

In other news, I did hit a btrfs bug 3 times while attempting to
balance.  I've added my comments @
https://bugzilla.kernel.org/show_bug.cgi?id=105681#c14
ᐧ

On Tue, Oct 20, 2015 at 3:46 PM, Henk Slager <hsla...@hotmail.com> wrote:
> copy-paste error corrected
> On Wed, Oct 21, 2015 at 12:40 AM, Henk Slager <hsla...@hotmail.com> wrote:
>> I had a similar issue some time ago, around the time kernel 4.1.6 was
>> just there.
>> In case you don't want to wait for new disk or decide to just run the
>> filesystem with 1 disk less or maybe later on replace 1 of the still
>> healthy disks with a double/bigger sized one and use current/older
>> kernel+tools, you could do this (assuming the filesystem is not too
>> full of course):
>> - mount degraded
> - btrfs balance start -f -v -sdevid=1 -mdevid=1 -mdevid=1 
>>   (where missing disk has devid 1)
>> After completion the (virtual/missing) device shall be fully unallocated
>> - create /dev/loopX with sparse file of same size as missing disk on
>> some other filesystem
>> - btrfs replace start 1 /dev/loopX 
>> - remove /dev/loopX from the filesystem
>> - remount filesystyem without degraded
>> And remove /dev/loopX
>>
>>
>> On Tue, Oct 20, 2015 at 11:48 PM, Kyle Manna <2blu...@gmail.com> wrote:
>>> Thanks for the follow-up Duncan, that makes sense.  I assumed I was
>>> doing something wrong.
>>>
>>> I downloaded the devel branch of of btrfs-progs and got it running
>>> before I saw the need for a kernel patch and decided to wait.
>>>
>>> For anyone following this later, I needed to use the following to get
>>> the missing device ID:
>>>
>>> btrfs device usage 
>>> ᐧ
>>>
>>> On Tue, Oct 20, 2015 at 1:58 PM, Duncan <1i5t5.dun...@cox.net> wrote:
>>>> Kyle Manna posted on Tue, 20 Oct 2015 10:24:48 -0700 as excerpted:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I have a collection of three (was 4) 1-2TB devices with data and
>>>>> metadata in a RAID1 mirror.  Last night I was struck by the Click of
>>>>> Death on an old Samsung drive.
>>>>>
>>>>> I removed the device from the system, rebooted and mounted the volume
>>>>> with `-o degraded` and the file system seems fine and usable.  I'm
>>>>> waiting on a replacement, drive but want to remove the old drive and
>>>>> re-balance in the meantime.
>>>>>
>>>>> How do I remove the missing device?  I tried the `btrfs device delete
>>>>> missing /mnt` but was greeted with "ERROR: missing is not a block
>>>>> device".  A quick look at that btrfs-progs git repo shows that
>>>>> `stat("missing")` is called, which of course fails since missing isn't a
>>>>> block device.  Nothing other then `btrfs replace` seemed intuitive and
>>>>> all the docs mention the older command.  What's the move?
>>>>>
>>>>> Thanks!
>>>>> - Kyle
>>>>>
>>>>> Versions:
>>>>> Kernel: 4.2.3-1-ARCH
>>>>> btrfs-progs: 4.2.2-1 ᐧ
>>>>
>>>> I believe the current advice given here (that you were likely trying to
>>>> follow, wrapped link)...
>>>>
>>>> https://btrfs.wiki.kernel.org/index.php/
>>>> Using_Btrfs_with_Multiple_Devices#Replacing_failed_devices
>>>>
>>>> ... is dated and no longer works due to code change some time in the past.
>>>>
>>>> There's a set of (very) recent patches, to the kernel and userspace both
>>>> (I just updated userspace and it's in the git devel-branch v4.2.3-49-
>>>> g4db87a1 I just built, kernelspace, I don't see it in linus-mainline yet,
>>>> so I'd guess it's in the btrfs-integration patches, to land in the v4.4
>>>> commit window if not in 4.3 as it's getting late in the cycle for that.
>>>>
>>>> btrfs fi show 
>>>>
>>>> That will list the btrfs component devices together with their devids.
>>>>
>>>> Then use the appropriate devid like so:
>>>>
>>>> btrfs dev del  
>>>>
>>>> The -progs commit is d462081f, by Anand Jain, titled:
>>>>
>>>> btrfs-progs: Introduce device delete by devid
>>>>
>>

Re: btrfs-balance causes system-freeze on full disk

2015-10-21 Thread Kyle Manna
I had a number of similar btrfs balance crashes in the past few days,
but the disk wasn't full.  You should try tailing the system logs from
a remote machine when it happens. You'll likely see some bug info
before the system dies and becomes unusable.

The issue I encountered is described @
https://bugzilla.kernel.org/show_bug.cgi?id=105681
ᐧ

On Wed, Oct 21, 2015 at 12:38 PM, Jakob Schürz
 wrote:
> Hi there!
>
> Is it possible, what i've recognized now. My system (debian) runs on
> btrfs, and i have a lot of snapshots on my hard-disk.
> Since some days my system freezes totally. I recognized, it always
> happens during btrfs-balance.
>
> So i deleted some of the old snapshots and tried another balance-run.
> Nothing happened... No system-freeze.
>
> System-freeze means: No Keyboard-action. The Mouse is frozen, the screen
> is frozen, no magic-sysreq, no ssh-login.
>
> Can btrfs cause such a freeze??
>
> greez
>
> jakob
> --
> http://xundeenergie.at
> http://verkehrsloesungen.wordpress.com/
> http://cogitationum.wordpress.com/
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


How to remove missing device on RAID1?

2015-10-20 Thread Kyle Manna
Hi all,

I have a collection of three (was 4) 1-2TB devices with data and
metadata in a RAID1 mirror.  Last night I was struck by the Click of
Death on an old Samsung drive.

I removed the device from the system, rebooted and mounted the volume
with `-o degraded` and the file system seems fine and usable.  I'm
waiting on a replacement, drive but want to remove the old drive and
re-balance in the meantime.

How do I remove the missing device?  I tried the `btrfs device delete
missing /mnt` but was greeted with "ERROR: missing is not a block
device".  A quick look at that btrfs-progs git repo shows that
`stat("missing")` is called, which of course fails since missing isn't
a block device.  Nothing other then `btrfs replace` seemed intuitive
and all the docs mention the older command.  What's the move?

Thanks!
- Kyle

Versions:
Kernel: 4.2.3-1-ARCH
btrfs-progs: 4.2.2-1
ᐧ
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to remove missing device on RAID1?

2015-10-20 Thread Kyle Manna
Thanks for the follow-up Duncan, that makes sense.  I assumed I was
doing something wrong.

I downloaded the devel branch of of btrfs-progs and got it running
before I saw the need for a kernel patch and decided to wait.

For anyone following this later, I needed to use the following to get
the missing device ID:

btrfs device usage 
ᐧ

On Tue, Oct 20, 2015 at 1:58 PM, Duncan <1i5t5.dun...@cox.net> wrote:
> Kyle Manna posted on Tue, 20 Oct 2015 10:24:48 -0700 as excerpted:
>
>> Hi all,
>>
>> I have a collection of three (was 4) 1-2TB devices with data and
>> metadata in a RAID1 mirror.  Last night I was struck by the Click of
>> Death on an old Samsung drive.
>>
>> I removed the device from the system, rebooted and mounted the volume
>> with `-o degraded` and the file system seems fine and usable.  I'm
>> waiting on a replacement, drive but want to remove the old drive and
>> re-balance in the meantime.
>>
>> How do I remove the missing device?  I tried the `btrfs device delete
>> missing /mnt` but was greeted with "ERROR: missing is not a block
>> device".  A quick look at that btrfs-progs git repo shows that
>> `stat("missing")` is called, which of course fails since missing isn't a
>> block device.  Nothing other then `btrfs replace` seemed intuitive and
>> all the docs mention the older command.  What's the move?
>>
>> Thanks!
>> - Kyle
>>
>> Versions:
>> Kernel: 4.2.3-1-ARCH
>> btrfs-progs: 4.2.2-1 ᐧ
>
> I believe the current advice given here (that you were likely trying to
> follow, wrapped link)...
>
> https://btrfs.wiki.kernel.org/index.php/
> Using_Btrfs_with_Multiple_Devices#Replacing_failed_devices
>
> ... is dated and no longer works due to code change some time in the past.
>
> There's a set of (very) recent patches, to the kernel and userspace both
> (I just updated userspace and it's in the git devel-branch v4.2.3-49-
> g4db87a1 I just built, kernelspace, I don't see it in linus-mainline yet,
> so I'd guess it's in the btrfs-integration patches, to land in the v4.4
> commit window if not in 4.3 as it's getting late in the cycle for that.
>
> btrfs fi show 
>
> That will list the btrfs component devices together with their devids.
>
> Then use the appropriate devid like so:
>
> btrfs dev del  
>
> The -progs commit is d462081f, by Anand Jain, titled:
>
> btrfs-progs: Introduce device delete by devid
>
> According to it, the required kernel commit (title only listed) is
> similar:
>
> Btrfs: Introduce device delete by devid
>
> You can probably find them on-list if you wish to cherry-pick them into a
> current version.
>
> --
> Duncan - List replies preferred.   No HTML msgs.
> "Every nonfree program has a lord, a master --
> and if you use the program, he is your master."  Richard Stallman
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID1, SSD+non-SSD

2015-02-07 Thread Kyle Manna
On Fri Feb 06 2015 at 12:06:33 PM Brian B canis8...@gmail.com wrote:

 My laptop has two disks, a SSD and a traditional magnetic disk. I plan
 to make a partition on the mag disk equal in size the SSD and set up
 BTRFS RAID1. This I know how to do.

 The only reason I'm doing the RAID1 is for the self-healing. I realize
 writing large amounts of data will be slower than the SSD alone, but
 is it possible to set it up to only read from the magnetic drive if
 there's an error reading from the SSD?

 In other words, is there a way to tell it to only read from the faster
 disk?  Is that even necessary?  Is there a better way to accomplish
 this?


What you may want to look at is lvmcache + btrfs.  I've played with
lvmcache (using ext4 on top) and btrfs independently, but not
together.  Too many new technologies at the same time for my taste. :)

The best documentation I've found on lvm cache is the man page:
http://man7.org/linux/man-pages/man7/lvmcache.7.html

LVM cache uses dm-cache behind the scenes and makes it much more
manageable (i.e. construction, manipulation, and teardown of devices.
An lvm cache won't help with redundancy, the blocks will either exist
on the caching device or slower device.   To remove the cache, you can
force a flush of the blocks out of the to the traditional HDD and use
it without the cache without having to recreate the file system.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Questions on using BtrFS for fileserver

2014-08-19 Thread Kyle Manna
 · Besides using bcache, are there any possibilities to boost
   performance by adding (dedicated) cache-SSDs to a BtrFS?

dm-cache is in the mainline kernel and lvm2 recently added support to
make devicemapper configuration automatic.  In my opinion, dm-cache is
a little easier to use because you can add/remove/resize the cache
without recreating the filesystem.  If you're interested, take a peek
at the man page for lvmcache.

- Kyle
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html