from:"Saint Germain"

Re: Announcing btrfs-dedupe

2016-11-09 Thread Saint Germain

On Wed, 09 Nov 2016 12:24:51 +0100, Niccolò Belli
<darkba...@linuxsystems.it> wrote :
> 
> On martedì 8 novembre 2016 23:36:25 CET, Saint Germain wrote:
> > Please be aware of these other similar softwares:
> > - jdupes: https://github.com/jbruchon/jdupes
> > - rmlint: https://github.com/sahib/rmlint
> > And of course fdupes.
> >
> > Some intesting points I have seen in them:
> > - use xxhash to identify potential duplicates (huge speedup)
> > - ability to deduplicate read-only snapshots
> > - identify potential reflinked files (see also my email here:
> >   https://www.spinics.net/lists/linux-btrfs/msg60081.html)
> > - ability to filter out hardlinks
> > - triangle problem: see jdupes readme
> > - jdupes has started the process to be included in Debian
> >
> > I hope that will help and that you can share some codes with them !
> > 
> Hi,
> What do you think about jdupes? I'm searching an alternative to
> duperemove and rmlint doesn't seem to support btrfs deduplication, so
> I would like to try jdupes. My main problem with duperemove is a
> memory leak, also it seems to lead to greater disk usage: 
> https://github.com/markfasheh/duperemove/issues/163

rmlint is supporting btrfs deduplication:
rmlint --algorithm=xxhash --types="duplicates" --hidden 
--config=sh:handler=clone --no-hardlinked

I've used jdupes and rmlint to deduplicate 2TB with 4GB RAM and it took
a few hours. So it is acceptable from a performance point of view.
The problems I found have been corrected by both.

Jdupes author is really kind and reactive !
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Announcing btrfs-dedupe

2016-11-08 Thread Saint Germain

On Sun, 6 Nov 2016 14:30:52 +0100, James Pharaoh
 wrote :

> Hi all,
> 
> I'm pleased to announce my btrfs deduplication utility, written in
> Rust. This operates on whole files, is fast, and I believe
> complements the existing utilities (duperemove, bedup), which exist
> currently.
> 
> Please visit the homepage for more information:
> 
> http://btrfs-dedupe.com
> 

Thanks for having shared your work.
Please be aware of these other similar softwares:
- jdupes: https://github.com/jbruchon/jdupes
- rmlint: https://github.com/sahib/rmlint
And of course fdupes.

Some intesting points I have seen in them:
- use xxhash to identify potential duplicates (huge speedup)
- ability to deduplicate read-only snapshots
- identify potential reflinked files (see also my email here:
  https://www.spinics.net/lists/linux-btrfs/msg60081.html)
- ability to filter out hardlinks
- triangle problem: see jdupes readme
- jdupes has started the process to be included in Debian

I hope that will help and that you can share some codes with them !

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Identifying reflink / CoW files

2016-11-04 Thread Saint Germain

On Thu, 3 Nov 2016 01:17:07 -0400, Zygo Blaxell
<ce3g8...@umail.furryterror.org> wrote :

> On Thu, Oct 27, 2016 at 01:30:11PM +0200, Saint Germain wrote:
> > Hello,
> > 
> > Following the previous discussion:
> > https://www.spinics.net/lists/linux-btrfs/msg19075.html
> > 
> > I would be interested in finding a way to reliably identify
> > reflink / CoW files in order to use deduplication programs (like
> > fdupes, jdupes, rmlint) efficiently.
> > 
> > Using FIEMAP doesn't seem to be reliable according to this
> > discussion on rmlint:
> > https://github.com/sahib/rmlint/issues/132#issuecomment-157665154
> 
> Inline extents have no physical address (FIEMAP returns 0 in that
> field). You can't dedup them and each file can have only one, so if
> you see the FIEMAP_EXTENT_INLINE bit set, you can just skip
> processing the entire file immediately.
> 
> You can create a separate non-inline extent in a temporary file then
> use dedup to replace _both_ copies of the original inline extent.
> Or don't bother, as the savings are negligible.
> 
> > Is there another way that deduplication programs can easily use ?
> 
> The problem is that it's not files that are reflinked--individual
> extents are.  "reflink file copy" really just means "a file whose
> extents are 100% shared with another file." It's possible for files
> on btrfs to have any percentage of shared extents from 0 to 100% in
> increments of the host page size.  It's also possible for the blocks
> to be shared with different extent boundaries.
> 
> The quality of the result therefore depends on the amount of effort
> put into measuring it.  If you look for the first non-hole extent in
> each file and use its physical address as a physical file identifier,
> then you get a fast reflink detector function that has a high risk of
> false positives.  If you map out two files and compare physical
> addresses block by block, you get a slow function with a low risk of
> false positives (but maybe a small risk of false negatives too).
> 
> If your dedup program only does full-file reflink copies then the
> first extent physical address method is sufficient.  If your program
> does block- or extent-level dedup then it shouldn't be using files in
> its data model at all, except where necessary to provide a mechanism
> to access the physical blocks through the POSIX filesystem API.
> 
> FIEMAP will tell you about all the extents (physical address for
> extents that have them, zero for other extent types).  It's also slow
> and has assorted accuracy problems especially with compressed files.
> Any user can run FIEMAP, and it uses only standard structure arrays.
> 
> SEARCH_V2 is root-only and requires parsing variable-length binary
> btrfs data encoding, but it's faster than FIEMAP and gives more
> accurate results on compressed files.
> 

As the dedup program only does full-file reflink, the first extent
physical address method can be used as a fast first check to identify
potential files.

But how to implement the second check in order to have 0% risk of false
positive ?
Because you said that mapping out two files and comparing the physical
addresses block by block also has a low risk of false positives.

Thank you very much for the detailed explanation !
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Identifying reflink / CoW files

2016-10-27 Thread Saint Germain

Hello,

Following the previous discussion:
https://www.spinics.net/lists/linux-btrfs/msg19075.html

I would be interested in finding a way to reliably identify reflink /
CoW files in order to use deduplication programs (like fdupes, jdupes,
rmlint) efficiently.

Using FIEMAP doesn't seem to be reliable according to this discussion
on rmlint:
https://github.com/sahib/rmlint/issues/132#issuecomment-157665154

Is there another way that deduplication programs can easily use ?

Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Kernel bug during RAID1 replace

2016-06-30 Thread Saint Germain



On Wed, 29 Jun 2016 18:24:07 -0600, Chris Murphy
<li...@colorremedies.com> wrote :

> On Wed, Jun 29, 2016 at 5:51 PM, Saint Germain <saint...@gmail.com>
> wrote:
> > On Wed, 29 Jun 2016 19:23:57 +, Hugo Mills <h...@carfax.org.uk>
> > wrote :
> >
> >> On Wed, Jun 29, 2016 at 09:16:13PM +0200, Saint Germain wrote:
> >> > On Wed, 29 Jun 2016 13:08:30 -0600, Chris Murphy
> >> > <li...@colorremedies.com> wrote :
> >> >
> >> > > >> > Ok I will follow your advice and start over with a fresh
> >> > > >> > BTRFS volume. As explained on another email, rsync doesn't
> >> > > >> > support reflink, so do you think it is worth trying with
> >> > > >> > BTRFS send instead ? Is it safe to copy this way or rsync
> >> > > >> > is more reliable in case of faulty BTRFS volume ?
> >> > > >> >
> >> > > >> If you have the space, btrfs restore would probably be the
> >> > > >> best option. It's not likely, but using send has a risk of
> >> > > >> contaminating the new filesystem as well.
> >> > > >>
> >> > > >
> >> > > > I have to copy through the network (I am running out of
> >> > > > disks...) so btrfs restore is unfortunately not an option.
> >> > > > I didn't know that btrfs send could contaminate the target
> >> > > > disk as well ?
> >> > > > Ok rsync it is then.
> >> > >
> >> > > restore will let you extract files despite csum errors. I don't
> >> > > think send will, and using cp or rsync Btrfs definitely won't
> >> > > hand over the file.
> >> > >
> >> >
> >> > That's Ok I'd prefer to avoid copying files with csum errors
> >> > anyway (I can restore them from backups).
> >> > However will btrfs send abort the whole operation as soon as it
> >> > finds a csum error ?
> >> > And will I have the risk to "contaminate" the target BTRFS
> >> > volume by using BTRFS send ?
> >>
> >>A send stream is effectively just a sequence of filesystem
> >> commands (mv, cp, cp --reflink, rm, dd). So any damage that it can
> >> do when replayed by receive is limited to what you can do with the
> >> basic shell commands (plus cloning extents). If you have metadata
> >> breakage in your source filesystem, this won't cause the same
> >> metadata breakage to show up in the target filesystem.
> >>
> >
> > Well after 300GB copied through "btrfs send", the process is aborted
> > with the following error:
> > ERROR: send ioctl failed with -5: Input/output error
> > ERROR: unexpected EOF in stream.
> >
> > /var/log/syslog relevant lines are appended at the end of this
> > email.
> >
> > So it seems that I will have to go with rsync then.
> 
> You'll likely hit the same bad file and get EIO, is my guess. What you
> can do is mount it ro from the get go, and do btrfs send receive again
> and maybe then it won't hit this sequence where it's finding some need
> to clean up a transaction and free an extent. Maybe you still get some
> failure to send whatever file is using that extent, but I think
> receive will tolerate it.
> 

Well I tried "btrfs send" and the process stalled at 300 GB (on a total
of 2 TB) with a never ending stream of:
"ERROR: unexpected EOF in stream."
I gave up and launched a rsync which is about to be finished.

Now I have some work to make sure that all rsynced files are consistent
(I have to compared them to the backuped ones).

Thanks for your help, I learned a bit more about BTRFS this way.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Kernel bug during RAID1 replace

2016-06-29 Thread Saint Germain

On Wed, 29 Jun 2016 19:23:57 +, Hugo Mills <h...@carfax.org.uk>
wrote :

> On Wed, Jun 29, 2016 at 09:16:13PM +0200, Saint Germain wrote:
> > On Wed, 29 Jun 2016 13:08:30 -0600, Chris Murphy
> > <li...@colorremedies.com> wrote :
> > 
> > > >> > Ok I will follow your advice and start over with a fresh
> > > >> > BTRFS volume. As explained on another email, rsync doesn't
> > > >> > support reflink, so do you think it is worth trying with
> > > >> > BTRFS send instead ? Is it safe to copy this way or rsync is
> > > >> > more reliable in case of faulty BTRFS volume ?
> > > >> >
> > > >> If you have the space, btrfs restore would probably be the best
> > > >> option. It's not likely, but using send has a risk of
> > > >> contaminating the new filesystem as well.
> > > >>
> > > >
> > > > I have to copy through the network (I am running out of
> > > > disks...) so btrfs restore is unfortunately not an option.
> > > > I didn't know that btrfs send could contaminate the target disk
> > > > as well ?
> > > > Ok rsync it is then.
> > > 
> > > restore will let you extract files despite csum errors. I don't
> > > think send will, and using cp or rsync Btrfs definitely won't
> > > hand over the file.
> > > 
> > 
> > That's Ok I'd prefer to avoid copying files with csum errors anyway
> > (I can restore them from backups).
> > However will btrfs send abort the whole operation as soon as it
> > finds a csum error ?
> > And will I have the risk to "contaminate" the target BTRFS volume by
> > using BTRFS send ?
> 
>A send stream is effectively just a sequence of filesystem commands
> (mv, cp, cp --reflink, rm, dd). So any damage that it can do when
> replayed by receive is limited to what you can do with the basic shell
> commands (plus cloning extents). If you have metadata breakage in your
> source filesystem, this won't cause the same metadata breakage to show
> up in the target filesystem.
> 

Well after 300GB copied through "btrfs send", the process is aborted
with the following error:
ERROR: send ioctl failed with -5: Input/output error
ERROR: unexpected EOF in stream.

/var/log/syslog relevant lines are appended at the end of this email.

So it seems that I will have to go with rsync then.

WARNING: CPU: 3 PID: 1779 at 
/build/linux-9LouV5/linux-4.6.1/fs/btrfs/extent-tree.c:6608 
__btrfs_free_extent.isra.67+0x152/0xdc0 [btrfs]
BTRFS: Transaction aborted (error -5)
Modules linked in: bnep(E) snd_hda_codec_hdmi(E) snd_hda_codec_realtek(E) 
snd_hda_codec_generic(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) nfs(E) lockd(E) 
grace(E) fscache(E) sunrpc(E) intel_rapl(E) x86_pkg_temp_thermal(E) 
intel_powerclamp(E) coretemp(E) kvm_intel(E) kvm(E) irqbypass(E) 
crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) nls_utf8(E) hmac(E) 
nls_cp437(E) drbg(E) ansi_cprng(E) vfat(E) fat(E) wl(POE) aesni_intel(E) 
aes_x86_64(E) lrw(E) gf128mul(E) glue_helper(E) ablk_helper(E) cryptd(E) 
cfg80211(E) pcspkr(E) snd_hda_intel(E) snd_hda_codec(E) snd_hda_core(E) 
snd_hwdep(E) evdev(E) btusb(E) efi_pstore(E) joydev(E) btrtl(E) serio_raw(E) 
snd_pcm(E) snd_timer(E) snd(E) soundcore(E) shpchp(E) efivars(E) i2c_i801(E) 
hci_uart(E) btbcm(E) btqca(E) btintel(E) bluetooth(E) rfkill(E) i915(E) 
battery(E) crc16(E) video(E) intel_lpss_acpi(E) drm_kms_helper(E) intel_lpss(E) 
mfd_core(E) tpm_tis(E) acpi_pad(E) tpm(E) drm(E) acpi_als(E) kfifo_buf(E) 
i2c_algo_bit(E) mei_me(E) button(E) pro
 cessor(E) industrialio(E) mei(E) fuse(E) autofs4(E) btrfs(E) xor(E) 
raid6_pq(E) sg(E) sd_mod(E) hid_logitech_hidpp(E) hid_logitech_dj(E) usbhid(E) 
ahci(E) libahci(E) crc32c_intel(E) e1000e(E) xhci_pci(E) xhci_hcd(E) ptp(E) 
psmouse(E) libata(E) pps_core(E) scsi_mod(E) usbcore(E) usb_common(E) 
i2c_hid(E) hid(E) fjes(E)
CPU: 3 PID: 1779 Comm: btrfs-transacti Tainted: P   OE   
4.6.0-0.bpo.1-amd64 #1 Debian 4.6.1-1~bpo8+1
Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z170 
Gaming-ITX/ac, BIOS P2.10 04/13/2016
 0286 3e3b5862 813123c5 880067783b68
  8107af94 01b8f6065000 880067783bc0
 88006b328000 880165fc6000  880105a21150
Call Trace:
 [] ? dump_stack+0x5c/0x77
 [] ? __warn+0xc4/0xe0
 [] ? warn_slowpath_fmt+0x5f/0x80
 [] ? __btrfs_free_extent.isra.67+0x152/0xdc0 [btrfs]
 [] ? btrfs_merge_delayed_refs+0x6c/0x610 [btrfs]
 [] ? __btrfs_run_delayed_refs+0x9ad/0x1210 [btrfs]
 [] ? btrfs_run_delayed_refs+0x8e/0x2b0 [btrfs]
 [] ? btrfs_commit_transaction+0x4a3/0xa30 [btrfs]
 [] ? start_transaction+0x96/0x4d0 [btrfs]
 [] ? transaction_kthread+0x1ce/0x1f0

Re: Kernel bug during RAID1 replace

2016-06-29 Thread Saint Germain

On Wed, 29 Jun 2016 13:08:30 -0600, Chris Murphy
 wrote :

> >> > Ok I will follow your advice and start over with a fresh BTRFS
> >> > volume. As explained on another email, rsync doesn't support
> >> > reflink, so do you think it is worth trying with BTRFS send
> >> > instead ? Is it safe to copy this way or rsync is more reliable
> >> > in case of faulty BTRFS volume ?
> >> >
> >> If you have the space, btrfs restore would probably be the best
> >> option. It's not likely, but using send has a risk of contaminating
> >> the new filesystem as well.
> >>
> >
> > I have to copy through the network (I am running out of disks...) so
> > btrfs restore is unfortunately not an option.
> > I didn't know that btrfs send could contaminate the target disk as
> > well ?
> > Ok rsync it is then.
> 
> restore will let you extract files despite csum errors. I don't think
> send will, and using cp or rsync Btrfs definitely won't hand over the
> file.
> 

That's Ok I'd prefer to avoid copying files with csum errors anyway (I
can restore them from backups).
However will btrfs send abort the whole operation as soon as it finds a
csum error ?
And will I have the risk to "contaminate" the target BTRFS volume by
using BTRFS send ?

Thanks !
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Kernel bug during RAID1 replace

2016-06-29 Thread Saint Germain


On Wed, 29 Jun 2016 14:19:23 -0400, "Austin S. Hemmelgarn"
 wrote :

> >>> Already got a backup. I just really want to try to repair it (in
> >>> order to test BTRFS).
> >>
> >> I don't know that this is a good test because I think the file
> >> system has already been sufficient corrupted that it can't be
> >> fixed. Part of the problem is that Btrfs isn't aware of faulty
> >> drives like mdadm or lvm yet, so it looks like it'll try to write
> >> to all devices and it's possible for significant confusion to
> >> happen if they're each getting different generation writes.
> >> Significant as in, currently beyond repair.
> >>
> > On the other hand it seems interesting to repair instead of just
> > giving up. It gives a good look at BTRFS resiliency/reliability.
> 
>  On the one hand Btrfs shouldn't become inconsistent in the first
>  place, that's the design goal. On the other hand, I'm finding
>  from the problems reported on the list that Btrfs increasingly
>  mounts at least read only and allows getting data off, even when
>  the file system isn't fully functional or repairable.
> 
>  In your case, once there are metadata problems even with raid 1,
>  it's difficult at best. But once you have the backup you could
>  try some other things once it's certain the hardware isn't
>  adding to the problems, which I'm still not yet certain of.
> 
> >>>
> >>> I'm ready to try anything. Let's experiment.
> >>
> >> I kinda think it's a waste of time. Someone else maybe has a better
> >> idea?
> >>
> >> I think your time is better spent finding out when and why the
> >> device with all of these write errors happened. It must have gone
> >> missing for a while, and you need to find out why that happened
> >> and prevent it; OR you have to be really vigilent at every mount
> >> time to make sure both devices have the same transid (generation).
> >> In my case when I tried to sabotage this, being of by a generation
> >> of 1 wasn't a problem for Btrfs to automatically fix up but I
> >> suspect it was only a generation mismatch in the superblock.
> >>
> >
> > Ok I will follow your advice and start over with a fresh BTRFS
> > volume. As explained on another email, rsync doesn't support
> > reflink, so do you think it is worth trying with BTRFS send
> > instead ? Is it safe to copy this way or rsync is more reliable in
> > case of faulty BTRFS volume ?
> >
> If you have the space, btrfs restore would probably be the best
> option. It's not likely, but using send has a risk of contaminating
> the new filesystem as well.
> 

I have to copy through the network (I am running out of disks...) so
btrfs restore is unfortunately not an option.
I didn't know that btrfs send could contaminate the target disk as
well ?
Ok rsync it is then.

Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Kernel bug during RAID1 replace

2016-06-29 Thread Saint Germain

On Wed, 29 Jun 2016 11:28:24 -0600, Chris Murphy
 wrote :

> > Already got a backup. I just really want to try to repair it (in
> > order to test BTRFS).
> 
> I don't know that this is a good test because I think the file system
> has already been sufficient corrupted that it can't be fixed. Part of
> the problem is that Btrfs isn't aware of faulty drives like mdadm or
> lvm yet, so it looks like it'll try to write to all devices and it's
> possible for significant confusion to happen if they're each getting
> different generation writes. Significant as in, currently beyond
> repair.
> 
> >> > On the other hand it seems interesting to repair instead of just
> >> > giving up. It gives a good look at BTRFS resiliency/reliability.
> >>
> >> On the one hand Btrfs shouldn't become inconsistent in the first
> >> place, that's the design goal. On the other hand, I'm finding from
> >> the problems reported on the list that Btrfs increasingly mounts
> >> at least read only and allows getting data off, even when the file
> >> system isn't fully functional or repairable.
> >>
> >> In your case, once there are metadata problems even with raid 1,
> >> it's difficult at best. But once you have the backup you could try
> >> some other things once it's certain the hardware isn't adding to
> >> the problems, which I'm still not yet certain of.
> >>
> >
> > I'm ready to try anything. Let's experiment.
> 
> I kinda think it's a waste of time. Someone else maybe has a better
> idea?
> 
> I think your time is better spent finding out when and why the device
> with all of these write errors happened. It must have gone missing for
> a while, and you need to find out why that happened and prevent it; OR
> you have to be really vigilent at every mount time to make sure both
> devices have the same transid (generation). In my case when I tried to
> sabotage this, being of by a generation of 1 wasn't a problem for
> Btrfs to automatically fix up but I suspect it was only a generation
> mismatch in the superblock.
> 

Ok I will follow your advice and start over with a fresh BTRFS volume.
As explained on another email, rsync doesn't support reflink, so do you
think it is worth trying with BTRFS send instead ?
Is it safe to copy this way or rsync is more reliable in case of faulty
BTRFS volume ?

Many thanks !
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Kernel bug during RAID1 replace

2016-06-29 Thread Saint Germain

On Wed, 29 Jun 2016 11:50:55 +0200, Saint Germain <saint...@gmail.com>
wrote :

> So if I understand correctly, you advise to use check --repair
> --init-csum-tree and delete the files which were reported as having
> checksum error ?
> After that I can compare the important files to a backup, but there is
> always the non-important files which are not backuped.
> 
> Is there anyway I can be sure afterwards that the volume is indeed
> completely correct and reliable ?
> If there is no way to be sure, I think it is better that I cp/rsync
> all data to a new BTRFS volume.
> 

Oh and I forgot to add that rsync doesn't support reflink yet, so I am
bit reluctant to rsync all data to a new volume instead of repairing
the existing BTRFS volume.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Kernel bug during RAID1 replace

2016-06-29 Thread Saint Germain

On Tue, 28 Jun 2016 22:25:32 -0600, Chris Murphy
 wrote :


> > Well I made a ddrescue image of both drives (only one error on sdb
> > during ddrescue copy) and started the computer again (after
> > disconnecting the old drives).
> 
> What was the error? Any kernel message at the time of this error?
> 

ddrescue reported an error during operation ("error: 1" displayed).
Dump of /var/log/syslog during the ddrescue operation is appended at
the end of this email.

> 
> > I don't know if I should continue trying to repair this RAID1 or if
> > I should just cp/rsync to a new BTRFS volume and get done with it.
> 
> Well for sure already you should prepare to lose this volume, so
> whatever backup you need, do that yesterday.

Already got a backup. I just really want to try to repair it (in order
to test BTRFS).

> > On the other hand it seems interesting to repair instead of just
> > giving up. It gives a good look at BTRFS resiliency/reliability.
> 
> On the one hand Btrfs shouldn't become inconsistent in the first
> place, that's the design goal. On the other hand, I'm finding from the
> problems reported on the list that Btrfs increasingly mounts at least
> read only and allows getting data off, even when the file system isn't
> fully functional or repairable.
> 
> In your case, once there are metadata problems even with raid 1, it's
> difficult at best. But once you have the backup you could try some
> other things once it's certain the hardware isn't adding to the
> problems, which I'm still not yet certain of.
> 

I'm ready to try anything. Let's experiment.

> >
> > Here is the log from the mount to the scrub aborting and the result
> > from smartctl.
> >
> > Thanks for your precious help so far.
> >
> >
> > BTRFS error (device sdb1): cleaner transaction attach returned -30
> 
> Not sure what this is. The Btrfs cleaner is used to remove snapshots,
> decrement extent reference count, and if the count is 0, then free up
> that space. So, why is it running? I don't know what -30 means.
> 
> 
> > BTRFS info (device sdb1): disk space caching is enabled
> > BTRFS info (device sdb1): bdev /dev/sdb1 errs: wr 11881695, rd 14,
> > flush 7928, corrupt 1714507, gen 1335 BTRFS info (device sdb1):
> > bdev /dev/sda1 errs: wr 0, rd 0, flush 0, corrupt 21622, gen 24
> 
> I missed something the first time around in these messages: the
> generation error. Both drives have generation errors. A generation
> error on a single drive means that drive was not successfully being
> written to or was missing. For it to happen on both drives is bad. If
> it happens to just one drive, once it's reappears it will be passively
> caught up to the other one as reads happen, but best practice for now
> requires the user to run scrub or balance. If that doesn't happen and
> a 2nd drive vanishes or has write errors that cause generation
> mismatches, now both drives are simultaneously behind and ahead of
> each other. Some commits went to one drive, some went to the other.
> And right now Btrfs totally flips out and will irreparably get
> corrupted.
> 
> So I have to ask if this volume was ever mounted degraded? If not you
> really need to look at logs and find out why the drives weren't being
> written to. sdb show lots of write, flush, corruption and generation
> errors, so it seems like it was having a hardware issue. But then sda
> has only corruptions and generation problems, as if it wasn't even
> connected or powered on.
> 
> OR another possibility is one of the drives was previously cloned
> (block copied), or snapshot via LVM and you ran into the block level
> copies gotcha:
> https://btrfs.wiki.kernel.org/index.php/Gotchas
> 

I got some errors on sdb 2 months ago (I noticed it because it was
suddenly mounted read-only). I ran a scrub and a check --repair, and
a lot of errors were corrected. I deleted the files which were not
repairable and everything was running smoothly since. I ran a scrub a
few weeks ago and everything was fine.

I never mounted in degraded mode or made a snapshot via LVM (I only
upgraded both drives through "replace" 6 months ago).

> 
> > BTRFS warning (device sdb1): checksum error at logical 93445255168
> > on dev /dev/sdb1, sector 54528696, root 5, inode 3434831, offset
> > 479232, length 4096, links 1 (path:
> > user/.local/share/zeitgeist/activity.sqlite-wal)
> 
> Some extent data and its checksum don't match, on sdb. So this file is
> considered corrupt. Maybe the data is OK and the checksum is wrong?
> 
> > btrfs_dev_stat_print_on_error: 164 callbacks suppressed
> > BTRFS error (device sdb1): bdev /dev/sdb1 errs: wr 11881695, rd 14,
> > flush 7928, corrupt 1714508, gen 1335 scrub_handle_errored_block:
> > 164 callbacks suppressed BTRFS error (device sdb1): unable to fixup
> > (regular) error at logical 93445255168 on dev /dev/sdb1
> 
> And it can't be fixed, because...
> 
> > BTRFS warning (device sdb1): checksum error at logical 93445255168
> > on dev /dev/sda1, sector

Re: Kernel bug during RAID1 replace

2016-06-28 Thread Saint Germain

On Mon, 27 Jun 2016 20:14:58 -0600, Chris Murphy
<li...@colorremedies.com> wrote :

> On Mon, Jun 27, 2016 at 6:49 PM, Saint Germain <saint...@gmail.com>
> wrote:
> 
> >
> > I've tried both option and launched a replace, but I got the same
> > error (replace is cancelled, jernel bug).
> > I will let these options on and attempt a ddrescue on /dev/sda
> > to /dev/sdd.
> > Then I will disconnect /dev/sda and reboot and see if it works
> > better.
> 
> Sounds reasonable. Just make sure the file system is already unmounted
> when you use ddrescue because otherwise you're block copying it while
> it could be modified while rw mounted (generation number tends to get
> incremented while rw mounted).
> 
> 

Well I made a ddrescue image of both drives (only one error on sdb
during ddrescue copy) and started the computer again (after
disconnecting the old drives).

However the errors remains there, and I still cannot scrub (scrub is
aborted), nor delete the file which have errors (drive is remounted
read-only if I try to delete the files).

I don't know if I should continue trying to repair this RAID1 or if I
should just cp/rsync to a new BTRFS volume and get done with it.
On the other hand it seems interesting to repair instead of just giving
up. It gives a good look at BTRFS resiliency/reliability.

Here is the log from the mount to the scrub aborting and the result
from smartctl.

Thanks for your precious help so far.


BTRFS error (device sdb1): cleaner transaction attach returned -30
BTRFS info (device sdb1): disk space caching is enabled
BTRFS info (device sdb1): bdev /dev/sdb1 errs: wr 11881695, rd 14, flush 7928, 
corrupt 1714507, gen 1335
BTRFS info (device sdb1): bdev /dev/sda1 errs: wr 0, rd 0, flush 0, corrupt 
21622, gen 24
scrub_handle_errored_block: 164 callbacks suppressed
BTRFS warning (device sdb1): checksum error at logical 93445255168 on dev 
/dev/sdb1, sector 54528696, root 5, inode 3434831, offset 479232, length 4096, 
links 1 (path: user/.local/share/zeitgeist/activity.sqlite-wal)
btrfs_dev_stat_print_on_error: 164 callbacks suppressed
BTRFS error (device sdb1): bdev /dev/sdb1 errs: wr 11881695, rd 14, flush 7928, 
corrupt 1714508, gen 1335
scrub_handle_errored_block: 164 callbacks suppressed
BTRFS error (device sdb1): unable to fixup (regular) error at logical 
93445255168 on dev /dev/sdb1
BTRFS error (device sdb1): bdev /dev/sdb1 errs: wr 11881695, rd 14, flush 7928, 
corrupt 1714509, gen 1335
BTRFS error (device sdb1): unable to fixup (regular) error at logical 
93445259264 on dev /dev/sdb1
BTRFS warning (device sdb1): checksum error at logical 93445255168 on dev 
/dev/sda1, sector 77669048, root 5, inode 3434831, offset 479232, length 4096, 
links 1 (path: user/.local/share/zeitgeist/activity.sqlite-wal)
BTRFS error (device sdb1): bdev /dev/sda1 errs: wr 0, rd 0, flush 0, corrupt 
21623, gen 24
BTRFS error (device sdb1): unable to fixup (regular) error at logical 
93445255168 on dev /dev/sda1
BTRFS error (device sdb1): bdev /dev/sda1 errs: wr 0, rd 0, flush 0, corrupt 
21624, gen 24
BTRFS error (device sdb1): unable to fixup (regular) error at logical 
93445259264 on dev /dev/sda1
BTRFS warning (device sdb1): checksum error at logical 136349810688 on dev 
/dev/sda1, sector 140429952, root 5, inode 4265283, offset 0, length 4096, 
links 1 (path: user/Pictures/Picture-42-2.jpg)
BTRFS error (device sdb1): bdev /dev/sda1 errs: wr 0, rd 0, flush 0, corrupt 
21625, gen 24
BTRFS warning (device sdb1): checksum error at logical 136349929472 on dev 
/dev/sda1, sector 140430184, root 5, inode 4265283, offset 118784, length 4096, 
links 1 (path: user/Pictures/Picture-42-2.jpg)
BTRFS warning (device sdb1): checksum error at logical 136350060544 on dev 
/dev/sda1, sector 140430440, root 5, inode 4265283, offset 249856, length 4096, 
links 1 (path: user/Pictures/Picture-42-2.jpg)
BTRFS error (device sdb1): bdev /dev/sda1 errs: wr 0, rd 0, flush 0, corrupt 
21626, gen 24
BTRFS error (device sdb1): bdev /dev/sda1 errs: wr 0, rd 0, flush 0, corrupt 
21627, gen 24
BTRFS error (device sdb1): unable to fixup (regular) error at logical 
136349810688 on dev /dev/sda1
BTRFS error (device sdb1): unable to fixup (regular) error at logical 
136350060544 on dev /dev/sda1
BTRFS error (device sdb1): unable to fixup (regular) error at logical 
136349929472 on dev /dev/sda1
BTRFS warning (device sdb1): checksum error at logical 136349814784 on dev 
/dev/sda1, sector 140429960, root 5, inode 4265283, offset 4096, length 4096, 
links 1 (path: user/Pictures/Picture-42-2.jpg)
BTRFS error (device sdb1): bdev /dev/sda1 errs: wr 0, rd 0, flush 0, corrupt 
21628, gen 24
BTRFS warning (device sdb1): checksum error at logical 136350064640 on dev 
/dev/sda1, sector 140430448, root 5, inode 4265283, offset 253952, length 4096, 
links 1 (path: user/Pictures/Picture-42-2.jpg)
BTRFS error (device sdb1): bdev /dev/sda1 errs: wr 0, rd 0, flush

Re: Kernel bug during RAID1 replace

2016-06-27 Thread Saint Germain

On Mon, 27 Jun 2016 18:00:34 -0600, Chris Murphy
<li...@colorremedies.com> wrote :

> On Mon, Jun 27, 2016 at 5:06 PM, Saint Germain <saint...@gmail.com>
> wrote:
> > On Mon, 27 Jun 2016 16:58:37 -0600, Chris Murphy
> > <li...@colorremedies.com> wrote :
> >
> >> On Mon, Jun 27, 2016 at 4:55 PM, Chris Murphy
> >> <li...@colorremedies.com> wrote:
> >>
> >> >> BTRFS info (device sdb1): dev_replace from /dev/sda1 (devid 1)
> >> >> to /dev/sdd1 started scrub_handle_errored_block: 166 callbacks
> >> >> suppressed BTRFS warning (device sdb1): checksum error at
> >> >> logical 93445255168 on dev /dev/sda1, sector 77669048, root 5,
> >> >> inode 3434831, offset 479232, length 4096, links 1 (path:
> >> >> user/.local/share/zeitgeist/activity.sqlite-wal)
> >> >> btrfs_dev_stat_print_on_error: 166 callbacks suppressed BTRFS
> >> >> error (device sdb1): bdev /dev/sda1 errs: wr 0, rd 0, flush 0,
> >> >> corrupt 14221, gen 24 scrub_handle_errored_block: 166 callbacks
> >> >> suppressed BTRFS error (device sdb1): unable to fixup (regular)
> >> >> error at logical 93445255168 on dev /dev/sda1
> >> >
> >> > Shoot. You have a lot of these. It looks suspiciously like you're
> >> > hitting a case list regulars are only just starting to understand
> >>
> >> Forget this part completely. It doesn't affect raid1. I just
> >> re-read that your setup is not raid1, I don't know why I thought
> >> it was raid5.
> >>
> >> The likely issue here is that you've got legit corruptions on sda
> >> (mix of slow and flat out bad sectors), as well as a failing drive.
> >>
> >> This is also safe to issue:
> >>
> >> smartctl -l scterc /dev/sda
> >> smartctl -l scterc /dev/sdb
> >> cat /sys/block/sda/device/timeout
> >> cat /sys/block/sdb/device/timeout
> >>
> >
> > My setup is indeed RAID1 (and not RAID5)
> >
> > root@system:/# smartctl -l scterc /dev/sda
> > smartctl 6.4 2014-10-07 r4002 [x86_64-linux-4.6.0-0.bpo.1-amd64]
> > (local build) Copyright (C) 2002-14, Bruce Allen, Christian Franke,
> > www.smartmontools.org
> >
> > SCT Error Recovery Control:
> >Read: Disabled
> >   Write: Disabled
> >
> > root@system:/# smartctl -l scterc /dev/sdb
> > smartctl 6.4 2014-10-07 r4002 [x86_64-linux-4.6.0-0.bpo.1-amd64]
> > (local build) Copyright (C) 2002-14, Bruce Allen, Christian Franke,
> > www.smartmontools.org
> >
> > SCT Error Recovery Control:
> >Read: Disabled
> >   Write: Disabled
> >
> > root@system:/# cat /sys/block/sda/device/timeout
> > 30
> > root@system:/# cat /sys/block/sdb/device/timeout
> > 30
> 
> Good news and bad news. The bad news is this is a significant
> misconfiguration, it's very common, and it means that any bad sectors
> that don't result in read errors before 30 seconds will mean they
> don't get fixed by Btrfs (or even mdadm or LVM raid). So they can
> accumulate.
> 
> There are two options since your drives support SCT ERC.
> 
> 1.
> smartctl -l scterc,70,70 /dev/sdX  ## done for both drives
> 
> That will make sure the drive reports a read error in 7 seconds, well
> under the kernel's command timer of 7 seconds. This is how your drives
> should normally be configured for RAID usage.
> 
> 2.
> echo 180 > /sys/block/sda/device/timeout
> echo 180 > /sys/block/sdb/device/timeout
> 
> This *might* actually work better in your case. If you permit the
> drives to have really long error recovery, it might actually allow the
> data to be returned to Btrfs and then it can start fixing problems.
> Maybe. It's a long shot. And there will be upwards of 3 minute hangs.
> 
> I would give this a shot first. You can issue these commands safely at
> any time, no umount is needed or anything like that. I would do this
> even before using cp/rsync or ddrescue because it increases the chance
> the drive can recover data from these bad sectors and fix the other
> drive.
> 
> These settings are not persistent across a reboot unless you set a
> udev rule or equivalent.
> 
> On one of my drives that supports SCT ERC it only accepts the smartctl
> -l command to set the timeout once. I can't change it without power
> cycling the drive or it just crashes (yay firmware bugs). Just FYI
> it's possible to run into other weirdness.
> 

I've tried both option and launched a replace, but I got the same error
(replace is cancelled, jernel bug)

Re: Kernel bug during RAID1 replace

2016-06-27 Thread Saint Germain

On Mon, 27 Jun 2016 16:58:37 -0600, Chris Murphy
 wrote :

> On Mon, Jun 27, 2016 at 4:55 PM, Chris Murphy
>  wrote:
> 
> >> BTRFS info (device sdb1): dev_replace from /dev/sda1 (devid 1)
> >> to /dev/sdd1 started scrub_handle_errored_block: 166 callbacks
> >> suppressed BTRFS warning (device sdb1): checksum error at logical
> >> 93445255168 on dev /dev/sda1, sector 77669048, root 5, inode
> >> 3434831, offset 479232, length 4096, links 1 (path:
> >> user/.local/share/zeitgeist/activity.sqlite-wal)
> >> btrfs_dev_stat_print_on_error: 166 callbacks suppressed BTRFS
> >> error (device sdb1): bdev /dev/sda1 errs: wr 0, rd 0, flush 0,
> >> corrupt 14221, gen 24 scrub_handle_errored_block: 166 callbacks
> >> suppressed BTRFS error (device sdb1): unable to fixup (regular)
> >> error at logical 93445255168 on dev /dev/sda1
> >
> > Shoot. You have a lot of these. It looks suspiciously like you're
> > hitting a case list regulars are only just starting to understand
> 
> Forget this part completely. It doesn't affect raid1. I just re-read
> that your setup is not raid1, I don't know why I thought it was raid5.
> 
> The likely issue here is that you've got legit corruptions on sda (mix
> of slow and flat out bad sectors), as well as a failing drive.
> 
> This is also safe to issue:
> 
> smartctl -l scterc /dev/sda
> smartctl -l scterc /dev/sdb
> cat /sys/block/sda/device/timeout
> cat /sys/block/sdb/device/timeout
> 

My setup is indeed RAID1 (and not RAID5)

root@system:/# smartctl -l scterc /dev/sda
smartctl 6.4 2014-10-07 r4002 [x86_64-linux-4.6.0-0.bpo.1-amd64] (local
build) Copyright (C) 2002-14, Bruce Allen, Christian Franke,
www.smartmontools.org

SCT Error Recovery Control:
   Read: Disabled
  Write: Disabled

root@system:/# smartctl -l scterc /dev/sdb
smartctl 6.4 2014-10-07 r4002 [x86_64-linux-4.6.0-0.bpo.1-amd64] (local
build) Copyright (C) 2002-14, Bruce Allen, Christian Franke,
www.smartmontools.org

SCT Error Recovery Control:
   Read: Disabled
  Write: Disabled

root@system:/# cat /sys/block/sda/device/timeout
30
root@system:/# cat /sys/block/sdb/device/timeout
30
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Kernel bug during RAID1 replace

2016-06-27 Thread Saint Germain

On Mon, 27 Jun 2016 16:55:07 -0600, Chris Murphy
<li...@colorremedies.com> wrote :

> On Mon, Jun 27, 2016 at 4:26 PM, Saint Germain <saint...@gmail.com>
> wrote:
> 
> >>
> >
> > Thanks for your help.
> >
> > Ok here is the log from the mounting, and including btrfs replace
> > (btrfs replace start -f /dev/sda1 /dev/sdd1 /home):
> >
> > BTRFS info (device sdb1): disk space caching is enabled
> > BTRFS info (device sdb1): bdev /dev/sdb1 errs: wr 11881695, rd 12,
> > flush 7928, corrupt 1705631, gen 1335 BTRFS info (device sdb1):
> > bdev /dev/sda1 errs: wr 0, rd 0, flush 0, corrupt 14220, gen 24
> 
> Eek. So sdb has 11+ million write errors, flush errors, read errors,
> and over 1 million corruptions. It's dying or dead.
> 
> And sda has a dozen thousand+ corruptions. This isn't a good
> combination, as you have two devices with problems and raid5 only
> protects you from one device with problems.
> 
> You were in the process of replacing sda, which is good, but it may
> not be enough...
> 
> 
> > BTRFS info (device sdb1): dev_replace from /dev/sda1 (devid 1)
> > to /dev/sdd1 started scrub_handle_errored_block: 166 callbacks
> > suppressed BTRFS warning (device sdb1): checksum error at logical
> > 93445255168 on dev /dev/sda1, sector 77669048, root 5, inode
> > 3434831, offset 479232, length 4096, links 1 (path:
> > user/.local/share/zeitgeist/activity.sqlite-wal)
> > btrfs_dev_stat_print_on_error: 166 callbacks suppressed BTRFS error
> > (device sdb1): bdev /dev/sda1 errs: wr 0, rd 0, flush 0, corrupt
> > 14221, gen 24 scrub_handle_errored_block: 166 callbacks suppressed
> > BTRFS error (device sdb1): unable to fixup (regular) error at
> > logical 93445255168 on dev /dev/sda1
> 
> Shoot. You have a lot of these. It looks suspiciously like you're
> hitting a case list regulars are only just starting to understand
> (somewhat) where it's possible to have a legit corrupt sector that
> Btrfs detects during scrub as wrong, fixes it from parity, but then
> occasionally wrongly overwrites the parity with bad parity. This
> doesn't cause an immediately recognizable problem. But if the volume
> becomes degraded later, Btrfs must use parity to reconstruct
> on-the-fly and if it hits one of these bad parities, the
> reconstruction is bad, and ends up causing lots of these checksum
> errors. We can tell it's not metadata corruption because a.) there's a
> file listed as being affected and b.) the file system doesn't fail and
> go read only. But still it means those files are likely toast...
> 
> 
> [...snip many instances of checksum errors...]
> 
> > BTRFS error (device sdb1): bdev /dev/sda1 errs: wr 0, rd 0, flush
> > 0, corrupt 16217, gen 24 ata2.00: exception Emask 0x0 SAct 0x4000
> > SErr 0x0 action 0x0 ata2.00: irq_stat 0x4008
> > ata2.00: failed command: READ FPDMA QUEUED
> > ata2.00: cmd 60/08:70:08:d8:70/00:00:0f:00:00/40 tag 14 ncq 4096 in
> >  res 41/40:00:08:d8:70/00:00:0f:00:00/40 Emask 0x409 (media
> > error)  ata2.00: status: { DRDY ERR }
> > ata2.00: error: { UNC }
> > ata2.00: configured for UDMA/133
> > sd 1:0:0:0: [sdb] tag#14 FAILED Result: hostbyte=DID_OK
> > driverbyte=DRIVER_SENSE sd 1:0:0:0: [sdb] tag#14 Sense Key : Medium
> > Error [current] [descriptor] sd 1:0:0:0: [sdb] tag#14 Add. Sense:
> > Unrecovered read error - auto reallocate failed sd 1:0:0:0: [sdb]
> > tag#14 CDB: Read(10) 28 00 0f 70 d8 08 00 00 08 00
> > blk_update_request: I/O error, dev sdb, sector 259053576
> 
> OK yeah so bad sector on sdb. So you have two failures because sda is
> already giving you trouble while being replaced and on top of it you
> now get a 2nd (partial) failure via bad sectors.
> 
> So rather urgently I think you need to copy things off this volume if
> you don't already have a backup so you can save as much as possible.
> Don't write to the drives. You might even consider 'mount -o
> remount,ro' to avoid anything writing to the volume. Copy the most
> important data first, triage time.
> 
> While that happens you can safely collect some more information:
> 
> btrfs fi us 
> smartctl -x## for both drives
> 

Ok thanks I will begin to make an image with dd.
Do you recommend to use sda or sdb ?

In the meantime here are the info requested:

btrfs fi us /home
Overall:
Device size:   3.63TiB
Device allocated:  2.76TiB
Device unallocated:  888.51GiB
Device missing:  0.00B
Used:  2.62TiB
Free (estimated):517.56GiB  (min: 517.56GiB)
Data ratio:   2.00
Metadata ratio:

Re: Kernel bug during RAID1 replace

2016-06-27 Thread Saint Germain

On Mon, 27 Jun 2016 15:42:42 -0600, Chris Murphy
<li...@colorremedies.com> wrote :

> On Mon, Jun 27, 2016 at 3:36 PM, Saint Germain <saint...@gmail.com>
> wrote:
> > Hello,
> >
> > I am on Debian Jessie with a kernel from backports:
> > 4.6.0-0.bpo.1-amd64
> >
> > I am also using btrfs-tools 4.4.1-1.1~bpo8+1
> >
> > When trying to replace a RAID1 drive (with btrfs replace start
> > -f /dev/sda1 /dev/sdd1), the operation is cancelled after completing
> > only 5%.
> >
> > I got this error in the /var/log/syslog:
> > [ cut here ]
> > WARNING: CPU: 2 PID: 2617
> > at /build/linux-9LouV5/linux-4.6.1/fs/btrfs/dev-replace.c:430
> > btrfs_dev_replace_start+0x2be/0x400 [btrfs] Modules linked in:
> > uas(E) usb_storage(E) bnep(E) ftdi_sio(E) usbserial(E)
> > snd_hda_codec_hdmi(E) nls_utf8(E) nls_cp437(E) vfat(E) fat(E)
> > intel_rapl(E) x86_pkg_temp_thermal(E) intel_powerclamp(E)
> > coretemp(E) kvm_intel(E) kvm(E) iTCO_wdt(E) irqbypass(E)
> > iTCO_vendor_support(E) crct10dif_pclmul(E) crc32_pclmul(E)
> > ghash_clmulni_intel(E) hmac(E) drbg(E) ansi_cprng(E) aesni_intel(E)
> > aes_x86_64(E) lrw(E) gf128mul(E) glue_helper(E) ablk_helper(E)
> > cryptd(E) wl(POE) btusb(E) btrtl(E) btbcm(E) btintel(E) cfg80211(E)
> > bluetooth(E) efi_pstore(E) snd_hda_codec_realtek(E) evdev(E)
> > crc16(E) serio_raw(E) pcspkr(E) efivars(E) joydev(E)
> > snd_hda_codec_generic(E) rfkill(E) snd_hda_intel(E) nuvoton_cir(E)
> > rc_core(E) snd_hda_codec(E) i915(E) battery(E) snd_hda_core(E)
> > snd_hwdep(E) soc_button_array(E) tpm_tis(E) drm_kms_helper(E)
> > intel_smartconnect(E) snd_pcm(E) tpm(E) video(E) i2c_i801(E)
> > snd_timer(E) drm(E) snd(E) lpc_ich(E) i2c_algo_bit(E) soundcore(E)
> > mfd_core(E) mei_me(E) processor(E) button(E) mei(E) shpchp(E)
> > fuse(E) autofs4(E) hid_logitech_hidpp(E) btrfs(E)
> > hid_logitech_dj(E) usbhid(E) hid(E) xor(E) raid6_pq(E) sg(E)
> > sr_mod(E) cdrom(E) sd_mod(E) crc32c_intel(E) ahci(E) libahci(E)
> > libata(E) psmouse(E) scsi_mod(E) xhci_pci(E) ehci_pci(E)
> > xhci_hcd(E) ehci_hcd(E) e1000e(E) usbcore(E) ptp(E) pps_core(E)
> > usb_common(E) fjes(E) CPU: 2 PID: 2617 Comm: btrfs Tainted:
> > P   OE   4.6.0-0.bpo.1-amd64 #1 Debian 4.6.1-1~bpo8+1
> > Hardware name: To Be Filled By O.E.M. To Be Filled By
> > O.E.M./Z87E-ITX, BIOS P2.10 10/04/2013 0286
> > f0ba7fe7 813123c5  
> > 8107af94 880186caf000 fffb 8800c76b0800
> > 8800cae7 8800cae70ee0 7ffdd5397d98 Call Trace:
> > [] ? dump_stack+0x5c/0x77 [] ?
> > __warn+0xc4/0xe0 [] ?
> > btrfs_dev_replace_start+0x2be/0x400 [btrfs] [] ?
> > btrfs_ioctl+0x1d42/0x2190 [btrfs] [] ?
> > handle_mm_fault+0x154d/0x1cb0 [] ?
> > do_vfs_ioctl+0x99/0x5d0 [] ? SyS_ioctl+0x76/0x90
> > [] ? system_call_fast_compare_end+0xc/0x96
> > ---[ end trace 9fbfaa137cc5a72a ]---
> >
> >
> >
> > What should I do to replace correctly my drive ?
> 
> I don't often see handle_mm_fault with btrfs problems, maybe the
> entire dmesg from mounting the fs and including btrfs replace would
> reveal a related problem that instigates the failure?
> 
> If the device being replaced is acting unreliably, then you'd want to
> use -r with replace to ignore that device unless it's absolutely
> necessary to read from it.
> 

Thanks for your help.

Ok here is the log from the mounting, and including btrfs replace
(btrfs replace start -f /dev/sda1 /dev/sdd1 /home):

BTRFS info (device sdb1): disk space caching is enabled
BTRFS info (device sdb1): bdev /dev/sdb1 errs: wr 11881695, rd 12, flush 7928, 
corrupt 1705631, gen 1335
BTRFS info (device sdb1): bdev /dev/sda1 errs: wr 0, rd 0, flush 0, corrupt 
14220, gen 24
BTRFS info (device sdb1): dev_replace from /dev/sda1 (devid 1) to /dev/sdd1 
started
scrub_handle_errored_block: 166 callbacks suppressed
BTRFS warning (device sdb1): checksum error at logical 93445255168 on dev 
/dev/sda1, sector 77669048, root 5, inode 3434831, offset 479232, length 4096, 
links 1 (path: user/.local/share/zeitgeist/activity.sqlite-wal)
btrfs_dev_stat_print_on_error: 166 callbacks suppressed
BTRFS error (device sdb1): bdev /dev/sda1 errs: wr 0, rd 0, flush 0, corrupt 
14221, gen 24
scrub_handle_errored_block: 166 callbacks suppressed
BTRFS error (device sdb1): unable to fixup (regular) error at logical 
93445255168 on dev /dev/sda1
BTRFS error (device sdb1): bdev /dev/sda1 errs: wr 0, rd 0, flush 0, corrupt 
14222, gen 24
BTRFS error (device sdb1): unable to fixup (regular) error at logical 
93445259264 on dev /dev/sda1
BTRFS warning (device sdb1): checksum error at l

Kernel bug during RAID1 replace

2016-06-27 Thread Saint Germain

Hello,

I am on Debian Jessie with a kernel from backports:
4.6.0-0.bpo.1-amd64

I am also using btrfs-tools 4.4.1-1.1~bpo8+1

When trying to replace a RAID1 drive (with btrfs replace start
-f /dev/sda1 /dev/sdd1), the operation is cancelled after completing
only 5%.

I got this error in the /var/log/syslog:
[ cut here ]
WARNING: CPU: 2 PID: 2617 at 
/build/linux-9LouV5/linux-4.6.1/fs/btrfs/dev-replace.c:430 
btrfs_dev_replace_start+0x2be/0x400 [btrfs]
Modules linked in: uas(E) usb_storage(E) bnep(E) ftdi_sio(E) usbserial(E) 
snd_hda_codec_hdmi(E) nls_utf8(E) nls_cp437(E) vfat(E) fat(E) intel_rapl(E) 
x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) kvm(E) 
iTCO_wdt(E) irqbypass(E) iTCO_vendor_support(E) crct10dif_pclmul(E) 
crc32_pclmul(E) ghash_clmulni_intel(E) hmac(E) drbg(E) ansi_cprng(E) 
aesni_intel(E) aes_x86_64(E) lrw(E) gf128mul(E) glue_helper(E) ablk_helper(E) 
cryptd(E) wl(POE) btusb(E) btrtl(E) btbcm(E) btintel(E) cfg80211(E) 
bluetooth(E) efi_pstore(E) snd_hda_codec_realtek(E) evdev(E) crc16(E) 
serio_raw(E) pcspkr(E) efivars(E) joydev(E) snd_hda_codec_generic(E) rfkill(E) 
snd_hda_intel(E) nuvoton_cir(E) rc_core(E) snd_hda_codec(E) i915(E) battery(E) 
snd_hda_core(E) snd_hwdep(E) soc_button_array(E) tpm_tis(E) drm_kms_helper(E) 
intel_smartconnect(E) snd_pcm(E) tpm(E) video(E) i2c_i801(E) snd_timer(E) 
drm(E) snd(E) lpc_ich(E) i2c_algo_bit(E) soundcore(E) mfd_core(E) mei_me(E) 
processor(E) button(E) mei(E) shp
 chp(E) fuse(E) autofs4(E) hid_logitech_hidpp(E) btrfs(E) hid_logitech_dj(E) 
usbhid(E) hid(E) xor(E) raid6_pq(E) sg(E) sr_mod(E) cdrom(E) sd_mod(E) 
crc32c_intel(E) ahci(E) libahci(E) libata(E) psmouse(E) scsi_mod(E) xhci_pci(E) 
ehci_pci(E) xhci_hcd(E) ehci_hcd(E) e1000e(E) usbcore(E) ptp(E) pps_core(E) 
usb_common(E) fjes(E)
CPU: 2 PID: 2617 Comm: btrfs Tainted: P   OE   4.6.0-0.bpo.1-amd64 #1 
Debian 4.6.1-1~bpo8+1
Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z87E-ITX, BIOS 
P2.10 10/04/2013
 0286 f0ba7fe7 813123c5 
  8107af94 880186caf000 fffb
 8800c76b0800 8800cae7 8800cae70ee0 7ffdd5397d98
Call Trace:
 [] ? dump_stack+0x5c/0x77
 [] ? __warn+0xc4/0xe0
 [] ? btrfs_dev_replace_start+0x2be/0x400 [btrfs]
 [] ? btrfs_ioctl+0x1d42/0x2190 [btrfs]
 [] ? handle_mm_fault+0x154d/0x1cb0
 [] ? do_vfs_ioctl+0x99/0x5d0
 [] ? SyS_ioctl+0x76/0x90
 [] ? system_call_fast_compare_end+0xc/0x96
---[ end trace 9fbfaa137cc5a72a ]---



What should I do to replace correctly my drive ?

Thanks in advance,
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: BTRFS with RAID1 cannot boot when removing drive

2014-02-16 Thread Saint Germain

On Fri, 14 Feb 2014 15:33:10 +0100, Saint Germain saint...@gmail.com
wrote :

 On 11 February 2014 03:30, Saint Germain saint...@gmail.com wrote:
   I am experimenting with BTRFS and RAID1 on my Debian Wheezy (with
   backported kernel 3.12-0.bpo.1-amd64) using a a motherboard with
   UEFI.
 
   I have installed Debian with the following partition on the first
   hard drive (no BTRFS subsystem):
   /dev/sda1: for / (BTRFS)
   /dev/sda2: for /home (BTRFS)
   /dev/sda3: for swap
  
   Then I added another drive for a RAID1 configuration (with btrfs
   balance) and I installed grub on the second hard drive with
   grub-install /dev/sdb.
 
  You should be able to mount a two-device btrfs raid1 filesystem
  with only a single device with the degraded mount option, tho I
  believe current kernels refuse a read-write mount in that case, so
  you'll have read-only access until you btrfs device add a second
  device, so it can do normal raid1 mode once again.
 
  Meanwhile, I don't believe it's on the wiki, but it's worth noting
  my experience with btrfs raid1 mode in my pre-deployment tests.
  Actually, with the (I believe) mandatory read-only mount if raid1
  is degraded below two devices, this problem's going to be harder
  to run into than it was in my testing several kernels ago, but
  here's what I found:
 
  But as I said, if btrfs only allows read-only mounts of filesystems
  without enough devices to properly complete the raidlevel, that
  shouldn't be as big an issue these days, since it should be more
  difficult or impossible to get the two devices separately mounted
  writable in the first place, with the consequence that the
  differing copies issue will be difficult or impossible to trigger
  in the first place. =:^)
 
 
 Hello,
 
 With your advices and Chris ones, I have now a (clean ?) partition to
 start experimenting with RAID1 (and which boot correctly in UEFI
 mode):
 sda1 = BIOS Boot partition
 sda2 = EFI System Partition
 sda3 = BTFS partition
 sda4 = swap partition
 For the moment I haven't created subvolumes (for / and for /home
 for instance) to keep things simple.
 
 The idea is then to create a RAID1 with a sdb drive (duplicate sda
 partitioning, add/balance/convert sdb3 + grub-install on sdb, add sdb
 swap UUID in /etc/fstab), shutdown and remove sda to check the
 procedure to replace it.
 
 I read the last thread on the subject lost with degraded RAID1, but
 would like to really confirm what would be the current approved
 procedure and if it will be valid for future BTRFS version (especially
 about the read-only mount).
 
 So what should I do from there ?
 Here are a few questions:
 
 1) Boot in degraded mode: currently with my kernel
 (3.12-0.bpo.1-amd64, from Debian wheezy-backports) it seems that I can
 mount in read-write mode.
 However for future kernel, it seems that I will be only able to mount
 read-only ? See here:
 http://www.spinics.net/lists/linux-btrfs/msg20164.html
 https://bugzilla.kernel.org/show_bug.cgi?id=60594
 
 2) If I am able to mount read-write, is this the correct procedure:
   a) place a new drive in another physical location sdc (I don't think
 I can use the same sda physical location ?)
   b) boot in degraded mode on sdb
   c) use the 'replace' command to replace sda by sdc
   d) perhaps a 'balance' is necessary ?
 
 3) Can I use also the above procedure if I am only allowed to mount
 read-only ?
 
 4) If I want to use my system without RAID1 support (dangerous I
 know), after booting in degraded mode with read-write, can I convert
 back sdb from RAID1 to RAID0 in a safe way ?
 (btrfs balance start -dconvert=raid0 -mconvert=raid0 /)
 

To continue with this RAID1 recovery procedure (Debian stable with
kernel 3.12-0.bpo.1-amd64), I tried to reproduce Duncan setup and the
result is not good.

Starting with a clean setup of 2 hard drive in RAID1 (sda and sdb) and
a clean snapshot of the rootfs:
1) poweroff, disconnect sda and boot on sdb with rootflags=ro,degraded
2) sdb is mounted ro but automatically remounted read-write by initramf
3) create a file witness1 and modify a file test.txt with 'alpha' inside
4) poweroff, connect sda, disconnect sdb and boot on sda
5) create a file witness2 and modify a file test.txt with 'beta' inside
6) poweroff, connect sdb and boot on sda
7) the modification from step 3 are there (but not from step 5)
8) launch scrub: a lot of errors are detected but no unrepairable errors
9) poweroff, disconnect sdb, boot on sda
10) the modification from step 3 are there (but not from step 5)
11) poweroff, boot on sda: kernel panic on startup
12) reboot, boot is possible
13) launch scrub: a lot of errors and kernel error
14) reboot, error on boot, and same error as step 13 with scrub
15) boot on previous snapshot of step1, same error on boot and
same error as step 13 with scrub.


I hope that it will be useful for someone. It seems that mounting
read-write is really not a good idea (have to find how to force ro with
Debian). The RAID1

Re: BTRFS with RAID1 cannot boot when removing drive

2014-02-14 Thread Saint Germain

On 11 February 2014 03:30, Saint Germain saint...@gmail.com wrote:
  I am experimenting with BTRFS and RAID1 on my Debian Wheezy (with
  backported kernel 3.12-0.bpo.1-amd64) using a a motherboard with
  UEFI.

  I have installed Debian with the following partition on the first
  hard drive (no BTRFS subsystem):
  /dev/sda1: for / (BTRFS)
  /dev/sda2: for /home (BTRFS)
  /dev/sda3: for swap
 
  Then I added another drive for a RAID1 configuration (with btrfs
  balance) and I installed grub on the second hard drive with
  grub-install /dev/sdb.

 You should be able to mount a two-device btrfs raid1 filesystem with
 only a single device with the degraded mount option, tho I believe
 current kernels refuse a read-write mount in that case, so you'll
 have read-only access until you btrfs device add a second device, so
 it can do normal raid1 mode once again.

 Meanwhile, I don't believe it's on the wiki, but it's worth noting my
 experience with btrfs raid1 mode in my pre-deployment tests.
 Actually, with the (I believe) mandatory read-only mount if raid1 is
 degraded below two devices, this problem's going to be harder to run
 into than it was in my testing several kernels ago, but here's what I
 found:

 But as I said, if btrfs only allows read-only mounts of filesystems
 without enough devices to properly complete the raidlevel, that
 shouldn't be as big an issue these days, since it should be more
 difficult or impossible to get the two devices separately mounted
 writable in the first place, with the consequence that the differing
 copies issue will be difficult or impossible to trigger in the first
 place. =:^)


Hello,

With your advices and Chris ones, I have now a (clean ?) partition to
start experimenting with RAID1 (and which boot correctly in UEFI
mode):
sda1 = BIOS Boot partition
sda2 = EFI System Partition
sda3 = BTFS partition
sda4 = swap partition
For the moment I haven't created subvolumes (for / and for /home
for instance) to keep things simple.

The idea is then to create a RAID1 with a sdb drive (duplicate sda
partitioning, add/balance/convert sdb3 + grub-install on sdb, add sdb
swap UUID in /etc/fstab), shutdown and remove sda to check the
procedure to replace it.

I read the last thread on the subject lost with degraded RAID1, but
would like to really confirm what would be the current approved
procedure and if it will be valid for future BTRFS version (especially
about the read-only mount).

So what should I do from there ?
Here are a few questions:

1) Boot in degraded mode: currently with my kernel
(3.12-0.bpo.1-amd64, from Debian wheezy-backports) it seems that I can
mount in read-write mode.
However for future kernel, it seems that I will be only able to mount
read-only ? See here:
http://www.spinics.net/lists/linux-btrfs/msg20164.html
https://bugzilla.kernel.org/show_bug.cgi?id=60594

2) If I am able to mount read-write, is this the correct procedure:
  a) place a new drive in another physical location sdc (I don't think
I can use the same sda physical location ?)
  b) boot in degraded mode on sdb
  c) use the 'replace' command to replace sda by sdc
  d) perhaps a 'balance' is necessary ?

3) Can I use also the above procedure if I am only allowed to mount read-only ?

4) If I want to use my system without RAID1 support (dangerous I
know), after booting in degraded mode with read-write, can I convert
back sdb from RAID1 to RAID0 in a safe way ?
(btrfs balance start -dconvert=raid0 -mconvert=raid0 /)

5) Perhaps a recovery procedure which includes booting on a different
rescue disk would be more appropriate ?

Thanks again,
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: BTRFS partitioning scheme (was BTRFS with RAID1 cannot boot when removing drive)

2014-02-13 Thread Saint Germain

On Thu, 13 Feb 2014 10:43:08 -0700, Chris Murphy
li...@colorremedies.com wrote :
  sda3 = 1 TiB root partition (BTRFS), mounted on /
  sda4 = 6 GiB swap partition
  (that way I should be able to be compatible with both CSM or UEFI)
  
  B) normal Debian installation on sdas, activate the CSM on the
  motherboard and reboot.
  
  C) apt-get install grub-efi-amd64 and grub-install /dev/sda
  
  And the problems begin:
  1) grub-install doesn't give any error but using the --debug I
  can see that it is not using EFI.
  2) Ok I force with grub-install --target=x86_64-efi
  --efi-directory=/boot/efi --bootloader-id=grub --recheck --debug
  /dev/sda
  3) This time something is generated in /boot/efi:
  /boot/efi/EFI/grub/grubx64.efi
  4) Copy the file /boot/efi/EFI/grub/grubx64.efi to
  /boot/efi/EFI/boot/bootx64.efi
  
  
  is EFI/boot/ correct here?
  
  If you're lucky then your BIOS will tell what path it will try to
  read for the boot code. For me that is /EFI/debian/grubx64.efi.
  
  
  I followed the advices here (first result on Google with grub uefi
  debian): http://tanguy.ortolo.eu/blog/article51/debian-efi
  
  5) Reboot and disable the CSM on the motherboard
  6) No boot possible, I always go directly to the UEFI-BIOS
  
  I am currently stuck there. I read a lot of conflicting advises
  which doesn't work:
- use modprobe efivars and efibootmgr: not possible because I
  have not booted in EFI (chicken-egg problem)
  
  
  Not exactly. Boot in EFI mode into your favourite installer rescue
  mode, then chroot into the target filesystem and run efibootmgr
  there.
  
  
  In the end I managed to do it like this:
  1) Make a USB stick with FAT32 partition
  2) Install grub on it with:
  grub-install --target=x86_64-efi --efi-directory=/media/usb0
  --removable 3) Note on a paper the grub commands to start the
  kernel in /boot/grub/grub.cfg 3) Reboot, Disable CSM in the
  motherboard boot utility (BIOS?), Reboot with the USB stick
  connected 4) Normally it should have started on the USB stick grub
  command-line 5) Enter the necessary command to start the kernel (if
  you have some problem with video mode, use insmod efi_gop)
  6) Normally your operating system should start normally
  7) Check that efibootmgr is installed and working (normally efivars
  should be loaded in the modules already)
  8) grub-install --efi-directory=/boot/efi --recheck --debug
  (with the debug info you should see that it is using grub-efi and
  not grub-pc) 9) efibootmgr -c -d /dev/sda -p 2 -w -L Debian
  (GRUB) -l '\EFI\Debian\grubx64.efi'
  (replace -p 2 with yout correct ESP partition number)
  10) Reboot and enjoy !
 
 OK at least with GRUB 2.00 I never have to use any options with
 grub-install when installing to a chrooted system. It also even
 writes the proper entry into NVRAM for me, I don't have to use
 efibootmgr.

Yes you are right, this is probably unnecessary (see below).

 
 Also I've never had single \ work with efibootmgr from shell. I have
 to use \\. Try typing efibootmgr -v to see the actual entry you
 created and whether it has the \ in the path.
 

Here is the output:
BootCurrent: 0001
Timeout: 1 seconds
BootOrder: 0001,
Boot* debian
HD(2,7d8,106430,5d012c09-b70d-4225-ae18-9831f997c493)File(\EFI\debian\grubx64.efi)
Boot0001* Debian (GRUB) 
HD(2,7d8,106430,5d012c09-b70d-4225-ae18-9831f997c493)File(\EFI\Debian\grubx64.efi)

Ah the joy of FAT32 and the case sensitivity !
So it seems that grub-install automatically install the correct entry
 and using efibootmgr was unnecessary.
However it seems that single '\' works.

 But one thing that explains why the UEFI bootloading stuff is
 confusing for you is that every distro keeps their own grub patches.
 So there is very possibly a lot of difference between the downstream
 grub behaviors, and upstream.
 

Understood. That is why I took the step to describe what I did.
Perhaps it will be useful for others (most info on the topic was
not for Debian...).

Thanks again for your insights !
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: BTRFS partitioning scheme (was BTRFS with RAID1 cannot boot when removing drive)

2014-02-13 Thread Saint Germain

On 13 February 2014 09:50, Frank Kingswood
fr...@kingswood-consulting.co.uk wrote:
 On 12/02/14 17:13, Saint Germain wrote:

 Ok based on your advices, here is what I have done so far to use UEFI
 (remeber that the objective is to have a clean and simple BTRFS RAID1
 install).

 A) I start first with only one drive, I have gone with the following
 partition scheme (Debian wheezy, kernel 3.12, grub 2.00, GPT partition
 with parted):
 sda1 = 1MiB BIOS Boot partition (no FS, set 1 bios_grub on with
 parted to set the type)
 sda2 = 550 MiB EFI System Partition (FAT32, toggle 2 boot with
 parted to set the type),  mounted on /boot/efi


 I'm curious, why so big? There's only one file of about 100kb there, and I
 was considering shrinking mine to the minimum possible (which seems to be
 about 33 MB).


It is quite difficult to find reliable information on this whole UEFI
boot with linux (info you can find for sure, but which ones to follow
? there are so many different info out there).

So I don't know if this 550 MiB is an urban legend or not, but you can
find several people recommending it and the reason why:
http://askubuntu.com/questions/336439/any-problems-with-this-partition-scheme
http://askubuntu.com/questions/287441/different-uses-of-term-efi-partition
https://bbs.archlinux.org/viewtopic.php?pid=1306753
http://forums.gentoo.org/viewtopic-p-7352214.html

Other people recommend around 200-300 MiB, so I basically took the
upper limit to see what happen.
If you have more reliable info on the topic I would be interested !

 sda3 = 1 TiB root partition (BTRFS), mounted on /
 sda4 = 6 GiB swap partition
 (that way I should be able to be compatible with both CSM or UEFI)

 B) normal Debian installation on sdas, activate the CSM on the
 motherboard and reboot.

 C) apt-get install grub-efi-amd64 and grub-install /dev/sda

 And the problems begin:
 1) grub-install doesn't give any error but using the --debug I can see
 that it is not using EFI.
 2) Ok I force with grub-install --target=x86_64-efi
 --efi-directory=/boot/efi --bootloader-id=grub --recheck --debug
 /dev/sda
 3) This time something is generated in /boot/efi:
 /boot/efi/EFI/grub/grubx64.efi
 4) Copy the file /boot/efi/EFI/grub/grubx64.efi to
 /boot/efi/EFI/boot/bootx64.efi


  is EFI/boot/ correct here?

 If you're lucky then your BIOS will tell what path it will try to read for
 the boot code. For me that is /EFI/debian/grubx64.efi.


I followed the advices here (first result on Google with grub uefi debian):
http://tanguy.ortolo.eu/blog/article51/debian-efi

 5) Reboot and disable the CSM on the motherboard
 6) No boot possible, I always go directly to the UEFI-BIOS

 I am currently stuck there. I read a lot of conflicting advises which
 doesn't work:
- use modprobe efivars and efibootmgr: not possible because I have
 not booted in EFI (chicken-egg problem)


 Not exactly. Boot in EFI mode into your favourite installer rescue mode,
 then chroot into the target filesystem and run efibootmgr there.


In the end I managed to do it like this:
1) Make a USB stick with FAT32 partition
2) Install grub on it with:
grub-install --target=x86_64-efi --efi-directory=/media/usb0 --removable
3) Note on a paper the grub commands to start the kernel in /boot/grub/grub.cfg
3) Reboot, Disable CSM in the motherboard boot utility (BIOS?), Reboot
with the USB stick connected
4) Normally it should have started on the USB stick grub command-line
5) Enter the necessary command to start the kernel (if you have some
problem with video mode, use insmod efi_gop)
6) Normally your operating system should start normally
7) Check that efibootmgr is installed and working (normally efivars
should be loaded in the modules already)
8) grub-install --efi-directory=/boot/efi --recheck --debug
(with the debug info you should see that it is using grub-efi and not grub-pc)
9) efibootmgr -c -d /dev/sda -p 2 -w -L Debian (GRUB) -l
'\EFI\Debian\grubx64.efi'
(replace -p 2 with yout correct ESP partition number)
10) Reboot and enjoy !

I made a lot of mistakes during these steps. The good thing is that
error are quire verbose, so you can easily see what is going wrong.
I hope that it will be easier for the next Debian user.

So now I can continue on this BTRFS RAID1 adventure... Let's see if my
setup is resilient to a hard drive failure.

Thanks for the help. Most comments here are quite on the spot and
reliable, that is very helpful !
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: BTRFS partitioning scheme (was BTRFS with RAID1 cannot boot when removing drive)

2014-02-12 Thread Saint Germain

On 11 February 2014 19:15, Chris Murphy li...@colorremedies.com wrote:

 To summarize, I think I have 3 options for partitioning (I am not
 considering UEFI secure boot or swap):
 1) grub, BTRFS partition (i.e. full disk in BTRFS), /boot inside BTRFS 
 subvolume

 This doesn't seem like a good idea for a boot drive to be without partitions.


 2) grub, GPT partition, with (A) on sda1, and a BTRFS partition on
 sda2, /boot inside BTRFS subvolume
 3) grub, GPT partition, with (A) on sda1, /boot (ext4) on sda2, and a
 BTRFS on sda3

 (A) = BIOS Boot partition (1 MiB) or EFI System Partition (FAT32, 550MiB)

 I don't really see the point of having UEFI/ESP if I don't use other
 proprietary operating system, so I think I will go with (A) = BIOS
 Boot partition except if there is someting I have missed.

 You need to boot your system in UEFI and CSM-BIOS modes, and compare the 
 dmesg for each. I'm finding it common the CSM limits power management, and 
 relegates drives to IDE speeds rather than full SATA link speeds. Sometimes 
 it's unavoidable to use the CSM if it has better overall behavior for your 
 use case. I've found it to be lacking and have abandoned it. It's basically 
 intended for booting Windows XP, right?


Ok based on your advices, here is what I have done so far to use UEFI
(remeber that the objective is to have a clean and simple BTRFS RAID1
install).

A) I start first with only one drive, I have gone with the following
partition scheme (Debian wheezy, kernel 3.12, grub 2.00, GPT partition
with parted):
sda1 = 1MiB BIOS Boot partition (no FS, set 1 bios_grub on with
parted to set the type)
sda2 = 550 MiB EFI System Partition (FAT32, toggle 2 boot with
parted to set the type),  mounted on /boot/efi
sda3 = 1 TiB root partition (BTRFS), mounted on /
sda4 = 6 GiB swap partition
(that way I should be able to be compatible with both CSM or UEFI)

B) normal Debian installation on sdas, activate the CSM on the
motherboard and reboot.

C) apt-get install grub-efi-amd64 and grub-install /dev/sda

And the problems begin:
1) grub-install doesn't give any error but using the --debug I can see
that it is not using EFI.
2) Ok I force with grub-install --target=x86_64-efi
--efi-directory=/boot/efi --bootloader-id=grub --recheck --debug
/dev/sda
3) This time something is generated in /boot/efi: /boot/efi/EFI/grub/grubx64.efi
4) Copy the file /boot/efi/EFI/grub/grubx64.efi to
/boot/efi/EFI/boot/bootx64.efi
5) Reboot and disable the CSM on the motherboard
6) No boot possible, I always go directly to the UEFI-BIOS

I am currently stuck there. I read a lot of conflicting advises which
doesn't work:
  - use modprobe efivars and efibootmgr: not possible because I have
not booted in EFI (chicken-egg problem)
  - use update-grub or use grub-mkconfig (to generate
/boot/efi/grub/grub.cfg): no results
  - other exotic commands...
So I will try to upgrade to grub 2.02beta (as recommender by Chris
Murphy) but I am not sure that it will help. If someone has some
Debian experience on this UEFI install, please don't hesitate to
propose solutions !

I will continue to document this experience (hope that it will be
useful for others), and hope to get to the point where I can have a
good system in BTRFS RAID1 mode.
You have to be very motivated to get into this, It is really a challenge ! ;-)
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: BTRFS with RAID1 cannot boot when removing drive

2014-02-12 Thread Saint Germain

On 11 February 2014 21:35, Duncan 1i5t5.dun...@cox.net wrote:
 Saint Germain posted on Tue, 11 Feb 2014 11:04:57 +0100 as excerpted:

 The big problem I currently have is that based on your input, I hesitate
 a lot on my partitioning scheme: should I use a dedicated /boot
 partition or should I have one global BTRFS partition ?
 It is not very clear in the doc (a lof of people used a dedicated /boot
 because at that time, grub couldn't natively boot BTRFS it seems, but it
 has changed).
 Could you recommend a partitioning scheme for a simple RAID1 with 2
 identical hard drives (just for home computing, not business).

 FWIW... I'm planning to and have your previous message covering that
 still marked unread to reply to later.  But real life has temporarily
 been monopolizing my time so the last day or two I've only done
 relatively short and quick replies.  That one will require a bit more
 time to answer to my satisfaction.

 So I'm punting for the moment.  But FWIW I tend to be a reasonably heavy
 partitioner (tho nowhere near what I used to be), so a lot of folks will
 consider my setup somewhat extreme.  That's OK.  It's my computer, setup
 for my purposes, not their computer for theirs, and it works very well
 for me, so it's all good. =:^)

 But hopefully I'll get back to that with a longer reply by the end of the
 week.  If I don't, you can probably consider that monopoly lasting longer
 than I thought, and it could be that I'll never get back to properly
 reply.  But it's an interesting enough topic to me that I'll /probably/
 get back, just not right ATM.


No problem, I have started another thread where we discuss partitioning.
It may be slightly off-topic, but the intention is really to have a
partition BTRFS-friendly.
For instance it seems that a dedicated /boot partition instead of a
subvolume is not necessary (better to have the /boot in the RAID1).
However I'm no expert.

Thanks for your help.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: BTRFS with RAID1 cannot boot when removing drive

2014-02-11 Thread Saint Germain

On 11 February 2014 07:59, Duncan 1i5t5.dun...@cox.net wrote:
 Saint Germain posted on Tue, 11 Feb 2014 04:15:27 +0100 as excerpted:

 Ok I need to really understand how my motherboard works (new Z87E-ITX).
 It is written 64Mb AMI UEFI Legal BIOS, so I thought it was really
 UEFI.

 I expect it's truly UEFI.  But from what I've read most UEFI based
 firmware(possibly all in theory, with the caveat that there's bugs and
 some might not actually work as intended due to bugs) on x86/amd64 (as
 opposed to arm) has a legacy-BIOS mode fallback.  Provided it's not in
 secure-boot mode, if the storage devices it is presented don't have a
 valid UEFI config, it'll fall back to legacy-BIOS mode and try to detect
 and boot that.

 Which may or may not be what your system is actually doing.  As I said,
 since I've not actually experimented with UEFI here, my practical
 knowledge on it is virtually nil, and I don't claim to have studied the
 theory well enough to deduce in that level of detail what your system is
 doing.  But I know that's how it's /supposed/ to be able to work. =:^)


Hello Duncan;

Yes I also suspect something like that. Unfortunately it is not really
clear on their website how it works.
You can find a lot of marketing stuff, but not what is really needed
to boot properly !

 (FWIW, what I /have/ done, deliberately, is read enough about UEFI to
 have a general feel for it, and to have been previously exposed to the
 ideas for some time, so that once I /do/ have it available and decide
 it's time, I'll be able to come up to speed relatively quickly as I've
 had the general ideas turning over in my head for quite some time
 already, so in effect I'll simply be reviewing the theory and doing the
 lab work, while concurrently making logical connections about how it all
 fits together that only happen once one actually does that lab work.
 I've discovered over the years that this is perhaps my most effective way
 to learn, read about the general principles while not really
 understanding it the first time thru, then come back to it some months or
 years later, and I pick it up real fast, because my subconscious has been
 working on the problem the whole time! Come to think of it, that's
 actually how I handled btrfs, too, trying it at one point and deciding it
 didn't fit my needs at the time, leaving it for awhile, then coming back
 to it later when my needs had changed, but I already had an idea what I
 was doing from the previous try, with the result being I really took to
 it fast, the second time!  =:^)


I'll try to keep that in mind !

 I understand. Normally the swap will only be used for hibernating. I
 don't expect to use it except perhaps in some extreme case.

 If hibernate is your main swap usage, you might consider the noauto fstab
 option as well, then specifically swapon the appropriate one in your
 hibernate script since you may well need logic in there to figure out
 which one to use in any case.  I was doing that for awhile.

 (I've run my own suspend/hibernate scripts based on the documentation in
 $KERNDIR/Documentation/power/*, for years.  The kernel's docs dir really
 is a great resource for a lot of sysadmin level stuff as well as the
 expected kernel developer stuff.  I think few are aware of just how much
 real useful admin-level information it actually contains. =:^)

I am not so used to working without swap. I've worked for year with a
computer with low RAM and a swap and I didn't have any problem (even
when doing some RAM intensive task). So I haven't tried to fix it ;-)
If there is sufficient RAM, I suppose that the the swap doesn't get
used so it is not a problem to let it ?
I've hesitated a long time between ZFS and BTRFS, and one of the case
for ZFS was that it can handle natively swap in its subvolume (and
so in theory it can enter in the RAID1 as well). However the folks at
ZFS seem also to consider also swap as a relic of the past. I guess I
will keep it just in case. ;-)

The big problem I currently have is that based on your input, I
hesitate a lot on my partitioning scheme: should I use a dedicated
/boot partition or should I have one global BTRFS partition ?
It is not very clear in the doc (a lof of people used a dedicated
/boot because at that time, grub couldn't natively boot BTRFS it
seems, but it has changed).
Could you recommend a partitioning scheme for a simple RAID1 with 2
identical hard drives (just for home computing, not business).

Many thanks !
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

BTRFS partitioning scheme (was BTRFS with RAID1 cannot boot when removing drive)

2014-02-11 Thread Saint Germain

Hello and thanks for your feedback !

Cc back to the mailing-list as it may be of interest here as well.

On 11 February 2014 16:11, Kyle Gates kylega...@hotmail.com wrote:
 The big problem I currently have is that based on your input, I
 hesitate a lot on my partitioning scheme: should I use a dedicated
 /boot partition or should I have one global BTRFS partition ?
 It is not very clear in the doc (a lof of people used a dedicated
 /boot because at that time, grub couldn't natively boot BTRFS it
 seems, but it has changed).
 Could you recommend a partitioning scheme for a simple RAID1 with 2
 identical hard drives (just for home computing, not business).

 I run a 1GiB RAID1 btrfs /boot in mixed mode with grub2 and gpt partitions.
 IIRC grub2 doesn't understand lzo compression nor subvolumes.


Well I did tried to read about this and ended up being confused
because development is so fast, documentation can become quickly
outdated.
It seems that grub can boot BTRFS /boot subvolumes:
https://bbs.archlinux.org/viewtopic.php?pid=1222358

However Chris Murphy has some problems a few months ago:
http://comments.gmane.org/gmane.comp.file-systems.btrfs/29140

So I still don't know if it is a good idea or not to have a BTRFS /boot ?
Of course the idea is that I would like to snapshot /boot and have it on RAID1.

To summarize, I think I have 3 options for partitioning (I am not
considering UEFI secure boot or swap):
1) grub, BTRFS partition (i.e. full disk in BTRFS), /boot inside BTRFS subvolume
2) grub, GPT partition, with (A) on sda1, and a BTRFS partition on
sda2, /boot inside BTRFS subvolume
3) grub, GPT partition, with (A) on sda1, /boot (ext4) on sda2, and a
BTRFS on sda3

(A) = BIOS Boot partition (1 MiB) or EFI System Partition (FAT32, 550MiB)

I don't really see the point of having UEFI/ESP if I don't use other
proprietary operating system, so I think I will go with (A) = BIOS
Boot partition except if there is someting I have missed.

Can someone recommend which one would be the most stable and easier to manage ?

Thanks in advance,
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: BTRFS with RAID1 cannot boot when removing drive

2014-02-11 Thread Saint Germain

On 11 February 2014 18:21, Chris Murphy li...@colorremedies.com wrote:

 On Feb 10, 2014, at 8:15 PM, Saint Germain saint...@gmail.com wrote:

 Ok I need to really understand how my motherboard works (new Z87E-ITX).
 It is written 64Mb AMI UEFI Legal BIOS, so I thought it was really
 UEFI.

 Manufacturers have done us a disservice by equating UEFI and BIOS. Some UEFI 
 also have a compatibility support module (CSM) which presents a BIOS to the 
 operating system. It's intended for legacy operating systems that won't ever 
 directly support UEFI.

 A way to tell in linux if you're booting with or without the CSM is issue the 
 efibootmgr command. If it return something that looks like an error message, 
 the CSM is enabled and the OS thinks it's running on a BIOS computer. If it 
 returns some numbered list then the CSM isn't enabled and the OS thinks it's 
 running on a UEFI computer.


Nice trick ! Thanks.

 /dev/sdb has the same partition as /dev/sda.

 grub-install device shouldn't work on UEFI because the only place 
 grub-install installs is to the volume mounted at /boot/efi. And also 
 grub-install /dev/sdb implies installing grub to a disk boot sector, which 
 also isn't applicable to UEFI.


I am still not up to date on UEFI partition and so.
But I have read these pages:
http://tanguy.ortolo.eu/blog/article51/debian-efi
http://forums.debian.net/viewtopic.php?f=16t=81120
And apparently is grub-install device is the correct command (but
you have to copy file in addition).
It is maybe because they use a special package grub-efi-amd64, which
replace the grub-install.
It is quite difficult to find reliable info on the topic...


 I understand. Normally the swap will only be used for hibernating. I
 don't expect to use it except perhaps in some extreme case.

 I consider hibernate fundamentally broken right now because whether it'll 
 work depends on too many things. It works for some people and not others, and 
 for those it does work it largely didn't work out of the box. It never worked 
 for me and did induce Btrfs corruptions so I've just given up on hibernate 
 entirely. There's a long old Fedora thread that discusses myriad issues about 
 it: https://bugzilla.redhat.com/show_bug.cgi?id=781749 and shows even if it's 
 working, it can stop working without warning after X number of 
 hibernate-resume cycles.


I am among the fortunate to have a working hibernate out of the box
(Debian stable) and it works reliably (even it ultimately it WILL fail
after 20-30 iterations). So I use the feature to save on electricity
cost ;-)
But yes, maybe I will get rid of the swap...

Thanks !
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: BTRFS with RAID1 cannot boot when removing drive

2014-02-10 Thread Saint Germain

Hello Duncan,

What an amazing extensive answer you gave me !
Thank you so much for it.

See my comments below.

On Mon, 10 Feb 2014 03:34:49 + (UTC), Duncan 1i5t5.dun...@cox.net
wrote :

  I am experimenting with BTRFS and RAID1 on my Debian Wheezy (with
  backported kernel 3.12-0.bpo.1-amd64) using a a motherboard with
  UEFI.
 
 My systems don't do UEFI, but I do run GPT partitions and use grub2
 for booting, with grub2-core installed to a BIOS/reserved type
 partition (instead of as an EFI service as it would be with UEFI).
 And I have root filesystem btrfs two-device raid1 mode working fine
 here, tested bootable with only one device of the two available.
 
 So while I can't help you directly with UEFI, I know the rest of it
 can/ does work.
 
 One more thing:  I do have a (small) separate btrfs /boot, actually
 two of them as I setup a separate /boot on each of the two devices in
 ordered to have a backup /boot, since grub can only point to
 one /boot by default, and while pointing to another in grub's rescue
 mode is possible, I didn't want to have to deal with that if the
 first /boot was corrupted, as it's easier to simply point the BIOS at
 a different drive entirely and load its (independently installed and
 configured) grub and /boot.
 

Can you explain why you choose to have a dedicated /boot partition ?
I also read on this thread that it may be better to have a
dedicated /boot partition:
https://bbs.archlinux.org/viewtopic.php?pid=1342893#p1342893


  However I haven't managed to make the system boot when the removing
  the first hard drive.
  
  I have installed Debian with the following partition on the first
  hard drive (no BTRFS subsystem):
  /dev/sda1: for / (BTRFS)
  /dev/sda2: for /home (BTRFS)
  /dev/sda3: for swap
  
  Then I added another drive for a RAID1 configuration (with btrfs
  balance) and I installed grub on the second hard drive with
  grub-install /dev/sdb.
 
 Just for clarification as you don't mention it specifically, altho
 your btrfs filesystem show information suggests you did it this way,
 are your partition layouts identical on both drives?
 
 That's what I've done here, and I definitely find that easiest to
 manage and even just to think about, tho it's definitely not a
 requirement.  But using different partition layouts does
 significantly increase management complexity, so it's useful to avoid
 if possible. =:^)

Yes, the partition layout is exactly the same on both drive (copied
with sfdisk). I also try to keep things simple ;-)

  If I boot on sdb, it takes sda1 as the root filesystem
 
  If I switched the cable, it always take the first hard drive as
  the root filesystem (now sdb)
 
 That's normal /appearance/, but that /appearance/ doesn't fully
 reflect reality.
 
 The problem is that mount output (and /proc/self/mounts), fstab, etc, 
 were designed with single-device filesystems in mind, and
 multi-device btrfs has to be made to fix the existing rules as best
 it can.
 
 So what's actually happening is that the for a btrfs composed of
 multiple devices, since there's only one device slot for the kernel
 to list devices, it only displays the first one it happens to come
 across, even tho the filesystem will normally (unless degraded)
 require that all component devices be available and logically
 assembled into the filesystem before it can be mounted.
 
 When you boot on sdb, naturally, the sdb component of the
 multi-device filesystem that the kernel finds, so it's the one
 listed, even tho the filesystem is actually composed of more devices,
 not just that one.

I am not following you: it seems to be the opposite of what you
describe. If I boot on sdb, I expect sdb1 and sdb2 to be the first
components that the kernel find. However I can see that sda1 and sda2
are used (using the 'mount' command).

 When you switch the cables, the first one is, at
 least on your system, always the first device component of the
 filesystem detected, so it's always the one occupying the single
 device slot available for display, even tho the filesystem has
 actually assembled all devices into the complete filesystem before
 mounting.
 

Normally the 2 hard drive should be exactly the same (or I didn't
understand something) except for the UUID_SUB.
That's why I don't understand if I switch the cable, I should get
exactly the same results with 'mount'.
But that is not the case, the 'mount' command always point to the same
partition:
- without cable switch: sda1 and sda2
- with cable switch: sdb1 and sdb2
Everything happen as if the system is using the UUID_SUB to get his
'favorite' partition.

  If I disconnect /dev/sda, the system doesn't boot with a message
  saying that it hasn't found the UUID:
  
  Scanning for BTRFS filesystems...
  mount:
  mounting /dev/disk/by-uuid/c64fca2a-5700-4cca-abac-3a61f2f7486c
  on /root failed: Invalid argument
  
  Can you tell me what I have done incorrectly ?
  Is it because of UEFI ? If yes I haven't understood how I can
  correct

Re: BTRFS with RAID1 cannot boot when removing drive

2014-02-10 Thread Saint Germain

Hello !

On Mon, 10 Feb 2014 19:18:22 -0700, Chris Murphy
li...@colorremedies.com wrote :

 
 On Feb 9, 2014, at 2:40 PM, Saint Germain saint...@gmail.com wrote:
  
  Then I added another drive for a RAID1 configuration (with btrfs
  balance) and I installed grub on the second hard drive with
  grub-install /dev/sdb.
 
 That can't work on UEFI. UEFI firmware effectively requires a GPT
 partition map and something to serve as an EFI System partition on
 all bootable drives.
 
 Second there's a difference between UEFI with and without secure boot.
 
 With secure boot you need to copy the files your distro installer
 puts on the target drive EFI System partition to each addition
 drive's ESP if you want multibooting to work in case of disk failure.
 The grub on each ESP likely looks on only its own ESP for a grub.cfg.
 So that then means having to sync grub.cfg's among each disk used for
 booting. A way around this is to create a single grub.cfg that merely
 forwards to the true grub.cfg. And you can copy this forward-only
 grub.cfg to each ESP. That way the ESP's never need updating or
 syncing again.
 
 Without secure boot, you must umount /boot/efi and mount the ESP for
 each bootable disk is turn, and then merely run:
 
 grub-install
 
 That will cause a core.img to be created for that particular ESP, and
 it will point to the usual grub.cfg location at /boot/grub.
 

Ok I need to really understand how my motherboard works (new Z87E-ITX).
It is written 64Mb AMI UEFI Legal BIOS, so I thought it was really
UEFI.

 
  
  If I boot on sdb, it takes sda1 as the root filesystem
  If I switched the cable, it always take the first hard drive as the
  root filesystem (now sdb)
  If I disconnect /dev/sda, the system doesn't boot with a message
  saying that it hasn't found the UUID:
  
  Scanning for BTRFS filesystems...
  mount:
  mounting /dev/disk/by-uuid/c64fca2a-5700-4cca-abac-3a61f2f7486c
  on /root failed: Invalid argument
 
 Well if /dev/sda is missing, and you have an unpartitioned /dev/sdb I
 don't even know how you're getting this far, and it seems like the
 UEFI computer might actually be booting in CSM-BIOS mode which
 presents a conventional BIOS to the operating system. Disintguishing
 such things gets messy quickly.
 

/dev/sdb has the same partition as /dev/sda.
Duncan gave me the hint with degraded mode and I managed to boot
(however I had some problem with mounting sda2).

  
  Can you tell me what I have done incorrectly ?
  Is it because of UEFI ? If yes I haven't understood how I can
  correct it in a simple way.
  
  As extra question, I don't see also how I can configure the system
  to get the correct swap in case of disk failure. Should I force
  both swap partition to have the same UUID ?
 
 If you're really expecting to create a system that can accept a disk
 failure and continue to work, I don't see how it can depend on swap
 partitions. It's fine to create them, but just realize if they're
 actually being used and the underlying physical device dies, the
 kernel isn't going to like it.
 
 A possible work around is using an md raid1 partition as swap.
 

I understand. Normally the swap will only be used for hibernating. I
don't expect to use it except perhaps in some extreme case.

Thanks for your help !
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

BTRFS with RAID1 cannot boot when removing drive

2014-02-09 Thread Saint Germain

Hello,

I am experimenting with BTRFS and RAID1 on my Debian Wheezy (with
backported kernel 3.12-0.bpo.1-amd64) using a a motherboard with UEFI.

However I haven't managed to make the system boot when the removing the
first hard drive.

I have installed Debian with the following partition on the first hard
drive (no BTRFS subsystem):
/dev/sda1: for / (BTRFS)
/dev/sda2: for /home (BTRFS)
/dev/sda3: for swap

Then I added another drive for a RAID1 configuration (with btrfs
balance) and I installed grub on the second hard drive with
grub-install /dev/sdb.

If I boot on sdb, it takes sda1 as the root filesystem
If I switched the cable, it always take the first hard drive as the
root filesystem (now sdb)
If I disconnect /dev/sda, the system doesn't boot with a message
saying that it hasn't found the UUID:

Scanning for BTRFS filesystems...
mount: mounting /dev/disk/by-uuid/c64fca2a-5700-4cca-abac-3a61f2f7486c on /root 
failed: Invalid argument

Can you tell me what I have done incorrectly ?
Is it because of UEFI ? If yes I haven't understood how I can correct
it in a simple way.

As extra question, I don't see also how I can configure the system to
get the correct swap in case of disk failure. Should I force both swap partition
to have the same UUID ?

Many thanks in advance !


Here are some outputs for info:

btrfs filesystem show
Label: none  uuid: 743d6b3b-71a7-4869-a0af-83549555284b
Total devices 2 FS bytes used 27.96MB
devid1 size 897.98GB used 3.03GB path /dev/sda2
devid2 size 897.98GB used 3.03GB path /dev/sdb2

Label: none  uuid: c64fca2a-5700-4cca-abac-3a61f2f7486c
Total devices 2 FS bytes used 3.85GB
devid1 size 27.94GB used 7.03GB path /dev/sda1
devid2 size 27.94GB used 7.03GB path /dev/sdb1

blkid 
/dev/sda1: UUID=c64fca2a-5700-4cca-abac-3a61f2f7486c 
UUID_SUB=77ffad34-681c-4c43-9143-9b73da7d1ae3 TYPE=btrfs 
/dev/sda3: UUID=469715b2-2fa3-4462-b6f5-62c04a60a4a2 TYPE=swap 
/dev/sda2: UUID=743d6b3b-71a7-4869-a0af-83549555284b 
UUID_SUB=744510f5-5bd5-4df4-b8c4-0fc1a853199a TYPE=btrfs 
/dev/sdb1: UUID=c64fca2a-5700-4cca-abac-3a61f2f7486c 
UUID_SUB=2615fd98-f2ad-4e7b-84bc-0ee7f9770ca0 TYPE=btrfs 
/dev/sdb2: UUID=743d6b3b-71a7-4869-a0af-83549555284b 
UUID_SUB=8783a7b1-57ef-4bcc-ae7f-be20761e9a19 TYPE=btrfs 
/dev/sdb3: UUID=56fbbe2f-7048-488f-b263-ab2eb000d1e1 TYPE=swap

cat /etc/fstab
# file system mount point   type  options   dump  pass
UUID=c64fca2a-5700-4cca-abac-3a61f2f7486c /   btrfs   defaults  
  0   1
UUID=743d6b3b-71a7-4869-a0af-83549555284b /home   btrfs   defaults  
  0   2
UUID=469715b2-2fa3-4462-b6f5-62c04a60a4a2 noneswapsw
  0   0

cat /boot/grub/grub.cfg 
#
# DO NOT EDIT THIS FILE
#
# It is automatically generated by grub-mkconfig using templates
# from /etc/grub.d and settings from /etc/default/grub
#

### BEGIN /etc/grub.d/00_header ###
if [ -s $prefix/grubenv ]; then
  load_env
fi
set default=0
if [ ${prev_saved_entry} ]; then
  set saved_entry=${prev_saved_entry}
  save_env saved_entry
  set prev_saved_entry=
  save_env prev_saved_entry
  set boot_once=true
fi

function savedefault {
  if [ -z ${boot_once} ]; then
saved_entry=${chosen}
save_env saved_entry
  fi
}

function load_video {
  insmod vbe
  insmod vga
  insmod video_bochs
  insmod video_cirrus
}

insmod part_msdos
insmod btrfs
set root='(hd1,msdos1)'
search --no-floppy --fs-uuid --set=root c64fca2a-5700-4cca-abac-3a61f2f7486c
if loadfont /usr/share/grub/unicode.pf2 ; then
  set gfxmode=640x480
  load_video
  insmod gfxterm
  insmod part_msdos
  insmod btrfs
  set root='(hd1,msdos1)'
  search --no-floppy --fs-uuid --set=root c64fca2a-5700-4cca-abac-3a61f2f7486c
  set locale_dir=($root)/boot/grub/locale
  set lang=fr_FR
  insmod gettext
fi
terminal_output gfxterm
set timeout=5
### END /etc/grub.d/00_header ###

### BEGIN /etc/grub.d/05_debian_theme ###
insmod part_msdos
insmod btrfs
set root='(hd1,msdos1)'
search --no-floppy --fs-uuid --set=root c64fca2a-5700-4cca-abac-3a61f2f7486c
insmod png
if background_image /usr/share/images/desktop-base/joy-grub.png; then
  set color_normal=white/black
  set color_highlight=black/white
else
  set menu_color_normal=cyan/blue
  set menu_color_highlight=white/blue
fi
### END /etc/grub.d/05_debian_theme ###

### BEGIN /etc/grub.d/10_linux ###
menuentry 'Debian GNU/Linux, with Linux 3.12-0.bpo.1-amd64' --class debian 
--class gnu-linux --class gnu --class os {
load_video
insmod gzio
insmod part_msdos
insmod btrfs
set root='(hd1,msdos1)'
search --no-floppy --fs-uuid --set=root 
c64fca2a-5700-4cca-abac-3a61f2f7486c
echo'Chargement de Linux 3.12-0.bpo.1-amd64 ...'
linux   /boot/vmlinuz-3.12-0.bpo.1-amd64 
root=UUID=c64fca2a-5700-4cca-abac-3a61f2f7486c ro  quiet
echo'Chargement du disque mémoire initial ...'
initrd

Re: Announcing btrfs-dedupe

Re: Announcing btrfs-dedupe

Re: Identifying reflink / CoW files

Identifying reflink / CoW files

Re: Kernel bug during RAID1 replace

Re: Kernel bug during RAID1 replace

Re: Kernel bug during RAID1 replace

Re: Kernel bug during RAID1 replace

Re: Kernel bug during RAID1 replace

Re: Kernel bug during RAID1 replace

Re: Kernel bug during RAID1 replace

Re: Kernel bug during RAID1 replace

Re: Kernel bug during RAID1 replace

Re: Kernel bug during RAID1 replace

Re: Kernel bug during RAID1 replace

Re: Kernel bug during RAID1 replace

Kernel bug during RAID1 replace

Re: BTRFS with RAID1 cannot boot when removing drive

Re: BTRFS with RAID1 cannot boot when removing drive

Re: BTRFS partitioning scheme (was BTRFS with RAID1 cannot boot when removing drive)

Re: BTRFS partitioning scheme (was BTRFS with RAID1 cannot boot when removing drive)

Re: BTRFS partitioning scheme (was BTRFS with RAID1 cannot boot when removing drive)

Re: BTRFS with RAID1 cannot boot when removing drive

Re: BTRFS with RAID1 cannot boot when removing drive

BTRFS partitioning scheme (was BTRFS with RAID1 cannot boot when removing drive)

Re: BTRFS with RAID1 cannot boot when removing drive

Re: BTRFS with RAID1 cannot boot when removing drive

Re: BTRFS with RAID1 cannot boot when removing drive

BTRFS with RAID1 cannot boot when removing drive

29 matches

Site Navigation

Mail list logo

Footer information