ENOSPC on file deletion with 3.1.6

2012-01-03 Thread Arie Peterson
Hi,


After upgrading my kernel from 2.6.38 (which has worked fine for months) to 
3.1.6, I got ENOSPC on recompiling gcc (even though df says there is 16G free 
of 50G; this is a raid1 setup, so in fact it's 8 of 25).

After this error, I tried to remove the compilation directory (with rm -r): 
this also gives ENOSPC. I am trying to work around this by first truncating 
files using echo  $file, but this fails for some files, again with ENOSPC. 
Also, removal of files is very slow even if it succeeds.

Moreover, any write operation on the file system now fails with ENOSPC.

Reverting to my old kernel does not help: it now shows the same problem.


Is this a known issue? Is there a way to make this file system unstuck? (I have 
backups, but I'd like to preserve snapshot information if possible.) Should I 
try upgrading to an even newer kernel?


Kind regards,

Arie

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ENOSPC on file deletion with 3.1.6

2012-01-03 Thread Sander
Arie Peterson wrote (ao):
 After upgrading my kernel from 2.6.38 (which has worked fine for months) to 
 3.1.6, I got ENOSPC on recompiling gcc (even though df says there is 16G free 
 of 50G; this is a raid1 setup, so in fact it's 8 of 25).
 
 After this error, I tried to remove the compilation directory (with rm -r): 
 this also gives ENOSPC. I am trying to work around this by first truncating 
 files using echo  $file, but this fails for some files, again with ENOSPC. 
 Also, removal of files is very slow even if it succeeds.
 
 Moreover, any write operation on the file system now fails with ENOSPC.
 
 Reverting to my old kernel does not help: it now shows the same problem.
 
 Is this a known issue? Is there a way to make this file system unstuck? (I 
 have 
 backups, but I'd like to preserve snapshot information if possible.) Should I 
 try upgrading to an even newer kernel?

Maybe your snapshots take up space. Can you show 'btrfs filesystem df /' ?

FWIW, I also had a disk full just a few days ago. Removed all snapshots
and some big files, but to no avail. Likely the background cleanup took
too much time. A reboot fixed this.

Sander

-- 
Humilis IT Services and Solutions
http://www.humilis.net
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ENOSPC on file deletion with 3.1.6

2012-01-03 Thread Arie Peterson
On Tuesday 03 January 2012 15:06:43 Sander wrote:

 Maybe your snapshots take up space. Can you show 'btrfs filesystem df /' ?

Data, RAID1: total=22.72GB, used=14.73GB
Data: total=8.00MB, used=0.00
System, RAID1: total=8.00MB, used=12.00KB
System: total=4.00MB, used=0.00
Metadata, RAID1: total=2.25GB, used=1.88GB
Metadata: total=8.00MB, used=0.00

 FWIW, I also had a disk full just a few days ago. Removed all snapshots
 and some big files, but to no avail. Likely the background cleanup took
 too much time. A reboot fixed this.

OK, I'll keep this in mind. I'm a bit anxious to reboot, because I'm afraid 
booting will fail if the root file system cannot be written to.

   Sander

Thanks,

Arie

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ENOSPC on file deletion with 3.1.6

2012-01-03 Thread Sander
Arie Peterson wrote (ao):
 On Tuesday 03 January 2012 15:06:43 Sander wrote:
  Maybe your snapshots take up space. Can you show 'btrfs filesystem df /' ?
 
 Data, RAID1: total=22.72GB, used=14.73GB
 Data: total=8.00MB, used=0.00
 System, RAID1: total=8.00MB, used=12.00KB
 System: total=4.00MB, used=0.00
 Metadata, RAID1: total=2.25GB, used=1.88GB
 Metadata: total=8.00MB, used=0.00

Hm, not full.

  FWIW, I also had a disk full just a few days ago. Removed all snapshots
  and some big files, but to no avail. Likely the background cleanup took
  too much time. A reboot fixed this.
 
 OK, I'll keep this in mind. I'm a bit anxious to reboot, because I'm afraid 
 booting will fail if the root file system cannot be written to.

But you did already reboot as you said the old kernel exposed the same
behavior?

Sander

-- 
Humilis IT Services and Solutions
http://www.humilis.net
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ENOSPC on file deletion with 3.1.6

2012-01-03 Thread cwillu
On Tue, Jan 3, 2012 at 8:12 AM, Arie Peterson ar...@xs4all.nl wrote:
 On Tuesday 03 January 2012 15:06:43 Sander wrote:

 Maybe your snapshots take up space. Can you show 'btrfs filesystem df /' ?

 Data, RAID1: total=22.72GB, used=14.73GB
 Data: total=8.00MB, used=0.00
 System, RAID1: total=8.00MB, used=12.00KB
 System: total=4.00MB, used=0.00
 Metadata, RAID1: total=2.25GB, used=1.88GB
 Metadata: total=8.00MB, used=0.00

 FWIW, I also had a disk full just a few days ago. Removed all snapshots
 and some big files, but to no avail. Likely the background cleanup took
 too much time. A reboot fixed this.

 OK, I'll keep this in mind. I'm a bit anxious to reboot, because I'm afraid
 booting will fail if the root file system cannot be written to.

I'd probably run a btrfs fi balance /, it should be able to recover
the space.  I'd typically be a little anxious if it was a large
filesystem as it's not interruptable except via power-button (in
principle it shouldn't matter, but...), but given that your filesystem
is quite small, it shouldn't take more than an hour or so.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Btrfs partition lost after RAID1 mirror disk failure?

2012-01-03 Thread Dan Garton
Hi,

I'm running Ubuntu with kernel 2.6.38 on a fileserver system.
One of the disks in a RAID1 configuration failed (/dev/sdc), and since then
I haven't been able to access the btrfs filesystem on the remaining disk
(/dev/sdb).

root@midnite:~/src/btrfs-progs-unstable# ./btrfsck  /dev/sdb
No valid Btrfs found on /dev/sdb

root@midnite:~/src/btrfs-progs-unstable# ./btrfsck  -s 1 /dev/sdb
using SB copy 1, bytenr 67108864
No valid Btrfs found on /dev/sdb

root@midnite:~/src/btrfs-progs-unstable# ./btrfsck  -s 2 /dev/sdb
using SB copy 2, bytenr 274877906944
No valid Btrfs found on /dev/sdb

(This was using a btrfsck compiled from the only git repo I could find
which was responding:
http://git.darksatanic.net/repo/btrfs-progs-unstable.git  version
v0.19-102-g2482539)

I include below a list of commands which were executed around the time of
the disk failure, attempting to mount the single remaining device (which is
on /dev/sdb, and the failed disk was on /dev/sdc). I'm pretty sure I didn't
destroy anything in the process, but who knows - hence why I include the
list.

Any help appreciated in recovering the partition on /dev/sdb and accessing
the data.

Thanks,
Dan G




  527  btrfs device scan
  610  btrfs device scan
  611  btrfs device show
  612  btrfs fi df
  613  btrfs fi df -h
  614  btrfs fi df nuvat
  615  btrfs fi show
  648  btrfs device scan
  649  btrfs
  650  btrfs  fi df
  651  btrfs  fi df nuvat
  652  btrfs fi show
  653  btrfs  fi df /nuvat/
  654  btrfs fi show
  665  vi /usr/share/initramfs-tools/modules.d/btrfs
 1136  btrfs
 1137  btrfs device scan
 1139  btrfs device scan
 1140  man btrfs
 1141  btrfs device scan /dev/sdc
 1143  btrfs filesystem df /nuvat/
 1144  btrfsck
 1145  btrfsck /dev/sdc
 1146  btrfsck /nuvat
 1147  btrfsctl --help
 1148  btrfsctl -a
 1149  btrfsctl -A /dev/sdc
 1150  btrfs-show
 1197  btrfs-show
 1198  btrfsck
 1199  btrfsck /dev/sdb
 1200  btrfsck /dev/sdc
 1201  btrfsck /dev/sdv
 1202  btrfsck /dev/sdb
 1203  btrfstune
 1204  btrfsctl
 1205  btrfsctl  -a
 1286  btrfs-vol
 1287  btrfs filesystem show
 1290  btrfs device scan
 1292  btrfsck
 1293  btrfsck /dev/sdb
 1299  btrfsctl
 1300  btrfsctl -a
 1304  btrfsck  -h
 1305  btrfsck  --help
 1306  btrfsck
 1307  btrfsck /dev/sdc
 1308  btrfsck /dev/sdb
 1309  btrfs-show
 1310  btrfs-show nuvat
 1311  btrfs-vol
 1312  dpkg -l | grep btrfs
 1313  apt-get install btrfs-tools
 1314  btrfsctl
 1315  btrfsctl -c
 1316  btrfsctl -A
 1317  btrfsctl -A /dev/sdb
 1318  btrfsctl -d
 1319  btrfsctl -d /nuvat/
 1320  btrfsctl -d /dev/sdb
 1321  btrfs-show
 1322  btrfs-show  --help
 1323  btrfs-show  /dev/sdb
 1326  btrfs-vol
 1327  btrfs-vol -a
 1328  btrfs-vol -a /nuvat
 1329  btrfs-vol -a asdasd /nuvat
 1330  btrfs-vol -a missing /nuvat
 1331  btrfs-vol -a /dev/sdc /nuvat
 1332  btrfs-vol -a /dev/sdb /nuvat
 1334  btrfs-vol -a missing /nuvat
 1335  btrfs
 1336  btrfs device /dev/sdc /nuvat
 1337  btrfs device add /dev/sdc /nuvat
 1338  btrfs device delete /dev/sdc /nuvat
 1339  btrfs fi show
 1340  btrfs fi show /nuvat
 1341  btrfs fi show nuvat
 1342  btrfs filesystem  show nuvat
 1343  btrfs filesystem  show
 1344  btrfsctl -a
 1345  btrfs device scan
 1346  btrfs filesystem  show all
 1348  btrfs-show  /dev/sdb
 1352  btrfsck /dev/sdb
 1355  btrfsck
 1356  btrfsck  -s
 1357  btrfsck  -s 1
 1358  btrfsck  -s 1 /dev/sdb
 1360  apt-cache search btrfs
 1361  btrfs filesystem show
 1374  btrfs
 1375  btrfs device scan
 1376  btrfs fi show
 1377  history | grep btrfs
 1387  btrfs-vol
 1388  btrfsck  /dev/sdb
 1389  btrfs subvolume
 1390  btrfs fi show
 1391  dpkg -l | grep btrfs
 1392  apt-cache search btrfs
 1393  git clone
http://git.darksatanic.net/repo/btrfs-progs-unstable.git/btrfs-progs__git
 1394  cd btrfs-progs__git/
 1398  ./btrfs device scan
 1399  ./btrfsck
 1400  ./btrfsck  /dev/sdb
 1401  ./btrfsctl -a
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ENOSPC on file deletion with 3.1.6

2012-01-03 Thread Arie Peterson
On Tuesday 03 January 2012 15:22:58 Sander wrote:

 Hm, not full.
 
  OK, I'll keep this in mind. I'm a bit anxious to reboot, because I'm
  afraid booting will fail if the root file system cannot be written to.
 
 But you did already reboot as you said the old kernel exposed the same
 behavior?

You are right; the full history of events was:

(- compile new kernel (3.1.6))
- boot new kernel;
- recompile gcc: problem occurs;
- solve problem by removing compilation directory;
- boot old kernel;
- recompile gcc: problem occurs for this kernel as well.

After trying to remove the problematic files for some time, I took the chance 
and rebooted. After the reboot, the file system still gave ENOSPC on any write 
operation. However, it was able to boot anyway, and now the removal of the 
problematic files went much faster and without new ENOSPC. The compilation 
directory was completely removed, and immediately afterwards, the file system 
became writeable again.

Sander, thanks for your help.


I am still curious if this a known problem, and if upgrading to a newer kernel 
might prevent it from reoccurring. (I was planning to wait for 3.2 to be 
released and included in Gentoo's repository; maybe I shouldn't wait for 
this...)


Kind regards,

Arie

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


NULL Pointer Dereference While Scrubbing

2012-01-03 Thread Mitch Harder
I've recently run into a kernel NULL pointer dereference while
scrubbing a partition that had picked up error.

I'm running kernel 3.2.0-rc7.  I'd had a power outage, and noticed an
error in a partition when running btrfsck after reboot:

# ./btrfsck /dev/sdb5
root 5 inode 19772 errors 400
found 3123032064 bytes used err is 1
total csum bytes: 2476808
total tree bytes: 586649600
total fs tree bytes: 554622976
btree space waste bytes: 145500448
file data blocks allocated: 2536382464
 referenced 5143969792
Btrfs Btrfs v0.19-dirty

I ran scrub (even though this partition is formated with single data
and metadata) to attempt to clear the error.  My system froze for
about 30 seconds (no HD activity, no mouse or keyboard movement), then
the scrub proceeded to run, but with the following errors in dmesg:

[ 3683.056829] [ cut here ]
[ 3683.056848] WARNING: at lib/kref.c:34 kref_get+0x20/0x30()
[ 3683.056851] Hardware name:
[ 3683.056853] Modules linked in: nvidia(P) nvidia_agp i2c_nforce2
[ 3683.056861] Pid: 4349, comm: btrfs-readahead Tainted: P   O
3.2.0-rc7-git-local+ #1
[ 3683.056865] Call Trace:
[ 3683.056873]  [c102ca5d] warn_slowpath_common+0x6d/0xa0
[ 3683.056878]  [c13acbc0] ? kref_get+0x20/0x30
[ 3683.056882]  [c13acbc0] ? kref_get+0x20/0x30
[ 3683.056886]  [c102caad] warn_slowpath_null+0x1d/0x20
[ 3683.056889]  [c13acbc0] kref_get+0x20/0x30
[ 3683.056898]  [c135b89d] reada_pick_zone+0x11d/0x160
[ 3683.056903]  [c135c491] reada_start_machine_worker+0x201/0x2f0
[ 3683.056910]  [c133a4f9] worker_loop+0x89/0x370
[ 3683.056914]  [c133a470] ? btrfs_queue_worker+0x250/0x250
[ 3683.056919]  [c10468f4] kthread+0x74/0x80
[ 3683.056922]  [c1046880] ? kthread_worker_fn+0x110/0x110
[ 3683.056929]  [c1766036] kernel_thread_helper+0x6/0xd
[ 3683.056932] ---[ end trace f9c0e14dc17013ed ]---
[ 3683.056941] BUG: unable to handle kernel NULL pointer dereference at 00e4
[ 3683.056947] IP: [c13aeeb4] radix_tree_delete+0x14/0x250
[ 3683.056953] *pde = 
[ 3683.056956] Oops:  [#1]
[ 3683.056960] Modules linked in: nvidia(P) nvidia_agp i2c_nforce2
[ 3683.056965]
[ 3683.056968] Pid: 4349, comm: btrfs-readahead Tainted: PW  O
3.2.0-rc7-git-local+ #1/MS-6570
[ 3683.056973] EIP: 0060:[c13aeeb4] EFLAGS: 00010282 CPU: 0
[ 3683.056977] EIP is at radix_tree_delete+0x14/0x250
[ 3683.056980] EAX: 00e4 EBX: f4cdf300 ECX: 0002c000 EDX: 00e4
[ 3683.056983] ESI: c135b650 EDI: f4e0c400 EBP: f0851ec4 ESP: f0851e70
[ 3683.056986]  DS: 007b ES: 007b FS:  GS:  SS: 0068
[ 3683.056990] Process btrfs-readahead (pid: 4349, ti=f085
task=f4f0a9a0 task.ti=f085)
[ 3683.056993] Stack:
[ 3683.056994]  0022 0002c000 00e4 c175dfe7 c18aaeb0 f0851e94
f0851e9c c102c9ea
[ 3683.057001]  c18aaeb0 c17013ed c18f8971 f0851ec4 c102ca6d c18a740e
c1aeb034 0022
[ 3683.057007]  c13acbc0 c13acbc0 f4cdf300 c135b650 f4e0c400 f0851ed0
c135b66e f4cdf334
[ 3683.057008] Call Trace:
[ 3683.057008]  [c175dfe7] ? printk+0x18/0x1a
[ 3683.057008]  [c102c9ea] ? print_oops_end_marker+0x2a/0x30
[ 3683.057008]  [c17013ed] ? xdr_process_buf+0x1d/0x1e0
[ 3683.057008]  [c102ca6d] ? warn_slowpath_common+0x7d/0xa0
[ 3683.057008]  [c13acbc0] ? kref_get+0x20/0x30
[ 3683.057008]  [c13acbc0] ? kref_get+0x20/0x30
[ 3683.057008]  [c135b650] ? reada_peer_zones_set_lock+0x60/0x60
[ 3683.057008]  [c135b66e] reada_zone_release+0x1e/0x30
[ 3683.057008]  [c13acb6c] kref_put+0x2c/0x60
[ 3683.057008]  [c135b7b3] reada_pick_zone+0x33/0x160
[ 3683.057008]  [c135c441] reada_start_machine_worker+0x1b1/0x2f0
[ 3683.057008]  [c133a4f9] worker_loop+0x89/0x370
[ 3683.057008]  [c133a470] ? btrfs_queue_worker+0x250/0x250
[ 3683.057008]  [c10468f4] kthread+0x74/0x80
[ 3683.057008]  [c1046880] ? kthread_worker_fn+0x110/0x110
[ 3683.057008]  [c1766036] kernel_thread_helper+0x6/0xd
[ 3683.057008] Code: 45 e0 8b 5d d8 83 c0 01 89 03 8b 45 e4 8d 65 f4
5b 5e 5f 5d c3 66 90 55 89 e5 57 56 53 83 ec 48 89 55 b0 89 c2 8b 4d
b0 89 45 b4 8b 00 3b 0c 85 48 c1 9e c1 0f 87 9d 01 00 00 8b 5a 08 85
c0 89
[ 3683.057008] EIP: [c13aeeb4] radix_tree_delete+0x14/0x250 SS:ESP
0068:f0851e70
[ 3683.057008] CR2: 00e4
[ 3683.057149] ---[ end trace f9c0e14dc17013ee ]---
[ 3684.019436] checksum error at logical 24911872 on dev /dev/sdb5,
sector 48656: metadata leaf (level 0) in tree 5
[ 3684.019445] checksum error at logical 24911872 on dev /dev/sdb5,
sector 48656: metadata leaf (level 0) in tree 5
[ 3684.019451] btrfs: unable to fixup (regular) error at logical 24911872
[ 3684.675210] checksum error at logical 36139008 on dev /dev/sdb5,
sector 70584: metadata leaf (level 0) in tree 5
[ 3684.675219] checksum error at logical 36139008 on dev /dev/sdb5,
sector 70584: metadata leaf (level 0) in tree 5
[ 3684.675226] btrfs: unable to fixup (regular) error at logical 36139008
[ 3684.990958] checksum error at logical 43667456 on dev /dev/sdb5,
sector 85288: metadata leaf (level 0) in tree 5
[ 3684.990967] checksum error at logical 43667456 on 

Re: btrfsprogs source code

2012-01-03 Thread Calvin Walton
On Tue, 2012-01-03 at 23:26 +0530, debit2...@gmail.com wrote:
 Hi Everyone,
 
 I am very new to this mailing list and very much interested in getting
 into the internals of BTRFS file system
 I was looking for mkfs.btrfs source code so that I can start getting
 how the disk is formatted with btrfs system.
 
 Can anyone of you redirect me to that place to download the btrfsprogs
 source code.

The best way to get the btrfs-progs source is probably via git; Chris
Mason's repository for it can be found at
http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-progs.git

-- 
Calvin Walton calvin.wal...@kepstin.ca

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fstrim on BTRFS

2012-01-03 Thread Chris Mason
On Thu, Dec 29, 2011 at 12:02:48PM +0800, Li Zefan wrote:
 Martin Steigerwald wrote:
  Hi!
  
  With 3.2-rc4 (probably earlier), Ext4 seems to remember what areas it 
  trimmed:
  
  merkaba:~ fstrim -v /boot
  /boot: 224657408 bytes were trimmed
  merkaba:~ fstrim -v /boot
  /boot: 0 bytes were trimmed
  
  
  But BTRFS does not:
  
  merkaba:~ fstrim -v /
  /: 4431613952 bytes were trimmed
  merkaba:~ fstrim -v /
  /: 4341846016 bytes were trimmed
  
  
  Is it planned to add this feature to BTRFS as well?
  
 
 There's no such plan, but it's do-able, and I can take care of it.
 There's an issue though.
 
 Whether we want to store TRIMMED information on disk? ext4 doesn't
 do this, so the first fstrim will be slow though you've done fstrim
 in previous mount.

I'd rather not store the trim status on disk.  The extra trims
don't have a huge cost, and since some devices have a large granularity
for trims, they may ignore the trim until it tosses a larger contiguous
area of the disk.

I'd be fine with a flag to the in-memory free extent struct that
indicates if it has been trimmed down to the device.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs partition lost after RAID1 mirror disk failure?

2012-01-03 Thread C Anthony Risinger
On Tue, Jan 3, 2012 at 8:44 AM, Dan Garton dan.gar...@gmail.com wrote:

  [...]
  1327  btrfs-vol -a
  1328  btrfs-vol -a /nuvat
  1329  btrfs-vol -a asdasd /nuvat
  1330  btrfs-vol -a missing /nuvat
  1331  btrfs-vol -a /dev/sdc /nuvat
  1332  btrfs-vol -a /dev/sdb /nuvat
  1334  btrfs-vol -a missing /nuvat
  [...]

these look destructive to me ... adding the wrong devices and the
existing devices back to the current array?  IIRC you should have `-r
missing`, but in general, do not use the btrfsctl utility at all -- it
won't have as much visibility/exception-handling/recovery as the
`btrfs` utility.

at what point did your FS become inaccessible?  your command history
suggest it was working until shortly after these commands ... :-(

-- 

C Anthony
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html