[Bug 1349711] Re: Machine lockup in btrfs-transaction

2014-08-07 Thread Peter Waller
That's my understanding too, except in one of the scenarios I observed
100% SYS CPU for long stretches even when there was a significant amount
(~50GB) of the device unused.

However, if it was a soft lockup it was for 8 hours, during which the
machine was totally unresponsive to HTTP requests, which amounts to a
very dead production machine.

In the end I'm not sure what to do about all of this. I'm surprised to
find out that BTRFS can back itself into a corner from which it cannot
retrieve itself.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1349711

Title:
  Machine lockup in btrfs-transaction

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349711/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1349711] Re: Machine lockup in btrfs-transaction

2014-08-05 Thread Peter Waller
Repost of what I sent to the mailing list just now:

My current interpretation of this problem is that it is some
pathological condition caused by not rebalancing and being nearly out
of space for allocating more metadata and hence it is rarely being
seen by anyone else (because most users are regularly doing
rebalances).

See this thread for details about rebalancing and out of space:
ENOSPC with mkdir and rename on 2014-08-02:

http://thread.gmane.org/gmane.comp.file-systems.btrfs/37415

I haven't had the lockups in production since July and I'm now trialling a
nightly rebalance:

$ btrfs filesystem balance start -dusage=50 -musage=10 $mount

I'll report back if I encounter further problems.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1349711

Title:
  Machine lockup in btrfs-transaction

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349711/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1349711] Re: Machine lockup in btrfs-transaction

2014-08-02 Thread Peter Waller
The production machine hasn't had a lockup since moving to
3.15.7-031507-generic (it's been up for 4 days) even though we could
reproduce the lockup on a new machine with that kernel using a snapshot
of the old volume.

Another twist is that on the productino machine I'm now reliably seeing
No space left on device, even though there appears to be in principle
63GB remaining:

$ btrfs fi df /path/to/volume
Data, single: total=489.97GiB, used=427.75GiB
System, DUP: total=8.00MiB, used=60.00KiB
System, single: total=4.00MiB, used=0.00
Metadata, DUP: total=5.00GiB, used=4.50GiB
Metadata, single: total=8.00MiB, used=0.00
unknown, single: total=512.00MiB, used=0.00

$ sudo btrfs fi show /path/to/volume
Label: none  uuid: 3ffd71ab-6c3d-4486-a6b0-5c1eeb9be6b3
Total devices 1 FS bytes used 432.25GiB
devid1 size 500.00GiB used 500.00GiB path /dev/dm-0

The ENOSPC is happening for mkdir and rename syscalls in particular.

I've posted a mail to the BTRFS list about this:
http://thread.gmane.org/gmane.comp.file-systems.btrfs/37415

I did a rebalance with `btrfs balance start -dusage=10` (increasing 10)
to try and gain more space for metadata, but this didn't fix the
situation. I did however get this stack trace in dmesg.

In the end, I had to enlarge the volume before it became usable again.


[375794.106653] [ cut here ]
[375794.106676] WARNING: CPU: 1 PID: 24706 at 
/home/apw/COD/linux/fs/btrfs/extent-tree.c:6946 use_block_rsv+0xfd/0x1a0 
[btrfs]()
[375794.106678] BTRFS: block rsv returned -28
[375794.106679] Modules linked in: softdog tcp_diag inet_diag dm_crypt ppdev 
xen_fbfront fb_sys_fops syscopyarea sysfillrect sysimgblt i2c_piix4 serio_raw 
parport_pc parport mac_hid isofs xt_tcpudp iptable_filter xt_owner ip_tables 
x_tables btrfs xor raid6_pq crct10dif_pclmul crc32_pclmul ghash_clmulni_intel 
aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd floppy 
psmouse
[375794.106702] CPU: 1 PID: 24706 Comm: twsearch.py Not tainted 
3.15.7-031507-generic #201407281235
[375794.106703] Hardware name: Xen HVM domU, BIOS 4.2.amazon 05/23/2014
[375794.106705]  1b22 88016db437c8 8176f115 
0007
[375794.106707]  88016db43818 88016db43808 8106ceac 
8801e489
[375794.106709]  8800a71ab9c0 8801aedcd800 1000 
88001c987000
[375794.106711] Call Trace:
[375794.106718]  [8176f115] dump_stack+0x46/0x58
[375794.106721]  [8106ceac] warn_slowpath_common+0x8c/0xc0
[375794.106723]  [8106cf96] warn_slowpath_fmt+0x46/0x50
[375794.106731]  [a00d9d1d] use_block_rsv+0xfd/0x1a0 [btrfs]
[375794.106739]  [a00de687] btrfs_alloc_free_block+0x57/0x220 [btrfs]
[375794.106746]  [a00c8a3c] btrfs_copy_root+0xfc/0x2b0 [btrfs]
[375794.106757]  [a013a583] ? create_reloc_root+0x33/0x2c0 [btrfs]
[375794.106767]  [a013a743] create_reloc_root+0x1f3/0x2c0 [btrfs]
[375794.106776]  [a0140eb8] btrfs_init_reloc_root+0xb8/0xd0 [btrfs]
[375794.106784]  [a00ee967] record_root_in_trans.part.30+0x97/0x100 
[btrfs]
[375794.106792]  [a00ee9f4] record_root_in_trans+0x24/0x30 [btrfs]
[375794.106800]  [a00efeb1] btrfs_record_root_in_trans+0x51/0x80 
[btrfs]
[375794.106808]  [a00f13d6] start_transaction.part.35+0x86/0x560 
[btrfs]
[375794.106815]  [a00d1ee0] ? 
btrfs_reduce_alloc_profile.isra.48+0x80/0x160 [btrfs]
[375794.106818]  [8109be78] ? finish_task_switch+0x128/0x180
[375794.106826]  [a00f18d9] start_transaction+0x29/0x30 [btrfs]
[375794.106834]  [a00f19a7] btrfs_join_transaction+0x17/0x20 [btrfs]
[375794.106841]  [a00d9764] flush_space+0xf4/0x160 [btrfs]
[375794.106848]  [a00d998a] reserve_metadata_bytes+0x1ba/0x450 [btrfs]
[375794.106851]  [811dd073] ? generic_permission+0xf3/0x120
[375794.106854]  [812f010c] ? security_inode_permission+0x1c/0x30
[375794.106857]  [810b5450] ? __wake_up_sync+0x20/0x20
[375794.106864]  [a00daf3a] 
btrfs_delalloc_reserve_metadata+0x16a/0x4a0 [btrfs]
[375794.106873]  [a0102b3d] __btrfs_buffered_write+0x15d/0x5c0 [btrfs]
[375794.106877]  [8118bd9c] ? handle_pte_fault+0x18c/0x1b0
[375794.106886]  [a010319f] btrfs_file_aio_write+0x1ff/0x3b0 [btrfs]
[375794.106889]  [811d268a] do_sync_write+0x5a/0x90
[375794.106892]  [811d32db] vfs_write+0xcb/0x1f0
[375794.106894]  [811d37df] SyS_write+0x4f/0xb0
[375794.106897]  [817858bf] tracesys+0xe1/0xe6
[375794.106898] ---[ end trace 1853311c87a5cd93 ]---

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1349711

Title:
  Machine lockup in btrfs-transaction

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349711/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com

[Bug 1349711] Re: Machine lockup in btrfs-transaction

2014-08-01 Thread Peter Waller
btrfs was created with `mkfs.btrfs /dev/mapper/vg-lv`.

It isn't a hard requirement except that it's a pain to migrate since
that requires downtime to move the files. Something I'd rather not do
unless absolutely necessary. The machine freezes are inconvenient but
represent a few minutes downtime if the machine is rebooted on hangup.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1349711

Title:
  Machine lockup in btrfs-transaction

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349711/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1349711] Re: Machine lockup in btrfs-transaction

2014-08-01 Thread Peter Waller
The filesystem may have been originally created on an older version of
BTRFS from Ubuntu Saucy, which I suppose may not have detected the SSD?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1349711

Title:
  Machine lockup in btrfs-transaction

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349711/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1349711] Re: Machine lockup in btrfs-transaction

2014-08-01 Thread Peter Waller
smb: Yeah, the system the filesystem was created on was PV, the device
name was xvd*. Now it's on HVM with xvd*.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1349711

Title:
  Machine lockup in btrfs-transaction

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349711/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1349711] Re: Machine lockup in btrfs-transaction

2014-07-31 Thread Peter Waller
I've got a way to rapidly reproduce the error now. I can do it reliably
with a turnaround time of 5-10 minutes.

I've reproduced the crash on the new Kernel, so it has now been observed
on both 3.13.0-32-generic and  3.15.7-031507-generic. I'll try 3.16
next.

I've also discovered this new stack trace at the point of the crash
(setup_cluster_bitmap) which happens every time:

[97056.916053] NMI backtrace for cpu 1
[97056.916053] CPU: 1 PID: 1107 Comm: kworker/u30:1 Not tainted 
3.13.0-32-generic #57-Ubuntu
[97056.916053] Hardware name: Xen HVM domU, BIOS 4.2.amazon 05/23/2014
[97056.916053] Workqueue: writeback bdi_writeback_workfn (flush-btrfs-1)
[97056.916053] task: 8800e8018000 ti: 8800da1ac000 task.ti: 
8800da1ac000
[97056.916053] RIP: 0010:[81044a8c] [81044a8c] 
default_send_IPI_mask_sequence_phys+0xbc/0x100
[97056.916053] RSP: 0018:8801efc23c60 EFLAGS: 0046
[97056.916053] RAX: 0400 RBX: b06a RCX: 0001
[97056.916053] RDX: 0001 RSI: 0001 RDI: 0300
[97056.916053] RBP: 8801efc23c98 R08: 81d13780 R09: 0d2e
[97056.916053] R10:  R11: 8801efc239e6 R12: 81d13780
[97056.916053] R13: 0400 R14: 0086 R15: 0002
[97056.916053] FS: () GS:8801efc2() 
knlGS:
[97056.916053] CS: 0010 DS:  ES:  CR0: 80050033
[97056.916053] CR2: 7f803f12e000 CR3: 000101cc2000 CR4: 001406e0
[97056.916053] Stack:
[97056.916053] 00010008 0001 2710 
81c4e1c0
[97056.916053] 81d137a0 81c4e1c0 0001 
8801efc23ca8
[97056.916053] 81049217 8801efc23cc0 81044c60 
8801efc2e800
[97056.916053] Call Trace:
[97056.916053] IRQ
[97056.916053] [81049217] physflat_send_IPI_all+0x17/0x20
[97056.916053] [81044c60] arch_trigger_all_cpu_backtrace+0x70/0xb0
[97056.916053] [810ca9ee] rcu_check_callbacks+0x3fe/0x650
[97056.916053] [81076227] update_process_times+0x47/0x70
[97056.916053] [810d5cf5] tick_sched_handle.isra.17+0x25/0x60
[97056.916053] [810d5d71] tick_sched_timer+0x41/0x60
[97056.916053] [8108e5e7] __run_hrtimer+0x77/0x1d0
[97056.916053] [810d5d30] ? tick_sched_handle.isra.17+0x60/0x60
[97056.916053] [8108edaf] hrtimer_interrupt+0xef/0x230
[97056.916053] [81009fef] xen_timer_interrupt+0x2f/0x150
[97056.916053] [81009ef0] ? xen_clocksource_read+0x20/0x30
[97056.916053] [8101b7e9] ? sched_clock+0x9/0x10
[97056.916053] [8109d1ad] ? sched_clock_local+0x1d/0x80
[97056.916053] [810bf78e] handle_irq_event_percpu+0x3e/0x1d0
[97056.916053] [810c2bbe] handle_percpu_irq+0x3e/0x60
[97056.916053] [8142dba7] __xen_evtchn_do_upcall+0x317/0x320
[97056.916053] [8109d98e] ? __vtime_account_system+0x2e/0x40
[97056.916053] [8109dd2c] ? vtime_account_system+0x3c/0x50
[97056.916053] [8142fc8b] xen_evtchn_do_upcall+0x2b/0x50
[97056.916053] [8172e22d] xen_hvm_callback_vector+0x6d/0x80
[97056.916053] EOI
[97056.916053] [813732df] ? find_next_zero_bit+0x8f/0xf0
[97056.916053] [a0164f6a] setup_cluster_bitmap+0x15a/0x280 [btrfs]
[97056.916053] [a0168174] btrfs_find_space_cluster+0x244/0x290 [btrfs]
[97056.916053] [a0115a32] find_free_extent+0x4d2/0xc30 [btrfs]
[97056.916053] [a01162b8] btrfs_reserve_extent+0xa8/0x1a0 [btrfs]
[97056.916053] [a012f165] cow_file_range+0x135/0x430 [btrfs]
[97056.916053] [a0130142] run_delalloc_range+0x312/0x350 [btrfs]
[97056.916053] [a0143f69] ? 
find_lock_delalloc_range.constprop.43+0x1b9/0x1f0 [btrfs]
[97056.916053] [a0145134] __extent_writepage+0x2f4/0x750 [btrfs]
[97056.916053] [a0144010] ? end_extent_writepage+0x70/0x70 [btrfs]
[97056.916053] [a0145802] 
extent_write_cache_pages.isra.30.constprop.50+0x272/0x3d0 [btrfs]
[97056.916053] [a0146acd] extent_writepages+0x4d/0x70 [btrfs]
[97056.916053] [a012caf0] ? btrfs_real_readdir+0x5b0/0x5b0 [btrfs]
[97056.916053] [a012ac88] btrfs_writepages+0x28/0x30 [btrfs]
[97056.916053] [8115a5ee] do_writepages+0x1e/0x40
[97056.916053] [811e5f40] __writeback_single_inode+0x40/0x210
[97056.916053] [811e6cf7] writeback_sb_inodes+0x247/0x3e0
[97056.916053] [811e6f2f] __writeback_inodes_wb+0x9f/0xd0
[97056.916053] [811e71a3] wb_writeback+0x243/0x2c0
[97056.916053] [811e8a79] bdi_writeback_workfn+0x1b9/0x430
[97056.916053] [810838f2] process_one_work+0x182/0x450
[97056.916053] [810846e1] worker_thread+0x121/0x410
[97056.916053] [810845c0] ? rescuer_thread+0x430/0x430
[97056.916053] [8108b3d2] kthread+0xd2/0xf0
[97056.916053] [8108b300] ? kthread_create_on_node+0x1d0/0x1d0
[97056.916053] 

[Bug 1349711] Re: Machine lockup in btrfs-transaction

2014-07-31 Thread Peter Waller
Now reproduced on 3.16. I'm out of things to try for now.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1349711

Title:
  Machine lockup in btrfs-transaction

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349711/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1349711] Re: Machine lockup in btrfs-transaction

2014-07-31 Thread Peter Waller
This gist contains a stack trace every 10 seconds taken with `echo l 
/proc/sysrq-trigger` whilst the machine was spinning in the kernel but
still responsive.

https://gist.github.com/pwaller/c7dd0f4807459acedcdf

The machine remained responsive for 5-10 minutes before becoming totally
unresponsive to network activity.

The test case just involves copying every sqlite file in a directory
tree containing ~400GB of files of various sizes to new files in the
same directory with `.new` on the end of the filename.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1349711

Title:
  Machine lockup in btrfs-transaction

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349711/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1349711] Re: Machine lockup in btrfs-transaction

2014-07-31 Thread Peter Waller
** Tags added: kernel-bug-exists-upstream

** Changed in: linux (Ubuntu)
   Status: Incomplete = Confirmed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1349711

Title:
  Machine lockup in btrfs-transaction

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349711/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1349711] Re: Machine lockup in btrfs-transaction

2014-07-31 Thread Peter Waller
Hm, I'm not sure I can  give a thorough description since I don't
understand enough about the exact workload myself. It is a fairly
arbitrary workload generated by our users.

In the end, it boils down to creating, reading and writing many
(~20,000) sqlite files of size 16kb - 12GB across many folders and doing
random read/write IO to them. The directory structure is that all 20,000
files live inside a directory of one root directory, like so:

/path/1/file.sqlite
/path/2/file.sqlite

etc.

The 500GB volume has approximately 275GB of such files. When copying
/path/1/file.sqlite to /path/1/file.sqlite.new (and so on) with
`cat` twice in parallel (via xargs -P2 as in #11), the volume eventually
(after multiple hours) hangs. If the copying is resumed from the last
file successfully copied before the hang, the hang onset is very rapid.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1349711

Title:
  Machine lockup in btrfs-transaction

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349711/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1349711] Re: Machine lockup in btrfs-transaction

2014-07-31 Thread Peter Waller
(otherwise unloaded test machines)

On a dual core machine, 100% system CPU usage with zero writes is seen
on one core for 5-10 minutes, spending time in BTRFS threads.

On a single thread machine 100% system CPU is used and I haven't yet
been able to cause it to hang entirely. I do observe almost completely
100% system CPU usage and very low IO rates, 0 IOPS for up to minutes at
a time before it then goes to 1000 IOPs.

The system goes to IOWAIT when there is significant IO traffic, and
otherwise kernel threads are consuming CPU.

However the latter is on a non-prewarmed EBS. So I am not sure if the
behaviour we're seeing is entirely due to cold block storage where we
have no latency/rate guarantees.

Generally, I can't tell if we have hit a bug in the BTRFS free extent
code, or if it is all due to bad EBS performance and a spinlock
somewhere.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1349711

Title:
  Machine lockup in btrfs-transaction

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349711/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1349711] Re: Machine lockup in btrfs-transaction

2014-07-30 Thread Peter Waller
I found an additional stack trace from a previous machine lockup.

[1093202.136107] INFO: task kworker/u30:1:31455 blocked for more than 120 
seconds.
[1093202.141596]   Tainted: GF3.13.0-30-generic #54-Ubuntu
[1093202.146201] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables 
this message.
[1093202.152408] kworker/u30:1   D 8801efc34440 0 31455  2 
0x
[1093202.152416] Workqueue: writeback bdi_writeback_workfn (flush-btrfs-1)
[1093202.152419]  880040d3b768 0002 8800880f 
880040d3bfd8
[1093202.152422]  00014440 00014440 8800880f 
8801efc34cd8
[1093202.152426]  8801effcefe8 880040d3b7f0 0002 
8114e010
[1093202.152429] Call Trace:
[1093202.152435]  [8114e010] ? wait_on_page_read+0x60/0x60
[1093202.152439]  [8171ea6d] io_schedule+0x9d/0x140
[1093202.152442]  [8114e01e] sleep_on_page+0xe/0x20
[1093202.152445]  [8171eff8] __wait_on_bit_lock+0x48/0xb0
[1093202.152449]  [8109df64] ? vtime_common_task_switch+0x24/0x40
[1093202.152452]  [8114e12a] __lock_page+0x6a/0x70
[1093202.152456]  [810aaee0] ? autoremove_wake_function+0x40/0x40
[1093202.152474]  [a00e8a0d] lock_delalloc_pages+0x13d/0x1d0 [btrfs]
[1093202.152487]  [a00eaf2b] 
find_lock_delalloc_range.constprop.43+0x14b/0x1f0 [btrfs]
[1093202.152499]  [a00ebfa1] __extent_writepage+0x131/0x750 [btrfs]
[1093202.152509]  [a00eb040] ? end_extent_writepage+0x70/0x70 [btrfs]
[1093202.152521]  [a00ec832] 
extent_write_cache_pages.isra.30.constprop.50+0x272/0x3d0 [btrfs]
[1093202.152532]  [a00edafd] extent_writepages+0x4d/0x70 [btrfs]
[1093202.152544]  [a00d3b20] ? btrfs_real_readdir+0x5b0/0x5b0 [btrfs]
[1093202.152555]  [a00d1cb8] btrfs_writepages+0x28/0x30 [btrfs]
[1093202.152559]  [8115a46e] do_writepages+0x1e/0x40
[1093202.152562]  [811e5c50] __writeback_single_inode+0x40/0x210
[1093202.152565]  [811e6a07] writeback_sb_inodes+0x247/0x3e0
[1093202.152568]  [811e6c3f] __writeback_inodes_wb+0x9f/0xd0
[1093202.152571]  [811e6eb3] wb_writeback+0x243/0x2c0
[1093202.152574]  [811e86d8] bdi_writeback_workfn+0x108/0x430
[1093202.152577]  [810974a8] ? finish_task_switch+0x128/0x170
[1093202.152581]  [810838a2] process_one_work+0x182/0x450
[1093202.152585]  [81084641] worker_thread+0x121/0x410
[1093202.152588]  [81084520] ? rescuer_thread+0x3e0/0x3e0
[1093202.152591]  [8108b322] kthread+0xd2/0xf0
[1093202.152594]  [8108b250] ? kthread_create_on_node+0x1d0/0x1d0
[1093202.152598]  [8172ac3c] ret_from_fork+0x7c/0xb0
[1093202.152601]  [8108b250] ? kthread_create_on_node+0x1d0/0x1d0

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1349711

Title:
  Machine lockup in btrfs-transaction

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349711/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1349711] [NEW] Machine lockup in btrfs-transaction

2014-07-29 Thread Peter Waller
Public bug reported:

This has happened twice now.

I'm on an AWS EC2 m3.large instance with the official Ubuntu AMI ami-
776d9700.

# cat /proc/version_signature
Ubuntu 3.13.0-32.57-generic 3.13.11.4

After running for many days, the machine locked up with the below
messages appearing on the console. The machine would respond to ping but
not SSH or HTTP requests. The machine has one BTRFS volume which is 87%
full and lives on an Logical Volume Manager (LVM) block device on top of
one Amazon Elastic Block Store (EBS) device.

Error messages after first reboot:

[   77.609490] BTRFS error (device dm-0): block group 10766778368 has wrong 
amount of free space
[   77.613678] BTRFS error (device dm-0): failed to load free space cache for 
block group 10766778368
[   77.643801] BTRFS error (device dm-0): block group 19356712960 has wrong 
amount of free space
[   77.648952] BTRFS error (device dm-0): failed to load free space cache for 
block group 19356712960
[   77.926325] BTRFS error (device dm-0): block group 20430454784 has wrong 
amount of free space
[   77.931078] BTRFS error (device dm-0): failed to load free space cache for 
block group 20430454784
[   78.111437] BTRFS error (device dm-0): block group 21504196608 has wrong 
amount of free space
[   78.116165] BTRFS error (device dm-0): failed to load free space cache for 
block group 21504196608

Error messages after second reboot:

[   45.390221] BTRFS error (device dm-0): free space inode generation (0) did 
not match free space cache generation (70012)
[   45.413472] BTRFS error (device dm-0): free space inode generation (0) did 
not match free space cache generation (70012)
[  467.423961] BTRFS error (device dm-0): block group 518646661120 has wrong 
amount of free space
[  467.429251] BTRFS error (device dm-0): failed to load free space cache for 
block group 518646661120

Error messages on the console after second lock-up follow:

[246736.752053] INFO: rcu_sched self-detected stall on CPU { 0}  (t=2220246 
jiffies g=35399662 c=35399661 q=0)
[246736.756059] INFO: rcu_sched detected stalls on CPUs/tasks: { 0} (detected 
by 1, t=2220247 jiffies, g=35399662, c=35399661, q=0)
[246764.192014] BUG: soft lockup - CPU#0 stuck for 23s! [kworker/u30:2:1828]
[246764.212058] BUG: soft lockup - CPU#1 stuck for 23s! [btrfs-transacti:492]
[246792.192022] BUG: soft lockup - CPU#0 stuck for 23s! [kworker/u30:2:1828]
[246792.212057] BUG: soft lockup - CPU#1 stuck for 23s! [btrfs-transacti:492]
[246820.192052] BUG: soft lockup - CPU#0 stuck for 23s! [kworker/u30:2:1828]
[246820.212018] BUG: soft lockup - CPU#1 stuck for 23s! [btrfs-transacti:492]
[246848.192052] BUG: soft lockup - CPU#0 stuck for 23s! [kworker/u30:2:1828]
[246848.212058] BUG: soft lockup - CPU#1 stuck for 23s! [btrfs-transacti:492]
[246876.192053] BUG: soft lockup - CPU#0 stuck for 22s! [kworker/u30:2:1828]
[246876.212057] BUG: soft lockup - CPU#1 stuck for 22s! [btrfs-transacti:492]
[246904.192053] BUG: soft lockup - CPU#0 stuck for 22s! [kworker/u30:2:1828]
[246904.212058] BUG: soft lockup - CPU#1 stuck for 22s! [btrfs-transacti:492]
[246916.772052] INFO: rcu_sched self-detected stall on CPU[246916.776058] INFO: 
rcu_sched detected stalls on CPUs/tasks:
[246944.192053] BUG: soft lockup - CPU#0 stuck for 22s! [kworker/u30:2:1828]
[246944.212058] BUG: soft lockup - CPU#1 stuck for 22s! [btrfs-transacti:492]
[246972.192053] BUG: soft lockup - CPU#0 stuck for 22s! [kworker/u30:2:1828]
[246972.212018] BUG: soft lockup - CPU#1 stuck for 22s! [btrfs-transacti:492]
[247000.192053] BUG: soft lockup - CPU#0 stuck for 22s! [kworker/u30:2:1828]
[247000.212058] BUG: soft lockup - CPU#1 stuck for 22s! [btrfs-transacti:492]
[247028.192054] BUG: soft lockup - CPU#0 stuck for 22s! [kworker/u30:2:1828]
[247028.212058] BUG: soft lockup - CPU#1 stuck for 22s! [btrfs-transacti:492]
[247056.192053] BUG: soft lockup - CPU#0 stuck for 23s! [kworker/u30:2:1828]
[247056.212061] BUG: soft lockup - CPU#1 stuck for 23s! [btrfs-transacti:492]

** Affects: linux (Ubuntu)
 Importance: Undecided
 Status: Incomplete


** Tags: btrfs btrfs-transaction linux soft-lockup

** Tags added: soft-lockup

** Tags added: btrfs-transaction

** Description changed:

  This has happened twice now.
  
  I'm on an AWS EC2 m3.large instance with the official Ubuntu AMI ami-
  776d9700.
  
- # cat /proc/version_signature 
+ # cat /proc/version_signature
  Ubuntu 3.13.0-32.57-generic 3.13.11.4
  
  After running for many days, the machine locked up with the below
- messages appearing in the terminal. The machine would respond to ping
- but not SSH or HTTP requests. The machine has one BTRFS volume which is
- 87% full and lives on an Logical Volume Manager (LVM) block device on
- top of one Amazon Elastic Block Store (EBS) device.
+ messages appearing on the console. The machine would respond to ping but
+ not SSH or HTTP requests. The machine has one BTRFS volume which is 87%
+ full and lives on an Logical Volume Manager (LVM) block 

[Bug 1349711] Re: Machine lockup in btrfs-transaction

2014-07-29 Thread Peter Waller
I've also started a thread on linux-btrfs:

http://thread.gmane.org/gmane.comp.file-systems.btrfs/37224

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1349711

Title:
  Machine lockup in btrfs-transaction

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349711/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1349711] Re: Machine lockup in btrfs-transaction

2014-07-29 Thread Peter Waller
@brad-figg, apologies I missed your response. Is there a way to generate
the output without automatically uploading it? I would like to review it
first. I tried `apport-cli --save` but that doesn't do anything unless
there are any crash files that I can tell.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1349711

Title:
  Machine lockup in btrfs-transaction

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349711/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1349711] Re: Machine lockup in btrfs-transaction

2014-07-29 Thread Peter Waller
One thing I am unsure of is that the bug did not manifest for at least
12 days running originally. So I'm not sure it is going to be possible
to reliably decide that it is fixed by moving to a particular kernel.
What is the standard here?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1349711

Title:
  Machine lockup in btrfs-transaction

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349711/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1349711] Re: Machine lockup in btrfs-transaction

2014-07-29 Thread Peter Waller
The crashes became more frequent. The approximate time was 12 days
running, then ~2 days running, then 6 hours, then 1 hour.

I since moved to 3.15.7-031507-generic.

One thing I have observed is that (EXT4 filesystem)
/var/log/nginx/access.log contained ~2KB of NULL characters in place of
any entries at the point when the machine froze.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1349711

Title:
  Machine lockup in btrfs-transaction

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349711/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1074564] Re: Upstart logfiles should be readable by adm group

2014-07-08 Thread Peter Waller
Some admins are typing sudo bash because it's inconvenient to have to
sudo to look at each log file. This is pretty annoying, what does it
take to get this fixed?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1074564

Title:
  Upstart logfiles should be readable by adm group

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/1074564/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1301015] Re: Networking does not restart

2014-07-08 Thread Peter Waller
I've added the following file:

/etc/network/interfaces.d$ cat lo1.cfg 
auto lo1
iface lo1 inet loopback
address 127.0.1.1
netmask 255.0.0.0


How do I get it to take effect if not `restart networking`? I've tried `ifup 
lo1` but it just says cannot find device lo1. I have a feeling that I did 
`restart networking` before on a pre-trusty box and it worked.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1301015

Title:
  Networking does not restart

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ifupdown/+bug/1301015/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 296122] Re: XNEST crashes in X_PolyFillRectangle with error config/hal: NewInputDeviceRequest failed

2014-03-24 Thread Peter Waller
I'm also seeing the same with Xnest on 13.10.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/296122

Title:
  XNEST crashes in X_PolyFillRectangle with error config/hal:
  NewInputDeviceRequest failed

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/xorg-server/+bug/296122/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 886427] Re: oidentd doesn't have a status option in the init script

2014-02-19 Thread Peter Waller
Any chance of this fix finding its way into Ubuntu? It's causing some
configuration managers to think that oidentd always needs to be
restarted because `status` always returns a non-zero exit status.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/886427

Title:
  oidentd doesn't have a status option in the init script

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/oidentd/+bug/886427/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1264674] Re: nginx segfault when adding add_header in configuration

2014-02-12 Thread Peter Waller
This bug says fix released but it isn't actually updating on my
machine. How is this? nginx won't currently start - when is this going
to be fixed?!

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1264674

Title:
  nginx segfault when adding add_header in configuration

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nginx/+bug/1264674/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1094319] Re: p11-kit: couldn't load module: /usr/lib/i386-linux-gnu/pkcs11/gnome-keyring-pkcs11.so: /usr/lib/i386-linux-gnu/pkcs11/gnome-keyring-pkcs11.so: cannot open shared object file: No such

2013-12-11 Thread Peter Waller
Any chance of an update for precise?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1094319

Title:
  p11-kit: couldn't load module: /usr/lib/i386-linux-gnu/pkcs11/gnome-
  keyring-pkcs11.so: /usr/lib/i386-linux-gnu/pkcs11/gnome-keyring-
  pkcs11.so: cannot open shared object file: No such file or directory

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/gnome-keyring/+bug/1094319/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1250208] [NEW] `stop salt-master` causes upstart to hang

2013-11-11 Thread Peter Waller
Public bug reported:

On the Saucy (13.10) 64-bit image provided by Amazon web services, `stop
salt-master` causes upstart to hang.

This makes it difficult to even un-install salt-master. I've mentioned
this problem upstream* in an issue that shows related behaviour.
However, that upstream issue is currently marked closed and I'm
currently unsure whose side the problem is on.

* https://github.com/saltstack/salt/issues/2166#issuecomment-28231668

** Affects: salt (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1250208

Title:
  `stop salt-master` causes upstart to hang

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/salt/+bug/1250208/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1228686] Re: X crashes on logout or user switching

2013-09-22 Thread Peter Waller
After X crashes, if I try to restart it, frequently the screen locks up
entirely. I've discovered more information in the Xorg.1.log (attached).

Here is an extract:

[ 40345.063] (II) NVIDIA(0): NVIDIA GPU GeForce GTX 560 Ti (GF114) at PCI:1:0:0 
(GPU-0)
[ 40345.063] (--) NVIDIA(0): Memory: 1048576 kBytes
[ 40345.063] (--) NVIDIA(0): VideoBIOS: 70.24.21.00.02
[ 40345.063] (II) NVIDIA(0): Detected PCI Express Link width: 16X
[ 40345.067] (EE) NVIDIA(GPU-0): EVO Push buffer channel allocation failed
[ 40345.067] (EE)  *** Aborting ***
[ 40345.067] (EE) NVIDIA(GPU-0): Failed to allocate EVO core DMA push buffer
[ 40345.067] (EE)  *** Aborting ***
[ 40345.067] (EE) NVIDIA(0): Failing initialization of X screen 0
[ 40345.070] (II) UnloadModule: nvidia
[ 40345.071] (II) UnloadSubModule: shadow
[ 40345.071] (II) UnloadSubModule: wfb
[ 40345.071] (II) UnloadSubModule: fb
[ 40345.071] (EE) Screen(s) found, but none have a usable configuration.
[ 40345.071] (EE) 
Fatal server error:
[ 40345.071] (EE) no screens found(EE) 
[ 40345.071] (EE) 


** Attachment added: Xorg.1.log after trying to restart lightdm
   
https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers-319/+bug/1228686/+attachment/3834552/+files/Xorg.1.log

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1228686

Title:
  X crashes on logout or user switching

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers-319/+bug/1228686/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1228686] [NEW] X crashes on logout or user switching

2013-09-21 Thread Peter Waller
Public bug reported:

If I try to logout or switch to another user, I am frequently presented
with a VT with a blinking cursor. It seems to be about a 50% chance that
it will work and do the right thing, and a 50% chance that X dies.

I'm on 13.10 saucy daily, 64 bit.

dmesg says the following, too, but I haven't found anything else of
relevance.

[822.442437] HDMI: invalid ELD data byte 68

** Affects: nvidia-graphics-drivers-319 (Ubuntu)
 Importance: Undecided
 Status: New

** Attachment added: Xorg.0.log immediately after the crash
   https://bugs.launchpad.net/bugs/1228686/+attachment/3833916/+files/Xorg.0.log

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1228686

Title:
  X crashes on logout or user switching

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers-319/+bug/1228686/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 312462] Re: document_new_from_data() arg1 must be without null bytes

2013-09-15 Thread Peter Waller
Oh, duh, `gir1.2-poppler-0.18` on ubuntu. Was blind to it.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/312462

Title:
  document_new_from_data() arg1 must be without null bytes

To manage notifications about this bug go to:
https://bugs.launchpad.net/poppler-python/+bug/312462/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 312462] Re: document_new_from_data() arg1 must be without null bytes

2013-09-15 Thread Peter Waller
BenjaminBerg, this wasn't obvious. I'm glad to hear there is something
replacing the bindings and it's not just that it is totally dead.
However, if I try and use it:

 from gi.repository import Poppler
ERROR:root:Could not find any typelib for Poppler

I can't see any obvious packages that I might be missing (`apt-cache
search poppler-glib`). Suggestions?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/312462

Title:
  document_new_from_data() arg1 must be without null bytes

To manage notifications about this bug go to:
https://bugs.launchpad.net/poppler-python/+bug/312462/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 312462] Re: document_new_from_data() arg1 must be without null bytes

2013-09-12 Thread Peter Waller
Ping. I really want to use poppler-python and stuff like this makes me
cringe.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/312462

Title:
  document_new_from_data() arg1 must be without null bytes

To manage notifications about this bug go to:
https://bugs.launchpad.net/poppler-python/+bug/312462/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1062118] [NEW] Missing some geometry header files (TGeo*)

2012-10-05 Thread Peter Waller
Public bug reported:

The files in the `geombuilder` directory seem to be missing from
`libroot-geom-dev` package. They are found here, and should be part of
the standard root installation, so far as I can tell.

http://root.cern.ch/viewvc/trunk/geom/geombuilder/inc/

** Affects: root-system (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1062118

Title:
  Missing some geometry header files (TGeo*)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/root-system/+bug/1062118/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs