Re: Ceph Hackathon: More Memory Allocator Testing

2015-08-19 Thread Stefan Priebe
Am 19.08.2015 um 22:16 schrieb Somnath Roy: Alexandre, I am not able to build librados/librbd by using the following config option. ./configure –without-tcmalloc –with-jemalloc Same issue to me. You have to remove libcmalloc out of your build environment to get this done. Stefan It

Re: Ceph Hackathon: More Memory Allocator Testing

2015-08-19 Thread Stefan Priebe - Profihost AG
Thanks for sharing. Do those tests use jemalloc for fio too? Otherwise librbd on client side is running with tcmalloc again. Stefan Am 19.08.2015 um 06:45 schrieb Mark Nelson: Hi Everyone, One of the goals at the Ceph Hackathon last week was to examine how to improve Ceph Small IO

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-19 Thread Stefan Priebe - Profihost AG
Am 18.08.2015 um 15:43 schrieb Campbell, Bill: Hey Stefan, Are you using your Ceph cluster for virtualization storage? Yes Is dm-writeboost configured on the OSD nodes themselves? Yes Stefan *From: *Stefan Priebe

Re: [E1000-devel] dropped rx with i40e

2015-08-19 Thread Stefan Priebe - Profihost AG
released two days ago and to the latest 1.3.38 driver - it works on 10 out of my 18 testing hosts. Currently i've no idea why it does not on those 8. Stefan Am 19.08.2015 um 00:24 schrieb Rose, Gregory V: -Original Message- From: Stefan Priebe - Profihost AG [mailto:s.pri

Re: [E1000-devel] dropped rx with i40e

2015-08-18 Thread Stefan Priebe - Profihost AG
Hi Greg, could you tell me the output of ethtool -i and ethtool -a and ethtool -c and ethtool -k? Another difference to the ixgbe is that large-receive-offload is fixed to off in ethtool -k. Stefan Am 17.08.2015 um 23:46 schrieb Rose, Gregory V: -Original Message- From: Stefan

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Stefan Priebe - Profihost AG
We're using an extra caching layer for ceph since the beginning for our older ceph deployments. All new deployments go with full SSDs. I've tested so far: - EnhanceIO - Flashcache - Bcache - dm-cache - dm-writeboost The best working solution was and is bcache except for it's buggy code. The

4.1.5: WARNING: CPU: 3 PID: 4756 at fs/btrfs/extent-tree.c:7539 btrfs_alloc_tree_block

2015-08-16 Thread Stefan Priebe
Hello, this is from vanilla 4.1.5: [29948.701309] [ cut here ] [29948.756302] WARNING: CPU: 3 PID: 4756 at fs/btrfs/extent-tree.c:7539 btrfs_alloc_tree_block+0x431/0x4c0 [btrfs]() [29948.814644] BTRFS: block rsv returned -28 [29948.814647] Modules linked in: dm_mod

[E1000-devel] i40e 1.3.38 new nvme?

2015-08-15 Thread Stefan Priebe
Hello, today there is a new 1.3.38 driver for the xl710 cards. After installing the driver it tells me i need a newer NVMe. Were i can i find it? On the intel download page is no newer to download. # ethtool -i eth2 driver: i40e version: 1.3.38 firmware-version: 4.42 0x8000191b 0.0.0 Stefan

Re: [Qemu-devel] guest: NO_HZ FULL will not work with unstable sched clock

2015-08-15 Thread Stefan Priebe
] kernel_init+0xe/0xf0 [0.195715] [816347a2] ret_from_fork+0x42/0x70 [0.195719] [8161f6a0] ? rest_init+0x80/0x80 [0.195729] ---[ end trace cf665146248feec1 ]--- Stefan Am 15.08.2015 um 20:44 schrieb Stefan Priebe: Hi, while switching to a FULL tickless kernel i

[Qemu-devel] guest: NO_HZ FULL will not work with unstable sched clock

2015-08-15 Thread Stefan Priebe
Hi, while switching to a FULL tickless kernel i detected that all our VMs produce the following stack trace while running under qemu 2.3.0. [0.195160] HPET: 3 timers in total, 0 timers will be used for per-cpu timer [0.195181] hpet0: at MMIO 0xfed0, IRQs 2, 8, 0 [0.195188]

4.2-rc6: kernel BUG at fs/btrfs/inode.c:3230

2015-08-13 Thread Stefan Priebe - Profihost AG
Seen today: [150110.712196] [ cut here ] [150110.776995] kernel BUG at fs/btrfs/inode.c:3230! [150110.841067] invalid opcode: [#1] SMP [150110.904472] Modules linked in: dm_mod netconsole ipt_REJECT nf_reject_ipv4 xt_multiport iptable_filter ip_tables x_tables

4.2-rc6: cp reflink call trace

2015-08-13 Thread Stefan Priebe - Profihost AG
I'm getting the following trace on a daily basis when stacking a lot of cp --reflink commands. Somethinkg like: File a 80GB cp --reflink=always a b modify b cp --reflink=always b c modify c cp --reflink=always c d modify d ... [57623.099897] INFO: task cp:1319 blocked for more than 120

Re: [E1000-devel] dropped rx with i40e

2015-08-13 Thread Stefan Priebe - Profihost AG
1.3.12-k from net-next devel does not help either ;-( Should we open an intel support ticket? We really need a solution. Stefan Am 12.08.2015 um 10:29 schrieb Stefan Priebe - Profihost AG: Might this be a memory allocation problem? It happens only after some hours running and when the whole

Re: [E1000-devel] dropped rx with i40e

2015-08-13 Thread Stefan Priebe
curious. May it be related to jumbo frames? Stefan - Greg -Original Message- From: Stefan Priebe [mailto:s.pri...@profihost.ag] Sent: Thursday, August 13, 2015 11:53 AM To: Rose, Gregory V; e1000-devel@lists.sourceforge.net Subject: Re: [E1000-devel] dropped rx with i40e Hi

[ceph-users] rbd rename snaps?

2015-08-12 Thread Stefan Priebe - Profihost AG
Hi, for mds there is the ability to rename snapshots. But for rbd i can't see one. Is there a way to rename a snapshot? Greets, Stefan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [E1000-devel] dropped rx with i40e

2015-08-12 Thread Stefan Priebe - Profihost AG
Might this be a memory allocation problem? It happens only after some hours running and when the whole memory is filled with linux fs cache. Is the i40e driver using kmalloc or vmalloc? Stefan Am 11.08.2015 um 06:03 schrieb Stefan Priebe: One more thing to note. It mostly happens after around 8

Re: bcache deadlock

2015-08-12 Thread Stefan Priebe - Profihost AG
could also explain why there is no stack trace. Greets, Stefan 2015-08-10 16:51 GMT+02:00 Stefan Priebe s.pri...@profihost.ag: Am 03.08.2015 um 08:25 schrieb Stefan Priebe - Profihost AG: Am 03.08.2015 um 08:21 schrieb Ming Lin: On Fri, Jul 31, 2015 at 11:08 PM, Stefan Priebe s.pri

Re: bcache deadlock

2015-08-10 Thread Stefan Priebe
Am 03.08.2015 um 08:25 schrieb Stefan Priebe - Profihost AG: Am 03.08.2015 um 08:21 schrieb Ming Lin: On Fri, Jul 31, 2015 at 11:08 PM, Stefan Priebe s.pri...@profihost.ag wrote: Hi, any ideas about this deadlock: 2015-08-01 00:05:05 echo 0 /proc/sys/kernel/hung_task_timeout_secs

Re: [E1000-devel] dropped rx with i40e

2015-08-10 Thread Stefan Priebe
to do and then I'll get back to you. - Greg -Original Message- From: Stefan Priebe - Profihost AG [mailto:s.pri...@profihost.ag] Sent: Wednesday, August 05, 2015 11:32 PM To: Rose, Gregory V; e1000-devel@lists.sourceforge.net Subject: Re: [E1000-devel] dropped rx with i40e Am

Re: [E1000-devel] dropped rx with i40e

2015-08-07 Thread Stefan Priebe - Profihost AG
. Just write me. Stefan - Greg -Original Message- From: Stefan Priebe - Profihost AG [mailto:s.pri...@profihost.ag] Sent: Wednesday, August 05, 2015 11:32 PM To: Rose, Gregory V; e1000-devel@lists.sourceforge.net Subject: Re: [E1000-devel] dropped rx with i40e Am 06.08.2015 um 00:22

Re: [E1000-devel] dropped rx with i40e

2015-08-06 Thread Stefan Priebe - Profihost AG
. While a 2nd system receiving the same load using ixgbe has no dropped packets. That might be an easy test to run. Thanks! Greets, Stefan Thanks, - Greg -Original Message- From: Stefan Priebe [mailto:s.pri...@profihost.ag] Sent: Wednesday, August 05, 2015 11:14 AM To: e1000-devel

[E1000-devel] dropped rx with i40e

2015-08-05 Thread Stefan Priebe - Profihost AG
Hello list, we're using the intel X520 cards with the ixgbe driver since a long time for our cloud infrastructure. We never had a problem with dropped packets and everything was always fine. Since a year we started switching to the X710 cards as they're better regarding their specs (lower power

Re: [E1000-devel] dropped rx with i40e

2015-08-05 Thread Stefan Priebe
Something i've noticed: ixgbe: Adaptive RX: off TX: off rx-usecs: 1 tx-usecs: 0 i40e: Adaptive RX: on TX: on rx-usecs: 62 tx-usecs: 122 Stefan Am 05.08.2015 um 09:02 schrieb Stefan Priebe - Profihost AG: Hello list, we're using the intel X520 cards with the ixgbe driver since a long time

[pve-devel] Tickless kernel or not

2015-08-05 Thread Stefan Priebe - Profihost AG
Hi, while digging and comparing several kernel parameters for optimizations i was wondering about the following changes from the RHEL7 kernel to the PVE 4.X kernel and the reasons behind it. Redhat uses since RHEL7 a fully tickless kernel. Options: CONFIG_NO_HZ=y CONFIG_NO_HZ_FULL=y

Re: [pve-devel] Tickless kernel or not

2015-08-05 Thread Stefan Priebe - Profihost AG
Am 05.08.2015 um 16:48 schrieb Dietmar Maurer: Redhat uses since RHEL7 a fully tickless kernel. Options: CONFIG_NO_HZ=y CONFIG_NO_HZ_FULL=y CONFIG_RCU_FAST_NO_HZ=n CONFIG_HZ_1000=y CONFIG_HZ=1000 PVE now uses: CONFIG_NO_HZ=y CONFIG_NO_HZ_IDLE=y CONFIG_RCU_FAST_NO_HZ=y CONFIG_HZ_250=y

Re: [ceph-users] Is it safe to increase pg number in a production environment

2015-08-04 Thread Stefan Priebe
We've done the splitting several times. The most important thing is to run a ceph version which does not have the linger ops bug. This is dumpling latest release, giant and hammer. Latest firefly release still has this bug. Which results in wrong watchers and no working snapshots. Stefan Am

Re: [ceph-users] Is it safe to increase pg number in a production environment

2015-08-04 Thread Stefan Priebe
We've done the splitting several times. The most important thing is to run a ceph version which does not have the linger ops bug. This is dumpling latest release, giant and hammer. Latest firefly release still has this bug. Which results in wrong watchers and no working snapshots. Stefan Am

Re: [ceph-users] Is it safe to increase pg number in a production environment

2015-08-04 Thread Stefan Priebe
at 12:51 AM, Stefan Priebe s.pri...@profihost.ag wrote: We've done the splitting several times. The most important thing is to run a ceph version which does not have the linger ops bug. This is dumpling latest release, giant and hammer. Latest firefly release still has this bug. Which results

Re: [ceph-users] Is it safe to increase pg number in a production environment

2015-08-04 Thread Stefan Priebe
at 12:51 AM, Stefan Priebe s.pri...@profihost.ag wrote: We've done the splitting several times. The most important thing is to run a ceph version which does not have the linger ops bug. This is dumpling latest release, giant and hammer. Latest firefly release still has this bug. Which results

btrfs trace / deadlock with 4.2-rc3

2015-07-31 Thread Stefan Priebe
Hi, i got the following error today. 2015-07-31 21:00:19 ---[ end Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 12 2015-07-31 21:00:19 Kernel Offset: 0x0 from 0x8100 (relocation range: 0x8000-0x9fff) 2015-07-31 21:00:18

Re: [pve-devel] javascript error since upgrade to novnc0.5-2

2015-07-29 Thread Stefan Priebe - Profihost AG
;h=c03d35f0247603cbb60ff39536f0801a03f11699 Strange have to check why it' working on some hosts and why not on others. Thanks! Stefan - Mail original - De: Stefan Priebe s.pri...@profihost.ag À: pve-devel pve-devel@pve.proxmox.com Envoyé: Mercredi 29 Juillet 2015 09:28:06 Objet

[pve-devel] javascript error since upgrade to novnc0.5-2

2015-07-29 Thread Stefan Priebe - Profihost AG
Hi, still on stable-3. Sicne upgrading to novnc 0.5-2 i get on some hosts the following error in novnc. pveui.js:284 Uncaught TypeError: Cannot read property 'type' of null pveui.js:284UI.updateSetting @ pveui.js:304UI.pve_start.start_vnc_viewer @

[pve-devel] CVE-2015-5154

2015-07-27 Thread Stefan Priebe - Profihost AG
Hi, does this affect proxmox? I don't understand why xen is explicit content of the advisory. http://seclists.org/oss-sec/2015/q3/212 -- Stefan ___ pve-devel mailing list pve-devel@pve.proxmox.com

Re: [Qemu-devel] [Qemu-stable] [PULL 0/3] Cve 2015 5154 patches

2015-07-27 Thread Stefan Priebe - Profihost AG
Am 27.07.2015 um 14:01 schrieb John Snow: The following changes since commit f793d97e454a56d17e404004867985622ca1a63b: Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging (2015-07-24 13:07:10 +0100) are available in the git repository at:

Re: [Qemu-devel] [Qemu-stable] [PULL 0/3] Cve 2015 5154 patches

2015-07-27 Thread Stefan Priebe - Profihost AG
Am 27.07.2015 um 14:28 schrieb John Snow: On 07/27/2015 08:10 AM, Stefan Priebe - Profihost AG wrote: Am 27.07.2015 um 14:01 schrieb John Snow: The following changes since commit f793d97e454a56d17e404004867985622ca1a63b: Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream

Re: [pve-devel] CVE-2015-5154

2015-07-27 Thread Stefan Priebe - Profihost AG
to answer myself. PVE is affected. Every VM with a CDROM can get root. Stefan Am 27.07.2015 um 14:36 schrieb Stefan Priebe - Profihost AG: Hi, does this affect proxmox? I don't understand why xen is explicit content of the advisory. http://seclists.org/oss-sec/2015/q3/212 -- Stefan

Re: do_IRQ: 0.126 No irq handler for vector (irq -1)

2015-07-23 Thread Stefan Priebe
Am 22.07.2015 um 09:23 schrieb Stefan Priebe - Profihost AG: Am 21.07.2015 um 23:15 schrieb Thomas Gleixner: On Tue, 21 Jul 2015, Stefan Priebe wrote: Am 20.07.2015 um 12:53 schrieb Thomas Gleixner: On Mon, 20 Jul 2015, Stefan Priebe - Profihost AG wrote: Hello list, i've 36 servers all

Re: do_IRQ: 0.126 No irq handler for vector (irq -1)

2015-07-23 Thread Stefan Priebe
Am 22.07.2015 um 09:23 schrieb Stefan Priebe - Profihost AG: Am 21.07.2015 um 23:15 schrieb Thomas Gleixner: On Tue, 21 Jul 2015, Stefan Priebe wrote: Am 20.07.2015 um 12:53 schrieb Thomas Gleixner: On Mon, 20 Jul 2015, Stefan Priebe - Profihost AG wrote: Hello list, i've 36 servers all

Re: do_IRQ: 0.126 No irq handler for vector (irq -1)

2015-07-22 Thread Stefan Priebe - Profihost AG
Am 21.07.2015 um 23:15 schrieb Thomas Gleixner: > On Tue, 21 Jul 2015, Stefan Priebe wrote: >> Am 20.07.2015 um 12:53 schrieb Thomas Gleixner: >>> On Mon, 20 Jul 2015, Stefan Priebe - Profihost AG wrote: >>>> Hello list, >>>> >>>> i've 36

Re: upstream/firefly exporting the same snap 2 times results in different exports

2015-07-22 Thread Stefan Priebe - Profihost AG
in upstream/firefly-backports. What's the purpose of that branch? Greets, Stefan Josh On 07/21/2015 12:52 PM, Stefan Priebe wrote: So this is really this old bug? http://tracker.ceph.com/issues/9806 Stefan Am 21.07.2015 um 21:46 schrieb Josh Durgin: On 07/21/2015 12:22 PM, Stefan Priebe

Re: do_IRQ: 0.126 No irq handler for vector (irq -1)

2015-07-22 Thread Stefan Priebe - Profihost AG
Am 21.07.2015 um 23:15 schrieb Thomas Gleixner: On Tue, 21 Jul 2015, Stefan Priebe wrote: Am 20.07.2015 um 12:53 schrieb Thomas Gleixner: On Mon, 20 Jul 2015, Stefan Priebe - Profihost AG wrote: Hello list, i've 36 servers all running vanilla 3.18.18 kernel which have a very high disk

Re: do_IRQ: 0.126 No irq handler for vector (irq -1)

2015-07-21 Thread Stefan Priebe
Am 20.07.2015 um 12:53 schrieb Thomas Gleixner: On Mon, 20 Jul 2015, Stefan Priebe - Profihost AG wrote: Hello list, i've 36 servers all running vanilla 3.18.18 kernel which have a very high disk and network load. Since a few days i encounter regular the following error messages and pretty

Re: upstream/firefly exporting the same snap 2 times results in different exports

2015-07-21 Thread Stefan Priebe
Am 21.07.2015 um 16:32 schrieb Jason Dillaman: Any chance that the snapshot was just created prior to the first export and you have a process actively writing to the image? Sadly not. I executed those commands exactly as i've posted manually at bash. I can reproduce this at 5 different

Re: do_IRQ: 0.126 No irq handler for vector (irq -1)

2015-07-21 Thread Stefan Priebe
Am 20.07.2015 um 12:53 schrieb Thomas Gleixner: On Mon, 20 Jul 2015, Stefan Priebe - Profihost AG wrote: Hello list, i've 36 servers all running vanilla 3.18.18 kernel which have a very high disk and network load. Since a few days i encounter regular the following error messages and pretty

Re: upstream/firefly exporting the same snap 2 times results in different exports

2015-07-21 Thread Stefan Priebe
Am 21.07.2015 um 21:46 schrieb Josh Durgin: On 07/21/2015 12:22 PM, Stefan Priebe wrote: Am 21.07.2015 um 19:19 schrieb Jason Dillaman: Does this still occur if you export the images to the console (i.e. rbd export cephstor/disk-116@snap - dump_file)? Would it be possible for you

Re: upstream/firefly exporting the same snap 2 times results in different exports

2015-07-21 Thread Stefan Priebe
So this is really this old bug? http://tracker.ceph.com/issues/9806 Stefan Am 21.07.2015 um 21:46 schrieb Josh Durgin: On 07/21/2015 12:22 PM, Stefan Priebe wrote: Am 21.07.2015 um 19:19 schrieb Jason Dillaman: Does this still occur if you export the images to the console (i.e. rbd export

Re: upstream/firefly exporting the same snap 2 times results in different exports

2015-07-21 Thread Stefan Priebe
Am 21.07.2015 um 19:19 schrieb Jason Dillaman: Does this still occur if you export the images to the console (i.e. rbd export cephstor/disk-116@snap - dump_file)? Would it be possible for you to provide logs from the two rbd export runs on your smallest VM image? If so, please add the

upstream/firefly exporting the same snap 2 times results in different exports

2015-07-21 Thread Stefan Priebe - Profihost AG
Hi, i remember there was a bug before in ceph not sure in which release where exporting the same rbd snap multiple times results in different raw images. Currently running upstream/firefly and i'm seeing the same again. # rbd export cephstor/disk-116@snap dump1 # sleep 10 # rbd export

Re: do_IRQ: 0.126 No irq handler for vector (irq -1)

2015-07-20 Thread Stefan Priebe - Profihost AG
Am 20.07.2015 um 12:53 schrieb Thomas Gleixner: > On Mon, 20 Jul 2015, Stefan Priebe - Profihost AG wrote: >> Hello list, >> >> i've 36 servers all running vanilla 3.18.18 kernel which have a very >> high disk and network load. >> >> Since a few days

do_IRQ: 0.126 No irq handler for vector (irq -1)

2015-07-20 Thread Stefan Priebe - Profihost AG
Hello list, i've 36 servers all running vanilla 3.18.18 kernel which have a very high disk and network load. Since a few days i encounter regular the following error messages and pretty often completely hanging disk i/o: [535040.439859] do_IRQ: 0.126 No irq handler for vector (irq -1)

do_IRQ: 0.126 No irq handler for vector (irq -1)

2015-07-20 Thread Stefan Priebe - Profihost AG
Hello list, i've 36 servers all running vanilla 3.18.18 kernel which have a very high disk and network load. Since a few days i encounter regular the following error messages and pretty often completely hanging disk i/o: [535040.439859] do_IRQ: 0.126 No irq handler for vector (irq -1)

Re: do_IRQ: 0.126 No irq handler for vector (irq -1)

2015-07-20 Thread Stefan Priebe - Profihost AG
Am 20.07.2015 um 12:53 schrieb Thomas Gleixner: On Mon, 20 Jul 2015, Stefan Priebe - Profihost AG wrote: Hello list, i've 36 servers all running vanilla 3.18.18 kernel which have a very high disk and network load. Since a few days i encounter regular the following error messages and pretty

out of space on big devices 30tb

2015-07-20 Thread Stefan Priebe - Profihost AG
Hello list, I get constantly no space messages friends m btrfs on big volumes. Btrfs balance always fixes it for 2-3 days. Now I'm in the process to recreate the fs. Are there any options I could pass to mods.btrfs which help to prevent this? Special use case heavy usage of cp reflink and

Re: kernel 4.1: no space left on device

2015-07-18 Thread Stefan Priebe
Still nobody? Upgraded to 4.2-rc2 and i still see the out of space situation on two 30TB und 40TB arrays every week. Stefan Am 27.06.2015 um 17:32 schrieb Stefan Priebe: Hi, while having some big btrfs volumes (44TB + 37TB). I see on a regular basis the no space left on device message. I'm

bcache_writeback task blocked for more ...

2015-07-17 Thread Stefan Priebe
Hi List, while running bcache on 3.18.18 and having added the known patches from the list i still get blocked bcache_writeback tasks. The kernel does now show any stack trace on his own. The relevant tasks look like this: [: ~]# cat /proc/769/stack [a01e9685] closure_sync+0x25/0x90

[Qemu-devel] Query CPU model / type

2015-07-15 Thread Stefan Priebe - Profihost AG
Hi, is there a way to query the current cpu model / type of a running qemu machine? I mean host, kvm64, qemu64, ... Stefan

Re: [Qemu-devel] Query CPU model / type

2015-07-15 Thread Stefan Priebe
Am 15.07.2015 um 13:32 schrieb Andrey Korolyov: On Wed, Jul 15, 2015 at 2:20 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Hi, is there a way to query the current cpu model / type of a running qemu machine? I mean host, kvm64, qemu64, ... Stefan I believe that the most

Re: [Qemu-devel] Query CPU model / type

2015-07-15 Thread Stefan Priebe
Am 15.07.2015 um 22:15 schrieb Andrey Korolyov: On Wed, Jul 15, 2015 at 11:07 PM, Stefan Priebe s.pri...@profihost.ag wrote: Am 15.07.2015 um 13:32 schrieb Andrey Korolyov: On Wed, Jul 15, 2015 at 2:20 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Hi, is there a way

Re: [pve-devel] qemu 2.3

2015-07-14 Thread Stefan Priebe - Profihost AG
OK found it - sorry a regression in the kernel. Patch is here https://lkml.org/lkml/fancy/2015/7/3/288 Stefan Am 14.07.2015 um 14:25 schrieb Stefan Priebe - Profihost AG: Hi, while testing qemu 2.3 i've now seen several times the following stack trace i've never seen before with qemu 2.2.1

[pve-devel] qemu 2.3

2015-07-14 Thread Stefan Priebe - Profihost AG
Hi, while testing qemu 2.3 i've now seen several times the following stack trace i've never seen before with qemu 2.2.1. Does anybody know something about it? [17732.613577] vmwrite error: reg 6000 value fff7 (err -9) [17732.614979] CPU: 9 PID: 19571 Comm: kvm Tainted: G O

Re: slowdown after one week

2015-07-13 Thread Stefan Priebe - Profihost AG
Am 13.07.2015 um 13:20 schrieb Austin S Hemmelgarn: On 2015-07-11 02:46, Stefan Priebe wrote: Hi, while using a 40TB btrfs partition for VM backups. I see a massive slowdown after around one week. The backup task takes usally 2-3 hours. After one week it takes 20 hours. If i umount

slowdown after one week

2015-07-11 Thread Stefan Priebe
Hi, while using a 40TB btrfs partition for VM backups. I see a massive slowdown after around one week. The backup task takes usally 2-3 hours. After one week it takes 20 hours. If i umount and remount the btrfs volume it takes 2-3 hours again. Kernel 4.1.1 Greets, Stefan -- To unsubscribe

Re: [ceph-users] replace OSD disk without removing the osd from crush

2015-07-09 Thread Stefan Priebe
Am 09.07.2015 um 19:35 schrieb Wido den Hollander: On 07/09/2015 09:15 AM, Stefan Priebe - Profihost AG wrote: Am 08.07.2015 um 23:33 schrieb Somnath Roy: Yes, I am able to reproduce that too..Not sure if this is a bug or change. That's odd. Can someone from inktank comment? Not from

Re: [ceph-users] replace OSD disk without removing the osd from crush

2015-07-09 Thread Stefan Priebe - Profihost AG
Am 08.07.2015 um 23:33 schrieb Somnath Roy: Yes, I am able to reproduce that too..Not sure if this is a bug or change. That's odd. Can someone from inktank comment? Thanks Regards Somnath -Original Message- From: Stefan Priebe [mailto:s.pri...@profihost.ag] Sent: Wednesday

Re: [ceph-users] replace OSD disk without removing the osd from crush

2015-07-08 Thread Stefan Priebe
-boun...@lists.ceph.com] On Behalf Of Stefan Priebe Sent: Wednesday, July 08, 2015 12:58 PM To: ceph-users Subject: [ceph-users] replace OSD disk without removing the osd from crush Hi, is there any way to replace an osd disk without removing the osd from crush, auth, ... Just recreate the same

Re: trying to compile with-jemalloc but ceph-osd is still linked to libtcmalloc

2015-07-07 Thread Stefan Priebe - Profihost AG
it always bind to tcmalloc. I've now removed googleperftools and it works fine. Stefan With regards, Shishir -Original Message- From: Stefan Priebe - Profihost AG [mailto:s.pri...@profihost.ag] Sent: Tuesday, July 07, 2015 2:48 PM To: Shishir Gowda; ceph-devel@vger.kernel.org

Re: trying to compile with-jemalloc but ceph-osd is still linked to libtcmalloc

2015-07-07 Thread Stefan Priebe - Profihost AG
(0x7f99cee7f000) I tried it with upstream master, what branch are you using. With regards, Shishir -Original Message- From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel- ow...@vger.kernel.org] On Behalf Of Stefan Priebe Sent: Friday, July 03, 2015 2:45 PM To: ceph

Re: [pve-devel] unknown setting 'maxcpus'

2015-07-04 Thread Stefan Priebe - Profihost AG
- De: Stefan Priebe s.pri...@profihost.ag À: dietmar diet...@proxmox.com, aderumier aderum...@odiso.com Cc: pve-devel pve-devel@pve.proxmox.com Envoyé: Vendredi 3 Juillet 2015 22:04:11 Objet: Re: [pve-devel] unknown setting 'maxcpus' Am 02.07.2015 um 16:01 schrieb Dietmar Maurer: I don't

Re: [pve-devel] unknown setting 'maxcpus'

2015-07-03 Thread Stefan Priebe
Am 02.07.2015 um 16:01 schrieb Dietmar Maurer: I don't get it. I've now replaced maxcpus: 128 with vcpus: 128 but this results in: maxcpus must be equal to or greater than smp on migration. Command is than: -smp 128,sockets=1,cores=2,maxcpus=2 Please can you post the whole VM config? Old

trying to compile with-jemalloc but ceph-osd is still linked to libtcmalloc

2015-07-03 Thread Stefan Priebe
Hi, i'm trying to compile current hammer with-jemalloc. configure .. --without-tcmalloc --with-jemalloc but resulting ceph-osd is still linked against tcmalloc: ldd /usr/bin/ceph-osd linux-vdso.so.1 = (0x7fffbf3b9000) libjemalloc.so.1 =

Re: [ceph-users] Redhat Storage Ceph Storage 1.3 released

2015-07-02 Thread Stefan Priebe - Profihost AG
Hi, Am 01.07.2015 um 23:35 schrieb Loic Dachary: Hi, The details of the differences between the Hammer point releases and the RedHat Ceph Storage 1.3 can be listed as described at http://www.spinics.net/lists/ceph-devel/msg24489.html reconciliation between hammer and v0.94.1.2 The

Re: [pve-devel] ceph hammer officially release by redhat

2015-07-02 Thread Stefan Priebe - Profihost AG
instead of tcmalloc Stefan - Mail original - De: Stefan Priebe s.pri...@profihost.ag À: pve-devel pve-devel@pve.proxmox.com Envoyé: Jeudi 2 Juillet 2015 06:33:56 Objet: [pve-devel] ceph hammer officially release by redhat Hi, Redhat has officially released hammer. (Attention

Re: [pve-devel] unknown setting 'maxcpus'

2015-07-02 Thread Stefan Priebe - Profihost AG
I don't get it. I've now replaced maxcpus: 128 with vcpus: 128 but this results in: maxcpus must be equal to or greater than smp on migration. Command is than: -smp 128,sockets=1,cores=2,maxcpus=2 Greets, Stefan Am 02.07.2015 um 15:32 schrieb Stefan Priebe - Profihost AG: qemu-server has

[pve-devel] unknown setting 'maxcpus'

2015-07-02 Thread Stefan Priebe - Profihost AG
Hi, i've today upgrade to latest stable-3 git. At nearly all commands i get this error: vm 203 - unable to parse value of 'maxcpus' - unknown setting 'maxcpus' vm 204 - unable to parse value of 'maxcpus' - unknown setting 'maxcpus' vm 202 - unable to parse value of 'maxcpus' - unknown setting

Re: [pve-devel] unknown setting 'maxcpus'

2015-07-02 Thread Stefan Priebe - Profihost AG
. Stefan - Mail original - De: Stefan Priebe s.pri...@profihost.ag À: pve-devel pve-devel@pve.proxmox.com Envoyé: Jeudi 2 Juillet 2015 14:03:49 Objet: [pve-devel] unknown setting 'maxcpus' Hi, i've today upgrade to latest stable-3 git. At nearly all commands i get

Re: [pve-devel] unknown setting 'maxcpus'

2015-07-02 Thread Stefan Priebe - Profihost AG
qemu-server has a qmupdate script. It seems this one misses this change. Stefan Am 02.07.2015 um 15:27 schrieb Stefan Priebe - Profihost AG: Am 02.07.2015 um 14:32 schrieb Alexandre DERUMIER: yes. cpu hotplug as changed. Now, the maxcpus = cores * sockets and vcpus: define

[pve-devel] ceph hammer officially release by redhat

2015-07-01 Thread Stefan Priebe
Hi, Redhat has officially released hammer. (Attention this is not the last tagged point release v0.94.1.2 instead it is current ceph/hammer (no tag)) http://redhatstorage.redhat.com/2015/06/25/announcing-red-hat-ceph-storage-1-3/ -- tTefan ___

Re: kernel 4.1: no space left on device

2015-06-28 Thread Stefan Priebe
Am 27.06.2015 um 23:50 schrieb Ruslanas Gžibovskis: Hi, df -i ? Maybe inode? df -i for btrfs? # df -i |grep vmbackup /dev/md50 0 00 - /vmbackup Have a nice $day_time. On Sat, 27 Jun 2015 19:11 Stefan Priebe s.pri...@profihost.ag mailto:s.pri...@profihost.ag

Re: [ceph-users] kernel 3.18 io bottlenecks?

2015-06-27 Thread Stefan Priebe
Dear Ilya, Am 25.06.2015 um 14:07 schrieb Ilya Dryomov: On Wed, Jun 24, 2015 at 10:29 PM, Stefan Priebe s.pri...@profihost.ag wrote: Am 24.06.2015 um 19:53 schrieb Ilya Dryomov: On Wed, Jun 24, 2015 at 8:38 PM, Stefan Priebe s.pri...@profihost.ag wrote: Am 24.06.2015 um 16:55 schrieb

Re: kernel 4.1: no space left on device

2015-06-27 Thread Stefan Priebe
Am 27.06.2015 um 17:51 schrieb Roman Mamedov: On Sat, 27 Jun 2015 17:32:04 +0200 Stefan Priebe s.pri...@profihost.ag wrote: Hi, while having some big btrfs volumes (44TB + 37TB). I see on a regular basis the no space left on device message. I'm only able to fix this. By running btrfs balance

kernel 4.1: no space left on device

2015-06-27 Thread Stefan Priebe
Hi, while having some big btrfs volumes (44TB + 37TB). I see on a regular basis the no space left on device message. I'm only able to fix this. By running btrfs balance AND unmounting and remounting the btrfs volume. Is there any way to debug / workaround this one? Greets, Stefan -- To

Re: [ceph-users] rbd_cache, limiting read on high iops around 40k

2015-06-22 Thread Stefan Priebe - Profihost AG
Am 22.06.2015 um 09:08 schrieb Alexandre DERUMIER aderum...@odiso.com: Just an update, there seems to be no proper way to pass iothread parameter from openstack-nova (not at least in Juno release). So a default single iothread per VM is what all we have. So in conclusion a nova instance

Re: rbd_cache, limiting read on high iops around 40k

2015-06-22 Thread Stefan Priebe - Profihost AG
Am 22.06.2015 um 09:08 schrieb Alexandre DERUMIER aderum...@odiso.com: Just an update, there seems to be no proper way to pass iothread parameter from openstack-nova (not at least in Juno release). So a default single iothread per VM is what all we have. So in conclusion a nova instance

[ceph-users] SSD LifeTime for Monitors

2015-06-17 Thread Stefan Priebe - Profihost AG
Hi, Does anybody know how many data gets written from the monitors? I was using some cheaper ssds for monitors and was wondering why they had already written 80 TB after 8 month. Stefan ___ ceph-users mailing list ceph-users@lists.ceph.com

slow TCP speed through linux ip forwarding

2015-06-16 Thread Stefan Priebe - Profihost AG
Hello, i hope somebody has an idea for my problem and may point me to the right direction. Alle servers run kernel 3.18.14. My problem is that i can't archieve more than 20Mbit/s using a single TCP stream and i've no further ideas how to solve it. reduced Network Map Server A | Linux GW

Re: [pve-devel] Qemu / virtio-rng-pci

2015-06-04 Thread Stefan Priebe - Profihost AG
://www.theregister.co.uk/2013/09/10/torvalds_on_rrrand_nsa_gchq/ Stefan - Mail original - De: Stefan Priebe s.pri...@profihost.ag À: dietmar diet...@proxmox.com, aderumier aderum...@odiso.com Cc: pve-devel pve-devel@pve.proxmox.com Envoyé: Mercredi 3 Juin 2015 20:41:48 Objet: Re: [pve-devel] Qemu / virtio

Re: [pve-devel] Qemu / virtio-rng-pci

2015-06-03 Thread Stefan Priebe
Am 03.06.2015 um 17:29 schrieb Dietmar Maurer: Well, the patch check the version of qemu or the machine option or forcemachine from qemu live migration. Ah ok sorry didn't saw this. But I still think it's bad to rely on qemu versions. What about a pve compatibility flag in the conf file which

Re: [pve-devel] Qemu / virtio-rng-pci

2015-06-01 Thread Stefan Priebe - Profihost AG
Stefan - Mail original - De: Stefan Priebe s.pri...@profihost.ag À: pve-devel pve-devel@pve.proxmox.com Envoyé: Lundi 1 Juin 2015 12:41:56 Objet: [pve-devel] Qemu / virtio-rng-pci Hi, while i had not enough entropy in some virtual machines i was wondering why we don't use virtio

[pve-devel] Qemu / virtio-rng-pci

2015-06-01 Thread Stefan Priebe - Profihost AG
Hi, while i had not enough entropy in some virtual machines i was wondering why we don't use virtio-rng-pci as default which is available since qemu 1.3? Here is a red hat article on it:

Re: [pve-devel] Qemu / virtio-rng-pci

2015-06-01 Thread Stefan Priebe - Profihost AG
Am 01.06.2015 um 17:39 schrieb Dietmar Maurer diet...@proxmox.com: Sure. I'm just thinking about the check regarding Qemu 2.3. I would also like to use it for older qemu versions / installations. Is there no other way to support it and not to break live migration? I don't see how to do

Re: [pve-devel] Qemu / virtio-rng-pci

2015-06-01 Thread Stefan Priebe - Profihost AG
: Stefan Priebe s.pri...@profihost.ag À: aderumier aderum...@odiso.com Cc: pve-devel pve-devel@pve.proxmox.com Envoyé: Lundi 1 Juin 2015 14:58:16 Objet: Re: [pve-devel] Qemu / virtio-rng-pci Am 01.06.2015 um 14:51 schrieb Alexandre DERUMIER: I just send a patch. Oh great. can you test

Re: [pve-devel] pve-qemu-kvm : remove openvz fairsched patch

2015-05-25 Thread Stefan Priebe
can we keep the -id $vmid option? it makes life in top or ps aux pretty easy ;-) Am 25.05.2015 um 14:02 schrieb Alexandre Derumier: we don't use openvz kernel anymore, we can remove this patch ___ pve-devel mailing list pve-devel@pve.proxmox.com

Re: [pve-devel] [PATCH] pve-qemu-kvm: fix VENOM qemu security flaw (CVE-2015-3456)

2015-05-14 Thread Stefan Priebe
:14 PM Stefan Priebe s.pri...@profihost.ag wrote: Signed-off-by: Stefan Priebe s.pri...@profihost.ag --- ...he-fifo-access-to-be-in-bounds-of-the-all.patch | 87 debian/patches/series |1 + 2 files changed, 88 insertions(+) create mode

Re: [Qemu-devel] [Qemu-stable] [PATCH] fdc: force the fifo access to be in bounds of the allocated buffer

2015-05-13 Thread Stefan Priebe
Am 13.05.2015 um 21:05 schrieb Stefan Weil: Am 13.05.2015 um 20:59 schrieb Stefan Priebe: Am 13.05.2015 um 20:51 schrieb Stefan Weil: Hi, I just noticed this patch because my provider told me that my KVM based server needs a reboot because of a CVE (see this German news: http://www.heise.de

Re: [Qemu-devel] [Qemu-stable] [PATCH] fdc: force the fifo access to be in bounds of the allocated buffer

2015-05-13 Thread Stefan Priebe
Am 13.05.2015 um 20:51 schrieb Stefan Weil: Hi, I just noticed this patch because my provider told me that my KVM based server needs a reboot because of a CVE (see this German news:

Re: [Qemu-devel] [Qemu-stable] [PATCH] fdc: force the fifo access to be in bounds of the allocated buffer

2015-05-13 Thread Stefan Priebe
Am 13.05.2015 um 21:04 schrieb John Snow: On 05/13/2015 02:59 PM, Stefan Priebe wrote: Am 13.05.2015 um 20:51 schrieb Stefan Weil: Hi, I just noticed this patch because my provider told me that my KVM based server needs a reboot because of a CVE (see this German news: http://www.heise.de

[pve-devel] [PATCH] pve-qemu-kvm: fix VENOM qemu security flaw (CVE-2015-3456)

2015-05-13 Thread Stefan Priebe
Signed-off-by: Stefan Priebe s.pri...@profihost.ag --- ...he-fifo-access-to-be-in-bounds-of-the-all.patch | 87 debian/patches/series |1 + 2 files changed, 88 insertions(+) create mode 100644 debian/patches/0001-fdc-force-the-fifo-access

[ceph-users] Change osd nearfull and full ratio of a running cluster

2015-04-29 Thread Stefan Priebe - Profihost AG
Hi, how can i change the osd full and osd nearfull ratio of a running cluster? Just setting: mon osd full ratio = .97 mon osd nearfull ratio = .92 has no effect. Stefan ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] strange benchmark problem : restarting osd daemon improve performance from 100k iops to 300k iops

2015-04-24 Thread Stefan Priebe - Profihost AG
Is jemalloc recommanded in general? Does it also work for firefly? Stefan Excuse my typo sent from my mobile phone. Am 24.04.2015 um 18:38 schrieb Alexandre DERUMIER aderum...@odiso.com: Hi, I have finished to rebuild ceph with jemalloc, all seem to working fine. I got a constant

Re: [pve-devel] crashes of pmxcfs

2015-04-23 Thread Stefan Priebe - Profihost AG
Hi, Am 23.04.2015 um 06:39 schrieb Dietmar Maurer: (gdb) bt full #0 clog_dump_json (clog=0x2cf94d0, str=0x1fd0700, ident=optimized out, max_entries=50) at logger.c:167 node = optimized out msg = optimized out cur = 0x64350140 ident = optimized out

<    5   6   7   8   9   10   11   12   13   14   >