On Thu, Jul 23, 2015 at 9:34 PM, Vedran Furač vedran.fu...@gmail.com wrote:
On 07/23/2015 06:47 PM, Ilya Dryomov wrote:
To me this looks like a writev() interrupted by a SIGALRM. I think
nginx guys read your original email the same way I did, which is write
syscall *returned* ERESTARTSYS,
You don’t (shouldn’t) need to rebuild the binary to use jemalloc. It should be
possible to do something like
LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 ceph-osd …
The last time we tried it segfaulted after a few minutes, so YMMV and be
careful.
Jan
On 23 Jul 2015, at 18:18, Luis
Turns out that when we started the 3 OSDs it did “out” the rest on the same
host, so their reweight was 0.
Thus when I started the singular OSD on that host, it tried to put all the PGs
on the other OSDs onto this one (which failed for lack of disk space) and
because of that it also consumed
Hi,
I have a problem with ceph-deploy on Ubuntu 15.04
in the file
/usr/local/lib/python2.7/dist-packages/ceph_deploy/hosts/debian/__init__.py
def choose_init():
Select a init system
Returns the name of a init system (upstart, sysvinit ...).
if distro.lower() ==
Sorry, autocorrect. Decompiled crush map.
Robert LeBlanc
Sent from a mobile device please excuse any typos.
On Jul 24, 2015 9:44 AM, Robert LeBlanc rob...@leblancnet.us wrote:
Please provide the recompiled crush map.
Robert LeBlanc
Sent from a mobile device please excuse any typos.
On Jul
Hi all,
I am looking for a way to alleviate the overhead of RBD snapshots/clones for
some time.
In our scenario there are a few “master” volumes that contain production
data, and are frequently snapshotted and cloned for dev/qa use. Those
snapshots/clones live for a few days to a few weeks
Hi Somnath,
Do you have a link with the definitions of all the perf counters?
Thanks,
Steve
On Sun, Jul 5, 2015 at 11:23 AM, Somnath Roy somnath@sandisk.com wrote:
Hi Ray,
Here is the description of the different latencies under filestore perf
counters.
Journal_latency :
On 07/24/2015 02:31 PM, Luis Periquito wrote:
Now it's official, I have a weird one!
Restarted one of the ceph-mons with jemalloc and it didn't make any
difference. It's still using a lot of cpu and still not freeing up memory...
The issue is that the cluster almost stops responding to
The leveldb is smallish: around 70mb.
I ran debug mon = 10 for a while, but couldn't find any interesting
information. I would run out of space quite quickly though as the log
partition only has 10g.
On 24 Jul 2015 21:13, Mark Nelson mnel...@redhat.com wrote:
On 07/24/2015 02:31 PM, Luis
On Fri, Jul 24, 2015 at 11:55 PM, Jason Dillaman dilla...@redhat.com wrote:
Hi all,
I am looking for a way to alleviate the overhead of RBD snapshots/clones for
some time.
In our scenario there are a few “master” volumes that contain production
data, and are frequently snapshotted and cloned
Hello,
If I understand correctly you want to look at how many “guest filesystem block
size” blocks there are that are empty?
This might not be that precise because we do not discard blocks inside the
guests, but if you tell me how to gather this - I can certainly try that. I’m
not sure if my
On 07/24/2015 03:29 PM, Ilya Dryomov wrote:
ngx_write_fd() is just a write(), which, when interrupted by SIGALRM,
fails with EINTR because SA_RESTART is not set. We can try digging
further, but I think nginx should retry in this case.
Hello,
Culprit was the timer_resolution 50ms; setting
Hi,
sorry for the late response, your message landed in the spam folder and I found
it just now.
# ceph mds dump
dumped mdsmap epoch 32
epoch 32
flags 0
created 2015-07-11 23:46:04.963071
modified2015-07-23 17:43:27.198951
tableserver 0
root0
session_timeout 60
Hi Noah,
It does look like the two things are unrelated. But you are right,
ceph-deploy stopped accepting that trailing hostname with the
ceph-deploy mon create-initial command with 1.5.26. It was never a
needed argument, and accepting it led to confusion. I tightened up
the argument parsing
On Fri, Jul 24, 2015 at 4:29 PM, Ilya Dryomov idryo...@gmail.com wrote:
On Fri, Jul 24, 2015 at 3:54 PM, Vedran Furač vedran.fu...@gmail.com wrote:
On 07/24/2015 09:54 AM, Ilya Dryomov wrote:
I don't know - looks like nginx isn't setting SA_RESTART, so it should
be repeating the
Hi,
Thanks.
I did not know about atop, nice tool... and I don't seem to be IRQ overloaded -
I can reach 100% cpu % for IRQs, but that's shared across all 8 physical cores.
I also discovered turbostat which showed me the R510s were not configured for
performance in the bios (but dbpm - demand
No thanks at all.
I think about ZFS deduplication in a slightly different aspect of using
snapshots. We determined, that platter HDD work better with big object size.
But it cause big performance overhead with snapshots. For example, you have
32Mb block size. And you have image snapshot. If
Now it's official, I have a weird one!
Restarted one of the ceph-mons with jemalloc and it didn't make any
difference. It's still using a lot of cpu and still not freeing up memory...
The issue is that the cluster almost stops responding to requests, and if I
restart the primary mon (that had
Hi! Did you try ZFS and deduplication mechanism? It could radically decrease
writes while COW.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
We use ZFS for other purposes and deduplication is overrated - it is quite
useful with big block sizes (and assuming your data don’t “shift” in the
blocks), but you can usually achieve much higher space savings with compression
- and it usually is faster, too :-) You need lots and lots of RAM
It sounds slightly similar to what I just experienced.
I had one monitor out of three, which seemed to essentially run one core at
full tilt continuously, and had it's virtual address space allocated at the
point where top started calling it Tb. Requests hitting this monitor did
not get very
Hi Bernhard,
Thanks for your email. systemd support for Ceph in general is still a
work in progress. It is actively being worked on, but the packages
hosted on ceph.com are still using sysvinit (for RPM systems), and
Upstart on Ubuntu. It is definitely a known issue.
Along those lines,
Hello,
I am trying to launch a test cluster with 1 monitor and 1 osd on a node.
I created a cluster with the name msl-lab-dsg02 and tried to deploy an initial
monitor and I get this error:
root@msl-lab-dsg02:~/Downloads/cluster# ceph-deploy --overwrite-conf mon
create-initial
- Original Message -
From: Jan Schermer j...@schermer.cz
To: Samuel Taylor Liston sam.lis...@utah.edu
Cc: ceph-users@lists.ceph.com, Wayne Betts wbe...@bnl.gov
Sent: Thursday, July 23, 2015 9:43:30 AM
Subject: Re: [ceph-users] el6 repo problem?
The packages were probably rebuilt
On Fri, Jul 24, 2015 at 3:54 PM, Vedran Furač vedran.fu...@gmail.com wrote:
On 07/24/2015 09:54 AM, Ilya Dryomov wrote:
I don't know - looks like nginx isn't setting SA_RESTART, so it should
be repeating the write()/writev() itself. That said, if it happens
only on cephfs, we need to track
“Friday fun”… not!
We set mon_osd_down_out_subtree_limit=host some time ago. Now we needed to take
down all OSDs on one host and as expected nothing happened (noout was _not_
set). All the PGs showed as stuck degraded.
Then we took 3 OSDs on the host up and then down again because of slow
26 matches
Mail list logo