Nik,
If you increase num_jobs beyond 4 , is it helping further ? Try 8 or so.
Yeah, libsoft* is definitely consuming some cpu cycles , but I don't know how
to resolve that.
Also, acpi_processor_ffh_cstate_enter popped up and consuming lot of cpu. Try
disabling cstate and run cpu in maximum
On Mon, May 11, 2015 at 06:07:21AM +, Somnath Roy wrote:
Yes, you need to run fio clients on a separate box, it will take quite a bit
of cpu.
Stopping OSDs on other nodes, rebalancing will start. Have you waited cluster
to go for active + clean state ? If you are running while
Yes, you need to run fio clients on a separate box, it will take quite a bit of
cpu.
Stopping OSDs on other nodes, rebalancing will start. Have you waited cluster
to go for active + clean state ? If you are running while rebalancing is going
on , the performance will be impacted.
~110% cpu
Hi Christian
In my experience, inconsistent PGs are almost always related back to a bad
drive somewhere. They are going to keep happening, and with that many drives
you still need to be diligent/aggressive in dropping bad drives and replacing
them.
If a drive returns an incorrect read, it
On 09/05/2015 00:55, Joao Eduardo Luis wrote:
A command being DEPRECATED must be:
- clearly marked as DEPRECATED in usage;
- kept around for at least 2 major releases;
- kept compatible for the duration of the deprecation period.
Once two major releases go by, the command will then
Hi all!
We are experiencing approximately 1 scrub error / inconsistent pg every
two days. As far as I know, to fix this you can issue a ceph pg
repair, which works fine for us. I have a few qestions regarding the
behavior of the ceph cluster in such a case:
1. After ceph detects the scrub error,
Oops... to fast to answer...
G.
On Mon, 11 May 2015 12:13:48 +0300, Timofey Titovets wrote:
Hey! I catch it again. Its a kernel bug. Kernel crushed if i try to
map rbd device with map like above!
Hooray!
2015-05-11 12:11 GMT+03:00 Timofey Titovets nefelim...@gmail.com:
FYI and history
Rule:
Hi all,
I have a few questions about ceph-fuse options:
- Is the fuse writeback cache being used? How can we see this? Can it be
turned on with allow_wbcache somehow?
- What is the default of the big_writes option? (as seen in
/usr/bin/ceph-fuse --help) . Where can we see this?
If we run
Hey! I catch it again. Its a kernel bug. Kernel crushed if i try to
map rbd device with map like above!
Hooray!
2015-05-11 12:11 GMT+03:00 Timofey Titovets nefelim...@gmail.com:
FYI and history
Rule:
# rules
rule replicated_ruleset {
ruleset 0
type replicated
min_size 1
max_size
Timofey,
glad that you 've managed to get it working :-)
Best,
George
FYI and history
Rule:
# rules
rule replicated_ruleset {
ruleset 0
type replicated
min_size 1
max_size 10
step take default
step choose firstn 0 type room
step choose firstn 0 type rack
step choose firstn 0
FYI and history
Rule:
# rules
rule replicated_ruleset {
ruleset 0
type replicated
min_size 1
max_size 10
step take default
step choose firstn 0 type room
step choose firstn 0 type rack
step choose firstn 0 type host
step chooseleaf firstn 0 type osd
step emit
}
And after reset
I tries searching on internet and could not find a el7 package with
liburcu-bp.la file, let me know which rpm package has this libtool archive.
Hi, maybe can you try
./install-deps.sh
to install needed dependencies.
- Mail original -
De: Srikanth Madugundi
Hi Robert,
just to make sure I got it correctly:
Do you mean that the /etc/mtab entries are completely ignored and no
matter what the order
of the /dev/sdX device is Ceph will just mount correctly the osd/ceph-X
by default?
In addition, assuming that an OSD node fails for a reason other
I had the same problem when doing benchmarks with small block sizes (8k) to
RBDs. These settings seemed to fix the problem for me.
sudo ceph tell osd.* injectargs '--filestore_merge_threshold 40'
sudo ceph tell osd.* injectargs '--filestore_split_multiple 8'
After you apply the settings give it
If you use ceph-disk (and I believe ceph-depoly) to create your OSDs, or
you go through the manual steps to set up the partition UUIDs, then yes
udev and the init script will do all the magic. Your disks can be moved to
another box without problems. I've moved disks to different ports on
On 05/05/2015 04:13 AM, Yujian Peng wrote:
Emmanuel Florac eflorac@... writes:
Le Mon, 4 May 2015 07:00:32 + (UTC)
Yujian Peng pengyujian5201314 at 126.com écrivait:
I'm encountering a data disaster. I have a ceph cluster with 145 osd.
The data center had a power problem yesterday, and
Under the OSD directory, you can look where the symlink points. This is
generally called ‘journal’, it should point to a device.
On 06 May 2015, at 06:54, Patrik Plank p.pl...@st-georgen-gusen.at wrote:
Hi,
i cant remember on which drive I install which OSD journal :-||
Is there any
- Original Message -
From: Daniel Hoffman daniel.hoff...@13andrew.com
To: Yehuda Sadeh-Weinraub yeh...@redhat.com
Cc: Ben b@benjackson.email, ceph-users ceph-us...@ceph.com
Sent: Sunday, May 10, 2015 5:03:22 PM
Subject: Re: [ceph-users] Shadow Files
Any updates on when this is
Did not work.
$ ls -l /usr/lib64/|grep liburcu-bp
lrwxrwxrwx 1 root root 19 May 10 05:27 liburcu-bp.so -
liburcu-bp.so.2.0.0
lrwxrwxrwx 1 root root 19 May 10 05:26 liburcu-bp.so.2 -
liburcu-bp.so.2.0.0
-rwxr-xr-x 1 root root32112 Feb 25 20:27 liburcu-bp.so.2.0.0
Can you point
Thanks for the help! We've lowered the number of PGs per pool to 64, so
with 12 pools and a replica count of 3, all 3 OSDs have a full 768 PGs.
If anyone has any concerns or objections (particularly folks from the
Ceph/Redhat team), please let me know.
Thanks again!
On Fri, May 8, 2015 at 1:21
We are still laying the foundations for eventual VMware integration and
indeed the Red Hat acquisition has made this more real now.
The first step is iSCSI support and work is ongoing in the kernel to get HA
iSCSI working with LIO and kRBD. See the blueprint and CDS sessions with
Mike Christie
I had an issue with my calamari server, so I built a new one from scratch.
I¹ve been struggling trying to get the new server to start up and see my
ceph cluster. I went so far as to remove salt and diamond from my ceph
nodes and reinstalled again. On my calamari server, it sees the hosts
Hi,
[Sorry I missed the body of your questions, here is my answer ;-]
On 11/05/2015 23:13, Somnath Roy wrote: Summary :
-
1. It is doing pretty good in Reads and 4 Rados Bench clients are saturating
40 GB network. With more physical server, it is scaling almost linearly
Thanks Loic..
inline
Regards
Somnath
-Original Message-
From: Loic Dachary [mailto:l...@dachary.org]
Sent: Monday, May 11, 2015 3:02 PM
To: Somnath Roy
Cc: ceph-users@lists.ceph.com; Ceph Development
Subject: Re: EC backend benchmark
Hi,
[Sorry I missed the body of your questions, here
Hi,
Thanks for sharing :-) Have you published the tools that you used to gather
these results ? It would be great to have a way to reproduce the same measures
in different contexts.
Cheers
On 11/05/2015 23:13, Somnath Roy wrote:
Hi Loic and community,
I have gathered the
Hi Loic and community,
I have gathered the following data on EC backend (all flash). I have decided to
use Jerasure since space saving is the utmost priority.
Setup:
41 OSDs (each on 8 TB flash), 5 node Ceph cluster. 48 core HT enabled cpu/64 GB
RAM. Tested with Rados Bench clients.
Loic,
I thought this one didn't go through !
I have sent another mail with attached doc.
This is the data with rados bench .
In case you missed it, could you please share your thoughts on the questions I
posted (way below in the mail, not sure how so many space came along!!) below ?
Thanks
Thanks.
Can you please let me know the suitable/best git version/tree to be pulling
to compile and use this feature/patch?
Thanks
On Tue, May 12, 2015 at 4:38 AM, Yehuda Sadeh-Weinraub yeh...@redhat.com
wrote:
--
*From: *Daniel Hoffman
It's the wip-rgw-orphans branch.
- Original Message -
From: Daniel Hoffman daniel.hoff...@13andrew.com
To: Yehuda Sadeh-Weinraub yeh...@redhat.com
Cc: Ben b@benjackson.email, David Zafman dzaf...@redhat.com,
ceph-users ceph-us...@ceph.com
Sent: Monday, May 11, 2015 4:30:11 PM
Greetings,
We have been testing a full SSD Ceph cluster for a few weeks now and still
testing. One of the outcome(We will post a full report on our test soon but
for now this email will only be for replicas) is that as soon as you put more
than 1 copy of the cluster, it kills the performance
On Mon, May 11, 2015 at 1:57 AM, Kenneth Waegeman
kenneth.waege...@ugent.be wrote:
Hi all,
I have a few questions about ceph-fuse options:
- Is the fuse writeback cache being used? How can we see this? Can it be
turned on with allow_wbcache somehow?
I'm not quite sure what you mean here.
On Fri, May 8, 2015 at 1:34 AM, Yan, Zheng uker...@gmail.com wrote:
On Fri, May 8, 2015 at 11:15 AM, Dexter Xiong dxtxi...@gmail.com wrote:
I tried echo 3 /proc/sys/vm/drop_caches and dentry_pinned_count dropped.
Thanks for your help.
could you please try the attached patch
I haven't
Agree that 99+% of the inconsistent PG's I see correlate directly to disk flern.
Check /var/log/kern.log*, /var/log/messages*, etc. and I'll bet you find errors
correlating.
-- Anthony
___
ceph-users mailing list
ceph-users@lists.ceph.com
Fellow Cephers,
I'm scratching my head on this one. Somehow a bunch of objects were lost in
my cluster, which is currently ceph version 0.87.1
(283c2e7cfa2457799f534744d7d549f83ea1335e).
The symptoms are that ceph -s reports a bunch of inconsistent PGs:
cluster
Hi, Patrik.
You must configure the priority of the I / O for scrubbing.
http://dachary.org/?p=3268
2015-05-12 8:03 GMT+03:00 Patrik Plank pat...@plank.me:
Hi,
the ceph cluster shows always the scrubbing notifications, although he do
not scrub.
And what does the Health Warn mean.
Scrubbing greatly affects the I / O and can slow queries on OSD. For more
information, look in the 'ceph health detail' and 'ceph pg dump | grep
scrub'
2015-05-12 8:42 GMT+03:00 Patrik Plank pat...@plank.me:
Hi,
is that the reason for the Health Warn or the scrubbing notification?
Hi,
the ceph cluster shows always the scrubbing notifications, although he do not
scrub.
And what does the Health Warn mean.
Does anybody have an idea why the warning is displayed.
How can I solve this?
cluster 78227661-3a1b-4e56-addc-c2a272933ac2
health HEALTH_WARN 6 requests are
So ok, understand.
But what can I do if the scrubbing process hangs by one page since last night:
root@ceph01:~# ceph health detail
HEALTH_OK
root@ceph01:~# ceph pg dump | grep scrub
pg_stat objects mip degr misp unf bytes log disklog
state state_stamp v
Personally I would not just run this command automatically because as you
stated, it only copies the primary PGs to the replicas and if the primary
is corrupt, you will corrupt your secondaries.I think the monitor log shows
which OSD has the problem so if it is not your primary, then just issue
- Original Message -
From: Daniel Hoffman daniel.hoff...@13andrew.com
To: ceph-users ceph-us...@ceph.com
Sent: Sunday, May 10, 2015 10:54:21 PM
Subject: [ceph-users] civetweb lockups
Hi All.
We have a wierd issue where civetweb just locks up, it just fails to respond
to HTTP and
Hi,
I'm currently doing benchmark too, and I don't see this behavior
I get very nice performance of up to 200k IOPS. However once the volume is
written to (ie when I map it using rbd map and dd whole volume with some
random data),
and repeat the benchmark, random performance drops to ~23k IOPS.
41 matches
Mail list logo