On 3/7/19 9:32 AM, Herbert Alexander Faleiros wrote:
On Thu, Mar 07, 2019 at 01:37:55PM -0300, Herbert Alexander Faleiros wrote:
Should I do something like this? (below, after stop osd.36)
# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-36/ --journal-path
/dev/sdc1
Strange, I can't reproduce this with v13.2.4. I tried the following
scenarios:
pg acting 1, 0, 2 -> up 1, 0 4 (osd.2 marked out). The df on osd.2
shows 0 space, but only osd.4 (backfill target) checks full space.
pg acting 1, 0, 2 -> up 4,3,5 (osd,1,0,2 all marked out). The df for
sd/PGBackend.cc be_compare_scrubmaps in luminous,
I don't see the changes in the commit here:
https://github.com/ceph/ceph/pull/15368/files
of course a lot of other things have changed, but is it possible this
fix never made it into luminous?
Graham
On 02/17/2018 12:48 PM, David Zafman wrote:
hung before due to a bug or if recovery stopped (as designed)
because of the unfound object. The new recovery_unfound and
backfill_unfound states indicates that recovery has stopped due to
unfound objects.
commit 64047e1bac2e775a06423a03cfab69b88462538c
Author: David Zafman <d
Yes, the pending backport for what we have so far is in
https://github.com/ceph/ceph/pull/20055
With this changes a backfill caused by marking an osd out has the
results as shown:
health: HEALTH_WARN
115/600 objects misplaced (19.167%)
...
data:
pools: 1 pools, 1
Jon,
If you are able please test my tentative fix for this issue which
is in https://github.com/ceph/ceph/pull/18673
Thanks
David
On 10/30/17 1:13 AM, Jon Light wrote:
Hello,
I have three OSDs that are crashing on start with a FAILED
assert(p.same_interval_since) error. I ran
I don't see that same_interval_since being cleared by split.
PG::split_into() copies the history from the parent PG to child. The
only code in Luminous that I see that clears it is in
ceph_objectstore_tool.cc
David
On 10/16/17 3:59 PM, Gregory Farnum wrote:
On Mon, Oct 16, 2017 at 3:49
I improved the code to compute degraded objects during
backfill/recovery. During my testing it wouldn't result in percentage
above 100%. I'll have to look at the code and verify that some
subsequent changes didn't break things.
David
On 10/13/17 9:55 AM, Florian Haas wrote:
Okay, in
quot; ], "errors": [ ],
"object": { "version": 3, "snap": "head", "locator": "", "nspace": "",
"name": "mytestobject" } } ], "epoch": 103443 }
David
On 9/26/17 10:55 AM, Grego
et-omaphdr
obj_header
$ for i in $(ceph-objectstore-tool --data-path ... --pgid 5.3d40
.dir.default.64449186.344176 list-omap)
do
echo -n "${i}: "
ceph-objectstore-tool --data-path ... .dir.default.292886573.13181.12
get-omap $i
done
key1: val1
key2: val2
key3: val3
David
On 9/8/17 12
Robin,
The only two changesets I can spot in Jewel that I think might be
related are these:
1.
http://tracker.ceph.com/issues/20089
https://github.com/ceph/ceph/pull/15416
This should improve the repair functionality.
2.
http://tracker.ceph.com/issues/19404
Please file a bug in tracker: http://tracker.ceph.com/projects/ceph
When an OSD is marked down is there are a crash (e.g. assert, heartbeat
timeout, declared down by another daemon)? Please include relevant log
snippets. If no obvious information, then bump osd debug log levels.
Luminous
James,
You have an omap corruption. It is likely caused by a bug which
has already been identified. A fix for that problem is available but it
is still pending backport for the next Jewel point release. All 4 of
your replicas have different "omap_digest" values.
Instead of the
Farnum
Sent: 20 February 2017 22:13
To: Nick Fisk <n...@fisk.me.uk>; David Zafman <dzaf...@redhat.com>
Cc: ceph-users <ceph-us...@ceph.com>
Subject: Re: [ceph-users] How safe is ceph pg repair these days?
On Sat, Feb 18, 2017 at 12:39 AM, Nick Fisk <n...@fisk.me.uk> wrote:
Hi Janmejay,
Sorry I just found you e-mail in my inbox.
There is no list namespaces, but rather you can list all objects in all
namespaces using the --all option and filter the results.
I created 10 namespaces (ns1 - ns10) in addition to the default one.
rados -p testpool --all ls
The ceph-objectstore-tool set-osdmap operation updates existing
osdmaps. If a map doesn't already exist the --force option can be used
to create it. It appears safe in your case to use that option.
David
On 4/15/16 9:47 AM, Markus Blank-Burian wrote:
Hi,
we had a problem on our
On 3/23/16 7:45 AM, Gregory Farnum wrote:
On Tue, Mar 22, 2016 at 11:59 AM, Max A. Krasilnikov
wrote:
Hello!
On Tue, Mar 22, 2016 at 11:40:39AM -0700, gfarnum wrote:
On Tue, Mar 22, 2016 at 1:19 AM, Max A. Krasilnikov wrote:
-1> 2016-03-21
Tue, Mar 8, 2016 at 10:39 AM, David Zafman <dzaf...@redhat.com> wrote:
Ben,
I haven't look at everything in your message, but pg 12.7a1 has lost data
because of writes that went only to osd.73. The way to recover this is to
force recovery to ignore this fact and go with whatev
Ben,
I haven't look at everything in your message, but pg 12.7a1 has lost
data because of writes that went only to osd.73. The way to recover
this is to force recovery to ignore this fact and go with whatever data
you have on the remaining OSDs.
I assume that having min_size 1, having
dout() is used for an OSD to log information about what it is doing
locally and might become very chatty. It is saved on the local nodes
disk only.
clog is the cluster log and is used for major events that should be
known by the administrator (see ceph -w). Clog should be used sparingly
I was focused on fixing the OSD, but you need to determine if some
misconfiguration or hardware issue caused a filesystem corruption.
David
On 10/22/15 3:08 PM, David Zafman wrote:
There is a corruption of the osdmaps on this particular OSD. You need
determine which maps are bad probably
There is a corruption of the osdmaps on this particular OSD. You need
determine which maps are bad probably by bumping the osd debug level to
20. Then transfer them from a working OSD. The newest
ceph-objectstore-tool has features to write the maps, but you'll need to
build a version
See below
On 10/21/15 2:44 PM, Gregory Farnum wrote:
On Wed, Oct 14, 2015 at 7:20 PM, Francois Lafont wrote:
Hi,
On 14/10/2015 06:45, Gregory Farnum wrote:
Ok, however during my tests I had been careful to replace the correct
file by a bad file with *exactly* the same
There would be a benefit to doing fadvise POSIX_FADV_DONTNEED after
deep-scrub reads for objects not recently accessed by clients.
I see the NewStore objectstore sometimes using the O_DIRECT flag for
writes. This concerns me because the open(2) man pages says:
"Applications should avoid
pe SnapSet import /tmp/snap.out decode dump_json
{
"snap_context": {
"seq": 9197,
"snaps": [
9197
]
},
"head_exists": 1,
"clones": []
}
On 09/03/2015 04:48 PM, David Zafman wrote:
If you have ceph-
) and create a new one
5. Restore RBD images from backup using new pool (make sure you have
disk space as the pool delete removes objects asynchronously)
David
On 9/3/15 8:15 PM, Chris Taylor wrote:
On 09/03/2015 02:44 PM, David Zafman wrote:
Chris,
WARNING: Do this at your own risk. You
This crash is what happens if a clone is missing from SnapSet (internal
data) for an object in the ObjectStore. If you had out of space issues,
this could possibly have been caused by being able to rename or create
files in a directory, but not being able to update SnapSet.
I've completely
":0,"pool":3,"namespace":"","max":0}]
To remove it, cut and paste your output line with snapid 9197 inside
single quotes like this:
$ ceph-objectstore-tool --data-path xx --journal-path xx
'["3.f9",{"oid":"rb.0.8c2990.238e
;: 2,
"size": 452,
"overlap": "[]"
},
{
"snap": 3,
"size": 452,
"overlap": "[]"
},
{
"snap": 4,
"siz
Without my latest branch which hasn't merged yet, you can't repair an EC
pg in the situation that the shard with a bad checksum is in the first k
chunks.
A way to fix it would be to take that osd down/out and let recovery
regenerate the chunk. Remove the pg from the osd
don't do something silly and shoot myself in the
foot.
Thanks!
-Aaron
On Fri, Aug 28, 2015 at 12:16 PM, David Zafman dzaf...@redhat.com wrote:
I don't know about removing the OSD from the CRUSH map. That seems like
overkill to me.
I just realized a possible better way. It would have been to take
from the CRUSH
map, ceph osd rm 21, then recreating it from scratch as though I'd lost a
disk?
-Aaron
On Fri, Aug 28, 2015 at 11:17 AM, David Zafman dzaf...@redhat.com wrote:
Without my latest branch which hasn't merged yet, you can't repair an EC
pg in the situation that the shard with a bad
Can you upload the entire log file?
David
On Nov 4, 2014, at 1:03 AM, Ta Ba Tuan tua...@vccloud.vn wrote:
Hi Sam,
I resend logs with debug options http://123.30.41.138/ceph-osd.21.log
http://123.30.41.138/ceph-osd.21.log
(Sorry about my spam :D)
I saw many missing objects :|
By default the vstart.sh setup would put all data below a directory called
“dev” in the source tree. In that case you’re using a single spindle. The
vstart script isn’t intended for performance testing.
David Zafman
Senior Developer
http://www.inktank.com
http://www.redhat.com
On Jul 2
Create a 3rd OSD. The default pool size is 3 replicas including the initial
system created pools.
David Zafman
Senior Developer
http://www.inktank.com
http://www.redhat.com
On Jun 25, 2014, at 3:04 AM, Iban Cabrillo cabri...@ifca.unican.es wrote:
Dear,
I am trying to deploy a new test
because it is more than 7 days
since the last deep scrub on Jan 1.
See also http://tracker.ceph.com/issues/6735
There may be a need for more documentation clarification in this area or a
change to the behavior.
David Zafman
Senior Developer
http://www.inktank.com
http://www.redhat.com
On Jun
that
osd_scrub_min_interval = osd_scrub_max_interval = osd_deep_scrub_interval.
I’d like to know how you have those 3 values set, so I can confirm that this
explains the issue.
David Zafman
Senior Developer
http://www.inktank.com
http://www.redhat.com
On Jun 23, 2014, at 7:01 PM, Christian Balzer ch
the
point of view of the host kernel, this won’t happen.
David Zafman
Senior Developer
http://www.inktank.com
http://www.redhat.com
On Jun 12, 2014, at 6:33 PM, lists+c...@deksai.com wrote:
I remember reading somewhere that the kernel ceph clients (rbd/fs) could
not run on the same host
The code checks the pg with the oldest scrub_stamp/deep_scrub_stamp to see
whether the osd_scrub_min_interval/osd_deep_scrub_interval time has elapsed.
So the output you are showing with the very old scrub stamps shouldn’t happen
under default settings. As soon set deep-scrub is re-enabled,
It isn’t clear to me what could cause a loop there. Just to be sure you don’t
have a filesystem corruption please try to run a “find” or “ls -R” on the
filestore root directory to be sure it completes.
Can you send the log you generated? Also, what version of Ceph are you running?
David
to spread operations
across more or less PGs at any given time.
David Zafman
Senior Developer
http://www.inktank.com
On Apr 24, 2014, at 8:09 AM, Chad Seys cws...@physics.wisc.edu wrote:
Hi All,
What does osd_recovery_max_single_start do? I could not find a
description
more or less PGs at
any given time.
David Zafman
Senior Developer
http://www.inktank.com
On Apr 24, 2014, at 8:09 AM, Chad Seys cws...@physics.wisc.edu wrote:
Hi All,
What does osd_recovery_max_single_start do? I could not find a description
of it.
Thanks!
Chad
and it was detected after 2013-12-13
15:38:13.283741 which was the last clean scrub.
David Zafman
Senior Developer
http://www.inktank.com
On Jan 9, 2014, at 6:36 PM, Mark Kirkwood mark.kirkw...@catalyst.net.nz wrote:
I've noticed this on 2 (development) clusters that I have with pools having
Did the inconsistent flag eventually get cleared? It might have been you
didn’t wait long enough for the repair to get through the pg.
David Zafman
Senior Developer
http://www.inktank.com
On Dec 28, 2013, at 12:29 PM, Corin Langosch corin.lango...@netskin.com wrote:
Hi Sage,
Am
of the replicas.
David
On Nov 17, 2013, at 10:46 PM, Chris Dunlop ch...@onthe.net.au wrote:
Hi David,
On Fri, Nov 15, 2013 at 10:00:37AM -0800, David Zafman wrote:
Replication does not occur until the OSD is “out.” This creates a new
mapping in the cluster of where the PGs should
if the administrator can determine which copy(s)
are bad.
David Zafman
Senior Developer
http://www.inktank.com
On Nov 18, 2013, at 1:11 PM, Chris Dunlop ch...@onthe.net.au wrote:
OK, that's good (as far is it goes, being a manual process).
So then, back to what I think was Mihály's original
for
unattended operation. Unless you are monitoring the cluster 24/7 you should
have enough disk space available to handle failures.
Related info in:
http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/
David Zafman
Senior Developer
http://www.inktank.com
On Nov 15, 2013, at 1:58 AM
David Zafman
Senior Developer
http://www.inktank.com
On Nov 12, 2013, at 3:16 AM, Mihály Árva-Tóth
mihaly.arva-t...@virtual-call-center.eu wrote:
Hello,
I have 3 node, with 3 OSD in each node. I'm using .rgw.buckets pool with 3
replica. One of my HDD (osd.0) has just bad sectors, when I try
of drives with higher total throughput).
David Zafman
Senior Developer
http://www.inktank.com
On Oct 16, 2013, at 8:15 PM, Mark Kirkwood mark.kirkw...@catalyst.net.nz
wrote:
I stumbled across this today:
4 osds on 4 hosts (names ceph1 - ceph4). They are KVM guests (this is a play
setup
I believe that the nature of the monitor network traffic should be fine with
10/100 network ports.
David Zafman
Senior Developer
http://www.inktank.com
On Sep 18, 2013, at 1:24 PM, Gandalf Corvotempesta
gandalf.corvotempe...@gmail.com wrote:
Hi to all.
Actually I'm building a test cluster
Transferring this back the ceph-users. Sorry, I can't help with rbd issues.
One thing I will say is that if you are mounting an rbd device with a
filesystem on a machine to export ftp, you can't also export the same device
via iSCSI.
David Zafman
Senior Developer
http://www.inktank.com
will
NOT corrupt ext4, but as Josh said modifying the same file at once by ftp and
nfs isn't going produce good results. With file locking 2 nfs clients could
coordinate using advisory locking.
David Zafman
Senior Developer
http://www.inktank.com
___
ceph
replicas.
David Zafman
Senior Developer
http://www.inktank.com
On May 25, 2013, at 12:33 PM, Mike Lowe j.michael.l...@gmail.com wrote:
Does anybody know exactly what ceph repair does? Could you list out briefly
the steps it takes? I unfortunately need to use it for an inconsistent pg
I can't reproduce this on v0.61-2. Could the disks for osd.13 osd.22 be
unwritable?
In your case it looks like the 3rd replica is probably the bad one, since
osd.13 and osd.22 are the same. You probably want to manually repair the 3rd
replica.
David Zafman
Senior Developer
http
According to osdmap e504: 4 osds: 2 up, 2 in you have 2 of 4 osds that are
down and out. That may be the issue.
David Zafman
Senior Developer
http://www.inktank.com
On May 8, 2013, at 12:05 AM, James Harper james.har...@bendigoit.com.au wrote:
I've just upgraded my ceph install
I don't believe that there would be a perceptible increase in data usage. The
next release called Cuttlefish is less than a week from release, so you might
wait for that.
Product questions should go to one of our mailing lists, not directly to
developers.
David Zafman
Senior Developer
http
|nodeep-scrub
You might want to turn off both kinds of scrubbing.
ceph osd set noscrub
ceph osd set nodeep-scrub
David Zafman
Senior Developer
http://www.inktank.com
On Apr 16, 2013, at 12:30 AM, kakito tientienminh080...@gmail.com wrote:
Hi Martin B Nielsen,
Thank you for your quick
Try first doing something like this first.
rados bench -p data 300 write --no-cleanup
David Zafman
Senior Developer
http://www.inktank.com
On Mar 12, 2013, at 1:46 PM, Scott Kinder skin...@yieldex.com wrote:
When I try and do a rados bench, I see the following error:
# rados bench -p
the names of the preceding _object#
rados -p data cleanup benchmark_data_ubuntu_#
rados -p data rm benchmark_last_metadata
David Zafman
Senior Developer
http://www.inktank.com
On Mar 12, 2013, at 2:11 PM, Scott Kinder skin...@yieldex.com wrote:
A follow-up question. How do I cleanup the written
deep-scrub finds problems it doesn't fix them. Try:
ceph osd repair osd-id
David Zafman
Senior Developer
david.zaf...@inktank.com
On Feb 27, 2013, at 12:43 AM, Jun Jun8 Liu liuj...@lenovo.com wrote:
Hi all
I did a test about deep scrub . Version is ceph version 0.56.2
60 matches
Mail list logo