Re: [ceph-users] Failed to repair pg

2019-03-07 Thread David Zafman
On 3/7/19 9:32 AM, Herbert Alexander Faleiros wrote: On Thu, Mar 07, 2019 at 01:37:55PM -0300, Herbert Alexander Faleiros wrote: Should I do something like this? (below, after stop osd.36) # ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-36/ --journal-path /dev/sdc1

Re: [ceph-users] backfill_toofull while OSDs are not full

2019-01-30 Thread David Zafman
Strange, I can't reproduce this with v13.2.4.  I tried the following scenarios: pg acting 1, 0, 2 -> up 1, 0 4 (osd.2 marked out).  The df on osd.2 shows 0 space, but only osd.4 (backfill target) checks full space. pg acting 1, 0, 2 -> up 4,3,5 (osd,1,0,2 all marked out).  The df for

Re: [ceph-users] Understanding/correcting sudden onslaught of unfound objects

2018-03-14 Thread David Zafman
sd/PGBackend.cc be_compare_scrubmaps in luminous, I don't see the changes in the commit here: https://github.com/ceph/ceph/pull/15368/files of course a lot of other things have changed, but is it possible this fix never made it into luminous? Graham On 02/17/2018 12:48 PM, David Zafman wrote:

Re: [ceph-users] Understanding/correcting sudden onslaught of unfound objects

2018-02-17 Thread David Zafman
hung before due to a bug or if recovery stopped (as designed) because of the unfound object.  The new recovery_unfound and backfill_unfound states indicates that recovery has stopped due to unfound objects. commit 64047e1bac2e775a06423a03cfab69b88462538c Author: David Zafman <d

Re: [ceph-users] ghost degraded objects

2018-01-22 Thread David Zafman
Yes, the pending backport for what we have so far is in https://github.com/ceph/ceph/pull/20055 With this changes a backfill caused by marking an osd out has the results as shown:     health: HEALTH_WARN     115/600 objects misplaced (19.167%) ...   data:     pools:   1 pools, 1

Re: [ceph-users] FAILED assert(p.same_interval_since) and unusable cluster

2017-11-01 Thread David Zafman
Jon,     If you are able please test my tentative fix for this issue which is in https://github.com/ceph/ceph/pull/18673 Thanks David On 10/30/17 1:13 AM, Jon Light wrote: Hello, I have three OSDs that are crashing on start with a FAILED assert(p.same_interval_since) error. I ran

Re: [ceph-users] Osd FAILED assert(p.same_interval_since)

2017-10-16 Thread David Zafman
I don't see that same_interval_since being cleared by split. PG::split_into() copies the history from the parent PG to child. The only code in Luminous that I see that clears it is in ceph_objectstore_tool.cc David On 10/16/17 3:59 PM, Gregory Farnum wrote: On Mon, Oct 16, 2017 at 3:49

Re: [ceph-users] objects degraded higher than 100%

2017-10-13 Thread David Zafman
I improved the code to compute degraded objects during backfill/recovery.  During my testing it wouldn't result in percentage above 100%.  I'll have to look at the code and verify that some subsequent changes didn't break things. David On 10/13/17 9:55 AM, Florian Haas wrote: Okay, in

Re: [ceph-users] inconsistent pg will not repair

2017-09-26 Thread David Zafman
quot; ], "errors": [ ], "object": { "version": 3, "snap": "head", "locator": "", "nspace": "", "name": "mytestobject" } } ], "epoch": 103443 } David On 9/26/17 10:55 AM, Grego

Re: [ceph-users] Significant uptick in inconsistent pgs in Jewel 10.2.9

2017-09-08 Thread David Zafman
et-omaphdr obj_header $ for i in $(ceph-objectstore-tool --data-path ... --pgid 5.3d40 .dir.default.64449186.344176 list-omap) do echo -n "${i}: " ceph-objectstore-tool --data-path ... .dir.default.292886573.13181.12 get-omap $i done key1: val1 key2: val2 key3: val3 David On 9/8/17 12

Re: [ceph-users] Significant uptick in inconsistent pgs in Jewel 10.2.9

2017-09-08 Thread David Zafman
Robin, The only two changesets I can spot in Jewel that I think might be related are these: 1. http://tracker.ceph.com/issues/20089 https://github.com/ceph/ceph/pull/15416 This should improve the repair functionality. 2. http://tracker.ceph.com/issues/19404

Re: [ceph-users] OSD's flapping on ordinary scrub with cluster being static (after upgrade to 12.1.1

2017-08-29 Thread David Zafman
Please file a bug in tracker: http://tracker.ceph.com/projects/ceph When an OSD is marked down is there are a crash (e.g. assert, heartbeat timeout, declared down by another daemon)?  Please include relevant log snippets.  If no obvious information, then bump osd debug log levels. Luminous

Re: [ceph-users] cephfs metadata damage and scrub error

2017-05-02 Thread David Zafman
James, You have an omap corruption. It is likely caused by a bug which has already been identified. A fix for that problem is available but it is still pending backport for the next Jewel point release. All 4 of your replicas have different "omap_digest" values. Instead of the

Re: [ceph-users] How safe is ceph pg repair these days?

2017-02-21 Thread David Zafman
Farnum Sent: 20 February 2017 22:13 To: Nick Fisk <n...@fisk.me.uk>; David Zafman <dzaf...@redhat.com> Cc: ceph-users <ceph-us...@ceph.com> Subject: Re: [ceph-users] How safe is ceph pg repair these days? On Sat, Feb 18, 2017 at 12:39 AM, Nick Fisk <n...@fisk.me.uk> wrote:

Re: [ceph-users] Listing out the available namespace in the Ceph Cluster

2016-11-23 Thread David Zafman
Hi Janmejay, Sorry I just found you e-mail in my inbox. There is no list namespaces, but rather you can list all objects in all namespaces using the --all option and filter the results. I created 10 namespaces (ns1 - ns10) in addition to the default one. rados -p testpool --all ls

Re: [ceph-users] OSDs refuse to start, latest osdmap missing

2016-04-15 Thread David Zafman
The ceph-objectstore-tool set-osdmap operation updates existing osdmaps. If a map doesn't already exist the --force option can be used to create it. It appears safe in your case to use that option. David On 4/15/16 9:47 AM, Markus Blank-Burian wrote: Hi, we had a problem on our

Re: [ceph-users] recorded data digest != on disk

2016-03-23 Thread David Zafman
On 3/23/16 7:45 AM, Gregory Farnum wrote: On Tue, Mar 22, 2016 at 11:59 AM, Max A. Krasilnikov wrote: Hello! On Tue, Mar 22, 2016 at 11:40:39AM -0700, gfarnum wrote: On Tue, Mar 22, 2016 at 1:19 AM, Max A. Krasilnikov wrote: -1> 2016-03-21

Re: [ceph-users] Ceph Recovery Assistance, pgs stuck peering

2016-03-08 Thread David Zafman
Tue, Mar 8, 2016 at 10:39 AM, David Zafman <dzaf...@redhat.com> wrote: Ben, I haven't look at everything in your message, but pg 12.7a1 has lost data because of writes that went only to osd.73. The way to recover this is to force recovery to ignore this fact and go with whatev

Re: [ceph-users] Ceph Recovery Assistance, pgs stuck peering

2016-03-08 Thread David Zafman
Ben, I haven't look at everything in your message, but pg 12.7a1 has lost data because of writes that went only to osd.73. The way to recover this is to force recovery to ignore this fact and go with whatever data you have on the remaining OSDs. I assume that having min_size 1, having

Re: [ceph-users] 答复: How long will the logs be kept?

2015-12-07 Thread David Zafman
dout() is used for an OSD to log information about what it is doing locally and might become very chatty. It is saved on the local nodes disk only. clog is the cluster log and is used for major events that should be known by the administrator (see ceph -w). Clog should be used sparingly

Re: [ceph-users] Core dump when running OSD service

2015-10-22 Thread David Zafman
I was focused on fixing the OSD, but you need to determine if some misconfiguration or hardware issue caused a filesystem corruption. David On 10/22/15 3:08 PM, David Zafman wrote: There is a corruption of the osdmaps on this particular OSD. You need determine which maps are bad probably

Re: [ceph-users] Core dump when running OSD service

2015-10-22 Thread David Zafman
There is a corruption of the osdmaps on this particular OSD. You need determine which maps are bad probably by bumping the osd debug level to 20. Then transfer them from a working OSD. The newest ceph-objectstore-tool has features to write the maps, but you'll need to build a version

Re: [ceph-users] CephFS file to rados object mapping

2015-10-21 Thread David Zafman
See below On 10/21/15 2:44 PM, Gregory Farnum wrote: On Wed, Oct 14, 2015 at 7:20 PM, Francois Lafont wrote: Hi, On 14/10/2015 06:45, Gregory Farnum wrote: Ok, however during my tests I had been careful to replace the correct file by a bad file with *exactly* the same

Re: [ceph-users] O_DIRECT on deep-scrub read

2015-10-07 Thread David Zafman
There would be a benefit to doing fadvise POSIX_FADV_DONTNEED after deep-scrub reads for objects not recently accessed by clients. I see the NewStore objectstore sometimes using the O_DIRECT flag for writes. This concerns me because the open(2) man pages says: "Applications should avoid

Re: [ceph-users] OSD respawning -- FAILED assert(clone_size.count(clone))

2015-09-08 Thread David Zafman
pe SnapSet import /tmp/snap.out decode dump_json { "snap_context": { "seq": 9197, "snaps": [ 9197 ] }, "head_exists": 1, "clones": [] } On 09/03/2015 04:48 PM, David Zafman wrote: If you have ceph-

Re: [ceph-users] OSD respawning -- FAILED assert(clone_size.count(clone))

2015-09-04 Thread David Zafman
) and create a new one 5. Restore RBD images from backup using new pool (make sure you have disk space as the pool delete removes objects asynchronously) David On 9/3/15 8:15 PM, Chris Taylor wrote: On 09/03/2015 02:44 PM, David Zafman wrote: Chris, WARNING: Do this at your own risk. You

Re: [ceph-users] OSD respawning -- FAILED assert(clone_size.count(clone))

2015-09-03 Thread David Zafman
This crash is what happens if a clone is missing from SnapSet (internal data) for an object in the ObjectStore. If you had out of space issues, this could possibly have been caused by being able to rename or create files in a directory, but not being able to update SnapSet. I've completely

Re: [ceph-users] OSD respawning -- FAILED assert(clone_size.count(clone))

2015-09-03 Thread David Zafman
":0,"pool":3,"namespace":"","max":0}] To remove it, cut and paste your output line with snapid 9197 inside single quotes like this: $ ceph-objectstore-tool --data-path xx --journal-path xx '["3.f9",{"oid":"rb.0.8c2990.238e

Re: [ceph-users] OSD respawning -- FAILED assert(clone_size.count(clone))

2015-09-03 Thread David Zafman
;: 2, "size": 452, "overlap": "[]" }, { "snap": 3, "size": 452, "overlap": "[]" }, { "snap": 4, "siz

Re: [ceph-users] Help with inconsistent pg on EC pool, v9.0.2

2015-08-28 Thread David Zafman
Without my latest branch which hasn't merged yet, you can't repair an EC pg in the situation that the shard with a bad checksum is in the first k chunks. A way to fix it would be to take that osd down/out and let recovery regenerate the chunk. Remove the pg from the osd

Re: [ceph-users] Help with inconsistent pg on EC pool, v9.0.2

2015-08-28 Thread David Zafman
don't do something silly and shoot myself in the foot. Thanks! -Aaron On Fri, Aug 28, 2015 at 12:16 PM, David Zafman dzaf...@redhat.com wrote: I don't know about removing the OSD from the CRUSH map. That seems like overkill to me. I just realized a possible better way. It would have been to take

Re: [ceph-users] Help with inconsistent pg on EC pool, v9.0.2

2015-08-28 Thread David Zafman
from the CRUSH map, ceph osd rm 21, then recreating it from scratch as though I'd lost a disk? -Aaron On Fri, Aug 28, 2015 at 11:17 AM, David Zafman dzaf...@redhat.com wrote: Without my latest branch which hasn't merged yet, you can't repair an EC pg in the situation that the shard with a bad

Re: [ceph-users] Ceph Giant not fixed RepllicatedPG:NotStrimming?

2014-11-04 Thread David Zafman
Can you upload the entire log file? David On Nov 4, 2014, at 1:03 AM, Ta Ba Tuan tua...@vccloud.vn wrote: Hi Sam, I resend logs with debug options http://123.30.41.138/ceph-osd.21.log http://123.30.41.138/ceph-osd.21.log (Sorry about my spam :D) I saw many missing objects :|

Re: [ceph-users] Performance is really bad when I run from vstart.sh

2014-07-02 Thread David Zafman
By default the vstart.sh setup would put all data below a directory called “dev” in the source tree. In that case you’re using a single spindle. The vstart script isn’t intended for performance testing. David Zafman Senior Developer http://www.inktank.com http://www.redhat.com On Jul 2

Re: [ceph-users] Troubles with a fireflay test installation

2014-06-25 Thread David Zafman
Create a 3rd OSD. The default pool size is 3 replicas including the initial system created pools. David Zafman Senior Developer http://www.inktank.com http://www.redhat.com On Jun 25, 2014, at 3:04 AM, Iban Cabrillo cabri...@ifca.unican.es wrote: Dear, I am trying to deploy a new test

Re: [ceph-users] Deep scrub versus osd scrub load threshold

2014-06-24 Thread David Zafman
because it is more than 7 days since the last deep scrub on Jan 1. See also http://tracker.ceph.com/issues/6735 There may be a need for more documentation clarification in this area or a change to the behavior. David Zafman Senior Developer http://www.inktank.com http://www.redhat.com On Jun

Re: [ceph-users] Deep scrub versus osd scrub load threshold

2014-06-23 Thread David Zafman
that osd_scrub_min_interval = osd_scrub_max_interval = osd_deep_scrub_interval. I’d like to know how you have those 3 values set, so I can confirm that this explains the issue. David Zafman Senior Developer http://www.inktank.com http://www.redhat.com On Jun 23, 2014, at 7:01 PM, Christian Balzer ch

Re: [ceph-users] What exactly is the kernel rbd on osd issue?

2014-06-12 Thread David Zafman
the point of view of the host kernel, this won’t happen. David Zafman Senior Developer http://www.inktank.com http://www.redhat.com On Jun 12, 2014, at 6:33 PM, lists+c...@deksai.com wrote: I remember reading somewhere that the kernel ceph clients (rbd/fs) could not run on the same host

Re: [ceph-users] PG Selection Criteria for Deep-Scrub

2014-06-11 Thread David Zafman
The code checks the pg with the oldest scrub_stamp/deep_scrub_stamp to see whether the osd_scrub_min_interval/osd_deep_scrub_interval time has elapsed. So the output you are showing with the very old scrub stamps shouldn’t happen under default settings. As soon set deep-scrub is re-enabled,

Re: [ceph-users] Problem with ceph_filestore_dump, possibly stuck in a loop

2014-05-20 Thread David Zafman
It isn’t clear to me what could cause a loop there. Just to be sure you don’t have a filesystem corruption please try to run a “find” or “ls -R” on the filestore root directory to be sure it completes. Can you send the log you generated? Also, what version of Ceph are you running? David

Re: [ceph-users] osd_recovery_max_single_start

2014-04-28 Thread David Zafman
to spread operations across more or less PGs at any given time. David Zafman Senior Developer http://www.inktank.com On Apr 24, 2014, at 8:09 AM, Chad Seys cws...@physics.wisc.edu wrote: Hi All, What does osd_recovery_max_single_start do? I could not find a description

Re: [ceph-users] osd_recovery_max_single_start

2014-04-24 Thread David Zafman
more or less PGs at any given time. David Zafman Senior Developer http://www.inktank.com On Apr 24, 2014, at 8:09 AM, Chad Seys cws...@physics.wisc.edu wrote: Hi All, What does osd_recovery_max_single_start do? I could not find a description of it. Thanks! Chad

Re: [ceph-users] Inconsistent pgs after update to 0.73 - 0.74

2014-01-09 Thread David Zafman
and it was detected after 2013-12-13 15:38:13.283741 which was the last clean scrub. David Zafman Senior Developer http://www.inktank.com On Jan 9, 2014, at 6:36 PM, Mark Kirkwood mark.kirkw...@catalyst.net.nz wrote: I've noticed this on 2 (development) clusters that I have with pools having

Re: [ceph-users] repair incosistent pg using emperor

2014-01-06 Thread David Zafman
Did the inconsistent flag eventually get cleared? It might have been you didn’t wait long enough for the repair to get through the pg. David Zafman Senior Developer http://www.inktank.com On Dec 28, 2013, at 12:29 PM, Corin Langosch corin.lango...@netskin.com wrote: Hi Sage, Am

Re: [ceph-users] HDD bad sector, pg inconsistent, no object remapping

2013-11-18 Thread David Zafman
of the replicas. David On Nov 17, 2013, at 10:46 PM, Chris Dunlop ch...@onthe.net.au wrote: Hi David, On Fri, Nov 15, 2013 at 10:00:37AM -0800, David Zafman wrote: Replication does not occur until the OSD is “out.” This creates a new mapping in the cluster of where the PGs should

Re: [ceph-users] HDD bad sector, pg inconsistent, no object remapping

2013-11-18 Thread David Zafman
if the administrator can determine which copy(s) are bad. David Zafman Senior Developer http://www.inktank.com On Nov 18, 2013, at 1:11 PM, Chris Dunlop ch...@onthe.net.au wrote: OK, that's good (as far is it goes, being a manual process). So then, back to what I think was Mihály's original

Re: [ceph-users] HDD bad sector, pg inconsistent, no object remapping

2013-11-15 Thread David Zafman
for unattended operation. Unless you are monitoring the cluster 24/7 you should have enough disk space available to handle failures. Related info in: http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/ David Zafman Senior Developer http://www.inktank.com On Nov 15, 2013, at 1:58 AM

Re: [ceph-users] HDD bad sector, pg inconsistent, no object remapping

2013-11-12 Thread David Zafman
David Zafman Senior Developer http://www.inktank.com On Nov 12, 2013, at 3:16 AM, Mihály Árva-Tóth mihaly.arva-t...@virtual-call-center.eu wrote: Hello, I have 3 node, with 3 OSD in each node. I'm using .rgw.buckets pool with 3 replica. One of my HDD (osd.0) has just bad sectors, when I try

Re: [ceph-users] Very unbalanced osd data placement with differing sized devices

2013-10-16 Thread David Zafman
of drives with higher total throughput). David Zafman Senior Developer http://www.inktank.com On Oct 16, 2013, at 8:15 PM, Mark Kirkwood mark.kirkw...@catalyst.net.nz wrote: I stumbled across this today: 4 osds on 4 hosts (names ceph1 - ceph4). They are KVM guests (this is a play setup

Re: [ceph-users] 10/100 network for Mons?

2013-09-19 Thread David Zafman
I believe that the nature of the monitor network traffic should be fine with 10/100 network ports. David Zafman Senior Developer http://www.inktank.com On Sep 18, 2013, at 1:24 PM, Gandalf Corvotempesta gandalf.corvotempe...@gmail.com wrote: Hi to all. Actually I'm building a test cluster

Re: [ceph-users] rbd map issues: no such file or directory (ENOENT) AND map wrong image

2013-08-19 Thread David Zafman
Transferring this back the ceph-users. Sorry, I can't help with rbd issues. One thing I will say is that if you are mounting an rbd device with a filesystem on a machine to export ftp, you can't also export the same device via iSCSI. David Zafman Senior Developer http://www.inktank.com

Re: [ceph-users] rbd map issues: no such file or directory (ENOENT) AND map wrong image

2013-08-13 Thread David Zafman
will NOT corrupt ext4, but as Josh said modifying the same file at once by ftp and nfs isn't going produce good results. With file locking 2 nfs clients could coordinate using advisory locking. David Zafman Senior Developer http://www.inktank.com ___ ceph

Re: [ceph-users] ceph repair details

2013-06-06 Thread David Zafman
replicas. David Zafman Senior Developer http://www.inktank.com On May 25, 2013, at 12:33 PM, Mike Lowe j.michael.l...@gmail.com wrote: Does anybody know exactly what ceph repair does? Could you list out briefly the steps it takes? I unfortunately need to use it for an inconsistent pg

Re: [ceph-users] Inconsistent PG's, repair ineffective

2013-05-21 Thread David Zafman
I can't reproduce this on v0.61-2. Could the disks for osd.13 osd.22 be unwritable? In your case it looks like the 3rd replica is probably the bad one, since osd.13 and osd.22 are the same. You probably want to manually repair the 3rd replica. David Zafman Senior Developer http

Re: [ceph-users] HEALTH_WARN after upgrade to cuttlefish

2013-05-08 Thread David Zafman
According to osdmap e504: 4 osds: 2 up, 2 in you have 2 of 4 osds that are down and out. That may be the issue. David Zafman Senior Developer http://www.inktank.com On May 8, 2013, at 12:05 AM, James Harper james.har...@bendigoit.com.au wrote: I've just upgraded my ceph install

Re: [ceph-users] Help: Ceph upgrade.

2013-04-25 Thread David Zafman
I don't believe that there would be a perceptible increase in data usage. The next release called Cuttlefish is less than a week from release, so you might wait for that. Product questions should go to one of our mailing lists, not directly to developers. David Zafman Senior Developer http

Re: [ceph-users] Ceph error: active+clean+scrubbing+deep

2013-04-24 Thread David Zafman
|nodeep-scrub You might want to turn off both kinds of scrubbing. ceph osd set noscrub ceph osd set nodeep-scrub David Zafman Senior Developer http://www.inktank.com On Apr 16, 2013, at 12:30 AM, kakito tientienminh080...@gmail.com wrote: Hi Martin B Nielsen, Thank you for your quick

Re: [ceph-users] Ceph Read Benchmark

2013-03-12 Thread David Zafman
Try first doing something like this first. rados bench -p data 300 write --no-cleanup David Zafman Senior Developer http://www.inktank.com On Mar 12, 2013, at 1:46 PM, Scott Kinder skin...@yieldex.com wrote: When I try and do a rados bench, I see the following error: # rados bench -p

Re: [ceph-users] Ceph Read Benchmark

2013-03-12 Thread David Zafman
the names of the preceding _object# rados -p data cleanup benchmark_data_ubuntu_# rados -p data rm benchmark_last_metadata David Zafman Senior Developer http://www.inktank.com On Mar 12, 2013, at 2:11 PM, Scott Kinder skin...@yieldex.com wrote: A follow-up question. How do I cleanup the written

Re: [ceph-users] deep scrub

2013-02-27 Thread David Zafman
deep-scrub finds problems it doesn't fix them. Try: ceph osd repair osd-id David Zafman Senior Developer david.zaf...@inktank.com On Feb 27, 2013, at 12:43 AM, Jun Jun8 Liu liuj...@lenovo.com wrote: Hi all I did a test about deep scrub . Version is ceph version 0.56.2