Re: [ceph-users] monitor ghosted

2020-01-08 Thread Brad Hubbard
On Thu, Jan 9, 2020 at 5:48 AM Peter Eisch 
wrote:

> Hi,
>
> This morning one of my three monitor hosts got booted from the Nautilus
> 14.2.4 cluster and it won’t regain. There haven’t been any changes, or
> events at this site at all. The conf file is the [unchanged] and the same
> as the other two monitors. The host is also running the MDS and MGR apps
> without any issue. The ceph-mon log shows this repeating:
>
> 2020-01-08 13:33:29.403 7fec1a736700 1 mon.cephmon02@1(probing) e7
> handle_auth_request failed to assign global_id
> 2020-01-08 13:33:29.433 7fec1a736700 1 mon.cephmon02@1(probing) e7
> handle_auth_request failed to assign global_id
> 2020-01-08 13:33:29.541 7fec1a736700 1 mon.cephmon02@1(probing) e7
> handle_auth_request failed to assign global_id
> ...
>

Try gathering a log with debug_mon 20. That should provide more detail
about why  AuthMonitor::_assign_global_id() didn't return an ID.


> There is nothing in the logs of the two remaining/healthy monitors. What
> is my best practice to get this host back in the cluster?
>
> peter
>
> Peter Eisch
> Senior Site Reliability Engineer
> T *1.612.659.3228* <1.612.659.3228>
> [image: Facebook] 
> [image: LinkedIn] 
> [image: Twitter] 
> *virginpulse.com* 
> | *virginpulse.com/global-challenge*
> 
>
> Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | 
> Switzerland | United Kingdom | USA
> Confidentiality Notice: The information contained in this e-mail,
> including any attachment(s), is intended solely for use by the designated
> recipient(s). Unauthorized use, dissemination, distribution, or
> reproduction of this message by anyone other than the intended
> recipient(s), or a person designated as responsible for delivering such
> messages to the intended recipient, is strictly prohibited and may be
> unlawful. This e-mail may contain proprietary, confidential or privileged
> information. Any views or opinions expressed are solely those of the author
> and do not necessarily represent those of Virgin Pulse, Inc. If you have
> received this message in error, or are not the named recipient(s), please
> immediately notify the sender and delete this e-mail message.
> v2.64
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>


-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-objectstore-tool crash when trying to recover pg from OSD

2019-11-07 Thread Brad Hubbard
I'd suggest you open a tracker under the Bluestore component so
someone can take a look. I'd also suggest you include a log with
'debug_bluestore=20' added to the COT command line.

On Thu, Nov 7, 2019 at 6:56 PM Eugene de Beste  wrote:
>
> Hi, does anyone have any feedback for me regarding this?
>
> Here's the log I get when trying to restart the OSD via systemctl: 
> https://pastebin.com/tshuqsLP
> On Mon, 4 Nov 2019 at 12:42, Eugene de Beste  wrote:
>
> Hi everyone
>
> I have a cluster that was initially set up with bad defaults in Luminous. 
> After upgrading to Nautilus I've had a few OSDs crash on me, due to errors 
> seemingly related to https://tracker.ceph.com/issues/42223 and 
> https://tracker.ceph.com/issues/22678.
>
> One of my pools have been running in min_size 1 (yes, I know) and I am not 
> stuck with incomplete pgs due to aforementioned OSD crash.
>
> When trying to use the ceph-objectstore-tool to get the pgs out of the OSD, 
> I'm running into the same issue as trying to start the OSD, which is the 
> crashes. ceph-objectstore-tool core dumps and I can't retrieve the pg.
>
> Does anyone have any input on this? I would like to be able to retrieve that 
> data if possible.
>
> Here's the log for ceph-objectstore-tool --debug --data-path 
> /var/lib/ceph/osd/ceph-22 --skip-journal-replay --skip-mount-omap --op info 
> --pgid 2.9f  -- https://pastebin.com/9aGtAfSv
>
> Regards and thanks,
> Eugene
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Inconsistents + FAILED assert(recovery_info.oi.legacy_snaps.size())

2019-10-30 Thread Brad Hubbard
ta_size: 0, omap_header_size: 0, 
> omap_entries_size: 0, attrset_size: 1, recovery_info: 
> ObjectRecoveryInfo(2:9b97b818:::rbd_data.0c16b76b8b4567.0001426e:5926@127481'7241006,
>  size: 4194304, copy_subset: [], clone_subset: {}, snapset: 0=[]:[]), 
> after_progress: ObjectRecoveryProgress(!first, data_recovered_to:0, 
> data_complete:true, omap_recovered_to:, omap_complete:true, error:false), 
> before_progress: ObjectRecoveryProgress(first, data_recovered_to:0, 
> data_complete:false, omap_recovered_to:, omap_complete:false, error:false))]) 
> v3
> -1> 2019-10-30 12:52:31.998899 7fce6189b700  1 -- 
> 129.20.177.4:6810/2125155 <== osd.25 129.20.177.3:6808/810999 24804  
> MOSDPGPush(2.1d9 194334/194298 
> [PushOp(2:9b97b818:::rbd_data.0c16b76b8b4567.0001426e:5926, version: 
> 127481'7241006, data_included: [], data_size: 0, omap_header_size: 0, 
> omap_entries_size: 0, attrset_size: 1, recovery_info: 
> ObjectRecoveryInfo(2:9b97b818:::rbd_data.0c16b76b8b4567.0001426e:5926@127481'7241006,
>  size: 4194304, copy_subset: [], clone_subset: {}, snapset: 0=[]:[]), 
> after_progress: ObjectRecoveryProgress(!first, data_recovered_to:0, 
> data_complete:true, omap_recovered_to:, omap_complete:true, error:false), 
> before_progress: ObjectRecoveryProgress(first, data_recovered_to:0, 
> data_complete:false, omap_recovered_to:, omap_complete:false, error:false))]) 
> v3  925+0+0 (542499734 0 0) 0x5648d708eac0 con 0x5648d74a0800
>  0> 2019-10-30 12:52:32.003339 7fce4803e700 -1 
> /build/ceph-12.2.12/src/osd/PrimaryLogPG.cc: In function 'virtual void 
> PrimaryLogPG::on_local_recover(const hobject_t&, const ObjectRecoveryInfo&, 
> ObjectContextRef, bool, ObjectStore::Transaction*)' thread 7fce4803e700 time 
> 2019-10-30 12:52:31.999086
> /build/ceph-12.2.12/src/osd/PrimaryLogPG.cc: 354: FAILED 
> assert(recovery_info.oi.legacy_snaps.size())
>
>
> More informations about this object (0c16b76b8b4567.0001426e)
> from pg 2.1d9 :
> ceph osd map rbd rbd_data.0c16b76b8b4567.0001426e
> osdmap e194356 pool 'rbd' (2) object 
> 'rbd_data.0c16b76b8b4567.0001426e' -> pg 2.181de9d9 (2.1d9) -> up 
> ([27,30,38], p27) acting ([30,25], p30)
>
> I also checked the logs of all OSDs already done and got the same logs
> about this object :
> * osd.4, last time : 2019-10-10 16:15:20
> * osd.32, last time : 2019-10-14 01:54:56
> * osd.33, last time : 2019-10-11 06:24:01
> * osd.34, last time : 2019-10-18 06:24:26
> * osd.20, last time : 2019-10-27 18:12:31
> * osd.28, last time : 2019-10-28 12:57:47
>
> No matter that the data came from osd.25 or osd.30, i have the same
> error. It seems this PG|object try to recover an healthy state but
> shutdown my OSDs one by one…
>
> Thus spake Jérémy Gardais (jeremy.gard...@univ-rennes1.fr) on mercredi 30 
> octobre 2019 à 11:09:36:
> > Thus spake Brad Hubbard (bhubb...@redhat.com) on mercredi 30 octobre 2019 à 
> > 12:50:50:
> > > You should probably try and work out what caused the issue and take
> > > steps to minimise the likelihood of a recurrence. This is not expected
> > > behaviour in a correctly configured and stable environment.
>
> This PG 2.1d9 is "only" marked as : 
> "active+undersized+degraded+remapped+backfill_wait", not inconsistent…
>
> Everything i got from PG 2.1d9 (query, list_missing,
> ceph-objectstore-tool list,…) is available here :
> https://cloud.ipr.univ-rennes1.fr/index.php/s/BYtuAURnC7YOAQG?path=%2Fpg.2.1d9
> But nothing looks suspicious to me…
>
> I also separated the logs from the last error on osd.27 and it's
> reboot ("only" ~22k lines ^^) :
> https://cloud.ipr.univ-rennes1.fr/index.php/s/BYtuAURnC7YOAQG/download?path=%2F=ceph-osd.27.log.last.error.txt
>
> Is anybody understand these logs or do i have to leave with this damned
> object ? ^^
>
> --
> Gardais Jérémy
> Institut de Physique de Rennes
> Université Rennes 1
> Téléphone: 02-23-23-68-60
> Mail & bonnes pratiques: http://fr.wikipedia.org/wiki/Nétiquette
> ---



-- 
Cheers,
Brad

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Inconsistents + FAILED assert(recovery_info.oi.legacy_snaps.size())

2019-10-29 Thread Brad Hubbard
On Tue, Oct 29, 2019 at 9:09 PM Jérémy Gardais
 wrote:
>
> Thus spake Brad Hubbard (bhubb...@redhat.com) on mardi 29 octobre 2019 à 
> 08:20:31:
> > Yes, try and get the pgs healthy, then you can just re-provision the down 
> > OSDs.
> >
> > Run a scrub on each of these pgs and then use the commands on the
> > following page to find out more information for each case.
> >
> > https://docs.ceph.com/docs/luminous/rados/troubleshooting/troubleshooting-pg/
> >
> > Focus on the commands 'list-missing', 'list-inconsistent-obj', and
> > 'list-inconsistent-snapset'.
> >
> > Let us know if you get stuck.
> >
> > P.S. There are several threads about these sorts of issues in this
> > mailing list that should turn up when doing a web search.
>
> I found this thread :
> https://www.mail-archive.com/ceph-users@lists.ceph.com/msg53116.html

That looks like the same issue.

>
> And i start to get additionnals informations to solve PG 2.2ba :
> 1. rados list-inconsistent-snapset 2.2ba --format=json-pretty
> {
> "epoch": 192223,
> "inconsistents": [
> {
> "name": "rbd_data.b4537a2ae8944a.425f",
> "nspace": "",
> "locator": "",
> "snap": 22772,
> "errors": [
> "headless"
> ]
> },
> {
> "name": "rbd_data.b4537a2ae8944a.425f",
> "nspace": "",
> "locator": "",
> "snap": "head",
> "snapset": {
> "snap_context": {
> "seq": 22806,
> "snaps": [
> 22805,
> 22804,
> 22674,
> 22619,
> 20536,
> 17248,
> 14270
> ]
> },
> "head_exists": 1,
> "clones": [
> {
> "snap": 17248,
> "size": 4194304,
> "overlap": "[0~2269184,2277376~1916928]",
> "snaps": [
> 17248
> ]
> },
> {
> "snap": 20536,
> "size": 4194304,
> "overlap": "[0~2269184,2277376~1916928]",
> "snaps": [
> 20536
> ]
> },
> {
> "snap": 22625,
> "size": 4194304,
> "overlap": "[0~2269184,2277376~1916928]",
> "snaps": [
> 22619
> ]
> },
> {
> "snap": 22674,
> "size": 4194304,
> "overlap": "[266240~4096]",
> "snaps": [
> 22674
> ]
> },
> {
> "snap": 22805,
> "size": 4194304,
> "overlap": 
> "[0~942080,958464~901120,1875968~16384,1908736~360448,2285568~1908736]",
> "snaps": [
> 22805,
> 22804
> ]
> }
> ]
> },
> "errors": [
> "extra_clones"
> ],
> "extra clones": [
> 22772
> ]
> }
> ]
> }
>
> 2.a ceph-objectstore-tool from osd.29 and osd.42 :
> ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-29/ --pgid 2.2ba 
> --op list rbd_data.b4537a2ae8944a.425f
> ["2.2ba",{"oid":"rbd_data.b4537a2ae8944a.425f","key":"","snapid":17248,"hash":71960

Re: [ceph-users] Inconsistents + FAILED assert(recovery_info.oi.legacy_snaps.size())

2019-10-28 Thread Brad Hubbard
Yes, try and get the pgs healthy, then you can just re-provision the down OSDs.

Run a scrub on each of these pgs and then use the commands on the
following page to find out more information for each case.

https://docs.ceph.com/docs/luminous/rados/troubleshooting/troubleshooting-pg/

Focus on the commands 'list-missing', 'list-inconsistent-obj', and
'list-inconsistent-snapset'.

Let us know if you get stuck.

P.S. There are several threads about these sorts of issues in this
mailing list that should turn up when doing a web search.

On Tue, Oct 29, 2019 at 5:06 AM Jérémy Gardais
 wrote:
>
> Hello,
>
> From several weeks, i have some OSDs flapping before ending out of the
> cluster by Ceph…
> I was hoping some Ceph's magic and just gave it sometime to auto heal
> (and be able to do all the side work…) but it was a bad idea (what a
> surprise :D). Also got some inconsistents PGs, but i was waiting a quiet
> health cluster before trying to fix them.
>
> Now that i have more time, i also have 6 OSDs down+out on my 5 nodes
> cluster and 1~2 OSDs still flapping from time to time, i asking myself
> if these PGs might be the (one ?) source of my problem.
>
> The last OSD error on osd.28 gave these logs :
> -2> 2019-10-28 12:57:47.346460 7fefbdc4d700  5 -- 129.20.177.2:6811/47803 
> >> 129.20.177.3:6808/4141402 conn(0x55de8211a000 :-1 
> s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=2058 cs=1 l=0). rx osd.25 
> seq 169 0x55dea57b3600 MOSDPGPush(2.1d9 191810/191810 
> [PushOp(2:9b97b818:::rbd_data.0c16b76b8b4567.0001426e:5926, version: 
> 127481'7241006, data_included: [], data_size: 0, omap_header_size: 0, 
> omap_entries_size: 0, attrset_size: 1, recovery_info: 
> ObjectRecoveryInfo(2:9b97b818:::rbd_data.0c16b76b8b4567.0001426e:5926@127481'7241006,
>  size: 4194304, copy_subset: [], clone_subset: {}, snapset: 0=[]:[]), 
> after_progress: ObjectRecoveryProgress(!first, data_recovered_to:0, 
> data_complete:true, omap_recovered_to:, omap_complete:true, error:false), 
> before_progress: ObjectRecoveryProgress(first, data_recovered_to:0, 
> data_complete:false, omap_recovered_to:, omap_complete:false, error:false))]) 
> v3
> -1> 2019-10-28 12:57:47.346517 7fefbdc4d700  1 -- 129.20.177.2:6811/47803 
> <== osd.25 129.20.177.3:6808/4141402 169  MOSDPGPush(2.1d9 191810/191810 
> [PushOp(2:9b97b818:::rbd_data.0c16b76b8b4567.0001426e:5926, version: 
> 127481'7241006, data_included: [], data_size: 0, omap_header_size: 0, 
> omap_entries_size: 0, attrset_size: 1, recovery_info: 
> ObjectRecoveryInfo(2:9b97b818:::rbd_data.c16b76b8b4567.0001426e:5926@127481'7241006,
>  size: 4194304, copy_subset: [], clone_subset: {}, snapset: 0=[]:[]), 
> after_progress: ObjectRecoveryProgress(!first, data_recovered_to:0, 
> data_complete:true, omap_recovered_to:, omap_complete:true, error:false), 
> before_progress: ObjectRecoveryProgress(first, data_recovered_to:0, 
> data_complete:false, omap_recovered_to:, omap_complete:false, error:false))]) 
> v3  909+0+0 (1239474936 0 0) 0x55dea57b3600 con 0x55de8211a000
>  0> 2019-10-28 12:57:47.353680 7fef99441700 -1 
> /build/ceph-12.2.12/src/osd/PrimaryLogPG.cc: In function 'virtual void 
> PrimaryLogPG::on_local_recover(const hobject_t&, const ObjectRecoveryInfo&, 
> ObjectContextRef, bool, ObjectStore::Transaction*)' thread 7fef99441700 time 
> 2019-10-28 12:57:47.347132
> /build/ceph-12.2.12/src/osd/PrimaryLogPG.cc: 354: FAILED 
> assert(recovery_info.oi.legacy_snaps.size())
>
>  ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous 
> (stable)
>  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
> const*)+0x102) [0x55de72039f32]
>  2: (PrimaryLogPG::on_local_recover(hobject_t const&, ObjectRecoveryInfo 
> const&, std::shared_ptr, bool, 
> ObjectStore::Transaction*)+0x135b) [0x55de71be330b]
>  3: (ReplicatedBackend::handle_push(pg_shard_t, PushOp const&, PushReplyOp*, 
> ObjectStore::Transaction*)+0x31d) [0x55de71d4fadd]
>  4: (ReplicatedBackend::_do_push(boost::intrusive_ptr)+0x18f) 
> [0x55de71d4fd7f]
>  5: 
> (ReplicatedBackend::_handle_message(boost::intrusive_ptr)+0x2d1) 
> [0x55de71d5ff11]
>  6: (PGBackend::handle_message(boost::intrusive_ptr)+0x50) 
> [0x55de71c7d030]
>  7: (PrimaryLogPG::do_request(boost::intrusive_ptr&, 
> ThreadPool::TPHandle&)+0x5f1) [0x55de71be87b1]
>  8: (OSD::dequeue_op(boost::intrusive_ptr, 
> boost::intrusive_ptr, ThreadPool::TPHandle&)+0x3f7) 
> [0x55de71a63e97]
>  9: (PGQueueable::RunVis::operator()(boost::intrusive_ptr 
> const&)+0x57) [0x55de71cf5077]
>  10: (OSD::ShardedOpWQ::_process(unsigned int, 
> ceph::heartbeat_handle_d*)+0x108c) [0x55de71a94e1c]
>  11: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x88d) 
> [0x55de7203fbbd]
>  12: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55de72041b80]
>  13: (()+0x8064) [0x7fefc12b5064]
>  14: (clone()+0x6d) [0x7fefc03a962d]
>  NOTE: a copy of the executable, or `objdump -rdS ` is needed to 
> 

Re: [ceph-users] lot of inconsistent+failed_repair - failed to pick suitable auth object (14.2.3)

2019-10-10 Thread Brad Hubbard
On Fri, Oct 11, 2019 at 12:27 AM Kenneth Waegeman
 wrote:
>
> Hi Brad, all,
>
> Pool 6 has min_size 2:
>
> pool 6 'metadata' replicated size 3 min_size 2 crush_rule 1 object_hash
> rjenkins pg_num 1024 pgp_num 1024 autoscale_mode warn last_change 172476
> flags hashpspool stripe_width 0 application cephfs

This looked like something min_size 1 could cause, but I guess that's
not the cause here.

> so inconsistens is empty, which is weird, no ?

Try scrubbing the pg just before running the command.

>
> Thanks again!
>
> K
>
>
> On 10/10/2019 12:52, Brad Hubbard wrote:
> > Does pool 6 have min_size = 1 set?
> >
> > https://tracker.ceph.com/issues/24994#note-5 would possibly be helpful
> > here, depending on what the output of the following command looks
> > like.
> >
> > # rados list-inconsistent-obj [pgid] --format=json-pretty
> >
> > On Thu, Oct 10, 2019 at 8:16 PM Kenneth Waegeman
> >  wrote:
> >> Hi all,
> >>
> >> After some node failure and rebalancing, we have a lot of pg's in
> >> inconsistent state. I tried to repair, but it din't work. This is also
> >> in the logs:
> >>
> >>> 2019-10-10 11:23:27.221 7ff54c9b0700  0 log_channel(cluster) log [DBG]
> >>> : 6.327 repair starts
> >>> 2019-10-10 11:23:27.431 7ff5509b8700 -1 log_channel(cluster) log [ERR]
> >>> : 6.327 shard 19 soid 6:e4c130fd:::20005f3b582.:head :
> >>> omap_digest 0x334f57be != omap_digest 0xa8c4ce76 from auth oi
> >>> 6:e4c130fd:::20005f3b582.:head(203789'1033530 osd.3.0:342
> >>> dirty|omap|data_digest|omap_digest s 0 uv 1032164 dd  od
> >>> a8c4ce76 alloc_hint [0 0 0])
> >>> 2019-10-10 11:23:27.431 7ff5509b8700 -1 log_channel(cluster) log [ERR]
> >>> : 6.327 shard 72 soid 6:e4c130fd:::20005f3b582.:head :
> >>> omap_digest 0x334f57be != omap_digest 0xa8c4ce76 from auth oi
> >>> 6:e4c130fd:::20005f3b582.:head(203789'1033530 osd.3.0:342
> >>> dirty|omap|data_digest|omap_digest s 0 uv 1032164 dd  od
> >>> a8c4ce76 alloc_hint [0 0 0])
> >>> 2019-10-10 11:23:27.431 7ff5509b8700 -1 log_channel(cluster) log [ERR]
> >>> : 6.327 shard 91 soid 6:e4c130fd:::20005f3b582.:head :
> >>> omap_digest 0x334f57be != omap_digest 0xa8c4ce76 from auth oi
> >>> 6:e4c130fd:::20005f3b582.:head(203789'1033530 osd.3.0:342
> >>> dirty|omap|data_digest|omap_digest s 0 uv 1032164 dd  od
> >>> a8c4ce76 alloc_hint [0 0 0])
> >>> 2019-10-10 11:23:27.431 7ff5509b8700 -1 log_channel(cluster) log [ERR]
> >>> : 6.327 soid 6:e4c130fd:::20005f3b582.:head : failed to pick
> >>> suitable auth object
> >>> 2019-10-10 11:23:27.731 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> >>> : 6.327 shard 19 soid 6:e4c2e57b:::20005f11daa.:head :
> >>> omap_digest 0x6aafaf97 != omap_digest 0x56dd55a2 from auth oi
> >>> 6:e4c2e57b:::20005f11daa.:head(203789'1033711 osd.3.0:3666823
> >>> dirty|omap|data_digest|omap_digest s 0 uv 1032158 dd  od
> >>> 56dd55a2 alloc_hint [0 0 0])
> >>> 2019-10-10 11:23:27.731 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> >>> : 6.327 shard 72 soid 6:e4c2e57b:::20005f11daa.:head :
> >>> omap_digest 0x6aafaf97 != omap_digest 0x56dd55a2 from auth oi
> >>> 6:e4c2e57b:::20005f11daa.:head(203789'1033711 osd.3.0:3666823
> >>> dirty|omap|data_digest|omap_digest s 0 uv 1032158 dd  od
> >>> 56dd55a2 alloc_hint [0 0 0])
> >>> 2019-10-10 11:23:27.731 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> >>> : 6.327 shard 91 soid 6:e4c2e57b:::20005f11daa.:head :
> >>> omap_digest 0x6aafaf97 != omap_digest 0x56dd55a2 from auth oi
> >>> 6:e4c2e57b:::20005f11daa.:head(203789'1033711 osd.3.0:3666823
> >>> dirty|omap|data_digest|omap_digest s 0 uv 1032158 dd  od
> >>> 56dd55a2 alloc_hint [0 0 0])
> >>> 2019-10-10 11:23:27.731 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> >>> : 6.327 soid 6:e4c2e57b:::20005f11daa.:head : failed to pick
> >>> suitable auth object
> >>> 2019-10-10 11:23:27.971 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> >>> : 6.327 shard 19 soid 6:e4c40009:::20005f45f1b.:head :
> >>> omap_digest 0x7ccf5cc9 != omap_digest 0xe048d29 from auth oi
> >>> 6:e4c40009:::20005f45f1b.:head(203789'1033837 osd.3.0:3666949
> >>> dirty|omap|data

Re: [ceph-users] lot of inconsistent+failed_repair - failed to pick suitable auth object (14.2.3)

2019-10-10 Thread Brad Hubbard
Does pool 6 have min_size = 1 set?

https://tracker.ceph.com/issues/24994#note-5 would possibly be helpful
here, depending on what the output of the following command looks
like.

# rados list-inconsistent-obj [pgid] --format=json-pretty

On Thu, Oct 10, 2019 at 8:16 PM Kenneth Waegeman
 wrote:
>
> Hi all,
>
> After some node failure and rebalancing, we have a lot of pg's in
> inconsistent state. I tried to repair, but it din't work. This is also
> in the logs:
>
> > 2019-10-10 11:23:27.221 7ff54c9b0700  0 log_channel(cluster) log [DBG]
> > : 6.327 repair starts
> > 2019-10-10 11:23:27.431 7ff5509b8700 -1 log_channel(cluster) log [ERR]
> > : 6.327 shard 19 soid 6:e4c130fd:::20005f3b582.:head :
> > omap_digest 0x334f57be != omap_digest 0xa8c4ce76 from auth oi
> > 6:e4c130fd:::20005f3b582.:head(203789'1033530 osd.3.0:342
> > dirty|omap|data_digest|omap_digest s 0 uv 1032164 dd  od
> > a8c4ce76 alloc_hint [0 0 0])
> > 2019-10-10 11:23:27.431 7ff5509b8700 -1 log_channel(cluster) log [ERR]
> > : 6.327 shard 72 soid 6:e4c130fd:::20005f3b582.:head :
> > omap_digest 0x334f57be != omap_digest 0xa8c4ce76 from auth oi
> > 6:e4c130fd:::20005f3b582.:head(203789'1033530 osd.3.0:342
> > dirty|omap|data_digest|omap_digest s 0 uv 1032164 dd  od
> > a8c4ce76 alloc_hint [0 0 0])
> > 2019-10-10 11:23:27.431 7ff5509b8700 -1 log_channel(cluster) log [ERR]
> > : 6.327 shard 91 soid 6:e4c130fd:::20005f3b582.:head :
> > omap_digest 0x334f57be != omap_digest 0xa8c4ce76 from auth oi
> > 6:e4c130fd:::20005f3b582.:head(203789'1033530 osd.3.0:342
> > dirty|omap|data_digest|omap_digest s 0 uv 1032164 dd  od
> > a8c4ce76 alloc_hint [0 0 0])
> > 2019-10-10 11:23:27.431 7ff5509b8700 -1 log_channel(cluster) log [ERR]
> > : 6.327 soid 6:e4c130fd:::20005f3b582.:head : failed to pick
> > suitable auth object
> > 2019-10-10 11:23:27.731 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> > : 6.327 shard 19 soid 6:e4c2e57b:::20005f11daa.:head :
> > omap_digest 0x6aafaf97 != omap_digest 0x56dd55a2 from auth oi
> > 6:e4c2e57b:::20005f11daa.:head(203789'1033711 osd.3.0:3666823
> > dirty|omap|data_digest|omap_digest s 0 uv 1032158 dd  od
> > 56dd55a2 alloc_hint [0 0 0])
> > 2019-10-10 11:23:27.731 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> > : 6.327 shard 72 soid 6:e4c2e57b:::20005f11daa.:head :
> > omap_digest 0x6aafaf97 != omap_digest 0x56dd55a2 from auth oi
> > 6:e4c2e57b:::20005f11daa.:head(203789'1033711 osd.3.0:3666823
> > dirty|omap|data_digest|omap_digest s 0 uv 1032158 dd  od
> > 56dd55a2 alloc_hint [0 0 0])
> > 2019-10-10 11:23:27.731 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> > : 6.327 shard 91 soid 6:e4c2e57b:::20005f11daa.:head :
> > omap_digest 0x6aafaf97 != omap_digest 0x56dd55a2 from auth oi
> > 6:e4c2e57b:::20005f11daa.:head(203789'1033711 osd.3.0:3666823
> > dirty|omap|data_digest|omap_digest s 0 uv 1032158 dd  od
> > 56dd55a2 alloc_hint [0 0 0])
> > 2019-10-10 11:23:27.731 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> > : 6.327 soid 6:e4c2e57b:::20005f11daa.:head : failed to pick
> > suitable auth object
> > 2019-10-10 11:23:27.971 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> > : 6.327 shard 19 soid 6:e4c40009:::20005f45f1b.:head :
> > omap_digest 0x7ccf5cc9 != omap_digest 0xe048d29 from auth oi
> > 6:e4c40009:::20005f45f1b.:head(203789'1033837 osd.3.0:3666949
> > dirty|omap|data_digest|omap_digest s 0 uv 1032168 dd  od
> > e048d29 alloc_hint [0 0 0])
> > 2019-10-10 11:23:27.971 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> > : 6.327 shard 72 soid 6:e4c40009:::20005f45f1b.:head :
> > omap_digest 0x7ccf5cc9 != omap_digest 0xe048d29 from auth oi
> > 6:e4c40009:::20005f45f1b.:head(203789'1033837 osd.3.0:3666949
> > dirty|omap|data_digest|omap_digest s 0 uv 1032168 dd  od
> > e048d29 alloc_hint [0 0 0])
> > 2019-10-10 11:23:27.971 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> > : 6.327 shard 91 soid 6:e4c40009:::20005f45f1b.:head :
> > omap_digest 0x7ccf5cc9 != omap_digest 0xe048d29 from auth oi
> > 6:e4c40009:::20005f45f1b.:head(203789'1033837 osd.3.0:3666949
> > dirty|omap|data_digest|omap_digest s 0 uv 1032168 dd  od
> > e048d29 alloc_hint [0 0 0])
> > 2019-10-10 11:23:27.971 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> > : 6.327 soid 6:e4c40009:::20005f45f1b.:head : failed to pick
> > suitable auth object
> > 2019-10-10 11:23:28.041 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> > : 6.327 shard 19 soid 6:e4c4a042:::20005f389fb.:head :
> > omap_digest 0xdd1558b8 != omap_digest 0xcf9af548 from auth oi
> > 6:e4c4a042:::20005f389fb.:head(203789'1033899 osd.3.0:3667011
> > dirty|omap|data_digest|omap_digest s 0 uv 1031358 dd  od
> > cf9af548 alloc_hint [0 0 0])
> > 2019-10-10 11:23:28.041 7ff54c9b0700 -1 log_channel(cluster) log [ERR]
> 

Re: [ceph-users] Ceph pg repair clone_missing?

2019-10-09 Thread Brad Hubbard
Awesome! Sorry it took so long.

On Thu, Oct 10, 2019 at 12:44 AM Marc Roos  wrote:
>
>
> Brad, many thanks!!! My cluster has finally HEALTH_OK af 1,5 year or so!
> :)
>
>
> -Original Message-
> Subject: Re: Ceph pg repair clone_missing?
>
> On Fri, Oct 4, 2019 at 6:09 PM Marc Roos 
> wrote:
> >
> >  >
> >  >Try something like the following on each OSD that holds a copy of
> >  >rbd_data.1f114174b0dc51.0974 and see what output you
> get.
> >  >Note that you can drop the bluestore flag if they are not bluestore
>
> > >osds and you will need the osd stopped at the time (set noout). Also
>
> > >note, snapids are displayed in hexadecimal in the output (but then
> '4'
> >  >is '4' so not a big issues here).
> >  >
> >  >$ ceph-objectstore-tool --type bluestore --data-path
> > >/var/lib/ceph/osd/ceph-XX/ --pgid 17.36 --op list
> >  >rbd_data.1f114174b0dc51.0974
> >
> > I got these results
> >
> > osd.7
> > Error getting attr on : 17.36_head,#-19:6c00:::scrub_17.36:head#,
> > (61) No data available
> > ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","s
> > na
> > pid":63,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]
> > ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","s
> > na
> > pid":-2,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]
>
> Ah, so of course the problem is the snapshot is missing. You may need to
> try something like the following on each of those osds.
>
> $ ceph-objectstore-tool --type bluestore --data-path
> /var/lib/ceph/osd/ceph-XX/ --pgid 17.36
> '{"oid":"rbd_data.1f114174b0dc51.0974","key":"","snapid":-2,
> "hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}'
> remove-clone-metadata 4
>
> >
> > osd.12
> > ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","s
> > na
> > pid":63,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]
> > ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","s
> > na
> > pid":-2,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]
> >
> > osd.29
> > ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","s
> > na
> > pid":63,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]
> > ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","s
> > na
> > pid":-2,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]
> >
> >
> >  >
> >  >The likely issue here is the primary believes snapshot 4 is gone but
>
> > >there is still data and/or metadata on one of the replicas which is
> > >confusing the issue. If that is the case you can use the the
> > >ceph-objectstore-tool to delete the relevant snapshot(s)  >
>
>
>
> --
> Cheers,
> Brad
>
>
>


-- 
Cheers,
Brad

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph pg repair clone_missing?

2019-10-08 Thread Brad Hubbard
On Fri, Oct 4, 2019 at 6:09 PM Marc Roos  wrote:
>
>  >
>  >Try something like the following on each OSD that holds a copy of
>  >rbd_data.1f114174b0dc51.0974 and see what output you get.
>  >Note that you can drop the bluestore flag if they are not bluestore
>  >osds and you will need the osd stopped at the time (set noout). Also
>  >note, snapids are displayed in hexadecimal in the output (but then '4'
>  >is '4' so not a big issues here).
>  >
>  >$ ceph-objectstore-tool --type bluestore --data-path
>  >/var/lib/ceph/osd/ceph-XX/ --pgid 17.36 --op list
>  >rbd_data.1f114174b0dc51.0974
>
> I got these results
>
> osd.7
> Error getting attr on : 17.36_head,#-19:6c00:::scrub_17.36:head#,
> (61) No data available
> ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","sna
> pid":63,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]
> ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","sna
> pid":-2,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]

Ah, so of course the problem is the snapshot is missing. You may need
to try something like the following on each of those osds.

$ ceph-objectstore-tool --type bluestore --data-path
/var/lib/ceph/osd/ceph-XX/ --pgid 17.36
'{"oid":"rbd_data.1f114174b0dc51.0974","key":"","snapid":-2,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}'
remove-clone-metadata 4

>
> osd.12
> ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","sna
> pid":63,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]
> ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","sna
> pid":-2,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]
>
> osd.29
> ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","sna
> pid":63,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]
> ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","sna
> pid":-2,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]
>
>
>  >
>  >The likely issue here is the primary believes snapshot 4 is gone but
>  >there is still data and/or metadata on one of the replicas which is
>  >confusing the issue. If that is the case you can use the the
>  >ceph-objectstore-tool to delete the relevant snapshot(s)
>  >



-- 
Cheers,
Brad

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph pg repair clone_missing?

2019-10-03 Thread Brad Hubbard
On Thu, Oct 3, 2019 at 6:46 PM Marc Roos  wrote:
>
>  >
>  >>
>  >> I was following the thread where you adviced on this pg repair
>  >>
>  >> I ran these rados 'list-inconsistent-obj'/'rados
>  >> list-inconsistent-snapset' and have output on the snapset. I tried
> to
>  >> extrapolate your comment on the data/omap_digest_mismatch_info onto
> my
>  >> situation. But I don't know how to proceed. I got on this mailing
> list
>  >> the advice to delete snapshot 4, but if I see this output, that
> might
>  >> not have been the smartest thing to do.
>  >
>  >That remains to be seen. Can you post the actual scrub error you are
> getting?
>
> 2019-10-03 09:27:07.831046 7fc448bf6700 -1 log_channel(cluster) log
> [ERR] : deep-scrub 17.36
> 17:6ca1f70a:::rbd_data.1f114174b0dc51.0974:head : expected
> clone 17:6ca1f70a:::rbd_data.1f114174b0dc51.0974:4 1 missing

Try something like the following on each OSD that holds a copy of
rbd_data.1f114174b0dc51.0974 and see what output you get.
Note that you can drop the bluestore flag if they are not bluestore
osds and you will need the osd stopped at the time (set noout). Also
note, snapids are displayed in hexadecimal in the output (but then '4'
is '4' so not a big issues here).

$ ceph-objectstore-tool --type bluestore --data-path
/var/lib/ceph/osd/ceph-XX/ --pgid 17.36 --op list
rbd_data.1f114174b0dc51.0974

The likely issue here is the primary believes snapshot 4 is gone but
there is still data and/or metadata on one of the replicas which is
confusing the issue. If that is the case you can use the the
ceph-objectstore-tool to delete the relevant snapshot(s)

>  >>
>  >>
>  >>
>  >>
>  >> [0]
>  >> http://tracker.ceph.com/issues/24994
>  >
>  >At first glance this appears to be a different issue to yours.
>  >
>  >>
>  >> [1]
>  >> {
>  >>   "epoch": 66082,
>  >>   "inconsistents": [
>  >> {
>  >>   "name": "rbd_data.1f114174b0dc51.0974",
>  >
>  >rbd_data.1f114174b0dc51 is the block_name_prefix for this image. You
>  >can run 'rbd info' on the images in this pool to see which image is
>  >actually affected and how important the data is.
>
> Yes I know what image it is. Deleting data is easy, I like to know/learn

I wasn't suggesting you just delete it. I merely suggested you be
informed about what data you are manipulating so you can proceed
appropriately.

>
> how to fix this.
>
>  >
>  >>   "nspace": "",
>  >>   "locator": "",
>  >>   "snap": "head",
>  >>   "snapset": {
>  >> "snap_context": {
>  >>   "seq": 63,
>  >>   "snaps": [
>  >> 63,
>  >> 35,
>  >> 13,
>  >> 4
>  >>   ]
>  >> },
>  >> "head_exists": 1,
>  >> "clones": [
>  >>   {
>  >> "snap": 4,
>  >> "size": 4194304,
>  >> "overlap": "[]",
>  >> "snaps": [
>  >>   4
>  >> ]
>  >>   },
>  >>   {
>  >> "snap": 63,
>  >> "size": 4194304,
>  >> "overlap": "[0~4194304]",
>  >> "snaps": [
>  >>   63,
>  >>   35,
>  >>   13
>  >> ]
>  >>   }
>  >> ]
>  >>   },
>  >>   "errors": [
>  >> "clone_missing"
>  >>   ],
>  >>   "missing": [
>  >> 4
>  >>   ]
>  >> }
>  >>   ]
>  >> }
>  >
>  >



-- 
Cheers,
Brad

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph pg repair clone_missing?

2019-10-02 Thread Brad Hubbard
On Wed, Oct 2, 2019 at 9:00 PM Marc Roos  wrote:
>
>
>
> Hi Brad,
>
> I was following the thread where you adviced on this pg repair
>
> I ran these rados 'list-inconsistent-obj'/'rados
> list-inconsistent-snapset' and have output on the snapset. I tried to
> extrapolate your comment on the data/omap_digest_mismatch_info onto my
> situation. But I don't know how to proceed. I got on this mailing list
> the advice to delete snapshot 4, but if I see this output, that might
> not have been the smartest thing to do.

That remains to be seen. Can you post the actual scrub error you are getting?

>
>
>
>
> [0]
> http://tracker.ceph.com/issues/24994

At first glance this appears to be a different issue to yours.

>
> [1]
> {
>   "epoch": 66082,
>   "inconsistents": [
> {
>   "name": "rbd_data.1f114174b0dc51.0974",

rbd_data.1f114174b0dc51 is the block_name_prefix for this image. You
can run 'rbd info' on the images in this pool to see which image is
actually affected and how important the data is.

>   "nspace": "",
>   "locator": "",
>   "snap": "head",
>   "snapset": {
> "snap_context": {
>   "seq": 63,
>   "snaps": [
> 63,
> 35,
> 13,
> 4
>   ]
> },
> "head_exists": 1,
> "clones": [
>   {
> "snap": 4,
> "size": 4194304,
> "overlap": "[]",
> "snaps": [
>   4
> ]
>   },
>   {
> "snap": 63,
> "size": 4194304,
> "overlap": "[0~4194304]",
> "snaps": [
>   63,
>   35,
>   13
> ]
>   }
> ]
>   },
>   "errors": [
> "clone_missing"
>   ],
>   "missing": [
> 4
>   ]
> }
>   ]
> }



--
Cheers,
Brad

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD crashed during the fio test

2019-10-01 Thread Brad Hubbard
If it is only this one osd I'd be inclined to be taking a hard look at
the underlying hardware and how it behaves/performs compared to the hw
backing identical osds. The less likely possibility is that you have
some sort of "hot spot" causing resource contention for that osd. To
investigate that further you could look at whether the pattern of cpu
and ram usage of that daemon varies significantly compared to the
other osd daemons in the cluster. You could also compare perf dumps
between daemons.

On Wed, Oct 2, 2019 at 1:46 PM Sasha Litvak
 wrote:
>
> I updated firmware and kernel, running torture tests.  So far no assert, but 
> I still noticed this on the same osd as yesterday
>
> Oct 01 19:35:13 storage2n2-la ceph-osd-34[11188]: 2019-10-01 19:35:13.721 
> 7f8d03150700  1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 
> 0x7f8cd05d7700' had timed out after 60
> Oct 01 19:35:13 storage2n2-la ceph-osd-34[11188]: 2019-10-01 19:35:13.721 
> 7f8d03150700  1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 
> 0x7f8cd0dd8700' had timed out after 60
> Oct 01 19:35:13 storage2n2-la ceph-osd-34[11188]: 2019-10-01 19:35:13.721 
> 7f8d03150700  1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 
> 0x7f8cd2ddc700' had timed out after 60
> Oct 01 19:35:13 storage2n2-la ceph-osd-34[11188]: 2019-10-01 19:35:13.721 
> 7f8d03150700  1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 
> 0x7f8cd35dd700' had timed out after 60
> Oct 01 19:35:13 storage2n2-la ceph-osd-34[11188]: 2019-10-01 19:35:13.721 
> 7f8d03150700  1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 
> 0x7f8cd3dde700' had timed out after 60
>
> The spike of latency on this OSD is 6 seconds at that time.  Any ideas?
>
> On Tue, Oct 1, 2019 at 8:03 AM Sasha Litvak  
> wrote:
>>
>> It was hardware indeed.  Dell server reported a disk being reset with power 
>> on.  Checking the usual suspects i.e. controller firmware, controller event 
>> log (if I can get one), drive firmware.
>> I will report more when I get a better idea
>>
>> Thank you!
>>
>> On Tue, Oct 1, 2019 at 2:33 AM Brad Hubbard  wrote:
>>>
>>> Removed ceph-de...@vger.kernel.org and added d...@ceph.io
>>>
>>> On Tue, Oct 1, 2019 at 4:26 PM Alex Litvak  
>>> wrote:
>>> >
>>> > Hellow everyone,
>>> >
>>> > Can you shed the line on the cause of the crash?  Could actually client 
>>> > request trigger it?
>>> >
>>> > Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]: 2019-09-30 22:52:58.867 
>>> > 7f093d71e700 -1 bdev(0x55b72c156000 /var/lib/ceph/osd/ceph-17/block) 
>>> > aio_submit retries 16
>>> > Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]: 2019-09-30 22:52:58.867 
>>> > 7f093d71e700 -1 bdev(0x55b72c156000 /var/lib/ceph/osd/ceph-17/block)  aio 
>>> > submit got (11) Resource temporarily unavailable
>>>
>>> The KernelDevice::aio_submit function has tried to submit Io 16 times
>>> (a hard coded limit) and received an error each time causing it to
>>> assert. Can you check the status of the underlying device(s)?
>>>
>>> > Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:
>>> > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.2.2/rpm/el7/BUILD/ceph-14.2.2/src/os/bluestore/KernelDevice.cc:
>>> > In fun
>>> > Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:
>>> > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.2.2/rpm/el7/BUILD/ceph-14.2.2/src/os/bluestore/KernelDevice.cc:
>>> > 757: F
>>> > Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  ceph version 14.2.2 
>>> > (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable)
>>> > Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  1: 
>>> > (ceph::__ceph_assert_fail(char const*, char const*, int, char 
>>> > const*)+0x14a) [0x55b71f668cf4]
>>> > Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  2: 
>>> > (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, 
>>> > char const*, ...)+0) [0x55b71f668ec2]
>>> > Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  3: 
>>> > (KernelDevice::aio_submit(IOContext*)+0x701) [0x55b71fd61ca1]
>>> > Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  4: 
>>> > (BlueStore::_txc_aio_submit(BlueStore::TransContext*)+0x42) 
>>> > [0x55b71fc29892]
>>> > Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  5: 

Re: [ceph-users] ceph pg repair fails...?

2019-10-01 Thread Brad Hubbard
On Wed, Oct 2, 2019 at 1:15 AM Mattia Belluco  wrote:
>
> Hi Jake,
>
> I am curious to see if your problem is similar to ours (despite the fact
> we are still on Luminous).
>
> Could you post the output of:
>
> rados list-inconsistent-obj 
>
> and
>
> rados list-inconsistent-snapset 

Make sure you scrub the pg before running these commands.
Take a look at the information in http://tracker.ceph.com/issues/24994
for hints on how to proceed.
'
>
> Thanks,
>
> Mattia
>
> On 10/1/19 1:08 PM, Jake Grimmett wrote:
> > Dear All,
> >
> > I've just found two inconsistent pg that fail to repair.
> >
> > This might be the same bug as shown here:
> >
> > 
> >
> > Cluster is running Nautilus 14.2.2
> > OS is Scientific Linux 7.6
> > DB/WAL on NVMe, Data on 12TB HDD
> >
> > Logs below cab also be seen here: 
> >
> > [root@ceph-s1 ~]# ceph health detail
> > HEALTH_ERR 22 scrub errors; Possible data damage: 2 pgs inconsistent
> > OSD_SCRUB_ERRORS 22 scrub errors
> > PG_DAMAGED Possible data damage: 2 pgs inconsistent
> > pg 2.2a7 is active+clean+inconsistent+failed_repair, acting
> > [83,60,133,326,281,162,180,172,144,219]
> > pg 2.36b is active+clean+inconsistent+failed_repair, acting
> > [254,268,10,262,32,280,211,114,169,53]
> >
> > Issued "pg repair" commands, osd log shows:
> > [root@ceph-n10 ~]# grep "2.2a7" /var/log/ceph/ceph-osd.83.log
> > 2019-10-01 07:05:02.459 7f9adab4b700  0 log_channel(cluster) log [DBG] :
> > 2.2a7 repair starts
> > 2019-10-01 07:11:41.589 7f9adab4b700 -1 log_channel(cluster) log [ERR] :
> > 2.2a7 shard 83(0) soid 2:e5472cab:::1000702081f.:head :
> > candidate size 4096 info size 0 mismatch
> > 2019-10-01 07:11:41.589 7f9adab4b700 -1 log_channel(cluster) log [ERR] :
> > 2.2a7 shard 60(1) soid 2:e5472cab:::1000702081f.:head :
> > candidate size 4096 info size 0 mismatch
> > 2019-10-01 07:11:41.589 7f9adab4b700 -1 log_channel(cluster) log [ERR] :
> > 2.2a7 shard 133(2) soid 2:e5472cab:::1000702081f.:head :
> > candidate size 4096 info size 0 mismatch
> > 2019-10-01 07:11:41.589 7f9adab4b700 -1 log_channel(cluster) log [ERR] :
> > 2.2a7 shard 144(8) soid 2:e5472cab:::1000702081f.:head :
> > candidate size 4096 info size 0 mismatch
> > 2019-10-01 07:11:41.589 7f9adab4b700 -1 log_channel(cluster) log [ERR] :
> > 2.2a7 shard 162(5) soid 2:e5472cab:::1000702081f.:head :
> > candidate size 4096 info size 0 mismatch
> > 2019-10-01 07:11:41.589 7f9adab4b700 -1 log_channel(cluster) log [ERR] :
> > 2.2a7 shard 172(7) soid 2:e5472cab:::1000702081f.:head :
> > candidate size 4096 info size 0 mismatch
> > 2019-10-01 07:11:41.589 7f9adab4b700 -1 log_channel(cluster) log [ERR] :
> > 2.2a7 shard 180(6) soid 2:e5472cab:::1000702081f.:head :
> > candidate size 4096 info size 0 mismatch
> > 2019-10-01 07:11:41.589 7f9adab4b700 -1 log_channel(cluster) log [ERR] :
> > 2.2a7 shard 219(9) soid 2:e5472cab:::1000702081f.:head :
> > candidate size 4096 info size 0 mismatch
> > 2019-10-01 07:11:41.589 7f9adab4b700 -1 log_channel(cluster) log [ERR] :
> > 2.2a7 shard 281(4) soid 2:e5472cab:::1000702081f.:head :
> > candidate size 4096 info size 0 mismatch
> > 2019-10-01 07:11:41.589 7f9adab4b700 -1 log_channel(cluster) log [ERR] :
> > 2.2a7 shard 326(3) soid 2:e5472cab:::1000702081f.:head :
> > candidate size 4096 info size 0 mismatch
> > 2019-10-01 07:11:41.589 7f9adab4b700 -1 log_channel(cluster) log [ERR] :
> > 2.2a7 soid 2:e5472cab:::1000702081f.:head : failed to pick
> > suitable object info
> > 2019-10-01 07:11:41.589 7f9adab4b700 -1 log_channel(cluster) log [ERR] :
> > repair 2.2a7s0 2:e5472cab:::1000702081f.:head : on disk size
> > (4096) does not match object info size (0) adjusted for ondisk to (0)
> > 2019-10-01 07:19:47.060 7f9adab4b700 -1 log_channel(cluster) log [ERR] :
> > 2.2a7 repair 11 errors, 0 fixed
> > [root@ceph-n10 ~]#
> >
> > [root@ceph-s1 ~]#  ceph pg repair 2.36b
> > instructing pg 2.36bs0 on osd.254 to repair
> >
> > [root@ceph-n29 ~]# grep "2.36b" /var/log/ceph/ceph-osd.254.log
> > 2019-10-01 11:15:12.215 7fa01f589700  0 log_channel(cluster) log [DBG] :
> > 2.36b repair starts
> > 2019-10-01 11:25:12.241 7fa01f589700 -1 log_channel(cluster) log [ERR] :
> > 2.36b shard 254(0) soid 2:d6cac754:::100070209f6.:head :
> > candidate size 4096 info size 0 mismatch
> > 2019-10-01 11:25:12.241 7fa01f589700 -1 log_channel(cluster) log [ERR] :
> > 2.36b shard 10(2) soid 2:d6cac754:::100070209f6.:head :
> > candidate size 4096 info size 0 mismatch
> > 2019-10-01 11:25:12.241 7fa01f589700 -1 log_channel(cluster) log [ERR] :
> > 2.36b shard 32(4) soid 2:d6cac754:::100070209f6.:head :
> > candidate size 4096 info size 0 mismatch
> > 2019-10-01 11:25:12.241 7fa01f589700 -1 log_channel(cluster) log [ERR] :
> > 2.36b shard 53(9) soid 2:d6cac754:::100070209f6.:head :
> > candidate size 4096 info 

Re: [ceph-users] ceph-osd@n crash dumps

2019-10-01 Thread Brad Hubbard
On Tue, Oct 1, 2019 at 10:43 PM Del Monaco, Andrea <
andrea.delmon...@atos.net> wrote:

> Hi list,
>
> After the nodes ran OOM and after reboot, we are not able to restart the
> ceph-osd@x services anymore. (Details about the setup at the end).
>
> I am trying to do this manually, so we can see the error but all i see is
> several crash dumps - this is just one of the OSDs which is not starting.
> Any idea how to get past this??
> [root@ceph001 ~]# /usr/bin/ceph-osd --debug_osd 10 -f --cluster ceph --id
> 83 --setuser ceph --setgroup ceph  > /tmp/dump 2>&1
> starting osd.83 at - osd_data /var/lib/ceph/osd/ceph-83
> /var/lib/ceph/osd/ceph-83/journal
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h:
> In function 'ECUtil::stripe_info_t::stripe_info_t(uint64_t, uint64_t)'
> thread 2aaf5540 time 2019-10-01 14:19:49.494368
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h:
> 34: FAILED assert(stripe_width % stripe_size == 0)
>  ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic
> (stable)
>  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x14b) [0x2af3d36b]
>  2: (()+0x26e4f7) [0x2af3d4f7]
>  3: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&,
> boost::intrusive_ptr&, ObjectStore*,
> CephContext*, std::shared_ptr, unsigned
> long)+0x46d) [0x55c0bd3d]
>  4: (PGBackend::build_pg_backend(pg_pool_t const&, std::map std::string, std::less, std::allocator const, std::string> > > const&, PGBackend::Listener*, coll_t,
> boost::intrusive_ptr&, ObjectStore*,
> CephContext*)+0x30a) [0x55b0ba8a]
>  5: (PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr const>, PGPool const&, std::map std::less, std::allocator std::string> > > const&, spg_t)+0x140) [0x55abd100]
>  6: (OSD::_make_pg(std::shared_ptr, spg_t)+0x10cb)
> [0x55914ecb]
>  7: (OSD::load_pgs()+0x4a9) [0x55917e39]
>  8: (OSD::init()+0xc99) [0x559238e9]
>  9: (main()+0x23a3) [0x558017a3]
>  10: (__libc_start_main()+0xf5) [0x2aaab77de495]
>  11: (()+0x385900) [0x558d9900]
> 2019-10-01 14:19:49.500 2aaf5540 -1
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h:
> In function 'ECUtil::stripe_info_t::stripe_info_t(uint64_t, uint64_t)'
> thread 2aaf5540 time 2019-10-01 14:19:49.494368
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h:
> 34: FAILED assert(stripe_width % stripe_size == 0)
>

 https://tracker.ceph.com/issues/41336 may be relevant here.

Can you post details of the pool involved as well as the erasure code
profile in use for that pool?


>  ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic
> (stable)
>  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x14b) [0x2af3d36b]
>  2: (()+0x26e4f7) [0x2af3d4f7]
>  3: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&,
> boost::intrusive_ptr&, ObjectStore*,
> CephContext*, std::shared_ptr, unsigned
> long)+0x46d) [0x55c0bd3d]
>  4: (PGBackend::build_pg_backend(pg_pool_t const&, std::map std::string, std::less, std::allocator const, std::string> > > const&, PGBackend::Listener*, coll_t,
> boost::intrusive_ptr&, ObjectStore*,
> CephContext*)+0x30a) [0x55b0ba8a]
>  5: (PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr const>, PGPool const&, std::map std::less, std::allocator std::string> > > const&, spg_t)+0x140) [0x55abd100]
>  6: (OSD::_make_pg(std::shared_ptr, spg_t)+0x10cb)
> [0x55914ecb]
>  7: (OSD::load_pgs()+0x4a9) [0x55917e39]
>  8: (OSD::init()+0xc99) [0x559238e9]
>  9: (main()+0x23a3) [0x558017a3]
>  10: (__libc_start_main()+0xf5) [0x2aaab77de495]
>  11: (()+0x385900) [0x558d9900]
>
> *** Caught signal (Aborted) **
>  in thread 2aaf5540 thread_name:ceph-osd
>  ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic
> (stable)
>  1: (()+0xf5d0) [0x2aaab69765d0]
>  2: (gsignal()+0x37) [0x2aaab77f22c7]
>  3: (abort()+0x148) [0x2aaab77f39b8]
>  4: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x248) [0x2af3d468]
>  5: (()+0x26e4f7) [0x2af3d4f7]
>  6: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&,
> boost::intrusive_ptr&, ObjectStore*,
> CephContext*, std::shared_ptr, unsigned
> long)+0x46d) [0x55c0bd3d]
>  7: (PGBackend::build_pg_backend(pg_pool_t const&, std::map std::string, std::less, std::allocator const, std::string> > > const&, PGBackend::Listener*, coll_t,
> 

Re: [ceph-users] OSD crashed during the fio test

2019-10-01 Thread Brad Hubbard
Removed ceph-de...@vger.kernel.org and added d...@ceph.io

On Tue, Oct 1, 2019 at 4:26 PM Alex Litvak  wrote:
>
> Hellow everyone,
>
> Can you shed the line on the cause of the crash?  Could actually client 
> request trigger it?
>
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]: 2019-09-30 22:52:58.867 
> 7f093d71e700 -1 bdev(0x55b72c156000 /var/lib/ceph/osd/ceph-17/block) 
> aio_submit retries 16
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]: 2019-09-30 22:52:58.867 
> 7f093d71e700 -1 bdev(0x55b72c156000 /var/lib/ceph/osd/ceph-17/block)  aio 
> submit got (11) Resource temporarily unavailable

The KernelDevice::aio_submit function has tried to submit Io 16 times
(a hard coded limit) and received an error each time causing it to
assert. Can you check the status of the underlying device(s)?

> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.2.2/rpm/el7/BUILD/ceph-14.2.2/src/os/bluestore/KernelDevice.cc:
> In fun
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.2.2/rpm/el7/BUILD/ceph-14.2.2/src/os/bluestore/KernelDevice.cc:
> 757: F
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  ceph version 14.2.2 
> (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable)
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  1: 
> (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14a) 
> [0x55b71f668cf4]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  2: 
> (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char 
> const*, ...)+0) [0x55b71f668ec2]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  3: 
> (KernelDevice::aio_submit(IOContext*)+0x701) [0x55b71fd61ca1]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  4: 
> (BlueStore::_txc_aio_submit(BlueStore::TransContext*)+0x42) [0x55b71fc29892]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  5: 
> (BlueStore::_txc_state_proc(BlueStore::TransContext*)+0x42b) [0x55b71fc496ab]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  6: 
> (BlueStore::queue_transactions(boost::intrusive_ptr&,
>  std::vector std::allocator >&, boost::intrusive_ptr, 
> ThreadPool::T
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  7: (non-virtual thunk to 
> PrimaryLogPG::queue_transactions(std::vector std::allocator >&,
> boost::intrusive_ptr)+0x54) [0x55b71f9b1b84]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  8: 
> (ReplicatedBackend::submit_transaction(hobject_t const&, object_stat_sum_t 
> const&, eversion_t const&, std::unique_ptr std::default_delete >&&, eversion_t const&, eversion_t const&, 
> s
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  9: 
> (PrimaryLogPG::issue_repop(PrimaryLogPG::RepGather*, 
> PrimaryLogPG::OpContext*)+0xf12) [0x55b71f90e322]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  10: 
> (PrimaryLogPG::execute_ctx(PrimaryLogPG::OpContext*)+0xfae) [0x55b71f969b7e]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  11: 
> (PrimaryLogPG::do_op(boost::intrusive_ptr&)+0x3965) 
> [0x55b71f96de15]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  12: 
> (PrimaryLogPG::do_request(boost::intrusive_ptr&, 
> ThreadPool::TPHandle&)+0xbd4) [0x55b71f96f8a4]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  13: 
> (OSD::dequeue_op(boost::intrusive_ptr, boost::intrusive_ptr, 
> ThreadPool::TPHandle&)+0x1a9) [0x55b71f7a9ea9]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  14: (PGOpItem::run(OSD*, 
> OSDShard*, boost::intrusive_ptr&, ThreadPool::TPHandle&)+0x62) 
> [0x55b71fa475d2]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  15: 
> (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x9f4) 
> [0x55b71f7c6ef4]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  16: 
> (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x433) 
> [0x55b71fdc5ce3]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  17: 
> (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55b71fdc8d80]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  18: (()+0x7dd5) 
> [0x7f0971da9dd5]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  19: (clone()+0x6d) 
> [0x7f0970c7002d]
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]: 2019-09-30 22:52:58.879 
> 7f093d71e700 -1
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.2.2/rpm/el7/BUILD/ceph-14.2.2/
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.2.2/rpm/el7/BUILD/ceph-14.2.2/src/os/bluestore/KernelDevice.cc:
> 757: F
> Sep 30 22:52:58 storage2n2-la ceph-osd-17[10770]:  ceph version 

Re: [ceph-users] ceph; pg scrub errors

2019-09-24 Thread Brad Hubbard
On Tue, Sep 24, 2019 at 10:51 PM M Ranga Swami Reddy
 wrote:
>
> Interestingly - "rados list-inconsistent-obj ${PG} --format=json"  not 
> showing any objects inconsistent-obj.
> And also "rados list-missing-obj ${PG} --format=json" also not showing any 
> missing or unfound objects.

Complete a scrub of ${PG} just before you run these commands.

>
> Thanks
> Swami
>
> On Mon, Sep 23, 2019 at 8:18 PM Robert LeBlanc  wrote:
>>
>> On Thu, Sep 19, 2019 at 4:34 AM M Ranga Swami Reddy
>>  wrote:
>> >
>> > Hi-Iam using ceph 12.2.11. here I am getting a few scrub errors. To fix 
>> > these scrub error I ran the "ceph pg repair ".
>> > But scrub error not going and the repair is talking long time like 8-12 
>> > hours.
>>
>> Depending on the size of the PGs and how active the cluster is, it
>> could take a long time as it takes another deep scrub to happen to
>> clear the error status after a repair. Since it is not going away,
>> either the problem is too complicated to automatically repair and
>> needs to be done by hand, or the problem is repaired and when it
>> deep-scrubs to check it, the problem has reappeared or another problem
>> was found and the disk needs to be replaced.
>>
>> Try running:
>> rados list-inconsistent-obj ${PG} --format=json
>>
>> and see what the exact problems are.
>> 
>> Robert LeBlanc
>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ZeroDivisionError when running ceph osd status

2019-09-11 Thread Brad Hubbard
On Thu, Sep 12, 2019 at 1:52 AM Benjamin Tayehanpour
 wrote:
>
> Greetings!
>
> I had an OSD down, so I ran ceph osd status and got this:
>
> [root@ceph1 ~]# ceph osd status
> Error EINVAL: Traceback (most recent call last):
>   File "/usr/lib64/ceph/mgr/status/module.py", line 313, in handle_command
> return self.handle_osd_status(cmd)
>   File "/usr/lib64/ceph/mgr/status/module.py", line 297, in
> handle_osd_status
> self.format_dimless(self.get_rate("osd", osd_id.__str__(), "osd.op_w") +
>   File "/usr/lib64/ceph/mgr/status/module.py", line 113, in get_rate
> return (data[-1][1] - data[-2][1]) / float(data[-1][0] - data[-2][0])
> ZeroDivisionError: float division by zero
> [root@ceph1 ~]#
>
> I could still figure out which OSD it was with systemctl, put I had to
> purge the OSD before ceph osd status would run again. Is this normal
> behaviour?

No. Looks like this was fixed recently in master by
https://tracker.ceph.com/projects/ceph/repository/revisions/0164c399f3c22edce6488cd28e5b172b68ca1239/diff/src/pybind/mgr/status/module.py

>
> Cordially yours,
> Benjamin
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-fuse segfaults in 14.2.2

2019-09-06 Thread Brad Hubbard
On Wed, Sep 4, 2019 at 9:42 PM Andras Pataki
 wrote:
>
> Dear ceph users,
>
> After upgrading our ceph-fuse clients to 14.2.2, we've been seeing sporadic 
> segfaults with not super revealing stack traces:
>
> in thread 7fff5a7fc700 thread_name:ceph-fuse
>
>  ceph version 14.2.2 (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus 
> (stable)
>  1: (()+0xf5d0) [0x760b85d0]
>  2: (()+0x255a0c) [0x557a9a0c]
>  3: (()+0x16b6b) [0x77bb3b6b]
>  4: (()+0x13401) [0x77bb0401]
>  5: (()+0x7dd5) [0x760b0dd5]
>  6: (clone()+0x6d) [0x74b5cead]
>  NOTE: a copy of the executable, or `objdump -rdS ` is needed to 
> interpret this.

if you install the appropriate debuginfo package (this would depend on
your OS) you may get a more enlightening stack.
>
>
> Prior to 14.2.2, we've run 12.2.11 and 13.2.5 and have not seen this issue.  
> Has anyone encountered this?  If it isn't known - I can file a bug tracker 
> for it.

Please do and maybe try to capture a core dump if you can't get a
better backtrace?

>
> Andras
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] BlueStore.cc: 11208: ceph_abort_msg("unexpected error")

2019-08-25 Thread Brad Hubbard
https://tracker.ceph.com/issues/38724

On Fri, Aug 23, 2019 at 10:18 PM Paul Emmerich  wrote:
>
> I've seen that before (but never on Nautilus), there's already an
> issue at tracker.ceph.com but I don't recall the id or title.
>
>
> Paul
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> 81247 München
> www.croit.io
> Tel: +49 89 1896585 90
>
> On Fri, Aug 23, 2019 at 1:47 PM Lars Täuber  wrote:
> >
> > Hi Paul,
> >
> > a result of fgrep is attached.
> > Can you do something with it?
> >
> > I can't read it. Maybe this is the relevant part:
> > " bluestore(/var/lib/ceph/osd/first-16) _txc_add_transaction error (39) 
> > Directory not empty not handled on operation 21 (op 1, counting from 0)"
> >
> > Later I tried it again and the osd is working again.
> >
> > It feels like I hit a bug!?
> >
> > Huge thanks for your help.
> >
> > Cheers,
> > Lars
> >
> > Fri, 23 Aug 2019 13:36:00 +0200
> > Paul Emmerich  ==> Lars Täuber  :
> > > Filter the log for "7f266bdc9700" which is the id of the crashed
> > > thread, it should contain more information on the transaction that
> > > caused the crash.
> > >
> > >
> > > Paul
> > >
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph status: pg backfill_toofull, but all OSDs have enough space

2019-08-22 Thread Brad Hubbard
https://tracker.ceph.com/issues/41255 is probably reporting the same issue.

On Thu, Aug 22, 2019 at 6:31 PM Lars Täuber  wrote:
>
> Hi there!
>
> We also experience this behaviour of our cluster while it is moving pgs.
>
> # ceph health detail
> HEALTH_ERR 1 MDSs report slow metadata IOs; Reduced data availability: 2 pgs 
> inactive; Degraded data redundancy (low space): 1 pg backfill_toofull
> MDS_SLOW_METADATA_IO 1 MDSs report slow metadata IOs
> mdsmds1(mds.0): 1 slow metadata IOs are blocked > 30 secs, oldest blocked 
> for 359 secs
> PG_AVAILABILITY Reduced data availability: 2 pgs inactive
> pg 21.231 is stuck inactive for 878.224182, current state remapped, last 
> acting [20,2147483647,13,2147483647,15,10]
> pg 21.240 is stuck inactive for 878.123932, current state remapped, last 
> acting [26,17,21,20,2147483647,2147483647]
> PG_DEGRADED_FULL Degraded data redundancy (low space): 1 pg backfill_toofull
> pg 21.376 is active+remapped+backfill_wait+backfill_toofull, acting 
> [6,11,29,2,10,15]
> # ceph pg map 21.376
> osdmap e68016 pg 21.376 (21.376) -> up [6,5,23,21,10,11] acting 
> [6,11,29,2,10,15]
>
> # ceph osd dump | fgrep ratio
> full_ratio 0.95
> backfillfull_ratio 0.9
> nearfull_ratio 0.85
>
> This happens while the cluster is rebalancing the pgs after I manually mark a 
> single osd out.
> see here:
>  Subject: [ceph-users] pg 21.1f9 is stuck inactive for 53316.902820,  current 
> state remapped
>  http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-August/036634.html
>
>
> Mostly the cluster heals itself at least into state HEALTH_WARN:
>
>
> # ceph health detail
> HEALTH_WARN 1 MDSs report slow metadata IOs; Reduced data availability: 2 pgs 
> inactive
> MDS_SLOW_METADATA_IO 1 MDSs report slow metadata IOs
> mdsmds1(mds.0): 1 slow metadata IOs are blocked > 30 secs, oldest blocked 
> for 1155 secs
> PG_AVAILABILITY Reduced data availability: 2 pgs inactive
> pg 21.231 is stuck inactive for 1677.312219, current state remapped, last 
> acting [20,2147483647,13,2147483647,15,10]
> pg 21.240 is stuck inactive for 1677.211969, current state remapped, last 
> acting [26,17,21,20,2147483647,2147483647]
>
>
>
> Cheers,
> Lars
>
>
> Wed, 21 Aug 2019 17:28:05 -0500
> Reed Dier  ==> Vladimir Brik 
>  :
> > Just chiming in to say that I too had some issues with backfill_toofull 
> > PGs, despite no OSD's being in a backfill_full state, albeit, there were 
> > some nearfull OSDs.
> >
> > I was able to get through it by reweighting down the OSD that was the 
> > target reported by ceph pg dump | grep 'backfill_toofull'.
> >
> > This was on 14.2.2.
> >
> > Reed
> >
> > > On Aug 21, 2019, at 2:50 PM, Vladimir Brik 
> > >  wrote:
> > >
> > > Hello
> > >
> > > After increasing number of PGs in a pool, ceph status is reporting 
> > > "Degraded data redundancy (low space): 1 pg backfill_toofull", but I 
> > > don't understand why, because all OSDs seem to have enough space.
> > >
> > > ceph health detail says:
> > > pg 40.155 is active+remapped+backfill_toofull, acting [20,57,79,85]
> > >
> > > $ ceph pg map 40.155
> > > osdmap e3952 pg 40.155 (40.155) -> up [20,57,66,85] acting [20,57,79,85]
> > >
> > > So I guess Ceph wants to move 40.155 from 66 to 79 (or other way 
> > > around?). According to "osd df", OSD 66's utilization is 71.90%, OSD 79's 
> > > utilization is 58.45%. The OSD with least free space in the cluster is 
> > > 81.23% full, and it's not any of the ones above.
> > >
> > > OSD backfillfull_ratio is 90% (is there a better way to determine this?):
> > > $ ceph osd dump | grep ratio
> > > full_ratio 0.95
> > > backfillfull_ratio 0.9
> > > nearfull_ratio 0.7
> > >
> > > Does anybody know why a PG could be in the backfill_toofull state if no 
> > > OSD is in the backfillfull state?
> > >
> > >
> > > Vlad
> > > ___
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
>
> --
> Informationstechnologie
> Berlin-Brandenburgische Akademie der Wissenschaften
> Jägerstraße 22-23  10117 Berlin
> Tel.: +49 30 20370-352   http://www.bbaw.de
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Sudden loss of all SSD OSDs in a cluster, immedaite abort on restart [Mimic 13.2.6]

2019-08-18 Thread Brad Hubbard
On Thu, Aug 15, 2019 at 2:09 AM Troy Ablan  wrote:
>
> Paul,
>
> Thanks for the reply.  All of these seemed to fail except for pulling
> the osdmap from the live cluster.
>
> -Troy
>
> -[~:#]- ceph-objectstore-tool --op get-osdmap --data-path
> /var/lib/ceph/osd/ceph-45/ --file osdmap45
> terminate called after throwing an instance of
> 'ceph::buffer::malformed_input'
>what():  buffer::malformed_input: unsupported bucket algorithm: -1
> *** Caught signal (Aborted) **
>   in thread 7f945ee04f00 thread_name:ceph-objectstor
>   ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic
> (stable)
>   1: (()+0xf5d0) [0x7f94531935d0]
>   2: (gsignal()+0x37) [0x7f9451d80207]
>   3: (abort()+0x148) [0x7f9451d818f8]
>   4: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f945268f7d5]
>   5: (()+0x5e746) [0x7f945268d746]
>   6: (()+0x5e773) [0x7f945268d773]
>   7: (__cxa_rethrow()+0x49) [0x7f945268d9e9]
>   8: (CrushWrapper::decode(ceph::buffer::list::iterator&)+0x18b8)
> [0x7f94553218d8]
>   9: (OSDMap::decode(ceph::buffer::list::iterator&)+0x4ad) [0x7f94550ff4ad]
>   10: (OSDMap::decode(ceph::buffer::list&)+0x31) [0x7f9455101db1]
>   11: (get_osdmap(ObjectStore*, unsigned int, OSDMap&,
> ceph::buffer::list&)+0x1d0) [0x55de1f9a6e60]
>   12: (main()+0x5340) [0x55de1f8c8870]
>   13: (__libc_start_main()+0xf5) [0x7f9451d6c3d5]
>   14: (()+0x3adc10) [0x55de1f9a1c10]
> Aborted
>
> -[~:#]- ceph-objectstore-tool --op get-osdmap --data-path
> /var/lib/ceph/osd/ceph-46/ --file osdmap46
> terminate called after throwing an instance of
> 'ceph::buffer::malformed_input'
>what():  buffer::malformed_input: unsupported bucket algorithm: -1
> *** Caught signal (Aborted) **
>   in thread 7f9ce4135f00 thread_name:ceph-objectstor
>   ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic
> (stable)
>   1: (()+0xf5d0) [0x7f9cd84c45d0]
>   2: (gsignal()+0x37) [0x7f9cd70b1207]
>   3: (abort()+0x148) [0x7f9cd70b28f8]
>   4: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f9cd79c07d5]
>   5: (()+0x5e746) [0x7f9cd79be746]
>   6: (()+0x5e773) [0x7f9cd79be773]
>   7: (__cxa_rethrow()+0x49) [0x7f9cd79be9e9]
>   8: (CrushWrapper::decode(ceph::buffer::list::iterator&)+0x18b8)
> [0x7f9cda6528d8]
>   9: (OSDMap::decode(ceph::buffer::list::iterator&)+0x4ad) [0x7f9cda4304ad]
>   10: (OSDMap::decode(ceph::buffer::list&)+0x31) [0x7f9cda432db1]
>   11: (get_osdmap(ObjectStore*, unsigned int, OSDMap&,
> ceph::buffer::list&)+0x1d0) [0x55cea26c8e60]
>   12: (main()+0x5340) [0x55cea25ea870]
>   13: (__libc_start_main()+0xf5) [0x7f9cd709d3d5]
>   14: (()+0x3adc10) [0x55cea26c3c10]
> Aborted
>
> -[~:#]- ceph osd getmap -o osdmap
> got osdmap epoch 81298
>
> -[~:#]- ceph-objectstore-tool --op set-osdmap --data-path
> /var/lib/ceph/osd/ceph-46/ --file osdmap
> osdmap (#-1:92f679f2:::osdmap.81298:0#) does not exist.
>
> -[~:#]- ceph-objectstore-tool --op set-osdmap --data-path
> /var/lib/ceph/osd/ceph-45/ --file osdmap
> osdmap (#-1:92f679f2:::osdmap.81298:0#) does not exist.

819   auto ch = store->open_collection(coll_t::meta());
 820   const ghobject_t full_oid = OSD::get_osdmap_pobject_name(e);
 821   if (!store->exists(ch, full_oid)) {
 822 cerr << "osdmap (" << full_oid << ") does not exist." << std::endl;
 823 if (!force) {
 824   return -ENOENT;
 825 }
 826 cout << "Creating a new epoch." << std::endl;
 827   }

Adding "--force"should get you past that error.

>
>
>
> On 8/14/19 2:54 AM, Paul Emmerich wrote:
> > Starting point to debug/fix this would be to extract the osdmap from
> > one of the dead OSDs:
> >
> > ceph-objectstore-tool --op get-osdmap --data-path /var/lib/ceph/osd/...
> >
> > Then try to run osdmaptool on that osdmap to see if it also crashes,
> > set some --debug options (don't know which one off the top of my
> > head).
> > Does it also crash? How does it differ from the map retrieved with
> > "ceph osd getmap"?
> >
> > You can also set the osdmap with "--op set-osdmap", does it help to
> > set the osdmap retrieved by "ceph osd getmap"?
> >
> > Paul
> >
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Sudden loss of all SSD OSDs in a cluster, immedaite abort on restart [Mimic 13.2.6]

2019-08-18 Thread Brad Hubbard
On Thu, Aug 15, 2019 at 2:09 AM Troy Ablan  wrote:
>
> Paul,
>
> Thanks for the reply.  All of these seemed to fail except for pulling
> the osdmap from the live cluster.
>
> -Troy
>
> -[~:#]- ceph-objectstore-tool --op get-osdmap --data-path
> /var/lib/ceph/osd/ceph-45/ --file osdmap45
> terminate called after throwing an instance of
> 'ceph::buffer::malformed_input'
>what():  buffer::malformed_input: unsupported bucket algorithm: -1

That's this code.

3114   switch (alg) {
3115   case CRUSH_BUCKET_UNIFORM:
3116 size = sizeof(crush_bucket_uniform);
3117 break;
3118   case CRUSH_BUCKET_LIST:
3119 size = sizeof(crush_bucket_list);
3120 break;
3121   case CRUSH_BUCKET_TREE:
3122 size = sizeof(crush_bucket_tree);
3123 break;
3124   case CRUSH_BUCKET_STRAW:
3125 size = sizeof(crush_bucket_straw);
3126 break;
3127   case CRUSH_BUCKET_STRAW2:
3128 size = sizeof(crush_bucket_straw2);
3129 break;
3130   default:
3131 {
3132   char str[128];
3133   snprintf(str, sizeof(str), "unsupported bucket algorithm:
%d", alg);
3134   throw buffer::malformed_input(str);
3135 }
3136   }

CRUSH_BUCKET_UNIFORM = 1
CRUSH_BUCKET_LIST = 2
CRUSH_BUCKET_TREE = 3
CRUSH_BUCKET_STRAW = 4
CRUSH_BUCKET_STRAW2 = 5

So valid values for bucket algorithms are 1 through 5 but, for
whatever reason, at least one of yours is being interpreted as "-1"

this doesn't seem like something that would just happen spontaneously
with no changes to the cluster.

What recent changes have you made to the osdmap? What recent changes
have you made to the crushmap? Have you recently upgraded?

> *** Caught signal (Aborted) **
>   in thread 7f945ee04f00 thread_name:ceph-objectstor
>   ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic
> (stable)
>   1: (()+0xf5d0) [0x7f94531935d0]
>   2: (gsignal()+0x37) [0x7f9451d80207]
>   3: (abort()+0x148) [0x7f9451d818f8]
>   4: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f945268f7d5]
>   5: (()+0x5e746) [0x7f945268d746]
>   6: (()+0x5e773) [0x7f945268d773]
>   7: (__cxa_rethrow()+0x49) [0x7f945268d9e9]
>   8: (CrushWrapper::decode(ceph::buffer::list::iterator&)+0x18b8)
> [0x7f94553218d8]
>   9: (OSDMap::decode(ceph::buffer::list::iterator&)+0x4ad) [0x7f94550ff4ad]
>   10: (OSDMap::decode(ceph::buffer::list&)+0x31) [0x7f9455101db1]
>   11: (get_osdmap(ObjectStore*, unsigned int, OSDMap&,
> ceph::buffer::list&)+0x1d0) [0x55de1f9a6e60]
>   12: (main()+0x5340) [0x55de1f8c8870]
>   13: (__libc_start_main()+0xf5) [0x7f9451d6c3d5]
>   14: (()+0x3adc10) [0x55de1f9a1c10]
> Aborted
>
> -[~:#]- ceph-objectstore-tool --op get-osdmap --data-path
> /var/lib/ceph/osd/ceph-46/ --file osdmap46
> terminate called after throwing an instance of
> 'ceph::buffer::malformed_input'
>what():  buffer::malformed_input: unsupported bucket algorithm: -1
> *** Caught signal (Aborted) **
>   in thread 7f9ce4135f00 thread_name:ceph-objectstor
>   ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic
> (stable)
>   1: (()+0xf5d0) [0x7f9cd84c45d0]
>   2: (gsignal()+0x37) [0x7f9cd70b1207]
>   3: (abort()+0x148) [0x7f9cd70b28f8]
>   4: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f9cd79c07d5]
>   5: (()+0x5e746) [0x7f9cd79be746]
>   6: (()+0x5e773) [0x7f9cd79be773]
>   7: (__cxa_rethrow()+0x49) [0x7f9cd79be9e9]
>   8: (CrushWrapper::decode(ceph::buffer::list::iterator&)+0x18b8)
> [0x7f9cda6528d8]
>   9: (OSDMap::decode(ceph::buffer::list::iterator&)+0x4ad) [0x7f9cda4304ad]
>   10: (OSDMap::decode(ceph::buffer::list&)+0x31) [0x7f9cda432db1]
>   11: (get_osdmap(ObjectStore*, unsigned int, OSDMap&,
> ceph::buffer::list&)+0x1d0) [0x55cea26c8e60]
>   12: (main()+0x5340) [0x55cea25ea870]
>   13: (__libc_start_main()+0xf5) [0x7f9cd709d3d5]
>   14: (()+0x3adc10) [0x55cea26c3c10]
> Aborted
>
> -[~:#]- ceph osd getmap -o osdmap
> got osdmap epoch 81298
>
> -[~:#]- ceph-objectstore-tool --op set-osdmap --data-path
> /var/lib/ceph/osd/ceph-46/ --file osdmap
> osdmap (#-1:92f679f2:::osdmap.81298:0#) does not exist.
>
> -[~:#]- ceph-objectstore-tool --op set-osdmap --data-path
> /var/lib/ceph/osd/ceph-45/ --file osdmap
> osdmap (#-1:92f679f2:::osdmap.81298:0#) does not exist.
>
>
>
> On 8/14/19 2:54 AM, Paul Emmerich wrote:
> > Starting point to debug/fix this would be to extract the osdmap from
> > one of the dead OSDs:
> >
> > ceph-objectstore-tool --op get-osdmap --data-path /var/lib/ceph/osd/...
> >
> > Then try to run osdmaptool on that osdmap to see if it also crashes,
> > set some --debug options (don't know which one off the top of my
> > head).
> > Does it also crash? How does it differ from the map retrieved with
> > "ceph osd getmap"?
> >
> > You can also set the osdmap with "--op set-osdmap", does it help to
> > set the osdmap retrieved by "ceph osd getmap"?
> >
> > Paul
> >
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 

Re: [ceph-users] Possibly a bug on rocksdb

2019-08-11 Thread Brad Hubbard
Could you create a tracker for this?

Also, if you can reproduce this could you gather a log with
debug_osd=20 ? That should show us the superblock it was trying to
decode as well as additional details.

On Mon, Aug 12, 2019 at 6:29 AM huxia...@horebdata.cn
 wrote:
>
> Dear folks,
>
> I had an OSD down, not because of a bad disk, but most likely a bug hit on 
> Rockdb. Any one had similar issue?
>
> I am using Luminous 12.2.12 version. Log attached below
>
> thanks,
> Samuel
>
> **
> [root@horeb72 ceph]# head -400 ceph-osd.4.log
> 2019-08-11 07:30:02.186519 7f69bd020700  0 -- 192.168.10.72:6805/5915 >> 
> 192.168.10.73:6801/4096 conn(0x56549cfc0800 :6805 
> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg 
> accept connect_seq 15 vs existing csq=15 existing_state=STATE_STANDBY
> 2019-08-11 07:30:02.186871 7f69bd020700  0 -- 192.168.10.72:6805/5915 >> 
> 192.168.10.73:6801/4096 conn(0x56549cfc0800 :6805 
> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg 
> accept connect_seq 16 vs existing csq=15 existing_state=STATE_STANDBY
> 2019-08-11 07:30:02.242291 7f69bc81f700  0 -- 192.168.10.72:6805/5915 >> 
> 192.168.10.71:6805/5046 conn(0x5654b93ed000 :6805 
> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg 
> accept connect_seq 15 vs existing csq=15 existing_state=STATE_STANDBY
> 2019-08-11 07:30:02.242554 7f69bc81f700  0 -- 192.168.10.72:6805/5915 >> 
> 192.168.10.71:6805/5046 conn(0x5654b93ed000 :6805 
> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg 
> accept connect_seq 16 vs existing csq=15 existing_state=STATE_STANDBY
> 2019-08-11 07:30:02.260295 7f69bc81f700  0 -- 192.168.10.72:6805/5915 >> 
> 192.168.10.73:6806/4864 conn(0x56544de16800 :6805 
> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg 
> accept connect_seq 15 vs existing csq=15 
> existing_state=STATE_CONNECTING_WAIT_CONNECT_REPLY
> 2019-08-11 17:11:01.968247 7ff4822f1d80 -1 WARNING: the following dangerous 
> and experimental features are enabled: bluestore,rocksdb
> 2019-08-11 17:11:01.968333 7ff4822f1d80  0 ceph version 12.2.12 
> (1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable), process 
> ceph-osd, pid 1048682
> 2019-08-11 17:11:01.970611 7ff4822f1d80  0 pidfile_write: ignore empty 
> --pid-file
> 2019-08-11 17:11:01.991542 7ff4822f1d80 -1 WARNING: the following dangerous 
> and experimental features are enabled: bluestore,rocksdb
> 2019-08-11 17:11:01.997597 7ff4822f1d80  0 load: jerasure load: lrc load: isa
> 2019-08-11 17:11:01.997710 7ff4822f1d80  1 bdev create path 
> /var/lib/ceph/osd/ceph-4/block type kernel
> 2019-08-11 17:11:01.997723 7ff4822f1d80  1 bdev(0x564774656c00 
> /var/lib/ceph/osd/ceph-4/block) open path /var/lib/ceph/osd/ceph-4/block
> 2019-08-11 17:11:01.998127 7ff4822f1d80  1 bdev(0x564774656c00 
> /var/lib/ceph/osd/ceph-4/block) open size 858887553024 (0xc7f9b0, 800GiB) 
> block_size 4096 (4KiB) non-rotational
> 2019-08-11 17:11:01.998231 7ff4822f1d80  1 bdev(0x564774656c00 
> /var/lib/ceph/osd/ceph-4/block) close
> 2019-08-11 17:11:02.265144 7ff4822f1d80  1 bdev create path 
> /var/lib/ceph/osd/ceph-4/block type kernel
> 2019-08-11 17:11:02.265177 7ff4822f1d80  1 bdev(0x564774658a00 
> /var/lib/ceph/osd/ceph-4/block) open path /var/lib/ceph/osd/ceph-4/block
> 2019-08-11 17:11:02.265695 7ff4822f1d80  1 bdev(0x564774658a00 
> /var/lib/ceph/osd/ceph-4/block) open size 858887553024 (0xc7f9b0, 800GiB) 
> block_size 4096 (4KiB) non-rotational
> 2019-08-11 17:11:02.266233 7ff4822f1d80  1 bdev create path 
> /var/lib/ceph/osd/ceph-4/block.db type kernel
> 2019-08-11 17:11:02.266256 7ff4822f1d80  1 bdev(0x564774589a00 
> /var/lib/ceph/osd/ceph-4/block.db) open path /var/lib/ceph/osd/ceph-4/block.db
> 2019-08-11 17:11:02.266812 7ff4822f1d80  1 bdev(0x564774589a00 
> /var/lib/ceph/osd/ceph-4/block.db) open size 2759360 (0x6fc20, 
> 27.9GiB) block_size 4096 (4KiB) non-rotational
> 2019-08-11 17:11:02.266998 7ff4822f1d80  1 bdev create path 
> /var/lib/ceph/osd/ceph-4/block type kernel
> 2019-08-11 17:11:02.267015 7ff4822f1d80  1 bdev(0x564774659a00 
> /var/lib/ceph/osd/ceph-4/block) open path /var/lib/ceph/osd/ceph-4/block
> 2019-08-11 17:11:02.267412 7ff4822f1d80  1 bdev(0x564774659a00 
> /var/lib/ceph/osd/ceph-4/block) open size 858887553024 (0xc7f9b0, 800GiB) 
> block_size 4096 (4KiB) non-rotational
> 2019-08-11 17:11:02.298355 7ff4822f1d80  0  set rocksdb option 
> compaction_readahead_size = 2MB
> 2019-08-11 17:11:02.298368 7ff4822f1d80  0  set rocksdb option 
> compaction_style = kCompactionStyleLevel
> 2019-08-11 17:11:02.299628 7ff4822f1d80  0  set rocksdb option 
> compaction_threads = 32
> 2019-08-11 17:11:02.299648 7ff4822f1d80  0  set rocksdb option compression = 
> kNoCompression
> 2019-08-11 17:11:02.23 7ff4822f1d80  0  set rocksdb option 
> flusher_threads = 8
> 2019-08-11 

Re: [ceph-users] 14.2.2 - OSD Crash

2019-08-06 Thread Brad Hubbard
-63> 2019-08-07 00:51:52.861 7fe987e49700  1 heartbeat_map
clear_timeout 'OSD::osd_op_tp thread 0x7fe987e49700' had suicide timed
out after 150

You hit a suicide timeout, that's fatal. On line 80 the process kills
the thread based on the assumption it's hung.


src/common/HeartbeatMap.cc:

 66 bool HeartbeatMap::_check(const heartbeat_handle_d *h, const char
*who,
 67 »···»···»···  ceph::coarse_mono_clock::rep now)
 68 {
 69   bool healthy = true;
 70   auto was = h->timeout.load();
 71   if (was && was < now) {
 72 ldout(m_cct, 1) << who << " '" << h->name << "'"
 73 »···»···<< " had timed out after " << h->grace <<
dendl;
 74 healthy = false;
 75   }
 76   was = h->suicide_timeout;
 77   if (was && was < now) {
 78 ldout(m_cct, 1) << who << " '" << h->name << "'"
 79 »···»···<< " had suicide timed out after " <<
h->suicide_grace << dendl;
 80 pthread_kill(h->thread_id, SIGABRT);
 81 sleep(1);
 82 ceph_abort_msg("hit suicide timeout");
 83   }
 84   return healthy;
 85 }

You can try increasing the relevant timeouts but you would be better
off looking for the underlying cause of the poor performance. There's
a lot of information out there if you search for "ceph suicide
timeout"

On Wed, Aug 7, 2019 at 9:16 AM EDH - Manuel Rios Fernandez
 wrote:
>
> Hi
>
>
>
> We got a pair of OSD located in  node that crash randomly since 14.2.2
>
>
>
> OS Version : Centos 7.6
>
>
>
> There’re a ton of lines before crash , I will unespected:
>
>
>
> --
>
> 3045> 2019-08-07 00:39:32.013 7fe9a4996700  1 heartbeat_map is_healthy 
> 'OSD::osd_op_tp thread 0x7fe987e49700' had timed out after 15
>
> -3044> 2019-08-07 00:39:32.013 7fe9a3994700  1 heartbeat_map is_healthy 
> 'OSD::osd_op_tp thread 0x7fe987e49700' had timed out after 15
>
> -3043> 2019-08-07 00:39:32.033 7fe9a4195700  1 heartbeat_map is_healthy 
> 'OSD::osd_op_tp thread 0x7fe987e49700' had timed out after 15
>
> -3042> 2019-08-07 00:39:32.033 7fe9a4996700  1 heartbeat_map is_healthy 
> 'OSD::osd_op_tp thread 0x7fe987e49700' had timed out after 15
>
> --
>
> -
>
>
>
> Some hundred lines of:
>
> -164> 2019-08-07 00:47:36.628 7fe9a3994700  1 heartbeat_map is_healthy 
> 'OSD::osd_op_tp thread 0x7fe98964c700' had timed out after 60
>
>   -163> 2019-08-07 00:47:36.632 7fe9a3994700  1 heartbeat_map is_healthy 
> 'OSD::osd_op_tp thread 0x7fe98964c700' had timed out after 60
>
>   -162> 2019-08-07 00:47:36.632 7fe9a3994700  1 heartbeat_map is_healthy 
> 'OSD::osd_op_tp thread 0x7fe98964c700' had timed out after 60
>
> -
>
>
>
>-78> 2019-08-07 00:50:51.755 7fe995bfa700 10 monclient: tick
>
>-77> 2019-08-07 00:50:51.755 7fe995bfa700 10 monclient: 
> _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 
> 00:50:21.756453)
>
>-76> 2019-08-07 00:51:01.755 7fe995bfa700 10 monclient: tick
>
>-75> 2019-08-07 00:51:01.755 7fe995bfa700 10 monclient: 
> _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 
> 00:50:31.756604)
>
>-74> 2019-08-07 00:51:11.755 7fe995bfa700 10 monclient: tick
>
>-73> 2019-08-07 00:51:11.755 7fe995bfa700 10 monclient: 
> _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 
> 00:50:41.756788)
>
>-72> 2019-08-07 00:51:21.756 7fe995bfa700 10 monclient: tick
>
>-71> 2019-08-07 00:51:21.756 7fe995bfa700 10 monclient: 
> _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 
> 00:50:51.756982)
>
>-70> 2019-08-07 00:51:31.755 7fe995bfa700 10 monclient: tick
>
>-69> 2019-08-07 00:51:31.755 7fe995bfa700 10 monclient: 
> _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 
> 00:51:01.757206)
>
>-68> 2019-08-07 00:51:41.756 7fe995bfa700 10 monclient: tick
>
>-67> 2019-08-07 00:51:41.756 7fe995bfa700 10 monclient: 
> _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 
> 00:51:11.757364)
>
>-66> 2019-08-07 00:51:51.756 7fe995bfa700 10 monclient: tick
>
>-65> 2019-08-07 00:51:51.756 7fe995bfa700 10 monclient: 
> _check_auth_rotating have uptodate secrets (they expire after 2019-08-07 
> 00:51:21.757535)
>
>-64> 2019-08-07 00:51:52.861 7fe987e49700  1 heartbeat_map clear_timeout 
> 'OSD::osd_op_tp thread 0x7fe987e49700' had timed out after 15
>
>-63> 2019-08-07 00:51:52.861 7fe987e49700  1 heartbeat_map clear_timeout 
> 'OSD::osd_op_tp thread 0x7fe987e49700' had suicide timed out after 150
>
>-62> 2019-08-07 00:51:52.948 7fe99966c700  5 
> bluestore.MempoolThread(0x55ff04ad6a88) _tune_cache_size target: 4294967296 
> heap: 6018998272 unmapped: 1721180160 mapped: 4297818112 old cache_size: 
> 1994018210 new cache size: 1992784572
>
>-61> 2019-08-07 00:51:52.948 7fe99966c700  5 
> bluestore.MempoolThread(0x55ff04ad6a88) _trim_shards cache_size: 1992784572 
> kv_alloc: 763363328 kv_used: 749381098 meta_alloc: 763363328 meta_used: 
> 654593191 data_alloc: 452984832 data_used: 455929856
>
>-60> 2019-08-07 

Re: [ceph-users] set_mon_vals failed to set cluster_network Configuration option 'cluster_network' may not be modified at runtime

2019-07-02 Thread Brad Hubbard
I'd suggest creating a tracker similar to
http://tracker.ceph.com/issues/40554 which was created for the issue
in the thread you mentioned.

On Wed, Jul 3, 2019 at 12:29 AM Vandeir Eduardo
 wrote:
>
> Hi,
>
> on client machines, when I use the command rbd, for example, rbd ls
> poolname, this message is always displayed:
>
> 2019-07-02 11:18:10.613 7fb2eaffd700 -1 set_mon_vals failed to set
> cluster_network = 10.1.2.0/24: Configuration option 'cluster_network'
> may not be modified at runtime
> 2019-07-02 11:18:10.613 7fb2eaffd700 -1 set_mon_vals failed to set
> public_network = 10.1.1.0/24: Configuration option 'public_network'
> may not be modified at runtime
> 2019-07-02 11:18:10.621 7fb2ea7fc700 -1 set_mon_vals failed to set
> cluster_network = 10.1.2.0/24: Configuration option 'cluster_network'
> may not be modified at runtime
> 2019-07-02 11:18:10.621 7fb2ea7fc700 -1 set_mon_vals failed to set
> public_network = 10.1.1.0/24: Configuration option 'public_network'
> may not be modified at runtime
>
> After this, rbd image names are displayed normally.
>
> If I run this command on a ceph node, this "warning/information???"
> messages are not displayed. Is there a way to get ride of this? Its
> really annoying.
>
> The only thread I found about something similar was this:
> https://www.spinics.net/lists/ceph-devel/msg42657.html
>
> I already tryied the commands "ceph config rm global cluster_network"
> and "ceph config rm global public_network", but the messages still
> persist.
>
> Any ideas?
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] details about cloning objects using librados

2019-07-02 Thread Brad Hubbard
On Wed, Jul 3, 2019 at 4:25 AM Gregory Farnum  wrote:
>
> I'm not sure how or why you'd get an object class involved in doing
> this in the normal course of affairs.
>
> There's a copy_from op that a client can send and which copies an
> object from another OSD into the target object. That's probably the
> primitive you want to build on. Note that the OSD doesn't do much

Argh! yes, good idea. We really should document that!

> consistency checking (it validates that the object version matches an
> input, but if they don't it just returns an error) so the client
> application is responsible for any locking needed.
> -Greg
>
> On Tue, Jul 2, 2019 at 3:49 AM Brad Hubbard  wrote:
> >
> > Yes, this should be possible using an object class which is also a
> > RADOS client (via the RADOS API). You'll still have some client
> > traffic as the machine running the object class will still need to
> > connect to the relevant primary osd and send the write (presumably in
> > some situations though this will be the same machine).
> >
> > On Tue, Jul 2, 2019 at 4:08 PM nokia ceph  wrote:
> > >
> > > Hi Brett,
> > >
> > > I think I was wrong here in the requirement description. It is not about 
> > > data replication , we need same content stored in different object/name.
> > > We store video contents inside the ceph cluster. And our new requirement 
> > > is we need to store same content for different users , hence need same 
> > > content in different object name . if client sends write request for 
> > > object x and sets number of copies as 100, then cluster has to clone 100 
> > > copies of object x and store it as object x1, objectx2,etc. Currently 
> > > this is done in the client side where objectx1, object x2...objectx100 
> > > are cloned inside the client and write request sent for all 100 objects 
> > > which we want to avoid to reduce network consumption.
> > >
> > > Similar usecases are rbd snapshot , radosgw copy .
> > >
> > > Is this possible in object class ?
> > >
> > > thanks,
> > > Muthu
> > >
> > >
> > > On Mon, Jul 1, 2019 at 7:58 PM Brett Chancellor 
> > >  wrote:
> > >>
> > >> Ceph already does this by default. For each replicated pool, you can set 
> > >> the 'size' which is the number of copies you want Ceph to maintain. The 
> > >> accepted norm for replicas is 3, but you can set it higher if you want 
> > >> to incur the performance penalty.
> > >>
> > >> On Mon, Jul 1, 2019, 6:01 AM nokia ceph  wrote:
> > >>>
> > >>> Hi Brad,
> > >>>
> > >>> Thank you for your response , and we will check this video as well.
> > >>> Our requirement is while writing an object into the cluster , if we can 
> > >>> provide number of copies to be made , the network consumption between 
> > >>> client and cluster will be only for one object write. However , the 
> > >>> cluster will clone/copy multiple objects and stores inside the cluster.
> > >>>
> > >>> Thanks,
> > >>> Muthu
> > >>>
> > >>> On Fri, Jun 28, 2019 at 9:23 AM Brad Hubbard  
> > >>> wrote:
> > >>>>
> > >>>> On Thu, Jun 27, 2019 at 8:58 PM nokia ceph  
> > >>>> wrote:
> > >>>> >
> > >>>> > Hi Team,
> > >>>> >
> > >>>> > We have a requirement to create multiple copies of an object and 
> > >>>> > currently we are handling it in client side to write as separate 
> > >>>> > objects and this causes huge network traffic between client and 
> > >>>> > cluster.
> > >>>> > Is there possibility of cloning an object to multiple copies using 
> > >>>> > librados api?
> > >>>> > Please share the document details if it is feasible.
> > >>>>
> > >>>> It may be possible to use an object class to accomplish what you want
> > >>>> to achieve but the more we understand what you are trying to do, the
> > >>>> better the advice we can offer (at the moment your description sounds
> > >>>> like replication which is already part of RADOS as you know).
> > >>>>
> > >>>> More on object classes from Cephalocon Barcelona in May this year:
> > >>>> https://www.youtube.com/watch?v=EVrP9MXiiuU
> > >>>>
> > >>>> >
> > >>>> > Thanks,
> > >>>> > Muthu
> > >>>> > ___
> > >>>> > ceph-users mailing list
> > >>>> > ceph-users@lists.ceph.com
> > >>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > >>>>
> > >>>>
> > >>>>
> > >>>> --
> > >>>> Cheers,
> > >>>> Brad
> > >>>
> > >>> ___
> > >>> ceph-users mailing list
> > >>> ceph-users@lists.ceph.com
> > >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >
> >
> > --
> > Cheers,
> > Brad
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] details about cloning objects using librados

2019-07-02 Thread Brad Hubbard
Yes, this should be possible using an object class which is also a
RADOS client (via the RADOS API). You'll still have some client
traffic as the machine running the object class will still need to
connect to the relevant primary osd and send the write (presumably in
some situations though this will be the same machine).

On Tue, Jul 2, 2019 at 4:08 PM nokia ceph  wrote:
>
> Hi Brett,
>
> I think I was wrong here in the requirement description. It is not about data 
> replication , we need same content stored in different object/name.
> We store video contents inside the ceph cluster. And our new requirement is 
> we need to store same content for different users , hence need same content 
> in different object name . if client sends write request for object x and 
> sets number of copies as 100, then cluster has to clone 100 copies of object 
> x and store it as object x1, objectx2,etc. Currently this is done in the 
> client side where objectx1, object x2...objectx100 are cloned inside the 
> client and write request sent for all 100 objects which we want to avoid to 
> reduce network consumption.
>
> Similar usecases are rbd snapshot , radosgw copy .
>
> Is this possible in object class ?
>
> thanks,
> Muthu
>
>
> On Mon, Jul 1, 2019 at 7:58 PM Brett Chancellor  
> wrote:
>>
>> Ceph already does this by default. For each replicated pool, you can set the 
>> 'size' which is the number of copies you want Ceph to maintain. The accepted 
>> norm for replicas is 3, but you can set it higher if you want to incur the 
>> performance penalty.
>>
>> On Mon, Jul 1, 2019, 6:01 AM nokia ceph  wrote:
>>>
>>> Hi Brad,
>>>
>>> Thank you for your response , and we will check this video as well.
>>> Our requirement is while writing an object into the cluster , if we can 
>>> provide number of copies to be made , the network consumption between 
>>> client and cluster will be only for one object write. However , the cluster 
>>> will clone/copy multiple objects and stores inside the cluster.
>>>
>>> Thanks,
>>> Muthu
>>>
>>> On Fri, Jun 28, 2019 at 9:23 AM Brad Hubbard  wrote:
>>>>
>>>> On Thu, Jun 27, 2019 at 8:58 PM nokia ceph  
>>>> wrote:
>>>> >
>>>> > Hi Team,
>>>> >
>>>> > We have a requirement to create multiple copies of an object and 
>>>> > currently we are handling it in client side to write as separate objects 
>>>> > and this causes huge network traffic between client and cluster.
>>>> > Is there possibility of cloning an object to multiple copies using 
>>>> > librados api?
>>>> > Please share the document details if it is feasible.
>>>>
>>>> It may be possible to use an object class to accomplish what you want
>>>> to achieve but the more we understand what you are trying to do, the
>>>> better the advice we can offer (at the moment your description sounds
>>>> like replication which is already part of RADOS as you know).
>>>>
>>>> More on object classes from Cephalocon Barcelona in May this year:
>>>> https://www.youtube.com/watch?v=EVrP9MXiiuU
>>>>
>>>> >
>>>> > Thanks,
>>>> > Muthu
>>>> > ___
>>>> > ceph-users mailing list
>>>> > ceph-users@lists.ceph.com
>>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>>>
>>>>
>>>> --
>>>> Cheers,
>>>> Brad
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] details about cloning objects using librados

2019-06-27 Thread Brad Hubbard
On Thu, Jun 27, 2019 at 8:58 PM nokia ceph  wrote:
>
> Hi Team,
>
> We have a requirement to create multiple copies of an object and currently we 
> are handling it in client side to write as separate objects and this causes 
> huge network traffic between client and cluster.
> Is there possibility of cloning an object to multiple copies using librados 
> api?
> Please share the document details if it is feasible.

It may be possible to use an object class to accomplish what you want
to achieve but the more we understand what you are trying to do, the
better the advice we can offer (at the moment your description sounds
like replication which is already part of RADOS as you know).

More on object classes from Cephalocon Barcelona in May this year:
https://www.youtube.com/watch?v=EVrP9MXiiuU

>
> Thanks,
> Muthu
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



--
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] obj_size_info_mismatch error handling

2019-06-17 Thread Brad Hubbard
Can you open a tracker for this Dan and provide scrub logs with
debug_osd=20 and rados list-inconsistent-obj output?

On Mon, Jun 3, 2019 at 10:44 PM Dan van der Ster  wrote:
>
> Hi Reed and Brad,
>
> Did you ever learn more about this problem?
> We currently have a few inconsistencies arriving with the same env
> (cephfs, v13.2.5) and symptoms.
>
> PG Repair doesn't fix the inconsistency, nor does Brad's omap
> workaround earlier in the thread.
> In our case, we can fix by cp'ing the file to a new inode, deleting
> the inconsistent file, then scrubbing the PG.
>
> -- Dan
>
>
> On Fri, May 3, 2019 at 3:18 PM Reed Dier  wrote:
> >
> > Just to follow up for the sake of the mailing list,
> >
> > I had not had a chance to attempt your steps yet, but things appear to have 
> > worked themselves out on their own.
> >
> > Both scrub errors cleared without intervention, and I'm not sure if it is 
> > the results of that object getting touched in CephFS that triggered the 
> > update of the size info, or if something else was able to clear it.
> >
> > Didn't see anything relating to the clearing in mon, mgr, or osd logs.
> >
> > So, not entirely sure what fixed it, but it is resolved on its own.
> >
> > Thanks,
> >
> > Reed
> >
> > On Apr 30, 2019, at 8:01 PM, Brad Hubbard  wrote:
> >
> > On Wed, May 1, 2019 at 10:54 AM Brad Hubbard  wrote:
> >
> >
> > Which size is correct?
> >
> >
> > Sorry, accidental discharge =D
> >
> > If the object info size is *incorrect* try forcing a write to the OI
> > with something like the following.
> >
> > 1. rados -p [name_of_pool_17] setomapval 10008536718.
> > temporary-key anything
> > 2. ceph pg deep-scrub 17.2b9
> > 3. Wait for the scrub to finish
> > 4. rados -p [name_of_pool_2] rmomapkey 10008536718. temporary-key
> >
> > If the object info size is *correct* you could try just doing a rados
> > get followed by a rados put of the object to see if the size is
> > updated correctly.
> >
> > It's more likely the object info size is wrong IMHO.
> >
> >
> > On Tue, Apr 30, 2019 at 1:06 AM Reed Dier  wrote:
> >
> >
> > Hi list,
> >
> > Woke up this morning to two PG's reporting scrub errors, in a way that I 
> > haven't seen before.
> >
> > $ ceph versions
> > {
> >"mon": {
> >"ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) 
> > mimic (stable)": 3
> >},
> >"mgr": {
> >"ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) 
> > mimic (stable)": 3
> >},
> >"osd": {
> >"ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) 
> > mimic (stable)": 156
> >},
> >"mds": {
> >"ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) 
> > mimic (stable)": 2
> >},
> >"overall": {
> >"ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) 
> > mimic (stable)": 156,
> >"ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) 
> > mimic (stable)": 8
> >}
> > }
> >
> >
> > OSD_SCRUB_ERRORS 8 scrub errors
> > PG_DAMAGED Possible data damage: 2 pgs inconsistent
> >pg 17.72 is active+clean+inconsistent, acting [3,7,153]
> >pg 17.2b9 is active+clean+inconsistent, acting [19,7,16]
> >
> >
> > Here is what $rados list-inconsistent-obj 17.2b9 --format=json-pretty 
> > yields:
> >
> > {
> >"epoch": 134582,
> >"inconsistents": [
> >{
> >"object": {
> >"name": "10008536718.",
> >"nspace": "",
> >"locator": "",
> >"snap": "head",
> >"version": 0
> >},
> >"errors": [],
> >"union_shard_errors": [
> >"obj_size_info_mismatch"
> >],
> >"shards": [
> >{
> >"osd": 7,
> >"primary": false,
> >"errors": [
> >"obj_size_info_mismatch"
> > 

Re: [ceph-users] obj_size_info_mismatch error handling

2019-04-30 Thread Brad Hubbard
On Wed, May 1, 2019 at 10:54 AM Brad Hubbard  wrote:
>
> Which size is correct?

Sorry, accidental discharge =D

If the object info size is *incorrect* try forcing a write to the OI
with something like the following.

1. rados -p [name_of_pool_17] setomapval 10008536718.
temporary-key anything
2. ceph pg deep-scrub 17.2b9
3. Wait for the scrub to finish
4. rados -p [name_of_pool_2] rmomapkey 10008536718. temporary-key

If the object info size is *correct* you could try just doing a rados
get followed by a rados put of the object to see if the size is
updated correctly.

It's more likely the object info size is wrong IMHO.

>
> On Tue, Apr 30, 2019 at 1:06 AM Reed Dier  wrote:
> >
> > Hi list,
> >
> > Woke up this morning to two PG's reporting scrub errors, in a way that I 
> > haven't seen before.
> >
> > $ ceph versions
> > {
> > "mon": {
> > "ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) 
> > mimic (stable)": 3
> > },
> > "mgr": {
> > "ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) 
> > mimic (stable)": 3
> > },
> > "osd": {
> > "ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) 
> > mimic (stable)": 156
> > },
> > "mds": {
> > "ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) 
> > mimic (stable)": 2
> > },
> > "overall": {
> > "ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) 
> > mimic (stable)": 156,
> > "ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) 
> > mimic (stable)": 8
> > }
> > }
> >
> >
> > OSD_SCRUB_ERRORS 8 scrub errors
> > PG_DAMAGED Possible data damage: 2 pgs inconsistent
> > pg 17.72 is active+clean+inconsistent, acting [3,7,153]
> > pg 17.2b9 is active+clean+inconsistent, acting [19,7,16]
> >
> >
> > Here is what $rados list-inconsistent-obj 17.2b9 --format=json-pretty 
> > yields:
> >
> > {
> > "epoch": 134582,
> > "inconsistents": [
> > {
> > "object": {
> > "name": "10008536718.",
> > "nspace": "",
> > "locator": "",
> > "snap": "head",
> > "version": 0
> > },
> > "errors": [],
> > "union_shard_errors": [
> > "obj_size_info_mismatch"
> > ],
> > "shards": [
> > {
> > "osd": 7,
> > "primary": false,
> > "errors": [
> > "obj_size_info_mismatch"
> > ],
> > "size": 5883,
> > "object_info": {
> > "oid": {
> > "oid": "10008536718.",
> > "key": "",
> > "snapid": -2,
> > "hash": 1752643257,
> > "max": 0,
> > "pool": 17,
> > "namespace": ""
> > },
> > "version": "134599'448331",
> > "prior_version": "134599'448330",
> > "last_reqid": "client.1580931080.0:671854",
> > "user_version": 448331,
> > "size": 3505,
> > "mtime": "2019-04-28 15:32:20.003519",
> > "local_mtime": "2019-04-28 15:32:25.991015",
> > "lost": 0,
> > "flags": [
> > "dirty",
> > "data_digest",
> > "omap_digest"
> > ],
> > "truncat

Re: [ceph-users] obj_size_info_mismatch error handling

2019-04-30 Thread Brad Hubbard
Which size is correct?

On Tue, Apr 30, 2019 at 1:06 AM Reed Dier  wrote:
>
> Hi list,
>
> Woke up this morning to two PG's reporting scrub errors, in a way that I 
> haven't seen before.
>
> $ ceph versions
> {
> "mon": {
> "ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic 
> (stable)": 3
> },
> "mgr": {
> "ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic 
> (stable)": 3
> },
> "osd": {
> "ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic 
> (stable)": 156
> },
> "mds": {
> "ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic 
> (stable)": 2
> },
> "overall": {
> "ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic 
> (stable)": 156,
> "ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic 
> (stable)": 8
> }
> }
>
>
> OSD_SCRUB_ERRORS 8 scrub errors
> PG_DAMAGED Possible data damage: 2 pgs inconsistent
> pg 17.72 is active+clean+inconsistent, acting [3,7,153]
> pg 17.2b9 is active+clean+inconsistent, acting [19,7,16]
>
>
> Here is what $rados list-inconsistent-obj 17.2b9 --format=json-pretty yields:
>
> {
> "epoch": 134582,
> "inconsistents": [
> {
> "object": {
> "name": "10008536718.",
> "nspace": "",
> "locator": "",
> "snap": "head",
> "version": 0
> },
> "errors": [],
> "union_shard_errors": [
> "obj_size_info_mismatch"
> ],
> "shards": [
> {
> "osd": 7,
> "primary": false,
> "errors": [
> "obj_size_info_mismatch"
> ],
> "size": 5883,
> "object_info": {
> "oid": {
> "oid": "10008536718.",
> "key": "",
> "snapid": -2,
> "hash": 1752643257,
> "max": 0,
> "pool": 17,
> "namespace": ""
> },
> "version": "134599'448331",
> "prior_version": "134599'448330",
> "last_reqid": "client.1580931080.0:671854",
> "user_version": 448331,
> "size": 3505,
> "mtime": "2019-04-28 15:32:20.003519",
> "local_mtime": "2019-04-28 15:32:25.991015",
> "lost": 0,
> "flags": [
> "dirty",
> "data_digest",
> "omap_digest"
> ],
> "truncate_seq": 899,
> "truncate_size": 0,
> "data_digest": "0xf99a3bd3",
> "omap_digest": "0x",
> "expected_object_size": 0,
> "expected_write_size": 0,
> "alloc_hint_flags": 0,
> "manifest": {
> "type": 0
> },
> "watchers": {}
> }
> },
> {
> "osd": 16,
> "primary": false,
> "errors": [
> "obj_size_info_mismatch"
> ],
> "size": 5883,
> "object_info": {
> "oid": {
> "oid": "10008536718.",
> "key": "",
> "snapid": -2,
> "hash": 1752643257,
> "max": 0,
> "pool": 17,
> "namespace": ""
> },
> "version": "134599'448331",
> "prior_version": "134599'448330",
> "last_reqid": "client.1580931080.0:671854",
> "user_version": 448331,
> "size": 3505,
> "mtime": "2019-04-28 15:32:20.003519",
> "local_mtime": "2019-04-28 15:32:25.991015",
> "lost": 0,
> "flags": [
> "dirty",
> "data_digest",
> "omap_digest"
> ],
> "truncate_seq": 899,
> "truncate_size": 0,
> "data_digest": "0xf99a3bd3",
>   

Re: [ceph-users] Is it possible to run a standalone Bluestore instance?

2019-04-21 Thread Brad Hubbard
Glad it worked.

On Mon, Apr 22, 2019 at 11:01 AM Can Zhang  wrote:
>
> Thanks for your detailed response.
>
> I freshly installed a CentOS 7.6 and run install-deps.sh and
> do_cmake.sh this time, and it works this time. Maybe the problem was
> caused by dirty environment.
>
>
> Best,
> Can Zhang
>
>
> On Fri, Apr 19, 2019 at 6:28 PM Brad Hubbard  wrote:
> >
> > OK. So this works for me with master commit
> > bdaac2d619d603f53a16c07f9d7bd47751137c4c on Centos 7.5.1804.
> >
> > I cloned the repo and ran './install-deps.sh' and './do_cmake.sh
> > -DWITH_FIO=ON' then 'make all'.
> >
> > # find ./lib  -iname '*.so*' | xargs nm -AD 2>&1 | grep
> > _ZTIN13PriorityCache8PriCacheE
> > ./lib/libfio_ceph_objectstore.so:018f72d0 V
> > _ZTIN13PriorityCache8PriCacheE
> >
> > # LD_LIBRARY_PATH=./lib ./bin/fio --enghelp=libfio_ceph_objectstore.so
> > conf: Path to a ceph configuration file
> > oi_attr_len : Set OI(aka '_') attribute to specified length
> > snapset_attr_len: Set 'snapset' attribute to specified length
> > _fastinfo_omap_len  : Set '_fastinfo' OMAP attribute to specified length
> > pglog_simulation: Enables PG Log simulation behavior
> > pglog_omap_len  : Set pglog omap entry to specified length
> > pglog_dup_omap_len  : Set duplicate pglog omap entry to specified length
> > single_pool_mode: Enables the mode when all jobs run against
> > the same pool
> > preallocate_files   : Enables/disables file preallocation (touch
> > and resize) on init
> >
> > So my result above matches your result on ubuntu but not on centos. It
> > looks to me like we used to define in libceph-common but currently
> > it's defined in libfio_ceph_objectstore.so. For reasons that are
> > unclear you are seeing the old behaviour. Why this is and why it isn't
> > working as designed is not clear to me but I suspect if you clone the
> > repo again and build from scratch (maybe in a different directory if
> > you wish to keep debugging, see below) you should get a working
> > result. Could you try that as a test?
> >
> > If, on the other hand, you wish to keep debugging your current
> > environment I'd suggest looking at the output of the following command
> > as it may shed further light on the issue.
> >
> > # LD_DEBUG=all LD_LIBRARY_PATH=./lib ./bin/fio
> > --enghelp=libfio_ceph_objectstore.so
> >
> > 'LD_DEBUG=lib' may suffice but that's difficult to judge without
> > knowing what the problem is. I still suspect somehow you have
> > mis-matched libraries and, if that's the case, it's probably not worth
> > pursuing. If you can give me specific steps so I can reproduce this
> > from a freshly cloned tree I'd be happy to look further into it.
> >
> > Good luck.
> >
> > On Thu, Apr 18, 2019 at 7:00 PM Brad Hubbard  wrote:
> > >
> > > Let me try to reproduce this on centos 7.5 with master and I'll let
> > > you know how I go.
> > >
> > > On Thu, Apr 18, 2019 at 3:59 PM Can Zhang  wrote:
> > > >
> > > > Using the commands you provided, I actually find some differences:
> > > >
> > > > On my CentOS VM:
> > > > ```
> > > > # sudo find ./lib*  -iname '*.so*' | xargs nm -AD 2>&1 | grep
> > > > _ZTIN13PriorityCache8PriCacheE
> > > > ./libceph-common.so:0221cc08 V _ZTIN13PriorityCache8PriCacheE
> > > > ./libceph-common.so.0:0221cc08 V _ZTIN13PriorityCache8PriCacheE
> > > > ./libfio_ceph_objectstore.so: U 
> > > > _ZTIN13PriorityCache8PriCacheE
> > > > ```
> > > > ```
> > > > # ldd libfio_ceph_objectstore.so |grep common
> > > > libceph-common.so.0 => /root/ceph/build/lib/libceph-common.so.0
> > > > (0x7fd13f3e7000)
> > > > ```
> > > > On my Ubuntu VM:
> > > > ```
> > > > $ sudo find ./lib*  -iname '*.so*' | xargs nm -AD 2>&1 | grep
> > > > _ZTIN13PriorityCache8PriCacheE
> > > > ./libfio_ceph_objectstore.so:019d13e0 V 
> > > > _ZTIN13PriorityCache8PriCacheE
> > > > ```
> > > > ```
> > > > $ ldd libfio_ceph_objectstore.so |grep common
> > > > libceph-common.so.0 =>
> > > > /home/can/work/ceph/build/lib/libceph-common.so.0 (0x7f024a89e000)
> > > > ```
> > > >
> > > > Notice the "U" and "V" from nm results.
>

Re: [ceph-users] Is it possible to run a standalone Bluestore instance?

2019-04-19 Thread Brad Hubbard
OK. So this works for me with master commit
bdaac2d619d603f53a16c07f9d7bd47751137c4c on Centos 7.5.1804.

I cloned the repo and ran './install-deps.sh' and './do_cmake.sh
-DWITH_FIO=ON' then 'make all'.

# find ./lib  -iname '*.so*' | xargs nm -AD 2>&1 | grep
_ZTIN13PriorityCache8PriCacheE
./lib/libfio_ceph_objectstore.so:018f72d0 V
_ZTIN13PriorityCache8PriCacheE

# LD_LIBRARY_PATH=./lib ./bin/fio --enghelp=libfio_ceph_objectstore.so
conf: Path to a ceph configuration file
oi_attr_len : Set OI(aka '_') attribute to specified length
snapset_attr_len: Set 'snapset' attribute to specified length
_fastinfo_omap_len  : Set '_fastinfo' OMAP attribute to specified length
pglog_simulation: Enables PG Log simulation behavior
pglog_omap_len  : Set pglog omap entry to specified length
pglog_dup_omap_len  : Set duplicate pglog omap entry to specified length
single_pool_mode: Enables the mode when all jobs run against
the same pool
preallocate_files   : Enables/disables file preallocation (touch
and resize) on init

So my result above matches your result on ubuntu but not on centos. It
looks to me like we used to define in libceph-common but currently
it's defined in libfio_ceph_objectstore.so. For reasons that are
unclear you are seeing the old behaviour. Why this is and why it isn't
working as designed is not clear to me but I suspect if you clone the
repo again and build from scratch (maybe in a different directory if
you wish to keep debugging, see below) you should get a working
result. Could you try that as a test?

If, on the other hand, you wish to keep debugging your current
environment I'd suggest looking at the output of the following command
as it may shed further light on the issue.

# LD_DEBUG=all LD_LIBRARY_PATH=./lib ./bin/fio
--enghelp=libfio_ceph_objectstore.so

'LD_DEBUG=lib' may suffice but that's difficult to judge without
knowing what the problem is. I still suspect somehow you have
mis-matched libraries and, if that's the case, it's probably not worth
pursuing. If you can give me specific steps so I can reproduce this
from a freshly cloned tree I'd be happy to look further into it.

Good luck.

On Thu, Apr 18, 2019 at 7:00 PM Brad Hubbard  wrote:
>
> Let me try to reproduce this on centos 7.5 with master and I'll let
> you know how I go.
>
> On Thu, Apr 18, 2019 at 3:59 PM Can Zhang  wrote:
> >
> > Using the commands you provided, I actually find some differences:
> >
> > On my CentOS VM:
> > ```
> > # sudo find ./lib*  -iname '*.so*' | xargs nm -AD 2>&1 | grep
> > _ZTIN13PriorityCache8PriCacheE
> > ./libceph-common.so:0221cc08 V _ZTIN13PriorityCache8PriCacheE
> > ./libceph-common.so.0:0221cc08 V _ZTIN13PriorityCache8PriCacheE
> > ./libfio_ceph_objectstore.so: U 
> > _ZTIN13PriorityCache8PriCacheE
> > ```
> > ```
> > # ldd libfio_ceph_objectstore.so |grep common
> > libceph-common.so.0 => /root/ceph/build/lib/libceph-common.so.0
> > (0x7fd13f3e7000)
> > ```
> > On my Ubuntu VM:
> > ```
> > $ sudo find ./lib*  -iname '*.so*' | xargs nm -AD 2>&1 | grep
> > _ZTIN13PriorityCache8PriCacheE
> > ./libfio_ceph_objectstore.so:019d13e0 V 
> > _ZTIN13PriorityCache8PriCacheE
> > ```
> > ```
> > $ ldd libfio_ceph_objectstore.so |grep common
> > libceph-common.so.0 =>
> > /home/can/work/ceph/build/lib/libceph-common.so.0 (0x7f024a89e000)
> > ```
> >
> > Notice the "U" and "V" from nm results.
> >
> >
> >
> >
> > Best,
> > Can Zhang
> >
> > On Thu, Apr 18, 2019 at 9:36 AM Brad Hubbard  wrote:
> > >
> > > Does it define _ZTIN13PriorityCache8PriCacheE ? If it does, and all is
> > > as you say, then it should not say that _ZTIN13PriorityCache8PriCacheE
> > > is undefined. Does ldd show that it is finding the libraries you think
> > > it is? Either it is finding a different version of that library
> > > somewhere else or the version you have may not define that symbol.
> > >
> > > On Thu, Apr 18, 2019 at 11:12 AM Can Zhang  wrote:
> > > >
> > > > It's already in LD_LIBRARY_PATH, under the same directory of
> > > > libfio_ceph_objectstore.so
> > > >
> > > >
> > > > $ ll lib/|grep libceph-common
> > > > lrwxrwxrwx. 1 root root19 Apr 17 11:15 libceph-common.so ->
> > > > libceph-common.so.0
> > > > -rwxr-xr-x. 1 root root 211853400 Apr 17 11:15 libceph-common.so.0
> > > >
> > > >
> > > >
> > > >
> > > > Best,
> > > > Can Zhang

Re: [ceph-users] Is it possible to run a standalone Bluestore instance?

2019-04-18 Thread Brad Hubbard
Let me try to reproduce this on centos 7.5 with master and I'll let
you know how I go.

On Thu, Apr 18, 2019 at 3:59 PM Can Zhang  wrote:
>
> Using the commands you provided, I actually find some differences:
>
> On my CentOS VM:
> ```
> # sudo find ./lib*  -iname '*.so*' | xargs nm -AD 2>&1 | grep
> _ZTIN13PriorityCache8PriCacheE
> ./libceph-common.so:0221cc08 V _ZTIN13PriorityCache8PriCacheE
> ./libceph-common.so.0:0221cc08 V _ZTIN13PriorityCache8PriCacheE
> ./libfio_ceph_objectstore.so: U _ZTIN13PriorityCache8PriCacheE
> ```
> ```
> # ldd libfio_ceph_objectstore.so |grep common
> libceph-common.so.0 => /root/ceph/build/lib/libceph-common.so.0
> (0x7fd13f3e7000)
> ```
> On my Ubuntu VM:
> ```
> $ sudo find ./lib*  -iname '*.so*' | xargs nm -AD 2>&1 | grep
> _ZTIN13PriorityCache8PriCacheE
> ./libfio_ceph_objectstore.so:019d13e0 V _ZTIN13PriorityCache8PriCacheE
> ```
> ```
> $ ldd libfio_ceph_objectstore.so |grep common
> libceph-common.so.0 =>
> /home/can/work/ceph/build/lib/libceph-common.so.0 (0x7f024a89e000)
> ```
>
> Notice the "U" and "V" from nm results.
>
>
>
>
> Best,
> Can Zhang
>
> On Thu, Apr 18, 2019 at 9:36 AM Brad Hubbard  wrote:
> >
> > Does it define _ZTIN13PriorityCache8PriCacheE ? If it does, and all is
> > as you say, then it should not say that _ZTIN13PriorityCache8PriCacheE
> > is undefined. Does ldd show that it is finding the libraries you think
> > it is? Either it is finding a different version of that library
> > somewhere else or the version you have may not define that symbol.
> >
> > On Thu, Apr 18, 2019 at 11:12 AM Can Zhang  wrote:
> > >
> > > It's already in LD_LIBRARY_PATH, under the same directory of
> > > libfio_ceph_objectstore.so
> > >
> > >
> > > $ ll lib/|grep libceph-common
> > > lrwxrwxrwx. 1 root root19 Apr 17 11:15 libceph-common.so ->
> > > libceph-common.so.0
> > > -rwxr-xr-x. 1 root root 211853400 Apr 17 11:15 libceph-common.so.0
> > >
> > >
> > >
> > >
> > > Best,
> > > Can Zhang
> > >
> > > On Thu, Apr 18, 2019 at 7:00 AM Brad Hubbard  wrote:
> > > >
> > > > On Wed, Apr 17, 2019 at 1:37 PM Can Zhang  wrote:
> > > > >
> > > > > Thanks for your suggestions.
> > > > >
> > > > > I tried to build libfio_ceph_objectstore.so, but it fails to load:
> > > > >
> > > > > ```
> > > > > $ LD_LIBRARY_PATH=./lib ./bin/fio --enghelp=libfio_ceph_objectstore.so
> > > > >
> > > > > fio: engine libfio_ceph_objectstore.so not loadable
> > > > > IO engine libfio_ceph_objectstore.so not found
> > > > > ```
> > > > >
> > > > > I managed to print the dlopen error, it said:
> > > > >
> > > > > ```
> > > > > dlopen error: ./lib/libfio_ceph_objectstore.so: undefined symbol:
> > > > > _ZTIN13PriorityCache8PriCacheE
> > > >
> > > > $ c++filt _ZTIN13PriorityCache8PriCacheE
> > > > typeinfo for PriorityCache::PriCache
> > > >
> > > > $ sudo find /lib* /usr/lib* -iname '*.so*' | xargs nm -AD 2>&1 | grep
> > > > _ZTIN13PriorityCache8PriCacheE
> > > > /usr/lib64/ceph/libceph-common.so:008edab0 V
> > > > _ZTIN13PriorityCache8PriCacheE
> > > > /usr/lib64/ceph/libceph-common.so.0:008edab0 V
> > > > _ZTIN13PriorityCache8PriCacheE
> > > >
> > > > It needs to be able to find libceph-common, put it in your path or 
> > > > preload it.
> > > >
> > > > > ```
> > > > >
> > > > > I found a not-so-relevant
> > > > > issue(https://tracker.ceph.com/issues/38360), the error seems to be
> > > > > caused by mixed versions. My build environment is CentOS 7.5.1804 with
> > > > > SCL devtoolset-7, and ceph is latest master branch. Does someone know
> > > > > about the symbol?
> > > > >
> > > > >
> > > > > Best,
> > > > > Can Zhang
> > > > >
> > > > > Best,
> > > > > Can Zhang
> > > > >
> > > > >
> > > > > On Tue, Apr 16, 2019 at 8:37 PM Igor Fedotov  wrote:
> > > > > >
> > > > > > Besides already mentioned store_test.cc one can also use ce

Re: [ceph-users] Is it possible to run a standalone Bluestore instance?

2019-04-17 Thread Brad Hubbard
Does it define _ZTIN13PriorityCache8PriCacheE ? If it does, and all is
as you say, then it should not say that _ZTIN13PriorityCache8PriCacheE
is undefined. Does ldd show that it is finding the libraries you think
it is? Either it is finding a different version of that library
somewhere else or the version you have may not define that symbol.

On Thu, Apr 18, 2019 at 11:12 AM Can Zhang  wrote:
>
> It's already in LD_LIBRARY_PATH, under the same directory of
> libfio_ceph_objectstore.so
>
>
> $ ll lib/|grep libceph-common
> lrwxrwxrwx. 1 root root19 Apr 17 11:15 libceph-common.so ->
> libceph-common.so.0
> -rwxr-xr-x. 1 root root 211853400 Apr 17 11:15 libceph-common.so.0
>
>
>
>
> Best,
> Can Zhang
>
> On Thu, Apr 18, 2019 at 7:00 AM Brad Hubbard  wrote:
> >
> > On Wed, Apr 17, 2019 at 1:37 PM Can Zhang  wrote:
> > >
> > > Thanks for your suggestions.
> > >
> > > I tried to build libfio_ceph_objectstore.so, but it fails to load:
> > >
> > > ```
> > > $ LD_LIBRARY_PATH=./lib ./bin/fio --enghelp=libfio_ceph_objectstore.so
> > >
> > > fio: engine libfio_ceph_objectstore.so not loadable
> > > IO engine libfio_ceph_objectstore.so not found
> > > ```
> > >
> > > I managed to print the dlopen error, it said:
> > >
> > > ```
> > > dlopen error: ./lib/libfio_ceph_objectstore.so: undefined symbol:
> > > _ZTIN13PriorityCache8PriCacheE
> >
> > $ c++filt _ZTIN13PriorityCache8PriCacheE
> > typeinfo for PriorityCache::PriCache
> >
> > $ sudo find /lib* /usr/lib* -iname '*.so*' | xargs nm -AD 2>&1 | grep
> > _ZTIN13PriorityCache8PriCacheE
> > /usr/lib64/ceph/libceph-common.so:008edab0 V
> > _ZTIN13PriorityCache8PriCacheE
> > /usr/lib64/ceph/libceph-common.so.0:008edab0 V
> > _ZTIN13PriorityCache8PriCacheE
> >
> > It needs to be able to find libceph-common, put it in your path or preload 
> > it.
> >
> > > ```
> > >
> > > I found a not-so-relevant
> > > issue(https://tracker.ceph.com/issues/38360), the error seems to be
> > > caused by mixed versions. My build environment is CentOS 7.5.1804 with
> > > SCL devtoolset-7, and ceph is latest master branch. Does someone know
> > > about the symbol?
> > >
> > >
> > > Best,
> > > Can Zhang
> > >
> > > Best,
> > > Can Zhang
> > >
> > >
> > > On Tue, Apr 16, 2019 at 8:37 PM Igor Fedotov  wrote:
> > > >
> > > > Besides already mentioned store_test.cc one can also use ceph
> > > > objectstore fio plugin
> > > > (https://github.com/ceph/ceph/tree/master/src/test/fio) to access
> > > > standalone BlueStore instance from FIO benchmarking tool.
> > > >
> > > >
> > > > Thanks,
> > > >
> > > > Igor
> > > >
> > > > On 4/16/2019 7:58 AM, Can ZHANG wrote:
> > > > > Hi,
> > > > >
> > > > > I'd like to run a standalone Bluestore instance so as to test and tune
> > > > > its performance. Are there any tools about it, or any suggestions?
> > > > >
> > > > >
> > > > >
> > > > > Best,
> > > > > Can Zhang
> > > > >
> > > > > ___
> > > > > ceph-users mailing list
> > > > > ceph-users@lists.ceph.com
> > > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > > ___
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >
> >
> > --
> > Cheers,
> > Brad



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Is it possible to run a standalone Bluestore instance?

2019-04-17 Thread Brad Hubbard
On Wed, Apr 17, 2019 at 1:37 PM Can Zhang  wrote:
>
> Thanks for your suggestions.
>
> I tried to build libfio_ceph_objectstore.so, but it fails to load:
>
> ```
> $ LD_LIBRARY_PATH=./lib ./bin/fio --enghelp=libfio_ceph_objectstore.so
>
> fio: engine libfio_ceph_objectstore.so not loadable
> IO engine libfio_ceph_objectstore.so not found
> ```
>
> I managed to print the dlopen error, it said:
>
> ```
> dlopen error: ./lib/libfio_ceph_objectstore.so: undefined symbol:
> _ZTIN13PriorityCache8PriCacheE

$ c++filt _ZTIN13PriorityCache8PriCacheE
typeinfo for PriorityCache::PriCache

$ sudo find /lib* /usr/lib* -iname '*.so*' | xargs nm -AD 2>&1 | grep
_ZTIN13PriorityCache8PriCacheE
/usr/lib64/ceph/libceph-common.so:008edab0 V
_ZTIN13PriorityCache8PriCacheE
/usr/lib64/ceph/libceph-common.so.0:008edab0 V
_ZTIN13PriorityCache8PriCacheE

It needs to be able to find libceph-common, put it in your path or preload it.

> ```
>
> I found a not-so-relevant
> issue(https://tracker.ceph.com/issues/38360), the error seems to be
> caused by mixed versions. My build environment is CentOS 7.5.1804 with
> SCL devtoolset-7, and ceph is latest master branch. Does someone know
> about the symbol?
>
>
> Best,
> Can Zhang
>
> Best,
> Can Zhang
>
>
> On Tue, Apr 16, 2019 at 8:37 PM Igor Fedotov  wrote:
> >
> > Besides already mentioned store_test.cc one can also use ceph
> > objectstore fio plugin
> > (https://github.com/ceph/ceph/tree/master/src/test/fio) to access
> > standalone BlueStore instance from FIO benchmarking tool.
> >
> >
> > Thanks,
> >
> > Igor
> >
> > On 4/16/2019 7:58 AM, Can ZHANG wrote:
> > > Hi,
> > >
> > > I'd like to run a standalone Bluestore instance so as to test and tune
> > > its performance. Are there any tools about it, or any suggestions?
> > >
> > >
> > >
> > > Best,
> > > Can Zhang
> > >
> > > ___
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] showing active config settings

2019-04-16 Thread Brad Hubbard
$ ceph config set osd osd_recovery_max_active 4
$ ceph daemon osd.0 config diff|grep -A5 osd_recovery_max_active
"osd_recovery_max_active": {
"default": 3,
"mon": 4,
"override": 4,
"final": 4
},

On Wed, Apr 17, 2019 at 5:29 AM solarflow99  wrote:
>
> I wish there was a way to query the running settings from one of the MGR 
> hosts, and it doesn't help that ansible doesn't even copy the keyring to the 
> OSD nodes so commands there wouldn't work anyway.
> I'm still puzzled why it doesn't show any change when I run this no matter 
> what I set it to:
>
> # ceph -n osd.1 --show-config | grep osd_recovery_max_active
> osd_recovery_max_active = 3
>
> in fact it doesn't matter if I use an OSD number that doesn't exist, same 
> thing if I use ceph get
>
>
>
> On Tue, Apr 16, 2019 at 1:18 AM Brad Hubbard  wrote:
>>
>> On Tue, Apr 16, 2019 at 6:03 PM Paul Emmerich  wrote:
>> >
>> > This works, it just says that it *might* require a restart, but this
>> > particular option takes effect without a restart.
>>
>> We've already looked at changing the wording once to make it more palatable.
>>
>> http://tracker.ceph.com/issues/18424
>>
>> >
>> > Implementation detail: this message shows up if there's no internal
>> > function to be called when this option changes, so it can't be sure if
>> > the change is actually doing anything because the option might be
>> > cached or only read on startup. But in this case this option is read
>> > in the relevant path every time and no notification is required. But
>> > the injectargs command can't know that.
>>
>> Right on all counts. The functions are referred to as observers and
>> register to be notified if the value changes, hence "not observed."
>>
>> >
>> > Paul
>> >
>> > On Mon, Apr 15, 2019 at 11:38 PM solarflow99  wrote:
>> > >
>> > > Then why doesn't this work?
>> > >
>> > > # ceph tell 'osd.*' injectargs '--osd-recovery-max-active 4'
>> > > osd.0: osd_recovery_max_active = '4' (not observed, change may require 
>> > > restart)
>> > > osd.1: osd_recovery_max_active = '4' (not observed, change may require 
>> > > restart)
>> > > osd.2: osd_recovery_max_active = '4' (not observed, change may require 
>> > > restart)
>> > > osd.3: osd_recovery_max_active = '4' (not observed, change may require 
>> > > restart)
>> > > osd.4: osd_recovery_max_active = '4' (not observed, change may require 
>> > > restart)
>> > >
>> > > # ceph -n osd.1 --show-config | grep osd_recovery_max_active
>> > > osd_recovery_max_active = 3
>> > >
>> > >
>> > >
>> > > On Wed, Apr 10, 2019 at 7:21 AM Eugen Block  wrote:
>> > >>
>> > >> > I always end up using "ceph --admin-daemon
>> > >> > /var/run/ceph/name-of-socket-here.asok config show | grep ..." to get 
>> > >> > what
>> > >> > is in effect now for a certain daemon.
>> > >> > Needs you to be on the host of the daemon of course.
>> > >>
>> > >> Me too, I just wanted to try what OP reported. And after trying that,
>> > >> I'll keep it that way. ;-)
>> > >>
>> > >>
>> > >> Zitat von Janne Johansson :
>> > >>
>> > >> > Den ons 10 apr. 2019 kl 13:37 skrev Eugen Block :
>> > >> >
>> > >> >> > If you don't specify which daemon to talk to, it tells you what the
>> > >> >> > defaults would be for a random daemon started just now using the 
>> > >> >> > same
>> > >> >> > config as you have in /etc/ceph/ceph.conf.
>> > >> >>
>> > >> >> I tried that, too, but the result is not correct:
>> > >> >>
>> > >> >> host1:~ # ceph -n osd.1 --show-config | grep osd_recovery_max_active
>> > >> >> osd_recovery_max_active = 3
>> > >> >>
>> > >> >
>> > >> > I always end up using "ceph --admin-daemon
>> > >> > /var/run/ceph/name-of-socket-here.asok config show | grep ..." to get 
>> > >> > what
>> > >> > is in effect now for a certain daemon.
>> > >> > Needs you to be on the host of the daemon of course.
>> > >> >
>> > >> > --
>> > >> > May the most significant bit of your life be positive.
>> > >>
>> > >>
>> > >>
>> > >> ___
>> > >> ceph-users mailing list
>> > >> ceph-users@lists.ceph.com
>> > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> > >
>> > > ___
>> > > ceph-users mailing list
>> > > ceph-users@lists.ceph.com
>> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
>> --
>> Cheers,
>> Brad



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] showing active config settings

2019-04-16 Thread Brad Hubbard
On Tue, Apr 16, 2019 at 6:03 PM Paul Emmerich  wrote:
>
> This works, it just says that it *might* require a restart, but this
> particular option takes effect without a restart.

We've already looked at changing the wording once to make it more palatable.

http://tracker.ceph.com/issues/18424

>
> Implementation detail: this message shows up if there's no internal
> function to be called when this option changes, so it can't be sure if
> the change is actually doing anything because the option might be
> cached or only read on startup. But in this case this option is read
> in the relevant path every time and no notification is required. But
> the injectargs command can't know that.

Right on all counts. The functions are referred to as observers and
register to be notified if the value changes, hence "not observed."

>
> Paul
>
> On Mon, Apr 15, 2019 at 11:38 PM solarflow99  wrote:
> >
> > Then why doesn't this work?
> >
> > # ceph tell 'osd.*' injectargs '--osd-recovery-max-active 4'
> > osd.0: osd_recovery_max_active = '4' (not observed, change may require 
> > restart)
> > osd.1: osd_recovery_max_active = '4' (not observed, change may require 
> > restart)
> > osd.2: osd_recovery_max_active = '4' (not observed, change may require 
> > restart)
> > osd.3: osd_recovery_max_active = '4' (not observed, change may require 
> > restart)
> > osd.4: osd_recovery_max_active = '4' (not observed, change may require 
> > restart)
> >
> > # ceph -n osd.1 --show-config | grep osd_recovery_max_active
> > osd_recovery_max_active = 3
> >
> >
> >
> > On Wed, Apr 10, 2019 at 7:21 AM Eugen Block  wrote:
> >>
> >> > I always end up using "ceph --admin-daemon
> >> > /var/run/ceph/name-of-socket-here.asok config show | grep ..." to get 
> >> > what
> >> > is in effect now for a certain daemon.
> >> > Needs you to be on the host of the daemon of course.
> >>
> >> Me too, I just wanted to try what OP reported. And after trying that,
> >> I'll keep it that way. ;-)
> >>
> >>
> >> Zitat von Janne Johansson :
> >>
> >> > Den ons 10 apr. 2019 kl 13:37 skrev Eugen Block :
> >> >
> >> >> > If you don't specify which daemon to talk to, it tells you what the
> >> >> > defaults would be for a random daemon started just now using the same
> >> >> > config as you have in /etc/ceph/ceph.conf.
> >> >>
> >> >> I tried that, too, but the result is not correct:
> >> >>
> >> >> host1:~ # ceph -n osd.1 --show-config | grep osd_recovery_max_active
> >> >> osd_recovery_max_active = 3
> >> >>
> >> >
> >> > I always end up using "ceph --admin-daemon
> >> > /var/run/ceph/name-of-socket-here.asok config show | grep ..." to get 
> >> > what
> >> > is in effect now for a certain daemon.
> >> > Needs you to be on the host of the daemon of course.
> >> >
> >> > --
> >> > May the most significant bit of your life be positive.
> >>
> >>
> >>
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] showing active config settings

2019-04-15 Thread Brad Hubbard
On Tue, Apr 16, 2019 at 7:38 AM solarflow99  wrote:
>
> Then why doesn't this work?
>
> # ceph tell 'osd.*' injectargs '--osd-recovery-max-active 4'
> osd.0: osd_recovery_max_active = '4' (not observed, change may require 
> restart)
> osd.1: osd_recovery_max_active = '4' (not observed, change may require 
> restart)
> osd.2: osd_recovery_max_active = '4' (not observed, change may require 
> restart)
> osd.3: osd_recovery_max_active = '4' (not observed, change may require 
> restart)
> osd.4: osd_recovery_max_active = '4' (not observed, change may require 
> restart)
>
> # ceph -n osd.1 --show-config | grep osd_recovery_max_active
> osd_recovery_max_active = 3

Did you try "config diff" as Paul suggested?

>
>
>
> On Wed, Apr 10, 2019 at 7:21 AM Eugen Block  wrote:
>>
>> > I always end up using "ceph --admin-daemon
>> > /var/run/ceph/name-of-socket-here.asok config show | grep ..." to get what
>> > is in effect now for a certain daemon.
>> > Needs you to be on the host of the daemon of course.
>>
>> Me too, I just wanted to try what OP reported. And after trying that,
>> I'll keep it that way. ;-)
>>
>>
>> Zitat von Janne Johansson :
>>
>> > Den ons 10 apr. 2019 kl 13:37 skrev Eugen Block :
>> >
>> >> > If you don't specify which daemon to talk to, it tells you what the
>> >> > defaults would be for a random daemon started just now using the same
>> >> > config as you have in /etc/ceph/ceph.conf.
>> >>
>> >> I tried that, too, but the result is not correct:
>> >>
>> >> host1:~ # ceph -n osd.1 --show-config | grep osd_recovery_max_active
>> >> osd_recovery_max_active = 3
>> >>
>> >
>> > I always end up using "ceph --admin-daemon
>> > /var/run/ceph/name-of-socket-here.asok config show | grep ..." to get what
>> > is in effect now for a certain daemon.
>> > Needs you to be on the host of the daemon of course.
>> >
>> > --
>> > May the most significant bit of your life be positive.
>>
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] VM management setup

2019-04-05 Thread Brad Hubbard
If you want to do containers at the same time, or transition some/all
to containers at some point in future maybe something based on
kubevirt [1] would be more futureproof?

[1] http://kubevirt.io/

CNV is an example,
https://www.redhat.com/en/resources/container-native-virtualization

On Sat, Apr 6, 2019 at 7:37 AM Ronny Aasen  wrote:
>
>
> Proxmox VE is a simple solution.
> https://www.proxmox.com/en/proxmox-ve
>
> based on debian. can administer an internal ceph cluster or connect to
> an external connected . easy and almost self explanatory web interface.
>
> good luck in your search !
>
> Ronny
>
>
>
> On 05.04.2019 21:34, jes...@krogh.cc wrote:
> > Hi. Knowing this is a bit off-topic but seeking recommendations
> > and advise anyway.
> >
> > We're seeking a "management" solution for VM's - currently in the 40-50
> > VM - but would like to have better access in managing them and potintially
> > migrate them across multiple hosts, setup block devices, etc, etc.
> >
> > This is only to be used internally in a department where a bunch of
> > engineering people will manage it, no costumers and that kind of thing.
> >
> > Up until now we have been using virt-manager with kvm - and have been
> > quite satisfied when we were in the "few vms", but it seems like the
> > time to move on.
> >
> > Thus we're looking for something "simple" that can help manage a ceph+kvm
> > based setup -  the simpler and more to the point the better.
> >
> > Any recommendations?
> >
> > .. found a lot of names allready ..
> > OpenStack
> > CloudStack
> > Proxmox
> > ..
> >
> > But recommendations are truely welcome.
> >
> > Thanks.
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] scrub errors

2019-03-28 Thread Brad Hubbard
On Fri, Mar 29, 2019 at 7:54 AM solarflow99  wrote:
>
> ok, I tried doing ceph osd out on each of the 4 OSDs 1 by 1.  I got it out of 
> backfill mode but still not sure if it'll fix anything.  pg 10.2a still shows 
> state active+clean+inconsistent.  Peer 8  is now 
> remapped+inconsistent+peering, and the other peer is active+clean+inconsistent

Per the document I linked previously if a pg remains remapped you
likely have a problem with your configuration. Take a good look at
your crushmap, pg distribution, pool configuration, etc.

>
>
> On Wed, Mar 27, 2019 at 4:13 PM Brad Hubbard  wrote:
>>
>> On Thu, Mar 28, 2019 at 8:33 AM solarflow99  wrote:
>> >
>> > yes, but nothing seems to happen.  I don't understand why it lists OSDs 7 
>> > in the  "recovery_state": when i'm only using 3 replicas and it seems to 
>> > use 41,38,8
>>
>> Well, osd 8s state is listed as
>> "active+undersized+degraded+remapped+wait_backfill" so it seems to be
>> stuck waiting for backfill for some reason. One thing you could try is
>> restarting all of the osds including 7 and 17 to see if forcing them
>> to peer again has any positive effect. Don't restart them all at once,
>> just one at a time waiting until each has peered before moving on.
>>
>> >
>> > # ceph health detail
>> > HEALTH_ERR 1 pgs inconsistent; 47 scrub errors
>> > pg 10.2a is active+clean+inconsistent, acting [41,38,8]
>> > 47 scrub errors
>> >
>> >
>> >
>> > As you can see all OSDs are up and in:
>> >
>> > # ceph osd stat
>> >  osdmap e23265: 49 osds: 49 up, 49 in
>> >
>> >
>> >
>> >
>> > And this just stays the same:
>> >
>> > "up": [
>> > 41,
>> > 38,
>> > 8
>> > ],
>> > "acting": [
>> > 41,
>> > 38,
>> > 8
>> >
>> >  "recovery_state": [
>> > {
>> > "name": "Started\/Primary\/Active",
>> > "enter_time": "2018-09-22 07:07:48.637248",
>> > "might_have_unfound": [
>> > {
>> > "osd": "7",
>> > "status": "not queried"
>> > },
>> > {
>> > "osd": "8",
>> > "status": "already probed"
>> > },
>> > {
>> > "osd": "17",
>> > "status": "not queried"
>> > },
>> > {
>> > "osd": "38",
>> > "status": "already probed"
>> > }
>> > ],
>> >
>> >
>> > On Tue, Mar 26, 2019 at 4:53 PM Brad Hubbard  wrote:
>> >>
>> >> http://docs.ceph.com/docs/hammer/rados/troubleshooting/troubleshooting-pg/
>> >>
>> >> Did you try repairing the pg?
>> >>
>> >>
>> >> On Tue, Mar 26, 2019 at 9:08 AM solarflow99  wrote:
>> >> >
>> >> > yes, I know its old.  I intend to have it replaced but thats a few 
>> >> > months away and was hoping to get past this.  the other OSDs appear to 
>> >> > be ok, I see them up and in, why do you see something wrong?
>> >> >
>> >> > On Mon, Mar 25, 2019 at 4:00 PM Brad Hubbard  
>> >> > wrote:
>> >> >>
>> >> >> Hammer is no longer supported.
>> >> >>
>> >> >> What's the status of osds 7 and 17?
>> >> >>
>> >> >> On Tue, Mar 26, 2019 at 8:56 AM solarflow99  
>> >> >> wrote:
>> >> >> >
>> >> >> > hi, thanks.  Its still using Hammer.  Here's the output from the pg 
>> >> >> > query, the last command you gave doesn't work at all but be too old.
>> >> >> >
>> >> >> >
>> >> >> > # ceph pg 10.2a query
>> >> >> > {
>> >> >> > "state": "active+clean+inconsistent",
>> >> >> > "

Re: [ceph-users] scrub errors

2019-03-27 Thread Brad Hubbard
On Thu, Mar 28, 2019 at 8:33 AM solarflow99  wrote:
>
> yes, but nothing seems to happen.  I don't understand why it lists OSDs 7 in 
> the  "recovery_state": when i'm only using 3 replicas and it seems to use 
> 41,38,8

Well, osd 8s state is listed as
"active+undersized+degraded+remapped+wait_backfill" so it seems to be
stuck waiting for backfill for some reason. One thing you could try is
restarting all of the osds including 7 and 17 to see if forcing them
to peer again has any positive effect. Don't restart them all at once,
just one at a time waiting until each has peered before moving on.

>
> # ceph health detail
> HEALTH_ERR 1 pgs inconsistent; 47 scrub errors
> pg 10.2a is active+clean+inconsistent, acting [41,38,8]
> 47 scrub errors
>
>
>
> As you can see all OSDs are up and in:
>
> # ceph osd stat
>  osdmap e23265: 49 osds: 49 up, 49 in
>
>
>
>
> And this just stays the same:
>
> "up": [
> 41,
> 38,
> 8
> ],
> "acting": [
> 41,
> 38,
> 8
>
>  "recovery_state": [
> {
> "name": "Started\/Primary\/Active",
> "enter_time": "2018-09-22 07:07:48.637248",
> "might_have_unfound": [
> {
> "osd": "7",
> "status": "not queried"
> },
> {
> "osd": "8",
> "status": "already probed"
>     },
> {
> "osd": "17",
> "status": "not queried"
> },
> {
> "osd": "38",
> "status": "already probed"
> }
> ],
>
>
> On Tue, Mar 26, 2019 at 4:53 PM Brad Hubbard  wrote:
>>
>> http://docs.ceph.com/docs/hammer/rados/troubleshooting/troubleshooting-pg/
>>
>> Did you try repairing the pg?
>>
>>
>> On Tue, Mar 26, 2019 at 9:08 AM solarflow99  wrote:
>> >
>> > yes, I know its old.  I intend to have it replaced but thats a few months 
>> > away and was hoping to get past this.  the other OSDs appear to be ok, I 
>> > see them up and in, why do you see something wrong?
>> >
>> > On Mon, Mar 25, 2019 at 4:00 PM Brad Hubbard  wrote:
>> >>
>> >> Hammer is no longer supported.
>> >>
>> >> What's the status of osds 7 and 17?
>> >>
>> >> On Tue, Mar 26, 2019 at 8:56 AM solarflow99  wrote:
>> >> >
>> >> > hi, thanks.  Its still using Hammer.  Here's the output from the pg 
>> >> > query, the last command you gave doesn't work at all but be too old.
>> >> >
>> >> >
>> >> > # ceph pg 10.2a query
>> >> > {
>> >> > "state": "active+clean+inconsistent",
>> >> > "snap_trimq": "[]",
>> >> > "epoch": 23265,
>> >> > "up": [
>> >> > 41,
>> >> > 38,
>> >> > 8
>> >> > ],
>> >> > "acting": [
>> >> > 41,
>> >> > 38,
>> >> > 8
>> >> > ],
>> >> > "actingbackfill": [
>> >> > "8",
>> >> > "38",
>> >> > "41"
>> >> > ],
>> >> > "info": {
>> >> > "pgid": "10.2a",
>> >> > "last_update": "23265'20886859",
>> >> > "last_complete": "23265'20886859",
>> >> > "log_tail": "23265'20883809",
>> >> > "last_user_version": 20886859,
>> >> > "last_backfill": "MAX",
>> >> > "purged_snaps": "[]",
>> >> > "history": {
>> >> > "epoch_created": 8200,
>> >> > "last_epoch_started": 21481,
>> >> > "last_epoch_clean": 21487,
>> >> >

Re: [ceph-users] Fedora 29 Issues.

2019-03-26 Thread Brad Hubbard
https://bugzilla.redhat.com/show_bug.cgi?id=1662496

On Wed, Mar 27, 2019 at 5:00 AM Andrew J. Hutton
 wrote:
>
> More or less followed the install instructions with modifications as
> needed; but I'm suspecting that either a dependency was missed in the
> F29 package or something else is up. I don't see anything obvious; any
> ideas?
>
> When I try to start setting up my first node I get the following:
>
> [root@odin ceph-cluster]# ceph-deploy new thor
> [ceph_deploy.conf][DEBUG ] found configuration file at:
> /root/.cephdeploy.conf
> [ceph_deploy.cli][INFO  ] Invoked (1.5.32): /usr/bin/ceph-deploy new thor
> [ceph_deploy.cli][INFO  ] ceph-deploy options:
> [ceph_deploy.cli][INFO  ]  username  : None
> [ceph_deploy.cli][INFO  ]  verbose   : False
> [ceph_deploy.cli][INFO  ]  overwrite_conf: False
> [ceph_deploy.cli][INFO  ]  quiet : False
> [ceph_deploy.cli][INFO  ]  cd_conf   :
> 
> [ceph_deploy.cli][INFO  ]  cluster   : ceph
> [ceph_deploy.cli][INFO  ]  ssh_copykey   : True
> [ceph_deploy.cli][INFO  ]  mon   : ['thor']
> [ceph_deploy.cli][INFO  ]  func  :  at 0x7f9fb9ee8ed8>
> [ceph_deploy.cli][INFO  ]  public_network: None
> [ceph_deploy.cli][INFO  ]  ceph_conf : None
> [ceph_deploy.cli][INFO  ]  cluster_network   : None
> [ceph_deploy.cli][INFO  ]  default_release   : False
> [ceph_deploy.cli][INFO  ]  fsid  : None
> [ceph_deploy.new][DEBUG ] Creating new cluster named ceph
> [ceph_deploy.new][INFO  ] making sure passwordless SSH succeeds
> [thor][DEBUG ] connected to host: odin
> [ceph_deploy][ERROR ] Traceback (most recent call last):
> [ceph_deploy][ERROR ]   File
> "/usr/lib/python2.7/site-packages/ceph_deploy/util/decorators.py", line
> 69, in newfunc
> [ceph_deploy][ERROR ] return f(*a, **kw)
> [ceph_deploy][ERROR ]   File
> "/usr/lib/python2.7/site-packages/ceph_deploy/cli.py", line 169, in _main
> [ceph_deploy][ERROR ] return args.func(args)
> [ceph_deploy][ERROR ]   File
> "/usr/lib/python2.7/site-packages/ceph_deploy/new.py", line 141, in new
> [ceph_deploy][ERROR ] ssh_copy_keys(host, args.username)
> [ceph_deploy][ERROR ]   File
> "/usr/lib/python2.7/site-packages/ceph_deploy/new.py", line 35, in
> ssh_copy_keys
> [ceph_deploy][ERROR ] if ssh.can_connect_passwordless(hostname):
> [ceph_deploy][ERROR ]   File
> "/usr/lib/python2.7/site-packages/ceph_deploy/util/ssh.py", line 22, in
> can_connect_passwordless
> [ceph_deploy][ERROR ] out, err, retval = remoto.process.check(conn,
> command, stop_on_error=False)
> [ceph_deploy][ERROR ]   File
> "/usr/lib/python2.7/site-packages/remoto/process.py", line 163, in check
> [ceph_deploy][ERROR ] kw = extend_path(conn, kw)
> [ceph_deploy][ERROR ] NameError: global name 'extend_path' is not defined
> [ceph_deploy][ERROR ]
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] scrub errors

2019-03-26 Thread Brad Hubbard
http://docs.ceph.com/docs/hammer/rados/troubleshooting/troubleshooting-pg/

Did you try repairing the pg?


On Tue, Mar 26, 2019 at 9:08 AM solarflow99  wrote:
>
> yes, I know its old.  I intend to have it replaced but thats a few months 
> away and was hoping to get past this.  the other OSDs appear to be ok, I see 
> them up and in, why do you see something wrong?
>
> On Mon, Mar 25, 2019 at 4:00 PM Brad Hubbard  wrote:
>>
>> Hammer is no longer supported.
>>
>> What's the status of osds 7 and 17?
>>
>> On Tue, Mar 26, 2019 at 8:56 AM solarflow99  wrote:
>> >
>> > hi, thanks.  Its still using Hammer.  Here's the output from the pg query, 
>> > the last command you gave doesn't work at all but be too old.
>> >
>> >
>> > # ceph pg 10.2a query
>> > {
>> > "state": "active+clean+inconsistent",
>> > "snap_trimq": "[]",
>> > "epoch": 23265,
>> > "up": [
>> > 41,
>> > 38,
>> > 8
>> > ],
>> > "acting": [
>> > 41,
>> > 38,
>> > 8
>> > ],
>> > "actingbackfill": [
>> > "8",
>> > "38",
>> > "41"
>> > ],
>> > "info": {
>> > "pgid": "10.2a",
>> > "last_update": "23265'20886859",
>> > "last_complete": "23265'20886859",
>> > "log_tail": "23265'20883809",
>> > "last_user_version": 20886859,
>> > "last_backfill": "MAX",
>> > "purged_snaps": "[]",
>> > "history": {
>> > "epoch_created": 8200,
>> > "last_epoch_started": 21481,
>> > "last_epoch_clean": 21487,
>> > "last_epoch_split": 0,
>> > "same_up_since": 21472,
>> > "same_interval_since": 21474,
>> > "same_primary_since": 8244,
>> > "last_scrub": "23265'20864209",
>> > "last_scrub_stamp": "2019-03-22 22:39:13.930673",
>> > "last_deep_scrub": "23265'20864209",
>> > "last_deep_scrub_stamp": "2019-03-22 22:39:13.930673",
>> > "last_clean_scrub_stamp": "2019-03-15 01:33:21.447438"
>> > },
>> > "stats": {
>> > "version": "23265'20886859",
>> > "reported_seq": "10109937",
>> > "reported_epoch": "23265",
>> > "state": "active+clean+inconsistent",
>> > "last_fresh": "2019-03-25 15:52:53.720768",
>> > "last_change": "2019-03-22 22:39:13.931038",
>> > "last_active": "2019-03-25 15:52:53.720768",
>> > "last_peered": "2019-03-25 15:52:53.720768",
>> > "last_clean": "2019-03-25 15:52:53.720768",
>> > "last_became_active": "0.00",
>> > "last_became_peered": "0.00",
>> > "last_unstale": "2019-03-25 15:52:53.720768",
>> > "last_undegraded": "2019-03-25 15:52:53.720768",
>> > "last_fullsized": "2019-03-25 15:52:53.720768",
>> > "mapping_epoch": 21472,
>> > "log_start": "23265'20883809",
>> > "ondisk_log_start": "23265'20883809",
>> > "created": 8200,
>> > "last_epoch_clean": 21487,
>> > "parent": "0.0",
>> > "parent_split_bits": 0,
>> > "last_scrub": "23265'20864209",
>> > "last_scrub_stamp": "2019-03-22 22:39:13.930673",
>> > "last_deep_scrub": "23265'20864209",
>> > "last

Re: [ceph-users] scrub errors

2019-03-25 Thread Brad Hubbard
; "last_active": "2018-09-22 06:33:14.791334",
> "last_peered": "2018-09-22 06:33:14.791334",
> "last_clean": "2018-09-22 06:33:14.791334",
> "last_became_active": "0.00",
> "last_became_peered": "0.00",
> "last_unstale": "2018-09-22 06:33:14.791334",
> "last_undegraded": "2018-09-22 06:33:14.791334",
> "last_fullsized": "2018-09-22 06:33:14.791334",
> "mapping_epoch": 21472,
> "log_start": "21395'11840466",
> "ondisk_log_start": "21395'11840466",
> "created": 8200,
> "last_epoch_clean": 20840,
> "parent": "0.0",
>     "parent_split_bits": 0,
> "last_scrub": "21395'11835365",
> "last_scrub_stamp": "2018-09-21 12:11:47.230141",
> "last_deep_scrub": "21395'11835365",
> "last_deep_scrub_stamp": "2018-09-21 12:11:47.230141",
> "last_clean_scrub_stamp": "2018-09-21 12:11:47.230141",
> "log_size": 3050,
> "ondisk_log_size": 3050,
> "stats_invalid": "0",
> "stat_sum": {
> "num_bytes": 6405126628,
> "num_objects": 241711,
> "num_object_clones": 0,
> "num_object_copies": 725130,
> "num_objects_missing_on_primary": 0,
> "num_objects_degraded": 0,
> "num_objects_misplaced": 0,
> "num_objects_unfound": 0,
> "num_objects_dirty": 241711,
> "num_whiteouts": 0,
> "num_read": 5637862,
> "num_read_kb": 48735376,
> "num_write": 6789687,
> "num_write_kb": 67678402,
> "num_scrub_errors": 0,
> "num_shallow_scrub_errors": 0,
> "num_deep_scrub_errors": 0,
> "num_objects_recovered": 167079,
> "num_bytes_recovered": 5191625476,
> "num_keys_recovered": 0,
> "num_objects_omap": 0,
> "num_objects_hit_set_archive": 0,
> "num_bytes_hit_set_archive": 0
> },
> "up": [
> 41,
> 38,
> 8
> ],
> "acting": [
> 41,
> 38,
> 8
> ],
> "blocked_by": [],
> "up_primary": 41,
> "acting_primary": 41
> },
> "empty": 0,
> "dne": 0,
> "incomplete": 0,
> "last_epoch_started": 21481,
> "hit_set_history": {
> "current_last_update": "0'0",
> "current_last_stamp": "0.00",
> "current_info": {
> "begin": "0.00",
> "end": "0.00",
> "version": "0'0",
> "using_gmt": "0"
> },
> "history": []
> }
> }
> ],
> "recovery_state": [
> {
> "name": "Started\/Primary\/Active",
> "enter_time": "2018-09-22 07:07:48.637248",
> "might_have_unfound": [
> {
> "osd": "7",
> "status": "not queried"
> },
> {
> "osd": "8",
> "status"

Re: [ceph-users] scrub errors

2019-03-25 Thread Brad Hubbard
It would help to know what version you are running but, to begin with,
could you post the output of the following?

$ sudo ceph pg 10.2a query
$ sudo rados list-inconsistent-obj 10.2a --format=json-pretty

Also, have a read of
http://docs.ceph.com/docs/mimic/rados/troubleshooting/troubleshooting-pg/
(adjust the URl for your release).

On Tue, Mar 26, 2019 at 8:19 AM solarflow99  wrote:
>
> I noticed my cluster has scrub errors but the deep-scrub command doesn't show 
> any errors.  Is there any way to know what it takes to fix it?
>
>
>
> # ceph health detail
> HEALTH_ERR 1 pgs inconsistent; 47 scrub errors
> pg 10.2a is active+clean+inconsistent, acting [41,38,8]
> 47 scrub errors
>
> # zgrep 10.2a /var/log/ceph/ceph.log*
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 16:20:18.148299 osd.41 
> 192.168.4.19:6809/30077 54885 : cluster [INF] 10.2a deep-scrub starts
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024040 osd.41 
> 192.168.4.19:6809/30077 54886 : cluster [ERR] 10.2a shard 38 missing 
> 10/24083d2a/ec50777d-cc99-46a8-8610-4492213f412f/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024049 osd.41 
> 192.168.4.19:6809/30077 54887 : cluster [ERR] 10.2a shard 38 missing 
> 10/ff183d2a/fce859b9-61a9-46cb-82f1-4b4af31c10db/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024074 osd.41 
> 192.168.4.19:6809/30077 54888 : cluster [ERR] 10.2a shard 38 missing 
> 10/34283d2a/4b7c96cb-c494-4637-8669-e42049bd0e1c/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024076 osd.41 
> 192.168.4.19:6809/30077 54889 : cluster [ERR] 10.2a shard 38 missing 
> 10/df283d2a/bbe61149-99f8-4b83-a42b-b208d18094a8/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024077 osd.41 
> 192.168.4.19:6809/30077 54890 : cluster [ERR] 10.2a shard 38 missing 
> 10/35383d2a/60e8ed9b-bd04-5a43-8917-6f29eba28a66:0014/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024078 osd.41 
> 192.168.4.19:6809/30077 54891 : cluster [ERR] 10.2a shard 38 missing 
> 10/d5383d2a/2bdeb186-561b-4151-b87e-fe7c2e217d41/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024080 osd.41 
> 192.168.4.19:6809/30077 54892 : cluster [ERR] 10.2a shard 38 missing 
> 10/a7383d2a/b6b9d21d-2f4f-4550-8928-52552349db7d/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024081 osd.41 
> 192.168.4.19:6809/30077 54893 : cluster [ERR] 10.2a shard 38 missing 
> 10/9c383d2a/5b552687-c709-4e87-b773-1cce5b262754/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024082 osd.41 
> 192.168.4.19:6809/30077 54894 : cluster [ERR] 10.2a shard 38 missing 
> 10/5d383d2a/cb1a2ea8-0872-4de9-8b93-5ea8d9d8e613/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024083 osd.41 
> 192.168.4.19:6809/30077 54895 : cluster [ERR] 10.2a shard 38 missing 
> 10/8f483d2a/74c7a2b9-f00a-4c89-afbd-c1b8439234ac/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024085 osd.41 
> 192.168.4.19:6809/30077 54896 : cluster [ERR] 10.2a shard 38 missing 
> 10/b1583d2a/b3f00768-82a2-4637-91d1-164f3a51312a/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024086 osd.41 
> 192.168.4.19:6809/30077 54897 : cluster [ERR] 10.2a shard 38 missing 
> 10/35583d2a/e347aff4-7b71-476e-863a-310e767e4160/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024088 osd.41 
> 192.168.4.19:6809/30077 54898 : cluster [ERR] 10.2a shard 38 missing 
> 10/69583d2a/0805d07a-49d1-44cb-87c7-3bd73a0ce692/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024122 osd.41 
> 192.168.4.19:6809/30077 54899 : cluster [ERR] 10.2a shard 38 missing 
> 10/1a583d2a/d65bcf6a-9457-46c3-8fbc-432ebbaad89a/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024123 osd.41 
> 192.168.4.19:6809/30077 54900 : cluster [ERR] 10.2a shard 38 missing 
> 10/6d583d2a/5592f7d6-a131-4eb2-a3dd-b2d96691dd7e/head
> /var/log/ceph/ceph.log-20190323.gz:2019-03-22 18:29:02.024124 osd.41 
> 192.168.4.19:6809/30077 54901 : cluster [ERR] 10.2a shard 38 missing 
> 10/f0683d2a/81897399-4cb0-59b3-b9ae-bf043a272137:0003/head
>
>
>
> # ceph pg deep-scrub 10.2a
> instructing pg 10.2a on osd.41 to deep-scrub
>
>
> # ceph -w | grep 10.2a
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OS Upgrade now monitor wont start

2019-03-24 Thread Brad Hubbard
Do a "ps auwwx" to see how a running monitor was started and use the
equivalent command to try to start the MON that won't start. "ceph-mon
--help" will show you what you need. Most important is to get the ID
portion right and to add "-d" to get it to run in teh foreground and
log to stdout. HTH and good luck!

On Mon, Mar 25, 2019 at 11:10 AM Brent Kennedy  wrote:
>
> Upgraded all the OS’s in the cluster to Ubuntu 14.04 LTS from Ubuntu 12.02 
> LTS then finished the upgrade from Firefly to Luminous.
>
>
>
> I then tried to upgrade the first monitor to Ubuntu 16.04 LTS, the OS upgrade 
> went fine, but then the monitor and manager wouldn’t start.  I then used 
> ceph-deploy to install over the existing install to ensure the new packages 
> were installed.  Monitor and manager still wont start.  Oddly enough, it 
> seems that logging wont populate either.  I was trying to find the command to 
> run the monitor manually to see if could read the output since the logging in 
> /var/log/ceph isn’t populating.  I did a file system search to see if a log 
> file was created in another directory, but it appears that’s not the case.  
> Monitor and cluster were healthy before I started the OS upgrade.  Nothing in 
> “Journalctl –xe” other than the services starting up without any errors.  
> Cluster shows 1/3 monitors down in health status though.
>
>
>
> I hope to upgrade all the remaining monitors to 16.04.  I already upgraded 
> the gateways to 16.04 without issue.  All the OSDs are being replaced with 
> newer hardware and going to CentOS 7.6.
>
>
>
>
>
> Regards,
>
> -Brent
>
>
>
> Existing Clusters:
>
> Test: Luminous 12.2.11 with 3 osd servers, 1 mon/man, 1 gateway ( all virtual 
> on SSD )
>
> US Production(HDD): Jewel 10.2.11 with 5 osd servers, 3 mons, 3 gateways 
> behind haproxy LB
>
> UK Production(HDD): Luminous 12.2.11 with 15 osd servers, 3 mons/man, 3 
> gateways behind haproxy LB
>
> US Production(SSD): Luminous 12.2.11 with 6 osd servers, 3 mons/man, 3 
> gateways behind haproxy LB
>
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Slow OPS

2019-03-21 Thread Brad Hubbard
A repop is a sub-operation between primaries and replicas mostly.

That op only shows a duration of 1.3 seconds and the delay you
mentioned previously was under a second. Do you see larger delays? Are
they always between "sub_op_committed" and "commit_sent"?

What is your workload and how heavily utilised is your
cluster/network? How hard are the underlying disks working?

On Thu, Mar 21, 2019 at 4:11 PM Glen Baars  wrote:
>
> Hello Brad,
>
> It doesn't seem to be a set of OSDs, the cluster has 160ish OSDs over 9 hosts.
>
> I seem to get a lot of these ops also that don't show a client.
>
> "description": "osd_repop(client.14349712.0:4866968 15.36 
> e30675/22264 15:6dd17247:::rbd_data.2359ef6b8b4567.0042766
> a:head v 30675'5522366)",
> "initiated_at": "2019-03-21 16:51:56.862447",
> "age": 376.527241,
>     "duration": 1.331278,
>
> Kind regards,
> Glen Baars
>
> -Original Message-
> From: Brad Hubbard 
> Sent: Thursday, 21 March 2019 1:43 PM
> To: Glen Baars 
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] Slow OPS
>
> Actually, the lag is between "sub_op_committed" and "commit_sent". Is there 
> any pattern to these slow requests? Do they involve the same osd, or set of 
> osds?
>
> On Thu, Mar 21, 2019 at 3:37 PM Brad Hubbard  wrote:
> >
> > On Thu, Mar 21, 2019 at 3:20 PM Glen Baars  
> > wrote:
> > >
> > > Thanks for that - we seem to be experiencing the wait in this section of 
> > > the ops.
> > >
> > > {
> > > "time": "2019-03-21 14:12:42.830191",
> > > "event": "sub_op_committed"
> > > },
> > > {
> > > "time": "2019-03-21 14:12:43.699872",
> > > "event": "commit_sent"
> > > },
> > >
> > > Does anyone know what that section is waiting for?
> >
> > Hi Glen,
> >
> > These are documented, to some extent, here.
> >
> > http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting
> > -osd/
> >
> > It looks like it may be taking a long time to communicate the commit
> > message back to the client? Are these slow ops always the same client?
> >
> > >
> > > Kind regards,
> > > Glen Baars
> > >
> > > -Original Message-
> > > From: Brad Hubbard 
> > > Sent: Thursday, 21 March 2019 8:23 AM
> > > To: Glen Baars 
> > > Cc: ceph-users@lists.ceph.com
> > > Subject: Re: [ceph-users] Slow OPS
> > >
> > > On Thu, Mar 21, 2019 at 12:11 AM Glen Baars  
> > > wrote:
> > > >
> > > > Hello Ceph Users,
> > > >
> > > >
> > > >
> > > > Does anyone know what the flag point ‘Started’ is? Is that ceph osd 
> > > > daemon waiting on the disk subsystem?
> > >
> > > This is set by "mark_started()" and is roughly set when the pg starts 
> > > processing the op. Might want to capture dump_historic_ops output after 
> > > the op completes.
> > >
> > > >
> > > >
> > > >
> > > > Ceph 13.2.4 on centos 7.5
> > > >
> > > >
> > > >
> > > > "description": "osd_op(client.1411875.0:422573570
> > > > 5.18ds0
> > > > 5:b1ed18e5:::rbd_data.6.cf7f46b8b4567.0046e41a:head [read
> > > >
> > > > 1703936~16384] snapc 0=[] ondisk+read+known_if_redirected
> > > > e30622)",
> > > >
> > > > "initiated_at": "2019-03-21 01:04:40.598438",
> > > >
> > > > "age": 11.340626,
> > > >
> > > > "duration": 11.342846,
> > > >
> > > > "type_data": {
> > > >
> > > > "flag_point": "started",
> > > >
> > > > "client_info": {
> > > >
> > > > "client": "client.1411875",
> > > >
> > > > "client_

Re: [ceph-users] Slow OPS

2019-03-20 Thread Brad Hubbard
Actually, the lag is between "sub_op_committed" and "commit_sent". Is
there any pattern to these slow requests? Do they involve the same
osd, or set of osds?

On Thu, Mar 21, 2019 at 3:37 PM Brad Hubbard  wrote:
>
> On Thu, Mar 21, 2019 at 3:20 PM Glen Baars  
> wrote:
> >
> > Thanks for that - we seem to be experiencing the wait in this section of 
> > the ops.
> >
> > {
> > "time": "2019-03-21 14:12:42.830191",
> > "event": "sub_op_committed"
> > },
> > {
> > "time": "2019-03-21 14:12:43.699872",
> > "event": "commit_sent"
> > },
> >
> > Does anyone know what that section is waiting for?
>
> Hi Glen,
>
> These are documented, to some extent, here.
>
> http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/
>
> It looks like it may be taking a long time to communicate the commit
> message back to the client? Are these slow ops always the same client?
>
> >
> > Kind regards,
> > Glen Baars
> >
> > -Original Message-
> > From: Brad Hubbard 
> > Sent: Thursday, 21 March 2019 8:23 AM
> > To: Glen Baars 
> > Cc: ceph-users@lists.ceph.com
> > Subject: Re: [ceph-users] Slow OPS
> >
> > On Thu, Mar 21, 2019 at 12:11 AM Glen Baars  
> > wrote:
> > >
> > > Hello Ceph Users,
> > >
> > >
> > >
> > > Does anyone know what the flag point ‘Started’ is? Is that ceph osd 
> > > daemon waiting on the disk subsystem?
> >
> > This is set by "mark_started()" and is roughly set when the pg starts 
> > processing the op. Might want to capture dump_historic_ops output after the 
> > op completes.
> >
> > >
> > >
> > >
> > > Ceph 13.2.4 on centos 7.5
> > >
> > >
> > >
> > > "description": "osd_op(client.1411875.0:422573570 5.18ds0
> > > 5:b1ed18e5:::rbd_data.6.cf7f46b8b4567.0046e41a:head [read
> > >
> > > 1703936~16384] snapc 0=[] ondisk+read+known_if_redirected e30622)",
> > >
> > > "initiated_at": "2019-03-21 01:04:40.598438",
> > >
> > > "age": 11.340626,
> > >
> > > "duration": 11.342846,
> > >
> > > "type_data": {
> > >
> > > "flag_point": "started",
> > >
> > > "client_info": {
> > >
> > > "client": "client.1411875",
> > >
> > > "client_addr": "10.4.37.45:0/627562602",
> > >
> > > "tid": 422573570
> > >
> > > },
> > >
> > > "events": [
> > >
> > > {
> > >
> > > "time": "2019-03-21 01:04:40.598438",
> > >
> > > "event": "initiated"
> > >
> > > },
> > >
> > > {
> > >
> > > "time": "2019-03-21 01:04:40.598438",
> > >
> > > "event": "header_read"
> > >
> > > },
> > >
> > > {
> > >
> > > "time": "2019-03-21 01:04:40.598439",
> > >
> > > "event": "throttled"
> > >
> > > },
> > >
> > > {
> > >
> > > "time": "2019-03-21 01:04:40.598450",
> > >
> > > "event": "all_read"
> > >
> > > },
> > >
> > > {
> > >
> > > "time": "2019-03-21 01:04:40.598499",
> > >
> > > "event": "dispatched"
> > >
> > > },
> > >
> &g

Re: [ceph-users] Slow OPS

2019-03-20 Thread Brad Hubbard
On Thu, Mar 21, 2019 at 3:20 PM Glen Baars  wrote:
>
> Thanks for that - we seem to be experiencing the wait in this section of the 
> ops.
>
> {
> "time": "2019-03-21 14:12:42.830191",
> "event": "sub_op_committed"
> },
> {
> "time": "2019-03-21 14:12:43.699872",
> "event": "commit_sent"
> },
>
> Does anyone know what that section is waiting for?

Hi Glen,

These are documented, to some extent, here.

http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/

It looks like it may be taking a long time to communicate the commit
message back to the client? Are these slow ops always the same client?

>
> Kind regards,
> Glen Baars
>
> -Original Message-
> From: Brad Hubbard 
> Sent: Thursday, 21 March 2019 8:23 AM
> To: Glen Baars 
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] Slow OPS
>
> On Thu, Mar 21, 2019 at 12:11 AM Glen Baars  
> wrote:
> >
> > Hello Ceph Users,
> >
> >
> >
> > Does anyone know what the flag point ‘Started’ is? Is that ceph osd daemon 
> > waiting on the disk subsystem?
>
> This is set by "mark_started()" and is roughly set when the pg starts 
> processing the op. Might want to capture dump_historic_ops output after the 
> op completes.
>
> >
> >
> >
> > Ceph 13.2.4 on centos 7.5
> >
> >
> >
> > "description": "osd_op(client.1411875.0:422573570 5.18ds0
> > 5:b1ed18e5:::rbd_data.6.cf7f46b8b4567.0046e41a:head [read
> >
> > 1703936~16384] snapc 0=[] ondisk+read+known_if_redirected e30622)",
> >
> > "initiated_at": "2019-03-21 01:04:40.598438",
> >
> > "age": 11.340626,
> >
> > "duration": 11.342846,
> >
> > "type_data": {
> >
> > "flag_point": "started",
> >
> > "client_info": {
> >
> > "client": "client.1411875",
> >
> > "client_addr": "10.4.37.45:0/627562602",
> >
> > "tid": 422573570
> >
> > },
> >
> > "events": [
> >
> > {
> >
> > "time": "2019-03-21 01:04:40.598438",
> >
> > "event": "initiated"
> >
> > },
> >
> > {
> >
> > "time": "2019-03-21 01:04:40.598438",
> >
> > "event": "header_read"
> >
> > },
> >
> > {
> >
> > "time": "2019-03-21 01:04:40.598439",
> >
> > "event": "throttled"
> >
> > },
> >
> > {
> >
> > "time": "2019-03-21 01:04:40.598450",
> >
> > "event": "all_read"
> >
> > },
> >
> > {
> >
> > "time": "2019-03-21 01:04:40.598499",
> >
> > "event": "dispatched"
> >
> > },
> >
> > {
> >
> > "time": "2019-03-21 01:04:40.598504",
> >
> > "event": "queued_for_pg"
> >
> > },
> >
> > {
> >
> > "time": "2019-03-21 01:04:40.598883",
> >
> > "event": "reached_pg"
> >
> > },
> >
> > {
> >
> > "time": "2019-03-21 01:04:40.598905",
> >
> > "event": "started"
> >
> > }
> >
> > ]
> >
> >

Re: [ceph-users] Slow OPS

2019-03-20 Thread Brad Hubbard
On Thu, Mar 21, 2019 at 12:11 AM Glen Baars  wrote:
>
> Hello Ceph Users,
>
>
>
> Does anyone know what the flag point ‘Started’ is? Is that ceph osd daemon 
> waiting on the disk subsystem?

This is set by "mark_started()" and is roughly set when the pg starts
processing the op. Might want to capture dump_historic_ops output
after the op completes.

>
>
>
> Ceph 13.2.4 on centos 7.5
>
>
>
> "description": "osd_op(client.1411875.0:422573570 5.18ds0 
> 5:b1ed18e5:::rbd_data.6.cf7f46b8b4567.0046e41a:head [read
>
> 1703936~16384] snapc 0=[] ondisk+read+known_if_redirected e30622)",
>
> "initiated_at": "2019-03-21 01:04:40.598438",
>
> "age": 11.340626,
>
> "duration": 11.342846,
>
> "type_data": {
>
> "flag_point": "started",
>
> "client_info": {
>
> "client": "client.1411875",
>
> "client_addr": "10.4.37.45:0/627562602",
>
> "tid": 422573570
>
> },
>
> "events": [
>
> {
>
> "time": "2019-03-21 01:04:40.598438",
>
> "event": "initiated"
>
> },
>
> {
>
> "time": "2019-03-21 01:04:40.598438",
>
> "event": "header_read"
>
> },
>
> {
>
> "time": "2019-03-21 01:04:40.598439",
>
> "event": "throttled"
>
> },
>
> {
>
> "time": "2019-03-21 01:04:40.598450",
>
> "event": "all_read"
>
> },
>
> {
>
> "time": "2019-03-21 01:04:40.598499",
>
> "event": "dispatched"
>
> },
>
> {
>
> "time": "2019-03-21 01:04:40.598504",
>
> "event": "queued_for_pg"
>
> },
>
> {
>
> "time": "2019-03-21 01:04:40.598883",
>
> "event": "reached_pg"
>
> },
>
> {
>
> "time": "2019-03-21 01:04:40.598905",
>
> "event": "started"
>
> }
>
> ]
>
> }
>
> }
>
> ],
>
>
>
> Glen
>
> This e-mail is intended solely for the benefit of the addressee(s) and any 
> other named recipient. It is confidential and may contain legally privileged 
> or confidential information. If you are not the recipient, any use, 
> distribution, disclosure or copying of this e-mail is prohibited. The 
> confidentiality and legal privilege attached to this communication is not 
> waived or lost by reason of the mistaken transmission or delivery to you. If 
> you have received this e-mail in error, please notify us immediately.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] leak memory when mount cephfs

2019-03-19 Thread Brad Hubbard
On Tue, Mar 19, 2019 at 7:54 PM Zhenshi Zhou  wrote:
>
> Hi,
>
> I mount cephfs on my client servers. Some of the servers mount without any
> error whereas others don't.
>
> The error:
> # ceph-fuse -n client.kvm -m ceph.somedomain.com:6789 /mnt/kvm -r /kvm -d
> 2019-03-19 17:03:29.136 7f8c80eddc80 -1 deliberately leaking some memory
> 2019-03-19 17:03:29.137 7f8c80eddc80  0 ceph version 13.2.4 
> (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable), process ceph-fuse, 
> pid 2951226
> ceph-fuse: symbol lookup error: ceph-fuse: undefined symbol: 
> _Z12pipe_cloexecPi

$ c++filt  _Z12pipe_cloexecPi
pipe_cloexec(int*)

$ sudo find /lib* /usr/lib* -iname '*.so*' | xargs nm -AD 2>&1 | grep
_Z12pipe_cloexecPi
/usr/lib64/ceph/libceph-common.so:0063bb00 T _Z12pipe_cloexecPi
/usr/lib64/ceph/libceph-common.so.0:0063bb00 T _Z12pipe_cloexecPi

This appears to be an incompatibility between ceph-fuse and the
version of libceph-common it is finding. The version of ceph-fuse you
are using expects  libceph-common to define the function
"pipe_cloexec(int*)" but it does not. I'd say the verion of
libceph-common.so you have installed is too old. Compare it to the
version on a system that works.

>
> I'm not sure why some servers cannot mount cephfs. Are the servers don't have
> enough memory?
>
> Both client and server use version 13.2.4.
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Large OMAP Objects in default.rgw.log pool

2019-03-07 Thread Brad Hubbard
On Fri, Mar 8, 2019 at 4:46 AM Samuel Taylor Liston  wrote:
>
> Hello All,
> I have recently had 32 large map objects appear in my default.rgw.log 
> pool.  Running luminous 12.2.8.
>
> Not sure what to think about these.I’ve done a lot of reading 
> about how when these normally occur it is related to a bucket needing 
> resharding, but it doesn’t look like my default.rgw.log pool  has anything in 
> it, let alone buckets.  Here’s some info on the system:
>
> [root@elm-rgw01 ~]# ceph versions
> {
> "mon": {
> "ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) 
> luminous (stable)": 5
> },
> "mgr": {
> "ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) 
> luminous (stable)": 1
> },
> "osd": {
> "ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) 
> luminous (stable)": 192
> },
> "mds": {},
> "rgw": {
> "ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) 
> luminous (stable)": 1
> },
> "overall": {
> "ceph version 12.2.8 (ae699615bac534ea496ee965ac6192cb7e0e07c0) 
> luminous (stable)": 199
> }
> }
> [root@elm-rgw01 ~]# ceph osd pool ls
> .rgw.root
> default.rgw.control
> default.rgw.meta
> default.rgw.log
> default.rgw.buckets.index
> default.rgw.buckets.non-ec
> default.rgw.buckets.data
> [root@elm-rgw01 ~]# ceph health detail
> HEALTH_WARN 32 large omap objects
> LARGE_OMAP_OBJECTS 32 large omap objects
> 32 large objects found in pool 'default.rgw.log'
> Search the cluster log for 'Large omap object found' for more details.—
>
> Looking closer at these object they are all of size 0.  Also that pool shows 
> a capacity usage of 0:

The size here relates to data size. Object map (omap) data is metadata
so an object of size 0 can have considerable omap data associated with
it (the omap data is stored separately from the object in a key/value
database). The large omap warning in health detail output should tell
you " "Search the cluster log for 'Large omap object found' for more
details." If you do that you should get the names of the specific
objects involved. You can then use the rados commands listomapkeys and
listomapvals to see the specifics of the omap data. Someone more
familiar with rgw can then probably help you out on what purpose they
serve.

HTH.

>
> (just a sampling of the 236 objects at size 0)
>
> [root@elm-mon01 ceph]# for i in `rados ls -p default.rgw.log`; do echo ${i}; 
> rados stat -p default.rgw.log ${i};done
> obj_delete_at_hint.78
> default.rgw.log/obj_delete_at_hint.78 mtime 2019-03-07 
> 11:39:19.00, size 0
> obj_delete_at_hint.70
> default.rgw.log/obj_delete_at_hint.70 mtime 2019-03-07 
> 11:39:19.00, size 0
> obj_delete_at_hint.000104
> default.rgw.log/obj_delete_at_hint.000104 mtime 2019-03-07 
> 11:39:20.00, size 0
> obj_delete_at_hint.26
> default.rgw.log/obj_delete_at_hint.26 mtime 2019-03-07 
> 11:39:19.00, size 0
> obj_delete_at_hint.28
> default.rgw.log/obj_delete_at_hint.28 mtime 2019-03-07 
> 11:39:19.00, size 0
> obj_delete_at_hint.40
> default.rgw.log/obj_delete_at_hint.40 mtime 2019-03-07 
> 11:39:19.00, size 0
> obj_delete_at_hint.15
> default.rgw.log/obj_delete_at_hint.15 mtime 2019-03-07 
> 11:39:19.00, size 0
> obj_delete_at_hint.69
> default.rgw.log/obj_delete_at_hint.69 mtime 2019-03-07 
> 11:39:19.00, size 0
> obj_delete_at_hint.95
> default.rgw.log/obj_delete_at_hint.95 mtime 2019-03-07 
> 11:39:19.00, size 0
> obj_delete_at_hint.03
> default.rgw.log/obj_delete_at_hint.03 mtime 2019-03-07 
> 11:39:19.00, size 0
> obj_delete_at_hint.47
> default.rgw.log/obj_delete_at_hint.47 mtime 2019-03-07 
> 11:39:19.00, size 0
>
>
> [root@elm-mon01 ceph]# rados df
> POOL_NAME  USEDOBJECTS   CLONES COPIES 
> MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPSRD  WR_OPSWR
> .rgw.root  1.09KiB 4  0 12
>   0   00 14853 9.67MiB 0 0B
> default.rgw.buckets.data444TiB 166829939  0 1000979634
>   0   00 362357590  859TiB 909188749 703TiB
> default.rgw.buckets.index   0B   358  0   1074
>   0   00 729694496 1.04TiB 522654976 0B
> default.rgw.buckets.non-ec  0B   182  0546
>   0   00 194204616  148GiB  97962607 0B
> default.rgw.control 0B 8  0 24
>   0   00 0  0B 0 0B
> default.rgw.log 0B   236  0708
>   0   00  33268863 3.01TiB  18415356 0B
> default.rgw.meta   16.2KiB67  0201
>   0   0 

Re: [ceph-users] Failed to repair pg

2019-03-07 Thread Brad Hubbard
you could try reading the data from this object and write it again
using rados get then rados put.

On Fri, Mar 8, 2019 at 3:32 AM Herbert Alexander Faleiros
 wrote:
>
> On Thu, Mar 07, 2019 at 01:37:55PM -0300, Herbert Alexander Faleiros wrote:
> > Hi,
> >
> > # ceph health detail
> > HEALTH_ERR 3 scrub errors; Possible data damage: 1 pg inconsistent
> > OSD_SCRUB_ERRORS 3 scrub errors
> > PG_DAMAGED Possible data damage: 1 pg inconsistent
> > pg 2.2bb is active+clean+inconsistent, acting [36,12,80]
> >
> > # ceph pg repair 2.2bb
> > instructing pg 2.2bb on osd.36 to repair
> >
> > But:
> >
> > 2019-03-07 13:23:38.636881 [ERR]  Health check update: Possible data 
> > damage: 1 pg inconsistent, 1 pg repair (PG_DAMAGED)
> > 2019-03-07 13:20:38.373431 [ERR]  2.2bb deep-scrub 3 errors
> > 2019-03-07 13:20:38.373426 [ERR]  2.2bb deep-scrub 0 missing, 1 
> > inconsistent objects
> > 2019-03-07 13:20:43.486860 [ERR]  Health check update: 3 scrub errors 
> > (OSD_SCRUB_ERRORS)
> > 2019-03-07 13:19:17.741350 [ERR]  deep-scrub 2.2bb 
> > 2:dd4a7bd3:::rbd_data.dfd5e2235befd0.0001c299:4f986 : is an 
> > unexpected clone
> > 2019-03-07 13:19:17.523042 [ERR]  2.2bb shard 36 soid 
> > 2:dd4a7bd3:::rbd_data.dfd5e2235befd0.0001c299:4f986 : data_digest 
> > 0x != data_digest 0xfc6b9538 from shard 12, size 0 != size 4194304 
> > from auth oi 
> > 2:dd4a7bd3:::rbd_data.dfd5e2235befd0.0001c299:4f986(482757'14986708 
> > client.112595650.0:344888465 dirty|omap_digest s 4194304 uv 14974021 od 
> >  alloc_hint [0 0 0]), size 0 != size 4194304 from shard 12
> > 2019-03-07 13:19:17.523038 [ERR]  2.2bb shard 36 soid 
> > 2:dd4a7bd3:::rbd_data.dfd5e2235befd0.0001c299:4f986 : candidate 
> > size 0 info size 4194304 mismatch
> > 2019-03-07 13:16:48.542673 [ERR]  2.2bb repair 2 errors, 1 fixed
> > 2019-03-07 13:16:48.542656 [ERR]  2.2bb repair 1 missing, 0 inconsistent 
> > objects
> > 2019-03-07 13:16:53.774956 [ERR]  Health check update: Possible data 
> > damage: 1 pg inconsistent (PG_DAMAGED)
> > 2019-03-07 13:16:53.774916 [ERR]  Health check update: 2 scrub errors 
> > (OSD_SCRUB_ERRORS)
> > 2019-03-07 13:15:16.986872 [ERR]  repair 2.2bb 
> > 2:dd4a7bd3:::rbd_data.dfd5e2235befd0.0001c299:4f986 : is an 
> > unexpected clone
> > 2019-03-07 13:15:16.986817 [ERR]  2.2bb shard 36 
> > 2:dd4a7bd3:::rbd_data.dfd5e2235befd0.0001c299:4f986 : missing
> > 2019-03-07 13:12:18.517442 [ERR]  Health check update: Possible data 
> > damage: 1 pg inconsistent, 1 pg repair (PG_DAMAGED)
> >
> > Also tried deep-scrub and scrub, same results.
> >
> > Also set noscrub,nodeep-scrub, kicked currently active scrubs one at
> > a time using 'ceph osd down '. After the last scrub was kicked,
> > forced scrub ran immediately then 'ceph pg repair', no luck.
> >
> > Finally tryed the manual aproach:
> >
> >  - stop osd.36
> >  - flush-journal
> >  - rm rbd\udata.dfd5e2235befd0.0001c299__4f986_CBDE52BB__2
> >  - start osd.36
> >  - ceph pg repair 2.2bb
> >
> > Also no luck...
> >
> > rbd\udata.dfd5e2235befd0.0001c299__4f986_CBDE52BB__2 at osd.36
> > is empty (0 size). At osd.80 4.0M, osd.2 is bluestore (can't find it).
> >
> > Ceph is 12.2.10, I'm currently migrating all my OSDs to bluestore.
> >
> > Is there anything else I can do?
>
> Should I do something like this? (below, after stop osd.36)
>
> # ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-36/ --journal-path 
> /dev/sdc1 rbd_data.dfd5e2235befd0.0001c299 remove-clone-metadata 
> 326022
>
> I'm no sure about rbd_data.$RBD and $CLONEID (took from rados
> list-inconsistent-obj, also below).
>
> > # rados list-inconsistent-obj 2.2bb | jq
> > {
> >   "epoch": 484655,
> >   "inconsistents": [
> > {
> >   "object": {
> > "name": "rbd_data.dfd5e2235befd0.0001c299",
> > "nspace": "",
> > "locator": "",
> > "snap": 326022,
> > "version": 14974021
> >   },
> >   "errors": [
> > "data_digest_mismatch",
> > "size_mismatch"
> >   ],
> >   "union_shard_errors": [
> > "size_mismatch_info",
> > "obj_size_info_mismatch"
> >   ],
> >   "selected_object_info": {
> > "oid": {
> >   "oid": "rbd_data.dfd5e2235befd0.0001c299",
> >   "key": "",
> >   "snapid": 326022,
> >   "hash": 3420345019,
> >   "max": 0,
> >   "pool": 2,
> >   "namespace": ""
> > },
> > "version": "482757'14986708",
> > "prior_version": "482697'14980304",
> > "last_reqid": "client.112595650.0:344888465",
> > "user_version": 14974021,
> > "size": 4194304,
> > "mtime": "2019-03-02 22:30:23.812849",
> > "local_mtime": "2019-03-02 22:30:23.813281",
> > "lost": 0,
> > "flags": [
> >   "dirty",
> >   "omap_digest"
> > ],
> > "legacy_snaps": [],
> > "truncate_seq": 

Re: [ceph-users] http://tracker.ceph.com/issues/38122

2019-03-06 Thread Brad Hubbard
+Jos Collin 

On Thu, Mar 7, 2019 at 9:41 AM Milanov, Radoslav Nikiforov 
wrote:

> Can someone elaborate on
>
>
>
> From http://tracker.ceph.com/issues/38122
>
>
>
> Which exactly package is missing?
>
> And why is this happening ? In Mimic all dependencies are resolved by yum?
>
> - Rado
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>


-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD fails to start (fsck error, unable to read osd superblock)

2019-02-13 Thread Brad Hubbard
A single OSD should be expendable and you should be able to just "zap"
it and recreate it. Was this not true in your case?

On Wed, Feb 13, 2019 at 1:27 AM Ruben Rodriguez  wrote:
>
>
>
> On 2/9/19 5:40 PM, Brad Hubbard wrote:
> > On Sun, Feb 10, 2019 at 1:56 AM Ruben Rodriguez  wrote:
> >>
> >> Hi there,
> >>
> >> Running 12.2.11-1xenial on a machine with 6 SSD OSD with bluestore.
> >>
> >> Today we had two disks fail out of the controller, and after a reboot
> >> they both seemed to come back fine but ceph-osd was only able to start
> >> in one of them. The other one gets this:
> >>
> >> 2019-02-08 18:53:00.703376 7f64f948ce00 -1
> >> bluestore(/var/lib/ceph/osd/ceph-3) _verify_csum bad crc32c/0x1000
> >> checksum at blob offset 0x0, got 0x95104dfc, expected 0xb9e3e26d, device
> >> location [0x4000~1000], logical extent 0x0~1000, object
> >> #-1:7b3f43c4:::osd_superblock:0#
> >> 2019-02-08 18:53:00.703406 7f64f948ce00 -1 osd.3 0 OSD::init() : unable
> >> to read osd superblock
> >>
> >> Note that there are no actual IO errors being shown by the controller in
> >> dmesg, and that the disk is readable. The metadata FS is mounted and
> >> looks normal.
> >>
> >> I tried running "ceph-bluestore-tool repair --path
> >> /var/lib/ceph/osd/ceph-3 --deep 1" and that gets many instances of:
> >
> > Running this with debug_bluestore=30 might give more information on
> > the nature of the IO error.
>
> I had collected the logs with debug info already, and nothing
> significant was listed there. I applied this patch
> https://github.com/ceph/ceph/pull/26247 and it allowed me to move
> forward. There was a osd map corruption issue that I had to handle by
> hand, but after that the osd started fine. After it started and
> backfills finished, the bluestore_ignore_data_csum flag is no longer
> needed, so I reverted to standard packages.
>
> --
> Ruben Rodriguez | Chief Technology Officer, Free Software Foundation
> GPG Key: 05EF 1D2F FE61 747D 1FC8  27C3 7FAC 7D26 472F 4409
> https://fsf.org | https://gnu.org
>


-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Debugging 'slow requests' ...

2019-02-11 Thread Brad Hubbard
Glad to help!

On Tue, Feb 12, 2019 at 4:55 PM Massimo Sgaravatto
 wrote:
>
> Thanks a lot Brad !
>
> The problem is indeed in the network: we moved the OSD nodes back to the 
> "old" switches and the problem disappeared.
>
> Now we have to figure out what is wrong/misconfigured with the new switch: we 
> would try to replicate the problem, possibly without a ceph deployment ...
>
> Thanks again for your help !
>
> Cheers, Massimo
>
> On Sun, Feb 10, 2019 at 12:07 AM Brad Hubbard  wrote:
>>
>> The log ends at
>>
>> $ zcat ceph-osd.5.log.gz |tail -2
>> 2019-02-09 07:37:00.022534 7f5fce60d700  1 --
>> 192.168.61.202:6816/157436 >> - conn(0x56308edcf000 :6816
>> s=STATE_ACCEPTING pgs=0 cs=0 l=0)._process_connection sd=296 -
>>
>> The last two messages are outbound to 192.168.222.204 and there are no
>> further messages between these two hosts (other than osd_pings) in the
>> log.
>>
>> $ zcat ceph-osd.5.log.gz |gawk
>> '!/osd_ping/&&/192.168.222.202/&&/192.168.222.204/&&/07:29:3/'|tail -5
>> 2019-02-09 07:29:34.267744 7f5fcee0e700  1 --
>> 192.168.222.202:6816/157436 <== osd.29 192.168.222.204:6804/4159520
>> 1946  rep_scrubmap(8.2bc e1205735 from shard 29) v2  40+0+1492
>> (3695125937 0 2050362985) 0x563090674d80 con 0x56308bf61000
>> 2019-02-09 07:29:34.375223 7f5faf4b4700  1 --
>> 192.168.222.202:6816/157436 --> 192.168.222.204:6804/4159520 --
>> replica_scrub(pg:
>> 8.2bc,from:0'0,to:0'0,epoch:1205833/1205735,start:8:3d4e6145:::rbd_data.35f46d19abe4ed.77a4:0,end:8:3d4e6916:::rbd_data.a6dc2425de9600.0006249c:0,chunky:1,deep:0,version:9,allow_preemption:1,priority=5)
>> v9 -- 0x56308bdf2000 con 0
>> 2019-02-09 07:29:34.378535 7f5fcee0e700  1 --
>> 192.168.222.202:6816/157436 <== osd.29 192.168.222.204:6804/4159520
>> 1947  rep_scrubmap(8.2bc e1205735 from shard 29) v2  40+0+1494
>> (3695125937 0 865217733) 0x563092d90900 con 0x56308bf61000
>> 2019-02-09 07:29:34.415868 7f5faf4b4700  1 --
>> 192.168.222.202:6816/157436 --> 192.168.222.204:6804/4159520 --
>> osd_repop(client.171725953.0:404377591 8.9b e1205833/1205735
>> 8:d90adab6:::rbd_data.c47f3c390c8495.0001934a:head v
>> 1205833'4767322) v2 -- 0x56308ca42400 con 0
>> 2019-02-09 07:29:34.486296 7f5faf4b4700  1 --
>> 192.168.222.202:6816/157436 --> 192.168.222.204:6804/4159520 --
>> replica_scrub(pg:
>> 8.2bc,from:0'0,to:0'0,epoch:1205833/1205735,start:8:3d4e6916:::rbd_data.a6dc2425de9600.0006249c:0,end:8:3d4e7434:::rbd_data.47c1b437840214.0003c594:0,chunky:1,deep:0,version:9,allow_preemption:1,priority=5)
>> v9 -- 0x56308e565340 con 0
>>
>> I'd be taking a good, hard look at the network, yes.
>>
>> On Sat, Feb 9, 2019 at 6:33 PM Massimo Sgaravatto
>>  wrote:
>> >
>> > Thanks for your feedback !
>> >
>> > I increased debug_ms to 1/5.
>> >
>> > This is another slow request (full output from 'ceph daemon osd.5 
>> > dump_historic_ops' for this event is attached):
>> >
>> >
>> > {
>> > "description": "osd_op(client.171725953.0:404377591 8.9b 
>> > 8:d90adab6:
>> > ::rbd_data.c47f3c390c8495.0001934a:head [set-alloc-hint 
>> > object_size 4194
>> > 304 write_size 4194304,write 1413120~122880] snapc 0=[] 
>> > ondisk+write+known_if_re
>> > directed e1205833)",
>> > "initiated_at": "2019-02-09 07:29:34.404655",
>> > "age": 387.914193,
>> > "duration": 340.224154,
>> > "type_data": {
>> > "flag_point": "commit sent; apply or cleanup",
>> > "client_info": {
>> > "client": "client.171725953",
>> > "client_addr": "192.168.61.66:0/4056439540",
>> > "tid": 404377591
>> > },
>> > "events": [
>> > {
>> > "time": "2019-02-09 07:29:34.404655",
>> > "event": "initiated"
>> > },
>> > 
>> > 
>> >{
>> > "time": "2019-02-09 07:29:34.416752",
>> > "event": "op

Re: [ceph-users] Debugging 'slow requests' ...

2019-02-09 Thread Brad Hubbard
_repop(client.171725953.0:404377591 8.9b 
> e1205833/1205735 8:d90ad\
> ab6:::rbd_data.c47f3c390c8495.0001934a:head v 1205833'4767322) v2 -- 
> 0x56308ca42400 con 0
> 2019-02-09 07:29:34.417132 7f5fcf60f700  1 -- 192.168.222.202:6816/157436 <== 
> osd.14 192.168.222.203:6811/158495 11242  
> osd_repop_reply(client.171725953.0:404377591 8.9b e120\
> 5833/1205735) v2  111+0+0 (634943494 0 0) 0x563090642780 con 
> 0x56308bbd
>
> The answer from 14 arrives immediately:
>
> 2019-02-09 07:29:34.417132 7f5fcf60f700  1 -- 192.168.222.202:6816/157436 <== 
> osd.14 192.168.222.203:6811/158495 11242  
> osd_repop_reply(client.171725953.0:404377591 8.9b e120\
> 5833/1205735) v2  111+0+0 (634943494 0 0) 0x563090642780 con 
> 0x56308bbd
>
> while the one from 29 arrives only at 7.35:
>
> 2019-02-09 07:35:14.628614 7f5fcee0e700  1 -- 192.168.222.202:6816/157436 <== 
> osd.29 192.168.222.204:6804/4159520 1952  
> osd_repop_reply(client.171725953.0:404377591 8.9b e120\
> 5833/1205735) v2  111+0+0 (3804866849 0 0) 0x56308f3f2a00 con 
> 0x56308bf61000
>
>
> In osd.29 log it looks like the request only arrives at 07.35 (and it 
> promptly replies):
>
> 2019-02-09 07:35:14.627462 7f99972cc700  1 -- 192.168.222.204:6804/4159520 
> <== osd.5 192.168.222.202:6816/157436 2527  
> osd_repop(client.171725953.0:404377591 8.9b e1205833/1205735) v2  
> 1050+0+123635 (1225076790 0 171428115) 0x5610f5128a00 con 0x5610fc5bf000
> 2019-02-09 07:35:14.628343 7f998d6d4700  1 -- 192.168.222.204:6804/4159520 
> --> 192.168.222.202:6816/157436 -- 
> osd_repop_reply(client.171725953.0:404377591 8.9b e1205833/1205735 ondisk, 
> result = 0) v2 -- 0x5610f4a51180 con 0
>
>
> Network problems ?
>
>
> Full logs for the 3 relevant OSDs (just for that time period) is at: 
> https://drive.google.com/drive/folders/1TG5MomMJsqVbsuFosvYokNptLufxOnPY?usp=sharing
>
>
>
> Thanks again !
> Cheers, Massimo
>
>
>
> On Fri, Feb 8, 2019 at 11:50 PM Brad Hubbard  wrote:
>>
>> Try capturing another log with debug_ms turned up. 1 or 5 should be Ok
>> to start with.
>>
>> On Fri, Feb 8, 2019 at 8:37 PM Massimo Sgaravatto
>>  wrote:
>> >
>> > Our Luminous ceph cluster have been worked without problems for a while, 
>> > but in the last days we have been suffering from continuous slow requests.
>> >
>> > We have indeed done some changes in the infrastructure recently:
>> >
>> > - Moved OSD nodes to a new switch
>> > - Increased pg nums for a pool, to have about ~ 100 PGs/OSD (also because  
>> > we have to install new OSDs in the cluster). The output of 'ceph osd df' 
>> > is attached.
>> >
>> > The problem could also be due to some ''bad' client, but in the log I 
>> > don't see a clear "correlation" with specific clients or images for such 
>> > blocked requests.
>> >
>> > I also tried to update to latest luminous release and latest CentOS7, but 
>> > this didn't help.
>> >
>> >
>> >
>> > Attached you can find the detail of one of such slow operations which took 
>> > about 266 secs (output from 'ceph daemon osd.11 dump_historic_ops').
>> > So as far as I can understand from these events:
>> > {
>> > "time": "2019-02-08 10:26:25.651728",
>> > "event": "op_commit"
>> > },
>> > {
>> > "time": "2019-02-08 10:26:25.651965",
>> > "event": "op_applied"
>> > },
>> >
>> >   {
>> > "time": "2019-02-08 10:26:25.653236",
>> > "event": "sub_op_commit_rec from 33"
>> > },
>> > {
>> > "time": "2019-02-08 10:30:51.890404",
>> > "event": "sub_op_commit_rec from 23"
>> > },
>> >
>> > the problem seems with the  "sub_op_commit_rec from 23" event which took 
>> > too much.
>> > So the problem is that the answer from OSD 23 took to much ?
>> >
>> >
>> > In the logs of the 2 OSD (11 and 23)in that time frame (attached) I can't 
>> > find anything useful.
>> > When the problem happened the load and the usage of memory was not high in 
>> > the relevant nodes.
>> >
>> >
>> > Any help to debug the issue is really appreciated ! :-)
>> >
>> > Thanks, Massimo
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
>> --
>> Cheers,
>> Brad



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD fails to start (fsck error, unable to read osd superblock)

2019-02-09 Thread Brad Hubbard
On Sun, Feb 10, 2019 at 1:56 AM Ruben Rodriguez  wrote:
>
> Hi there,
>
> Running 12.2.11-1xenial on a machine with 6 SSD OSD with bluestore.
>
> Today we had two disks fail out of the controller, and after a reboot
> they both seemed to come back fine but ceph-osd was only able to start
> in one of them. The other one gets this:
>
> 2019-02-08 18:53:00.703376 7f64f948ce00 -1
> bluestore(/var/lib/ceph/osd/ceph-3) _verify_csum bad crc32c/0x1000
> checksum at blob offset 0x0, got 0x95104dfc, expected 0xb9e3e26d, device
> location [0x4000~1000], logical extent 0x0~1000, object
> #-1:7b3f43c4:::osd_superblock:0#
> 2019-02-08 18:53:00.703406 7f64f948ce00 -1 osd.3 0 OSD::init() : unable
> to read osd superblock
>
> Note that there are no actual IO errors being shown by the controller in
> dmesg, and that the disk is readable. The metadata FS is mounted and
> looks normal.
>
> I tried running "ceph-bluestore-tool repair --path
> /var/lib/ceph/osd/ceph-3 --deep 1" and that gets many instances of:

Running this with debug_bluestore=30 might give more information on
the nature of the IO error.

>
> 2019-02-08 19:00:31.783815 7fa35bd0df80 -1
> bluestore(/var/lib/ceph/osd/ceph-3) _verify_csum bad crc32c/0x1000
> checksum at blob offset 0x0, got 0x95104dfc, expected 0xb9e3e26d, device
> location [0x4000~1000], logical extent 0x0~1000, object
> #-1:7b3f43c4:::osd_superblock:0#
> 2019-02-08 19:00:31.783866 7fa35bd0df80 -1
> bluestore(/var/lib/ceph/osd/ceph-3) fsck error:
> #-1:7b3f43c4:::osd_superblock:0# error during read: (5) Input/output error
>
> ...which is the same error. Due to a host being down for unrelated
> reasons, this is preventing some PG's from activating, keeping one pool
> inaccessible. There is no critical data in it, but I'm more interested
> in solving the issue for reliability.
>
> Any advice? What does bad crc indicate in this context? Should I send
> this to the bug tracker instead?
> --
> Ruben Rodriguez | Chief Technology Officer, Free Software Foundation
> GPG Key: 05EF 1D2F FE61 747D 1FC8 27C3 7FAC 7D26 472F 4409
> https://fsf.org | https://gnu.org
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Debugging 'slow requests' ...

2019-02-08 Thread Brad Hubbard
Try capturing another log with debug_ms turned up. 1 or 5 should be Ok
to start with.

On Fri, Feb 8, 2019 at 8:37 PM Massimo Sgaravatto
 wrote:
>
> Our Luminous ceph cluster have been worked without problems for a while, but 
> in the last days we have been suffering from continuous slow requests.
>
> We have indeed done some changes in the infrastructure recently:
>
> - Moved OSD nodes to a new switch
> - Increased pg nums for a pool, to have about ~ 100 PGs/OSD (also because  we 
> have to install new OSDs in the cluster). The output of 'ceph osd df' is 
> attached.
>
> The problem could also be due to some ''bad' client, but in the log I don't 
> see a clear "correlation" with specific clients or images for such blocked 
> requests.
>
> I also tried to update to latest luminous release and latest CentOS7, but 
> this didn't help.
>
>
>
> Attached you can find the detail of one of such slow operations which took 
> about 266 secs (output from 'ceph daemon osd.11 dump_historic_ops').
> So as far as I can understand from these events:
> {
> "time": "2019-02-08 10:26:25.651728",
> "event": "op_commit"
> },
> {
> "time": "2019-02-08 10:26:25.651965",
> "event": "op_applied"
> },
>
>   {
> "time": "2019-02-08 10:26:25.653236",
> "event": "sub_op_commit_rec from 33"
> },
> {
> "time": "2019-02-08 10:30:51.890404",
> "event": "sub_op_commit_rec from 23"
> },
>
> the problem seems with the  "sub_op_commit_rec from 23" event which took too 
> much.
> So the problem is that the answer from OSD 23 took to much ?
>
>
> In the logs of the 2 OSD (11 and 23)in that time frame (attached) I can't 
> find anything useful.
> When the problem happened the load and the usage of memory was not high in 
> the relevant nodes.
>
>
> Any help to debug the issue is really appreciated ! :-)
>
> Thanks, Massimo
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] backfill_toofull after adding new OSDs

2019-02-06 Thread Brad Hubbard
Let's try to restrict discussion to the original thread
"backfill_toofull while OSDs are not full" and get a tracker opened up
for this issue.

On Sat, Feb 2, 2019 at 11:52 AM Fyodor Ustinov  wrote:
>
> Hi!
>
> Right now, after adding OSD:
>
> # ceph health detail
> HEALTH_ERR 74197563/199392333 objects misplaced (37.212%); Degraded data 
> redundancy (low space): 1 pg backfill_toofull
> OBJECT_MISPLACED 74197563/199392333 objects misplaced (37.212%)
> PG_DEGRADED_FULL Degraded data redundancy (low space): 1 pg backfill_toofull
> pg 6.eb is active+remapped+backfill_wait+backfill_toofull, acting 
> [21,0,47]
>
> # ceph pg ls-by-pool iscsi backfill_toofull
> PG   OBJECTS DEGRADED MISPLACED UNFOUND BYTES  LOG  STATE 
>  STATE_STAMPVERSION   REPORTED   UP   
>   ACTING   SCRUB_STAMPDEEP_SCRUB_STAMP
> 6.eb 6450  1290   0 1645654016 3067 
> active+remapped+backfill_wait+backfill_toofull 2019-02-02 00:20:32.975300 
> 7208'6567 9790:16214 [5,1,21]p5 [21,0,47]p21 2019-01-18 04:13:54.280495 
> 2019-01-18 04:13:54.280495
>
> All OSD have less 40% USE.
>
> ID CLASS WEIGHT  REWEIGHT SIZEUSE AVAIL   %USE  VAR  PGS
>  0   hdd 9.56149  1.0 9.6 TiB 3.2 TiB 6.3 TiB 33.64 1.31 313
>  1   hdd 9.56149  1.0 9.6 TiB 3.3 TiB 6.3 TiB 34.13 1.33 295
>  5   hdd 9.56149  1.0 9.6 TiB 756 GiB 8.8 TiB  7.72 0.30 103
> 47   hdd 9.32390  1.0 9.3 TiB 3.1 TiB 6.2 TiB 33.75 1.31 306
>
> (all other OSD also have less 40%)
>
> ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable)
>
> Maybe the developers will pay attention to the letter and say something?
>
> - Original Message -
> From: "Fyodor Ustinov" 
> To: "Caspar Smit" 
> Cc: "Jan Kasprzak" , "ceph-users" 
> Sent: Thursday, 31 January, 2019 16:50:24
> Subject: Re: [ceph-users] backfill_toofull after adding new OSDs
>
> Hi!
>
> I saw the same several times when I added a new osd to the cluster. One-two 
> pg in "backfill_toofull" state.
>
> In all versions of mimic.
>
> - Original Message -
> From: "Caspar Smit" 
> To: "Jan Kasprzak" 
> Cc: "ceph-users" 
> Sent: Thursday, 31 January, 2019 15:43:07
> Subject: Re: [ceph-users] backfill_toofull after adding new OSDs
>
> Hi Jan,
>
> You might be hitting the same issue as Wido here:
>
> [ https://www.spinics.net/lists/ceph-users/msg50603.html | 
> https://www.spinics.net/lists/ceph-users/msg50603.html ]
>
> Kind regards,
> Caspar
>
> Op do 31 jan. 2019 om 14:36 schreef Jan Kasprzak < [ mailto:k...@fi.muni.cz | 
> k...@fi.muni.cz ] >:
>
>
> Hello, ceph users,
>
> I see the following HEALTH_ERR during cluster rebalance:
>
> Degraded data redundancy (low space): 8 pgs backfill_toofull
>
> Detailed description:
> I have upgraded my cluster to mimic and added 16 new bluestore OSDs
> on 4 hosts. The hosts are in a separate region in my crush map, and crush
> rules prevented data to be moved on the new OSDs. Now I want to move
> all data to the new OSDs (and possibly decomission the old filestore OSDs).
> I have created the following rule:
>
> # ceph osd crush rule create-replicated on-newhosts newhostsroot host
>
> after this, I am slowly moving the pools one-by-one to this new rule:
>
> # ceph osd pool set test-hdd-pool crush_rule on-newhosts
>
> When I do this, I get the above error. This is misleading, because
> ceph osd df does not suggest the OSDs are getting full (the most full
> OSD is about 41 % full). After rebalancing is done, the HEALTH_ERR
> disappears. Why am I getting this error?
>
> # ceph -s
> cluster:
> id: ...my UUID...
> health: HEALTH_ERR
> 1271/3803223 objects misplaced (0.033%)
> Degraded data redundancy: 40124/3803223 objects degraded (1.055%), 65 pgs 
> degraded, 67 pgs undersized
> Degraded data redundancy (low space): 8 pgs backfill_toofull
>
> services:
> mon: 3 daemons, quorum mon1,mon2,mon3
> mgr: mon2(active), standbys: mon1, mon3
> osd: 80 osds: 80 up, 80 in; 90 remapped pgs
> rgw: 1 daemon active
>
> data:
> pools: 13 pools, 5056 pgs
> objects: 1.27 M objects, 4.8 TiB
> usage: 15 TiB used, 208 TiB / 224 TiB avail
> pgs: 40124/3803223 objects degraded (1.055%)
> 1271/3803223 objects misplaced (0.033%)
> 4963 active+clean
> 41 active+recovery_wait+undersized+degraded+remapped
> 21 active+recovery_wait+undersized+degraded
> 17 active+remapped+backfill_wait
> 5 active+remapped+backfill_wait+backfill_toofull
> 3 active+remapped+backfill_toofull
> 2 active+recovering+undersized+remapped
> 2 active+recovering+undersized+degraded+remapped
> 1 active+clean+remapped
> 1 active+recovering+undersized+degraded
>
> io:
> client: 6.6 MiB/s rd, 2.7 MiB/s wr, 75 op/s rd, 89 op/s wr
> recovery: 2.0 MiB/s, 92 objects/s
>
> Thanks for any hint,
>
> -Yenya
>
> --
> | Jan "Yenya" Kasprzak http://fi.muni.cz/ | fi.muni.cz ] - work | 
> [ http://yenya.net/ | yenya.net ] - private}> |
> | [ http://www.fi.muni.cz/~kas/ | http://www.fi.muni.cz/~kas/ ] GPG: 
> 4096R/A45477D5 |

Re: [ceph-users] process stuck in D state on cephfs kernel mount

2019-01-21 Thread Brad Hubbard
http://www.brendangregg.com/blog/2017-08-08/linux-load-averages.html
should still be current enough and makes good reading on the subject.

On Mon, Jan 21, 2019 at 8:46 PM Stijn De Weirdt  wrote:
>
> hi marc,
>
> > - how to prevent the D state process to accumulate so much load?
> you can't. in linux, uninterruptable tasks themself count as "load",
> this does not mean you eg ran out of cpu resources.
>
> stijn
>
> >
> > Thanks,
> >
> >
> >
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] centos 7.6 kernel panic caused by osd

2019-01-11 Thread Brad Hubbard
On Fri, Jan 11, 2019 at 8:58 PM Rom Freiman  wrote:
>
> Same kernel :)

Not exactly the point I had in mind, but sure ;)

>
>
> On Fri, Jan 11, 2019, 12:49 Brad Hubbard  wrote:
>>
>> Haha, in the email thread he says CentOS but the bug is opened against RHEL 
>> :P
>>
>> Is it worth recommending a fix in skb_can_coalesce() upstream so other
>> modules don't hit this?
>>
>> On Fri, Jan 11, 2019 at 7:39 PM Ilya Dryomov  wrote:
>> >
>> > On Fri, Jan 11, 2019 at 1:38 AM Brad Hubbard  wrote:
>> > >
>> > > On Fri, Jan 11, 2019 at 9:57 AM Jason Dillaman  
>> > > wrote:
>> > > >
>> > > > I think Ilya recently looked into a bug that can occur when
>> > > > CONFIG_HARDENED_USERCOPY is enabled and the IO's TCP message goes
>> > > > through the loopback interface (i.e. co-located OSDs and krbd).
>> > > > Assuming that you have the same setup, you might be hitting the same
>> > > > bug.
>> > >
>> > > Thanks for that Jason, I wasn't aware of that bug. I'm interested to
>> > > see the details.
>> >
>> > Here is Rom's BZ, it has some details:
>> >
>> > https://bugzilla.redhat.com/show_bug.cgi?id=1665248
>> >
>> > Thanks,
>> >
>> > Ilya
>>
>>
>>
>> --
>> Cheers,
>> Brad
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] centos 7.6 kernel panic caused by osd

2019-01-11 Thread Brad Hubbard
Haha, in the email thread he says CentOS but the bug is opened against RHEL :P

Is it worth recommending a fix in skb_can_coalesce() upstream so other
modules don't hit this?

On Fri, Jan 11, 2019 at 7:39 PM Ilya Dryomov  wrote:
>
> On Fri, Jan 11, 2019 at 1:38 AM Brad Hubbard  wrote:
> >
> > On Fri, Jan 11, 2019 at 9:57 AM Jason Dillaman  wrote:
> > >
> > > I think Ilya recently looked into a bug that can occur when
> > > CONFIG_HARDENED_USERCOPY is enabled and the IO's TCP message goes
> > > through the loopback interface (i.e. co-located OSDs and krbd).
> > > Assuming that you have the same setup, you might be hitting the same
> > > bug.
> >
> > Thanks for that Jason, I wasn't aware of that bug. I'm interested to
> > see the details.
>
> Here is Rom's BZ, it has some details:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1665248
>
> Thanks,
>
> Ilya



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] centos 7.6 kernel panic caused by osd

2019-01-10 Thread Brad Hubbard
On Fri, Jan 11, 2019 at 9:57 AM Jason Dillaman  wrote:
>
> I think Ilya recently looked into a bug that can occur when
> CONFIG_HARDENED_USERCOPY is enabled and the IO's TCP message goes
> through the loopback interface (i.e. co-located OSDs and krbd).
> Assuming that you have the same setup, you might be hitting the same
> bug.

Thanks for that Jason, I wasn't aware of that bug. I'm interested to
see the details.

>
> On Thu, Jan 10, 2019 at 6:46 PM Brad Hubbard  wrote:
> >
> > On Fri, Jan 11, 2019 at 12:20 AM Rom Freiman  wrote:
> > >
> > > Hey,
> > > After upgrading to centos7.6, I started encountering the following kernel 
> > > panic
> > >
> > > [17845.147263] XFS (rbd4): Unmounting Filesystem
> > > [17846.860221] rbd: rbd4: capacity 3221225472 features 0x1
> > > [17847.109887] XFS (rbd4): Mounting V5 Filesystem
> > > [17847.191646] XFS (rbd4): Ending clean mount
> > > [17861.663757] rbd: rbd5: capacity 3221225472 features 0x1
> > > [17862.930418] usercopy: kernel memory exposure attempt detected from 
> > > 9d54d26d8800 (kmalloc-512) (1024 bytes)
> > > [17862.941698] [ cut here ]
> > > [17862.946854] kernel BUG at mm/usercopy.c:72!
> > > [17862.951524] invalid opcode:  [#1] SMP
> > > [17862.956123] Modules linked in: vhost_net vhost macvtap macvlan tun 
> > > xt_REDIRECT nf_nat_redirect ip6table_mangle xt_nat xt_mark xt_connmark 
> > > xt_CHECKSUM ip6table_raw xt_physdev iptable_mangle veth iptable_raw rbd 
> > > libceph dns_resolver ebtable_filter ebtables ip6table_filter ip6_tables 
> > > xt_comment mlx4_en(OE) mlx4_core(OE) xt_multiport ipt_REJECT 
> > > nf_reject_ipv4 nf_conntrack_netlink nfnetlink iptable_nat xt_addrtype 
> > > iptable_filter xt_conntrack br_netfilter bridge stp llc xfs openvswitch 
> > > nf_conntrack_ipv6 nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 
> > > nf_nat_ipv4 nf_defrag_ipv6 nf_nat nf_conntrack mlx5_core(OE) mlxfw(OE) 
> > > iTCO_wdt iTCO_vendor_support sb_edac intel_powerclamp coretemp intel_rapl 
> > > iosf_mbi kvm_intel kvm irqbypass pcspkr joydev sg mei_me lpc_ich i2c_i801 
> > > mei ioatdma ipmi_si ipmi_devintf ipmi_msghandler
> > > [17863.036328]  dm_multipath ip_tables ext4 mbcache jbd2 dm_thin_pool 
> > > dm_persistent_data dm_bio_prison dm_bufio libcrc32c sd_mod crc_t10dif 
> > > crct10dif_generic crct10dif_pclmul crct10dif_common crc32_pclmul 
> > > crc32c_intel ghash_clmulni_intel mgag200 igb aesni_intel isci lrw 
> > > gf128mul glue_helper ablk_helper ahci drm_kms_helper cryptd libsas dca 
> > > syscopyarea sysfillrect sysimgblt fb_sys_fops ttm libahci 
> > > scsi_transport_sas ptp drm libata pps_core mlx_compat(OE) 
> > > drm_panel_orientation_quirks i2c_algo_bit devlink wmi 
> > > scsi_transport_iscsi sunrpc dm_mirror dm_region_hash dm_log dm_mod [last 
> > > unloaded: mlx4_core]
> > > [17863.094372] CPU: 3 PID: 71755 Comm: msgr-worker-1 Kdump: loaded 
> > > Tainted: G   OE     3.10.0-957.1.3.el7.x86_64 #1
> > > [17863.107673] Hardware name: Intel Corporation S2600JF/S2600JF, BIOS 
> > > SE5C600.86B.02.06.0006.032420170950 03/24/2017
> > > [17863.119134] task: 9d4e8e33e180 ti: 9d53dbaf8000 task.ti: 
> > > 9d53dbaf8000
> > > [17863.127489] RIP: 0010:[]  [] 
> > > __check_object_size+0x87/0x250
> > > [17863.137217] RSP: 0018:9d53dbafbb98  EFLAGS: 00010246
> > > [17863.143140] RAX: 0062 RBX: 9d54d26d8800 RCX: 
> > > 
> > > [17863.151106] RDX:  RSI: 9d557bad3898 RDI: 
> > > 9d557bad3898
> > > [17863.159072] RBP: 9d53dbafbbb8 R08:  R09: 
> > > 
> > > [17863.167038] R10: 0d0f R11: 9d53dbafb896 R12: 
> > > 0400
> > > [17863.175001] R13: 0001 R14: 9d54d26d8c00 R15: 
> > > 0400
> > > [17863.182968] FS:  7f531fa98700() GS:9d557bac() 
> > > knlGS:
> > > [17863.192001] CS:  0010 DS:  ES:  CR0: 80050033
> > > [17863.198414] CR2: 7f4438516930 CR3: 000f19236000 CR4: 
> > > 001627e0
> > > [17863.206379] Call Trace:
> > > [17863.209114]  [] memcpy_toiovec+0x4d/0xb0
> > > [17863.215240]  [] skb_copy_datagram_iovec+0x128/0x280
> > > [17863.222434]  [] tcp_recvmsg+0x22a/0xb30
> > > [17863.228463]  [] inet_recvmsg+0x80/0xb0
> > > [17863.234395]  [] sock

Re: [ceph-users] centos 7.6 kernel panic caused by osd

2019-01-10 Thread Brad Hubbard
On Fri, Jan 11, 2019 at 12:20 AM Rom Freiman  wrote:
>
> Hey,
> After upgrading to centos7.6, I started encountering the following kernel 
> panic
>
> [17845.147263] XFS (rbd4): Unmounting Filesystem
> [17846.860221] rbd: rbd4: capacity 3221225472 features 0x1
> [17847.109887] XFS (rbd4): Mounting V5 Filesystem
> [17847.191646] XFS (rbd4): Ending clean mount
> [17861.663757] rbd: rbd5: capacity 3221225472 features 0x1
> [17862.930418] usercopy: kernel memory exposure attempt detected from 
> 9d54d26d8800 (kmalloc-512) (1024 bytes)
> [17862.941698] [ cut here ]
> [17862.946854] kernel BUG at mm/usercopy.c:72!
> [17862.951524] invalid opcode:  [#1] SMP
> [17862.956123] Modules linked in: vhost_net vhost macvtap macvlan tun 
> xt_REDIRECT nf_nat_redirect ip6table_mangle xt_nat xt_mark xt_connmark 
> xt_CHECKSUM ip6table_raw xt_physdev iptable_mangle veth iptable_raw rbd 
> libceph dns_resolver ebtable_filter ebtables ip6table_filter ip6_tables 
> xt_comment mlx4_en(OE) mlx4_core(OE) xt_multiport ipt_REJECT nf_reject_ipv4 
> nf_conntrack_netlink nfnetlink iptable_nat xt_addrtype iptable_filter 
> xt_conntrack br_netfilter bridge stp llc xfs openvswitch nf_conntrack_ipv6 
> nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 
> nf_nat nf_conntrack mlx5_core(OE) mlxfw(OE) iTCO_wdt iTCO_vendor_support 
> sb_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass 
> pcspkr joydev sg mei_me lpc_ich i2c_i801 mei ioatdma ipmi_si ipmi_devintf 
> ipmi_msghandler
> [17863.036328]  dm_multipath ip_tables ext4 mbcache jbd2 dm_thin_pool 
> dm_persistent_data dm_bio_prison dm_bufio libcrc32c sd_mod crc_t10dif 
> crct10dif_generic crct10dif_pclmul crct10dif_common crc32_pclmul crc32c_intel 
> ghash_clmulni_intel mgag200 igb aesni_intel isci lrw gf128mul glue_helper 
> ablk_helper ahci drm_kms_helper cryptd libsas dca syscopyarea sysfillrect 
> sysimgblt fb_sys_fops ttm libahci scsi_transport_sas ptp drm libata pps_core 
> mlx_compat(OE) drm_panel_orientation_quirks i2c_algo_bit devlink wmi 
> scsi_transport_iscsi sunrpc dm_mirror dm_region_hash dm_log dm_mod [last 
> unloaded: mlx4_core]
> [17863.094372] CPU: 3 PID: 71755 Comm: msgr-worker-1 Kdump: loaded Tainted: G 
>   OE     3.10.0-957.1.3.el7.x86_64 #1
> [17863.107673] Hardware name: Intel Corporation S2600JF/S2600JF, BIOS 
> SE5C600.86B.02.06.0006.032420170950 03/24/2017
> [17863.119134] task: 9d4e8e33e180 ti: 9d53dbaf8000 task.ti: 
> 9d53dbaf8000
> [17863.127489] RIP: 0010:[]  [] 
> __check_object_size+0x87/0x250
> [17863.137217] RSP: 0018:9d53dbafbb98  EFLAGS: 00010246
> [17863.143140] RAX: 0062 RBX: 9d54d26d8800 RCX: 
> 
> [17863.151106] RDX:  RSI: 9d557bad3898 RDI: 
> 9d557bad3898
> [17863.159072] RBP: 9d53dbafbbb8 R08:  R09: 
> 
> [17863.167038] R10: 0d0f R11: 9d53dbafb896 R12: 
> 0400
> [17863.175001] R13: 0001 R14: 9d54d26d8c00 R15: 
> 0400
> [17863.182968] FS:  7f531fa98700() GS:9d557bac() 
> knlGS:
> [17863.192001] CS:  0010 DS:  ES:  CR0: 80050033
> [17863.198414] CR2: 7f4438516930 CR3: 000f19236000 CR4: 
> 001627e0
> [17863.206379] Call Trace:
> [17863.209114]  [] memcpy_toiovec+0x4d/0xb0
> [17863.215240]  [] skb_copy_datagram_iovec+0x128/0x280
> [17863.222434]  [] tcp_recvmsg+0x22a/0xb30
> [17863.228463]  [] inet_recvmsg+0x80/0xb0
> [17863.234395]  [] sock_aio_read.part.9+0x14c/0x170
> [17863.241297]  [] ? wake_up_q+0x5b/0x80
> [17863.247129]  [] sock_aio_read+0x21/0x30
> [17863.253157]  [] do_sync_read+0x93/0xe0
> [17863.259087]  [] vfs_read+0x145/0x170
> [17863.264823]  [] SyS_read+0x7f/0xf0
> [17863.270366]  [] system_call_fastpath+0x22/0x27
> [17863.277061] Code: 45 d1 48 c7 c6 d4 b6 67 a6 48 c7 c1 e0 4b 68 a6 48 0f 45 
> f1 49 89 c0 4d 89 e1 48 89 d9 48 c7 c7 d0 1a 68 a6 31 c0 e8 20 d5 51 00 <0f> 
> 0b 0f 1f 80 00 00 00 00 48 c7 c0 00 00 c0 a5 4c 39 f0 73 0d
> [17863.298802] RIP  [] __check_object_size+0x87/0x250
> [17863.305912]  RSP 
>
> It seems to be related to rbd operations but I cannot pinpoint directly the 
> reason.

To me this seems to be an issue in the networking subsystem and there
is nothing, at this stage, that implicates the ceph modules.

If the Mellanox modules are involved in any way I would start looking
there (not because I am biased against them, but because experience
tells me that is the place to start) and then move on to the other
networking modules and the kernel more generally. This looks like some
sort of memory accounting error in the networking subsystem. I could
be wrong, of course, but there would need to be further data to tell
either way. I'd suggest capturing a vmcore and getting someone to
analyse it for you would be a good next step.

>
> Versions:
> CentOS Linux release 7.6.1810 (Core)
> Linux 

Re: [ceph-users] Compacting omap data

2019-01-03 Thread Brad Hubbard
Nautilus will make this easier.

https://github.com/ceph/ceph/pull/18096

On Thu, Jan 3, 2019 at 5:22 AM Bryan Stillwell  wrote:
>
> Recently on one of our bigger clusters (~1,900 OSDs) running Luminous 
> (12.2.8), we had a problem where OSDs would frequently get restarted while 
> deep-scrubbing.
>
> After digging into it I found that a number of the OSDs had very large omap 
> directories (50GiB+).  I believe these were OSDs that had previous held PGs 
> that were part of the .rgw.buckets.index pool which I have recently moved to 
> all SSDs, however, it seems like the data remained on the HDDs.
>
> I was able to reduce the data usage on most of the OSDs (from ~50 GiB to < 
> 200 MiB!) by compacting the omap dbs offline by setting 
> 'leveldb_compact_on_mount = true' in the [osd] section of ceph.conf, but that 
> didn't work on the newer OSDs which use rocksdb.  On those I had to do an 
> online compaction using a command like:
>
> $ ceph tell osd.510 compact
>
> That worked, but today when I tried doing that on some of the SSD-based OSDs 
> which are backing .rgw.buckets.index I started getting slow requests and the 
> compaction ultimately failed with this error:
>
> $ ceph tell osd.1720 compact
> osd.1720: Error ENXIO: osd down
>
> When I tried it again it succeeded:
>
> $ ceph tell osd.1720 compact
> osd.1720: compacted omap in 420.999 seconds
>
> The data usage on that OSD dropped from 57.8 GiB to 43.4 GiB which was nice, 
> but I don't believe that'll get any smaller until I start splitting the PGs 
> in the .rgw.buckets.index pool to better distribute that pool across the 
> SSD-based OSDs.
>
> The first question I have is what is the option to do an offline compaction 
> of rocksdb so I don't impact our customers while compacting the rest of the 
> SSD-based OSDs?
>
> The next question is if there's a way to configure Ceph to automatically 
> compact the omap dbs in the background in a way that doesn't affect user 
> experience?
>
> Finally, I was able to figure out that the omap directories were getting 
> large because we're using filestore on this cluster, but how could someone 
> determine this when using BlueStore?
>
> Thanks,
> Bryan
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph OOM Killer Luminous

2018-12-21 Thread Brad Hubbard
Can you provide the complete OOM message from the dmesg log?

On Sat, Dec 22, 2018 at 7:53 AM Pardhiv Karri  wrote:
>
>
> Thank You for the quick response Dyweni!
>
> We are using FileStore as this cluster is upgraded from 
> Hammer-->Jewel-->Luminous 12.2.8. 16x2TB HDD per node for all nodes. R730xd 
> has 128GB and R740xd has 96GB of RAM. Everything else is the same.
>
> Thanks,
> Pardhiv Karri
>
> On Fri, Dec 21, 2018 at 1:43 PM Dyweni - Ceph-Users <6exbab4fy...@dyweni.com> 
> wrote:
>>
>> Hi,
>>
>>
>> You could be running out of memory due to the default Bluestore cache sizes.
>>
>>
>> How many disks/OSDs in the R730xd versus the R740xd?  How much memory in 
>> each server type?  How many are HDD versus SSD?  Are you running Bluestore?
>>
>>
>> OSD's in Luminous, which run Bluestore, allocate memory to use as a "cache", 
>> since the kernel-provided page-cache is not available to Bluestore.  
>> Bluestore, by default, will use 1GB of memory for each HDD, and 3GB of 
>> memory for each SSD.  OSD's do not allocate all that memory up front, but 
>> grow into it as it is used.  This cache is in addition to any other memory 
>> the OSD uses.
>>
>>
>> Check out the bluestore_cache_* values (these are specified in bytes) in the 
>> manual cache sizing section of the docs 
>> (http://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/).
>>Note that the automatic cache sizing feature wasn't added until 12.2.9.
>>
>>
>>
>> As an example, I have OSD's running on 32bit/armhf nodes.  These nodes have 
>> 2GB of memory.  I run 1 Bluestore OSD on each node.  In my ceph.conf file, I 
>> have 'bluestore cache size = 536870912' and 'bluestore cache kv max = 
>> 268435456'.  I see aprox 1.35-1.4 GB used by each OSD.
>>
>>
>>
>>
>> On 2018-12-21 15:19, Pardhiv Karri wrote:
>>
>> Hi,
>>
>> We have a luminous cluster which was upgraded from Hammer --> Jewel --> 
>> Luminous 12.2.8 recently. Post upgrade we are seeing issue with a few nodes 
>> where they are running out of memory and dying. In the logs we are seeing 
>> OOM killer. We don't have this issue before upgrade. The only difference is 
>> the nodes without any issue are R730xd and the ones with the memory leak are 
>> R740xd. The hardware vendor don't see anything wrong with the hardware. From 
>> Ceph end we are not seeing any issue when it comes to running the cluster, 
>> only issue is with memory leak. Right now we are actively rebooting the 
>> nodes in timely manner to avoid crashes. One R740xd node we set all the OSDs 
>> to 0.0 and there is no memory leak there. Any pointers to fix the issue 
>> would be helpful.
>>
>> Thanks,
>> Pardhiv Karri
>>
>>
>>
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> --
> Pardhiv Karri
> "Rise and Rise again until LAMBS become LIONS"
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph 10.2.11 - Status not working

2018-12-17 Thread Brad Hubbard
On Tue, Dec 18, 2018 at 10:23 AM Mike O'Connor  wrote:
>
> Hi All
>
> I have a ceph cluster which has been working with out issues for about 2
> years now, it was upgrade about 6 month ago to 10.2.11
>
> root@blade3:/var/lib/ceph/mon# ceph status
> 2018-12-18 10:42:39.242217 7ff770471700  0 -- 10.1.5.203:0/1608630285 >>
> 10.1.5.207:6789/0 pipe(0x7ff768000c80 sd=4 :0 s=1 pgs=0 cs=0 l=1
> c=0x7ff768001f90).fault
> 2018-12-18 10:42:45.242745 7ff770471700  0 -- 10.1.5.203:0/1608630285 >>
> 10.1.5.207:6789/0 pipe(0x7ff7680051e0 sd=3 :0 s=1 pgs=0 cs=0 l=1
> c=0x7ff768002410).fault
> 2018-12-18 10:42:51.243230 7ff770471700  0 -- 10.1.5.203:0/1608630285 >>
> 10.1.5.207:6789/0 pipe(0x7ff7680051e0 sd=3 :0 s=1 pgs=0 cs=0 l=1
> c=0x7ff768002f40).fault
> 2018-12-18 10:42:54.243452 7ff770572700  0 -- 10.1.5.203:0/1608630285 >>
> 10.1.5.205:6789/0 pipe(0x7ff768000c80 sd=4 :0 s=1 pgs=0 cs=0 l=1
> c=0x7ff768008060).fault
> 2018-12-18 10:42:57.243715 7ff770471700  0 -- 10.1.5.203:0/1608630285 >>
> 10.1.5.207:6789/0 pipe(0x7ff7680051e0 sd=3 :0 s=1 pgs=0 cs=0 l=1
> c=0x7ff768003580).fault
> 2018-12-18 10:43:03.244280 7ff7781b9700  0 -- 10.1.5.203:0/1608630285 >>
> 10.1.5.205:6789/0 pipe(0x7ff7680051e0 sd=3 :0 s=1 pgs=0 cs=0 l=1
> c=0x7ff768003670).fault
>
> All system can ping each other. I simple can not see why its failing.
>
>
> ceph.conf
>
> [global]
>  auth client required = cephx
>  auth cluster required = cephx
>  auth service required = cephx
>  cluster network = 10.1.5.0/24
>  filestore xattr use omap = true
>  fsid = 42a0f015-76da-4f47-b506-da5cdacd030f
>  keyring = /etc/pve/priv/$cluster.$name.keyring
>  osd journal size = 5120
>  osd pool default min size = 1
>  public network = 10.1.5.0/24
>  mon_pg_warn_max_per_osd = 0
>
> [client]
>  rbd cache = true
> [osd]
>  keyring = /var/lib/ceph/osd/ceph-$id/keyring
>  osd max backfills = 1
>  osd recovery max active = 1
>  osd_disk_threads = 1
>  osd_disk_thread_ioprio_class = idle
>  osd_disk_thread_ioprio_priority = 7
> [mon.2]
>  host = blade5
>  mon addr = 10.1.5.205:6789
> [mon.1]
>  host = blade3
>  mon addr = 10.1.5.203:6789
> [mon.3]
>  host = blade7
>  mon addr = 10.1.5.207:6789
> [mon.0]
>  host = blade1
>  mon addr = 10.1.5.201:6789
> [mds]
>  mds data = /var/lib/ceph/mds/mds.$id
>  keyring = /var/lib/ceph/mds/mds.$id/mds.$id.keyring
> [mds.0]
>  host = blade1
> [mds.1]
>  host = blade3
> [mds.2]
>  host = blade5
> [mds.3]
>  host = blade7
>
>
> Any ideas ? more information ?

The system on which you are running the "ceph" client, blade3
(10.1.5.203) is trying to contact monitors on 10.1.5.207 (blade7) port
6789 and 10.1.5.205 (blade5) port 6789. You need to check the ceph-mon
binary is running on blade7 and blade5 and that they are listening on
port 6789 and that that port is accessible from blade3. The simplest
explanation is the MONs are not running. The next simplest is their is
a firewall interfering with blade3's ability to connect to port 6789
on those machines. Check the above and see what you find.

-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Crush, data placement and randomness

2018-12-06 Thread Brad Hubbard
https://ceph.com/wp-content/uploads/2016/08/weil-crush-sc06.pdf
On Thu, Dec 6, 2018 at 8:11 PM Leon Robinson  wrote:
>
> The most important thing to remember about CRUSH is that the H stands for 
> hashing.
>
> If you hash the same object you're going to get the same result.
>
> e.g. cat /etc/fstab | md5sum is always the same output, unless you change the 
> file contents.
>
> CRUSH uses the number of osds and the object and the pool and a bunch of 
> other things to create a hash which determines placement. If any of that 
> changes then the hash will change, and the placement with change, if it 
> restores to exactly how it was, then the placement returns to how it was.
>
> On Thu, 2018-12-06 at 09:44 +0100, Marc Roos wrote:
>
>
>
>
> Afaik it is not random, it is calculated where your objects are stored.
>
> Some algorithm that probably takes into account how many osd's you have
>
> and their sizes.
>
> How can it be random placed? You would not be able to ever find it
>
> again. Because there is not such a thing as a 'file allocation table'
>
>
> But better search for this, I am not that deep into ceph ;)
>
>
>
>
>
> -Original Message-
>
> From: Franck Desjeunes [mailto:
>
> fdesjeu...@gmail.com
>
> ]
>
> Sent: 06 December 2018 08:01
>
> To:
>
> ceph-users@lists.ceph.com
>
>
> Subject: [ceph-users] Crush, data placement and randomness
>
>
> Hi all cephers.
>
>
> I don't know if this is the right place to ask this kind of questions,
>
> but I'll give it a try.
>
>
>
> I'm getting interested in ceph and deep dived into the technical details
>
> of it but I'm struggling to understand few things.
>
>
> When I execute a ceph osd map on an hypothetic object that does not
>
> exist, the command always give me the same OSDs set to store the object.
>
> So, what is the randomness of the CRUSH algorithm if  an object A will
>
> always be stored in the same OSDs set ?
>
>
> In the same way, why when I use librados to read an object, the stack
>
> trace shows that the code goes through the exact same functions calls as
>
> to create an object to get the OSDs set ?
>
>
> As far as I see, for me, CRUSH is fully deterministic and I don't
>
> understand why it is qualified as a pseudo-random algorithm.
>
>
> Thank you for your help.
>
>
> Best regards.
>
>
>
> ___
>
> ceph-users mailing list
>
> ceph-users@lists.ceph.com
>
>
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> --
>
> Leon L. Robinson 
>
> 
>
> NOTICE AND DISCLAIMER
> This e-mail (including any attachments) is intended for the above-named 
> person(s). If you are not the intended recipient, notify the sender 
> immediately, delete this email from your system and do not disclose or use 
> for any purpose. We may monitor all incoming and outgoing emails in line with 
> current legislation. We have taken steps to ensure that this email and 
> attachments are free from any virus, but it remains your responsibility to 
> ensure that viruses do not adversely affect you
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to repair active+clean+inconsistent?

2018-11-14 Thread Brad Hubbard
You could try a 'rados get' and then a 'rados put' on the object to start with.
On Thu, Nov 15, 2018 at 4:07 AM K.C. Wong  wrote:
>
> So, I’ve issued the deep-scrub command (and the repair command)
> and nothing seems to happen.
> Unrelated to this issue, I have to take down some OSD to prepare
> a host for RMA. One of them happens to be in the replication
> group for this PG. So, a scrub happened indirectly. I now have
> this from “ceph -s”:
>
> cluster 374aed9e-5fc1-47e1-8d29-4416f7425e76
>  health HEALTH_ERR
> 1 pgs inconsistent
> 18446 scrub errors
>  monmap e1: 3 mons at 
> {mgmt01=10.0.1.1:6789/0,mgmt02=10.1.1.1:6789/0,mgmt03=10.2.1.1:6789/0}
> election epoch 252, quorum 0,1,2 mgmt01,mgmt02,mgmt03
>   fsmap e346: 1/1/1 up {0=mgmt01=up:active}, 2 up:standby
>  osdmap e40248: 120 osds: 119 up, 119 in
> flags sortbitwise,require_jewel_osds
>   pgmap v22025963: 3136 pgs, 18 pools, 18975 GB data, 214 Mobjects
> 59473 GB used, 287 TB / 345 TB avail
> 3120 active+clean
>   15 active+clean+scrubbing+deep
>1 active+clean+inconsistent
>
> That’s a lot of scrub errors:
>
> HEALTH_ERR 1 pgs inconsistent; 18446 scrub errors
> pg 1.65 is active+clean+inconsistent, acting [62,67,33]
> 18446 scrub errors
>
> Now, “rados list-inconsistent-obj 1.65” returns a *very* long JSON
> output. Here’s a very small snippet, the errors look the same across:
>
> {
>   “object”:{
> "name":”10ea8bb.0045”,
> "nspace":”",
> "locator":”",
> "snap":"head”,
> "version”:59538
>   },
>   "errors":["attr_name_mismatch”],
>   "union_shard_errors":["oi_attr_missing”],
>   "selected_object_info":"1:a70dc1cc:::10ea8bb.0045:head(2897'59538 
> client.4895965.0:462007 dirty|data_digest|omap_digest s 4194304 uv 59538 dd 
> f437a612 od  alloc_hint [0 0])”,
>   "shards”:[
> {
>   "osd":33,
>   "errors":[],
>   "size":4194304,
>   "omap_digest”:"0x”,
>   "data_digest”:"0xf437a612”,
>   "attrs":[
> {"name":"_”,
>  "value":”EAgNAQAABAM1AA...“,
>  "Base64":true},
> {"name":"snapset”,
>  "value":”AgIZAQ...“,
>  "Base64":true}
>   ]
> },
> {
>   "osd":62,
>   "errors":[],
>   "size":4194304,
>   "omap_digest":"0x”,
>   "data_digest":"0xf437a612”,
>   "attrs”:[
> {"name":"_”,
>  "value":”EAgNAQAABAM1AA...",
>  "Base64":true},
> {"name":"snapset”,
>  "value":”AgIZAQ…",
>  "Base64":true}
>   ]
> },
> {
>   "osd":67,
>   "errors":["oi_attr_missing”],
>   "size":4194304,
>   "omap_digest":"0x”,
>   "data_digest":"0xf437a612”,
>   "attrs":[]
> }
>   ]
> }
>
> Clearly, on osd.67, the “attrs” array is empty. The question is,
> how do I fix this?
>
> Many thanks in advance,
>
> -kc
>
> K.C. Wong
> kcw...@verseon.com
> M: +1 (408) 769-8235
>
> -
> Confidentiality Notice:
> This message contains confidential information. If you are not the
> intended recipient and received this message in error, any use or
> distribution is strictly prohibited. Please also notify us
> immediately by return e-mail, and delete this message from your
> computer system. Thank you.
> -
>
> 4096R/B8995EDE  E527 CBE8 023E 79EA 8BBB  5C77 23A6 92E9 B899 5EDE
>
> hkps://hkps.pool.sks-keyservers.net
>
> On Nov 11, 2018, at 10:58 PM, Brad Hubbard  wrote:
>
> On Mon, Nov 12, 2018 at 4:21 PM Ashley Merrick  
> wrote:
>
>
> Your need to run "ceph pg deep-scrub 1.65" first
>
>
> Right, thanks Ashley. That's what the "Note that you may have to do a
> deep scrub to populate the output." part of my answer meant but
> perhaps I needed to go further?
>
> The system has a record of a scrub error on a previous scan but
> subsequent activity in the cluster has invalidated the specifics. You
> need to run another scrub to get the specifi

Re: [ceph-users] How to repair active+clean+inconsistent?

2018-11-11 Thread Brad Hubbard
On Mon, Nov 12, 2018 at 4:21 PM Ashley Merrick  wrote:
>
> Your need to run "ceph pg deep-scrub 1.65" first

Right, thanks Ashley. That's what the "Note that you may have to do a
deep scrub to populate the output." part of my answer meant but
perhaps I needed to go further?

The system has a record of a scrub error on a previous scan but
subsequent activity in the cluster has invalidated the specifics. You
need to run another scrub to get the specific information for this pg
at this point in time (the information does not remain valid
indefinitely and therefore may need to be renewed depending on
circumstances).

>
> On Mon, Nov 12, 2018 at 2:20 PM K.C. Wong  wrote:
>>
>> Hi Brad,
>>
>> I got the following:
>>
>> [root@mgmt01 ~]# ceph health detail
>> HEALTH_ERR 1 pgs inconsistent; 1 scrub errors
>> pg 1.65 is active+clean+inconsistent, acting [62,67,47]
>> 1 scrub errors
>> [root@mgmt01 ~]# rados list-inconsistent-obj 1.65
>> No scrub information available for pg 1.65
>> error 2: (2) No such file or directory
>> [root@mgmt01 ~]# rados list-inconsistent-snapset 1.65
>> No scrub information available for pg 1.65
>> error 2: (2) No such file or directory
>>
>> Rather odd output, I’d say; not that I understand what
>> that means. I also tried ceph list-inconsistent-pg:
>>
>> [root@mgmt01 ~]# rados lspools
>> rbd
>> cephfs_data
>> cephfs_metadata
>> .rgw.root
>> default.rgw.control
>> default.rgw.data.root
>> default.rgw.gc
>> default.rgw.log
>> ctrl-p
>> prod
>> corp
>> camp
>> dev
>> default.rgw.users.uid
>> default.rgw.users.keys
>> default.rgw.buckets.index
>> default.rgw.buckets.data
>> default.rgw.buckets.non-ec
>> [root@mgmt01 ~]# for i in $(rados lspools); do rados list-inconsistent-pg 
>> $i; done
>> []
>> ["1.65"]
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>> []
>>
>> So, that’d put the inconsistency in the cephfs_data pool.
>>
>> Thank you for your help,
>>
>> -kc
>>
>> K.C. Wong
>> kcw...@verseon.com
>> M: +1 (408) 769-8235
>>
>> -
>> Confidentiality Notice:
>> This message contains confidential information. If you are not the
>> intended recipient and received this message in error, any use or
>> distribution is strictly prohibited. Please also notify us
>> immediately by return e-mail, and delete this message from your
>> computer system. Thank you.
>> -
>>
>> 4096R/B8995EDE  E527 CBE8 023E 79EA 8BBB  5C77 23A6 92E9 B899 5EDE
>>
>> hkps://hkps.pool.sks-keyservers.net
>>
>> On Nov 11, 2018, at 5:43 PM, Brad Hubbard  wrote:
>>
>> What does "rados list-inconsistent-obj " say?
>>
>> Note that you may have to do a deep scrub to populate the output.
>> On Mon, Nov 12, 2018 at 5:10 AM K.C. Wong  wrote:
>>
>>
>> Hi folks,
>>
>> I would appreciate any pointer as to how I can resolve a
>> PG stuck in “active+clean+inconsistent” state. This has
>> resulted in HEALTH_ERR status for the last 5 days with no
>> end in sight. The state got triggered when one of the drives
>> in the PG returned I/O error. I’ve since replaced the failed
>> drive.
>>
>> I’m running Jewel (out of centos-release-ceph-jewel) on
>> CentOS 7. I’ve tried “ceph pg repair ” and it didn’t seem
>> to do anything. I’ve tried even more drastic measures such as
>> comparing all the files (using filestore) under that PG_head
>> on all 3 copies and then nuking the outlier. Nothing worked.
>>
>> Many thanks,
>>
>> -kc
>>
>> K.C. Wong
>> kcw...@verseon.com
>> M: +1 (408) 769-8235
>>
>> -
>> Confidentiality Notice:
>> This message contains confidential information. If you are not the
>> intended recipient and received this message in error, any use or
>> distribution is strictly prohibited. Please also notify us
>> immediately by return e-mail, and delete this message from your
>> computer system. Thank you.
>> -
>> 4096R/B8995EDE  E527 CBE8 023E 79EA 8BBB  5C77 23A6 92E9 B899 5EDE
>> hkps://hkps.pool.sks-keyservers.net
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
>>
>> --
>> Cheers,
>> Brad
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to repair active+clean+inconsistent?

2018-11-11 Thread Brad Hubbard
What does "rados list-inconsistent-obj " say?

Note that you may have to do a deep scrub to populate the output.
On Mon, Nov 12, 2018 at 5:10 AM K.C. Wong  wrote:
>
> Hi folks,
>
> I would appreciate any pointer as to how I can resolve a
> PG stuck in “active+clean+inconsistent” state. This has
> resulted in HEALTH_ERR status for the last 5 days with no
> end in sight. The state got triggered when one of the drives
> in the PG returned I/O error. I’ve since replaced the failed
> drive.
>
> I’m running Jewel (out of centos-release-ceph-jewel) on
> CentOS 7. I’ve tried “ceph pg repair ” and it didn’t seem
> to do anything. I’ve tried even more drastic measures such as
> comparing all the files (using filestore) under that PG_head
> on all 3 copies and then nuking the outlier. Nothing worked.
>
> Many thanks,
>
> -kc
>
> K.C. Wong
> kcw...@verseon.com
> M: +1 (408) 769-8235
>
> -
> Confidentiality Notice:
> This message contains confidential information. If you are not the
> intended recipient and received this message in error, any use or
> distribution is strictly prohibited. Please also notify us
> immediately by return e-mail, and delete this message from your
> computer system. Thank you.
> -
> 4096R/B8995EDE  E527 CBE8 023E 79EA 8BBB  5C77 23A6 92E9 B899 5EDE
> hkps://hkps.pool.sks-keyservers.net
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to subscribe to developers list

2018-11-11 Thread Brad Hubbard
What do you get if you send "help" (without quotes) to m
ajord...@vger.kernel.org ?

On Sun, Nov 11, 2018 at 10:15 AM Cranage, Steve <
scran...@deepspacestorage.com> wrote:

> Can anyone tell me the secret? A colleague tried and failed many times so
> I tried and got this:
>
>
>
>
>
> Steve Cranage
> --_000_SN4PR0701MB3792CB55C8AA7468ADE7FC4DB2C00SN4PR0701MB3792_
>  Command
> '--_000_sn4pr0701mb3792cb55c8aa7468ade7fc4db2c00sn4pr0701mb3792_' not
> recognized.
>  Content-Type: text/plain; charset="us-ascii"
>  Command 'content-type:' not recognized.
>  Content-Transfer-Encoding: quoted-printable
>  Command 'content-transfer-encoding:' not recognized.
> 
>  subscribe+ceph-devel
>  Command 'subscribe+ceph-devel' not recognized.
>
>
>
> According to the server help, the 'subscribe+ceph-devel’ should be correct
> syntax, but apparently not so.
>
>
>
> TIA!
>
> Principal Architect, Co-Founder
>
> DeepSpace Storage
>
> 719-930-6960
>
> [image: cid:image001.png@01D3FCBC.58FDB6F0]
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>


-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSDs crashing

2018-09-25 Thread Brad Hubbard
On Tue, Sep 25, 2018 at 11:31 PM Josh Haft  wrote:
>
> Hi cephers,
>
> I have a cluster of 7 storage nodes with 12 drives each and the OSD
> processes are regularly crashing. All 84 have crashed at least once in
> the past two days. Cluster is Luminous 12.2.2 on CentOS 7.4.1708,
> kernel version 3.10.0-693.el7.x86_64. I rebooted one of the OSD nodes
> to see if that cleared up the issue, but it did not. This problem has
> been going on for about a month now, but it was much less frequent
> initially - I'd see a crash once every few days or so. I took a look
> through the mailing list and bug reports, but wasn't able to find
> anything resembling this problem.
>
> I am running a second cluster - also 12.2.2, CentOS 7.4.1708, and
> kernel version 3.10.0-693.el7.x86_64 - but I do not see the issue
> there.
>
> Log messages always look similar to the following, and I've pulled out
> the back trace from a core dump as well. The aborting thread always
> looks to be msgr-worker.
>



> #7  0x7f9e731a3a36 in __cxxabiv1::__terminate (handler= out>) at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:38
> #8  0x7f9e731a3a63 in std::terminate () at
> ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:48
> #9  0x7f9e731fa345 in std::(anonymous
> namespace)::execute_native_thread_routine (__p=) at
> ../../../../../libstdc++-v3/src/c++11/thread.cc:92

That is this code executing.

https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/src/c%2B%2B11/thread.cc;h=0351f19e042b0701ba3c2597ecec87144fd631d5;hb=cf82a597b0d189857acb34a08725762c4f5afb50#l76

So the problem is we are generating an exception when our thread gets
run, we should probably catch that before it gets to here but that's
another story...

The exception is "buffer::malformed_input: entity_addr_t marker != 1"
and there is some precedent for this
(https://tracker.ceph.com/issues/21660,
https://tracker.ceph.com/issues/24819) but I don't think they are your
issue.

We generated that exception because we encountered an ill-formed
entity_addr_t whilst decoding a message.

Could you open a tracker for this issue and upload the entire log from
a crash, preferably with "debug ms >= 5" but be careful as this will
create very large log files. You can use ceph-post-file to upload
large compressed files.

Let me know the tracker ID here once you've created it.

P.S. This is likely fixed in a later version of Luminous since you
seem to be the only one hitting it. Either that or there is something
unusual about your environment.

>
> Has anyone else seen this? Any suggestions on how to proceed? I do
> intend to upgrade to Mimic but would prefer to do it when the cluster
> is stable.
>
> Thanks for your help.
> Josh
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] PG inconsistent, "pg repair" not working

2018-09-25 Thread Brad Hubbard
On Tue, Sep 25, 2018 at 7:50 PM Sergey Malinin  wrote:
>
> # rados list-inconsistent-obj 1.92
> {"epoch":519,"inconsistents":[]}

It's likely the epoch has changed since the last scrub and you'll need
to run another scrub to repopulate this data.

>
> September 25, 2018 4:58 AM, "Brad Hubbard"  wrote:
>
> > What does the output of the following command look like?
> >
> > $ rados list-inconsistent-obj 1.92
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [RGWRados]librados: Objecter returned from getxattrs r=-36

2018-09-19 Thread Brad Hubbard
Are you using filestore or bluestore on the OSDs? If filestore what is
the underlying filesystem?

You could try setting debug_osd and debug_filestore to 20 and see if
that gives some more info?
On Wed, Sep 19, 2018 at 12:36 PM fatkun chan  wrote:
>
>
> ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a) luminous 
> (stable)
>
> I have a file with long name , when I cat the file through minio client, the 
> error show.
> librados: Objecter returned from getxattrs r=-36
>
>
> the log is come from radosgw
>
> 2018-09-15 03:38:24.763109 7f833c0ed700  2 req 20:0.000272:s3:GET 
> /hand-gesture/:list_bucket:verifying op params
> 2018-09-15 03:38:24.763111 7f833c0ed700  2 req 20:0.000273:s3:GET 
> /hand-gesture/:list_bucket:pre-executing
> 2018-09-15 03:38:24.763112 7f833c0ed700  2 req 20:0.000274:s3:GET 
> /hand-gesture/:list_bucket:executing
> 2018-09-15 03:38:24.763115 7f833c0ed700 10 cls_bucket_list 
> hand-gesture[7f3000c9-66f8-4598-9811-df3800e4469a.804194.12]) start [] 
> num_entries 1001
> 2018-09-15 03:38:24.763822 7f833c0ed700 20 get_obj_state: rctx=0x7f833c0e5790 
> obj=hand-gesture:train_result/mobilenetv2_160_0.35_feature16_pyramid3_minside160_lr0.01_batchsize32_steps2000_limitratio0.5625_slot_blankdata201809041612_bluedata201808302300composite_background_201809111827/201809111827/logs/events.out.tfevents.1536672273.tf-hand-gesture-58-worker-s7uc-0-jsuf7
>  state=0x7f837553c0a0 s->prefetch_data=0
> 2018-09-15 03:38:24.763841 7f833c0ed700 10 librados: getxattrs 
> oid=7f3000c9-66f8-4598-9811-df3800e4469a.804194.12_train_result/mobilenetv2_160_0.35_feature16_pyramid3_minside160_lr0.01_batchsize32_steps2000_limitratio0.5625_slot_blankdata201809041612_bluedata201808302300composite_background_201809111827/201809111827/logs/events.out.tfevents.1536672273.tf-hand-gesture-58-worker-s7uc-0-jsuf7
>  nspace=
> 2018-09-15 03:38:24.764283 7f833c0ed700 10 librados: Objecter returned from 
> getxattrs r=-36
> 2018-09-15 03:38:24.764304 7f833c0ed700  2 req 20:0.001466:s3:GET 
> /hand-gesture/:list_bucket:completing
> 2018-09-15 03:38:24.764308 7f833c0ed700  0 WARNING: set_req_state_err 
> err_no=36 resorting to 500
> 2018-09-15 03:38:24.764355 7f833c0ed700  2 req 20:0.001517:s3:GET 
> /hand-gesture/:list_bucket:op status=-36
> 2018-09-15 03:38:24.764362 7f833c0ed700  2 req 20:0.001524:s3:GET 
> /hand-gesture/:list_bucket:http status=500
> 2018-09-15 03:38:24.764364 7f833c0ed700  1 == req done req=0x7f833c0e7110 
> op status=-36 http_status=500 ==
> 2018-09-15 03:38:24.764371 7f833c0ed700 20 process_request() returned -36
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] what is Implicated osds

2018-08-20 Thread Brad Hubbard
On Tue, Aug 21, 2018 at 2:37 AM, Satish Patel  wrote:
> Folks,
>
> Today i found ceph -s is really slow and just hanging for minute or 2
> minute to give me output also same with "ceph osd tree" output,
> command just hanging long time to give me output..
>
> This is what i am seeing output, one OSD down not sure why its down
> and what is the relation with command running slow?
>
> I am also seeing what does that means? " 369 slow requests are blocked
>> 32 sec. Implicated osds 0,2,3,4,5,6,7,8,9,11"

This is just a hint that these are the osds you should look at in
regard to the slow requests.

What's common about the stale pgs, what pool do they belong too and
what are the configuration details of that pool?

Can you do a pg query on one of the stale pgs?

>
>
> [root@ostack-infra-01-ceph-mon-container-692bea95 ~]# ceph -s
>   cluster:
> id: c369cdc9-35a2-467a-981d-46e3e1af8570
> health: HEALTH_WARN
> Reduced data availability: 53 pgs stale
> 369 slow requests are blocked > 32 sec. Implicated osds
> 0,2,3,4,5,6,7,8,9,11
>
>   services:
> mon: 3 daemons, quorum
> ostack-infra-02-ceph-mon-container-87f0ee0e,ostack-infra-01-ceph-mon-container-692bea95,ostack-infra-03-ceph-mon-container-a92c1c2a
> mgr: ostack-infra-01-ceph-mon-container-692bea95(active),
> standbys: ostack-infra-03-ceph-mon-container-a92c1c2a,
> ostack-infra-02-ceph-mon-container-87f0ee0e
> osd: 12 osds: 11 up, 11 in
>
>   data:
> pools:   5 pools, 656 pgs
> objects: 1461 objects, 11509 MB
> usage:   43402 MB used, 5080 GB / 5122 GB avail
> pgs: 603 active+clean
>  53  stale+active+clean
>
>
>
> [root@ostack-infra-01-ceph-mon-container-692bea95 ~]# ceph osd tree
> ID CLASS WEIGHT  TYPE NAMESTATUS REWEIGHT PRI-AFF
> -1   5.45746 root default
> -3   1.81915 host ceph-osd-01
>  0   ssd 0.45479 osd.0up  1.0 1.0
>  2   ssd 0.45479 osd.2up  1.0 1.0
>  5   ssd 0.45479 osd.5up  1.0 1.0
>  6   ssd 0.45479 osd.6up  1.0 1.0
> -5   1.81915 host ceph-osd-02
>  1   ssd 0.45479 osd.1  down0 1.0
>  3   ssd 0.45479 osd.3up  1.0 1.0
>  4   ssd 0.45479 osd.4up  1.0 1.0
>  7   ssd 0.45479 osd.7up  1.0 1.0
> -7   1.81915 host ceph-osd-03
>  8   ssd 0.45479 osd.8up  1.0 1.0
>  9   ssd 0.45479 osd.9up  1.0 1.0
> 10   ssd 0.45479 osd.10   up  1.0 1.0
> 11   ssd 0.45479 osd.11   up  1.0 1.0
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [Jewel 10.2.11] OSD Segmentation fault

2018-08-13 Thread Brad Hubbard
Jewel is almost EOL.

It looks similar to several related issues, one of which is
http://tracker.ceph.com/issues/21826

On Mon, Aug 13, 2018 at 9:19 PM, Alexandru Cucu  wrote:
> Hi,
>
> Already tried zapping the disk. Unfortunaltely the same segfaults keep
> me from adding the OSD back to the cluster.
>
> I wanted to open an issue on tracker.ceph.com but I can't find the
> "new issue" button.
>
> ---
> Alex Cucu
>
> On Mon, Aug 13, 2018 at 8:24 AM  wrote:
>>
>>
>>
>> Am 3. August 2018 12:03:17 MESZ schrieb Alexandru Cucu :
>> >Hello,
>> >
>>
>> Hello Alex,
>>
>> >Another OSD started randomly crashing with segmentation fault. Haven't
>> >managed to add the last 3 OSDs back to the cluster as the daemons keep
>> >crashing.
>> >
>>
>> An idea could be to remove the osds completely from the Cluster and add it 
>> again after zapping the Disks.
>>
>> Hth
>> - Mehmet
>>
>> >---
>> >
>> >-2> 2018-08-03 12:12:52.670076 7f12b6b15700  4 rocksdb:
>> >EVENT_LOG_v1 {"time_micros": 1533287572670073, "job": 3, "event":
>> >"table_file_deletion", "file_number": 4350}
>> >  -1> 2018-08-03 12:12:53.146753 7f12c38d0a80  0 osd.154 89917 load_pgs
>> > 0> 2018-08-03 12:12:57.526910 7f12c38d0a80 -1 *** Caught signal
>> >(Segmentation fault) **
>> > in thread 7f12c38d0a80 thread_name:ceph-osd
>> > ceph version 10.2.11 (e4b061b47f07f583c92a050d9e84b1813a35671e)
>> > 1: (()+0x9f1c2a) [0x7f12c42ddc2a]
>> > 2: (()+0xf5e0) [0x7f12c1dc85e0]
>> > 3: (()+0x34484) [0x7f12c34a6484]
>> > 4: (rocksdb::BlockBasedTable::NewIndexIterator(rocksdb::ReadOptions
>> >const&, rocksdb::BlockIter*,
>> >rocksdb::BlockBasedTable::CachableEntry*)+0x466)
>> >[0x7f12c41e40d6]
>> > 5: (rocksdb::BlockBasedTable::Get(rocksdb::ReadOptions const&,
>> >rocksdb::Slice const&, rocksdb::GetContext*, bool)+0x297)
>> >[0x7f12c41e4b27]
>> > 6: (rocksdb::TableCache::Get(rocksdb::ReadOptions const&,
>> >rocksdb::InternalKeyComparator const&, rocksdb::FileDescriptor const&,
>> >rocksdb::Slice const&, rocksdb::GetContext*, rocksdb::HistogramImpl*,
>> >bool
>> >, int)+0x2a4) [0x7f12c429ff94]
>> > 7: (rocksdb::Version::Get(rocksdb::ReadOptions const&,
>> >rocksdb::LookupKey const&, rocksdb::PinnableSlice*, rocksdb::Status*,
>> >rocksdb::MergeContext*, rocksdb::RangeDelAggregator*, bool*, bool*,
>> >unsigned l
>> >ong*)+0x810) [0x7f12c419bb80]
>> > 8: (rocksdb::DBImpl::GetImpl(rocksdb::ReadOptions const&,
>> >rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&,
>> >rocksdb::PinnableSlice*, bool*)+0x5a4) [0x7f12c424e494]
>> > 9: (rocksdb::DBImpl::Get(rocksdb::ReadOptions const&,
>> >rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&,
>> >rocksdb::PinnableSlice*)+0x19) [0x7f12c424ea19]
>> > 10: (rocksdb::DB::Get(rocksdb::ReadOptions const&,
>> >rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&,
>> >std::string*)+0x95) [0x7f12c4252a45]
>> > 11: (rocksdb::DB::Get(rocksdb::ReadOptions const&, rocksdb::Slice
>> >const&, std::string*)+0x4a) [0x7f12c4251eea]
>> > 12: (RocksDBStore::get(std::string const&, std::string const&,
>> >ceph::buffer::list*)+0xff) [0x7f12c415c31f]
>> > 13: (DBObjectMap::_lookup_map_header(DBObjectMap::MapHeaderLock
>> >const&, ghobject_t const&)+0x5e4) [0x7f12c4110814]
>> > 14: (DBObjectMap::get_values(ghobject_t const&, std::set> >std::less, std::allocator > const&,
>> >std::map,
>> >std::
>> >allocator > >*)+0x5f)
>> >[0x7f12c41f]
>> > 15: (FileStore::omap_get_values(coll_t const&, ghobject_t const&,
>> >std::set,
>> >std::allocator > const&, std::map> >ceph::buffer::list, std::less> >td::string>, std::allocator> >ceph::buffer::list> > >*)+0x197) [0x7f12c4031f77]
>> >16: (PG::_has_removal_flag(ObjectStore*, spg_t)+0x151) [0x7f12c3d8f7c1]
>> > 17: (OSD::load_pgs()+0x5d5) [0x7f12c3cf43e5]
>> > 18: (OSD::init()+0x2086) [0x7f12c3d07096]
>> > 19: (main()+0x2c18) [0x7f12c3c1e088]
>> > 20: (__libc_start_main()+0xf5) [0x7f12c0374c05]
>> > 21: (()+0x3c8847) [0x7f12c3cb4847]
>> > NOTE: a copy of the executable, or `objdump -rdS ` is
>> >needed to interpret this.
>> >---
>> >
>> >Any help would be appreciated.
>> >
>> >Thanks,
>> >Alex Cucu
>> >
>> >On Mon, Jul 30, 2018 at 4:55 PM Alexandru Cucu  wrote:
>> >>
>> >> Hello Ceph users,
>> >>
>> >> We have updated our cluster from 10.2.7 to 10.2.11. A few hours after
>> >> the update, 1 OSD crashed.
>> >> When trying to add the OSD back to the cluster, other 2 OSDs started
>> >> crashing with segmentation fault. Had to mark all 3 OSDs as down as
>> >we
>> >> had stuck PGs and blocked operations and the cluster status was
>> >> HEALTH_ERR.
>> >>
>> >> We have tried various ways to re-add the OSDs back to the cluster but
>> >> after a while they start crashing and won't start anymore. After a
>> >> while they can be started again and marked as in but after some
>> >> rebalancing they will start the crashing imediately after starting.
>> >>
>> >> Here are some logs:
>> >> https://pastebin.com/nCRamgRU
>> >>
>> >> Do you know of any existing bug report that might be related? (I
>> >> couldn't find anything).
>> 

Re: [ceph-users] OSD had suicide timed out

2018-08-08 Thread Brad Hubbard
If, in the above case, osd 13 was not too busy to respond (resource
shortage) then you need to find out why else osd 5, etc. could not
contact it.

On Wed, Aug 8, 2018 at 6:47 PM, Josef Zelenka
 wrote:
> Checked the system load on the host with the OSD that is suiciding currently
> and it's fine, however i can see a noticeably higher IO (around 700), though
> that seems rather like a symptom of the constant flapping/attempting to come
> up to me(it's an SSD based Ceph so this shouldn't cause much harm to it).
> Had a look at one of the osds sending the you_died messages and it seems
> it's attempting to contact osd.13, but ultimately fails.
>
> 8/0 13574/13574/13574) [5,11] r=0 lpr=13574 crt=13592'3654839 lcod
> 13592'3654838 mlcod 13592'3654838 active+clean] publish_stats_to_osd
> 13593:9552151
> 2018-08-08 10:45:16.112344 7effa1d8c700 15 osd.5 pg_epoch: 13593 pg[14.6( v
> 13592'3654839 (13294'3653334,13592'3654839] local-lis/les=13574/13575 n=945
> ec=126/126 lis/c 13574/13574 les/c/f 13575/13578/0 13574/13574/13574) [5,11]
> r=0 lpr=13574 crt=13592'3654839 lcod 13592'3654838 mlcod 13592'3654838
> active+clean] publish_stats_to_osd 13593:9552152
> 2018-08-08 10:45:16.679484 7eff9a57d700 15 osd.5 pg_epoch: 13593 pg[11.15( v
> 13575'34486 (9987'32956,13575'34486] local-lis/les=13574/13575 n=1
> ec=115/115 lis/c 13574/13574 les/c/f 13575/13575/0 13574/13574/13574) [5,10]
> r=0 lpr=13574 crt=13572'34485 lcod 13572'34485 mlcod 13572'34485
> active+clean] publish_stats_to_osd 13593:2966967
> 2018-08-08 10:45:17.818135 7effb95a4700  1 -- 10.12.125.1:6803/1319081 <==
> osd.13 10.12.125.3:0/735946 18  osd_ping(ping e13589 stamp 2018-08-08
> 10:45:17.817238) v4  2004+0+0 (4218069135 0 0) 0x55bb638ba800 con
> 0x55bb65e79800
> 2018-08-08 10:45:17.818176 7effb9da5700  1 -- 10.12.3.15:6809/1319081 <==
> osd.13 10.12.3.17:0/735946 18  osd_ping(ping e13589 stamp 2018-08-08
> 10:45:17.817238) v4  2004+0+0 (4218069135 0 0) 0x55bb63cd8c00 con
> 0x55bb65e7b000
> 2018-08-08 10:45:18.919053 7effb95a4700  1 -- 10.12.125.1:6803/1319081 <==
> osd.13 10.12.125.3:0/735946 19  osd_ping(ping e13589 stamp 2018-08-08
> 10:45:18.918149) v4  2004+0+0 (1428835292 0 0) 0x55bb638bb200 con
> 0x55bb65e79800
> 2018-08-08 10:45:18.919598 7effb9da5700  1 -- 10.12.3.15:6809/1319081 <==
> osd.13 10.12.3.17:0/735946 19  osd_ping(ping e13589 stamp 2018-08-08
> 10:45:18.918149) v4  2004+0+0 (1428835292 0 0) 0x55bb63cd8a00 con
> 0x55bb65e7b000
> 2018-08-08 10:45:21.679563 7eff9a57d700 15 osd.5 pg_epoch: 13593 pg[11.15( v
> 13575'34486 (9987'32956,13575'34486] local-lis/les=13574/13575 n=1
> ec=115/115 lis/c 13574/13574 les/c/f 13575/13575/0 13574/13574/13574) [5,10]
> r=0 lpr=13574 crt=13572'34485 lcod 13572'34485 mlcod 13572'34485
> active+clean] publish_stats_to_osd 13593:2966968
> 2018-08-08 10:45:23.020715 7effb95a4700  1 -- 10.12.125.1:6803/1319081 <==
> osd.13 10.12.125.3:0/735946 20  osd_ping(ping e13589 stamp 2018-08-08
> 10:45:23.018994) v4  2004+0+0 (1018071233 0 0) 0x55bb63bb7200 con
> 0x55bb65e79800
> 2018-08-08 10:45:23.020837 7effb9da5700  1 -- 10.12.3.15:6809/1319081 <==
> osd.13 10.12.3.17:0/735946 20  osd_ping(ping e13589 stamp 2018-08-08
> 10:45:23.018994) v4  2004+0+0 (1018071233 0 0) 0x55bb63cd8c00 con
> 0x55bb65e7b000
> 2018-08-08 10:45:26.679513 7eff8e565700 15 osd.5 pg_epoch: 13593 pg[11.15( v
> 13575'34486 (9987'32956,13575'34486] local-lis/les=13574/13575 n=1
> ec=115/115 lis/c 13574/13574 les/c/f 13575/13575/0 13574/13574/13574) [5,10]
> r=0 lpr=13574 crt=13572'34485 lcod 13572'34485 mlcod 13572'34485
> active+clean] publish_stats_to_osd 13593:2966969
> 2018-08-08 10:45:28.921091 7effb95a4700  1 -- 10.12.125.1:6803/1319081 <==
> osd.13 10.12.125.3:0/735946 21  osd_ping(ping e13589 stamp 2018-08-08
> 10:45:28.920140) v4  2004+0+0 (2459835898 0 0) 0x55bb638ba800 con
> 0x55bb65e79800
> 2018-08-08 10:45:28.922026 7effb9da5700  1 -- 10.12.3.15:6809/1319081 <==
> osd.13 10.12.3.17:0/735946 21  osd_ping(ping e13589 stamp 2018-08-08
> 10:45:28.920140) v4  2004+0+0 (2459835898 0 0) 0x55bb63cd8c00 con
> 0x55bb65e7b000
> 2018-08-08 10:45:31.679828 7eff9a57d700 15 osd.5 pg_epoch: 13593 pg[11.15( v
> 13575'34486 (9987'32956,13575'34486] local-lis/les=13574/13575 n=1
> ec=115/115 lis/c 13574/13574 les/c/f 13575/13575/0 13574/13574/13574) [5,10]
> r=0 lpr=13574 crt=13572'34485 lcod 13572'34485 mlcod 13572'34485
> active+clean] publish_stats_to_osd 13593:2966970
> 2018-08-08 10:45:33.022697 7effb95a4700  1 -- 10.12.125.1:6803/1319081 <==
> osd.13 10.12.125.3:0/735946 22  osd_ping(ping e13589 stamp 2018-08-08
> 10:45:33.021217) v4  2004+0+0 (3639738084 0 0) 0x55bb63bb7200 con
> 0x55bb65e79800
>
> Regarding heartbeat messages, 

Re: [ceph-users] OSD had suicide timed out

2018-08-08 Thread Brad Hubbard
Do you see "internal heartbeat not healthy" messages in the log of the
osd that suicides?

On Wed, Aug 8, 2018 at 5:45 PM, Brad Hubbard  wrote:
> What is the load like on the osd host at the time and what does the
> disk utilization look like?
>
> Also, what does the transaction look like from one of the osds that
> sends the "you died" message with debugging osd 20 and ms 1 enabled?
>
> On Wed, Aug 8, 2018 at 5:34 PM, Josef Zelenka
>  wrote:
>> Thank you for your suggestion, tried it,  really seems like the other osds
>> think the osd is dead(if I understand this right), however the networking
>> seems absolutely fine between the nodes(no issues in graphs etc).
>>
>>-13> 2018-08-08 09:13:58.466119 7fe053d41700  1 -- 10.12.3.17:0/706864
>> <== osd.12 10.12.3.17:6807/4624236 81  osd_ping(ping_reply e13452 stamp
>> 2018-08-08 09:13:58.464608) v4  2004+0+0 (687351303 0 0) 0x55731eb73e00
>> con 0x55731e7d4800
>>-12> 2018-08-08 09:13:58.466140 7fe054542700  1 -- 10.12.3.17:0/706864
>> <== osd.11 10.12.3.16:6812/19232 81  osd_ping(ping_reply e13452 stamp
>> 2018-08-08 09:13:58.464608) v4  2004+0+0 (687351303 0 0) 0x55733c391200
>> con 0x55731e7a5800
>>-11> 2018-08-08 09:13:58.466147 7fe053540700  1 -- 10.12.125.3:0/706864
>> <== osd.11 10.12.125.2:6811/19232 82  osd_ping(you_died e13452 stamp
>> 2018-08-08 09:13:58.464608) v4  2004+0+0 (3111562112 0 0) 0x55731eb66800
>> con 0x55731e7a4000
>>-10> 2018-08-08 09:13:58.466164 7fe054542700  1 -- 10.12.3.17:0/706864
>> <== osd.11 10.12.3.16:6812/19232 82  osd_ping(you_died e13452 stamp
>> 2018-08-08 09:13:58.464608) v4  2004+0+0 (3111562112 0 0) 0x55733c391200
>> con 0x55731e7a5800
>> -9> 2018-08-08 09:13:58.466164 7fe053d41700  1 -- 10.12.3.17:0/706864
>> <== osd.12 10.12.3.17:6807/4624236 82  osd_ping(you_died e13452 stamp
>> 2018-08-08 09:13:58.464608) v4  2004+0+0 (3111562112 0 0) 0x55731eb73e00
>> con 0x55731e7d4800
>> -8> 2018-08-08 09:13:58.466176 7fe053540700  1 -- 10.12.3.17:0/706864
>> <== osd.9 10.12.3.16:6813/10016600 81  osd_ping(ping_reply e13452 stamp
>> 2018-08-08 09:13:58.464608) v4  2004+0+0 (687351303 0 0) 0x55731eb66800
>> con 0x55731e732000
>> -7> 2018-08-08 09:13:58.466200 7fe053d41700  1 -- 10.12.3.17:0/706864
>> <== osd.10 10.12.3.16:6810/2017908 81  osd_ping(ping_reply e13452 stamp
>> 2018-08-08 09:13:58.464608) v4  2004+0+0 (687351303 0 0) 0x55731eb73e00
>> con 0x55731e796800
>> -6> 2018-08-08 09:13:58.466208 7fe053540700  1 -- 10.12.3.17:0/706864
>> <== osd.9 10.12.3.16:6813/10016600 82  osd_ping(you_died e13452 stamp
>> 2018-08-08 09:13:58.464608) v4  2004+0+0 (3111562112 0 0) 0x55731eb66800
>> con 0x55731e732000
>> -5> 2018-08-08 09:13:58.466222 7fe053d41700  1 -- 10.12.3.17:0/706864
>> <== osd.10 10.12.3.16:6810/2017908 82  osd_ping(you_died e13452 stamp
>> 2018-08-08 09:13:58.464608) v4  2004+0+0 (3111562112 0 0) 0x55731eb73e00
>> con 0x55731e796800
>> -4> 2018-08-08 09:13:59.748336 7fe040531700  1 -- 10.12.3.17:6802/706864
>> --> 10.12.3.16:6800/1677830 -- mgrreport(unknown.13 +0-0 packed 742
>> osd_metrics=1) v5 -- 0x55731fa4af00 con 0
>> -3> 2018-08-08 09:13:59.748538 7fe040531700  1 -- 10.12.3.17:6802/706864
>> --> 10.12.3.16:6800/1677830 -- pg_stats(64 pgs tid 0 v 0) v1 --
>> 0x55733cbf4c00 con 0
>> -2> 2018-08-08 09:14:00.953804 7fe0525a1700  1 heartbeat_map is_healthy
>> 'OSD::peering_tp thread 0x7fe03f52f700' had timed out after 15
>> -1> 2018-08-08 09:14:00.953857 7fe0525a1700  1 heartbeat_map is_healthy
>> 'OSD::peering_tp thread 0x7fe03f52f700' had suicide timed out after 150
>>  0> 2018-08-08 09:14:00.970742 7fe03f52f700 -1 *** Caught signal
>> (Aborted) **
>>
>>
>> Could it be that the suiciding OSDs are rejecting the ping somehow? I'm
>> quite confused as on what's really going on here, it seems completely random
>> to me.
>>
>>
>> On 08/08/18 01:51, Brad Hubbard wrote:
>>>
>>> Try to work out why the other osds are saying this one is down. Is it
>>> because this osd is too busy to respond or something else.
>>>
>>> debug_ms = 1 will show you some message debugging which may help.
>>>
>>> On Tue, Aug 7, 2018 at 10:34 PM, Josef Zelenka
>>>  wrote:
>>>>
>>>> To follow up, I did some further digging with debug_osd=20/20 and it
>>>> appears
>>>> as if there's no traffic to the OSD, even though it comes UP for the

Re: [ceph-users] OSD had suicide timed out

2018-08-08 Thread Brad Hubbard
What is the load like on the osd host at the time and what does the
disk utilization look like?

Also, what does the transaction look like from one of the osds that
sends the "you died" message with debugging osd 20 and ms 1 enabled?

On Wed, Aug 8, 2018 at 5:34 PM, Josef Zelenka
 wrote:
> Thank you for your suggestion, tried it,  really seems like the other osds
> think the osd is dead(if I understand this right), however the networking
> seems absolutely fine between the nodes(no issues in graphs etc).
>
>-13> 2018-08-08 09:13:58.466119 7fe053d41700  1 -- 10.12.3.17:0/706864
> <== osd.12 10.12.3.17:6807/4624236 81  osd_ping(ping_reply e13452 stamp
> 2018-08-08 09:13:58.464608) v4  2004+0+0 (687351303 0 0) 0x55731eb73e00
> con 0x55731e7d4800
>-12> 2018-08-08 09:13:58.466140 7fe054542700  1 -- 10.12.3.17:0/706864
> <== osd.11 10.12.3.16:6812/19232 81  osd_ping(ping_reply e13452 stamp
> 2018-08-08 09:13:58.464608) v4  2004+0+0 (687351303 0 0) 0x55733c391200
> con 0x55731e7a5800
>-11> 2018-08-08 09:13:58.466147 7fe053540700  1 -- 10.12.125.3:0/706864
> <== osd.11 10.12.125.2:6811/19232 82  osd_ping(you_died e13452 stamp
> 2018-08-08 09:13:58.464608) v4  2004+0+0 (3111562112 0 0) 0x55731eb66800
> con 0x55731e7a4000
>-10> 2018-08-08 09:13:58.466164 7fe054542700  1 -- 10.12.3.17:0/706864
> <== osd.11 10.12.3.16:6812/19232 82  osd_ping(you_died e13452 stamp
> 2018-08-08 09:13:58.464608) v4  2004+0+0 (3111562112 0 0) 0x55733c391200
> con 0x55731e7a5800
> -9> 2018-08-08 09:13:58.466164 7fe053d41700  1 -- 10.12.3.17:0/706864
> <== osd.12 10.12.3.17:6807/4624236 82  osd_ping(you_died e13452 stamp
> 2018-08-08 09:13:58.464608) v4  2004+0+0 (3111562112 0 0) 0x55731eb73e00
> con 0x55731e7d4800
> -8> 2018-08-08 09:13:58.466176 7fe053540700  1 -- 10.12.3.17:0/706864
> <== osd.9 10.12.3.16:6813/10016600 81  osd_ping(ping_reply e13452 stamp
> 2018-08-08 09:13:58.464608) v4  2004+0+0 (687351303 0 0) 0x55731eb66800
> con 0x55731e732000
> -7> 2018-08-08 09:13:58.466200 7fe053d41700  1 -- 10.12.3.17:0/706864
> <== osd.10 10.12.3.16:6810/2017908 81  osd_ping(ping_reply e13452 stamp
> 2018-08-08 09:13:58.464608) v4  2004+0+0 (687351303 0 0) 0x55731eb73e00
> con 0x55731e796800
> -6> 2018-08-08 09:13:58.466208 7fe053540700  1 -- 10.12.3.17:0/706864
> <== osd.9 10.12.3.16:6813/10016600 82  osd_ping(you_died e13452 stamp
> 2018-08-08 09:13:58.464608) v4  2004+0+0 (3111562112 0 0) 0x55731eb66800
> con 0x55731e732000
> -5> 2018-08-08 09:13:58.466222 7fe053d41700  1 -- 10.12.3.17:0/706864
> <== osd.10 10.12.3.16:6810/2017908 82  osd_ping(you_died e13452 stamp
> 2018-08-08 09:13:58.464608) v4  2004+0+0 (3111562112 0 0) 0x55731eb73e00
> con 0x55731e796800
> -4> 2018-08-08 09:13:59.748336 7fe040531700  1 -- 10.12.3.17:6802/706864
> --> 10.12.3.16:6800/1677830 -- mgrreport(unknown.13 +0-0 packed 742
> osd_metrics=1) v5 -- 0x55731fa4af00 con 0
> -3> 2018-08-08 09:13:59.748538 7fe040531700  1 -- 10.12.3.17:6802/706864
> --> 10.12.3.16:6800/1677830 -- pg_stats(64 pgs tid 0 v 0) v1 --
> 0x55733cbf4c00 con 0
> -2> 2018-08-08 09:14:00.953804 7fe0525a1700  1 heartbeat_map is_healthy
> 'OSD::peering_tp thread 0x7fe03f52f700' had timed out after 15
> -1> 2018-08-08 09:14:00.953857 7fe0525a1700  1 heartbeat_map is_healthy
> 'OSD::peering_tp thread 0x7fe03f52f700' had suicide timed out after 150
>  0> 2018-08-08 09:14:00.970742 7fe03f52f700 -1 *** Caught signal
> (Aborted) **
>
>
> Could it be that the suiciding OSDs are rejecting the ping somehow? I'm
> quite confused as on what's really going on here, it seems completely random
> to me.
>
>
> On 08/08/18 01:51, Brad Hubbard wrote:
>>
>> Try to work out why the other osds are saying this one is down. Is it
>> because this osd is too busy to respond or something else.
>>
>> debug_ms = 1 will show you some message debugging which may help.
>>
>> On Tue, Aug 7, 2018 at 10:34 PM, Josef Zelenka
>>  wrote:
>>>
>>> To follow up, I did some further digging with debug_osd=20/20 and it
>>> appears
>>> as if there's no traffic to the OSD, even though it comes UP for the
>>> cluster
>>> (this started happening on another OSD in the cluster today, same stuff):
>>>
>>> -27> 2018-08-07 14:10:55.146531 7f9fce3cd700 10 osd.0 12560
>>> handle_osd_ping osd.17 10.12.3.17:6811/19661 says i am down in 12566
>>> -26> 2018-08-07 14:10:55.146542 7f9fcebce700 10 osd.0 12560
>>> handle_osd_ping osd.12 10.12.125.3:6807/4624236 says i am down in 12566
>>> -25> 2018-0

Re: [ceph-users] OSD had suicide timed out

2018-08-07 Thread Brad Hubbard
Try to work out why the other osds are saying this one is down. Is it
because this osd is too busy to respond or something else.

debug_ms = 1 will show you some message debugging which may help.

On Tue, Aug 7, 2018 at 10:34 PM, Josef Zelenka
 wrote:
> To follow up, I did some further digging with debug_osd=20/20 and it appears
> as if there's no traffic to the OSD, even though it comes UP for the cluster
> (this started happening on another OSD in the cluster today, same stuff):
>
>-27> 2018-08-07 14:10:55.146531 7f9fce3cd700 10 osd.0 12560
> handle_osd_ping osd.17 10.12.3.17:6811/19661 says i am down in 12566
>-26> 2018-08-07 14:10:55.146542 7f9fcebce700 10 osd.0 12560
> handle_osd_ping osd.12 10.12.125.3:6807/4624236 says i am down in 12566
>-25> 2018-08-07 14:10:55.146551 7f9fcf3cf700 10 osd.0 12560
> handle_osd_ping osd.13 10.12.3.17:6805/186262 says i am down in 12566
>-24> 2018-08-07 14:10:55.146564 7f9fce3cd700 20 osd.0 12559
> share_map_peer 0x56308a9d already has epoch 12566
>-23> 2018-08-07 14:10:55.146576 7f9fcebce700 20 osd.0 12559
> share_map_peer 0x56308abb9800 already has epoch 12566
>-22> 2018-08-07 14:10:55.146590 7f9fcf3cf700 20 osd.0 12559
> share_map_peer 0x56308abb1000 already has epoch 12566
>-21> 2018-08-07 14:10:55.146600 7f9fce3cd700 10 osd.0 12560
> handle_osd_ping osd.15 10.12.125.3:6813/49064793 says i am down in 12566
>-20> 2018-08-07 14:10:55.146609 7f9fcebce700 10 osd.0 12560
> handle_osd_ping osd.16 10.12.3.17:6801/1018363 says i am down in 12566
>-19> 2018-08-07 14:10:55.146619 7f9fcf3cf700 10 osd.0 12560
> handle_osd_ping osd.11 10.12.3.16:6812/19232 says i am down in 12566
>-18> 2018-08-07 14:10:55.146643 7f9fcf3cf700 20 osd.0 12559
> share_map_peer 0x56308a9d already has epoch 12566
>-17> 2018-08-07 14:10:55.146653 7f9fcf3cf700 10 osd.0 12560
> handle_osd_ping osd.15 10.12.3.17:6812/49064793 says i am down in 12566
>-16> 2018-08-07 14:10:55.448468 7f9fcabdd700 10 osd.0 12560
> tick_without_osd_lock
>-15> 2018-08-07 14:10:55.448491 7f9fcabdd700 20 osd.0 12559
> can_inc_scrubs_pending 0 -> 1 (max 1, active 0)
>-14> 2018-08-07 14:10:55.448497 7f9fcabdd700 20 osd.0 12560
> scrub_time_permit should run between 0 - 24 now 14 = yes
>-13> 2018-08-07 14:10:55.448525 7f9fcabdd700 20 osd.0 12560
> scrub_load_below_threshold loadavg 2.31 < daily_loadavg 2.68855 and < 15m
> avg 2.63 = yes
>-12> 2018-08-07 14:10:55.448535 7f9fcabdd700 20 osd.0 12560 sched_scrub
> load_is_low=1
>-11> 2018-08-07 14:10:55.448555 7f9fcabdd700 10 osd.0 12560 sched_scrub
> 15.112 scheduled at 2018-08-07 15:03:15.052952 > 2018-08-07 14:10:55.448494
>-10> 2018-08-07 14:10:55.448563 7f9fcabdd700 20 osd.0 12560 sched_scrub
> done
> -9> 2018-08-07 14:10:55.448565 7f9fcabdd700 10 osd.0 12559
> promote_throttle_recalibrate 0 attempts, promoted 0 objects and 0 bytes;
> target 25 obj/sec or 5120 k bytes/sec
> -8> 2018-08-07 14:10:55.448568 7f9fcabdd700 20 osd.0 12559
> promote_throttle_recalibrate  new_prob 1000
> -7> 2018-08-07 14:10:55.448569 7f9fcabdd700 10 osd.0 12559
> promote_throttle_recalibrate  actual 0, actual/prob ratio 1, adjusted
> new_prob 1000, prob 1000 -> 1000
> -6> 2018-08-07 14:10:55.507159 7f9faab9d700 20 osd.0 op_wq(5) _process
> empty q, waiting
> -5> 2018-08-07 14:10:55.812434 7f9fb5bb3700 20 osd.0 op_wq(7) _process
> empty q, waiting
> -4> 2018-08-07 14:10:56.236584 7f9fcd42e700  1 heartbeat_map is_healthy
> 'OSD::osd_op_tp thread 0x7f9fa7396700' had timed out after 60
> -3> 2018-08-07 14:10:56.236618 7f9fcd42e700  1 heartbeat_map is_healthy
> 'OSD::osd_op_tp thread 0x7f9fb33ae700' had timed out after 60
> -2> 2018-08-07 14:10:56.236621 7f9fcd42e700  1 heartbeat_map is_healthy
> 'OSD::peering_tp thread 0x7f9fba3bc700' had timed out after 15
> -1> 2018-08-07 14:10:56.236640 7f9fcd42e700  1 heartbeat_map is_healthy
> 'OSD::peering_tp thread 0x7f9fba3bc700' had suicide timed out after 150
>  0> 2018-08-07 14:10:56.245420 7f9fba3bc700 -1 *** Caught signal
> (Aborted) **
>  in thread 7f9fba3bc700 thread_name:tp_peering
>
> THe osd cyclically crashes and comes back up. I tried modifying the recovery
> etc timeouts, but no luck - the situation is still the same. Regarding the
> radosgw, across all nodes, after starting the rgw process, i only get this:
>
> 2018-08-07 14:32:17.852785 7f482dcaf700  2
> RGWDataChangesLog::ChangesRenewThread: start
>
> I found this thread in the ceph mailing list
> (http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-June/018956.html)
> but I'm not sure if this is the same thing(albeit, it's the same error), as
> I don't use s3 acls/expiration in my cluster(if it's set to a default, I'm
> not aware of it)
>
>
>
>
> On 06/08/18 16:30, Josef Zelenka wrote:
>>
>> Hi,
>>
>> i'm running a cluster on Luminous(12.2.5), Ubuntu 16.04 - configuration is
>> 3 nodes, 6 drives each(though i have encountered this on a different
>> cluster, similar hardware, 

Re: [ceph-users] Bluestore OSD Segfaults (12.2.5/12.2.7)

2018-08-07 Thread Brad Hubbard
Looks like https://tracker.ceph.com/issues/21826 which is a dup of
https://tracker.ceph.com/issues/20557

On Wed, Aug 8, 2018 at 1:49 AM, Thomas White  wrote:
> Hi all,
>
> We have recently begun switching over to Bluestore on our Ceph cluster, 
> currently on 12.2.7. We first began encountering segfaults on Bluestore 
> during 12.2.5, but strangely these segfaults apply exclusively to our SSD 
> pools and not the PCIE/HDD disks. We upgraded to 12.2.7 last week to get 
> clear of the issues known within 12.2.6 and hoping it may address our 
> bluestore issues, but to no avail, and upgrading to mimic is not feasible for 
> us right away as this is a production environment.
>
> I have attached one of the OSD logs which are experiencing the segfault, as 
> well as the recommended command to interpret the debug information. 
> Unfortunately at present due to the 403 I am unable to open a bug tracker for 
> this.
>
> OSD Log: https://transfer.sh/AYQ8Y/ceph-osd.123.log
> OSD Binary debug: https://transfer.sh/FOiLv/ceph-osd-123-binary.txt.tar.gz
>
> The disks in use are Intel DC S3710s 800G.
>
> These OSDs were previously filestore and fully operational, and the procedure 
> for migrating these was to as usual mark as out, await recovery, zap and 
> redeploy. We further used DD to ensure the disk was fully wiped and performed 
> smartctl tests to rule out errors with the disk performance, but were unable 
> to find any faults.
>
> What may be unusual is only some of the SSDs are encountering this segfault 
> so far. On one host where we have 8 OSDs, only 2 of these are hitting the 
> segfaults so far. However, we have noticed the new OSDs are considerably more 
> temperamental to be marked as down despite minimal load.
>
> Any advice anyone could offer on this would be great.
>
> Kind Regards,
>
> Tom
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Luminous OSD crashes every few seconds: FAILED assert(0 == "past_interval end mismatch")

2018-08-01 Thread Brad Hubbard
If you don't already know why, you should investigate why your cluster
could not recover after the loss of a single osd.

Your solution seems valid given your description.


On Thu, Aug 2, 2018 at 12:15 PM, J David  wrote:
> On Wed, Aug 1, 2018 at 9:53 PM, Brad Hubbard  wrote:
>> What is the status of the cluster with this osd down and out?
>
> Briefly, miserable.
>
> All client IO was blocked.
>
> 36 pgs were stuck “down.”  pg query reported that they were blocked by
> that OSD, despite that OSD not holding any replicas for them, with
> diagnostics (now gone off of scrollback, sorry) about how bringing
> that OSD online or marking it lost might resolve the issue.
>
> With blocked IO and pgs stuck “down” I was not at all comfortable
> marking the OSD lost.
>
> Both conditions resolved after taking the steps outlined in the post I
> just made to ceph-users.
>
> Thanks!
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Luminous OSD crashes every few seconds: FAILED assert(0 == "past_interval end mismatch")

2018-08-01 Thread Brad Hubbard
What is the status of the cluster with this osd down and out?

On Thu, Aug 2, 2018 at 5:42 AM, J David  wrote:
> Hello all,
>
> On Luminous 12.2.7, during the course of recovering from a failed OSD,
> one of the other OSDs started repeatedly crashing every few seconds
> with an assertion failure:
>
> 2018-08-01 12:17:20.584350 7fb50eded700 -1 log_channel(cluster) log
> [ERR] : 2.621 past_interal bound [19300,21449) end does not match
> required [21374,21447) end
> /build/ceph-12.2.7/src/osd/PG.cc: In function 'void
> PG::check_past_interval_bounds() const' thread 7fb50eded700 time
> 2018-08-01 12:17:20.584367
> /build/ceph-12.2.7/src/osd/PG.cc: 847: FAILED assert(0 ==
> “past_interval end mismatch")
>
> The console output of a run of this OSD is here:
>
> https://pastebin.com/WSjsVwVu
>
> The last 512k worth of the log file for this OSD is here:
>
> https://pastebin.com/rYQkMatA
>
> Currently I have “debug osd = 5/5” in ceph.conf, but if other values
> would shed useful light, this problem  is easy to reproduce.
>
> There are no disk errors or problems that I can see with the OSD that
> won’t stay running.
>
> Does anyone know what happened here, and whether it's recoverable?
>
> Thanks for any advice!
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] fyi: Luminous 12.2.7 pulled wrong osd disk, resulted in node down

2018-08-01 Thread Brad Hubbard
On Wed, Aug 1, 2018 at 10:38 PM, Marc Roos  wrote:
>
>
> Today we pulled the wrong disk from a ceph node. And that made the whole
> node go down/be unresponsive. Even to a simple ping. I cannot find to
> much about this in the log files. But I expect that the
> /usr/bin/ceph-osd process caused a kernel panic.

That would most likely be a kernel bug. Someone would probably need to
look at a vmcore to work out what happened.

>
> Linux c01 3.10.0-693.11.1.el7.x86_64
> CentOS Linux release 7.4.1708 (Core)
> libcephfs2-12.2.7-0.el7.x86_64
> ceph-mon-12.2.7-0.el7.x86_64
> nfs-ganesha-ceph-2.6.1-0.1.el7.x86_64
> ceph-selinux-12.2.7-0.el7.x86_64
> ceph-osd-12.2.7-0.el7.x86_64
> ceph-mgr-12.2.7-0.el7.x86_64
> ceph-12.2.7-0.el7.x86_64
> python-cephfs-12.2.7-0.el7.x86_64
> ceph-common-12.2.7-0.el7.x86_64
> ceph-mds-12.2.7-0.el7.x86_64
> ceph-radosgw-12.2.7-0.el7.x86_64
> ceph-base-12.2.7-0.el7.x86_64
>
> Aug  1 11:01:01 c02 systemd: Started Session 8331 of user root.
> Aug  1 11:01:01 c02 systemd: Starting Session 8331 of user root.
> Aug  1 11:01:01 c02 systemd: Starting Session 8331 of user root.
> Aug  1 11:03:08 c03 kernel: XFS (sdb1): xfs_do_force_shutdown(0x2)
> called from line 1200 of file fs/xfs/xfs_log.c.  Return address =
> 0xc0232e60
> Aug  1 11:03:08 c03 kernel: XFS (sdb1): xfs_do_force_shutdown(0x2)
> called from line 1200 of file fs/xfs/xfs_log.c.  Return address =
> 0xc0232e60
> Aug  1 11:03:33 c03 kernel: XFS (sdf1): xfs_do_force_shutdown(0x2)
> called from line 1200 of file fs/xfs/xfs_log.c.  Return address =
> 0xc0232e60
> Aug  1 11:03:33 c03 kernel: XFS (sdf1): xfs_do_force_shutdown(0x2)
> called from line 1200 of file fs/xfs/xfs_log.c.  Return address =
> 0xc0232e60
> Aug  1 11:03:34 c02 kernel: libceph: osd5 down
> Aug  1 11:03:34 c02 kernel: libceph: osd5 down
> Aug  1 11:05:04 c02 ceph-osd: 2018-08-01 11:05:04.656719 7f1f1e764700 -1
> osd.9 22452 heartbeat_check: no reply from 192.168.10.113:6816 osd.12
> since back 2018-08-01 11:04:44.365869 front 2018-08-01 11:04:44.365869
> (cutoff 2018-08-01 11:04:44.656717)
> Aug  1 11:05:04 c02 ceph-osd: 2018-08-01 11:05:04.656719 7f1f1e764700 -1
> osd.9 22452 heartbeat_check: no reply from 192.168.10.113:6816 osd.12
> since back 2018-08-01 11:04:44.365869 front 2018-08-01 11:04:44.365869
> (cutoff 2018-08-01 11:04:44.656717)
> Aug  1 11:05:04 c02 ceph-osd: 2018-08-01 11:05:04.656746 7f1f1e764700 -1
> osd.9 22452 heartbeat_check: no reply from 192.168.10.113:6812 osd.14
> since back 2018-08-01 11:04:44.365869 front 2018-08-01 11:04:44.365869
> (cutoff 2018-08-01 11:04:44.656717)
> Aug  1 11:05:04 c02 ceph-osd: 2018-08-01 11:05:04.656761 7f1f1e764700 -1
> osd.9 22452 heartbeat_check: no reply from 192.168.10.113:6804 osd.15
> since back 2018-08-01 11:04:44.365869 front 2018-08-01 11:04:44.365869
> (cutoff 2018-08-01 11:04:44.656717)
> Aug  1 11:05:04 c02 ceph-osd: 2018-08-01 11:05:04.656773 7f1f1e764700 -1
> osd.9 22452 heartbeat_check: no reply from 192.168.10.113:6814 osd.16
> since back 2018-08-01 11:04:44.365869 front 2018-08-01 11:04:44.365869
> (cutoff 2018-08-01 11:04:44.656717)
> Aug  1 11:05:04 c02 ceph-osd: 2018-08-01 11:05:04.656746 7f1f1e764700 -1
> osd.9 22452 heartbeat_check: no reply from 192.168.10.113:6812 osd.14
> since back 2018-08-01 11:04:44.365869 front 2018-08-01 11:04:44.365869
> (cutoff 2018-08-01 11:04:44.656717)
> Aug  1 11:05:04 c02 ceph-osd: 2018-08-01 11:05:04.656761 7f1f1e764700 -1
> osd.9 22452 heartbeat_check: no reply from 192.168.10.113:6804 osd.15
> since back 2018-08-01 11:04:44.365869 front 2018-08-01 11:04:44.365869
> (cutoff 2018-08-01 11:04:44.656717)
> Aug  1 11:05:04 c02 ceph-osd: 2018-08-01 11:05:04.656773 7f1f1e764700 -1
> osd.9 22452 heartbeat_check: no reply from 192.168.10.113:6814 osd.16
> since back 2018-08-01 11:04:44.365869 front 2018-08-01 11:04:44.365869
> (cutoff 2018-08-01 11:04:44.656717)
> Aug  1 11:05:05 c02 ceph-osd: 2018-08-01 11:05:05.657034 7f1f1e764700 -1
> osd.9 22452 heartbeat_check: no reply from 192.168.10.113:6816 osd.12
> since back 2018-08-01 11:04:44.365869 front 2018-08-01 11:04:44.365869
> (cutoff 2018-08-01 11:04:45.657031)
> Aug  1 11:05:05 c02 ceph-osd: 2018-08-01 11:05:05.657034 7f1f1e764700 -1
> osd.9 22452 heartbeat_check: no reply from 192.168.10.113:6816 osd.12
> since back 2018-08-01 11:04:44.365869 front 2018-08-01 11:04:44.365869
> (cutoff 2018-08-01 11:04:45.657031)
> Aug  1 11:05:05 c02 ceph-osd: 2018-08-01 11:05:05.657067 7f1f1e764700 -1
> osd.9 22452 heartbeat_check: no reply from 192.168.10.113:6812 osd.14
> since back 2018-08-01 11:04:44.365869 front 2018-08-01 11:04:44.365869
> (cutoff 2018-08-01 11:04:45.657031)
> Aug  1 11:05:05 c02 ceph-osd: 2018-08-01 11:05:05.657079 7f1f1e764700 -1
> osd.9 22452 heartbeat_check: no reply from 192.168.10.113:6804 osd.15
> since back 2018-08-01 11:04:44.365869 front 2018-08-01 11:04:44.365869
> (cutoff 2018-08-01 11:04:45.657031)
> Aug  1 11:05:05 c02 ceph-osd: 2018-08-01 11:05:05.657089 

Re: [ceph-users] OMAP warning ( again )

2018-08-01 Thread Brad Hubbard
rgw is not really my area but I'd suggest before you do *anything* you
establish which object it is talking about.

On Thu, Aug 2, 2018 at 8:08 AM, Brent Kennedy  wrote:
> Ceph health detail gives this:
> HEALTH_WARN 1 large omap objects
> LARGE_OMAP_OBJECTS 1 large omap objects
> 1 large objects found in pool '.rgw.buckets.index'
> Search the cluster log for 'Large omap object found' for more details.
>
> The ceph.log file on the monitor server only shows the 1 large omap objects 
> message.
>
> I looked further into the issue again and remembered it was related to bucket 
> sharding.  I then remembered that in Luminous it was supposed to dynamic. I 
> went through the process this time of checking to see what the shards were 
> set to for one of the buckets we have and the max shards is still set to 0.  
> The blog posting about it says that there isn’t anything we have to do, but I 
> am wondering if the same is true for clusters that were upgraded to luminous 
> from older versions.
>
> Do I need to run this: radosgw-admin reshard add --bucket= 
> --num-shards=  for every bucket to get that going?
>
> When I look at a bucket ( BKTEST ), it shows num_shards as 0:
> root@ukpixmon1:/var/log/ceph# radosgw-admin metadata get 
> bucket.instance:BKTEST:default.7320.3
> {
> "key": "bucket.instance:BKTEST:default.7320.3",
> "ver": {
> "tag": "_JFn84AijvH8aWXWXyvSeKpZ",
> "ver": 1
> },
> "mtime": "2018-01-10 18:50:07.994194Z",
> "data": {
> "bucket_info": {
> "bucket": {
> "name": "BKTEST",
> "marker": "default.7320.3",
> "bucket_id": "default.7320.3",
> "tenant": "",
> "explicit_placement": {
> "data_pool": ".rgw.buckets",
> "data_extra_pool": ".rgw.buckets.extra",
> "index_pool": ".rgw.buckets.index"
> }
> },
> "creation_time": "2016-03-09 17:23:50.00Z",
> "owner": "zz",
> "flags": 0,
> "zonegroup": "default",
> "placement_rule": "default-placement",
> "has_instance_obj": "true",
> "quota": {
> "enabled": false,
> "check_on_raw": false,
> "max_size": -1024,
> "max_size_kb": 0,
> "max_objects": -1
> },
> "num_shards": 0,
> "bi_shard_hash_type": 0,
> "requester_pays": "false",
> "has_website": "false",
> "swift_versioning": "false",
> "swift_ver_location": "",
>     "index_type": 0,
> "mdsearch_config": [],
> "reshard_status": 0,
> "new_bucket_instance_id": ""
>
> When I run that shard setting to change the number of shards:
> "radosgw-admin reshard add --bucket=BKTEST --num-shards=2"
>
> Then run to get the status:
> "radosgw-admin reshard list"
>
> [
> {
> "time": "2018-08-01 21:58:13.306381Z",
> "tenant": "",
> "bucket_name": "BKTEST",
> "bucket_id": "default.7320.3",
> "new_instance_id": "",
> "old_num_shards": 1,
> "new_num_shards": 2
> }
> ]
>
> If it was 0, why does it say old_num_shards was 1?
>
> -Brent
>
> -Original Message-
> From: Brad Hubbard [mailto:bhubb...@redhat.com]
> Sent: Tuesday, July 31, 2018 9:07 PM
> To: Brent Kennedy 
> Cc: ceph-users 
> Subject: Re: [ceph-users] OMAP warning ( again )
>
> Search the cluster log for 'Large omap object found' for more details.
>
> On Wed, Aug 1, 2018 at 3:50 AM, Brent Kennedy  wrote:
>> Upgraded from 12.2.5 to 12.2.6, got a “1 large omap objects” warning
>> message, then upgraded to 12.2.7 and the message went away.  I just
>> added four OSDs to balance out the cluster ( we had some servers with
>> fewer drives in them; jbod config ) 

Re: [ceph-users] OMAP warning ( again )

2018-07-31 Thread Brad Hubbard
Search the cluster log for 'Large omap object found' for more details.

On Wed, Aug 1, 2018 at 3:50 AM, Brent Kennedy  wrote:
> Upgraded from 12.2.5 to 12.2.6, got a “1 large omap objects” warning
> message, then upgraded to 12.2.7 and the message went away.  I just added
> four OSDs to balance out the cluster ( we had some servers with fewer drives
> in them; jbod config ) and now the “1 large omap objects” warning message is
> back.  I did some googlefoo to try to figure out what it means and then how
> to correct it, but the how to correct it is a bit vague.
>
>
>
> We use rados gateways for all storage, so everything is in the .rgw.buckets
> pool, which I gather from research is why we are getting the warning message
> ( there are millions of objects in there ).
>
>
>
> Is there an if/then process to clearing this error message?
>
>
>
> Regards,
>
> -Brent
>
>
>
> Existing Clusters:
>
> Test: Luminous 12.2.7 with 3 osd servers, 1 mon/man, 1 gateway ( all virtual
> )
>
> US Production: Firefly with 4 osd servers, 3 mons, 3 gateways behind haproxy
> LB
>
> UK Production: Luminous 12.2.7 with 8 osd servers, 3 mons/man, 3 gateways
> behind haproxy LB
>
>
>
>
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Self shutdown of 1 whole system (Derbian stretch/Ceph 12.2.7/bluestore)

2018-07-23 Thread Brad Hubbard
Ceph doesn't shut down systems as in kill or reboot the box if that's
what you're saying?

On Mon, Jul 23, 2018 at 5:04 PM, Nicolas Huillard  wrote:
> Le lundi 23 juillet 2018 à 11:07 +0700, Konstantin Shalygin a écrit :
>> > I even have no fancy kernel or device, just real standard Debian.
>> > The
>> > uptime was 6 days since the upgrade from 12.2.6...
>>
>> Nicolas, you should upgrade your 12.2.6 to 12.2.7 due bugs in this
>> release.
>
> That was done (cf. subject).
> This is happening with 12.2.7, fresh and 6 days old.
>
> --
> Nicolas Huillard
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] active+clean+inconsistent PGs after upgrade to 12.2.7

2018-07-19 Thread Brad Hubbard
I've updated the tracker.

On Thu, Jul 19, 2018 at 7:51 PM, Robert Sander
 wrote:
> On 19.07.2018 11:15, Ronny Aasen wrote:
>
>> Did you upgrade from 12.2.5 or 12.2.6 ?
>
> Yes.
>
>> sounds like you hit the reason for the 12.2.7 release
>>
>> read : https://ceph.com/releases/12-2-7-luminous-released/
>>
>> there should come features in 12.2.8 that can deal with the "objects are
>> in sync but checksums are wrong" scenario.
>
> I already read that before the upgrade but did not consider to be
> affected by the bug.
>
> The pools with the inconsistent PGs only have RBDs stored and not CephFS
> nor RGW data.
>
> I have restarted the OSDs with "osd skip data digest = true" as a "ceph
> tell" is not able to inject this argument into the running processes.
>
> Let's see if this works out.
>
> Regards
> --
> Robert Sander
> Heinlein Support GmbH
> Schwedter Str. 8/9b, 10119 Berlin
>
> https://www.heinlein-support.de
>
> Tel: 030 / 405051-43
> Fax: 030 / 405051-19
>
> Amtsgericht Berlin-Charlottenburg - HRB 93818 B
> Geschäftsführer: Peer Heinlein - Sitz: Berlin
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Omap warning in 12.2.6

2018-07-19 Thread Brad Hubbard
Search the cluster log for 'Large omap object found' for more details.

On Fri, Jul 20, 2018 at 5:13 AM, Brent Kennedy  wrote:
> I just upgraded our cluster to 12.2.6 and now I see this warning about 1
> large omap object.  I looked and it seems this warning was just added in
> 12.2.6.  I found a few discussions on what is was but not much information
> on addressing it properly.  Our cluster uses rgw exclusively with just a few
> buckets in the .rgw.buckets pool.  Our largest bucket has millions of
> objects in it.
>
>
>
> Any thoughts or links on this?
>
>
>
>
>
> Regards,
>
> -Brent
>
>
>
> Existing Clusters:
>
> Test: Luminous 12.2.6 with 3 osd servers, 1 mon/man, 1 gateway ( all virtual
> )
>
> US Production: Firefly with 4 osd servers, 3 mons, 3 gateways behind haproxy
> LB
>
> UK Production: Luminous 12.2.6 with 8 osd servers, 3 mons/man, 3 gateways
> behind haproxy LB
>
>
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Recovery from 12.2.5 (corruption) -> 12.2.6 (hair on fire) -> 13.2.0 (some objects inaccessible and CephFS damaged)

2018-07-18 Thread Brad Hubbard
On Thu, Jul 19, 2018 at 12:47 PM, Troy Ablan  wrote:
>
>
> On 07/18/2018 06:37 PM, Brad Hubbard wrote:
>> On Thu, Jul 19, 2018 at 2:48 AM, Troy Ablan  wrote:
>>>
>>>
>>> On 07/17/2018 11:14 PM, Brad Hubbard wrote:
>>>>
>>>> On Wed, Jul 18, 2018 at 2:57 AM, Troy Ablan  wrote:
>>>>>
>>>>> I was on 12.2.5 for a couple weeks and started randomly seeing
>>>>> corruption, moved to 12.2.6 via yum update on Sunday, and all hell broke
>>>>> loose.  I panicked and moved to Mimic, and when that didn't solve the
>>>>> problem, only then did I start to root around in mailing lists archives.
>>>>>
>>>>> It appears I can't downgrade OSDs back to Luminous now that 12.2.7 is
>>>>> out, but I'm unsure how to proceed now that the damaged cluster is
>>>>> running under Mimic.  Is there anything I can do to get the cluster back
>>>>> online and objects readable?
>>>>
>>>> That depends on what the specific problem is. Can you provide some
>>>> data that fills in the blanks around "randomly seeing corruption"?
>>>>
>>> Thanks for the reply, Brad.  I have a feeling that almost all of this stems
>>> from the time the cluster spent running 12.2.6.  When booting VMs that use
>>> rbd as a backing store, they typically get I/O errors during boot and cannot
>>> read critical parts of the image.  I also get similar errors if I try to rbd
>>> export most of the images. Also, CephFS is not started as ceph -s indicates
>>> damage.
>>>
>>> Many of the OSDs have been crashing and restarting as I've tried to rbd
>>> export good versions of images (from older snapshots).  Here's one
>>> particular crash:
>>>
>>> 2018-07-18 15:52:15.809 7fcbaab77700 -1
>>> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/h
>>> uge/release/13.2.0/rpm/el7/BUILD/ceph-13.2.0/src/os/bluestore/BlueStore.h:
>>> In function 'void
>>> BlueStore::SharedBlobSet::remove_last(BlueStore::SharedBlob*)' thread
>>> 7fcbaab7
>>> 7700 time 2018-07-18 15:52:15.750916
>>> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.0/rpm/el7/BUILD/ceph-13
>>> .2.0/src/os/bluestore/BlueStore.h: 455: FAILED assert(sb->nref == 0)
>>>
>>>  ceph version 13.2.0 (79a10589f1f80dfe21e8f9794365ed98143071c4) mimic
>>> (stable)
>>>  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
>>> const*)+0xff) [0x7fcbc197a53f]
>>>  2: (()+0x286727) [0x7fcbc197a727]
>>>  3: (BlueStore::SharedBlob::put()+0x1da) [0x5641f39181ca]
>>>  4: (std::_Rb_tree,
>>> boost::intrusive_ptr,
>>> std::_Identity >,
>>> std::less >,
>>> std::allocator >
>>>> ::_M_erase(std::_Rb_tree_node>> lueStore::SharedBlob> >*)+0x2d) [0x5641f3977cfd]
>>>  5: (std::_Rb_tree,
>>> boost::intrusive_ptr,
>>> std::_Identity >,
>>> std::less >,
>>> std::allocator >
>>>> ::_M_erase(std::_Rb_tree_node>> lueStore::SharedBlob> >*)+0x1b) [0x5641f3977ceb]
>>>  6: (std::_Rb_tree,
>>> boost::intrusive_ptr,
>>> std::_Identity >,
>>> std::less >,
>>> std::allocator >
>>>> ::_M_erase(std::_Rb_tree_node>> lueStore::SharedBlob> >*)+0x1b) [0x5641f3977ceb]
>>>  7: (std::_Rb_tree,
>>> boost::intrusive_ptr,
>>> std::_Identity >,
>>> std::less >,
>>> std::allocator >
>>>> ::_M_erase(std::_Rb_tree_node>> lueStore::SharedBlob> >*)+0x1b) [0x5641f3977ceb]
>>>  8: (BlueStore::TransContext::~TransContext()+0xf7) [0x5641f3979297]
>>>  9: (BlueStore::_txc_finish(BlueStore::TransContext*)+0x610)
>>> [0x5641f391c9b0]
>>>  10: (BlueStore::_txc_state_proc(BlueStore::TransContext*)+0x9a)
>>> [0x5641f392a38a]
>>>  11: (BlueStore::_kv_finalize_thread()+0x41e) [0x5641f392b3be]
>>>  12: (BlueStore::KVFinalizeThread::entry()+0xd) [0x5641f397d85d]
>>>  13: (()+0x7e25) [0x7fcbbe4d2e25]
>>>  14: (clone()+0x6d) [0x7fcbbd5c3bad]
>>>  NOTE: a copy of the executable, or `objdump -rdS ` is needed to
>>> interpret this.
>>>
>>>
>>> Here's the output of ceph -s that might fill in some configuration
>>> questions.  Since osds are continually restarti

Re: [ceph-users] Recovery from 12.2.5 (corruption) -> 12.2.6 (hair on fire) -> 13.2.0 (some objects inaccessible and CephFS damaged)

2018-07-18 Thread Brad Hubbard
On Thu, Jul 19, 2018 at 2:48 AM, Troy Ablan  wrote:
>
>
> On 07/17/2018 11:14 PM, Brad Hubbard wrote:
>>
>> On Wed, Jul 18, 2018 at 2:57 AM, Troy Ablan  wrote:
>>>
>>> I was on 12.2.5 for a couple weeks and started randomly seeing
>>> corruption, moved to 12.2.6 via yum update on Sunday, and all hell broke
>>> loose.  I panicked and moved to Mimic, and when that didn't solve the
>>> problem, only then did I start to root around in mailing lists archives.
>>>
>>> It appears I can't downgrade OSDs back to Luminous now that 12.2.7 is
>>> out, but I'm unsure how to proceed now that the damaged cluster is
>>> running under Mimic.  Is there anything I can do to get the cluster back
>>> online and objects readable?
>>
>> That depends on what the specific problem is. Can you provide some
>> data that fills in the blanks around "randomly seeing corruption"?
>>
> Thanks for the reply, Brad.  I have a feeling that almost all of this stems
> from the time the cluster spent running 12.2.6.  When booting VMs that use
> rbd as a backing store, they typically get I/O errors during boot and cannot
> read critical parts of the image.  I also get similar errors if I try to rbd
> export most of the images. Also, CephFS is not started as ceph -s indicates
> damage.
>
> Many of the OSDs have been crashing and restarting as I've tried to rbd
> export good versions of images (from older snapshots).  Here's one
> particular crash:
>
> 2018-07-18 15:52:15.809 7fcbaab77700 -1
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/h
> uge/release/13.2.0/rpm/el7/BUILD/ceph-13.2.0/src/os/bluestore/BlueStore.h:
> In function 'void
> BlueStore::SharedBlobSet::remove_last(BlueStore::SharedBlob*)' thread
> 7fcbaab7
> 7700 time 2018-07-18 15:52:15.750916
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.0/rpm/el7/BUILD/ceph-13
> .2.0/src/os/bluestore/BlueStore.h: 455: FAILED assert(sb->nref == 0)
>
>  ceph version 13.2.0 (79a10589f1f80dfe21e8f9794365ed98143071c4) mimic
> (stable)
>  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0xff) [0x7fcbc197a53f]
>  2: (()+0x286727) [0x7fcbc197a727]
>  3: (BlueStore::SharedBlob::put()+0x1da) [0x5641f39181ca]
>  4: (std::_Rb_tree,
> boost::intrusive_ptr,
> std::_Identity >,
> std::less >,
> std::allocator >
>>::_M_erase(std::_Rb_tree_node lueStore::SharedBlob> >*)+0x2d) [0x5641f3977cfd]
>  5: (std::_Rb_tree,
> boost::intrusive_ptr,
> std::_Identity >,
> std::less >,
> std::allocator >
>>::_M_erase(std::_Rb_tree_node lueStore::SharedBlob> >*)+0x1b) [0x5641f3977ceb]
>  6: (std::_Rb_tree,
> boost::intrusive_ptr,
> std::_Identity >,
> std::less >,
> std::allocator >
>>::_M_erase(std::_Rb_tree_node lueStore::SharedBlob> >*)+0x1b) [0x5641f3977ceb]
>  7: (std::_Rb_tree,
> boost::intrusive_ptr,
> std::_Identity >,
> std::less >,
> std::allocator >
>>::_M_erase(std::_Rb_tree_node lueStore::SharedBlob> >*)+0x1b) [0x5641f3977ceb]
>  8: (BlueStore::TransContext::~TransContext()+0xf7) [0x5641f3979297]
>  9: (BlueStore::_txc_finish(BlueStore::TransContext*)+0x610)
> [0x5641f391c9b0]
>  10: (BlueStore::_txc_state_proc(BlueStore::TransContext*)+0x9a)
> [0x5641f392a38a]
>  11: (BlueStore::_kv_finalize_thread()+0x41e) [0x5641f392b3be]
>  12: (BlueStore::KVFinalizeThread::entry()+0xd) [0x5641f397d85d]
>  13: (()+0x7e25) [0x7fcbbe4d2e25]
>  14: (clone()+0x6d) [0x7fcbbd5c3bad]
>  NOTE: a copy of the executable, or `objdump -rdS ` is needed to
> interpret this.
>
>
> Here's the output of ceph -s that might fill in some configuration
> questions.  Since osds are continually restarting if I try to put load on
> it, the cluster seems to be churning a bit.  That's why I set nodown for
> now.
>
>   cluster:
> id: b2873c9a-5539-4c76-ac4a-a6c9829bfed2
> health: HEALTH_ERR
> 1 filesystem is degraded
> 1 filesystem is offline
> 1 mds daemon damaged
> nodown,noscrub,nodeep-scrub flag(s) set
> 9 scrub errors
> Reduced data availability: 61 pgs inactive, 56 pgs peering, 4
> pgs stale
> Possible data damage: 3 pgs inconsistent
> 16 slow requests are blocked > 32 sec
> 26 stuck requests are blocked > 4096 sec
>
>   services:
> mon: 5 daemons, quorum a,b,c,d,e
> mgr: a(active), standbys: b, d, e, c
> mds: lcs-0/1/1 up

Re: [ceph-users] Recovery from 12.2.5 (corruption) -> 12.2.6 (hair on fire) -> 13.2.0 (some objects inaccessible and CephFS damaged)

2018-07-18 Thread Brad Hubbard
On Wed, Jul 18, 2018 at 2:57 AM, Troy Ablan  wrote:
> I was on 12.2.5 for a couple weeks and started randomly seeing
> corruption, moved to 12.2.6 via yum update on Sunday, and all hell broke
> loose.  I panicked and moved to Mimic, and when that didn't solve the
> problem, only then did I start to root around in mailing lists archives.
>
> It appears I can't downgrade OSDs back to Luminous now that 12.2.7 is
> out, but I'm unsure how to proceed now that the damaged cluster is
> running under Mimic.  Is there anything I can do to get the cluster back
> online and objects readable?

That depends on what the specific problem is. Can you provide some
data that fills in the blanks around "randomly seeing corruption"?

>
> Everything is BlueStore and most of it is EC.
>
> Thanks.
>
> -Troy
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Jewel PG stuck inconsistent with 3 0-size objects

2018-07-16 Thread Brad Hubbard
Your issue is different since not only do the omap digests of all
replicas not match the omap digest from the auth object info but they
are all different to each other.

What is min_size of pool 67 and what can you tell us about the events
leading up to this?

On Mon, Jul 16, 2018 at 7:06 PM, Matthew Vernon  wrote:
> Hi,
>
> Our cluster is running 10.2.9 (from Ubuntu; on 16.04 LTS), and we have a
> pg that's stuck inconsistent; if I repair it, it logs "failed to pick
> suitable auth object" (repair log attached, to try and stop my MUA
> mangling it).
>
> We then deep-scrubbed that pg, at which point
> rados list-inconsistent-obj 67.2e --format=json-pretty produces a bit of
> output (also attached), which includes that all 3 osds have a zero-sized
> object e.g.
>
> "osd": 1937,
> "errors": [
> "omap_digest_mismatch_oi"
> ],
> "size": 0,
> "omap_digest": "0x45773901",
> "data_digest": "0x"
>
> All 3 osds have different omap_digest, but all have 0 size. Indeed,
> looking on the OSD disks directly, each object is 0 size (i.e. they are
> identical).
>
> This looks similar to one of the failure modes in
> http://tracker.ceph.com/issues/21388 where the is a suggestion (comment
> 19 from David Zafman) to do:
>
> rados -p default.rgw.buckets.index setomapval
> .dir.861ae926-7ff0-48c5-86d6-a6ba8d0a7a14.7130858.6 temporary-key anything
> [deep-scrub]
> rados -p default.rgw.buckets.index rmomapkey
> .dir.861ae926-7ff0-48c5-86d6-a6ba8d0a7a14.7130858.6 temporary-key
>
> Is this likely to be the correct approach here, to? And is there an
> underlying bug in ceph that still needs fixing? :)
>
> Thanks,
>
> Matthew
>
>
>
> --
>  The Wellcome Sanger Institute is operated by Genome Research
>  Limited, a charity registered in England with number 1021457 and a
>  company registered in England with number 2742969, whose registered
>  office is 215 Euston Road, London, NW1 2BE.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Cheers,
Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Slow requests

2018-07-09 Thread Brad Hubbard
On Mon, Jul 9, 2018 at 5:28 PM, Benjamin Naber
 wrote:
> Hi @all,
>
> Problem seems to be solved, afther downgrading from Kernel 4.17.2 to 
> 3.10.0-862.
> Anyone other have issues with newer Kernels and osd nodes?

I'd suggest you pursue that with whoever supports the kernel
exhibiting the problem.

>
> kind regards
>
> Ben
>
>> Brad Hubbard  hat am 5. Juli 2018 um 01:16 geschrieben:
>>
>>
>> On Wed, Jul 4, 2018 at 6:26 PM, Benjamin Naber  
>> wrote:
>> > Hi @all,
>> >
>> > im currently in testing for setup an production environment based on the 
>> > following OSD Nodes:
>> >
>> > CEPH Version: luminous 12.2.5
>> >
>> > 5x OSD Nodes with following specs:
>> >
>> > - 8 Core Intel Xeon 2,0 GHZ
>> >
>> > - 96GB Ram
>> >
>> > - 10x 1,92 TB Intel DC S4500 connectet via SATA
>> >
>> > - 4x 10 Gbit NIC 2 bonded via LACP for Backend Network 2 bonded via LACP 
>> > for Backend Network.
>> >
>> > if i run some fio benchmark via a VM that ist running on a RBD Device on a 
>> > KVM testing Host. the cluster always runs into slow request warning. Also 
>> > the performance goes heavy down.
>> >
>> > If i dump the osd that stucks, i get the following output:
>> >
>> > {
>> > "ops": [
>> > {
>> > "description": "osd_op(client.141944.0:359346834 13.1da 
>> > 13:5b8b7fd3:::rbd_data.170a3238e1f29.00be:head [write 
>> > 2097152~1048576] snapc 0=[] ondisk+write+known_if_redirected e2755)",
>> > "initiated_at": "2018-07-04 10:00:49.475879",
>> > "age": 287.180328,
>> > "duration": 287.180355,
>> > "type_data": {
>> > "flag_point": "waiting for sub ops",
>> > "client_info": {
>> > "client": "client.141944",
>> > "client_addr": "10.111.90.1:0/3532639465",
>> > "tid": 359346834
>> > },
>> > "events": [
>> > {
>> > "time": "2018-07-04 10:00:49.475879",
>> > "event": "initiated"
>> > },
>> > {
>> > "time": "2018-07-04 10:00:49.476935",
>> > "event": "queued_for_pg"
>> > },
>> > {
>> > "time": "2018-07-04 10:00:49.477547",
>> > "event": "reached_pg"
>> > },
>> > {
>> > "time": "2018-07-04 10:00:49.477578",
>> > "event": "started"
>> > },
>> > {
>> > "time": "2018-07-04 10:00:49.477614",
>> > "event": "waiting for subops from 5,26"
>> > },
>> > {
>> > "time": "2018-07-04 10:00:49.484679",
>> > "event": "op_commit"
>> > },
>> > {
>> > "time": "2018-07-04 10:00:49.484681",
>> > "event": "op_applied"
>> > },
>> > {
>> > "time": "2018-07-04 10:00:49.485588",
>> > "event": "sub_op_commit_rec from 5"
>> > }
>> > ]
>> > }
>> > },
>> > {
>> > "description": "osd_op(client.141944.0:359346835 13.1da 
>> > 13:5b8b7fd3:::rbd_data.170a3238e1f29.00be:head [write 
>> > 3145728~1048576] snapc 0=[] ondisk+write+known_if_redirected e2755)",
>> > "initiated_at"

Re: [ceph-users] Slow requests

2018-07-04 Thread Brad Hubbard
On Wed, Jul 4, 2018 at 6:26 PM, Benjamin Naber  wrote:
> Hi @all,
>
> im currently in testing for setup an production environment based on the 
> following OSD Nodes:
>
> CEPH Version: luminous 12.2.5
>
> 5x OSD Nodes with following specs:
>
> - 8 Core Intel Xeon 2,0 GHZ
>
> - 96GB Ram
>
> - 10x 1,92 TB Intel DC S4500 connectet via SATA
>
> - 4x 10 Gbit NIC 2 bonded via LACP for Backend Network 2 bonded via LACP for 
> Backend Network.
>
> if i run some fio benchmark via a VM that ist running on a RBD Device on a 
> KVM testing Host. the cluster always runs into slow request warning. Also the 
> performance goes heavy down.
>
> If i dump the osd that stucks, i get the following output:
>
> {
> "ops": [
> {
> "description": "osd_op(client.141944.0:359346834 13.1da 
> 13:5b8b7fd3:::rbd_data.170a3238e1f29.00be:head [write 
> 2097152~1048576] snapc 0=[] ondisk+write+known_if_redirected e2755)",
> "initiated_at": "2018-07-04 10:00:49.475879",
> "age": 287.180328,
> "duration": 287.180355,
> "type_data": {
> "flag_point": "waiting for sub ops",
> "client_info": {
> "client": "client.141944",
> "client_addr": "10.111.90.1:0/3532639465",
> "tid": 359346834
> },
> "events": [
> {
> "time": "2018-07-04 10:00:49.475879",
> "event": "initiated"
> },
> {
> "time": "2018-07-04 10:00:49.476935",
> "event": "queued_for_pg"
> },
> {
> "time": "2018-07-04 10:00:49.477547",
> "event": "reached_pg"
> },
> {
> "time": "2018-07-04 10:00:49.477578",
> "event": "started"
> },
> {
> "time": "2018-07-04 10:00:49.477614",
> "event": "waiting for subops from 5,26"
> },
> {
> "time": "2018-07-04 10:00:49.484679",
> "event": "op_commit"
> },
> {
> "time": "2018-07-04 10:00:49.484681",
> "event": "op_applied"
> },
> {
> "time": "2018-07-04 10:00:49.485588",
> "event": "sub_op_commit_rec from 5"
> }
> ]
> }
> },
> {
> "description": "osd_op(client.141944.0:359346835 13.1da 
> 13:5b8b7fd3:::rbd_data.170a3238e1f29.00be:head [write 
> 3145728~1048576] snapc 0=[] ondisk+write+known_if_redirected e2755)",
> "initiated_at": "2018-07-04 10:00:49.477065",
> "age": 287.179143,
> "duration": 287.179221,
> "type_data": {
> "flag_point": "waiting for sub ops",
> "client_info": {
> "client": "client.141944",
> "client_addr": "10.111.90.1:0/3532639465",
> "tid": 359346835
> },
> "events": [
> {
> "time": "2018-07-04 10:00:49.477065",
> "event": "initiated"
> },
> {
> "time": "2018-07-04 10:00:49.478116",
> "event": "queued_for_pg"
> },
> {
> "time": "2018-07-04 10:00:49.478178",
> "event": "reached_pg"
> },
> {
> "time": "2018-07-04 10:00:49.478201",
> "event": "started"
> },
> {
> "time": "2018-07-04 10:00:49.478232",
> "event": "waiting for subops from 5,26"
> },
> {
> "time": "2018-07-04 10:00:49.484695",
> "event": "op_commit"
> },
> {
> "time": "2018-07-04 10:00:49.484696",
> "event": "op_applied"
> },
> {
> "time": "2018-07-04 10:00:49.485621",
> "event": "sub_op_commit_rec from 5"
> }
> ]
> }
> },
> {
> "description": "osd_op(client.141944.0:359346440 13.11d 
> 13:b8afbe4a:::rbd_data.170a3238e1f29.005c:head [write 0~1048576] 
> snapc 0=[] 

Re: [ceph-users] fixing unrepairable inconsistent PG

2018-06-28 Thread Brad Hubbard
On Fri, Jun 29, 2018 at 2:38 AM, Andrei Mikhailovsky  wrote:
> Hi Brad,
>
> This has helped to repair the issue. Many thanks for your help on this!!!

No problem.

>
> I had so many objects with broken omap checksum, that I spent at least a few 
> hours identifying those and using the commands you've listed to repair. They 
> were all related to one pool called .rgw.buckets.index . All other pools look 
> okay so far.

So originally you said you were having trouble with "one inconsistent
and stubborn PG" When did that become "so many objects"?

>
> I am wondering what could have got horribly wrong with the above pool?

Is that pool 18? I notice it seems to be size 2, what is min_size on that pool?

As to working out what went wrong. What event(s) coincided with or
preceded the problem? What history can you provide? What data can you
provide from the time leading up to when the issue was first seen?

>
> Cheers
>
> Andrei
> - Original Message -
>> From: "Brad Hubbard" 
>> To: "Andrei Mikhailovsky" 
>> Cc: "ceph-users" 
>> Sent: Thursday, 28 June, 2018 01:08:34
>> Subject: Re: [ceph-users] fixing unrepairable inconsistent PG
>
>> Try the following. You can do this with all osds up and running.
>>
>> # rados -p [name_of_pool_18] setomapval .dir.default.80018061.2
>> temporary-key anything
>> # ceph pg deep-scrub 18.2
>>
>> Once you are sure the scrub has completed and the pg is no longer
>> inconsistent you can remove the temporary key.
>>
>> # rados -p [name_of_pool_18] rmomapkey .dir.default.80018061.2 temporary-key
>>
>>
>> On Wed, Jun 27, 2018 at 9:42 PM, Andrei Mikhailovsky  
>> wrote:
>>> Here is one more thing:
>>>
>>> rados list-inconsistent-obj 18.2
>>> {
>>>"inconsistents" : [
>>>   {
>>>  "object" : {
>>> "locator" : "",
>>> "version" : 632942,
>>> "nspace" : "",
>>> "name" : ".dir.default.80018061.2",
>>> "snap" : "head"
>>>  },
>>>  "union_shard_errors" : [
>>> "omap_digest_mismatch_info"
>>>  ],
>>>  "shards" : [
>>> {
>>>"osd" : 21,
>>>"primary" : true,
>>>"data_digest" : "0x",
>>>"omap_digest" : "0x25e8a1da",
>>>"errors" : [
>>>   "omap_digest_mismatch_info"
>>>],
>>>"size" : 0
>>> },
>>> {
>>>"data_digest" : "0x",
>>>"primary" : false,
>>>"osd" : 28,
>>>"errors" : [
>>>   "omap_digest_mismatch_info"
>>>],
>>>"omap_digest" : "0x25e8a1da",
>>>"size" : 0
>>> }
>>>  ],
>>>  "errors" : [],
>>>  "selected_object_info" : {
>>> "mtime" : "2018-06-19 16:31:44.759717",
>>> "alloc_hint_flags" : 0,
>>> "size" : 0,
>>> "last_reqid" : "client.410876514.0:1",
>>> "local_mtime" : "2018-06-19 16:31:44.760139",
>>> "data_digest" : "0x",
>>> "truncate_seq" : 0,
>>> "legacy_snaps" : [],
>>> "expected_write_size" : 0,
>>> "watchers" : {},
>>> "flags" : [
>>>"dirty",
>>>"data_digest",
>>>    "omap_digest"
>>> ],
>>> "oid" : {
>>>"pool" : 18,
>>>"hash" : 1156456354,
>>>"key" : "",
>>>"oid" : ".dir.default.80018061.2",
>>>"namespace&qu

Re: [ceph-users] fixing unrepairable inconsistent PG

2018-06-27 Thread Brad Hubbard
Try the following. You can do this with all osds up and running.

# rados -p [name_of_pool_18] setomapval .dir.default.80018061.2
temporary-key anything
# ceph pg deep-scrub 18.2

Once you are sure the scrub has completed and the pg is no longer
inconsistent you can remove the temporary key.

# rados -p [name_of_pool_18] rmomapkey .dir.default.80018061.2 temporary-key


On Wed, Jun 27, 2018 at 9:42 PM, Andrei Mikhailovsky  wrote:
> Here is one more thing:
>
> rados list-inconsistent-obj 18.2
> {
>"inconsistents" : [
>   {
>  "object" : {
> "locator" : "",
> "version" : 632942,
> "nspace" : "",
> "name" : ".dir.default.80018061.2",
> "snap" : "head"
>  },
>  "union_shard_errors" : [
> "omap_digest_mismatch_info"
>  ],
>  "shards" : [
> {
>"osd" : 21,
>"primary" : true,
>"data_digest" : "0x",
>"omap_digest" : "0x25e8a1da",
>"errors" : [
>   "omap_digest_mismatch_info"
>],
>"size" : 0
> },
> {
>"data_digest" : "0x",
>"primary" : false,
>"osd" : 28,
>"errors" : [
>   "omap_digest_mismatch_info"
>],
>"omap_digest" : "0x25e8a1da",
>"size" : 0
> }
>  ],
>  "errors" : [],
>  "selected_object_info" : {
> "mtime" : "2018-06-19 16:31:44.759717",
> "alloc_hint_flags" : 0,
> "size" : 0,
> "last_reqid" : "client.410876514.0:1",
> "local_mtime" : "2018-06-19 16:31:44.760139",
> "data_digest" : "0x",
> "truncate_seq" : 0,
> "legacy_snaps" : [],
> "expected_write_size" : 0,
> "watchers" : {},
> "flags" : [
>"dirty",
>"data_digest",
>"omap_digest"
> ],
> "oid" : {
>"pool" : 18,
>"hash" : 1156456354,
>"key" : "",
>"oid" : ".dir.default.80018061.2",
>"namespace" : "",
>"snapid" : -2,
>"max" : 0
> },
> "truncate_size" : 0,
> "version" : "120985'632942",
> "expected_object_size" : 0,
> "omap_digest" : "0x",
> "lost" : 0,
> "manifest" : {
>"redirect_target" : {
>   "namespace" : "",
>   "snapid" : 0,
>   "max" : 0,
>   "pool" : -9223372036854775808,
>   "hash" : 0,
>   "oid" : "",
>   "key" : ""
>},
>"type" : 0
> },
> "prior_version" : "0'0",
> "user_version" : 632942
>  }
>   }
>],
>"epoch" : 121151
> }
>
> Cheers
>
> - Original Message -
>> From: "Andrei Mikhailovsky" 
>> To: "Brad Hubbard" 
>> Cc: "ceph-users" 
>> Sent: Wednesday, 27 June, 2018 09:10:07
>> Subject: Re: [ceph-users] fixing unrepairable inconsistent PG
>
>> Hi Brad,
>>
>> Thanks, that helped to get the query info on the inconsistent PG 18.2:
>>
>> {
>>"state": "active+clean+inconsistent",
>>"snap_trimq": "[]",
>>"snap_trimq_len": 0,
>>"epoch": 121293,
>>"up": [
>>21,
>>28
>>],
>>"a

  1   2   3   4   5   >