from:"Steve Dainard"

Re: [ceph-users] Cluster health_warn 1 active+undersized+degraded/1 active+remapped

2015-08-17 Thread Steve Dainard

 active+remapped, last acting [17,76]
>>>> pg 2.81c is stuck unclean for 13892.363443, current state
>>>> active+remapped, last acting [25,76]
>>>> pg 2.1a3 is stuck unclean for 13892.364400, current state
>>>> active+remapped, last acting [16,76]
>>>> pg 2.2cb is stuck unclean for 13892.374390, current state
>>>> active+remapped, last acting [14,76]
>>>> pg 2.d41 is stuck unclean for 13892.373636, current state
>>>> active+remapped, last acting [27,76]
>>>> pg 2.3f9 is stuck unclean for 13892.373147, current state
>>>> active+remapped, last acting [35,76]
>>>> pg 2.a62 is stuck unclean for 86283.741920, current state
>>>> active+remapped, last acting [2,38]
>>>> pg 2.1b0 is stuck unclean for 13892.363268, current state
>>>> active+remapped, last acting [3,76]
>>>> recovery 1/66089446 objects degraded (0.000%)
>>>> recovery 88844/66089446 objects misplaced (0.134%)
>>>>
>>>> I say apparently because with one object degraded, none of the pg's
>>>> are showing degraded:
>>>> # ceph pg dump_stuck degraded
>>>> ok
>>>>
>>>> # ceph pg dump_stuck unclean
>>>> ok
>>>> pg_stat state up up_primary acting acting_primary
>>>> 2.e7f active+remapped [58] 58 [58,5] 58
>>>> 2.143 active+remapped [16] 16 [16,76] 16
>>>> 2.968 active+remapped [44] 44 [44,76] 44
>>>> 2.5f8 active+remapped [17] 17 [17,76] 17
>>>> 2.81c active+remapped [25] 25 [25,76] 25
>>>> 2.1a3 active+remapped [16] 16 [16,76] 16
>>>> 2.2cb active+remapped [14] 14 [14,76] 14
>>>> 2.d41 active+remapped [27] 27 [27,76] 27
>>>> 2.3f9 active+remapped [35] 35 [35,76] 35
>>>> 2.a62 active+remapped [2] 2 [2,38] 2
>>>> 2.1b0 active+remapped [3] 3 [3,76] 3
>>>>
>>>> All of the OSD filesystems are below 85% full.
>>>>
>>>> I then compared a 0.94.2 cluster that was new and had not been updated
>>>> (current cluster is 0.94.2 which had been updated a couple times) and
>>>> noticed the crush map had 'tunable straw_calc_version 1' so I added it
>>>> to the current cluster.
>>>>
>>>> After the data moved around for about 8 hours or so I'm left with this 
>>>> state:
>>>>
>>>> # ceph health detail
>>>> HEALTH_WARN 2 pgs stuck unclean; recovery 16357/66089446 objects
>>>> misplaced (0.025%)
>>>> pg 2.e7f is stuck unclean for 149422.331848, current state
>>>> active+remapped, last acting [58,5]
>>>> pg 2.782 is stuck unclean for 64878.002464, current state
>>>> active+remapped, last acting [76,31]
>>>> recovery 16357/66089446 objects misplaced (0.025%)
>>>>
>>>> I attempted a pg repair on both of the pg's listed above, but it
>>>> doesn't look like anything is happening. The doc's reference an
>>>> inconsistent state as a use case for the repair command so that's
>>>> likely why.
>>>>
>>>> These 2 pg's have been the issue throughout this process so how can I
>>>> dig deeper to figure out what the problem is?
>>>>
>>>> # ceph pg 2.e7f query: http://pastebin.com/jMMsbsjS
>>>> # ceph pg 2.e7f query: http://pastebin.com/0ntBfFK5
>>>>
>>>>
>>>> On Wed, Aug 12, 2015 at 6:52 PM, yangyongp...@bwstor.com.cn
>>>>  wrote:
>>>>> You can try "ceph pg repair pg_id"to repair the unhealth pg."ceph health
>>>>> detail" command is very useful to detect unhealth pgs.
>>>>>
>>>>> 
>>>>> yangyongp...@bwstor.com.cn
>>>>>
>>>>>
>>>>> From: Steve Dainard
>>>>> Date: 2015-08-12 23:48
>>>>> To: ceph-users
>>>>> Subject: [ceph-users] Cluster health_warn 1 active+undersized+degraded/1
>>>>> active+remapped
>>>>> I ran a ceph osd reweight-by-utilization yesterday and partway through
>>>>> had a network interruption. After the network was restored the cluster
>>>>> continued to rebalance but this morning the cluster has stopped
>>>>> rebalance and status will not change from:
>>>>>
>>>>> # ceph status
>>>>> cluster af859ff1-c394-4c9a-95e2-0e0e4c87445c
>>>>> health HEALTH_WARN
>>&

Re: [ceph-users] Cluster health_warn 1 active+undersized+degraded/1 active+remapped

2015-08-13 Thread Steve Dainard

OSD tree: http://pastebin.com/3z333DP4
Crushmap: http://pastebin.com/DBd9k56m

I realize these nodes are quite large, I have plans to break them out
into 12 OSD's/node.

On Thu, Aug 13, 2015 at 9:02 AM, GuangYang  wrote:
> Could you share the 'ceph osd tree dump' and CRUSH map dump ?
>
> Thanks,
> Guang
>
>
> 
>> Date: Thu, 13 Aug 2015 08:16:09 -0700
>> From: sdain...@spd1.com
>> To: yangyongp...@bwstor.com.cn; ceph-users@lists.ceph.com
>> Subject: Re: [ceph-users] Cluster health_warn 1 active+undersized+degraded/1 
>> active+remapped
>>
>> I decided to set OSD 76 out and let the cluster shuffle the data off
>> that disk and then brought the OSD back in. For the most part this
>> seemed to be working, but then I had 1 object degraded and 88xxx
>> objects misplaced:
>>
>> # ceph health detail
>> HEALTH_WARN 11 pgs stuck unclean; recovery 1/66089446 objects degraded
>> (0.000%); recovery 88844/66089446 objects misplaced (0.134%)
>> pg 2.e7f is stuck unclean for 88398.251351, current state
>> active+remapped, last acting [58,5]
>> pg 2.143 is stuck unclean for 13892.364101, current state
>> active+remapped, last acting [16,76]
>> pg 2.968 is stuck unclean for 13892.363521, current state
>> active+remapped, last acting [44,76]
>> pg 2.5f8 is stuck unclean for 13892.377245, current state
>> active+remapped, last acting [17,76]
>> pg 2.81c is stuck unclean for 13892.363443, current state
>> active+remapped, last acting [25,76]
>> pg 2.1a3 is stuck unclean for 13892.364400, current state
>> active+remapped, last acting [16,76]
>> pg 2.2cb is stuck unclean for 13892.374390, current state
>> active+remapped, last acting [14,76]
>> pg 2.d41 is stuck unclean for 13892.373636, current state
>> active+remapped, last acting [27,76]
>> pg 2.3f9 is stuck unclean for 13892.373147, current state
>> active+remapped, last acting [35,76]
>> pg 2.a62 is stuck unclean for 86283.741920, current state
>> active+remapped, last acting [2,38]
>> pg 2.1b0 is stuck unclean for 13892.363268, current state
>> active+remapped, last acting [3,76]
>> recovery 1/66089446 objects degraded (0.000%)
>> recovery 88844/66089446 objects misplaced (0.134%)
>>
>> I say apparently because with one object degraded, none of the pg's
>> are showing degraded:
>> # ceph pg dump_stuck degraded
>> ok
>>
>> # ceph pg dump_stuck unclean
>> ok
>> pg_stat state up up_primary acting acting_primary
>> 2.e7f active+remapped [58] 58 [58,5] 58
>> 2.143 active+remapped [16] 16 [16,76] 16
>> 2.968 active+remapped [44] 44 [44,76] 44
>> 2.5f8 active+remapped [17] 17 [17,76] 17
>> 2.81c active+remapped [25] 25 [25,76] 25
>> 2.1a3 active+remapped [16] 16 [16,76] 16
>> 2.2cb active+remapped [14] 14 [14,76] 14
>> 2.d41 active+remapped [27] 27 [27,76] 27
>> 2.3f9 active+remapped [35] 35 [35,76] 35
>> 2.a62 active+remapped [2] 2 [2,38] 2
>> 2.1b0 active+remapped [3] 3 [3,76] 3
>>
>> All of the OSD filesystems are below 85% full.
>>
>> I then compared a 0.94.2 cluster that was new and had not been updated
>> (current cluster is 0.94.2 which had been updated a couple times) and
>> noticed the crush map had 'tunable straw_calc_version 1' so I added it
>> to the current cluster.
>>
>> After the data moved around for about 8 hours or so I'm left with this state:
>>
>> # ceph health detail
>> HEALTH_WARN 2 pgs stuck unclean; recovery 16357/66089446 objects
>> misplaced (0.025%)
>> pg 2.e7f is stuck unclean for 149422.331848, current state
>> active+remapped, last acting [58,5]
>> pg 2.782 is stuck unclean for 64878.002464, current state
>> active+remapped, last acting [76,31]
>> recovery 16357/66089446 objects misplaced (0.025%)
>>
>> I attempted a pg repair on both of the pg's listed above, but it
>> doesn't look like anything is happening. The doc's reference an
>> inconsistent state as a use case for the repair command so that's
>> likely why.
>>
>> These 2 pg's have been the issue throughout this process so how can I
>> dig deeper to figure out what the problem is?
>>
>> # ceph pg 2.e7f query: http://pastebin.com/jMMsbsjS
>> # ceph pg 2.e7f query: http://pastebin.com/0ntBfFK5
>>
>>
>> On Wed, Aug 12, 2015 at 6:52 PM, yangyongp...@bwstor.com.cn
>>  wrote:
>>> You can try "ceph pg repair pg_id"to repair the unhealth pg."ceph health
>>> detail" command is very useful to detect unhealth

Re: [ceph-users] Cluster health_warn 1 active+undersized+degraded/1 active+remapped

2015-08-13 Thread Steve Dainard

I decided to set OSD 76 out and let the cluster shuffle the data off
that disk and then brought the OSD back in. For the most part this
seemed to be working, but then I had 1 object degraded and 88xxx
objects misplaced:

# ceph health detail
HEALTH_WARN 11 pgs stuck unclean; recovery 1/66089446 objects degraded
(0.000%); recovery 88844/66089446 objects misplaced (0.134%)
pg 2.e7f is stuck unclean for 88398.251351, current state
active+remapped, last acting [58,5]
pg 2.143 is stuck unclean for 13892.364101, current state
active+remapped, last acting [16,76]
pg 2.968 is stuck unclean for 13892.363521, current state
active+remapped, last acting [44,76]
pg 2.5f8 is stuck unclean for 13892.377245, current state
active+remapped, last acting [17,76]
pg 2.81c is stuck unclean for 13892.363443, current state
active+remapped, last acting [25,76]
pg 2.1a3 is stuck unclean for 13892.364400, current state
active+remapped, last acting [16,76]
pg 2.2cb is stuck unclean for 13892.374390, current state
active+remapped, last acting [14,76]
pg 2.d41 is stuck unclean for 13892.373636, current state
active+remapped, last acting [27,76]
pg 2.3f9 is stuck unclean for 13892.373147, current state
active+remapped, last acting [35,76]
pg 2.a62 is stuck unclean for 86283.741920, current state
active+remapped, last acting [2,38]
pg 2.1b0 is stuck unclean for 13892.363268, current state
active+remapped, last acting [3,76]
recovery 1/66089446 objects degraded (0.000%)
recovery 88844/66089446 objects misplaced (0.134%)

I say apparently because with one object degraded, none of the pg's
are showing degraded:
# ceph pg dump_stuck degraded
ok

# ceph pg dump_stuck unclean
ok
pg_stat state up up_primary acting acting_primary
2.e7f active+remapped [58] 58 [58,5] 58
2.143 active+remapped [16] 16 [16,76] 16
2.968 active+remapped [44] 44 [44,76] 44
2.5f8 active+remapped [17] 17 [17,76] 17
2.81c active+remapped [25] 25 [25,76] 25
2.1a3 active+remapped [16] 16 [16,76] 16
2.2cb active+remapped [14] 14 [14,76] 14
2.d41 active+remapped [27] 27 [27,76] 27
2.3f9 active+remapped [35] 35 [35,76] 35
2.a62 active+remapped [2] 2 [2,38] 2
2.1b0 active+remapped [3] 3 [3,76] 3

All of the OSD filesystems are below 85% full.

I then compared a 0.94.2 cluster that was new and had not been updated
(current cluster is 0.94.2 which had been updated a couple times) and
noticed the crush map had 'tunable straw_calc_version 1' so I added it
to the current cluster.

After the data moved around for about 8 hours or so I'm left with this state:

# ceph health detail
HEALTH_WARN 2 pgs stuck unclean; recovery 16357/66089446 objects
misplaced (0.025%)
pg 2.e7f is stuck unclean for 149422.331848, current state
active+remapped, last acting [58,5]
pg 2.782 is stuck unclean for 64878.002464, current state
active+remapped, last acting [76,31]
recovery 16357/66089446 objects misplaced (0.025%)

I attempted a pg repair on both of the pg's listed above, but it
doesn't look like anything is happening. The doc's reference an
inconsistent state as a use case for the repair command so that's
likely why.

These 2 pg's have been the issue throughout this process so how can I
dig deeper to figure out what the problem is?

# ceph pg 2.e7f query: http://pastebin.com/jMMsbsjS
# ceph pg 2.e7f query: http://pastebin.com/0ntBfFK5


On Wed, Aug 12, 2015 at 6:52 PM, yangyongp...@bwstor.com.cn
 wrote:
> You can try "ceph pg repair pg_id"to repair the unhealth pg."ceph health
> detail" command is very useful to detect unhealth pgs.
>
> ________
> yangyongp...@bwstor.com.cn
>
>
> From: Steve Dainard
> Date: 2015-08-12 23:48
> To: ceph-users
> Subject: [ceph-users] Cluster health_warn 1 active+undersized+degraded/1
> active+remapped
> I ran a ceph osd reweight-by-utilization yesterday and partway through
> had a network interruption. After the network was restored the cluster
> continued to rebalance but this morning the cluster has stopped
> rebalance and status will not change from:
>
> # ceph status
> cluster af859ff1-c394-4c9a-95e2-0e0e4c87445c
>  health HEALTH_WARN
> 1 pgs degraded
> 1 pgs stuck degraded
> 2 pgs stuck unclean
> 1 pgs stuck undersized
> 1 pgs undersized
> recovery 8163/66089054 objects degraded (0.012%)
> recovery 8194/66089054 objects misplaced (0.012%)
>  monmap e24: 3 mons at
> {mon1=10.0.231.53:6789/0,mon2=10.0.231.54:6789/0,mon3=10.0.231.55:6789/0}
> election epoch 250, quorum 0,1,2 mon1,mon2,mon3
>  osdmap e184486: 100 osds: 100 up, 100 in; 1 remapped pgs
>   pgmap v3010985: 4144 pgs, 7 pools, 125 TB data, 32270 kobjects
> 251 TB used, 111 TB / 363 TB avail
> 8163/66089054 objects degraded (0.012%)
> 8194/66089054 objects

[ceph-users] Cluster health_warn 1 active+undersized+degraded/1 active+remapped

2015-08-12 Thread Steve Dainard

I ran a ceph osd reweight-by-utilization yesterday and partway through
had a network interruption. After the network was restored the cluster
continued to rebalance but this morning the cluster has stopped
rebalance and status will not change from:

# ceph status
cluster af859ff1-c394-4c9a-95e2-0e0e4c87445c
 health HEALTH_WARN
1 pgs degraded
1 pgs stuck degraded
2 pgs stuck unclean
1 pgs stuck undersized
1 pgs undersized
recovery 8163/66089054 objects degraded (0.012%)
recovery 8194/66089054 objects misplaced (0.012%)
 monmap e24: 3 mons at
{mon1=10.0.231.53:6789/0,mon2=10.0.231.54:6789/0,mon3=10.0.231.55:6789/0}
election epoch 250, quorum 0,1,2 mon1,mon2,mon3
 osdmap e184486: 100 osds: 100 up, 100 in; 1 remapped pgs
  pgmap v3010985: 4144 pgs, 7 pools, 125 TB data, 32270 kobjects
251 TB used, 111 TB / 363 TB avail
8163/66089054 objects degraded (0.012%)
8194/66089054 objects misplaced (0.012%)
4142 active+clean
   1 active+undersized+degraded
   1 active+remapped


# ceph health detail
HEALTH_WARN 1 pgs degraded; 1 pgs stuck degraded; 2 pgs stuck unclean;
1 pgs stuck undersized; 1 pgs undersized; recovery 8163/66089054
objects degraded (0.012%); recovery 8194/66089054 objects misplaced
(0.012%)
pg 2.e7f is stuck unclean for 65125.554509, current state
active+remapped, last acting [58,5]
pg 2.782 is stuck unclean for 65140.681540, current state
active+undersized+degraded, last acting [76]
pg 2.782 is stuck undersized for 60568.221461, current state
active+undersized+degraded, last acting [76]
pg 2.782 is stuck degraded for 60568.221549, current state
active+undersized+degraded, last acting [76]
pg 2.782 is active+undersized+degraded, acting [76]
recovery 8163/66089054 objects degraded (0.012%)
recovery 8194/66089054 objects misplaced (0.012%)

# ceph pg 2.e7f query
"recovery_state": [
{
"name": "Started\/Primary\/Active",
"enter_time": "2015-08-11 15:43:09.190269",
"might_have_unfound": [],
"recovery_progress": {
"backfill_targets": [],
"waiting_on_backfill": [],
"last_backfill_started": "0\/\/0\/\/-1",
"backfill_info": {
"begin": "0\/\/0\/\/-1",
"end": "0\/\/0\/\/-1",
"objects": []
},
"peer_backfill_info": [],
"backfills_in_flight": [],
"recovering": [],
"pg_backend": {
"pull_from_peer": [],
"pushing": []
}
},
"scrub": {
"scrubber.epoch_start": "0",
"scrubber.active": 0,
"scrubber.waiting_on": 0,
"scrubber.waiting_on_whom": []
}
},
{
"name": "Started",
"enter_time": "2015-08-11 15:43:04.955796"
}
],


# ceph pg 2.782 query
  "recovery_state": [
{
"name": "Started\/Primary\/Active",
"enter_time": "2015-08-11 15:42:42.178042",
"might_have_unfound": [
{
"osd": "5",
"status": "not queried"
}
],
"recovery_progress": {
"backfill_targets": [],
"waiting_on_backfill": [],
"last_backfill_started": "0\/\/0\/\/-1",
"backfill_info": {
"begin": "0\/\/0\/\/-1",
"end": "0\/\/0\/\/-1",
"objects": []
},
"peer_backfill_info": [],
"backfills_in_flight": [],
"recovering": [],
"pg_backend": {
"pull_from_peer": [],
"pushing": []
}
},
"scrub": {
"scrubber.epoch_start": "0",
"scrubber.active": 0,
"scrubber.waiting_on": 0,
"scrubber.waiting_on_whom": []
}
},
{
"name": "Started",
"enter_time": "2015-08-11 15:42:41.139709"
}
],
"agent_state": {}

I tried restarted osd.5/58/76 but no change.

Any suggestions?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] ceph tell not persistent through reboots?

2015-08-06 Thread Steve Dainard

That would make sense..

Thanks!

On Thu, Aug 6, 2015 at 6:29 PM, Wang, Warren
 wrote:
> Injecting args into the running procs is not meant to be persistent. You'll 
> need to modify /etc/ceph/ceph.conf for that.
>
> Warren
>
> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
> Steve Dainard
> Sent: Thursday, August 06, 2015 9:16 PM
> To: ceph-users@lists.ceph.com
> Subject: [ceph-users] ceph tell not persistent through reboots?
>
> Hello,
>
> Version 0.94.1
>
> I'm passing settings to the admin socket ie:
> ceph tell osd.* injectargs '--osd_deep_scrub_begin_hour 20'
> ceph tell osd.* injectargs '--osd_deep_scrub_end_hour 4'
> ceph tell osd.* injectargs '--osd_deep_scrub_interval 1209600'
>
> Then I check to see if they're in the configs now:
> # ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok config show | egrep -i 
> 'scrub_interval|hour'
> "osd_scrub_begin_hour": "4",
> "osd_scrub_end_hour": "20",
> "osd_deep_scrub_interval": "1.2096e+06",
>
> Then I restart that host and check again and the values have returned to 
> default:
> # ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok config show | egrep -i 
> 'scrub_interval|hour'
> "osd_scrub_begin_hour": "0",
> "osd_scrub_end_hour": "24",
> "osd_deep_scrub_interval": "604800",
>
> If I check on another host the values are correct:
> # ceph --admin-daemon /var/run/ceph/ceph-osd.90.asok config show | egrep -i 
> 'scrub_interval|hour'
> "osd_scrub_begin_hour": "20",
> "osd_scrub_end_hour": "4",
> "osd_deep_scrub_interval": "1.2096e+06",
>
> If I check on a mon the values are default:
> # ceph --admin-daemon /var/run/ceph/ceph-mon.mon1.asok config show | egrep -i 
> 'scrub_interval|hour'
> "osd_scrub_begin_hour": "0",
> "osd_scrub_end_hour": "24",
> "osd_deep_scrub_interval": "604800",
>
> If I try to pass a config to mon1 via a osd host it appears to do something:
> # ceph tell mon.1 injectargs "--osd_deep_scrub_interval 1209600"
> injectargs:osd_deep_scrub_interval = '1.2096e+06'
>
> And then check on mon1 and its still the default value:
> # ceph --admin-daemon /var/run/ceph/ceph-mon.mon1.asok config show | egrep -i 
> scrub_interval
> "osd_deep_scrub_interval": "604800",
>
>
> And if I pass a config on mon1 it looks like its being updated, but the 
> default remains:
> # ceph tell mon.1 injectargs "--osd_deep_scrub_interval 1209600"
> injectargs:osd_deep_scrub_interval = '1.2096e+06'
> # ceph --admin-daemon /var/run/ceph/ceph-mon.mon1.asok config show | egrep -i 
> scrub_interval
> "osd_deep_scrub_interval": "604800",
>
> I don't know if this is a bug, or if I'm doing something wrong here...
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Direct IO tests on RBD device vary significantly

2015-08-06 Thread Steve Dainard

Trying to get an understanding why direct IO would be so slow on my cluster.

Ceph 0.94.1
1 Gig public network
10 Gig public network
10 Gig cluster network

100 OSD's, 4T disk sizes, 5G SSD journal.

As of this morning I had no SSD journal and was finding direct IO was
sub 10MB/s so I decided to add journals today.

Afterwards I started running tests again and wasn't very impressed.
Then for no apparent reason the write speeds increased significantly.
But I'm finding they vary wildly.

Currently there is a bit of background ceph activity, but only my
testing client has an rbd mapped/mounted:
   election epoch 144, quorum 0,1,2 mon1,mon3,mon2
 osdmap e181963: 100 osds: 100 up, 100 in
flags noout
  pgmap v2852566: 4144 pgs, 7 pools, 113 TB data, 29179 kobjects
227 TB used, 135 TB / 363 TB avail
4103 active+clean
  40 active+clean+scrubbing
   1 active+clean+scrubbing+deep

Tests:
1M block size: http://pastebin.com/LKtsaHrd throughput has no consistency
4k block size: http://pastebin.com/ib6VW9eB thoughput is amazingly consistent

Thoughts?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] ceph tell not persistent through reboots?

2015-08-06 Thread Steve Dainard

Hello,

Version 0.94.1

I'm passing settings to the admin socket ie:
ceph tell osd.* injectargs '--osd_deep_scrub_begin_hour 20'
ceph tell osd.* injectargs '--osd_deep_scrub_end_hour 4'
ceph tell osd.* injectargs '--osd_deep_scrub_interval 1209600'

Then I check to see if they're in the configs now:
# ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok config show |
egrep -i 'scrub_interval|hour'
"osd_scrub_begin_hour": "4",
"osd_scrub_end_hour": "20",
"osd_deep_scrub_interval": "1.2096e+06",

Then I restart that host and check again and the values have returned
to default:
# ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok config show |
egrep -i 'scrub_interval|hour'
"osd_scrub_begin_hour": "0",
"osd_scrub_end_hour": "24",
"osd_deep_scrub_interval": "604800",

If I check on another host the values are correct:
# ceph --admin-daemon /var/run/ceph/ceph-osd.90.asok config show |
egrep -i 'scrub_interval|hour'
"osd_scrub_begin_hour": "20",
"osd_scrub_end_hour": "4",
"osd_deep_scrub_interval": "1.2096e+06",

If I check on a mon the values are default:
# ceph --admin-daemon /var/run/ceph/ceph-mon.mon1.asok config show |
egrep -i 'scrub_interval|hour'
"osd_scrub_begin_hour": "0",
"osd_scrub_end_hour": "24",
"osd_deep_scrub_interval": "604800",

If I try to pass a config to mon1 via a osd host it appears to do something:
# ceph tell mon.1 injectargs "--osd_deep_scrub_interval 1209600"
injectargs:osd_deep_scrub_interval = '1.2096e+06'

And then check on mon1 and its still the default value:
# ceph --admin-daemon /var/run/ceph/ceph-mon.mon1.asok config show |
egrep -i scrub_interval
"osd_deep_scrub_interval": "604800",


And if I pass a config on mon1 it looks like its being updated, but
the default remains:
# ceph tell mon.1 injectargs "--osd_deep_scrub_interval 1209600"
injectargs:osd_deep_scrub_interval = '1.2096e+06'
# ceph --admin-daemon /var/run/ceph/ceph-mon.mon1.asok config show |
egrep -i scrub_interval
"osd_deep_scrub_interval": "604800",

I don't know if this is a bug, or if I'm doing something wrong here...
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Meanning of ceph perf dump

2015-07-24 Thread Steve Dainard

Hi Somnath,

Do you have a link with the definitions of all the perf counters?

Thanks,
Steve

On Sun, Jul 5, 2015 at 11:23 AM, Somnath Roy  wrote:
> Hi Ray,
>
> Here is the description of the different latencies under filestore perf
> counters.
>
>
>
> Journal_latency :
>
> --
>
>
>
> This is the latency of putting the ops in journal. Write is acknowledged
> after that (well a bit after that, there is one context switch after this).
>
>
>
> commitcycle_latency:
>
> --
>
>
>
> Filestore backend while carrying out transaction, do a buffered write. In a
> separate thread it does call syncfs() to persist the data to the disk and
> update the persistent commit number in a separate file. This thread runs by
> default 5 sec interval.
>
> This latency measures the time taken to carry out this job after the timer
> expires i.e the actual persisting cycle.
>
>
>
> apply_latency:
>
> 
>
>
>
> This is the entire latency till the transaction finishes i.e journal write +
> transaction time. It will do a buffer write here.
>
>
>
> queue_transaction_latency_avg:
>
> 
>
> This is the latency of putting the op in the journal queue. This will give
> you an idea how much throttling is going on at the first place. This depends
> on the following two parameters if you are using XFS.
>
>
>
> filestore_queue_max_ops
>
> filestore_queue_max_bytes
>
>
>
>
>
> All the latency numbers are represented by avgcount(number of ops within
> this range) and the sum (which is total latency in second). Sum/avgcount
> will give you an idea the latency per op.
>
>
>
> Hope this is helpful,
>
>
>
> Thanks & Regards
>
> Somnath
>
>
>
>
>
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Ray
> Sun
> Sent: Sunday, July 05, 2015 7:28 AM
> To: ceph-users@lists.ceph.com
> Subject: [ceph-users] Meanning of ceph perf dump
>
>
>
> Cephers,
>
> Is there any documents or code definition to explain ceph perf dump? I am a
> little confusing about the output, for example, under filestore, there's
> journal_latency and apply_latency and each of them has avgcount and sum. I
> am not quite sure what's the unit and meaning of the numbers? How can I use
> these numbers to tuning my ceph cluster. Thanks a lot.
>
>
>
> "filestore": {
>
> "journal_queue_max_ops": 300,
>
> "journal_queue_ops": 0,
>
> "journal_ops": 35893,
>
> "journal_queue_max_bytes": 33554432,
>
> "journal_queue_bytes": 0,
>
> "journal_bytes": 20579009432,
>
> "journal_latency": {
>
> "avgcount": 35893,
>
> "sum": 1213.560761279
>
> },
>
> "journal_wr": 34228,
>
> "journal_wr_bytes": {
>
> "avgcount": 34228,
>
> "sum": 20657713152
>
> },
>
> "journal_full": 0,
>
> "committing": 0,
>
> "commitcycle": 3207,
>
> "commitcycle_interval": {
>
> "avgcount": 3207,
>
> "sum": 16157.379852152
>
> },
>
> "commitcycle_latency": {
>
> "avgcount": 3207,
>
> "sum": 121.892109010
>
> },
>
> "op_queue_max_ops": 50,
>
> "op_queue_ops": 0,
>
> "ops": 35893,
>
> "op_queue_max_bytes": 104857600,
>
> "op_queue_bytes": 0,
>
> "bytes": 20578506930,
>
> "apply_latency": {
>
> "avgcount": 35893,
>
> "sum": 1327.974596287
>
> },
>
> "queue_transaction_latency_avg": {
>
> "avgcount": 35893,
>
> "sum": 0.025993727
>
> }
>
> },
>
>
>
> Best Regards
> -- Ray
>
>
> 
>
> PLEASE NOTE: The information contained in this electronic mail message is
> intended only for the use of the designated recipient(s) named above. If the
> reader of this message is not the intended recipient, you are hereby
> notified that you have received this message in error and that any review,
> dissemination, distribution, or copying of this message is strictly
> prohibited. If you have received this communication in error, please notify
> the sender by telephone or e-mail (as shown above) immediately and destroy
> any and all copies of this message in your possession (whether hard copies
> or electronically stored copies).
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Workaround for RHEL/CentOS 7.1 rbdmap service start warnings?

2015-07-17 Thread Steve Dainard

Other than those errors, do you find RBD's will not be unmapped on
system restart/shutdown on a machine using systemd? Leaving the system
hanging without network connections trying to unmap RBD's?

That's been my experience thus far, so I wrote an (overly simple)
systemd file to handle this on a per RBD basis.

On Tue, Jul 14, 2015 at 1:15 PM, Bruce McFarland
 wrote:
> When starting the rbdmap.service to provide map/unmap of rbd devices across
> boot/shutdown cycles the /etc/init.d/rbdmap includes
> /lib/lsb/init-functions. This is not a problem except that the rbdmap script
> is making calls to the log_daemon_* log_progress_* log_actiion_* functions
> that are included in Ubuntu 14.04 distro's, but are not in the RHEL 7.1/RHCS
> 1.3 distro. Are there any recommended workaround for boot time startup in
> RHEL/Centos 7.1 clients?
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Deadly slow Ceph cluster revisited

2015-07-17 Thread Steve Dainard

Disclaimer: I'm relatively new to ceph, and haven't moved into
production with it.

Did you run your bench for 30 seconds?

For reference my bench from a VM bridged to a 10Gig card with 90x4TB
at 30 seconds is:

 Total time run: 30.766596
Total writes made:  1979
Write size: 4194304
Bandwidth (MB/sec): 257.292

Stddev Bandwidth:   106.78
Max bandwidth (MB/sec): 420
Min bandwidth (MB/sec): 0
Average Latency:0.248238
Stddev Latency: 0.723444
Max latency:10.5275
Min latency:0.0346015

Seems like latency is a huge factor if your 30 second test took 52 seconds.

What kind of 10Gig NICs are you using? I have Mellanox Connectx-3 and
one node was using an older driver version. I started to experience
the osd in..out..in.. and "incorrectly marked out from..." as
mentioned by Quentin as well as poor performance. Installed the newest
version of the Mellanox driver and all is running well again.

On Fri, Jul 17, 2015 at 7:55 AM, J David  wrote:
> On Fri, Jul 17, 2015 at 10:21 AM, Mark Nelson  wrote:
>> rados -p  30 bench write
>>
>> just to see how it handles 4MB object writes.
>
> Here's that, from the VM host:
>
>  Total time run: 52.062639
> Total writes made:  66
> Write size: 4194304
> Bandwidth (MB/sec): 5.071
>
> Stddev Bandwidth:   11.6312
> Max bandwidth (MB/sec): 80
> Min bandwidth (MB/sec): 0
> Average Latency:12.436
> Stddev Latency: 13.6272
> Max latency:51.6924
> Min latency:0.073353
>
> Unfortunately I don't know much about how to parse this (other than
> 5MB/sec writes does match up with our best-case performance in the VM
> guest).
>
>> If rados bench is
>> also terribly slow, then you might want to start looking for evidence of IO
>> getting hung up on a specific disk or node.
>
> Thusfar, no evidence of that has presented itself.  iostat looks good
> on every drive and the nodes are all equally loaded.
>
> Thanks!
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Unsetting osd_crush_chooseleaf_type = 0

2015-07-16 Thread Steve Dainard

I originally built a single node cluster, and added
'osd_crush_chooseleaf_type = 0 #0 is for one node cluster' to ceph.conf
(which is now commented out).

I've now added a 2nd node, where can I set this value to 1? I see in the
crush map that the osd's are under 'host' buckets and don't see any
reference to leaf.

Would the cluster automatically rebalance when the 2nd host was added? How
can I verify this?

The issue right now, is with two host, copies = 2, min copies = 1, I cannot
access data from client machines when one of the two hosts goes down.

# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1

# devices
device 0 osd.0
device 1 osd.1
device 2 osd.2
device 3 osd.3

# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 region
type 10 root

# buckets
host ceph1 {
id -2   # do not change unnecessarily
# weight 163.350
alg straw
hash 0  # rjenkins1
item osd.0 weight 3.630
item osd.1 weight 3.630
}
host ceph2 {
id -3   # do not change unnecessarily
# weight 163.350
alg straw
hash 0  # rjenkins1
item osd.2 weight 3.630
item osd.3 weight 3.630
}
root default {
id -1   # do not change unnecessarily
# weight 326.699
alg straw
hash 0  # rjenkins1
item ceph1 weight 163.350
item ceph2 weight 163.350
}

# rules
rule replicated_ruleset {
ruleset 0
type replicated
min_size 1
max_size 10
step take default
*step choose firstn 0 type osd <-- should this line be ''**step
chooseleaf firstn 0 type host"?*
step emit
}

# end crush map
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Health WARN, ceph errors looping

2015-07-07 Thread Steve Dainard

The error keeps coming back, eventually status changing to OK, then
back into errors.

I thought it looked like a connectivity issue as well with the
"wrongly marked me down", but firewall rules are allowing all traffic
on the cluster network.

Syslog is being flooded with messages like:
Jul  7 10:52:17 ceph1 bash: 2015-07-07 10:52:17.609870 7f2055192700 -1
osd.21 129936 heartbeat_check: no reply from osd.89 ever on either
front or back, first ping sent 2015-07-07 10:51:50.995374 (cutoff
2015-07-07 10:51:57.609817)
Jul  7 10:52:17 ceph1 bash: 2015-07-07 10:52:17.611302 7f203ba5b700 -1
osd.21 129936 heartbeat_check: no reply from osd.50 ever on either
front or back, first ping sent 2015-07-07 10:51:44.691270 (cutoff
2015-07-07 10:51:57.611297)
Jul  7 10:52:17 ceph1 bash: 2015-07-07 10:52:17.611309 7f203ba5b700 -1
osd.21 129936 heartbeat_check: no reply from osd.61 ever on either
front or back, first ping sent 2015-07-07 10:51:50.995374 (cutoff
2015-07-07 10:51:57.611297)
Jul  7 10:52:17 ceph1 bash: 2015-07-07 10:52:17.611315 7f203ba5b700 -1
osd.21 129936 heartbeat_check: no reply from osd.69 ever on either
front or back, first ping sent 2015-07-07 10:51:54.998259 (cutoff
2015-07-07 10:51:57.611297)

Thats just a small section, but multiple osd's are listed. eventually
the logs are rate limited because they're coming in so fast.

On Tue, Jul 7, 2015 at 10:13 AM, Abhishek L
 wrote:
>
> Steve Dainard writes:
>
>> Hello,
>>
>> Ceph 0.94.1
>> 2 hosts, Centos 7
>>
>> I have two hosts, one which ran out of / disk space which crashed all
>> the osd daemons. After cleaning up the OS disk storage and restarting
>> ceph on that node, I'm seeing multiple errors, then health OK, then
>> back into the errors:
>>
>> # ceph -w
>> http://pastebin.com/mSKwNzYp
>
> Is the error still consistently happening? (the last lines shows
> active+clean) Wild guess, but is it possible some sort of
> iptables/firewall rules are preventing communication between the osds?
>
>>
>> Any help is appreciated.
>>
>> Thanks,
>> Steve
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> --
> Abhishek
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Health WARN, ceph errors looping

2015-07-07 Thread Steve Dainard

Hello,

Ceph 0.94.1
2 hosts, Centos 7

I have two hosts, one which ran out of / disk space which crashed all
the osd daemons. After cleaning up the OS disk storage and restarting
ceph on that node, I'm seeing multiple errors, then health OK, then
back into the errors:

# ceph -w
http://pastebin.com/mSKwNzYp

Any help is appreciated.

Thanks,
Steve
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Can't mount btrfs volume on rbd

2015-06-11 Thread Steve Dainard

Hello,

I'm getting an error when attempting to mount a volume on a host that was
forceably powered off:

# mount /dev/rbd4 climate-downscale-CMIP5/
mount: mount /dev/rbd4 on /mnt/climate-downscale-CMIP5 failed: Stale file
handle

/var/log/messages:
Jun 10 15:31:07 node1 kernel: rbd4: unknown partition table

# parted /dev/rbd4 print
Model: Unknown (unknown)
Disk /dev/rbd4: 36.5TB
Sector size (logical/physical): 512B/512B
Partition Table: loop
Disk Flags:

Number  Start  End SizeFile system  Flags
 1  0.00B  36.5TB  36.5TB  btrfs

# btrfs check --repair /dev/rbd4
enabling repair mode
Checking filesystem on /dev/rbd4
UUID: dfe6b0c8-2866-4318-abc2-e1e75c891a5e
checking extents
cmds-check.c:2274: check_owner_ref: Assertion `rec->is_root` failed.
btrfs[0x4175cc]
btrfs[0x41b873]
btrfs[0x41c3fe]
btrfs[0x41dc1d]
btrfs[0x406922]


OS: CentOS 7.1
btrfs-progs: 3.16.2
Ceph: version: 0.94.1/CentOS 7.1

I haven't found any references to 'stale file handle' on btrfs.

The underlying block device is ceph rbd, so I've posted to both lists for
any feedback. Also once I reformatted btrfs I didn't get a mount error.

The btrfs volume has been reformatted so I won't be able to do much post
mortem but I'm wondering if anyone has some insight.

Thanks,
Steve
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Cluster health_warn 1 active+undersized+degraded/1 active+remapped

Re: [ceph-users] Cluster health_warn 1 active+undersized+degraded/1 active+remapped

Re: [ceph-users] Cluster health_warn 1 active+undersized+degraded/1 active+remapped

[ceph-users] Cluster health_warn 1 active+undersized+degraded/1 active+remapped

Re: [ceph-users] ceph tell not persistent through reboots?

[ceph-users] Direct IO tests on RBD device vary significantly

[ceph-users] ceph tell not persistent through reboots?

Re: [ceph-users] Meanning of ceph perf dump

Re: [ceph-users] Workaround for RHEL/CentOS 7.1 rbdmap service start warnings?

Re: [ceph-users] Deadly slow Ceph cluster revisited

[ceph-users] Unsetting osd_crush_chooseleaf_type = 0

Re: [ceph-users] Health WARN, ceph errors looping

[ceph-users] Health WARN, ceph errors looping

[ceph-users] Can't mount btrfs volume on rbd

14 matches

Site Navigation

Mail list logo

Footer information