Re: [ceph-users] New cephfs cluster performance issues- Jewel - cachepressure, capability release, poor iostat await avg queue size

mykola.dvornik Wed, 19 Oct 2016 10:34:44 -0700

Not sure if related, but I see the same issue on the very different 
hardware/configuration. In particular on large data transfers OSDs become slow 
and blocking. Iostat await on spinners can go up to 6(!) s ( journal is on the 
ssd). Looking closer on those spinners with blktrace suggest that most of those 
6 s the io requests spend in the que, before get committed to the driver and 
eventually written to disk. Tried different io schedulers, played with their 
parameters, but nothing helps. Unfortunately, blktrace is a very nasty thing 
that fails to start at some point until machine is rebooted. So I am still 
waiting for the appropriate time slot to reboot the OSD nodes and record io 
with blktrace again.


-Mykola

From: John Spray
Sent: Wednesday, 19 October 2016 19:17
To: Jim Kilborn
Cc: [email protected]
Subject: Re: [ceph-users] New cephfs cluster performance issues- Jewel - 
cachepressure, capability release, poor iostat await avg queue size

On Wed, Oct 19, 2016 at 5:17 PM, Jim Kilborn <[email protected]> wrote:
> John,
>
>
>
> Thanks for the tips….
>
> Unfortunately, I was looking at this page 
> http://docs.ceph.com/docs/jewel/start/os-recommendations/

OK, thanks - I've pushed an update to clarify that
(https://github.com/ceph/ceph/pull/11564).

> I’ll consider either upgrading the kernels or using the fuse client, but will 
> likely go the kernel 4.4 route
>
>
>
> As for moving to just a replicated pool, I take it that a replication size of 
> 3 is minimum recommended.
>
> If I move to no EC, I will have to have have 9 4TB spinners on of the 4 
> servers. Can I put the 9 journals on the one 128GB ssd with 10GB per journal, 
> or is that two many osds per journal, creating a hot spot for writes?

That sounds like a lot of journals on one SSD, but people other than
me have more empirical experience in hardware selection.

John

>
>
>
> Thanks!!
>
>
>
>
>
>
>
> Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
>
>
>
> From: John Spray<mailto:[email protected]>
> Sent: Wednesday, October 19, 2016 9:10 AM
> To: Jim Kilborn<mailto:[email protected]>
> Cc: [email protected]<mailto:[email protected]>
> Subject: Re: [ceph-users] New cephfs cluster performance issues- Jewel - 
> cache pressure, capability release, poor iostat await avg queue size
>
>
>
> On Wed, Oct 19, 2016 at 1:28 PM, Jim Kilborn <[email protected]> wrote:
>> I have setup a new linux cluster to allow migration from our old SAN based 
>> cluster to a new cluster with ceph.
>> All systems running centos 7.2 with the 3.10.0-327.36.1 kernel.
>> I am basically running stock ceph settings, with just turning the write 
>> cache off via hdparm on the drives, and temporarily turning of scrubbing.
>>
>> The 4 ceph servers are all Dell 730XD with 128GB memory, and dual xeon. So 
>> Server performance should be good.  Since I am running cephfs, I have 
>> tiering setup.
>> Each server has 4 – 4TB drives for the erasure code pool, with K=3 and M=1. 
>> So the idea is to ensure a single host failure.
>> Each server also has a 1TB Seagate 850 Pro SSD for the cache drive, in a 
>> replicated set with size=2
>> The cache tier also has a 128GB SM863 SSD that is being used as a journal 
>> for the cache SSD. It has power loss protection
>> My crush map is setup to ensure the cache pool uses only the 4 850 pro and 
>> the erasure code uses only the 16 spinning 4TB drives.
>>
>> The problems that I am seeing is that I start copying data from our old san 
>> to the ceph volume, and once the cache tier gets to my  target_max_bytes of 
>> 1.4 TB, I start seeing:
>>
>> HEALTH_WARN 63 requests are blocked > 32 sec; 1 osds have slow requests; 
>> noout,noscrub,nodeep-scrub,sortbitwise flag(s) set
>> 26 ops are blocked > 65.536 sec on osd.0
>> 37 ops are blocked > 32.768 sec on osd.0
>> 1 osds have slow requests
>> noout,noscrub,nodeep-scrub,sortbitwise flag(s) set
>>
>> osd.0 is the cache ssd
>>
>> If I watch iostat on the cache ssd, I see the queue lengths are high and the 
>> await are high
>> Below is the iostat on the cache drive (osd.0) on the first host. The 
>> avgqu-sz is between 87 and 182 and the await is between 88ms and 1193ms
>>
>> Device:   rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz 
>> avgqu-sz   await r_await w_await  svctm  %util
>> sdb
>>                   0.00     0.33    9.00   84.33     0.96    20.11   462.40   
>>  75.92  397.56  125.67  426.58  10.70  99.90
>>                   0.00     0.67   30.00   87.33     5.96    21.03   471.20   
>>  67.86  910.95   87.00 1193.99   8.27  97.07
>>                   0.00    16.67   33.00  289.33     4.21    18.80   146.20   
>>  29.83   88.99   93.91   88.43   3.10  99.83
>>                   0.00     7.33    7.67  261.67     1.92    19.63   163.81   
>> 117.42  331.97  182.04  336.36   3.71 100.00
>>
>>
>> If I look at the iostat for all the drives, only the cache ssd drive is 
>> backed up
>>
>> Device:   rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz 
>> avgqu-sz   await r_await w_await  svctm  %util
>> Sdg (journal for cache drive)
>>                   0.00     6.33    0.00    8.00     0.00     0.07    19.04   
>>   0.00    0.33    0.00    0.33   0.33   0.27
>> Sdb (cache drive)
>>                   0.00     0.33    3.33   82.00     0.83    20.07   501.68   
>> 106.75 1057.81  269.40 1089.86  11.72 100.00
>> Sda (4TB EC)
>>                   0.00     0.00    0.00    4.00     0.00     0.02     9.33   
>>   0.00    0.00    0.00    0.00   0.00   0.00
>> Sdd (4TB EC)
>>                   0.00     0.00    0.00    2.33     0.00     0.45   392.00   
>>   0.08   34.00    0.00   34.00   6.86   1.60
>> Sdf (4TB EC)
>>                   0.00    14.00    0.00   26.00     0.00     0.22    17.71   
>>   1.00   38.55    0.00   38.55   0.68   1.77
>> Sdc (4TB EC)
>>                   0.00     0.00    0.00    1.33     0.00     0.01     8.75   
>>   0.02   12.25    0.00   12.25  12.25   1.63
>>
>> While at this time is just complaining about slow osd.0, sometimes the other 
>> cache tier ssds show some slow response, but not as frequently.
>>
>>
>> I occasionally see complaints about a client not responding to cache 
>> pressure, and yesterday while copying serveral terabytes, the client doing 
>> the copy was noted for failing to respond to capability release, and I ended 
>> up rebooting it.
>>
>> I just seems the cluster isn’t handling large amounts of data copies, like 
>> and nfs or san based volume would, and I am worried about moving our users 
>> to a cluster that already is showing signs of performance issues, even when 
>> I am just doing a copy with no other users. I am doing only one rsync at a 
>> time.
>>
>> Is the problem that I need to user a later kernel for the clients mounting 
>> the volume ? I have read some posts about that, but the docs say centos 7 
>> with 3.10 is ok.
>
> Which docs say to use the stock centos kernel?  >4.4 is recommended
> here: http://docs.ceph.com/docs/master/cephfs/best-practices/
>
>> Do I need more drives in my cache pool? I only have 4 ssd drive in the cache 
>> pool (one on each host), with each having a separate journal drive.
>> But is that too much of a hot spot since all i/o has to go to the cache 
>> layer?
>> It seems like my ssds should be able to keep up with a single rsync copy.
>> Is there something set wrong on my ssds that they cant keep up?
>
> Cache tiering is pretty sensitive to tuning of various parameters, and
> does not necessarily work well out of the box.  We do not test CephFS
> with cache tiering.  CephFS also has some behaviours that don't mesh
> very nicely with cache tiering, like periodically doing small writes
> to xattrs that would cause an entire data object to be promoted.
>
>> I put the metadata pool on the ssd cache tier drives as well.
>
> Hmm, seems like a dubious choice: all data operations are going
> through those SSDs, *plus* the operations to promote/flush data to the
> spinners -- we do not do any prioritisation internally so your
> metadata ops are potentially going to be quite impacted by that.
>
>> Any ideas where the problem is or what I need to change to make this stable?
>
> Mainly I would suggest that you try changing some stuff and see what
> effect it has:
>  * Trying just running a replicated pool on spinners and use your SSDs
> as journals (or use them for metadata only) -- obviously this does not
> get you erasure coding, but it would narrow down where your issues are
> coming from.
>  * Try a more recent kernel (>4.4)
>
> If you want to take a longer view of this, it might make sense for you
> to use replicated storage (and no cache tier) for now, and by the time
> your data has grown the Luminous release could be out with support for
> using EC without a cache tier.
>
> John
>
>>
>>
>> Thanks. Additional details below
>>
>> The ceph osd tier drives are osd 0,  5,  10, 15
>>
>> ceph df
>> GLOBAL:
>>     SIZE       AVAIL      RAW USED     %RAW USED
>>     63155G     50960G       12195G         19.31
>> POOLS:
>>     NAME                ID     USED       %USED     MAX AVAIL     OBJECTS
>>     cephfs-metadata     15     58943k         0          523G       28436
>>     cephfs-data         16      7382G     15.59        36499G     5013128
>>     cephfs-cache        17      1124G      3.56          523G      298348
>>
>> ceph osd df
>> ID WEIGHT  REWEIGHT SIZE   USE    AVAIL  %USE  VAR  PGS
>>  0 1.00000  1.00000   931G   669G   261G 71.88 3.72 280
>>  5 1.00000  1.00000   931G   526G   404G 56.51 2.93 259
>> 10 1.00000  1.00000   931G   589G   341G 63.35 3.28 266
>> 15 1.00000  1.00000   931G   468G   462G 50.33 2.61 219
>>  1 1.00000  1.00000  3714G   672G  3041G 18.11 0.94 139
>>  2 1.00000  1.00000  3714G   593G  3120G 15.98 0.83 122
>>  3 1.00000  1.00000  3714G   585G  3128G 15.77 0.82 121
>>  4 1.00000  1.00000  3714G   632G  3081G 17.04 0.88 130
>>  6 1.00000  1.00000  3714G   589G  3124G 15.87 0.82 120
>>  7 1.00000  1.00000  3714G   627G  3087G 16.89 0.87 130
>>  8 1.00000  1.00000  3714G   638G  3076G 17.18 0.89 132
>>  9 1.00000  1.00000  3714G   630G  3083G 16.98 0.88 130
>> 11 1.00000  1.00000  3714G   646G  3067G 17.41 0.90 132
>> 12 1.00000  1.00000  3714G   580G  3133G 15.63 0.81 120
>> 13 1.00000  1.00000  3714G   624G  3089G 16.82 0.87 129
>> 14 1.00000  1.00000  3714G   633G  3081G 17.05 0.88 131
>> 16 1.00000  1.00000  3714G   589G  3124G 15.88 0.82 122
>> 17 1.00000  1.00000  3714G   624G  3090G 16.80 0.87 129
>> 18 1.00000  1.00000  3714G   614G  3099G 16.55 0.86 127
>> 19 1.00000  1.00000  3714G   656G  3057G 17.68 0.92 134
>> 15 1.00000  1.00000   931G   468G   462G 50.33 2.61 219
>> 16 1.00000  1.00000  3714G   589G  3124G 15.88 0.82 122
>> 17 1.00000  1.00000  3714G   624G  3090G 16.80 0.87 129
>> 18 1.00000  1.00000  3714G   614G  3099G 16.55 0.86 127
>> 19 1.00000  1.00000  3714G   656G  3057G 17.68 0.92 134
>> 10 1.00000  1.00000   931G   589G   341G 63.35 3.28 266
>> 11 1.00000  1.00000  3714G   646G  3067G 17.41 0.90 132
>> 12 1.00000  1.00000  3714G   580G  3133G 15.63 0.81 120
>> 13 1.00000  1.00000  3714G   624G  3089G 16.82 0.87 129
>> 14 1.00000  1.00000  3714G   633G  3081G 17.05 0.88 131
>>  5 1.00000  1.00000   931G   526G   404G 56.51 2.93 259
>>  6 1.00000  1.00000  3714G   589G  3124G 15.87 0.82 120
>>  7 1.00000  1.00000  3714G   627G  3087G 16.89 0.87 130
>>  8 1.00000  1.00000  3714G   638G  3076G 17.18 0.89 132
>>  9 1.00000  1.00000  3714G   630G  3083G 16.98 0.88 130
>>  0 1.00000  1.00000   931G   669G   261G 71.88 3.72 280
>>  1 1.00000  1.00000  3714G   672G  3041G 18.11 0.94 139
>>  2 1.00000  1.00000  3714G   593G  3120G 15.98 0.83 122
>>  3 1.00000  1.00000  3714G   585G  3128G 15.77 0.82 121
>>  4 1.00000  1.00000  3714G   632G  3081G 17.04 0.88 130
>>               TOTAL 63155G 12195G 50959G 19.31
>>
>> ceph osd tree
>> ID  WEIGHT   TYPE NAME                             UP/DOWN REWEIGHT 
>> PRIMARY-AFFINITY
>> -14  4.00000 root ssd
>>  -1  1.00000     disktype darkjedi-ceph01_ssd
>>   0  1.00000         osd.0                              up  1.00000          
>> 1.00000
>>  -4  1.00000     disktype darkjedi-ceph02_ssd
>>   5  1.00000         osd.5                              up  1.00000          
>> 1.00000
>>  -7  1.00000     disktype darkjedi-ceph03_ssd
>>  10  1.00000         osd.10                             up  1.00000          
>> 1.00000
>> -10  1.00000     disktype darkjedi-ceph04_ssd
>>  15  1.00000         osd.15                             up  1.00000          
>> 1.00000
>> -13 16.00000 root spinning
>>  -2  4.00000     disktype darkjedi-ceph01_spinning
>>   1  1.00000         osd.1                              up  1.00000          
>> 1.00000
>>   2  1.00000         osd.2                              up  1.00000          
>> 1.00000
>>   3  1.00000         osd.3                              up  1.00000          
>> 1.00000
>>   4  1.00000         osd.4                              up  1.00000          
>> 1.00000
>>  -5  4.00000     disktype darkjedi-ceph02_spinning
>>   6  1.00000         osd.6                              up  1.00000          
>> 1.00000
>>   7  1.00000         osd.7                              up  1.00000          
>> 1.00000
>>   8  1.00000         osd.8                              up  1.00000          
>> 1.00000
>>   9  1.00000         osd.9                              up  1.00000          
>> 1.00000
>>  -8  4.00000     disktype darkjedi-ceph03_spinning
>>  11  1.00000         osd.11                             up  1.00000          
>> 1.00000
>>  12  1.00000         osd.12                             up  1.00000          
>> 1.00000
>>  13  1.00000         osd.13                             up  1.00000          
>> 1.00000
>>  14  1.00000         osd.14                             up  1.00000          
>> 1.00000
>> -11  4.00000     disktype darkjedi-ceph04_spinning
>>  16  1.00000         osd.16                             up  1.00000          
>> 1.00000
>>  17  1.00000         osd.17                             up  1.00000          
>> 1.00000
>>  18  1.00000         osd.18                             up  1.00000          
>> 1.00000
>>  19  1.00000         osd.19                             up  1.00000          
>> 1.00000
>> -12  5.00000 host darkjedi-ceph04
>> -10  1.00000     disktype darkjedi-ceph04_ssd
>>  15  1.00000         osd.15                             up  1.00000          
>> 1.00000
>> -11  4.00000     disktype darkjedi-ceph04_spinning
>>  16  1.00000         osd.16                             up  1.00000          
>> 1.00000
>>  17  1.00000         osd.17                             up  1.00000          
>> 1.00000
>>  18  1.00000         osd.18                             up  1.00000          
>> 1.00000
>>  19  1.00000         osd.19                             up  1.00000          
>> 1.00000
>>  -9  5.00000 host darkjedi-ceph03
>>  -7  1.00000     disktype darkjedi-ceph03_ssd
>>  10  1.00000         osd.10                             up  1.00000          
>> 1.00000
>>  -8  4.00000     disktype darkjedi-ceph03_spinning
>>  11  1.00000         osd.11                             up  1.00000          
>> 1.00000
>>  12  1.00000         osd.12                             up  1.00000          
>> 1.00000
>>  13  1.00000         osd.13                             up  1.00000          
>> 1.00000
>>  14  1.00000         osd.14                             up  1.00000          
>> 1.00000
>>  -6  6.00000 host darkjedi-ceph02
>>  -4  2.00000     disktype darkjedi-ceph02_ssd
>>   5  1.00000         osd.5                              up  1.00000          
>> 1.00000
>>  -5  4.00000     disktype darkjedi-ceph02_spinning
>>   6  1.00000         osd.6                              up  1.00000          
>> 1.00000
>>   7  1.00000         osd.7                              up  1.00000          
>> 1.00000
>>   8  1.00000         osd.8                              up  1.00000          
>> 1.00000
>>   9  1.00000         osd.9                              up  1.00000          
>> 1.00000
>>  -3  5.00000 host darkjedi-ceph01
>>  -1  1.00000     disktype darkjedi-ceph01_ssd
>>   0  1.00000         osd.0                              up  1.00000          
>> 1.00000
>>  -2  4.00000     disktype darkjedi-ceph01_spinning
>>   1  1.00000         osd.1                              up  1.00000          
>> 1.00000
>>   2  1.00000         osd.2                              up  1.00000          
>> 1.00000
>>   3  1.00000         osd.3                              up  1.00000          
>> 1.00000
>>   4  1.00000         osd.4                              up  1.00000          
>> 1.00000
>>
>>
>> ceph -w
>>     cluster 62ed97d6-adf4-12e4-8fd5-3d9701b22b86
>>      health HEALTH_WARN
>>             72 requests are blocked > 32 sec
>>             noout,noscrub,nodeep-scrub,sortbitwise flag(s) set
>>      monmap e2: 3 mons at 
>> {darkjedi-ceph01=192.168.19.241:6789/0,darkjedi-ceph02=192.168.19.242:6789/0,darkjedi-ceph03=192.168.19.243:6789/0}
>>             election epoch 120, quorum 0,1,2 
>> darkjedi-ceph01,darkjedi-ceph02,darkjedi-ceph03
>>       fsmap e1156: 1/1/1 up {0=darkjedi-ceph04=up:active}, 1 up:standby
>>      osdmap e4294: 20 osds: 20 up, 20 in
>>             flags noout,noscrub,nodeep-scrub,sortbitwise
>>       pgmap v674643: 1024 pgs, 3 pools, 8513 GB data, 5216 kobjects
>>             12204 GB used, 50950 GB / 63155 GB avail
>>                 1024 active+clean
>>   client io 4039 kB/s wr, 0 op/s rd, 4 op/s wr
>>   cache io 10080 kB/s flush, 8064 kB/s evict, 0 op/s promote, 1 PG(s) 
>> evicting
>>
>> 2016-10-19 07:27:42.063881 mon.0 [INF] pgmap v674642: 1024 pgs: 1024 
>> active+clean; 8513 GB data, 12204 GB used, 50950 GB / 63155 GB avail; 10093 
>> kB/s wr, 4 op/s
>> 2016-10-19 07:27:38.539687 osd.0 [WRN] 47 slow requests, 1 included below; 
>> oldest blocked for > 87.453888 secs
>> 2016-10-19 07:27:38.539695 osd.0 [WRN] slow request 30.771983 seconds old, 
>> received at 2016-10-19 07:27:07.767636: osd_op(client.1234553.1:412446 
>> 17.6f7367c 10000323d95.00000041 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:39.539883 osd.0 [WRN] 48 slow requests, 2 included below; 
>> oldest blocked for > 88.454076 secs
>> 2016-10-19 07:27:39.539890 osd.0 [WRN] slow request 60.016825 seconds old, 
>> received at 2016-10-19 07:26:39.522982: osd_op(client.1234553.1:412252 
>> 17.1220dbd9 10000323d93.0000001b [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:39.539897 osd.0 [WRN] slow request 30.063037 seconds old, 
>> received at 2016-10-19 07:27:09.476770: osd_op(client.1234553.1:412458 
>> 17.dac4885c 10000323d95.0000004d [write 0~2260992 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:40.540136 osd.0 [WRN] 53 slow requests, 5 included below; 
>> oldest blocked for > 89.454282 secs
>> 2016-10-19 07:27:40.540148 osd.0 [WRN] slow request 30.426319 seconds old, 
>> received at 2016-10-19 07:27:10.113694: osd_op(client.1234553.1:412483 
>> 17.dd3448b3 10000323d96.00000018 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:40.540153 osd.0 [WRN] slow request 30.557129 seconds old, 
>> received at 2016-10-19 07:27:09.982884: osd_op(client.1234553.1:412471 
>> 17.6dae7fe6 10000323d96.0000000c [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:40.540160 osd.0 [WRN] slow request 60.947034 seconds old, 
>> received at 2016-10-19 07:26:39.592979: osd_op(client.1234553.1:412259 
>> 17.5fecfad9 10000323d93.00000022 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:40.540165 osd.0 [WRN] slow request 30.218226 seconds old, 
>> received at 2016-10-19 07:27:10.321787: osd_op(client.1234553.1:412487 
>> 17.7e783a0b 10000323d96.0000001c [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:40.540173 osd.0 [WRN] slow request 30.669424 seconds old, 
>> received at 2016-10-19 07:27:09.870589: osd_op(client.1234553.1:412460 
>> 17.242c68e6 10000323d96.00000001 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:41.540415 osd.0 [WRN] 56 slow requests, 4 included below; 
>> oldest blocked for > 90.454559 secs
>> 2016-10-19 07:27:41.540426 osd.0 [WRN] slow request 30.591025 seconds old, 
>> received at 2016-10-19 07:27:10.949264: osd_op(client.1234553.1:412490 
>> 17.f00571c7 10000323d96.0000001f [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:41.540431 osd.0 [WRN] slow request 30.584468 seconds old, 
>> received at 2016-10-19 07:27:10.955822: osd_op(client.1234553.1:412493 
>> 17.e5c93438 10000323d96.00000022 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:41.540438 osd.0 [WRN] slow request 30.570790 seconds old, 
>> received at 2016-10-19 07:27:10.969499: osd_op(client.1234553.1:412499 
>> 17.75f41c27 10000323d96.00000028 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:41.540445 osd.0 [WRN] slow request 31.588275 seconds old, 
>> received at 2016-10-19 07:27:09.952014: osd_op(client.1234553.1:412463 
>> 17.e66e96b6 10000323d96.00000004 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:42.540685 osd.0 [WRN] 56 slow requests, 4 included below; 
>> oldest blocked for > 91.454840 secs
>> 2016-10-19 07:27:42.540694 osd.0 [WRN] slow request 60.595040 seconds old, 
>> received at 2016-10-19 07:26:41.945531: osd_op(client.1234553.1:412276 
>> 17.cae2b8f3 10000323d93.00000033 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:42.540702 osd.0 [WRN] slow request 30.032041 seconds old, 
>> received at 2016-10-19 07:27:12.508530: osd_op(client.1234553.1:412521 
>> 17.ec0d7938 10000323d96.0000003e [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:42.540716 osd.0 [WRN] slow request 60.589088 seconds old, 
>> received at 2016-10-19 07:26:41.951483: osd_op(client.1234553.1:412277 
>> 17.10107d9a 10000323d93.00000034 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently waiting for blocked object
>> 2016-10-19 07:27:42.540726 osd.0 [WRN] slow request 60.573651 seconds old, 
>> received at 2016-10-19 07:26:41.966920: osd_op(client.1234553.1:412287 
>> 17.cac5d907 10000323d93.0000003e [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:43.073014 mon.0 [INF] pgmap v674643: 1024 pgs: 1024 
>> active+clean; 8513 GB data, 12204 GB used, 50950 GB / 63155 GB avail; 4039 
>> kB/s wr, 4 op/s
>> 2016-10-19 07:27:43.147883 mon.0 [INF] HEALTH_WARN; 72 requests are blocked 
>> > 32 sec; noout,noscrub,nodeep-scrub,sortbitwise flag(s) set
>> 2016-10-19 07:27:43.540977 osd.0 [WRN] 60 slow requests, 5 included below; 
>> oldest blocked for > 92.455114 secs
>> 2016-10-19 07:27:43.540987 osd.0 [WRN] slow request 30.946896 seconds old, 
>> received at 2016-10-19 07:27:12.593949: osd_op(client.1234553.1:412553 
>> 17.780b5c32 10000323d97.00000010 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:43.540992 osd.0 [WRN] slow request 30.940430 seconds old, 
>> received at 2016-10-19 07:27:12.600414: osd_op(client.1234553.1:412554 
>> 17.62f5a288 10000323d97.00000011 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:43.541002 osd.0 [WRN] slow request 30.930237 seconds old, 
>> received at 2016-10-19 07:27:12.610608: osd_op(client.1234553.1:412557 
>> 17.e6fe4157 10000323d97.00000014 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:43.541009 osd.0 [WRN] slow request 30.669228 seconds old, 
>> received at 2016-10-19 07:27:12.871617: osd_op(client.1234553.1:412568 
>> 17.dd1ccfe0 10000323d97.0000001f [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:43.541018 osd.0 [WRN] slow request 60.284696 seconds old, 
>> received at 2016-10-19 07:26:43.256148: osd_op(client.1234553.1:412302 
>> 17.dcfdcaad 10000323d93.0000004d [write 0~2260992 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:44.083654 mon.0 [INF] pgmap v674644: 1024 pgs: 1024 
>> active+clean; 8513 GB data, 12204 GB used, 50950 GB / 63155 GB avail; 4059 
>> kB/s wr, 2 op/s
>> 2016-10-19 07:27:45.105055 mon.0 [INF] pgmap v674645: 1024 pgs: 1024 
>> active+clean; 8513 GB data, 12204 GB used, 50950 GB / 63155 GB avail; 26360 
>> kB/s wr, 6 op/s
>> 2016-10-19 07:27:46.116200 mon.0 [INF] pgmap v674646: 1024 pgs: 1024 
>> active+clean; 8514 GB data, 12204 GB used, 50950 GB / 63155 GB avail; 90603 
>> kB/s wr, 23 op/s
>> 2016-10-19 07:27:47.126401 mon.0 [INF] pgmap v674647: 1024 pgs: 1024 
>> active+clean; 8514 GB data, 12204 GB used, 50950 GB / 63155 GB avail; 114 
>> MB/s wr, 38 op/s
>> 2016-10-19 07:27:48.145239 mon.0 [INF] pgmap v674648: 1024 pgs: 1024 
>> active+clean; 8514 GB data, 12204 GB used, 50950 GB / 63155 GB avail; 100 
>> MB/s wr, 49 op/s
>> 2016-10-19 07:27:49.153554 mon.0 [INF] pgmap v674649: 1024 pgs: 1024 
>> active+clean; 8514 GB data, 12204 GB used, 50950 GB / 63155 GB avail; 64589 
>> kB/s wr, 30 op/s
>> 2016-10-19 07:27:50.158547 mon.0 [INF] pgmap v674650: 1024 pgs: 1024 
>> active+clean; 8514 GB data, 12205 GB used, 50950 GB / 63155 GB avail; 34499 
>> kB/s wr, 8 op/s
>> 2016-10-19 07:27:51.175699 mon.0 [INF] pgmap v674651: 1024 pgs: 1024 
>> active+clean; 8514 GB data, 12205 GB used, 50949 GB / 63155 GB avail; 79100 
>> kB/s wr, 19 op/s
>> 2016-10-19 07:27:52.180504 mon.0 [INF] pgmap v674652: 1024 pgs: 1024 
>> active+clean; 8514 GB data, 12205 GB used, 50950 GB / 63155 GB avail; 95241 
>> kB/s wr, 71 op/s
>> 2016-10-19 07:27:44.541269 osd.0 [WRN] 59 slow requests, 3 included below; 
>> oldest blocked for > 93.455423 secs
>> 2016-10-19 07:27:44.541276 osd.0 [WRN] slow request 31.985270 seconds old, 
>> received at 2016-10-19 07:27:12.555884: osd_op(client.1234553.1:412536 
>> 17.30639c12 10000323d96.0000004d [write 0~2260992 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:44.541282 osd.0 [WRN] slow request 31.980479 seconds old, 
>> received at 2016-10-19 07:27:12.560674: osd_op(client.1234553.1:412537 
>> 17.5c9321a8 10000323d97.00000000 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:44.541287 osd.0 [WRN] slow request 31.954483 seconds old, 
>> received at 2016-10-19 07:27:12.586670: osd_op(client.1234553.1:412545 
>> 17.49eb1b07 10000323d97.00000008 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:50.542250 osd.0 [WRN] 53 slow requests, 1 included below; 
>> oldest blocked for > 99.456434 secs
>> 2016-10-19 07:27:50.542261 osd.0 [WRN] slow request 30.322487 seconds old, 
>> received at 2016-10-19 07:27:20.219678: osd_op(client.1234553.1:412574 
>> 17.6c703de 10000323d97.00000025 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:51.542498 osd.0 [WRN] 50 slow requests, 3 included below; 
>> oldest blocked for > 100.456658 secs
>> 2016-10-19 07:27:51.542505 osd.0 [WRN] slow request 60.740689 seconds old, 
>> received at 2016-10-19 07:26:50.801700: osd_op(client.1234553.1:412320 
>> 17.802362e0 10000323d94.00000011 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:51.542513 osd.0 [WRN] slow request 60.574101 seconds old, 
>> received at 2016-10-19 07:26:50.968288: osd_op(client.1234553.1:412321 
>> 17.c529e2e6 10000323d94.00000012 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:51.542518 osd.0 [WRN] slow request 30.452691 seconds old, 
>> received at 2016-10-19 07:27:21.089698: osd_op(client.1234553.1:412576 
>> 17.65178e57 10000323d97.00000027 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:52.542740 osd.0 [WRN] 46 slow requests, 4 included below; 
>> oldest blocked for > 101.456908 secs
>> 2016-10-19 07:27:52.542746 osd.0 [WRN] slow request 30.674005 seconds old, 
>> received at 2016-10-19 07:27:21.868634: osd_op(client.1234553.1:412582 
>> 17.722c7ade 10000323d97.0000002d [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:52.542757 osd.0 [WRN] slow request 60.439945 seconds old, 
>> received at 2016-10-19 07:26:52.102693: osd_op(client.1234553.1:412333 
>> 17.f8df72c7 10000323d94.0000001e [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:52.542763 osd.0 [WRN] slow request 30.875550 seconds old, 
>> received at 2016-10-19 07:27:21.667089: osd_op(client.1234553.1:412581 
>> 17.13ef9fd2 10000323d97.0000002c [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:52.542770 osd.0 [WRN] slow request 60.317892 seconds old, 
>> received at 2016-10-19 07:26:52.224747: osd_op(client.1234553.1:412338 
>> 17.7416d4bb 10000323d94.00000023 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently waiting for blocked object
>> 2016-10-19 07:27:53.209272 mon.0 [INF] pgmap v674653: 1024 pgs: 1024 
>> active+clean; 8514 GB data, 12205 GB used, 50949 GB / 63155 GB avail; 73115 
>> kB/s wr, 75 op/s
>> 2016-10-19 07:27:53.542922 osd.0 [WRN] 45 slow requests, 1 included below; 
>> oldest blocked for > 98.523012 secs
>> 2016-10-19 07:27:53.542934 osd.0 [WRN] slow request 30.691735 seconds old, 
>> received at 2016-10-19 07:27:22.851117: osd_op(client.1234553.1:412586 
>> 17.905a3e43 10000323d97.00000031 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:54.219455 mon.0 [INF] pgmap v674654: 1024 pgs: 1024 
>> active+clean; 8514 GB data, 12205 GB used, 50949 GB / 63155 GB avail; 50465 
>> kB/s wr, 21 op/s
>>
>> Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
>>
>> _______________________________________________
>> ceph-users mailing list
>> [email protected]
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] New cephfs cluster performance issues- Jewel - cachepressure, capability release, poor iostat await avg queue size

Reply via email to