Re: [ceph-users] New cephfs cluster performance issues- Jewel - cache pressure, capability release, poor iostat await avg queue size

John Spray Wed, 19 Oct 2016 10:17:22 -0700

On Wed, Oct 19, 2016 at 5:17 PM, Jim Kilborn <j...@kilborns.com> wrote:
> John,
>
>
>
> Thanks for the tips….
>
> Unfortunately, I was looking at this page 
> http://docs.ceph.com/docs/jewel/start/os-recommendations/


OK, thanks - I've pushed an update to clarify that
(https://github.com/ceph/ceph/pull/11564).

> I’ll consider either upgrading the kernels or using the fuse client, but will 
> likely go the kernel 4.4 route
>
>
>
> As for moving to just a replicated pool, I take it that a replication size of 
> 3 is minimum recommended.
>
> If I move to no EC, I will have to have have 9 4TB spinners on of the 4 
> servers. Can I put the 9 journals on the one 128GB ssd with 10GB per journal, 
> or is that two many osds per journal, creating a hot spot for writes?

That sounds like a lot of journals on one SSD, but people other than
me have more empirical experience in hardware selection.

John

>
>
>
> Thanks!!
>
>
>
>
>
>
>
> Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
>
>
>
> From: John Spray<mailto:jsp...@redhat.com>
> Sent: Wednesday, October 19, 2016 9:10 AM
> To: Jim Kilborn<mailto:j...@kilborns.com>
> Cc: ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
> Subject: Re: [ceph-users] New cephfs cluster performance issues- Jewel - 
> cache pressure, capability release, poor iostat await avg queue size
>
>
>
> On Wed, Oct 19, 2016 at 1:28 PM, Jim Kilborn <j...@kilborns.com> wrote:
>> I have setup a new linux cluster to allow migration from our old SAN based 
>> cluster to a new cluster with ceph.
>> All systems running centos 7.2 with the 3.10.0-327.36.1 kernel.
>> I am basically running stock ceph settings, with just turning the write 
>> cache off via hdparm on the drives, and temporarily turning of scrubbing.
>>
>> The 4 ceph servers are all Dell 730XD with 128GB memory, and dual xeon. So 
>> Server performance should be good.  Since I am running cephfs, I have 
>> tiering setup.
>> Each server has 4 – 4TB drives for the erasure code pool, with K=3 and M=1. 
>> So the idea is to ensure a single host failure.
>> Each server also has a 1TB Seagate 850 Pro SSD for the cache drive, in a 
>> replicated set with size=2
>> The cache tier also has a 128GB SM863 SSD that is being used as a journal 
>> for the cache SSD. It has power loss protection
>> My crush map is setup to ensure the cache pool uses only the 4 850 pro and 
>> the erasure code uses only the 16 spinning 4TB drives.
>>
>> The problems that I am seeing is that I start copying data from our old san 
>> to the ceph volume, and once the cache tier gets to my  target_max_bytes of 
>> 1.4 TB, I start seeing:
>>
>> HEALTH_WARN 63 requests are blocked > 32 sec; 1 osds have slow requests; 
>> noout,noscrub,nodeep-scrub,sortbitwise flag(s) set
>> 26 ops are blocked > 65.536 sec on osd.0
>> 37 ops are blocked > 32.768 sec on osd.0
>> 1 osds have slow requests
>> noout,noscrub,nodeep-scrub,sortbitwise flag(s) set
>>
>> osd.0 is the cache ssd
>>
>> If I watch iostat on the cache ssd, I see the queue lengths are high and the 
>> await are high
>> Below is the iostat on the cache drive (osd.0) on the first host. The 
>> avgqu-sz is between 87 and 182 and the await is between 88ms and 1193ms
>>
>> Device:   rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz 
>> avgqu-sz   await r_await w_await  svctm  %util
>> sdb
>>                   0.00     0.33    9.00   84.33     0.96    20.11   462.40   
>>  75.92  397.56  125.67  426.58  10.70  99.90
>>                   0.00     0.67   30.00   87.33     5.96    21.03   471.20   
>>  67.86  910.95   87.00 1193.99   8.27  97.07
>>                   0.00    16.67   33.00  289.33     4.21    18.80   146.20   
>>  29.83   88.99   93.91   88.43   3.10  99.83
>>                   0.00     7.33    7.67  261.67     1.92    19.63   163.81   
>> 117.42  331.97  182.04  336.36   3.71 100.00
>>
>>
>> If I look at the iostat for all the drives, only the cache ssd drive is 
>> backed up
>>
>> Device:   rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz 
>> avgqu-sz   await r_await w_await  svctm  %util
>> Sdg (journal for cache drive)
>>                   0.00     6.33    0.00    8.00     0.00     0.07    19.04   
>>   0.00    0.33    0.00    0.33   0.33   0.27
>> Sdb (cache drive)
>>                   0.00     0.33    3.33   82.00     0.83    20.07   501.68   
>> 106.75 1057.81  269.40 1089.86  11.72 100.00
>> Sda (4TB EC)
>>                   0.00     0.00    0.00    4.00     0.00     0.02     9.33   
>>   0.00    0.00    0.00    0.00   0.00   0.00
>> Sdd (4TB EC)
>>                   0.00     0.00    0.00    2.33     0.00     0.45   392.00   
>>   0.08   34.00    0.00   34.00   6.86   1.60
>> Sdf (4TB EC)
>>                   0.00    14.00    0.00   26.00     0.00     0.22    17.71   
>>   1.00   38.55    0.00   38.55   0.68   1.77
>> Sdc (4TB EC)
>>                   0.00     0.00    0.00    1.33     0.00     0.01     8.75   
>>   0.02   12.25    0.00   12.25  12.25   1.63
>>
>> While at this time is just complaining about slow osd.0, sometimes the other 
>> cache tier ssds show some slow response, but not as frequently.
>>
>>
>> I occasionally see complaints about a client not responding to cache 
>> pressure, and yesterday while copying serveral terabytes, the client doing 
>> the copy was noted for failing to respond to capability release, and I ended 
>> up rebooting it.
>>
>> I just seems the cluster isn’t handling large amounts of data copies, like 
>> and nfs or san based volume would, and I am worried about moving our users 
>> to a cluster that already is showing signs of performance issues, even when 
>> I am just doing a copy with no other users. I am doing only one rsync at a 
>> time.
>>
>> Is the problem that I need to user a later kernel for the clients mounting 
>> the volume ? I have read some posts about that, but the docs say centos 7 
>> with 3.10 is ok.
>
> Which docs say to use the stock centos kernel?  >4.4 is recommended
> here: http://docs.ceph.com/docs/master/cephfs/best-practices/
>
>> Do I need more drives in my cache pool? I only have 4 ssd drive in the cache 
>> pool (one on each host), with each having a separate journal drive.
>> But is that too much of a hot spot since all i/o has to go to the cache 
>> layer?
>> It seems like my ssds should be able to keep up with a single rsync copy.
>> Is there something set wrong on my ssds that they cant keep up?
>
> Cache tiering is pretty sensitive to tuning of various parameters, and
> does not necessarily work well out of the box.  We do not test CephFS
> with cache tiering.  CephFS also has some behaviours that don't mesh
> very nicely with cache tiering, like periodically doing small writes
> to xattrs that would cause an entire data object to be promoted.
>
>> I put the metadata pool on the ssd cache tier drives as well.
>
> Hmm, seems like a dubious choice: all data operations are going
> through those SSDs, *plus* the operations to promote/flush data to the
> spinners -- we do not do any prioritisation internally so your
> metadata ops are potentially going to be quite impacted by that.
>
>> Any ideas where the problem is or what I need to change to make this stable?
>
> Mainly I would suggest that you try changing some stuff and see what
> effect it has:
>  * Trying just running a replicated pool on spinners and use your SSDs
> as journals (or use them for metadata only) -- obviously this does not
> get you erasure coding, but it would narrow down where your issues are
> coming from.
>  * Try a more recent kernel (>4.4)
>
> If you want to take a longer view of this, it might make sense for you
> to use replicated storage (and no cache tier) for now, and by the time
> your data has grown the Luminous release could be out with support for
> using EC without a cache tier.
>
> John
>
>>
>>
>> Thanks. Additional details below
>>
>> The ceph osd tier drives are osd 0,  5,  10, 15
>>
>> ceph df
>> GLOBAL:
>>     SIZE       AVAIL      RAW USED     %RAW USED
>>     63155G     50960G       12195G         19.31
>> POOLS:
>>     NAME                ID     USED       %USED     MAX AVAIL     OBJECTS
>>     cephfs-metadata     15     58943k         0          523G       28436
>>     cephfs-data         16      7382G     15.59        36499G     5013128
>>     cephfs-cache        17      1124G      3.56          523G      298348
>>
>> ceph osd df
>> ID WEIGHT  REWEIGHT SIZE   USE    AVAIL  %USE  VAR  PGS
>>  0 1.00000  1.00000   931G   669G   261G 71.88 3.72 280
>>  5 1.00000  1.00000   931G   526G   404G 56.51 2.93 259
>> 10 1.00000  1.00000   931G   589G   341G 63.35 3.28 266
>> 15 1.00000  1.00000   931G   468G   462G 50.33 2.61 219
>>  1 1.00000  1.00000  3714G   672G  3041G 18.11 0.94 139
>>  2 1.00000  1.00000  3714G   593G  3120G 15.98 0.83 122
>>  3 1.00000  1.00000  3714G   585G  3128G 15.77 0.82 121
>>  4 1.00000  1.00000  3714G   632G  3081G 17.04 0.88 130
>>  6 1.00000  1.00000  3714G   589G  3124G 15.87 0.82 120
>>  7 1.00000  1.00000  3714G   627G  3087G 16.89 0.87 130
>>  8 1.00000  1.00000  3714G   638G  3076G 17.18 0.89 132
>>  9 1.00000  1.00000  3714G   630G  3083G 16.98 0.88 130
>> 11 1.00000  1.00000  3714G   646G  3067G 17.41 0.90 132
>> 12 1.00000  1.00000  3714G   580G  3133G 15.63 0.81 120
>> 13 1.00000  1.00000  3714G   624G  3089G 16.82 0.87 129
>> 14 1.00000  1.00000  3714G   633G  3081G 17.05 0.88 131
>> 16 1.00000  1.00000  3714G   589G  3124G 15.88 0.82 122
>> 17 1.00000  1.00000  3714G   624G  3090G 16.80 0.87 129
>> 18 1.00000  1.00000  3714G   614G  3099G 16.55 0.86 127
>> 19 1.00000  1.00000  3714G   656G  3057G 17.68 0.92 134
>> 15 1.00000  1.00000   931G   468G   462G 50.33 2.61 219
>> 16 1.00000  1.00000  3714G   589G  3124G 15.88 0.82 122
>> 17 1.00000  1.00000  3714G   624G  3090G 16.80 0.87 129
>> 18 1.00000  1.00000  3714G   614G  3099G 16.55 0.86 127
>> 19 1.00000  1.00000  3714G   656G  3057G 17.68 0.92 134
>> 10 1.00000  1.00000   931G   589G   341G 63.35 3.28 266
>> 11 1.00000  1.00000  3714G   646G  3067G 17.41 0.90 132
>> 12 1.00000  1.00000  3714G   580G  3133G 15.63 0.81 120
>> 13 1.00000  1.00000  3714G   624G  3089G 16.82 0.87 129
>> 14 1.00000  1.00000  3714G   633G  3081G 17.05 0.88 131
>>  5 1.00000  1.00000   931G   526G   404G 56.51 2.93 259
>>  6 1.00000  1.00000  3714G   589G  3124G 15.87 0.82 120
>>  7 1.00000  1.00000  3714G   627G  3087G 16.89 0.87 130
>>  8 1.00000  1.00000  3714G   638G  3076G 17.18 0.89 132
>>  9 1.00000  1.00000  3714G   630G  3083G 16.98 0.88 130
>>  0 1.00000  1.00000   931G   669G   261G 71.88 3.72 280
>>  1 1.00000  1.00000  3714G   672G  3041G 18.11 0.94 139
>>  2 1.00000  1.00000  3714G   593G  3120G 15.98 0.83 122
>>  3 1.00000  1.00000  3714G   585G  3128G 15.77 0.82 121
>>  4 1.00000  1.00000  3714G   632G  3081G 17.04 0.88 130
>>               TOTAL 63155G 12195G 50959G 19.31
>>
>> ceph osd tree
>> ID  WEIGHT   TYPE NAME                             UP/DOWN REWEIGHT 
>> PRIMARY-AFFINITY
>> -14  4.00000 root ssd
>>  -1  1.00000     disktype darkjedi-ceph01_ssd
>>   0  1.00000         osd.0                              up  1.00000          
>> 1.00000
>>  -4  1.00000     disktype darkjedi-ceph02_ssd
>>   5  1.00000         osd.5                              up  1.00000          
>> 1.00000
>>  -7  1.00000     disktype darkjedi-ceph03_ssd
>>  10  1.00000         osd.10                             up  1.00000          
>> 1.00000
>> -10  1.00000     disktype darkjedi-ceph04_ssd
>>  15  1.00000         osd.15                             up  1.00000          
>> 1.00000
>> -13 16.00000 root spinning
>>  -2  4.00000     disktype darkjedi-ceph01_spinning
>>   1  1.00000         osd.1                              up  1.00000          
>> 1.00000
>>   2  1.00000         osd.2                              up  1.00000          
>> 1.00000
>>   3  1.00000         osd.3                              up  1.00000          
>> 1.00000
>>   4  1.00000         osd.4                              up  1.00000          
>> 1.00000
>>  -5  4.00000     disktype darkjedi-ceph02_spinning
>>   6  1.00000         osd.6                              up  1.00000          
>> 1.00000
>>   7  1.00000         osd.7                              up  1.00000          
>> 1.00000
>>   8  1.00000         osd.8                              up  1.00000          
>> 1.00000
>>   9  1.00000         osd.9                              up  1.00000          
>> 1.00000
>>  -8  4.00000     disktype darkjedi-ceph03_spinning
>>  11  1.00000         osd.11                             up  1.00000          
>> 1.00000
>>  12  1.00000         osd.12                             up  1.00000          
>> 1.00000
>>  13  1.00000         osd.13                             up  1.00000          
>> 1.00000
>>  14  1.00000         osd.14                             up  1.00000          
>> 1.00000
>> -11  4.00000     disktype darkjedi-ceph04_spinning
>>  16  1.00000         osd.16                             up  1.00000          
>> 1.00000
>>  17  1.00000         osd.17                             up  1.00000          
>> 1.00000
>>  18  1.00000         osd.18                             up  1.00000          
>> 1.00000
>>  19  1.00000         osd.19                             up  1.00000          
>> 1.00000
>> -12  5.00000 host darkjedi-ceph04
>> -10  1.00000     disktype darkjedi-ceph04_ssd
>>  15  1.00000         osd.15                             up  1.00000          
>> 1.00000
>> -11  4.00000     disktype darkjedi-ceph04_spinning
>>  16  1.00000         osd.16                             up  1.00000          
>> 1.00000
>>  17  1.00000         osd.17                             up  1.00000          
>> 1.00000
>>  18  1.00000         osd.18                             up  1.00000          
>> 1.00000
>>  19  1.00000         osd.19                             up  1.00000          
>> 1.00000
>>  -9  5.00000 host darkjedi-ceph03
>>  -7  1.00000     disktype darkjedi-ceph03_ssd
>>  10  1.00000         osd.10                             up  1.00000          
>> 1.00000
>>  -8  4.00000     disktype darkjedi-ceph03_spinning
>>  11  1.00000         osd.11                             up  1.00000          
>> 1.00000
>>  12  1.00000         osd.12                             up  1.00000          
>> 1.00000
>>  13  1.00000         osd.13                             up  1.00000          
>> 1.00000
>>  14  1.00000         osd.14                             up  1.00000          
>> 1.00000
>>  -6  6.00000 host darkjedi-ceph02
>>  -4  2.00000     disktype darkjedi-ceph02_ssd
>>   5  1.00000         osd.5                              up  1.00000          
>> 1.00000
>>  -5  4.00000     disktype darkjedi-ceph02_spinning
>>   6  1.00000         osd.6                              up  1.00000          
>> 1.00000
>>   7  1.00000         osd.7                              up  1.00000          
>> 1.00000
>>   8  1.00000         osd.8                              up  1.00000          
>> 1.00000
>>   9  1.00000         osd.9                              up  1.00000          
>> 1.00000
>>  -3  5.00000 host darkjedi-ceph01
>>  -1  1.00000     disktype darkjedi-ceph01_ssd
>>   0  1.00000         osd.0                              up  1.00000          
>> 1.00000
>>  -2  4.00000     disktype darkjedi-ceph01_spinning
>>   1  1.00000         osd.1                              up  1.00000          
>> 1.00000
>>   2  1.00000         osd.2                              up  1.00000          
>> 1.00000
>>   3  1.00000         osd.3                              up  1.00000          
>> 1.00000
>>   4  1.00000         osd.4                              up  1.00000          
>> 1.00000
>>
>>
>> ceph -w
>>     cluster 62ed97d6-adf4-12e4-8fd5-3d9701b22b86
>>      health HEALTH_WARN
>>             72 requests are blocked > 32 sec
>>             noout,noscrub,nodeep-scrub,sortbitwise flag(s) set
>>      monmap e2: 3 mons at 
>> {darkjedi-ceph01=192.168.19.241:6789/0,darkjedi-ceph02=192.168.19.242:6789/0,darkjedi-ceph03=192.168.19.243:6789/0}
>>             election epoch 120, quorum 0,1,2 
>> darkjedi-ceph01,darkjedi-ceph02,darkjedi-ceph03
>>       fsmap e1156: 1/1/1 up {0=darkjedi-ceph04=up:active}, 1 up:standby
>>      osdmap e4294: 20 osds: 20 up, 20 in
>>             flags noout,noscrub,nodeep-scrub,sortbitwise
>>       pgmap v674643: 1024 pgs, 3 pools, 8513 GB data, 5216 kobjects
>>             12204 GB used, 50950 GB / 63155 GB avail
>>                 1024 active+clean
>>   client io 4039 kB/s wr, 0 op/s rd, 4 op/s wr
>>   cache io 10080 kB/s flush, 8064 kB/s evict, 0 op/s promote, 1 PG(s) 
>> evicting
>>
>> 2016-10-19 07:27:42.063881 mon.0 [INF] pgmap v674642: 1024 pgs: 1024 
>> active+clean; 8513 GB data, 12204 GB used, 50950 GB / 63155 GB avail; 10093 
>> kB/s wr, 4 op/s
>> 2016-10-19 07:27:38.539687 osd.0 [WRN] 47 slow requests, 1 included below; 
>> oldest blocked for > 87.453888 secs
>> 2016-10-19 07:27:38.539695 osd.0 [WRN] slow request 30.771983 seconds old, 
>> received at 2016-10-19 07:27:07.767636: osd_op(client.1234553.1:412446 
>> 17.6f7367c 10000323d95.00000041 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:39.539883 osd.0 [WRN] 48 slow requests, 2 included below; 
>> oldest blocked for > 88.454076 secs
>> 2016-10-19 07:27:39.539890 osd.0 [WRN] slow request 60.016825 seconds old, 
>> received at 2016-10-19 07:26:39.522982: osd_op(client.1234553.1:412252 
>> 17.1220dbd9 10000323d93.0000001b [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:39.539897 osd.0 [WRN] slow request 30.063037 seconds old, 
>> received at 2016-10-19 07:27:09.476770: osd_op(client.1234553.1:412458 
>> 17.dac4885c 10000323d95.0000004d [write 0~2260992 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:40.540136 osd.0 [WRN] 53 slow requests, 5 included below; 
>> oldest blocked for > 89.454282 secs
>> 2016-10-19 07:27:40.540148 osd.0 [WRN] slow request 30.426319 seconds old, 
>> received at 2016-10-19 07:27:10.113694: osd_op(client.1234553.1:412483 
>> 17.dd3448b3 10000323d96.00000018 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:40.540153 osd.0 [WRN] slow request 30.557129 seconds old, 
>> received at 2016-10-19 07:27:09.982884: osd_op(client.1234553.1:412471 
>> 17.6dae7fe6 10000323d96.0000000c [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:40.540160 osd.0 [WRN] slow request 60.947034 seconds old, 
>> received at 2016-10-19 07:26:39.592979: osd_op(client.1234553.1:412259 
>> 17.5fecfad9 10000323d93.00000022 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:40.540165 osd.0 [WRN] slow request 30.218226 seconds old, 
>> received at 2016-10-19 07:27:10.321787: osd_op(client.1234553.1:412487 
>> 17.7e783a0b 10000323d96.0000001c [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:40.540173 osd.0 [WRN] slow request 30.669424 seconds old, 
>> received at 2016-10-19 07:27:09.870589: osd_op(client.1234553.1:412460 
>> 17.242c68e6 10000323d96.00000001 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:41.540415 osd.0 [WRN] 56 slow requests, 4 included below; 
>> oldest blocked for > 90.454559 secs
>> 2016-10-19 07:27:41.540426 osd.0 [WRN] slow request 30.591025 seconds old, 
>> received at 2016-10-19 07:27:10.949264: osd_op(client.1234553.1:412490 
>> 17.f00571c7 10000323d96.0000001f [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:41.540431 osd.0 [WRN] slow request 30.584468 seconds old, 
>> received at 2016-10-19 07:27:10.955822: osd_op(client.1234553.1:412493 
>> 17.e5c93438 10000323d96.00000022 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:41.540438 osd.0 [WRN] slow request 30.570790 seconds old, 
>> received at 2016-10-19 07:27:10.969499: osd_op(client.1234553.1:412499 
>> 17.75f41c27 10000323d96.00000028 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:41.540445 osd.0 [WRN] slow request 31.588275 seconds old, 
>> received at 2016-10-19 07:27:09.952014: osd_op(client.1234553.1:412463 
>> 17.e66e96b6 10000323d96.00000004 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:42.540685 osd.0 [WRN] 56 slow requests, 4 included below; 
>> oldest blocked for > 91.454840 secs
>> 2016-10-19 07:27:42.540694 osd.0 [WRN] slow request 60.595040 seconds old, 
>> received at 2016-10-19 07:26:41.945531: osd_op(client.1234553.1:412276 
>> 17.cae2b8f3 10000323d93.00000033 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:42.540702 osd.0 [WRN] slow request 30.032041 seconds old, 
>> received at 2016-10-19 07:27:12.508530: osd_op(client.1234553.1:412521 
>> 17.ec0d7938 10000323d96.0000003e [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:42.540716 osd.0 [WRN] slow request 60.589088 seconds old, 
>> received at 2016-10-19 07:26:41.951483: osd_op(client.1234553.1:412277 
>> 17.10107d9a 10000323d93.00000034 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently waiting for blocked object
>> 2016-10-19 07:27:42.540726 osd.0 [WRN] slow request 60.573651 seconds old, 
>> received at 2016-10-19 07:26:41.966920: osd_op(client.1234553.1:412287 
>> 17.cac5d907 10000323d93.0000003e [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:43.073014 mon.0 [INF] pgmap v674643: 1024 pgs: 1024 
>> active+clean; 8513 GB data, 12204 GB used, 50950 GB / 63155 GB avail; 4039 
>> kB/s wr, 4 op/s
>> 2016-10-19 07:27:43.147883 mon.0 [INF] HEALTH_WARN; 72 requests are blocked 
>> > 32 sec; noout,noscrub,nodeep-scrub,sortbitwise flag(s) set
>> 2016-10-19 07:27:43.540977 osd.0 [WRN] 60 slow requests, 5 included below; 
>> oldest blocked for > 92.455114 secs
>> 2016-10-19 07:27:43.540987 osd.0 [WRN] slow request 30.946896 seconds old, 
>> received at 2016-10-19 07:27:12.593949: osd_op(client.1234553.1:412553 
>> 17.780b5c32 10000323d97.00000010 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:43.540992 osd.0 [WRN] slow request 30.940430 seconds old, 
>> received at 2016-10-19 07:27:12.600414: osd_op(client.1234553.1:412554 
>> 17.62f5a288 10000323d97.00000011 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:43.541002 osd.0 [WRN] slow request 30.930237 seconds old, 
>> received at 2016-10-19 07:27:12.610608: osd_op(client.1234553.1:412557 
>> 17.e6fe4157 10000323d97.00000014 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:43.541009 osd.0 [WRN] slow request 30.669228 seconds old, 
>> received at 2016-10-19 07:27:12.871617: osd_op(client.1234553.1:412568 
>> 17.dd1ccfe0 10000323d97.0000001f [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:43.541018 osd.0 [WRN] slow request 60.284696 seconds old, 
>> received at 2016-10-19 07:26:43.256148: osd_op(client.1234553.1:412302 
>> 17.dcfdcaad 10000323d93.0000004d [write 0~2260992 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:44.083654 mon.0 [INF] pgmap v674644: 1024 pgs: 1024 
>> active+clean; 8513 GB data, 12204 GB used, 50950 GB / 63155 GB avail; 4059 
>> kB/s wr, 2 op/s
>> 2016-10-19 07:27:45.105055 mon.0 [INF] pgmap v674645: 1024 pgs: 1024 
>> active+clean; 8513 GB data, 12204 GB used, 50950 GB / 63155 GB avail; 26360 
>> kB/s wr, 6 op/s
>> 2016-10-19 07:27:46.116200 mon.0 [INF] pgmap v674646: 1024 pgs: 1024 
>> active+clean; 8514 GB data, 12204 GB used, 50950 GB / 63155 GB avail; 90603 
>> kB/s wr, 23 op/s
>> 2016-10-19 07:27:47.126401 mon.0 [INF] pgmap v674647: 1024 pgs: 1024 
>> active+clean; 8514 GB data, 12204 GB used, 50950 GB / 63155 GB avail; 114 
>> MB/s wr, 38 op/s
>> 2016-10-19 07:27:48.145239 mon.0 [INF] pgmap v674648: 1024 pgs: 1024 
>> active+clean; 8514 GB data, 12204 GB used, 50950 GB / 63155 GB avail; 100 
>> MB/s wr, 49 op/s
>> 2016-10-19 07:27:49.153554 mon.0 [INF] pgmap v674649: 1024 pgs: 1024 
>> active+clean; 8514 GB data, 12204 GB used, 50950 GB / 63155 GB avail; 64589 
>> kB/s wr, 30 op/s
>> 2016-10-19 07:27:50.158547 mon.0 [INF] pgmap v674650: 1024 pgs: 1024 
>> active+clean; 8514 GB data, 12205 GB used, 50950 GB / 63155 GB avail; 34499 
>> kB/s wr, 8 op/s
>> 2016-10-19 07:27:51.175699 mon.0 [INF] pgmap v674651: 1024 pgs: 1024 
>> active+clean; 8514 GB data, 12205 GB used, 50949 GB / 63155 GB avail; 79100 
>> kB/s wr, 19 op/s
>> 2016-10-19 07:27:52.180504 mon.0 [INF] pgmap v674652: 1024 pgs: 1024 
>> active+clean; 8514 GB data, 12205 GB used, 50950 GB / 63155 GB avail; 95241 
>> kB/s wr, 71 op/s
>> 2016-10-19 07:27:44.541269 osd.0 [WRN] 59 slow requests, 3 included below; 
>> oldest blocked for > 93.455423 secs
>> 2016-10-19 07:27:44.541276 osd.0 [WRN] slow request 31.985270 seconds old, 
>> received at 2016-10-19 07:27:12.555884: osd_op(client.1234553.1:412536 
>> 17.30639c12 10000323d96.0000004d [write 0~2260992 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:44.541282 osd.0 [WRN] slow request 31.980479 seconds old, 
>> received at 2016-10-19 07:27:12.560674: osd_op(client.1234553.1:412537 
>> 17.5c9321a8 10000323d97.00000000 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:44.541287 osd.0 [WRN] slow request 31.954483 seconds old, 
>> received at 2016-10-19 07:27:12.586670: osd_op(client.1234553.1:412545 
>> 17.49eb1b07 10000323d97.00000008 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:50.542250 osd.0 [WRN] 53 slow requests, 1 included below; 
>> oldest blocked for > 99.456434 secs
>> 2016-10-19 07:27:50.542261 osd.0 [WRN] slow request 30.322487 seconds old, 
>> received at 2016-10-19 07:27:20.219678: osd_op(client.1234553.1:412574 
>> 17.6c703de 10000323d97.00000025 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:51.542498 osd.0 [WRN] 50 slow requests, 3 included below; 
>> oldest blocked for > 100.456658 secs
>> 2016-10-19 07:27:51.542505 osd.0 [WRN] slow request 60.740689 seconds old, 
>> received at 2016-10-19 07:26:50.801700: osd_op(client.1234553.1:412320 
>> 17.802362e0 10000323d94.00000011 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:51.542513 osd.0 [WRN] slow request 60.574101 seconds old, 
>> received at 2016-10-19 07:26:50.968288: osd_op(client.1234553.1:412321 
>> 17.c529e2e6 10000323d94.00000012 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:51.542518 osd.0 [WRN] slow request 30.452691 seconds old, 
>> received at 2016-10-19 07:27:21.089698: osd_op(client.1234553.1:412576 
>> 17.65178e57 10000323d97.00000027 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:52.542740 osd.0 [WRN] 46 slow requests, 4 included below; 
>> oldest blocked for > 101.456908 secs
>> 2016-10-19 07:27:52.542746 osd.0 [WRN] slow request 30.674005 seconds old, 
>> received at 2016-10-19 07:27:21.868634: osd_op(client.1234553.1:412582 
>> 17.722c7ade 10000323d97.0000002d [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:52.542757 osd.0 [WRN] slow request 60.439945 seconds old, 
>> received at 2016-10-19 07:26:52.102693: osd_op(client.1234553.1:412333 
>> 17.f8df72c7 10000323d94.0000001e [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:52.542763 osd.0 [WRN] slow request 30.875550 seconds old, 
>> received at 2016-10-19 07:27:21.667089: osd_op(client.1234553.1:412581 
>> 17.13ef9fd2 10000323d97.0000002c [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:52.542770 osd.0 [WRN] slow request 60.317892 seconds old, 
>> received at 2016-10-19 07:26:52.224747: osd_op(client.1234553.1:412338 
>> 17.7416d4bb 10000323d94.00000023 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently waiting for blocked object
>> 2016-10-19 07:27:53.209272 mon.0 [INF] pgmap v674653: 1024 pgs: 1024 
>> active+clean; 8514 GB data, 12205 GB used, 50949 GB / 63155 GB avail; 73115 
>> kB/s wr, 75 op/s
>> 2016-10-19 07:27:53.542922 osd.0 [WRN] 45 slow requests, 1 included below; 
>> oldest blocked for > 98.523012 secs
>> 2016-10-19 07:27:53.542934 osd.0 [WRN] slow request 30.691735 seconds old, 
>> received at 2016-10-19 07:27:22.851117: osd_op(client.1234553.1:412586 
>> 17.905a3e43 10000323d97.00000031 [write 0~4194304 [1@-1]] snapc 1=[] 
>> ondisk+write e4294) currently reached_pg
>> 2016-10-19 07:27:54.219455 mon.0 [INF] pgmap v674654: 1024 pgs: 1024 
>> active+clean; 8514 GB data, 12205 GB used, 50949 GB / 63155 GB avail; 50465 
>> kB/s wr, 21 op/s
>>
>> Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] New cephfs cluster performance issues- Jewel - cache pressure, capability release, poor iostat await avg queue size

Reply via email to