date:20171112

Re: [ceph-users] rocksdb: Corruption: missing start of fragmented record

2017-11-12 Thread Konstantin Shalygin


Fair point. I just tried with 12.2.1 (on pre-release Ubuntu bionic now).

Doesn't change anything - fsck doesn't fix rocksdb, the bluestore won't
mount, the OSD won't activate and the error is the same.

Is there any fix in .2 that might address this, or do you just mean that
in general there will be bug fixes?


I think Christian talks about version 12.2.2, not 12.2.*

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Performance, and how much wiggle room there is with tunables

2017-11-12 Thread Rudi Ahlers

Would you mind telling me what rados command set you use, and share the
output? I would like to compare it to our server as well.

On Fri, Nov 10, 2017 at 6:29 AM, Robert Stanford 
wrote:

>
>  In my cluster, rados bench shows about 1GB/s bandwidth.  I've done some
> tuning:
>
> [osd]
> osd op threads = 8
> osd disk threads = 4
> osd recovery max active = 7
>
>
> I was hoping to get much better bandwidth.  My network can handle it, and
> my disks are pretty fast as well.  Are there any major tunables I can play
> with to increase what will be reported by "rados bench"?  Am I pretty much
> stuck around the bandwidth it reported?
>
>  Thank you
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
Kind Regards
Rudi Ahlers
Website: http://www.rudiahlers.co.za
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Where can I find the fix commit of #3370 ?

2017-11-12 Thread ? ?

I met the same issue as http://tracker.ceph.com/issues/3370 ,
But I can't find the commit id of 2978257c56935878f8a756c6cb169b569e99bb91 , 
Can someone help me?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Erasure Coding Pools and PG calculation - documentation

2017-11-12 Thread Christian Wuerdig

Well, as stated in the other email I think in the EC scenario you can
set size=k+m for the pgcalc tool. If you want 10+2 then in theory you
should be able to get away with 6 nodes to survive a single node
failure if you can guarantee that every node will always receive 2 out
of the 12 chunks - looks like this might be achievable:
http://ceph.com/planet/erasure-code-on-small-clusters/

On Mon, Nov 13, 2017 at 1:32 PM, Tim Gipson  wrote:
> I guess my questions are more centered around k+m and PG calculations.
>
> As we were starting to build and test our EC pools with our infrastructure we 
> were trying to figure out what our calculations needed to be starting with 3 
> OSD hosts with 12 x 10 TB OSDs a piece.  The nodes have the ability to expand 
> to 24 drives a piece and we hope to eventually get to around a 1PB cluster 
> after we add some more hosts.  Initially we hoped to be able to do a k=10 m=2 
> on the pool but I am not sure that is going to be feasible.  We’d like to set 
> up the failure domain so that we would be able to lose an entire host without 
> losing the cluster.  At this point I’m not sure that’s possible without 
> bringing in more hosts.
>
> Thanks for the help!
>
> Tim Gipson
>
>
> On 11/12/17, 5:14 PM, "Christian Wuerdig"  wrote:
>
> I might be wrong, but from memory I think you can use
> http://ceph.com/pgcalc/ and use k+m for the size
>
> On Sun, Nov 12, 2017 at 5:41 AM, Ashley Merrick  
> wrote:
> > Hello,
> >
> > Are you having any issues with getting the pool working or just around 
> the
> > PG num you should use?
> >
> > ,Ashley
> >
> > Get Outlook for Android
> >
> > 
> > From: ceph-users  on behalf of Tim 
> Gipson
> > 
> > Sent: Saturday, November 11, 2017 5:38:02 AM
> > To: ceph-users@lists.ceph.com
> > Subject: [ceph-users] Erasure Coding Pools and PG calculation -
> > documentation
> >
> > Hey all,
> >
> > I’m having some trouble setting up a Pool for Erasure Coding.  I haven’t
> > found much documentation around the PG calculation for an Erasure Coding
> > pool.  It seems from what I’ve tried so far that the math needed to set 
> one
> > up is different than the math you use to calculate PGs for a regular
> > replicated pool.
> >
> > Does anyone have any experience setting up a pool this way and can you 
> give
> > me some help or direction, or point me toward some documentation that 
> goes
> > over the math behind this sort of pool setup?
> >
> > Any help would be greatly appreciated!
> >
> > Thanks,
> >
> >
> > Tim Gipson
> > Systems Engineer
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Erasure Coding Pools and PG calculation - documentation

2017-11-12 Thread Tim Gipson

I guess my questions are more centered around k+m and PG calculations.

As we were starting to build and test our EC pools with our infrastructure we 
were trying to figure out what our calculations needed to be starting with 3 
OSD hosts with 12 x 10 TB OSDs a piece.  The nodes have the ability to expand 
to 24 drives a piece and we hope to eventually get to around a 1PB cluster 
after we add some more hosts.  Initially we hoped to be able to do a k=10 m=2 
on the pool but I am not sure that is going to be feasible.  We’d like to set 
up the failure domain so that we would be able to lose an entire host without 
losing the cluster.  At this point I’m not sure that’s possible without 
bringing in more hosts.

Thanks for the help!

Tim Gipson

On 11/12/17, 5:14 PM, "Christian Wuerdig"  wrote:

I might be wrong, but from memory I think you can use
http://ceph.com/pgcalc/ and use k+m for the size

On Sun, Nov 12, 2017 at 5:41 AM, Ashley Merrick  
wrote:
> Hello,
>
> Are you having any issues with getting the pool working or just around the
> PG num you should use?
>
> ,Ashley
>
> Get Outlook for Android
>
> 
> From: ceph-users  on behalf of Tim 
Gipson
> 
> Sent: Saturday, November 11, 2017 5:38:02 AM
> To: ceph-users@lists.ceph.com
> Subject: [ceph-users] Erasure Coding Pools and PG calculation -
> documentation
>
> Hey all,
>
> I’m having some trouble setting up a Pool for Erasure Coding.  I haven’t
> found much documentation around the PG calculation for an Erasure Coding
> pool.  It seems from what I’ve tried so far that the math needed to set 
one
> up is different than the math you use to calculate PGs for a regular
> replicated pool.
>
> Does anyone have any experience setting up a pool this way and can you 
give
> me some help or direction, or point me toward some documentation that goes
> over the math behind this sort of pool setup?
>
> Any help would be greatly appreciated!
>
> Thanks,
>
>
> Tim Gipson
> Systems Engineer
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] OSD is near full and slow in accessing storage from client

2017-11-12 Thread Brad Hubbard



On Mon, Nov 13, 2017 at 4:57 AM, David Turner  wrote:
> You cannot reduce the PG count for a pool.  So there isn't anything you can
> really do for this unless you create a new FS with better PG counts and
> migrate your data into it.
>
> The problem with having more PGs than you need is in the memory footprint
> for the osd daemon. There are warning thresholds for having too many PGs per
> osd.  Also in future expansions, if you need to add pools, you might not be
> able to create the pools with the proper amount of PGs due to older pools
> that have way too many PGs.
>
> It would still be nice to see the output from those commands I asked about.
>
> The built-in reweighting scripts might help your data distribution.
> reweight-by-utilization

Please also carefully consider your use of "min_size 1" and understand the risks
associated with it (there are several threads on this list, as well as
ceph-devel, that talk about this setting).

>
>
> On Sun, Nov 12, 2017, 11:41 AM gjprabu  wrote:
>>
>> Hi David,
>>
>> Thanks for your valuable reply , once complete the backfilling for new osd
>> and will consider by increasing replica value asap. Is it possible to
>> decrease the metadata pg count ?  if the pg count for metadata for value
>> same as data count what kind of issue may occur ?
>>
>> Regards
>> PrabuGJ
>>
>>
>>
>>  On Sun, 12 Nov 2017 21:25:05 +0530 David
>> Turner wrote 
>>
>> What's the output of `ceph df` to see if your PG counts are good or not?
>> Like everyone else has said, the space on the original osds can't be
>> expected to free up until the backfill from adding the new osd has finished.
>>
>> You don't have anything in your cluster health to indicate that your
>> cluster will not be able to finish this backfilling operation on its own.
>>
>> You might find this URL helpful in calculating your PG counts.
>> http://ceph.com/pgcalc/  As a side note. It is generally better to keep your
>> PG counts as base 2 numbers (16, 64, 256, etc). When you do not have a base
>> 2 number then some of your PGs will take up twice as much space as others.
>> In your case with 250, you have 244 PGs that are the same size and 6 PGs
>> that are twice the size of those 244 PGs.  Bumping that up to 256 will even
>> things out.
>>
>> Assuming that the metadata pool is for a CephFS volume, you do not need
>> nearly so many PGs for that pool. Also, I would recommend changing at least
>> the metadata pool to 3 replica_size. If we can talk you into 3 replica for
>> everything else, great! But if not, at least do the metadata pool. If you
>> lose an object in the data pool, you just lose that file. If you lose an
>> object in the metadata pool, you might lose access to the entire CephFS
>> volume.
>>
>>
>> On Sun, Nov 12, 2017, 9:39 AM gjprabu  wrote:
>>
>> Hi Cassiano,
>>
>>Thanks for your valuable feedback and will wait for some time till
>> new osd sync get complete. Also for by increasing pg count it is the issue
>> will solve? our setup pool size for data and metadata pg number is 250. Is
>> this correct for 7 OSD with 2 replica. Also currently stored data size is
>> 17TB.
>>
>> ceph osd df
>> ID WEIGHT  REWEIGHT SIZE   USEAVAIL %USE  VAR  PGS
>> 0 3.29749  1.0  3376G  2814G  562G 83.35 1.23 165
>> 1 3.26869  1.0  3347G  1923G 1423G 57.48 0.85 152
>> 2 3.27339  1.0  3351G  1980G 1371G 59.10 0.88 161
>> 3 3.24089  1.0  3318G  2131G 1187G 64.23 0.95 168
>> 4 3.24089  1.0  3318G  2998G  319G 90.36 1.34 176
>> 5 3.32669  1.0  3406G  2476G  930G 72.68 1.08 165
>> 6 3.27800  1.0  3356G  1518G 1838G 45.24 0.67 166
>>   TOTAL 23476G 15843G 7632G 67.49
>> MIN/MAX VAR: 0.67/1.34  STDDEV: 14.53
>>
>> ceph osd tree
>> ID WEIGHT   TYPE NAMEUP/DOWN REWEIGHT PRIMARY-AFFINITY
>> -1 22.92604 root default  
>> -2  3.29749 host intcfs-osd1  
>> 0  3.29749 osd.0 up  1.0  1.0
>> -3  3.26869 host intcfs-osd2  
>> 1  3.26869 osd.1 up  1.0  1.0
>> -4  3.27339 host intcfs-osd3  
>> 2  3.27339 osd.2 up  1.0  1.0
>> -5  3.24089 host intcfs-osd4  
>> 3  3.24089 osd.3 up  1.0  1.0
>> -6  3.24089 host intcfs-osd5  
>> 4  3.24089 osd.4 up  1.0  1.0
>> -7  3.32669 host intcfs-osd6  
>> 5  3.32669 osd.5 up  1.0  1.0
>> -8  3.27800 host intcfs-osd7  
>> 6  3.27800 osd.6 up  1.0  1.0
>>
>> ceph osd pool ls detail
>>
>> pool 0 'rbd' replicated size

Re: [ceph-users] Erasure Coding Pools and PG calculation - documentation

2017-11-12 Thread Christian Wuerdig

I might be wrong, but from memory I think you can use
http://ceph.com/pgcalc/ and use k+m for the size

On Sun, Nov 12, 2017 at 5:41 AM, Ashley Merrick  wrote:
> Hello,
>
> Are you having any issues with getting the pool working or just around the
> PG num you should use?
>
> ,Ashley
>
> Get Outlook for Android
>
> 
> From: ceph-users  on behalf of Tim Gipson
> 
> Sent: Saturday, November 11, 2017 5:38:02 AM
> To: ceph-users@lists.ceph.com
> Subject: [ceph-users] Erasure Coding Pools and PG calculation -
> documentation
>
> Hey all,
>
> I’m having some trouble setting up a Pool for Erasure Coding.  I haven’t
> found much documentation around the PG calculation for an Erasure Coding
> pool.  It seems from what I’ve tried so far that the math needed to set one
> up is different than the math you use to calculate PGs for a regular
> replicated pool.
>
> Does anyone have any experience setting up a pool this way and can you give
> me some help or direction, or point me toward some documentation that goes
> over the math behind this sort of pool setup?
>
> Any help would be greatly appreciated!
>
> Thanks,
>
>
> Tim Gipson
> Systems Engineer
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Getting errors on erasure pool writes k=2, m=1

2017-11-12 Thread Christian Wuerdig

As per: https://www.spinics.net/lists/ceph-devel/msg38686.html
Bluestore as a hard 4GB object size limit


On Sat, Nov 11, 2017 at 9:27 AM, Marc Roos  wrote:
>
> osd's are crashing when putting a (8GB) file in a erasure coded pool,
> just before finishing. The same osd's are used for replicated pools
> rbd/cephfs, and seem to do fine. Did I made some error is this a bug?
> Looks similar to
> https://www.spinics.net/lists/ceph-devel/msg38685.html
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-September/021045.html
>
>
> [@c01 ~]# date ; rados -p ec21 put  $(basename
> "/mnt/disk/blablablalbalblablalablalb.txt")
> blablablalbalblablalablalb.txt
> Fri Nov 10 20:27:26 CET 2017
>
> [Fri Nov 10 20:33:51 2017] libceph: osd9 down
> [Fri Nov 10 20:33:51 2017] libceph: osd9 down
> [Fri Nov 10 20:33:51 2017] libceph: osd0 192.168.10.111:6802 socket
> closed (con state OPEN)
> [Fri Nov 10 20:33:51 2017] libceph: osd0 192.168.10.111:6802 socket
> error on write
> [Fri Nov 10 20:33:52 2017] libceph: osd0 down
> [Fri Nov 10 20:33:52 2017] libceph: osd7 down
> [Fri Nov 10 20:33:55 2017] libceph: osd0 down
> [Fri Nov 10 20:33:55 2017] libceph: osd7 down
> [Fri Nov 10 20:34:41 2017] libceph: osd7 up
> [Fri Nov 10 20:34:41 2017] libceph: osd7 up
> [Fri Nov 10 20:35:03 2017] libceph: osd9 up
> [Fri Nov 10 20:35:03 2017] libceph: osd9 up
> [Fri Nov 10 20:35:47 2017] libceph: osd0 up
> [Fri Nov 10 20:35:47 2017] libceph: osd0 up
>
> [@c02 ~]# rados -p ec21 stat blablablalbalblablalablalb.txt
> 2017-11-10 20:39:31.296101 7f840ad45e40 -1 WARNING: the following
> dangerous and experimental features are enabled: bluestore
> 2017-11-10 20:39:31.296290 7f840ad45e40 -1 WARNING: the following
> dangerous and experimental features are enabled: bluestore
> 2017-11-10 20:39:31.331588 7f840ad45e40 -1 WARNING: the following
> dangerous and experimental features are enabled: bluestore
> ec21/blablablalbalblablalablalb.txt mtime 2017-11-10 20:32:52.00,
> size 8585740288
>
>
>
> 2017-11-10 20:32:52.287503 7f933028d700  4 rocksdb: EVENT_LOG_v1
> {"time_micros": 1510342372287484, "job": 32, "event": "flush_started",
> "num_memtables": 1, "num_entries": 728747, "num_deletes": 363960,
> "memory_usage": 263854696}
> 2017-11-10 20:32:52.287509 7f933028d700  4 rocksdb:
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
> 12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/flush_job.cc:293]
> [default] [JOB 32] Level-0 flush table #25279: started
> 2017-11-10 20:32:52.503311 7f933028d700  4 rocksdb: EVENT_LOG_v1
> {"time_micros": 1510342372503293, "cf_name": "default", "job": 32,
> "event": "table_file_creation", "file_number": 25279, "file_size":
> 4811948, "table_properties": {"data_size": 4675796, "index_size":
> 102865, "filter_size": 32302, "raw_key_size": 646440,
> "raw_average_key_size": 75, "raw_value_size": 4446103,
> "raw_average_value_size": 519, "num_data_blocks": 1180, "num_entries":
> 8560, "filter_policy_name": "rocksdb.BuiltinBloomFilter",
> "kDeletedKeys": "0", "kMergeOperands": "330"}}
> 2017-11-10 20:32:52.503327 7f933028d700  4 rocksdb:
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
> 12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/flush_job.cc:319]
> [default] [JOB 32] Level-0 flush table #25279: 4811948 bytes OK
> 2017-11-10 20:32:52.572413 7f933028d700  4 rocksdb:
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
> 12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/db_impl_files.cc:242]
> adding log 25276 to recycle list
>
> 2017-11-10 20:32:52.572422 7f933028d700  4 rocksdb: (Original Log Time
> 2017/11/10-20:32:52.503339)
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
> 12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/memtable_list.cc:360]
> [default] Level-0 commit table #25279 started
> 2017-11-10 20:32:52.572425 7f933028d700  4 rocksdb: (Original Log Time
> 2017/11/10-20:32:52.572312)
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
> 12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/memtable_list.cc:383]
> [default] Level-0 commit table #25279: memtable #1 done
> 2017-11-10 20:32:52.572428 7f933028d700  4 rocksdb: (Original Log Time
> 2017/11/10-20:32:52.572328) EVENT_LOG_v1 {"time_micros":
> 1510342372572321, "job": 32, "event": "flush_finished", "lsm_state": [4,
> 4, 36, 140, 0, 0, 0], "immutable_memtables": 0}
> 2017-11-10 20:32:52.572430 7f933028d700  4 rocksdb: (Original Log Time
> 2017/11/10-20:32:52.572397)
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
>

Re: [ceph-users] Undersized fix for small cluster, other than adding a 4th node?

2017-11-12 Thread Christian Wuerdig

The default failure domain is host and you will need 5 (=k+m) nodes
for this config. If you have 4 nodes you can run k=3,m=1 or k=2,m=2
otherwise you'd have to change failure domain to OSD

On Fri, Nov 10, 2017 at 10:52 AM, Marc Roos  wrote:
>
> I added an erasure k=3,m=2 coded pool on a 3 node test cluster and am
> getting these errors.
>
>pg 48.0 is stuck undersized for 23867.00, current state
> active+undersized+degraded, last acting [9,13,2147483647,7,2147483647]
> pg 48.1 is stuck undersized for 27479.944212, current state
> active+undersized+degraded, last acting [12,1,2147483647,8,2147483647]
> pg 48.2 is stuck undersized for 27479.944514, current state
> active+undersized+degraded, last acting [12,1,2147483647,3,2147483647]
> pg 48.3 is stuck undersized for 27479.943845, current state
> active+undersized+degraded, last acting [11,0,2147483647,2147483647,5]
> pg 48.4 is stuck undersized for 27479.947473, current state
> active+undersized+degraded, last acting [8,4,2147483647,2147483647,5]
> pg 48.5 is stuck undersized for 27479.940289, current state
> active+undersized+degraded, last acting [6,5,11,2147483647,2147483647]
> pg 48.6 is stuck undersized for 27479.947125, current state
> active+undersized+degraded, last acting [5,8,2147483647,1,2147483647]
> pg 48.7 is stuck undersized for 23866.977708, current state
> active+undersized+degraded, last acting [13,11,2147483647,0,2147483647]
>
> Mentioned here
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-May/009572.html
> is that the problem was resolved by adding an extra node, I already
> changed the min_size to 3. Or should I change to k=2,m=2 but do I still
> then have good saving on storage then? How can you calculate saving
> storage of erasure pool?
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Log entrys from RGW.

2017-11-12 Thread Jaroslaw Owsiewski

http://tracker.ceph.com/issues/22015 - someone else has this issue?

Regards
-- 
Jarek
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] ceph zstd not for bluestor due to performance reasons

2017-11-12 Thread Stefan Priebe - Profihost AG


Am 12.11.2017 um 17:55 schrieb Sage Weil:
> On Wed, 25 Oct 2017, Sage Weil wrote:
>> On Wed, 25 Oct 2017, Stefan Priebe - Profihost AG wrote:
>>> Hello,
>>>
>>> in the lumious release notes is stated that zstd is not supported by
>>> bluestor due to performance reason. I'm wondering why btrfs instead
>>> states that zstd is as fast as lz4 but compresses as good as zlib.
>>>
>>> Why is zlib than supported by bluestor? And why does btrfs / facebook
>>> behave different?
>>>
>>> "BlueStore supports inline compression using zlib, snappy, or LZ4. (Ceph
>>> also supports zstd for RGW compression but zstd is not recommended for
>>> BlueStore for performance reasons.)"
>>
>> zstd will work but in our testing the performance wasn't great for 
>> bluestore in particular.  The problem was that for each compression run 
>> there is a relatively high start-up cost initializing the zstd 
>> context/state (IIRC a memset of a huge memory buffer) that dominated the 
>> execution time... primarily because bluestore is generally compressing 
>> pretty small chunks of data at a time, not big buffers or streams.
>>
>> Take a look at unittest_compression timings on compressing 16KB buffers 
>> (smaller than bluestore needs usually, but illustrated of the problem):
>>
>> [ RUN  ] Compressor/CompressorTest.compress_16384/0
>> [plugin zlib (zlib/isal)]
>> [   OK ] Compressor/CompressorTest.compress_16384/0 (294 ms)
>> [ RUN  ] Compressor/CompressorTest.compress_16384/1
>> [plugin zlib (zlib/noisal)]
>> [   OK ] Compressor/CompressorTest.compress_16384/1 (1755 ms)
>> [ RUN  ] Compressor/CompressorTest.compress_16384/2
>> [plugin snappy (snappy)]
>> [   OK ] Compressor/CompressorTest.compress_16384/2 (169 ms)
>> [ RUN  ] Compressor/CompressorTest.compress_16384/3
>> [plugin zstd (zstd)]
>> [   OK ] Compressor/CompressorTest.compress_16384/3 (4528 ms)
>>
>> It's an order of magnitude slower than zlib or snappy, which probably 
>> isn't acceptable--even if it is a bit smaller.
> 
> Update!  Zstd developer Yann Collet debugged this and it turns out it was 
> a build issue, fixed by https://github.com/ceph/ceph/pull/18879/files 
> (missing quotes!  yeesh).  The results now look quite good!
> 
> [ RUN  ] Compressor/CompressorTest.compress_16384/0
> [plugin zlib (zlib/isal)]
> [   OK ] Compressor/CompressorTest.compress_16384/0 (370 ms)
> [ RUN  ] Compressor/CompressorTest.compress_16384/1
> [plugin zlib (zlib/noisal)]
> [   OK ] Compressor/CompressorTest.compress_16384/1 (1926 ms)
> [ RUN  ] Compressor/CompressorTest.compress_16384/2
> [plugin snappy (snappy)]
> [   OK ] Compressor/CompressorTest.compress_16384/2 (163 ms)
> [ RUN  ] Compressor/CompressorTest.compress_16384/3
> [plugin zstd (zstd)]
> [   OK ] Compressor/CompressorTest.compress_16384/3 (723 ms)
> 
> Not as fast as snappy, but somewhere between intel-accellerated zlib and 
> non-accellerated zlib, with better compression ratios.
> 
> Also, the zstd compression level is currently hard-coded to level 5.  
> That should be fixed at some point.
> 
> We can backport this to luminous so it's available in 12.2.3.

thanks a lot - i already ported your improvements and the fix to my
local branch but will also change the compression level to 3 or may be 2.

5 is still far too slow and also higher than most apps using zstd.

I'm happy that my post to the zstd github project had so much success.

Greets,
Stefan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] OSD is near full and slow in accessing storage from client

2017-11-12 Thread David Turner

You cannot reduce the PG count for a pool.  So there isn't anything you can
really do for this unless you create a new FS with better PG counts and
migrate your data into it.

The problem with having more PGs than you need is in the memory footprint
for the osd daemon. There are warning thresholds for having too many PGs
per osd.  Also in future expansions, if you need to add pools, you might
not be able to create the pools with the proper amount of PGs due to older
pools that have way too many PGs.

It would still be nice to see the output from those commands I asked about.

The built-in reweighting scripts might help your data distribution.
reweight-by-utilization

On Sun, Nov 12, 2017, 11:41 AM gjprabu  wrote:

> Hi David,
>
> Thanks for your valuable reply , once complete the backfilling for new osd
> and will consider by increasing replica value asap. Is it possible to
> decrease the metadata pg count ?  if the pg count for metadata for value
> same as data count what kind of issue may occur ?
>
> Regards
> PrabuGJ
>
>
>
>  On Sun, 12 Nov 2017 21:25:05 +0530 David Turner
> wrote 
>
> What's the output of `ceph df` to see if your PG counts are good or not?
> Like everyone else has said, the space on the original osds can't be
> expected to free up until the backfill from adding the new osd has finished.
>
> You don't have anything in your cluster health to indicate that your
> cluster will not be able to finish this backfilling operation on its own.
>
> You might find this URL helpful in calculating your PG counts.
> http://ceph.com/pgcalc/  As a side note. It is generally better to keep
> your PG counts as base 2 numbers (16, 64, 256, etc). When you do not have a
> base 2 number then some of your PGs will take up twice as much space as
> others. In your case with 250, you have 244 PGs that are the same size and
> 6 PGs that are twice the size of those 244 PGs.  Bumping that up to 256
> will even things out.
>
> Assuming that the metadata pool is for a CephFS volume, you do not need
> nearly so many PGs for that pool. Also, I would recommend changing at least
> the metadata pool to 3 replica_size. If we can talk you into 3 replica for
> everything else, great! But if not, at least do the metadata pool. If you
> lose an object in the data pool, you just lose that file. If you lose an
> object in the metadata pool, you might lose access to the entire CephFS
> volume.
>
> On Sun, Nov 12, 2017, 9:39 AM gjprabu  wrote:
>
> Hi Cassiano,
>
>Thanks for your valuable feedback and will wait for some time till
> new osd sync get complete. Also for by increasing pg count it is the issue
> will solve? our setup pool size for data and metadata pg number is 250. Is
> this correct for 7 OSD with 2 replica. Also currently stored data size is
> 17TB.
>
> ceph osd df
> ID WEIGHT  REWEIGHT SIZE   USEAVAIL %USE  VAR  PGS
> 0 3.29749  1.0  3376G  2814G  562G 83.35 1.23 165
> 1 3.26869  1.0  3347G  1923G 1423G 57.48 0.85 152
> 2 3.27339  1.0  3351G  1980G 1371G 59.10 0.88 161
> 3 3.24089  1.0  3318G  2131G 1187G 64.23 0.95 168
> 4 3.24089  1.0  3318G  2998G  319G 90.36 1.34 176
> 5 3.32669  1.0  3406G  2476G  930G 72.68 1.08 165
> 6 3.27800  1.0  3356G  1518G 1838G 45.24 0.67 166
>   TOTAL 23476G 15843G 7632G 67.49
> MIN/MAX VAR: 0.67/1.34  STDDEV: 14.53
>
> ceph osd tree
> ID WEIGHT   TYPE NAMEUP/DOWN REWEIGHT PRIMARY-AFFINITY
> -1 22.92604 root default
> -2  3.29749 host intcfs-osd1
> 0  3.29749 osd.0 up  1.0  1.0
> -3  3.26869 host intcfs-osd2
> 1  3.26869 osd.1 up  1.0  1.0
> -4  3.27339 host intcfs-osd3
> 2  3.27339 osd.2 up  1.0  1.0
> -5  3.24089 host intcfs-osd4
> 3  3.24089 osd.3 up  1.0  1.0
> -6  3.24089 host intcfs-osd5
> 4  3.24089 osd.4 up  1.0  1.0
> -7  3.32669 host intcfs-osd6
> 5  3.32669 osd.5 up  1.0  1.0
> -8  3.27800 host intcfs-osd7
> 6  3.27800 osd.6 up  1.0  1.0
>
> *ceph osd pool ls detail*
>
> pool 0 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash
> rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0
> pool 3 '*downloads_data*' replicated size 2 min_size 1 crush_ruleset 0
> object_hash rjenkins* pg_num 250 pgp_num 250* last_change 39 flags
> hashpspool crash_replay_interval 45 stripe_width 0
> pool 4 '*downloads_metadata*' replicated size 2 min_size 1 crush_ruleset
> 0 object_hash rjenkins *pg_num 250 pgp_num 250 *last_change 36 flags
> hashpspool stripe_width 0
>
> Regards
> Prabu GJ
>
>  On Sun, 12 Nov 2017 19:20:34 +0530 *Cassiano Pilipavicius
> >* wrote 
>
> I am also not an expert, but it

Re: [ceph-users] ceph zstd not for bluestor due to performance reasons

2017-11-12 Thread Sage Weil

On Wed, 25 Oct 2017, Sage Weil wrote:
> On Wed, 25 Oct 2017, Stefan Priebe - Profihost AG wrote:
> > Hello,
> > 
> > in the lumious release notes is stated that zstd is not supported by
> > bluestor due to performance reason. I'm wondering why btrfs instead
> > states that zstd is as fast as lz4 but compresses as good as zlib.
> > 
> > Why is zlib than supported by bluestor? And why does btrfs / facebook
> > behave different?
> > 
> > "BlueStore supports inline compression using zlib, snappy, or LZ4. (Ceph
> > also supports zstd for RGW compression but zstd is not recommended for
> > BlueStore for performance reasons.)"
> 
> zstd will work but in our testing the performance wasn't great for 
> bluestore in particular.  The problem was that for each compression run 
> there is a relatively high start-up cost initializing the zstd 
> context/state (IIRC a memset of a huge memory buffer) that dominated the 
> execution time... primarily because bluestore is generally compressing 
> pretty small chunks of data at a time, not big buffers or streams.
> 
> Take a look at unittest_compression timings on compressing 16KB buffers 
> (smaller than bluestore needs usually, but illustrated of the problem):
> 
> [ RUN  ] Compressor/CompressorTest.compress_16384/0
> [plugin zlib (zlib/isal)]
> [   OK ] Compressor/CompressorTest.compress_16384/0 (294 ms)
> [ RUN  ] Compressor/CompressorTest.compress_16384/1
> [plugin zlib (zlib/noisal)]
> [   OK ] Compressor/CompressorTest.compress_16384/1 (1755 ms)
> [ RUN  ] Compressor/CompressorTest.compress_16384/2
> [plugin snappy (snappy)]
> [   OK ] Compressor/CompressorTest.compress_16384/2 (169 ms)
> [ RUN  ] Compressor/CompressorTest.compress_16384/3
> [plugin zstd (zstd)]
> [   OK ] Compressor/CompressorTest.compress_16384/3 (4528 ms)
> 
> It's an order of magnitude slower than zlib or snappy, which probably 
> isn't acceptable--even if it is a bit smaller.

Update!  Zstd developer Yann Collet debugged this and it turns out it was 
a build issue, fixed by https://github.com/ceph/ceph/pull/18879/files 
(missing quotes!  yeesh).  The results now look quite good!

[ RUN  ] Compressor/CompressorTest.compress_16384/0
[plugin zlib (zlib/isal)]
[   OK ] Compressor/CompressorTest.compress_16384/0 (370 ms)
[ RUN  ] Compressor/CompressorTest.compress_16384/1
[plugin zlib (zlib/noisal)]
[   OK ] Compressor/CompressorTest.compress_16384/1 (1926 ms)
[ RUN  ] Compressor/CompressorTest.compress_16384/2
[plugin snappy (snappy)]
[   OK ] Compressor/CompressorTest.compress_16384/2 (163 ms)
[ RUN  ] Compressor/CompressorTest.compress_16384/3
[plugin zstd (zstd)]
[   OK ] Compressor/CompressorTest.compress_16384/3 (723 ms)

Not as fast as snappy, but somewhere between intel-accellerated zlib and 
non-accellerated zlib, with better compression ratios.

Also, the zstd compression level is currently hard-coded to level 5.  
That should be fixed at some point.

We can backport this to luminous so it's available in 12.2.3.

sage
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Moving bluestore WAL and DB after bluestore creation

2017-11-12 Thread Shawn Edwards

I've created some Bluestore OSD with all data (wal, db, and data) all on
the same rotating disk.  I would like to now move the wal and db onto an
nvme disk.  Is that possible without re-creating the OSD?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] OSD is near full and slow in accessing storage from client

2017-11-12 Thread gjprabu





Hi David,Thanks for your valuable reply , once complete the 
backfilling for new osd and will consider by increasing replica value asap. Is 
it possible to decrease the metadata pg count ?  if the pg count for metadata 
for value same as data count what kind of issue may occur ? RegardsPrabuGJ 
On Sun, 12 Nov 2017 21:25:05 +0530  David Turner wrote 
What's the output of `ceph df` to see if your PG counts are good or not?  
Like everyone else has said, the space on the original osds can't be expected 
to free up until the backfill from adding the new osd has finished. You don't 
have anything in your cluster health to indicate that your cluster will not be 
able to finish this backfilling operation on its own. You might find this URL 
helpful in calculating your PG counts. http://ceph.com/pgcalc/  As a side note. 
It is generally better to keep your PG counts as base 2 numbers (16, 64, 256, 
etc). When you do not have a base 2 number then some of your PGs will take up 
twice as much space as others. In your case with 250, you have 244 PGs that are 
the same size and 6 PGs that are twice the size of those 244 PGs.  Bumping that 
up to 256 will even things out. Assuming that the metadata pool is for a CephFS 
volume, you do not need nearly so many PGs for that pool. Also, I would 
recommend changing at least the metadata pool to 3 replica_size. If we can talk 
you into 3 replica for everything else, great! But if not, at least do the 
metadata pool. If you lose an object in the data pool, you just lose that file. 
If you lose an object in the metadata pool, you might lose access to the entire 
CephFS volume. On Sun, Nov 12, 2017, 9:39 AM gjprabu  
wrote:Hi Cassiano,       Thanks for your valuable feedback and will wait for 
some time till new osd sync get complete. Also for by increasing pg count it is 
the issue will solve? our setup pool size for data and metadata pg number is 
250. Is this correct for 7 OSD with 2 replica. Also currently stored data size 
is 17TB.ceph osd dfID WEIGHT  REWEIGHT SIZE   USE    AVAIL %USE  VAR  PGS0 
3.29749  1.0  3376G  2814G  562G 83.35 1.23 1651 3.26869  1.0  3347G  
1923G 1423G 57.48 0.85 1522 3.27339  1.0  3351G  1980G 1371G 59.10 0.88 
1613 3.24089  1.0  3318G  2131G 1187G 64.23 0.95 1684 3.24089  1.0  
3318G  2998G  319G 90.36 1.34 1765 3.32669  1.0  3406G  2476G  930G 72.68 
1.08 1656 3.27800  1.0  3356G  1518G 1838G 45.24 0.67 166  
TOTAL 23476G 15843G 7632G 67.49 MIN/MAX VAR: 0.67/1.34  STDDEV: 
14.53ceph osd treeID WEIGHT   TYPE NAME    UP/DOWN REWEIGHT 
PRIMARY-AFFINITY-1 22.92604 root default
  -2  3.29749 host intcfs-osd1  0  3.29749  
   osd.0 up  1.0  1.0-3  3.26869 host 
intcfs-osd2  1  3.26869 osd.1   
  up  1.0  1.0-4  3.27339 host intcfs-osd3  
2  3.27339 osd.2 up  1.0  
1.0-5  3.24089 host intcfs-osd4  3  
3.24089 osd.3 up  1.0  1.0-6  3.24089 
host intcfs-osd5  4  3.24089 osd.4  
   up  1.0  1.0-7  3.32669 host intcfs-osd6 
 5  3.32669 osd.5 up  1.0  
1.0-8  3.27800 host intcfs-osd7  6  
3.27800 osd.6 up  1.0  1.0ceph osd pool ls 
detailpool 0 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash 
rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0pool 
3 'downloads_data' replicated size 2 min_size 1 crush_ruleset 0 object_hash 
rjenkins pg_num 250 pgp_num 250 last_change 39 flags hashpspool 
crash_replay_interval 45 stripe_width 0pool 4 'downloads_metadata' replicated 
size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 250 pgp_num 250 
last_change 36 flags hashpspool stripe_width 0RegardsPrabu GJ On Sun, 12 
Nov 2017 19:20:34 +0530 Cassiano Pilipavicius  wrote 
I am also not an expert, but it looks like you have big data   volumes 
on few PGs, from what I've seen, the pg data is only   deleted from the old 
OSD when is completed copied to the new osd.So, if 1 pg have 100G por example, 
only when it is fully copied   to the new OSD, the space will be released 
on the old OSD.If you have a busy cluster/network, it may take a good while.
   Maybe just wait a litle and check from time to time and the space   will 
eventually be released.Em 11/12/2017 11:44 AM, Sébastien   VIGNERON 
escreveu:___ceph-users mailing list 
ceph-users@lists.ceph.com

Re: [ceph-users] OSD is near full and slow in accessing storage from client

2017-11-12 Thread David Turner

What's the output of `ceph df` to see if your PG counts are good or not?
Like everyone else has said, the space on the original osds can't be
expected to free up until the backfill from adding the new osd has finished.

You don't have anything in your cluster health to indicate that your
cluster will not be able to finish this backfilling operation on its own.

You might find this URL helpful in calculating your PG counts.
http://ceph.com/pgcalc/  As a side note. It is generally better to keep
your PG counts as base 2 numbers (16, 64, 256, etc). When you do not have a
base 2 number then some of your PGs will take up twice as much space as
others. In your case with 250, you have 244 PGs that are the same size and
6 PGs that are twice the size of those 244 PGs.  Bumping that up to 256
will even things out.

Assuming that the metadata pool is for a CephFS volume, you do not need
nearly so many PGs for that pool. Also, I would recommend changing at least
the metadata pool to 3 replica_size. If we can talk you into 3 replica for
everything else, great! But if not, at least do the metadata pool. If you
lose an object in the data pool, you just lose that file. If you lose an
object in the metadata pool, you might lose access to the entire CephFS
volume.

On Sun, Nov 12, 2017, 9:39 AM gjprabu  wrote:

> Hi Cassiano,
>
>Thanks for your valuable feedback and will wait for some time till
> new osd sync get complete. Also for by increasing pg count it is the issue
> will solve? our setup pool size for data and metadata pg number is 250. Is
> this correct for 7 OSD with 2 replica. Also currently stored data size is
> 17TB.
>
> ceph osd df
> ID WEIGHT  REWEIGHT SIZE   USEAVAIL %USE  VAR  PGS
> 0 3.29749  1.0  3376G  2814G  562G 83.35 1.23 165
> 1 3.26869  1.0  3347G  1923G 1423G 57.48 0.85 152
> 2 3.27339  1.0  3351G  1980G 1371G 59.10 0.88 161
> 3 3.24089  1.0  3318G  2131G 1187G 64.23 0.95 168
> 4 3.24089  1.0  3318G  2998G  319G 90.36 1.34 176
> 5 3.32669  1.0  3406G  2476G  930G 72.68 1.08 165
> 6 3.27800  1.0  3356G  1518G 1838G 45.24 0.67 166
>   TOTAL 23476G 15843G 7632G 67.49
> MIN/MAX VAR: 0.67/1.34  STDDEV: 14.53
>
> ceph osd tree
> ID WEIGHT   TYPE NAMEUP/DOWN REWEIGHT PRIMARY-AFFINITY
> -1 22.92604 root default
> -2  3.29749 host intcfs-osd1
> 0  3.29749 osd.0 up  1.0  1.0
> -3  3.26869 host intcfs-osd2
> 1  3.26869 osd.1 up  1.0  1.0
> -4  3.27339 host intcfs-osd3
> 2  3.27339 osd.2 up  1.0  1.0
> -5  3.24089 host intcfs-osd4
> 3  3.24089 osd.3 up  1.0  1.0
> -6  3.24089 host intcfs-osd5
> 4  3.24089 osd.4 up  1.0  1.0
> -7  3.32669 host intcfs-osd6
> 5  3.32669 osd.5 up  1.0  1.0
> -8  3.27800 host intcfs-osd7
> 6  3.27800 osd.6 up  1.0  1.0
>
> *ceph osd pool ls detail*
>
> pool 0 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash
> rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0
> pool 3 '*downloads_data*' replicated size 2 min_size 1 crush_ruleset 0
> object_hash rjenkins* pg_num 250 pgp_num 250* last_change 39 flags
> hashpspool crash_replay_interval 45 stripe_width 0
> pool 4 '*downloads_metadata*' replicated size 2 min_size 1 crush_ruleset
> 0 object_hash rjenkins *pg_num 250 pgp_num 250 *last_change 36 flags
> hashpspool stripe_width 0
>
> Regards
> Prabu GJ
>
>  On Sun, 12 Nov 2017 19:20:34 +0530 *Cassiano Pilipavicius
> >* wrote 
>
> I am also not an expert, but it looks like you have big data volumes on
> few PGs, from what I've seen, the pg data is only deleted from the old OSD
> when is completed copied to the new osd.
>
> So, if 1 pg have 100G por example, only when it is fully copied to the new
> OSD, the space will be released on the old OSD.
>
> If you have a busy cluster/network, it may take a good while. Maybe just
> wait a litle and check from time to time and the space will eventually be
> released.
>
> Em 11/12/2017 11:44 AM, Sébastien VIGNERON escreveu:
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> I’m not an expert either so if someone in the list have some ideas on this
> problem, don’t be shy, share them with us.
>
> For now, I only have hypothese that the OSD space will be recovered as
> soon as the recovery process is complete.
> Hope everything will get back in order soon (before reaching 95% or above).
>
> I saw some messages on the list about the fstrim tool which can help
> reclaim unused free space, but i don’t know if it’s apply to your case.
>
> Cordialement / Best regards,
>
> Sébastien VIGNERON
>

Re: [ceph-users] OSD is near full and slow in accessing storage from client

2017-11-12 Thread Cassiano Pilipavicius

I think that more pgs help to distribute the data more evenly, but I 
dont know if its recommended with a low OSDs number. I remember read 
somewhere in the docs a guideline for the mas pgs number/OSD, but was 
from an really old ceph version, maybe things has changed.



Em 11/12/2017 12:39 PM, gjprabu escreveu:

Hi Cassiano,

       Thanks for your valuable feedback and will wait for some time 
till new osd sync get complete. Also for by increasing pg count it is 
the issue will solve? our setup pool size for data and metadata pg 
number is 250. Is this correct for 7 OSD with 2 replica. Also 
currently stored data size is 17TB.


ceph osd df
ID WEIGHT  REWEIGHT SIZE   USE    AVAIL %USE  VAR  PGS
0 3.29749  1.0  3376G  2814G  562G 83.35 1.23 165
1 3.26869  1.0  3347G  1923G 1423G 57.48 0.85 152
2 3.27339  1.0  3351G  1980G 1371G 59.10 0.88 161
3 3.24089  1.0  3318G  2131G 1187G 64.23 0.95 168
4 3.24089  1.0  3318G  2998G  319G 90.36 1.34 176
5 3.32669  1.0  3406G  2476G  930G 72.68 1.08 165
6 3.27800  1.0  3356G  1518G 1838G 45.24 0.67 166
  TOTAL 23476G 15843G 7632G 67.49
MIN/MAX VAR: 0.67/1.34  STDDEV: 14.53

ceph osd tree
ID WEIGHT   TYPE NAME    UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 22.92604 root default
-2  3.29749 host intcfs-osd1
0  3.29749 osd.0 up  1.0 1.0
-3  3.26869 host intcfs-osd2
1  3.26869 osd.1 up  1.0 1.0
-4  3.27339 host intcfs-osd3
2  3.27339 osd.2 up  1.0 1.0
-5  3.24089 host intcfs-osd4
3  3.24089 osd.3 up  1.0 1.0
-6  3.24089 host intcfs-osd5
4  3.24089 osd.4 up  1.0 1.0
-7  3.32669 host intcfs-osd6
5  3.32669 osd.5 up  1.0 1.0
-8  3.27800 host intcfs-osd7
6  3.27800 osd.6 up  1.0 1.0

*ceph osd pool ls detail*

pool 0 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash 
rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool 
stripe_width 0
pool 3 '*downloads_data*' replicated size 2 min_size 1 crush_ruleset 0 
object_hash rjenkins*pg_num 250 pgp_num 250* last_change 39 flags 
hashpspool crash_replay_interval 45 stripe_width 0
pool 4 '*downloads_metadata*' replicated size 2 min_size 1 
crush_ruleset 0 object_hash rjenkins *pg_num 250 pgp_num 250 
*last_change 36 flags hashpspool stripe_width 0


Regards
Prabu GJ

 On Sun, 12 Nov 2017 19:20:34 +0530 *Cassiano Pilipavicius 
* wrote 


I am also not an expert, but it looks like you have big data
volumes on few PGs, from what I've seen, the pg data is only
deleted from the old OSD when is completed copied to the new osd.

So, if 1 pg have 100G por example, only when it is fully copied to
the new OSD, the space will be released on the old OSD.

If you have a busy cluster/network, it may take a good while.
Maybe just wait a litle and check from time to time and the space
will eventually be released.


Em 11/12/2017 11:44 AM, Sébastien VIGNERON escreveu:


___
ceph-users mailing list
ceph-users@lists.ceph.com 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

I’m not an expert either so if someone in the list have some
ideas on this problem, don’t be shy, share them with us.

For now, I only have hypothese that the OSD space will be
recovered as soon as the recovery process is complete.
Hope everything will get back in order soon (before reaching
95% or above).

I saw some messages on the list about the fstrim tool which
can help reclaim unused free space, but i don’t know if it’s
apply to your case.

Cordialement / Best regards,

Sébastien VIGNERON
CRIANN,
Ingénieur / Engineer
Technopôle du Madrillet
745, avenue de l'Université
76800 Saint-Etienne du Rouvray - France
tél. +33 2 32 91 42 91
fax. +33 2 32 91 42 92
http://www.criann.fr
mailto:sebastien.vigne...@criann.fr
support: supp...@criann.fr 

Le 12 nov. 2017 à 13:29, gjprabu > a écrit :

Hi Sebastien,

    Below is the query details. I am not that much expert
and still learning . pg's are not stuck stat before adding
osd and pg are slowly clearing stat to active-clean. Today
morning there was around
53 active+undersized+degraded+remapped+wait_backfill and
now it is 21 only, hope its going on and i am seeing the
space keep increasing in newly added OSD (osd.6)


ID WEIGHT  REWEIGHT SIZE   USE    AVAIL %USE  VAR  PGS
*0 3.29749  1.0  3376G  2814G  562G

Re: [ceph-users] No ops on some OSD

2017-11-12 Thread Marc Roos


[@c03 ~]# ceph osd status
2017-11-12 15:54:13.164823 7f478a6ad700 -1 WARNING: the following 
dangerous and experimental features are enabled: bluestore
2017-11-12 15:54:13.211219 7f478a6ad700 -1 WARNING: the following 
dangerous and experimental features are enabled: bluestore
no valid command found; 10 closest matches:
osd map   {}
osd lspools {}
osd count-metadata 
osd versions
osd find 
osd metadata {}
osd getmaxosd
osd ls-tree {} {}
osd getmap {}
osd getcrushmap {}
Error EINVAL: invalid command



-Original Message-
From: I Gede Iswara Darmawan [mailto:iswaradr...@gmail.com] 
Sent: zondag 12 november 2017 2:17
Cc: ceph-users
Subject: Re: [ceph-users] No ops on some OSD

Still the same syntax (ceph osd status)

Thanks

Regards,

I Gede Iswara Darmawan

Information System - School of Industrial and System Engineering

Telkom University

P / SMS / WA : 081 322 070719 

E : iswaradr...@gmail.com / iswaradr...@live.com


On Sat, Nov 4, 2017 at 6:11 PM, Marc Roos  
wrote:




What is the new syntax for "ceph osd status" for luminous?





-Original Message-
From: I Gede Iswara Darmawan [mailto:iswaradr...@gmail.com]
Sent: donderdag 2 november 2017 6:19
To: ceph-users@lists.ceph.com
Subject: [ceph-users] No ops on some OSD

Hello,

I want to ask about my problem. There's some OSD that dont have any 
load
(indicated with No ops on that OSD).

Hereby I attached the ceph osd status result :
https://pastebin.com/fFLcCbpk . Look at OSD 17,61 and 72. There's 
no
load or operation happened at that OSD. How to fix this?

Thank you
Regards,

I Gede Iswara Darmawan

Information System - School of Industrial and System Engineering

Telkom University

P / SMS / WA : 081 322 070719

E : iswaradr...@gmail.com / iswaradr...@live.com




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
 




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] OSD is near full and slow in accessing storage from client

2017-11-12 Thread gjprabu

Hi Cassiano,



   Thanks for your valuable feedback and will wait for some time till new 
osd sync get complete. Also for by increasing pg count it is the issue will 
solve? our setup pool size for data and metadata pg number is 250. Is this 
correct for 7 OSD with 2 replica. Also currently stored data size is 17TB.



ceph osd df

ID WEIGHT  REWEIGHT SIZE   USEAVAIL %USE  VAR  PGS

0 3.29749  1.0  3376G  2814G  562G 83.35 1.23 165

1 3.26869  1.0  3347G  1923G 1423G 57.48 0.85 152

2 3.27339  1.0  3351G  1980G 1371G 59.10 0.88 161

3 3.24089  1.0  3318G  2131G 1187G 64.23 0.95 168

4 3.24089  1.0  3318G  2998G  319G 90.36 1.34 176

5 3.32669  1.0  3406G  2476G  930G 72.68 1.08 165

6 3.27800  1.0  3356G  1518G 1838G 45.24 0.67 166

  TOTAL 23476G 15843G 7632G 67.49 

MIN/MAX VAR: 0.67/1.34  STDDEV: 14.53



ceph osd tree

ID WEIGHT   TYPE NAMEUP/DOWN REWEIGHT PRIMARY-AFFINITY

-1 22.92604 root default  

-2  3.29749 host intcfs-osd1  

0  3.29749 osd.0 up  1.0  1.0

-3  3.26869 host intcfs-osd2  

1  3.26869 osd.1 up  1.0  1.0

-4  3.27339 host intcfs-osd3  

2  3.27339 osd.2 up  1.0  1.0

-5  3.24089 host intcfs-osd4  

3  3.24089 osd.3 up  1.0  1.0

-6  3.24089 host intcfs-osd5  

4  3.24089 osd.4 up  1.0  1.0

-7  3.32669 host intcfs-osd6  

5  3.32669 osd.5 up  1.0  1.0

-8  3.27800 host intcfs-osd7  

6  3.27800 osd.6 up  1.0  1.0



ceph osd pool ls detail



pool 0 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins 
pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0

pool 3 'downloads_data' replicated size 2 min_size 1 crush_ruleset 0 
object_hash rjenkins pg_num 250 pgp_num 250 last_change 39 flags hashpspool 
crash_replay_interval 45 stripe_width 0

pool 4 'downloads_metadata' replicated size 2 min_size 1 crush_ruleset 0 
object_hash rjenkins pg_num 250 pgp_num 250 last_change 36 flags hashpspool 
stripe_width 0



Regards

Prabu GJ


 On Sun, 12 Nov 2017 19:20:34 +0530 Cassiano Pilipavicius 
cassi...@tips.com.br wrote 




I am also not an expert, but it looks like you have big data volumes on few 
PGs, from what I've seen, the pg data is only deleted from the old OSD when is 
completed copied to the new osd.

So, if 1 pg have 100G por example, only when it is fully copied to the new OSD, 
the space will be released on the old OSD.

If you have a busy cluster/network, it may take a good while. Maybe just wait a 
litle and check from time to time and the space will eventually be released.



Em 11/12/2017 11:44 AM, Sébastien VIGNERON escreveu:





___

ceph-users mailing list 

ceph-users@lists.ceph.com 

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 


I’m not an expert either so if someone in the list have some ideas on this 
problem, don’t be shy, share them with us. 



For now, I only have hypothese that the OSD space will be recovered as soon as 
the recovery process is complete. 

Hope everything will get back in order soon (before reaching 95% or above).



I saw some messages on the list about the fstrim tool which can help reclaim 
unused free space, but i don’t know if it’s apply to your case.



Cordialement / Best regards,

 

 Sébastien VIGNERON 

 CRIANN, 

 Ingénieur / Engineer

 Technopôle du Madrillet 

 745, avenue de l'Université 

 76800 Saint-Etienne du Rouvray - France 

 tél. +33 2 32 91 42 91 

 fax. +33 2 32 91 42 92 

 http://www.criann.fr 

 mailto:sebastien.vigne...@criann.fr

 support: supp...@criann.fr




Le 12 nov. 2017 à 13:29, gjprabu gjpr...@zohocorp.com a écrit :



Hi Sebastien,



Below is the query details. I am not that much expert and still learning . 
pg's are not stuck stat before adding osd and pg are slowly clearing stat to 
active-clean. Today morning there was around 53 
active+undersized+degraded+remapped+wait_backfill and now it is 21 only, hope 
its going on and i am seeing the space keep increasing in newly added OSD 
(osd.6) 





ID WEIGHT  REWEIGHT SIZE   USEAVAIL %USE  VAR  PGS 

0 3.29749  1.0  3376G  2814G  562G 83.35 1.23 165  ( Available Spaces not 
reduced after adding new OSD)

1 3.26869  1.0  3347G  1923G 1423G 57.48 0.85 152

2 3.27339  1.0  3351G  1980G 1371G 59.10 0.88 161

3 3.24089  1.0  3318G  2131G 1187G 64.23 0.95 168

4 3.24089  1.0  3318G  2998G  319G 90.36 1.34 176  ( Available Spaces not 
reduced

Re: [ceph-users] OSD is near full and slow in accessing storage from client

2017-11-12 Thread Cassiano Pilipavicius

I am also not an expert, but it looks like you have big data volumes on 
few PGs, from what I've seen, the pg data is only deleted from the old 
OSD when is completed copied to the new osd.


So, if 1 pg have 100G por example, only when it is fully copied to the 
new OSD, the space will be released on the old OSD.


If you have a busy cluster/network, it may take a good while. Maybe just 
wait a litle and check from time to time and the space will eventually 
be released.



Em 11/12/2017 11:44 AM, Sébastien VIGNERON escreveu:
I’m not an expert either so if someone in the list have some ideas on 
this problem, don’t be shy, share them with us.


For now, I only have hypothese that the OSD space will be recovered as 
soon as the recovery process is complete.
Hope everything will get back in order soon (before reaching 95% or 
above).


I saw some messages on the list about the fstrim tool which can help 
reclaim unused free space, but i don’t know if it’s apply to your case.


Cordialement / Best regards,

Sébastien VIGNERON
CRIANN,
Ingénieur / Engineer
Technopôle du Madrillet
745, avenue de l'Université
76800 Saint-Etienne du Rouvray - France
tél. +33 2 32 91 42 91
fax. +33 2 32 91 42 92
http://www.criann.fr
mailto:sebastien.vigne...@criann.fr
support: supp...@criann.fr

Le 12 nov. 2017 à 13:29, gjprabu > a écrit :


Hi Sebastien,

    Below is the query details. I am not that much expert and still 
learning . pg's are not stuck stat before adding osd and pg are 
slowly clearing stat to active-clean. Today morning there was around 
53 active+undersized+degraded+remapped+wait_backfill and now it is 21 
only, hope its going on and i am seeing the space keep increasing in 
newly added OSD (osd.6)



ID WEIGHT  REWEIGHT SIZE   USE    AVAIL %USE  VAR  PGS
*0 3.29749  1.0  3376G 2814G  562G 83.35 1.23 165  ( Available 
Spaces not reduced after adding new OSD)*

1 3.26869  1.0  3347G  1923G 1423G 57.48 0.85 152
2 3.27339  1.0  3351G  1980G 1371G 59.10 0.88 161
3 3.24089  1.0  3318G  2131G 1187G 64.23 0.95 168
*4 3.24089  1.0  3318G 2998G  319G 90.36 1.34 176  (**Available 
*Spaces not reduced after adding new OSD)**
*5 3.32669  1.0  3406G 2476G  930G 72.68 1.08 165 *(**Available 
*Spaces not reduced after adding new OSD)***

6 3.27800  1.0  3356G  1518G 1838G 45.24 0.67 166
  TOTAL 23476G 15843G 7632G 67.49
MIN/MAX VAR: 0.67/1.34  STDDEV: 14.53

...



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] OSD is near full and slow in accessing storage from client

2017-11-12 Thread gjprabu

Hi 



Thanks Sebastian,  If anybody help on this issue it will be highly appropriated 
.



Regards

Prabu GJ







 On Sun, 12 Nov 2017 19:14:02 +0530 Sébastien VIGNERON 
sebastien.vigne...@criann.fr wrote 




I’m not an expert either so if someone in the list have some ideas on this 
problem, don’t be shy, share them with us.



For now, I only have hypothese that the OSD space will be recovered as soon as 
the recovery process is complete. 

Hope everything will get back in order soon (before reaching 95% or above).



I saw some messages on the list about the fstrim tool which can help reclaim 
unused free space, but i don’t know if it’s apply to your case.



Cordialement / Best regards,



Sébastien VIGNERON 

CRIANN, 

Ingénieur / Engineer

Technopôle du Madrillet 

745, avenue de l'Université 

76800 Saint-Etienne du Rouvray - France 

tél. +33 2 32 91 42 91 

fax. +33 2 32 91 42 92 

http://www.criann.fr 

mailto:sebastien.vigne...@criann.fr

support: supp...@criann.fr




Le 12 nov. 2017 à 13:29, gjprabu gjpr...@zohocorp.com a écrit :



Hi Sebastien,



Below is the query details. I am not that much expert and still learning . 
pg's are not stuck stat before adding osd and pg are slowly clearing stat to 
active-clean. Today morning there was around 53 
active+undersized+degraded+remapped+wait_backfill and now it is 21 only, hope 
its going on and i am seeing the space keep increasing in newly added OSD 
(osd.6) 





ID WEIGHT  REWEIGHT SIZE   USEAVAIL %USE  VAR  PGS 

0 3.29749  1.0  3376G  2814G  562G 83.35 1.23 165  ( Available Spaces not 
reduced after adding new OSD)

1 3.26869  1.0  3347G  1923G 1423G 57.48 0.85 152

2 3.27339  1.0  3351G  1980G 1371G 59.10 0.88 161

3 3.24089  1.0  3318G  2131G 1187G 64.23 0.95 168

4 3.24089  1.0  3318G  2998G  319G 90.36 1.34 176  ( Available Spaces not 
reduced after adding new OSD)

5 3.32669  1.0  3406G  2476G  930G 72.68 1.08 165  ( Available Spaces not 
reduced after adding new OSD)

6 3.27800  1.0  3356G  1518G 1838G 45.24 0.67 166

  TOTAL 23476G 15843G 7632G 67.49 

MIN/MAX VAR: 0.67/1.34  STDDEV: 14.53




...







___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] OSD is near full and slow in accessing storage from client

2017-11-12 Thread Sébastien VIGNERON

I’m not an expert either so if someone in the list have some ideas on this 
problem, don’t be shy, share them with us.

For now, I only have hypothese that the OSD space will be recovered as soon as 
the recovery process is complete. 
Hope everything will get back in order soon (before reaching 95% or above).

I saw some messages on the list about the fstrim tool which can help reclaim 
unused free space, but i don’t know if it’s apply to your case.

Cordialement / Best regards,

Sébastien VIGNERON 
CRIANN, 
Ingénieur / Engineer
Technopôle du Madrillet 
745, avenue de l'Université 
76800 Saint-Etienne du Rouvray - France 
tél. +33 2 32 91 42 91 
fax. +33 2 32 91 42 92 
http://www.criann.fr 
mailto:sebastien.vigne...@criann.fr
support: supp...@criann.fr

> Le 12 nov. 2017 à 13:29, gjprabu  a écrit :
> 
> Hi Sebastien,
> 
> Below is the query details. I am not that much expert and still learning 
> . pg's are not stuck stat before adding osd and pg are slowly clearing stat 
> to active-clean. Today morning there was around 53 
> active+undersized+degraded+remapped+wait_backfill and now it is 21 only, hope 
> its going on and i am seeing the space keep increasing in newly added OSD 
> (osd.6) 
> 
> 
> ID WEIGHT  REWEIGHT SIZE   USEAVAIL %USE  VAR  PGS 
> 0 3.29749  1.0  3376G  2814G  562G 83.35 1.23 165  ( Available Spaces not 
> reduced after adding new OSD)
> 1 3.26869  1.0  3347G  1923G 1423G 57.48 0.85 152
> 2 3.27339  1.0  3351G  1980G 1371G 59.10 0.88 161
> 3 3.24089  1.0  3318G  2131G 1187G 64.23 0.95 168
> 4 3.24089  1.0  3318G  2998G  319G 90.36 1.34 176  ( Available Spaces not 
> reduced after adding new OSD)
> 5 3.32669  1.0  3406G  2476G  930G 72.68 1.08 165  ( Available Spaces not 
> reduced after adding new OSD)
> 6 3.27800  1.0  3356G  1518G 1838G 45.24 0.67 166
>   TOTAL 23476G 15843G 7632G 67.49 
> MIN/MAX VAR: 0.67/1.34  STDDEV: 14.53
...

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] OSD is near full and slow in accessing storage from client

2017-11-12 Thread Sébastien VIGNERON

Hi,

Have you tried to query pg state for some stuck or undersized pgs? Maybe some 
OSD daemons are not right, blocking the reconstruction.

ceph pg 3.be query
ceph pg 4.d4 query
ceph pg 4.8c query

http://docs.ceph.com/docs/jewel/rados/troubleshooting/troubleshooting-pg/

Cordialement / Best regards,

Sébastien VIGNERON 
CRIANN, 
Ingénieur / Engineer
Technopôle du Madrillet 
745, avenue de l'Université 
76800 Saint-Etienne du Rouvray - France 
tél. +33 2 32 91 42 91 
fax. +33 2 32 91 42 92 
http://www.criann.fr 
mailto:sebastien.vigne...@criann.fr
support: supp...@criann.fr

> Le 12 nov. 2017 à 10:59, gjprabu  a écrit :
> 
> Hi Sebastien
> 
>  Thanks for you reply , yes undersize pgs and recovery in process becuase of 
> we added new osd after getting 2 OSD is near full warning .   Yes newly added 
> osd is reblancing the size.
> 
> 
> [root@intcfs-osd6 ~]# ceph osd df
> ID WEIGHT  REWEIGHT SIZE   USEAVAIL %USE  VAR  PGS
> 0 3.29749  1.0  3376G  2875G  501G 85.15 1.26 165
> 1 3.26869  1.0  3347G  1923G 1423G 57.46 0.85 152
> 2 3.27339  1.0  3351G  1980G 1371G 59.08 0.88 161
> 3 3.24089  1.0  3318G  2130G 1187G 64.21 0.95 168
> 4 3.24089  1.0  3318G  2997G  320G 90.34 1.34 176
> 5 3.32669  1.0  3406G  2466G  939G 72.42 1.07 165
> 6 3.27800  1.0  3356G  1463G 1893G 43.60 0.65 166  
> 
> ceph osd crush rule dump
> 
> [
> {
> "rule_id": 0,
> "rule_name": "replicated_ruleset",
> "ruleset": 0,
> "type": 1,
> "min_size": 1,
> "max_size": 10,
> "steps": [
> {
> "op": "take",
> "item": -1,
> "item_name": "default"
> },
> {
> "op": "chooseleaf_firstn",
> "num": 0,
> "type": "host"
> },
> {
> "op": "emit"
> }
> ]
> }
> ]
> 
> 
> ceph version 10.2.2 and ceph version 10.2.9
> 
> 
> ceph osd pool ls detail
> 
> pool 0 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash 
> rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0
> pool 3 'downloads_data' replicated size 2 min_size 1 crush_ruleset 0 
> object_hash rjenkins pg_num 250 pgp_num 250 last_change 39 flags hashpspool 
> crash_replay_interval 45 stripe_width 0
> pool 4 'downloads_metadata' replicated size 2 min_size 1 crush_ruleset 0 
> object_hash rjenkins pg_num 250 pgp_num 250 last_change 36 flags hashpspool 
> stripe_width 0
> 
> 
>  On Sun, 12 Nov 2017 15:04:02 +0530 Sébastien VIGNERON 
> > wrote 
> 
> 
> Hi,
> 
> Can you share:
>  - your placement rules: ceph osd crush rule dump
>  - your CEPH version: ceph versions
>  - your pools definitions: ceph osd pool ls detail
> 
> With these we can determine is your pgs are stuck because of a 
> misconfiguration or something else.
> 
> You seems to have some undersized pgs and a recovery in process. Does your 
> OSDs showed some rebalance of your datas? Does your OSDs use percentage 
> change over time? (changes in "ceph osd df")
> 
> Cordialement / Best regards,
> 
> Sébastien VIGNERON 
> CRIANN, 
> Ingénieur / Engineer
> Technopôle du Madrillet 
> 745, avenue de l'Université 
> 76800 Saint-Etienne du Rouvray - France 
> tél. +33 2 32 91 42 91 
> fax. +33 2 32 91 42 92 
> http://www.criann.fr  
> mailto:sebastien.vigne...@criann.fr 
> support: supp...@criann.fr 
> 
> Le 12 nov. 2017 à 10:04, gjprabu  > a écrit :
> 
> Hi Team,
> 
>  We have ceph setup with 6 OSD and we got alert with 2 OSD is near 
> full . We faced issue like slow in accessing ceph from client. So i have 
> added 7th OSD and still 2 OSD is showing near full ( OSD.0 and OSD.4) , I 
> have restarted ceph service in osd.0 and osd.4 .  Kindly check the below ceph 
> osd status and please provide us the solutions. 
> 
> 
> # ceph health detail
> HEALTH_WARN 46 pgs backfill_wait; 1 pgs backfilling; 32 pgs degraded; 50 pgs 
> stuck unclean; 32 pgs undersized; recovery 1098780/40253637 objects degraded 
> (2.730%); recovery 3401433/40253637 objects misplaced (8.450%); 2 near full 
> osd(s); mds0: Client integ-hm3 failing to respond to cache pressure; mds0: 
> Client integ-hm8 failing to respond to cache pressure; mds0: Client integ-hm2 
> failing to respond to cache pressure; mds0: Client integ-hm9 failing to 
> respond to cache pressure; mds0: Client integ-hm5 failing to respond to cache 
> pressure; mds0: Client integ-hm9-bkp failing to respond to cache pressure; 
> mds0: Client me-build1-bkp failing to respond to cache pressure
> 
> pg 3.f6 is stuck unclean for 511223.069161, current state 
> active+undersized+degraded+remapped+wait_backfill, last acting [2]
> pg 4.f6 is stuck unclean for

Re: [ceph-users] OSD is near full and slow in accessing storage from client

2017-11-12 Thread gjprabu

Hi Sebastien



 Thanks for you reply , yes undersize pgs and recovery in process becuase of we 
added new osd after getting 2 OSD is near full warning .   Yes newly added osd 
is reblancing the size.





[root@intcfs-osd6 ~]# ceph osd df

ID WEIGHT  REWEIGHT SIZE   USEAVAIL %USE  VAR  PGS

0 3.29749  1.0  3376G  2875G  501G 85.15 1.26 165

1 3.26869  1.0  3347G  1923G 1423G 57.46 0.85 152

2 3.27339  1.0  3351G  1980G 1371G 59.08 0.88 161

3 3.24089  1.0  3318G  2130G 1187G 64.21 0.95 168

4 3.24089  1.0  3318G  2997G  320G 90.34 1.34 176

5 3.32669  1.0  3406G  2466G  939G 72.42 1.07 165

6 3.27800  1.0  3356G  1463G 1893G 43.60 0.65 166  



ceph osd crush rule dump



[

{

"rule_id": 0,

"rule_name": "replicated_ruleset",

"ruleset": 0,

"type": 1,

"min_size": 1,

"max_size": 10,

"steps": [

{

"op": "take",

"item": -1,

"item_name": "default"

},

{

"op": "chooseleaf_firstn",

"num": 0,

"type": "host"

},

{

"op": "emit"

}

]

}

]





ceph version 10.2.2 and ceph version 10.2.9





ceph osd pool ls detail



pool 0 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins 
pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0

pool 3 'downloads_data' replicated size 2 min_size 1 crush_ruleset 0 
object_hash rjenkins pg_num 250 pgp_num 250 last_change 39 flags hashpspool 
crash_replay_interval 45 stripe_width 0

pool 4 'downloads_metadata' replicated size 2 min_size 1 crush_ruleset 0 
object_hash rjenkins pg_num 250 pgp_num 250 last_change 36 flags hashpspool 
stripe_width 0





 On Sun, 12 Nov 2017 15:04:02 +0530 Sébastien VIGNERON 
sebastien.vigne...@criann.fr wrote 




Hi,



Can you share:

 - your placement rules: ceph osd crush rule dump

 - your CEPH version: ceph versions

 - your pools definitions: ceph osd pool ls detail



With these we can determine is your pgs are stuck because of a misconfiguration 
or something else.



You seems to have some undersized pgs and a recovery in process. Does your OSDs 
showed some rebalance of your datas? Does your OSDs use percentage change over 
time? (changes in "ceph osd df")



Cordialement / Best regards,



Sébastien VIGNERON 

CRIANN, 

Ingénieur / Engineer

Technopôle du Madrillet 

745, avenue de l'Université 

76800 Saint-Etienne du Rouvray - France 

tél. +33 2 32 91 42 91 

fax. +33 2 32 91 42 92 

http://www.criann.fr 

mailto:sebastien.vigne...@criann.fr

support: supp...@criann.fr




Le 12 nov. 2017 à 10:04, gjprabu gjpr...@zohocorp.com a écrit :



Hi Team,



 We have ceph setup with 6 OSD and we got alert with 2 OSD is near full 
. We faced issue like slow in accessing ceph from client. So i have added 7th 
OSD and still 2 OSD is showing near full ( OSD.0 and OSD.4) , I have restarted 
ceph service in osd.0 and osd.4 .  Kindly check the below ceph osd status and 
please provide us the solutions. 





# ceph health detail

HEALTH_WARN 46 pgs backfill_wait; 1 pgs backfilling; 32 pgs degraded; 50 pgs 
stuck unclean; 32 pgs undersized; recovery 1098780/40253637 objects degraded 
(2.730%); recovery 3401433/40253637 objects misplaced (8.450%); 2 near full 
osd(s); mds0: Client integ-hm3 failing to respond to cache pressure; mds0: 
Client integ-hm8 failing to respond to cache pressure; mds0: Client integ-hm2 
failing to respond to cache pressure; mds0: Client integ-hm9 failing to respond 
to cache pressure; mds0: Client integ-hm5 failing to respond to cache pressure; 
mds0: Client integ-hm9-bkp failing to respond to cache pressure; mds0: Client 
me-build1-bkp failing to respond to cache pressure



pg 3.f6 is stuck unclean for 511223.069161, current state 
active+undersized+degraded+remapped+wait_backfill, last acting [2]

pg 4.f6 is stuck unclean for 511232.770419, current state 
active+undersized+degraded+remapped+wait_backfill, last acting [2]

pg 3.ec is stuck unclean for 510902.815668, current state 
active+undersized+degraded+remapped+wait_backfill, last acting [2]

pg 3.eb is stuck unclean for 511285.576487, current state 
active+remapped+wait_backfill, last acting [3,0]

pg 4.17 is stuck unclean for 511235.326709, current state 
active+undersized+degraded+remapped+wait_backfill, last acting [1]

pg 4.2f is stuck unclean for 511232.356371, current state 
active+undersized+degraded+remapped+wait_backfill, last acting [2]

pg 4.3d is stuck unclean for 511300.446982, current state active+remapped, last 
acting [3,0]

pg 4.93 is stuck unclean for 511295.539229, current state 
active+undersized+degraded+remapped+wait_backfill, last acting [3]

pg 3.47 is stuck unclean for 511288.104965, current state 
active+remapped+wait_backfill, last acting [3,0]

pg 4.d5 is stuck unclean for 510916.509825, current

Re: [ceph-users] OSD is near full and slow in accessing storage from client

2017-11-12 Thread Sébastien VIGNERON

Hi,

Can you share:
 - your placement rules: ceph osd crush rule dump
 - your CEPH version: ceph versions
 - your pools definitions: ceph osd pool ls detail

With these we can determine is your pgs are stuck because of a misconfiguration 
or something else.

You seems to have some undersized pgs and a recovery in process. Does your OSDs 
showed some rebalance of your datas? Does your OSDs use percentage change over 
time? (changes in "ceph osd df")

Cordialement / Best regards,

Sébastien VIGNERON 
CRIANN, 
Ingénieur / Engineer
Technopôle du Madrillet 
745, avenue de l'Université 
76800 Saint-Etienne du Rouvray - France 
tél. +33 2 32 91 42 91 
fax. +33 2 32 91 42 92 
http://www.criann.fr 
mailto:sebastien.vigne...@criann.fr
support: supp...@criann.fr

> Le 12 nov. 2017 à 10:04, gjprabu  a écrit :
> 
> Hi Team,
> 
>  We have ceph setup with 6 OSD and we got alert with 2 OSD is near 
> full . We faced issue like slow in accessing ceph from client. So i have 
> added 7th OSD and still 2 OSD is showing near full ( OSD.0 and OSD.4) , I 
> have restarted ceph service in osd.0 and osd.4 .  Kindly check the below ceph 
> osd status and please provide us the solutions. 
> 
> 
> # ceph health detail
> HEALTH_WARN 46 pgs backfill_wait; 1 pgs backfilling; 32 pgs degraded; 50 pgs 
> stuck unclean; 32 pgs undersized; recovery 1098780/40253637 objects degraded 
> (2.730%); recovery 3401433/40253637 objects misplaced (8.450%); 2 near full 
> osd(s); mds0: Client integ-hm3 failing to respond to cache pressure; mds0: 
> Client integ-hm8 failing to respond to cache pressure; mds0: Client integ-hm2 
> failing to respond to cache pressure; mds0: Client integ-hm9 failing to 
> respond to cache pressure; mds0: Client integ-hm5 failing to respond to cache 
> pressure; mds0: Client integ-hm9-bkp failing to respond to cache pressure; 
> mds0: Client me-build1-bkp failing to respond to cache pressure
> 
> pg 3.f6 is stuck unclean for 511223.069161, current state 
> active+undersized+degraded+remapped+wait_backfill, last acting [2]
> pg 4.f6 is stuck unclean for 511232.770419, current state 
> active+undersized+degraded+remapped+wait_backfill, last acting [2]
> pg 3.ec is stuck unclean for 510902.815668, current state 
> active+undersized+degraded+remapped+wait_backfill, last acting [2]
> pg 3.eb is stuck unclean for 511285.576487, current state 
> active+remapped+wait_backfill, last acting [3,0]
> pg 4.17 is stuck unclean for 511235.326709, current state 
> active+undersized+degraded+remapped+wait_backfill, last acting [1]
> pg 4.2f is stuck unclean for 511232.356371, current state 
> active+undersized+degraded+remapped+wait_backfill, last acting [2]
> pg 4.3d is stuck unclean for 511300.446982, current state active+remapped, 
> last acting [3,0]
> pg 4.93 is stuck unclean for 511295.539229, current state 
> active+undersized+degraded+remapped+wait_backfill, last acting [3]
> pg 3.47 is stuck unclean for 511288.104965, current state 
> active+remapped+wait_backfill, last acting [3,0]
> pg 4.d5 is stuck unclean for 510916.509825, current state 
> active+undersized+degraded+remapped+wait_backfill, last acting [2]
> pg 3.31 is stuck unclean for 511221.542878, current state 
> active+remapped+wait_backfill, last acting [0,3]
> pg 3.62 is stuck unclean for 511221.551662, current state 
> active+undersized+degraded+remapped+wait_backfill, last acting [4]
> pg 4.4d is stuck unclean for 511232.279602, current state 
> active+undersized+degraded+remapped+wait_backfill, last acting [2]
> pg 4.48 is stuck unclean for 510911.095367, current state 
> active+remapped+wait_backfill, last acting [5,4]
> pg 3.4f is stuck unclean for 511226.712285, current state 
> active+undersized+degraded+remapped+wait_backfill, last acting [1]
> pg 3.78 is stuck unclean for 511221.531199, current state 
> active+undersized+degraded+remapped+wait_backfill, last acting [2]
> pg 3.24 is stuck unclean for 510903.483324, current state 
> active+remapped+backfilling, last acting [1,2]
> pg 4.8c is stuck unclean for 511231.668693, current state 
> active+undersized+degraded+remapped+wait_backfill, last acting [1]
> pg 3.b4 is stuck unclean for 511222.612012, current state 
> active+undersized+degraded+remapped+wait_backfill, last acting [0]
> pg 4.41 is stuck unclean for 511287.031264, current state 
> active+remapped+wait_backfill, last acting [3,2]
> pg 3.d1 is stuck unclean for 510903.797329, current state 
> active+remapped+wait_backfill, last acting [0,3]
> pg 3.7f is stuck unclean for 511222.929722, current state 
> active+undersized+degraded+remapped+wait_backfill, last acting [1]
> pg 4.af is stuck unclean for 511262.494659, current state 
> active+undersized+degraded+remapped, last acting [0]
> pg 3.66 is stuck unclean for 510903.296711, current state 
> active+remapped+wait_backfill, last acting [3,0]
> pg 3.76 is stuck unclean for 511224.615144, current state 
> active+undersized+degraded+remapped+wait_backfill, last acting [3]

[ceph-users] OSD is near full and slow in accessing storage from client

2017-11-12 Thread gjprabu

Hi Team,



 We have ceph setup with 6 OSD and we got alert with 2 OSD is near full 
. We faced issue like slow in accessing ceph from client. So i have added 7th 
OSD and still 2 OSD is showing near full ( OSD.0 and OSD.4) , I have restarted 
ceph service in osd.0 and osd.4 .  Kindly check the below ceph osd status and 
please provide us the solutions. 





# ceph health detail

HEALTH_WARN 46 pgs backfill_wait; 1 pgs backfilling; 32 pgs degraded; 50 pgs 
stuck unclean; 32 pgs undersized; recovery 1098780/40253637 objects degraded 
(2.730%); recovery 3401433/40253637 objects misplaced (8.450%); 2 near full 
osd(s); mds0: Client integ-hm3 failing to respond to cache pressure; mds0: 
Client integ-hm8 failing to respond to cache pressure; mds0: Client integ-hm2 
failing to respond to cache pressure; mds0: Client integ-hm9 failing to respond 
to cache pressure; mds0: Client integ-hm5 failing to respond to cache pressure; 
mds0: Client integ-hm9-bkp failing to respond to cache pressure; mds0: Client 
me-build1-bkp failing to respond to cache pressure



pg 3.f6 is stuck unclean for 511223.069161, current state 
active+undersized+degraded+remapped+wait_backfill, last acting [2]

pg 4.f6 is stuck unclean for 511232.770419, current state 
active+undersized+degraded+remapped+wait_backfill, last acting [2]

pg 3.ec is stuck unclean for 510902.815668, current state 
active+undersized+degraded+remapped+wait_backfill, last acting [2]

pg 3.eb is stuck unclean for 511285.576487, current state 
active+remapped+wait_backfill, last acting [3,0]

pg 4.17 is stuck unclean for 511235.326709, current state 
active+undersized+degraded+remapped+wait_backfill, last acting [1]

pg 4.2f is stuck unclean for 511232.356371, current state 
active+undersized+degraded+remapped+wait_backfill, last acting [2]

pg 4.3d is stuck unclean for 511300.446982, current state active+remapped, last 
acting [3,0]

pg 4.93 is stuck unclean for 511295.539229, current state 
active+undersized+degraded+remapped+wait_backfill, last acting [3]

pg 3.47 is stuck unclean for 511288.104965, current state 
active+remapped+wait_backfill, last acting [3,0]

pg 4.d5 is stuck unclean for 510916.509825, current state 
active+undersized+degraded+remapped+wait_backfill, last acting [2]

pg 3.31 is stuck unclean for 511221.542878, current state 
active+remapped+wait_backfill, last acting [0,3]

pg 3.62 is stuck unclean for 511221.551662, current state 
active+undersized+degraded+remapped+wait_backfill, last acting [4]

pg 4.4d is stuck unclean for 511232.279602, current state 
active+undersized+degraded+remapped+wait_backfill, last acting [2]

pg 4.48 is stuck unclean for 510911.095367, current state 
active+remapped+wait_backfill, last acting [5,4]

pg 3.4f is stuck unclean for 511226.712285, current state 
active+undersized+degraded+remapped+wait_backfill, last acting [1]

pg 3.78 is stuck unclean for 511221.531199, current state 
active+undersized+degraded+remapped+wait_backfill, last acting [2]

pg 3.24 is stuck unclean for 510903.483324, current state 
active+remapped+backfilling, last acting [1,2]

pg 4.8c is stuck unclean for 511231.668693, current state 
active+undersized+degraded+remapped+wait_backfill, last acting [1]

pg 3.b4 is stuck unclean for 511222.612012, current state 
active+undersized+degraded+remapped+wait_backfill, last acting [0]

pg 4.41 is stuck unclean for 511287.031264, current state 
active+remapped+wait_backfill, last acting [3,2]

pg 3.d1 is stuck unclean for 510903.797329, current state 
active+remapped+wait_backfill, last acting [0,3]

pg 3.7f is stuck unclean for 511222.929722, current state 
active+undersized+degraded+remapped+wait_backfill, last acting [1]

pg 4.af is stuck unclean for 511262.494659, current state 
active+undersized+degraded+remapped, last acting [0]

pg 3.66 is stuck unclean for 510903.296711, current state 
active+remapped+wait_backfill, last acting [3,0]

pg 3.76 is stuck unclean for 511224.615144, current state 
active+undersized+degraded+remapped+wait_backfill, last acting [3]

pg 4.57 is stuck unclean for 511234.514343, current state active+remapped, last 
acting [0,4]

pg 3.69 is stuck unclean for 511224.672085, current state 
active+undersized+degraded+remapped+wait_backfill, last acting [4]

pg 3.9a is stuck unclean for 510967.30, current state 
active+remapped+wait_backfill, last acting [3,2]

pg 4.50 is stuck unclean for 510903.825565, current state 
active+undersized+degraded+remapped+wait_backfill, last acting [1]

pg 4.53 is stuck unclean for 510921.975268, current state 
active+undersized+degraded+remapped+wait_backfill, last acting [2]

pg 3.e7 is stuck unclean for 511221.530592, current state 
active+undersized+degraded+remapped+wait_backfill, last acting [2]

pg 4.6a is stuck unclean for 510911.284877, current state 
active+undersized+degraded+remapped+wait_backfill, last acting [0]

pg 4.16 is stuck unclean for 511232.702762, current state

Re: [ceph-users] rocksdb: Corruption: missing start of fragmented record

Re: [ceph-users] Performance, and how much wiggle room there is with tunables

[ceph-users] Where can I find the fix commit of #3370 ?

Re: [ceph-users] Erasure Coding Pools and PG calculation - documentation

Re: [ceph-users] Erasure Coding Pools and PG calculation - documentation

Re: [ceph-users] OSD is near full and slow in accessing storage from client

Re: [ceph-users] Erasure Coding Pools and PG calculation - documentation

Re: [ceph-users] Getting errors on erasure pool writes k=2, m=1

Re: [ceph-users] Undersized fix for small cluster, other than adding a 4th node?

[ceph-users] Log entrys from RGW.

Re: [ceph-users] ceph zstd not for bluestor due to performance reasons

Re: [ceph-users] OSD is near full and slow in accessing storage from client

Re: [ceph-users] ceph zstd not for bluestor due to performance reasons

[ceph-users] Moving bluestore WAL and DB after bluestore creation

Re: [ceph-users] OSD is near full and slow in accessing storage from client

Re: [ceph-users] OSD is near full and slow in accessing storage from client

Re: [ceph-users] OSD is near full and slow in accessing storage from client

Re: [ceph-users] No ops on some OSD

Re: [ceph-users] OSD is near full and slow in accessing storage from client

Re: [ceph-users] OSD is near full and slow in accessing storage from client

Re: [ceph-users] OSD is near full and slow in accessing storage from client

Re: [ceph-users] OSD is near full and slow in accessing storage from client

Re: [ceph-users] OSD is near full and slow in accessing storage from client

Re: [ceph-users] OSD is near full and slow in accessing storage from client

Re: [ceph-users] OSD is near full and slow in accessing storage from client

[ceph-users] OSD is near full and slow in accessing storage from client

26 matches

Site Navigation

Mail list logo

Footer information