Re: [ceph-users] Intel SSD D3-S4510 and Intel SSD D3-S4610 firmware advisory notice

2019-04-19 Thread Irek Fasikhov
Wow!!!

пт, 19 апр. 2019 г. в 10:16, Stefan Kooman :

> Hi List,
>
> TL;DR:
>
> For those of you who are running a Ceph cluster with Intel SSD D3-S4510
> and or Intel SSD D3-S4610 with firmware version XCV10100 please upgrade
> to firmware XCV10110 ASAP. At least before ~ 1700 power up hours.
>
> More information here:
>
>
> https://support.microsoft.com/en-us/help/4499612/intel-ssd-drives-unresponsive-after-1700-idle-hours
>
>
> https://downloadcenter.intel.com/download/28673/SSD-S4510-S4610-2-5-non-searchable-firmware-links/
>
> Gr. Stefan
>
> P.s. Thanks to Frank Dennis (@jedisct1) for retweeting @NerdPyle:
> https://twitter.com/jedisct1/status/1118623635072258049
>
>
> --
> | BIT BV  http://www.bit.nl/Kamer van Koophandel 09090351
> | GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Urgent: Reduced data availability / All pgs inactive

2019-02-20 Thread Irek Fasikhov
Hi,

You have problems with MRG.
http://docs.ceph.com/docs/master/rados/operations/pg-states/
*The ceph-mgr hasn’t yet received any information about the PG’s state from
an OSD since mgr started up.*

чт, 21 февр. 2019 г. в 09:04, Irek Fasikhov :

> Hi,
>
> You have problems with MRG.
> http://docs.ceph.com/docs/master/rados/operations/pg-states/
> *The ceph-mgr hasn’t yet received any information about the PG’s state
> from an OSD since mgr started up.*
>
>
> ср, 20 февр. 2019 г. в 23:10, Ranjan Ghosh :
>
>> Hi all,
>>
>> hope someone can help me. After restarting a node of my 2-node-cluster
>> suddenly I get this:
>>
>> root@yak2 /var/www/projects # ceph -s
>>   cluster:
>> id: 749b2473-9300-4535-97a6-ee6d55008a1b
>> health: HEALTH_WARN
>> Reduced data availability: 200 pgs inactive
>>
>>   services:
>> mon: 3 daemons, quorum yak1,yak2,yak0
>> mgr: yak0.planwerk6.de(active), standbys: yak1.planwerk6.de,
>> yak2.planwerk6.de
>> mds: cephfs-1/1/1 up  {0=yak1.planwerk6.de=up:active}, 1 up:standby
>> osd: 2 osds: 2 up, 2 in
>>
>>   data:
>> pools:   2 pools, 200 pgs
>> objects: 0  objects, 0 B
>> usage:   0 B used, 0 B / 0 B avail
>> pgs: 100.000% pgs unknown
>>  200 unknown
>>
>> And this:
>>
>>
>> root@yak2 /var/www/projects # ceph health detail
>> HEALTH_WARN Reduced data availability: 200 pgs inactive
>> PG_AVAILABILITY Reduced data availability: 200 pgs inactive
>> pg 1.34 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.35 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.36 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.37 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.38 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.39 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.3a is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.3b is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.3c is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.3d is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.3e is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.3f is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.40 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.41 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.42 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.43 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.44 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.45 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.46 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.47 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.48 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.49 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.4a is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.4b is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.4c is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 1.4d is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 2.34 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 2.35 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 2.36 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 2.38 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 2.39 is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 2.3a is stuck inactive for 3506.815664, current state unknown,
>> last acting []
>> pg 2.3b is st

Re: [ceph-users] How to speed up backfill

2018-01-10 Thread Irek Fasikhov
ceph tell osd.* injectargs '--osd_recovery_delay_start 30'

2018-01-11 10:31 GMT+03:00 shadow_lin :

> Hi ,
>  Mine is purely backfilling(remove a osd from the cluster) and it
> started at 600Mb/s and ended at about 3MB/s.
> How is your recovery made up?Is it backfill or log replay pg recovery
> or both?
>
> 2018-01-11
> --
> shadow_lin
> --
>
> *发件人:*Josef Zelenka 
> *发送时间:*2018-01-11 15:26
> *主题:*Re: [ceph-users] How to speed up backfill
> *收件人:*"shadow_lin"
> *抄送:*"ceph-users"
>
>
> Hi, our recovery slowed down significantly towards the end, however it was
> still about five times faster than the original speed.We suspected that
> this is caused somehow by threading (more objects transferred - more
> threads used), but this is only an assumption.
>
> On 11/01/18 05:02, shadow_lin wrote:
>
> Hi,
> I had tried these two method and for backfilling it seems only
> osd-max-backfills works.
> How was your recovery speed when it comes to the last few pgs or objects?
>
> 2018-01-11
> --
> shadow_lin
> --
>
> *发件人:*Josef Zelenka 
> 
> *发送时间:*2018-01-11 04:53
> *主题:*Re: [ceph-users] How to speed up backfill
> *收件人:*"shadow_lin" 
> *抄送:*
>
>
> Hi, i had the same issue a few days back, i tried playing around with
> these two:
>
> ceph tell 'osd.*' injectargs '--osd-max-backfills '
> ceph tell 'osd.*' injectargs '--osd-recovery-max-active  '
>  and it helped greatly(increased our recovery speed 20x), but be careful to 
> not overload your systems.
>
>
> On 10/01/18 17:50, shadow_lin wrote:
>
> Hi all,
> I am playing with setting for backfill to try to find how to control the
> speed of backfill.
>
> Now I only find  "osd max backfills" can have effect the backfill speed.
> But after all pg need to be backfilled begin backfilling I can't find any
> way to speed up backfills.
>
> Especailly when it comes to the last pg to recover, the speed is only a
> few MB/s(when there are multi pg are backfilled the speed could be more
> than 600MB/s in my test)
>
> I am a little confused about the setting of backfills and recovery.Though
> backfilling is a kind of recovery but It seems recovery setting is only
> about to replay pg logs to do recover  pg.
>
> Would change "osd recovery max active" or other recovery setting have any
> effect on backfilling?
>
> I did tried "osd recovery op priority" and "osd recovery max active" with
> no luck.
>
> Any advice would be greatly appreciated.Thanks
>
> 2018-01-11
> --
> lin.yunfan
>
>
> ___
> ceph-users mailing 
> listceph-us...@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Luminous release_type "rc"

2017-09-26 Thread Irek Fasikhov
Hi
No cause for concern:
https://github.com/ceph/ceph/pull/17348/commits/2b5f84586ec4d20ebb5aacd6f3c71776c621bf3b

2017-09-26 11:23 GMT+03:00 Stefan Kooman :

> Hi,
>
> I noticed the ceph version still gives "rc" although we are using the
> latest Ceph packages: 12.2.0-1xenial
> (https://download.ceph.com/debian-luminous xenial/main amd64 Packages):
>
> ceph daemon mon.mon5 version
> {"version":"12.2.0","release":"luminous","release_type":"rc"}
>
> Why is this important (to me)? I want to make a monitoring check that
> ensures we
> are running identical, "stable" packages, instead of "beta" / "rc" in
> production.
>
> Gr. Stefan
>
>
>
> --
> | BIT BV  http://www.bit.nl/Kamer van Koophandel 09090351
> | GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Long OSD restart after upgrade to 10.2.9

2017-07-16 Thread Irek Fasikhov
Hi, Anton.
You need to run the OSD with debug_ms = 1/1 and debug_osd = 20/20 for
detailed information.

2017-07-17 8:26 GMT+03:00 Anton Dmitriev :

> Hi, all!
>
> After upgrading from 10.2.7 to 10.2.9 I see that restarting osds by
> 'restart ceph-osd id=N' or 'restart ceph-osd-all' takes about 10 minutes
> for getting OSD from DOWN to UP. The same situation on all 208 OSDs on 7
> servers.
>
> Also very long OSD start after rebooting servers.
>
> Before upgrade it took no more than 2 minutes.
>
> Does anyone has the same situation like mine?
>
>
> 2017-07-17 08:07:26.895600 7fac2d656840  0 set uid:gid to 4402:4402
> (ceph:ceph)
> 2017-07-17 08:07:26.895615 7fac2d656840  0 ceph version 10.2.9
> (2ee413f77150c0f375ff6f10edd6c8f9c7d060d0), process ceph-osd, pid 197542
> 2017-07-17 08:07:26.897018 7fac2d656840  0 pidfile_write: ignore empty
> --pid-file
> 2017-07-17 08:07:26.906489 7fac2d656840  0 filestore(/var/lib/ceph/osd/ceph-0)
> backend xfs (magic 0x58465342)
> 2017-07-17 08:07:26.917074 7fac2d656840  0 
> genericfilestorebackend(/var/lib/ceph/osd/ceph-0)
> detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config
> option
> 2017-07-17 08:07:26.917092 7fac2d656840  0 
> genericfilestorebackend(/var/lib/ceph/osd/ceph-0)
> detect_features: SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data
> hole' config option
> 2017-07-17 08:07:26.917112 7fac2d656840  0 
> genericfilestorebackend(/var/lib/ceph/osd/ceph-0)
> detect_features: splice is supported
> 2017-07-17 08:07:27.037031 7fac2d656840  0 
> genericfilestorebackend(/var/lib/ceph/osd/ceph-0)
> detect_features: syncfs(2) syscall fully supported (by glibc and kernel)
> 2017-07-17 08:07:27.037154 7fac2d656840  0 
> xfsfilestorebackend(/var/lib/ceph/osd/ceph-0)
> detect_feature: extsize is disabled by conf
> 2017-07-17 08:15:17.839072 7fac2d656840  0 filestore(/var/lib/ceph/osd/ceph-0)
> mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled
> 2017-07-17 08:15:20.150446 7fac2d656840  0 
> cls/hello/cls_hello.cc:305: loading cls_hello
> 2017-07-17 08:15:20.152483 7fac2d656840  0 
> cls/cephfs/cls_cephfs.cc:202: loading cephfs_size_scan
> 2017-07-17 08:15:20.210428 7fac2d656840  0 osd.0 224167 crush map has
> features 2200130813952, adjusting msgr requires for clients
> 2017-07-17 08:15:20.210443 7fac2d656840  0 osd.0 224167 crush map has
> features 2200130813952 was 8705, adjusting msgr requires for mons
> 2017-07-17 08:15:20.210448 7fac2d656840  0 osd.0 224167 crush map has
> features 2200130813952, adjusting msgr requires for osds
> 2017-07-17 08:15:58.902173 7fac2d656840  0 osd.0 224167 load_pgs
> 2017-07-17 08:16:19.083406 7fac2d656840  0 osd.0 224167 load_pgs opened
> 242 pgs
> 2017-07-17 08:16:19.083969 7fac2d656840  0 osd.0 224167 using 0 op queue
> with priority op cut off at 64.
> 2017-07-17 08:16:19.109547 7fac2d656840 -1 osd.0 224167 log_to_monitors
> {default=true}
> 2017-07-17 08:16:19.522448 7fac2d656840  0 osd.0 224167 done with init,
> starting boot process
>
> --
> Dmitriev Anton
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] qemu-img convert vs rbd import performance

2017-07-13 Thread Irek Fasikhov
Hi.

You need to add to the ceph.conf
[client]
 rbd cache = true
 rbd readahead trigger requests = 5
 rbd readahead max bytes = 419430400
 *rbd readahead disable after bytes = 0*
 rbd_concurrent_management_ops = 50

2017-07-13 15:29 GMT+03:00 Mahesh Jambhulkar :

> Seeing some performance issues on my ceph cluster with *qemu-img convert 
> *directly
> writing to ceph against normal rbd import command.
>
> *Direct data copy (without qemu-img convert) took 5 hours 43 minutes for
> 465GB data.*
>
>
> [root@cephlarge vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb]# time
> rbd import 66582225-6539-4e5e-9b7a-59aa16739df1 -p volumes
> 66582225-6539-4e5e-9b7a-59aa16739df1_directCopy --image-format 2
> rbd: --pool is deprecated for import, use --dest-pool
> Importing image: 100% complete...done.
>
> real*343m38.028s*
> user4m40.779s
> sys 7m18.916s
> [root@cephlarge vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb]# rbd
> info volumes/66582225-6539-4e5e-9b7a-59aa16739df1_directCopy
> rbd image '66582225-6539-4e5e-9b7a-59aa16739df1_directCopy':
> size 465 GB in 119081 objects
> order 22 (4096 kB objects)
> block_name_prefix: rbd_data.373174b0dc51
> format: 2
> features: layering, exclusive-lock, object-map, fast-diff,
> deep-flatten
> flags:
> [root@cephlarge vm_res_id_24291e4b-93d2-47ad-80a8-bf3c395319b9_vdb]#
>
>
> *Qemu-img convert is still in progress and completed merely 10% in more
> than 40 hours. (for 465GB data)*
>
> [root@cephlarge mnt]# time qemu-img convert -p -t none -O raw
> /mnt/data/workload_326e8a43-a90a-4fe9-8aab-6d33bcdf5a05/snap
> shot_9f0cee13-8200-4562-82ec-1fb9f234bcd8/vm_id_05e9534e-
> 5c84-4487-9613-1e0e227e4c1a/vm_res_id_24291e4b-93d2-47ad-
> 80a8-bf3c395319b9_vdb/66582225-6539-4e5e-9b7a-59aa16739df1
> rbd:volumes/24291e4b-93d2-47ad-80a8-bf3c395319b9
> (0.00/100%)
>
>
> (10.00/100%)
>
>
> *Rbd bench-write shows speed of ~21MB/s.*
>
> [root@cephlarge ~]# rbd bench-write image01 --pool=rbdbench
> bench-write  io_size 4096 io_threads 16 bytes 1073741824 pattern sequential
>   SEC   OPS   OPS/SEC   BYTES/SEC
> 2  6780   3133.53  12834946.35
> 3  6831   1920.65  7866998.17
> 4  8896   2040.50  8357871.83
> 5 13058   2562.61  10496432.34
> 6 17225   2836.78  11619432.99
> 7 20345   2736.84  11210076.25
> 8 23534   3761.57  15407392.94
> 9 25689   3601.35  14751109.98
>10 29670   3391.53  13891695.57
>11 33169   3218.29  13182107.64
>12 36356   3135.34  12842344.21
>13 38431   2972.62  12175863.99
>14 47780   4389.77  17980497.11
>15 55452   5156.40  21120627.26
>16 59298   4772.32  19547440.33
>17 61437   5151.20  21099315.94
>18 67702   5861.64  24009295.97
>19 77086   5895.03  24146032.34
>20 85474   5936.09  24314243.88
>21 93848   7499.73  30718898.25
>22100115   7783.39  31880760.34
>23105405   7524.76  30821410.70
>24111677   6797.12  27841003.78
>25116971   6274.51  25700386.48
>26121156   5468.77  22400087.81
>27126484   5345.83  21896515.02
>28137937   6412.41  26265239.30
>29143229   6347.28  25998461.13
>30149505   6548.76  26823729.97
>31159978   7815.37  32011752.09
>32171431   8821.65  36133479.15
>33181084   8795.28  36025472.27
>35182856   6322.41  25896605.75
>36186891   5592.25  22905872.73
>37190906   4876.30  19973339.07
>38190943   3076.87  12602853.89
>39190974   1536.79  6294701.64
>40195323   2344.75  9604081.07
>41198479   2703.00  11071492.89
>42208893   3918.55  16050365.70
>43214172   4702.42  19261091.89
>44215263   5167.53  21166212.98
>45219435   5392.57  22087961.94
>46225731   5242.85  21474728.85
>47234101   5009.43  20518607.70
>48243529   6326.00  25911280.08
>49254058   7944.90  32542315.10
> elapsed:50  ops:   262144  ops/sec:  5215.19  bytes/sec: 21361431.86
> [root@cephlarge ~]#
>
> This CEPH deployment has 2 OSDs.
>
> It would be of great help if anyone can give me pointers.
>
> --
> Regards,
> mahesh j
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] To backup or not to backup the classic way - How to backup hundreds of TB?

2017-02-14 Thread Irek Fasikhov
Hi.

We use Ceph Rados GW S3. And we are very happy :).
Each administrator is responsible for its service.

Using the following clients S3:
Linux - s3cmd, duply;
Windows - cloudberry.

P.S 500 TB data, 3x replication, 3 datacenter.

С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

2017-02-14 12:15 GMT+03:00 Götz Reinicke :

> Hi,
>
> I guess that's a question that pops up in different places, but I could
> not find any which fits to my thoughts.
>
> Currently we start to use ceph for file shares of our films produced by
> our students and some xen/vmware VMs. Thd VM data is already backed up; the
> fils original footage is stored in other places.
>
> We start with some 100TB rbd and mount smb/NFS shares from the clients.
> May be we look into ceph fs soon.
>
> The question is: How would someone handle a backup of 100 TB data?
> Rsyncing that to an other system or having a commercial backup solution
> looks not that good e.g. regarding the price.
>
> One thought is, is there some sort of best practice in the ceph world e.g.
> replicating to an other physical independent cluster? Or use more replicas,
> odds, nodes and do snapshots in one cluster?
>
> Having productive data and backup on the same hardware currently makes me
> feel not that good too….But the world changes :)
>
> Long story short: How do you do backup hundreds of TB?
>
> Curious for suggestions and thoughts .. Thanks and Regards . Götz
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Migrating data from a Ceph clusters to another

2017-02-09 Thread Irek Fasikhov
Hi.
I recommend using rbd import/export.

С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

2017-02-09 11:13 GMT+03:00 林自均 :

> Hi,
>
> I have 2 Ceph clusters, cluster A and cluster B. I want to move all the
> pools on A to B. The pool names don't conflict between clusters. I guess
> it's like RBD mirroring, except that it's pool mirroring. Is there any
> proper ways to do it?
>
> Thanks for any suggestions.
>
> Best,
> John Lin
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Experience with 5k RPM/archive HDDs

2017-02-02 Thread Irek Fasikhov
Hi, Maxime.

Linux SMR is only starting with version 4.9 kernel.


С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

2017-02-03 10:26 GMT+03:00 Maxime Guyot :

> Hi everyone,
>
>
>
> I’m wondering if anyone in the ML is running a cluster with archive type
> HDDs, like the HGST Ultrastar Archive (10TB@7.2k RPM) or the Seagate
> Enterprise Archive (8TB@5.9k RPM)?
>
> As far as I read they both fall in the enterprise class HDDs so **might**
> be suitable for a low performance, low cost cluster?
>
>
>
> Cheers,
>
> Maxime
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] XFS no space left on device

2016-10-25 Thread Irek Fasikhov
Привет, Василий.
Hi,Vasily.

You are busy inode. see "df -i"

С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

2016-10-25 15:52 GMT+03:00 Василий Ангапов :

> This is a a bit more information about that XFS:
>
> root@ed-ds-c178:[~]:$ xfs_info /dev/mapper/disk23p1
> meta-data=/dev/mapper/disk23p1   isize=2048   agcount=6, agsize=268435455
> blks
>  =   sectsz=4096  attr=2, projid32bit=1
>  =   crc=0finobt=0
> data =   bsize=4096   blocks=1465130385, imaxpct=5
>  =   sunit=0  swidth=0 blks
> naming   =version 2  bsize=4096   ascii-ci=0 ftype=0
> log  =internal   bsize=4096   blocks=521728, version=2
>  =   sectsz=4096  sunit=1 blks, lazy-count=1
> realtime =none   extsz=4096   blocks=0, rtextents=0
>
> root@ed-ds-c178:[~]:$ xfs_db /dev/mapper/disk23p1
> xfs_db> frag
> actual 25205642, ideal 22794438, fragmentation factor 9.57%
>
> 2016-10-25 14:59 GMT+03:00 Василий Ангапов :
> > Actually all OSDs are already mounted with inode64 option. Otherwise I
> > could not write beyond 1TB.
> >
> > 2016-10-25 14:53 GMT+03:00 Ashley Merrick :
> >> Sounds like 32bit Inode limit, if you mount with -o inode64 (not 100%
> how you would do in ceph), would allow data to continue to be wrote.
> >>
> >> ,Ashley
> >>
> >> -Original Message-
> >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> Of ??? ???
> >> Sent: 25 October 2016 12:38
> >> To: ceph-users 
> >> Subject: [ceph-users] XFS no space left on device
> >>
> >> Hello,
> >>
> >> I got Ceph 10.2.1 cluster with 10 nodes, each having 29 * 6TB OSDs.
> >> Yesterday I found that 3 OSDs were down and out with 89% space
> utilization.
> >> In logs there is:
> >> 2016-10-24 22:36:37.599253 7f8309c5e800  0 ceph version 10.2.1 (
> 3a66dd4f30852819c1bdaa8ec23c795d4ad77269), process ceph-osd, pid
> >> 2602081
> >> 2016-10-24 22:36:37.600129 7f8309c5e800  0 pidfile_write: ignore empty
> --pid-file
> >> 2016-10-24 22:36:37.635769 7f8309c5e800  0
> >> filestore(/var/lib/ceph/osd/ceph-123) backend xfs (magic 0x58465342)
> >> 2016-10-24 22:36:37.635805 7f8309c5e800 -1
> >> genericfilestorebackend(/var/lib/ceph/osd/ceph-123) detect_features:
> >> unable to create /var/lib/ceph/osd/ceph-123/fiemap_test: (28) No space
> left on device
> >> 2016-10-24 22:36:37.635814 7f8309c5e800 -1
> >> filestore(/var/lib/ceph/osd/ceph-123) _detect_fs: detect_features
> >> error: (28) No space left on device
> >> 2016-10-24 22:36:37.635818 7f8309c5e800 -1
> >> filestore(/var/lib/ceph/osd/ceph-123) FileStore::mount: error in
> >> _detect_fs: (28) No space left on device
> >> 2016-10-24 22:36:37.635824 7f8309c5e800 -1 osd.123 0 OSD:init: unable
> to mount object store
> >> 2016-10-24 22:36:37.635827 7f8309c5e800 -1 ESC[0;31m ** ERROR: osd init
> failed: (28) No space left on deviceESC[0m
> >>
> >> root@ed-ds-c178:[/var/lib/ceph/osd/ceph-123]:$ df -h
> /var/lib/ceph/osd/ceph-123
> >> FilesystemSize  Used Avail Use% Mounted on
> >> /dev/mapper/disk23p1  5.5T  4.9T  651G  89% /var/lib/ceph/osd/ceph-123
> >>
> >> root@ed-ds-c178:[/var/lib/ceph/osd/ceph-123]:$ df -i
> /var/lib/ceph/osd/ceph-123
> >> Filesystem  InodesIUsed IFree IUse% Mounted on
> >> /dev/mapper/disk23p1 146513024 22074752 124438272   16%
> >> /var/lib/ceph/osd/ceph-123
> >>
> >> root@ed-ds-c178:[/var/lib/ceph/osd/ceph-123]:$ touch 123
> >> touch: cannot touch ‘123’: No space left on device
> >>
> >> root@ed-ds-c178:[/var/lib/ceph/osd/ceph-123]:$ grep ceph-123
> /proc/mounts
> >> /dev/mapper/disk23p1 /var/lib/ceph/osd/ceph-123 xfs
> rw,noatime,attr2,inode64,noquota 0 0
> >>
> >> The same situation is for all three down OSDs. OSD can be unmounted and
> mounted without problem:
> >> root@ed-ds-c178:[~]:$ umount /var/lib/ceph/osd/ceph-123 
> >> root@ed-ds-c178:[~]:$
> root@ed-ds-c178:[~]:$ mount /var/lib/ceph/osd/ceph-123 root@ed-ds-c178:[~]:$
> touch /var/lib/ceph/osd/ceph-123/123
> >> touch: cannot touch ‘/var/lib/ceph/osd/ceph-123/123’: No space left on
> device
> >>
> >> xfs_repair gives no error for FS.
> >>
> >> Kernel is
> >> root@ed-ds-c178:[~]:$ uname -r
> >> 4.7.0-1.el7.wg.x86_64
> >>
> >> What else can I do to rectify that situation?
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] data corruption with hammer

2016-03-19 Thread Irek Fasikhov
Hi, Nick

I switched between forward and writeback. (forward -> writeback)

С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

2016-03-17 16:10 GMT+03:00 Nick Fisk <n...@fisk.me.uk>:

> > -Original Message-
> > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> > Irek Fasikhov
> > Sent: 17 March 2016 13:00
> > To: Sage Weil <sw...@redhat.com>
> > Cc: Robert LeBlanc <robert.lebl...@endurance.com>; ceph-users  > us...@lists.ceph.com>; Nick Fisk <n...@fisk.me.uk>; William Perkins
> > <william.perk...@endurance.com>
> > Subject: Re: [ceph-users] data corruption with hammer
> >
> > Hi,All.
> >
> > I confirm the problem. When min_read_recency_for_promote> 1 data
> > failure.
>
> But what scenario is this? Are you switching between forward and
> writeback, or just running in writeback?
>
> >
> >
> > С уважением, Фасихов Ирек Нургаязович
> > Моб.: +79229045757
> >
> > 2016-03-17 15:26 GMT+03:00 Sage Weil <sw...@redhat.com>:
> > On Thu, 17 Mar 2016, Nick Fisk wrote:
> > > There is got to be something else going on here. All that PR does is to
> > > potentially delay the promotion to hit_set_period*recency instead of
> > > just doing it on the 2nd read regardless, it's got to be uncovering
> > > another bug.
> > >
> > > Do you see the same problem if the cache is in writeback mode before
> you
> > > start the unpacking. Ie is it the switching mid operation which causes
> > > the problem? If it only happens mid operation, does it still occur if
> > > you pause IO when you make the switch?
> > >
> > > Do you also see this if you perform on a RBD mount, to rule out any
> > > librbd/qemu weirdness?
> > >
> > > Do you know if it’s the actual data that is getting corrupted or if
> it's
> > > the FS metadata? I'm only wondering as unpacking should really only be
> > > writing to each object a couple of times, whereas FS metadata could
> > > potentially be being updated+read back lots of times for the same group
> > > of objects and ordering is very important.
> > >
> > > Thinking through it logically the only difference is that with
> recency=1
> > > the object will be copied up to the cache tier, where recency=6 it will
> > > be proxy read for a long time. If I had to guess I would say the issue
> > > would lie somewhere in the proxy read + writeback<->forward logic.
> >
> > That seems reasonable.  Was switching from writeback -> forward always
> > part of the sequence that resulted in corruption?  Not that there is a
> > known ordering issue when switching to forward mode.  I wouldn't really
> > expect it to bite real users but it's possible..
> >
> > http://tracker.ceph.com/issues/12814
> >
> > I've opened a ticket to track this:
> >
> > http://tracker.ceph.com/issues/15171
> >
> > What would be *really* great is if you could reproduce this with a
> > ceph_test_rados workload (from ceph-tests).  I.e., get ceph_test_rados
> > running, and then find the sequence of operations that are sufficient to
> > trigger a failure.
> >
> > sage
> >
> >
> >
> >  >
> > >
> > >
> > > > -Original Message-
> > > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On
> > Behalf Of
> > > > Mike Lovell
> > > > Sent: 16 March 2016 23:23
> > > > To: ceph-users <ceph-users@lists.ceph.com>; sw...@redhat.com
> > > > Cc: Robert LeBlanc <robert.lebl...@endurance.com>; William Perkins
> > > > <william.perk...@endurance.com>
> > > > Subject: Re: [ceph-users] data corruption with hammer
> > > >
> > > > just got done with a test against a build of 0.94.6 minus the two
> commits
> > that
> > > > were backported in PR 7207. everything worked as it should with the
> > cache-
> > > > mode set to writeback and the min_read_recency_for_promote set to 2.
> > > > assuming it works properly on master, there must be a commit that
> we're
> > > > missing on the backport to support this properly.
> > > >
> > > > sage,
> > > > i'm adding you to the recipients on this so hopefully you see it.
> the tl;dr
> > > > version is that the backport of the cache recency fix to hammer
> doesn't
> > work
> > > > right and potentially corrupts data when
> > > > the mi

Re: [ceph-users] data corruption with hammer

2016-03-19 Thread Irek Fasikhov
Hi,All.

I confirm the problem. When min_read_recency_for_promote> 1 data failure.

С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

2016-03-17 15:26 GMT+03:00 Sage Weil :

> On Thu, 17 Mar 2016, Nick Fisk wrote:
> > There is got to be something else going on here. All that PR does is to
> > potentially delay the promotion to hit_set_period*recency instead of
> > just doing it on the 2nd read regardless, it's got to be uncovering
> > another bug.
> >
> > Do you see the same problem if the cache is in writeback mode before you
> > start the unpacking. Ie is it the switching mid operation which causes
> > the problem? If it only happens mid operation, does it still occur if
> > you pause IO when you make the switch?
> >
> > Do you also see this if you perform on a RBD mount, to rule out any
> > librbd/qemu weirdness?
> >
> > Do you know if it’s the actual data that is getting corrupted or if it's
> > the FS metadata? I'm only wondering as unpacking should really only be
> > writing to each object a couple of times, whereas FS metadata could
> > potentially be being updated+read back lots of times for the same group
> > of objects and ordering is very important.
> >
> > Thinking through it logically the only difference is that with recency=1
> > the object will be copied up to the cache tier, where recency=6 it will
> > be proxy read for a long time. If I had to guess I would say the issue
> > would lie somewhere in the proxy read + writeback<->forward logic.
>
> That seems reasonable.  Was switching from writeback -> forward always
> part of the sequence that resulted in corruption?  Not that there is a
> known ordering issue when switching to forward mode.  I wouldn't really
> expect it to bite real users but it's possible..
>
> http://tracker.ceph.com/issues/12814
>
> I've opened a ticket to track this:
>
> http://tracker.ceph.com/issues/15171
>
> What would be *really* great is if you could reproduce this with a
> ceph_test_rados workload (from ceph-tests).  I.e., get ceph_test_rados
> running, and then find the sequence of operations that are sufficient to
> trigger a failure.
>
> sage
>
>
>
>  >
> >
> >
> > > -Original Message-
> > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> Of
> > > Mike Lovell
> > > Sent: 16 March 2016 23:23
> > > To: ceph-users ; sw...@redhat.com
> > > Cc: Robert LeBlanc ; William Perkins
> > > 
> > > Subject: Re: [ceph-users] data corruption with hammer
> > >
> > > just got done with a test against a build of 0.94.6 minus the two
> commits that
> > > were backported in PR 7207. everything worked as it should with the
> cache-
> > > mode set to writeback and the min_read_recency_for_promote set to 2.
> > > assuming it works properly on master, there must be a commit that we're
> > > missing on the backport to support this properly.
> > >
> > > sage,
> > > i'm adding you to the recipients on this so hopefully you see it. the
> tl;dr
> > > version is that the backport of the cache recency fix to hammer
> doesn't work
> > > right and potentially corrupts data when
> > > the min_read_recency_for_promote is set to greater than 1.
> > >
> > > mike
> > >
> > > On Wed, Mar 16, 2016 at 4:41 PM, Mike Lovell
> > >  wrote:
> > > robert and i have done some further investigation the past couple days
> on
> > > this. we have a test environment with a hard drive tier and an ssd
> tier as a
> > > cache. several vms were created with volumes from the ceph cluster. i
> did a
> > > test in each guest where i un-tarred the linux kernel source multiple
> times
> > > and then did a md5sum check against all of the files in the resulting
> source
> > > tree. i started off with the monitors and osds running 0.94.5 and
> never saw
> > > any problems.
> > >
> > > a single node was then upgraded to 0.94.6 which has osds in both the
> ssd and
> > > hard drive tier. i then proceeded to run the same test and, while the
> untar
> > > and md5sum operations were running, i changed the ssd tier cache-mode
> > > from forward to writeback. almost immediately the vms started
> reporting io
> > > errors and odd data corruption. the remainder of the cluster was
> updated to
> > > 0.94.6, including the monitors, and the same thing happened.
> > >
> > > things were cleaned up and reset and then a test was run
> > > where min_read_recency_for_promote for the ssd cache pool was set to 1.
> > > we previously had it set to 6. there was never an error with the
> recency
> > > setting set to 1. i then tested with it set to 2 and it immediately
> caused
> > > failures. we are currently thinking that it is related to the backport
> of the fix
> > > for the recency promotion and are in progress of making a .6 build
> without
> > > that backport to see if we can cause corruption. is anyone using a
> version
> > > from after the original 

Re: [ceph-users] ceph 9.2.0 SAMSUNG ssd performance issue?

2016-02-11 Thread Irek Fasikhov
Hi.
You need to read :
https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/

С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

2016-02-12 10:41 GMT+03:00 Huan Zhang :

> Hi,
>
> ceph VERY SLOW with 24 osd(SAMSUNG ssd).
> fio /dev/rbd0 iodepth=1 direct=1   IOPS only ~200
> fio /dev/rbd0 iodepth=32 direct=1 IOPS only ~3000
>
> But test single ssd deive with fio:
> fio iodepth=1 direct=1   IOPS  ~15000
> fio iodepth=32 direct=1 IOPS  ~3
>
> Why ceph SO SLOW? Could you give me some help?
> Appreciated!
>
>
> My Enviroment:
> [root@szcrh-controller ~]# ceph -s
> cluster eb26a8b9-e937-4e56-a273-7166ffaa832e
>  health HEALTH_WARN
> 1 mons down, quorum 0,1,2,3,4 ceph01,ceph02,ceph03,ceph04,
> ceph05
>  monmap e1: 6 mons at {ceph01=
>
> 10.10.204.144:6789/0,ceph02=10.10.204.145:6789/0,ceph03=10.10.204.146:6789/0,ceph04=10.10.204.147:6789/0,ceph05=10.10.204.148:6789/0,ceph06=0.0.0.0:0/5
> }
> election epoch 6, quorum 0,1,2,3,4
> ceph01,ceph02,ceph03,ceph04,ceph05
>  osdmap e114: 24 osds: 24 up, 24 in
> flags sortbitwise
>   pgmap v2213: 1864 pgs, 3 pools, 49181 MB data, 4485 objects
> 144 GB used, 42638 GB / 42782 GB avail
> 1864 active+clean
>
> [root@ceph03 ~]# lsscsi
> [0:0:6:0]diskATA  SAMSUNG MZ7KM1T9 003Q  /dev/sda
> [0:0:7:0]diskATA  SAMSUNG MZ7KM1T9 003Q  /dev/sdb
> [0:0:8:0]diskATA  SAMSUNG MZ7KM1T9 003Q  /dev/sdc
> [0:0:9:0]diskATA  SAMSUNG MZ7KM1T9 003Q  /dev/sdd
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Undersized pgs problem

2015-11-27 Thread Irek Fasikhov
You have time to synchronize?

С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

2015-11-27 15:57 GMT+03:00 Vasiliy Angapov <anga...@gmail.com>:

> > It seams that you played around with crushmap, and done something wrong.
> > Compare the look of 'ceph osd tree' and crushmap. There are some 'osd'
> devices renamed to 'device' think threre is you problem.
> Is this a mistake actually? What I did is removed a bunch of OSDs from
> my cluster that's why the numeration is sparse. But is it an issue to
> a have a sparse numeration of OSDs?
>
> > Hi.
> > Vasiliy, Yes it is a problem with crusmap. Look at height:
> > -3 14.56000 host slpeah001
> > -2 14.56000 host slpeah002
> What exactly is wrong here?
>
> I also found out that my OSD logs are full of such records:
> 2015-11-26 08:31:19.273268 7fe4f49b1700  0 cephx: verify_authorizer
> could not get service secret for service osd secret_id=2924
> 2015-11-26 08:31:19.273276 7fe4f49b1700  0 --
> 192.168.254.18:6816/110740 >> 192.168.254.12:0/1011754 pipe(0x41fd1000
> sd=79 :6816 s=0 pgs=0 cs=0 l=1 c=0x3ee1a520).accept: got bad
> authorizer
> 2015-11-26 08:31:24.273207 7fe4f49b1700  0 auth: could not find
> secret_id=2924
> 2015-11-26 08:31:24.273225 7fe4f49b1700  0 cephx: verify_authorizer
> could not get service secret for service osd secret_id=2924
> 2015-11-26 08:31:24.273231 7fe4f49b1700  0 --
> 192.168.254.18:6816/110740 >> 192.168.254.12:0/1011754 pipe(0x3f90b000
> sd=79 :6816 s=0 pgs=0 cs=0 l=1 c=0x3ee1a3c0).accept: got bad
> authorizer
> 2015-11-26 08:31:29.273199 7fe4f49b1700  0 auth: could not find
> secret_id=2924
> 2015-11-26 08:31:29.273215 7fe4f49b1700  0 cephx: verify_authorizer
> could not get service secret for service osd secret_id=2924
> 2015-11-26 08:31:29.273222 7fe4f49b1700  0 --
> 192.168.254.18:6816/110740 >> 192.168.254.12:0/1011754 pipe(0x41fd1000
> sd=79 :6816 s=0 pgs=0 cs=0 l=1 c=0x3ee1a260).accept: got bad
> authorizer
> 2015-11-26 08:31:34.273469 7fe4f49b1700  0 auth: could not find
> secret_id=2924
> 2015-11-26 08:31:34.273482 7fe4f49b1700  0 cephx: verify_authorizer
> could not get service secret for service osd secret_id=2924
> 2015-11-26 08:31:34.273486 7fe4f49b1700  0 --
> 192.168.254.18:6816/110740 >> 192.168.254.12:0/1011754 pipe(0x3f90b000
> sd=79 :6816 s=0 pgs=0 cs=0 l=1 c=0x3ee1a100).accept: got bad
> authorizer
> 2015-11-26 08:31:39.273310 7fe4f49b1700  0 auth: could not find
> secret_id=2924
> 2015-11-26 08:31:39.273331 7fe4f49b1700  0 cephx: verify_authorizer
> could not get service secret for service osd secret_id=2924
> 2015-11-26 08:31:39.273342 7fe4f49b1700  0 --
> 192.168.254.18:6816/110740 >> 192.168.254.12:0/1011754 pipe(0x41fcc000
> sd=98 :6816 s=0 pgs=0 cs=0 l=1 c=0x3ee19fa0).accept: got bad
> authorizer
> 2015-11-26 08:31:44.273753 7fe4f49b1700  0 auth: could not find
> secret_id=2924
> 2015-11-26 08:31:44.273769 7fe4f49b1700  0 cephx: verify_authorizer
> could not get service secret for service osd secret_id=2924
> 2015-11-26 08:31:44.273776 7fe4f49b1700  0 --
> 192.168.254.18:6816/110740 >> 192.168.254.12:0/1011754 pipe(0x41fcc000
> sd=98 :6816 s=0 pgs=0 cs=0 l=1 c=0x3ee189a0).accept: got bad
> authorizer
> 2015-11-26 08:31:49.273412 7fe4f49b1700  0 auth: could not find
> secret_id=2924
> 2015-11-26 08:31:49.273431 7fe4f49b1700  0 cephx: verify_authorizer
> could not get service secret for service osd secret_id=2924
> 2015-11-26 08:31:49.273455 7fe4f49b1700  0 --
> 192.168.254.18:6816/110740 >> 192.168.254.12:0/1011754 pipe(0x41fd1000
> sd=98 :6816 s=0 pgs=0 cs=0 l=1 c=0x3ee19080).accept: got bad
> authorizer
> 2015-11-26 08:31:54.273293 7fe4f49b1700  0 auth: could not find
> secret_id=2924
>
> What does it mean? Google sais it might be a time sync issue, but my
> clocks are perfectly synchronized...
>
> 2015-11-26 21:05 GMT+08:00 Irek Fasikhov <malm...@gmail.com>:
> > Hi.
> > Vasiliy, Yes it is a problem with crusmap. Look at height:
> > " -3 14.56000 host slpeah001
> >  -2 14.56000 host slpeah002
> >  "
> >
> > С уважением, Фасихов Ирек Нургаязович
> > Моб.: +79229045757
> >
> > 2015-11-26 13:16 GMT+03:00 ЦИТ РТ-Курамшин Камиль Фидаилевич
> > <kamil.kurams...@tatar.ru>:
> >>
> >> It seams that you played around with crushmap, and done something wrong.
> >> Compare the look of 'ceph osd tree' and crushmap. There are some 'osd'
> >> devices renamed to 'device' think threre is you problem.
> >>
> >> Отправлено с мобильного устройства.
> >>
> >>
> >> -Original Message-
> >> From: Vasiliy Angapov <anga...

Re: [ceph-users] Undersized pgs problem

2015-11-26 Thread Irek Fasikhov
Hi.
Vasiliy, Yes it is a problem with crusmap. Look at height:
" -3 14.56000 host slpeah001
 -2 14.56000 host slpeah002
 "

С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

2015-11-26 13:16 GMT+03:00 ЦИТ РТ-Курамшин Камиль Фидаилевич <
kamil.kurams...@tatar.ru>:

> It seams that you played around with crushmap, and done something wrong.
> Compare the look of 'ceph osd tree' and crushmap. There are some 'osd'
> devices renamed to 'device' think threre is you problem.
>
> Отправлено с мобильного устройства.
>
>
> -Original Message-
> From: Vasiliy Angapov 
> To: ceph-users 
> Sent: чт, 26 нояб. 2015 7:53
> Subject: [ceph-users] Undersized pgs problem
>
> Hi, colleagues!
>
> I have small 4-node CEPH cluster (0.94.2), all pools have size 3, min_size
> 1.
> This night one host failed and cluster was unable to rebalance saying
> there are a lot of undersized pgs.
>
> root@slpeah002:[~]:# ceph -s
> cluster 78eef61a-3e9c-447c-a3ec-ce84c617d728
>  health HEALTH_WARN
> 1486 pgs degraded
> 1486 pgs stuck degraded
> 2257 pgs stuck unclean
> 1486 pgs stuck undersized
> 1486 pgs undersized
> recovery 80429/555185 <80429555185> objects degraded
> (14.487%)
> recovery 40079/555185 objects misplaced (7.219%)
> 4/20 in osds are down
> 1 mons down, quorum 1,2 slpeah002,slpeah007
>  monmap e7: 3 mons at
> {slpeah001=
> 192.168.254.11:6780/0,slpeah002=192.168.254.12:6780/0,slpeah007=172.31.252.46:6789/0}
>
> election epoch 710, quorum 1,2 slpeah002,slpeah007
>  osdmap e14062: 20 osds: 16 up, 20 in; 771 remapped pgs
>   pgmap v7021316: 4160 pgs, 5 pools, 1045 GB data, 180 kobjects
> 3366 GB used, 93471 GB / 96838 GB avail
> 80429/555185 <80429555185> objects degraded (14.487%)
> 40079/555185 objects misplaced (7.219%)
> 1903 active+clean
> 1486 active+undersized+degraded
>  771 active+remapped
>   client io 0 B/s rd, 246 kB/s wr, 67 op/s
>
>   root@slpeah002:[~]:# ceph osd tree
> ID  WEIGHT   TYPE NAME  UP/DOWN REWEIGHT PRIMARY-AFFINITY
>  -1 94.63998 root default
>  -9 32.75999 host slpeah007
>  72  5.45999 osd.72  up  1.0  1.0
>  73  5.45999 osd.73  up  1.0  1.0
>  74  5.45999 osd.74  up  1.0  1.0
>  75  5.45999 osd.75  up  1.0  1.0
>  76  5.45999 osd.76  up  1.0  1.0
>  77  5.45999 osd.77  up  1.0  1.0
> -10 32.75999 host slpeah008
>  78  5.45999 osd.78  up  1.0  1.0
>  79  5.45999 osd.79  up  1.0  1.0
>  80  5.45999 osd.80  up  1.0  1.0
>  81  5.45999 osd.81  up  1.0  1.0
>  82  5.45999 osd.82  up  1.0  1.0
>  83  5.45999 osd.83  up  1.0  1.0
>  -3 14.56000 host slpeah001
>   1  3.64000  osd.1 down  1.0  1.0
>  33  3.64000 osd.33down  1.0  1.0
>  34  3.64000 osd.34down  1.0  1.0
>  35  3.64000 osd.35down  1.0  1.0
>  -2 14.56000 host slpeah002
>   0  3.64000 osd.0   up  1.0  1.0
>  36  3.64000 osd.36  up  1.0  1.0
>  37  3.64000 osd.37  up  1.0  1.0
>  38  3.64000 osd.38  up  1.0  1.0
>
> Crushmap:
>
>  # begin crush map
> tunable choose_local_tries 0
> tunable choose_local_fallback_tries 0
> tunable choose_total_tries 50
> tunable chooseleaf_descend_once 1
> tunable chooseleaf_vary_r 1
> tunable straw_calc_version 1
> tunable allowed_bucket_algs 54
>
> # devices
> device 0 osd.0
> device 1 osd.1
> device 2 device2
> device 3 device3
> device 4 device4
> device 5 device5
> device 6 device6
> device 7 device7
> device 8 device8
> device 9 device9
> device 10 device10
> device 11 device11
> device 12 device12
> device 13 device13
> device 14 device14
> device 15 device15
> device 16 device16
> device 17 device17
> device 18 device18
> device 19 device19
> device 20 device20
> device 21 device21
> device 22 device22
> device 23 device23
> device 24 device24
> device 25 device25
> device 26 device26
> device 27 device27
> device 28 device28
> device 29 device29
> device 30 device30
> device 31 device31
> device 32 device32
> device 33 osd.33
> device 34 osd.34
> device 35 osd.35
> device 36 osd.36
> device 37 osd.37
> device 38 osd.38
> device 39 device39
> device 40 device40
> device 41 device41
> device 42 device42
> device 43 device43
> device 44 device44
> device 45 device45
> 

Re: [ceph-users] proxmox 4.0 release : lxc with krbd support and qemu librbd improvements

2015-10-08 Thread Irek Fasikhov
Hi, Alexandre.

Very Very Good!
Thank you for your work! :)

С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

2015-10-07 7:25 GMT+03:00 Alexandre DERUMIER :

> Hi,
>
> proxmox 4.0 has been released:
>
> http://forum.proxmox.com/threads/23780-Proxmox-VE-4-0-released!
>
>
> Some ceph improvements :
>
> - lxc containers with krbd support (multiple disks + snapshots)
> - qemu with jemalloc support (improve librbd performance)
> - qemu iothread option by disk (improve scaling rbd  with multiple disk)
> - librbd hammer version
>
> Regards,
>
> Alexandre
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Repair inconsistent pgs..

2015-08-17 Thread Irek Fasikhov
Hi, Igor.

You need to repair the PG.

for i in `ceph pg dump| grep inconsistent | grep -v 'inconsistent+repair' |
awk {'print$1'}`;do ceph pg repair $i;done

С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

2015-08-18 8:27 GMT+03:00 Voloshanenko Igor igor.voloshane...@gmail.com:

 Hi all, at our production cluster, due high rebalancing ((( we have 2 pgs
 in inconsistent state...

 root@temp:~# ceph health detail | grep inc
 HEALTH_ERR 2 pgs inconsistent; 18 scrub errors
 pg 2.490 is active+clean+inconsistent, acting [56,15,29]
 pg 2.c4 is active+clean+inconsistent, acting [56,10,42]

 From OSD logs, after recovery attempt:

 root@test:~# ceph pg dump | grep -i incons | cut -f 1 | while read i; do
 ceph pg repair ${i} ; done
 dumped all in format plain
 instructing pg 2.490 on osd.56 to repair
 instructing pg 2.c4 on osd.56 to repair

 /var/log/ceph/ceph-osd.56.log:51:2015-08-18 07:26:37.035910 7f94663b3700
 -1 log_channel(cluster) log [ERR] : deep-scrub 2.490
 f5759490/rbd_data.1631755377d7e.04da/head//2 expected clone
 90c59490/rbd_data.eb486436f2beb.7a65/141//2
 /var/log/ceph/ceph-osd.56.log:52:2015-08-18 07:26:37.035960 7f94663b3700
 -1 log_channel(cluster) log [ERR] : deep-scrub 2.490
 fee49490/rbd_data.12483d3ba0794b.522f/head//2 expected clone
 f5759490/rbd_data.1631755377d7e.04da/141//2
 /var/log/ceph/ceph-osd.56.log:53:2015-08-18 07:26:37.036133 7f94663b3700
 -1 log_channel(cluster) log [ERR] : deep-scrub 2.490
 a9b39490/rbd_data.12483d3ba0794b.37b3/head//2 expected clone
 fee49490/rbd_data.12483d3ba0794b.522f/141//2
 /var/log/ceph/ceph-osd.56.log:54:2015-08-18 07:26:37.036243 7f94663b3700
 -1 log_channel(cluster) log [ERR] : deep-scrub 2.490
 bac19490/rbd_data.1238e82ae8944a.032e/head//2 expected clone
 a9b39490/rbd_data.12483d3ba0794b.37b3/141//2
 /var/log/ceph/ceph-osd.56.log:55:2015-08-18 07:26:37.036289 7f94663b3700
 -1 log_channel(cluster) log [ERR] : deep-scrub 2.490
 98519490/rbd_data.123e9c2ae8944a.0807/head//2 expected clone
 bac19490/rbd_data.1238e82ae8944a.032e/141//2
 /var/log/ceph/ceph-osd.56.log:56:2015-08-18 07:26:37.036314 7f94663b3700
 -1 log_channel(cluster) log [ERR] : deep-scrub 2.490
 c3c09490/rbd_data.1238e82ae8944a.0c2b/head//2 expected clone
 98519490/rbd_data.123e9c2ae8944a.0807/141//2
 /var/log/ceph/ceph-osd.56.log:57:2015-08-18 07:26:37.036363 7f94663b3700
 -1 log_channel(cluster) log [ERR] : deep-scrub 2.490
 28809490/rbd_data.edea7460fe42b.01d9/head//2 expected clone
 c3c09490/rbd_data.1238e82ae8944a.0c2b/141//2
 /var/log/ceph/ceph-osd.56.log:58:2015-08-18 07:26:37.036432 7f94663b3700
 -1 log_channel(cluster) log [ERR] : deep-scrub 2.490
 e1509490/rbd_data.1423897545e146.09a6/head//2 expected clone
 28809490/rbd_data.edea7460fe42b.01d9/141//2
 /var/log/ceph/ceph-osd.56.log:59:2015-08-18 07:26:38.548765 7f94663b3700
 -1 log_channel(cluster) log [ERR] : 2.490 deep-scrub 17 errors

 So, how i can solve expected clone situation by hand?
 Thank in advance!



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Geographical Replication and Disaster Recovery Support

2015-08-13 Thread Irek Fasikhov
Hi.
This document applies only to RadosGW.

You need to read the data document:
https://wiki.ceph.com/Planning/Blueprints/Hammer/RBD%3A_Mirroring


С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

2015-08-13 11:40 GMT+03:00 Özhan Rüzgar Karaman oruzgarkara...@gmail.com:

 Hi;
 I like to learn about Ceph's Geographical Replication and Disaster
 Recovery Options. I know that currently we do not have a built-in official
 Geo Replication or disaster recovery, there are some third party tools like
 drbd but they are not like a solution that business needs.

 I also read the RGW document at Ceph Wiki Site.


 https://wiki.ceph.com/Planning/Blueprints/Dumpling/RGW_Geo-Replication_and_Disaster_Recovery


 The document is from Dumpling Release nearly year 2013. Do we have any
 active works or efforts to achieve disaster recovery or geographical
 replication features to Ceph, is it on our current road map?

 Thanks
 Özhan KARAMAN

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CEPH cache layer. Very slow

2015-08-13 Thread Irek Fasikhov
Hi, Igor.
Try to roll the patch here:
http://www.theirek.com/blog/2014/02/16/patch-dlia-raboty-s-enierghoniezavisimym-keshiem-ssd-diskov

P.S. I am no longer tracks changes in this direction(kernel), because we
use already recommended SSD

С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

2015-08-13 11:56 GMT+03:00 Voloshanenko Igor igor.voloshane...@gmail.com:

 So, after testing SSD (i wipe 1 SSD, and used it for tests)

 root@ix-s2:~# sudo fio --filename=/dev/sda --direct=1 --sync=1 --rw=write
 --bs=4k --numjobs=1 --iodepth=1 --runtime=60 --time_based --gr[53/1800]
 ting --name=journal-test
 journal-test: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync,
 iodepth=1
 fio-2.1.3
 Starting 1 process
 Jobs: 1 (f=1): [W] [100.0% done] [0KB/1152KB/0KB /s] [0/288/0 iops] [eta
 00m:00s]
 journal-test: (groupid=0, jobs=1): err= 0: pid=2849460: Thu Aug 13
 10:46:42 2015
   write: io=68972KB, bw=1149.6KB/s, iops=287, runt= 60001msec
 clat (msec): min=2, max=15, avg= 3.48, stdev= 1.08
  lat (msec): min=2, max=15, avg= 3.48, stdev= 1.08
 clat percentiles (usec):
  |  1.00th=[ 2704],  5.00th=[ 2800], 10.00th=[ 2864], 20.00th=[ 2928],
  | 30.00th=[ 3024], 40.00th=[ 3088], 50.00th=[ 3280], 60.00th=[ 3408],
  | 70.00th=[ 3504], 80.00th=[ 3728], 90.00th=[ 3856], 95.00th=[ 4016],
  | 99.00th=[ 9024], 99.50th=[ 9280], 99.90th=[ 9792], 99.95th=[10048],
  | 99.99th=[14912]
 bw (KB  /s): min= 1064, max= 1213, per=100.00%, avg=1150.07,
 stdev=34.31
 lat (msec) : 4=94.99%, 10=4.96%, 20=0.05%
   cpu  : usr=0.13%, sys=0.57%, ctx=17248, majf=0, minf=7
   IO depths: 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
 =64=0.0%
  submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
 =64=0.0%
  complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
 =64=0.0%
  issued: total=r=0/w=17243/d=0, short=r=0/w=0/d=0

 Run status group 0 (all jobs):
   WRITE: io=68972KB, aggrb=1149KB/s, minb=1149KB/s, maxb=1149KB/s,
 mint=60001msec, maxt=60001msec

 Disk stats (read/write):
   sda: ios=0/17224, merge=0/0, ticks=0/59584, in_queue=59576, util=99.30%

 So, it's pain... SSD do only 287 iops on 4K... 1,1 MB/s

 I try to change cache mode :
 echo temporary write through  /sys/class/scsi_disk/2:0:0:0/cache_type
 echo temporary write through  /sys/class/scsi_disk/3:0:0:0/cache_type

 no luck, still same shit results, also i found this article:
 https://lkml.org/lkml/2013/11/20/264 pointed to old very simple patch,
 which disable CMD_FLUSH
 https://gist.github.com/TheCodeArtist/93dddcd6a21dc81414ba

 Has everybody better ideas, how to improve this? (or disable CMD_FLUSH
 without recompile kernel, i used ubuntu and 4.0.4 for now (4.x branch
 because SSD 850 Pro have issue with NCQ TRIM and before 4.0.4 this
 exception was not included into libsata.c)

 2015-08-12 19:17 GMT+03:00 Pieter Koorts pieter.koo...@me.com:

 Hi Igor

 I suspect you have very much the same problem as me.

 https://www.mail-archive.com/ceph-users@lists.ceph.com/msg22260.html

 Basically Samsung drives (like many SATA SSD's) are very much hit and
 miss so you will need to test them like described here to see if they are
 any good.
 http://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/

 To give you an idea my average performance went from 11MB/s (with Samsung
 SSD) to 30MB/s (without any SSD) on write performance. This is a very small
 cluster.

 Pieter

 On Aug 12, 2015, at 04:33 PM, Voloshanenko Igor 
 igor.voloshane...@gmail.com wrote:

 Hi all, we have setup CEPH cluster with 60 OSD (2 diff types) (5 nodes,
 12 disks on each, 10 HDD, 2 SSD)

 Also we cover this with custom crushmap with 2 root leaf

 ID   WEIGHT  TYPE NAME  UP/DOWN REWEIGHT PRIMARY-AFFINITY
 -100 5.0 root ssd
 -102 1.0 host ix-s2-ssd
2 1.0 osd.2   up  1.0  1.0
9 1.0 osd.9   up  1.0  1.0
 -103 1.0 host ix-s3-ssd
3 1.0 osd.3   up  1.0  1.0
7 1.0 osd.7   up  1.0  1.0
 -104 1.0 host ix-s5-ssd
1 1.0 osd.1   up  1.0  1.0
6 1.0 osd.6   up  1.0  1.0
 -105 1.0 host ix-s6-ssd
4 1.0 osd.4   up  1.0  1.0
8 1.0 osd.8   up  1.0  1.0
 -106 1.0 host ix-s7-ssd
0 1.0 osd.0   up  1.0  1.0
5 1.0 osd.5   up  1.0  1.0
   -1 5.0 root platter
   -2 1.0 host ix-s2-platter
   13 1.0 osd.13  up  1.0  1.0
   17 1.0 osd.17  up  1.0  1.0
   21 1.0 osd.21  up  1.0  1.0
   

Re: [ceph-users] RBD performance slowly degrades :-(

2015-08-12 Thread Irek Fasikhov
Hi.
Read this thread here:
https://www.mail-archive.com/ceph-users@lists.ceph.com/msg17360.html

С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757

2015-08-12 14:52 GMT+03:00 Pieter Koorts pieter.koo...@me.com:

 Hi

 Something that's been bugging me for a while is I am trying to diagnose
 iowait time within KVM guests. Guests doing reads or writes tend do about
 50% to 90% iowait but the host itself is only doing about 1% to 2% iowait.
 So the result is the guests are extremely slow.

 I currently run 3x hosts each with a single SSD and single HDD OSD in
 cache-teir writeback mode. Although the SSD (Samsung 850 EVO 120GB) is not
 a great one it should at least perform reasonably compared to a hard disk
 and doing some direct SSD tests I get approximately 100MB/s write and
 200MB/s read on each SSD.

 When I run rados bench though, the benchmark starts with a not great but
 okay speed and as the benchmark progresses it just gets slower and slower
 till it's worse than a USB hard drive. The SSD cache pool is 120GB in size
 (360GB RAW) and in use at about 90GB. I have tried tuning the XFS mount
 options as well but it has had little effect.

 Understandably the server spec is not great but I don't expect performance
 to be that bad.

 *OSD config:*
 [osd]
 osd crush update on start = false
 osd mount options xfs =
 rw,noatime,inode64,logbsize=256k,delaylog,allocsize=4M

 *Servers spec:*
 Dual Quad Core XEON E5410 and 32GB RAM in each server
 10GBE @ 10G speed with 8000byte Jumbo Frames.

 *Rados bench result:* (starts at 50MB/s average and plummets down to
 11MB/s)
 sudo rados bench -p rbd 50 write --no-cleanup -t 1
  Maintaining 1 concurrent writes of 4194304 bytes for up to 50 seconds or
 0 objects
  Object prefix: benchmark_data_osc-mgmt-1_10007
sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
  0   0 0 0 0 0 - 0
  1   11413   51.990652 0.0671911  0.074661
  2   12726   51.990852 0.0631836 0.0751152
  3   13736   47.992140 0.0691167 0.0802425
  4   15150   49.992256 0.0816432 0.0795869
  5   15655   43.993420  0.208393  0.088523
  6   1616039.99420  0.241164 0.0999179
  7   16463   35.993412  0.239001  0.106577
  8   16665   32.4942 8  0.214354  0.122767
  9   17271 31.5524  0.132588  0.125438
 10   17776   30.394820  0.256474  0.128548
 11   17978   28.3589 8  0.183564  0.138354
 12   18281   26.995612  0.345809  0.145523
 13   1858425.84212  0.373247  0.151291
 14   18685   24.2819 4  0.950586  0.160694
 15   18685   22.6632 0 -  0.160694
 16   19089   22.2466 8  0.204714  0.178352
 17   19493   21.879116  0.282236  0.180571
 18   19897   21.552416  0.262566  0.183742
 19   1   101   100   21.049512  0.357659  0.187477
 20   1   104   10320.59712  0.369327  0.192479
 21   1   105   104   19.8066 4  0.373233  0.194217
 22   1   105   104   18.9064 0 -  0.194217
 23   1   106   105   18.2582 2   2.35078  0.214756
 24   1   107   106   17.6642 4  0.680246  0.219147
 25   1   109   108   17.2776 8  0.677688  0.229222
 26   1   113   112   17.228316   0.29171  0.230487
 27   1   117   116   17.182816  0.255915  0.231101
 28   1   120   119   16.997612  0.412411  0.235122
 29   1   120   119   16.4115 0 -  0.235122
 30   1   120   119   15.8645 0 -  0.235122
 31   1   120   119   15.3527 0 -  0.235122
 32   1   122   121   15.1229 2  0.319309  0.262822
 33   1   124   123   14.9071 8  0.344094  0.266201
 34   1   127   126   14.821512   0.33534  0.267913
 35   1   129   128   14.6266 8  0.355403  0.269241
 36   1   132   131   14.553612  0.581528  0.274327
 37   1   132   131   14.1603 0 -  0.274327
 38   1   133   132   13.8929 2   1.43621   0.28313
 39   1   134   133   13.6392 4  0.894817  0.287729
 40   1   134   133   13.2982 0 -  0.287729
 41   1   

Re: [ceph-users] rbd_cache, limiting read on high iops around 40k

2015-06-22 Thread Irek Fasikhov
It is already possible to do in proxmox 3.4 (with the latest updates
qemu-kvm 2.2.x). But it is necessary to register in the conf file
iothread:1. For single drives the ambiguous behavior of productivity.

2015-06-22 10:12 GMT+03:00 Stefan Priebe - Profihost AG 
s.pri...@profihost.ag:


 Am 22.06.2015 um 09:08 schrieb Alexandre DERUMIER aderum...@odiso.com:

  Just an update, there seems to be no proper way to pass iothread
  parameter from openstack-nova (not at least in Juno release). So a
  default single iothread per VM is what all we have. So in conclusion a
  nova instance max iops on ceph rbd will be limited to 30-40K.
 
  Thanks for the update.
 
  For proxmox users,
 
  I have added iothread option to gui for proxmox 4.0

 Can we make iothread the default? Does it also help for single disks or
 only multiple disks?

  and added jemalloc as default memory allocator
 
 
  I have also send a jemmaloc patch to qemu dev mailing
  https://lists.gnu.org/archive/html/qemu-devel/2015-06/msg05265.html
 
  (Help is welcome to push it in qemu upstream ! )
 
 
 
  - Mail original -
  De: pushpesh sharma pushpesh@gmail.com
  À: aderumier aderum...@odiso.com
  Cc: Somnath Roy somnath@sandisk.com, Irek Fasikhov 
 malm...@gmail.com, ceph-devel ceph-de...@vger.kernel.org,
 ceph-users ceph-users@lists.ceph.com
  Envoyé: Lundi 22 Juin 2015 07:58:47
  Objet: Re: rbd_cache, limiting read on high iops around 40k
 
  Just an update, there seems to be no proper way to pass iothread
  parameter from openstack-nova (not at least in Juno release). So a
  default single iothread per VM is what all we have. So in conclusion a
  nova instance max iops on ceph rbd will be limited to 30-40K.
 
  On Tue, Jun 16, 2015 at 10:08 PM, Alexandre DERUMIER
  aderum...@odiso.com wrote:
  Hi,
 
  some news about qemu with tcmalloc vs jemmaloc.
 
  I'm testing with multiple disks (with iothreads) in 1 qemu guest.
 
  And if tcmalloc is a little faster than jemmaloc,
 
  I have hit a lot of time the
 tcmalloc::ThreadCache::ReleaseToCentralCache bug.
 
  increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES, don't help.
 
 
  with multiple disk, I'm around 200k iops with tcmalloc (before hitting
 the bug) and 350kiops with jemmaloc.
 
  The problem is that when I hit malloc bug, I'm around 4000-1 iops,
 and only way to fix is is to restart qemu ...
 
 
 
  - Mail original -
  De: pushpesh sharma pushpesh@gmail.com
  À: aderumier aderum...@odiso.com
  Cc: Somnath Roy somnath@sandisk.com, Irek Fasikhov 
 malm...@gmail.com, ceph-devel ceph-de...@vger.kernel.org,
 ceph-users ceph-users@lists.ceph.com
  Envoyé: Vendredi 12 Juin 2015 08:58:21
  Objet: Re: rbd_cache, limiting read on high iops around 40k
 
  Thanks, posted the question in openstack list. Hopefully will get some
  expert opinion.
 
  On Fri, Jun 12, 2015 at 11:33 AM, Alexandre DERUMIER
  aderum...@odiso.com wrote:
  Hi,
 
  here a libvirt xml sample from libvirt src
 
  (you need to define iothreads number, then assign then in disks).
 
  I don't use openstack, so I really don't known how it's working with
 it.
 
 
  domain type='qemu'
  nameQEMUGuest1/name
  uuidc7a5fdbd-edaf-9455-926a-d65c16db1809/uuid
  memory unit='KiB'219136/memory
  currentMemory unit='KiB'219136/currentMemory
  vcpu placement='static'2/vcpu
  iothreads2/iothreads
  os
  type arch='i686' machine='pc'hvm/type
  boot dev='hd'/
  /os
  clock offset='utc'/
  on_poweroffdestroy/on_poweroff
  on_rebootrestart/on_reboot
  on_crashdestroy/on_crash
  devices
  emulator/usr/bin/qemu/emulator
  disk type='file' device='disk'
  driver name='qemu' type='raw' iothread='1'/
  source file='/var/lib/libvirt/images/iothrtest1.img'/
  target dev='vdb' bus='virtio'/
  address type='pci' domain='0x' bus='0x00' slot='0x04'
 function='0x0'/
  /disk
  disk type='file' device='disk'
  driver name='qemu' type='raw' iothread='2'/
  source file='/var/lib/libvirt/images/iothrtest2.img'/
  target dev='vdc' bus='virtio'/
  /disk
  controller type='usb' index='0'/
  controller type='ide' index='0'/
  controller type='pci' index='0' model='pci-root'/
  memballoon model='none'/
  /devices
  /domain
 
 
  - Mail original -
  De: pushpesh sharma pushpesh@gmail.com
  À: aderumier aderum...@odiso.com
  Cc: Somnath Roy somnath@sandisk.com, Irek Fasikhov 
 malm...@gmail.com, ceph-devel ceph-de...@vger.kernel.org,
 ceph-users ceph-users@lists.ceph.com
  Envoyé: Vendredi 12 Juin 2015 07:52:41
  Objet: Re: rbd_cache, limiting read on high iops around 40k
 
  Hi Alexandre,
 
  I agree with your rational, of one iothread per disk. CPU consumed in
  IOwait is pretty high in each VM. But I am not finding a way to set
  the same on a nova instance. I am using openstack Juno with QEMU+KVM.
  As per libvirt documentation for setting iothreads, I can edit
  domain.xml directly and achieve the same effect. However in as in
  openstack env domain xml is created by nova with some additional
  metadata, so editing

Re: [ceph-users] rbd_cache, limiting read on high iops around 40k

2015-06-22 Thread Irek Fasikhov
| Proxmox 4.0 will allow to enable|disable 1 iothread by disk.

Alexandre, Useful option!
In proxmox 3.4 will it be possible to add at least in the configuration
file? Or it entails a change in the source code KVM?
Thanks.

2015-06-22 11:54 GMT+03:00 Alexandre DERUMIER aderum...@odiso.com:

 It is already possible to do in proxmox 3.4 (with the latest updates
 qemu-kvm 2.2.x). But it is necessary to register in the conf file
 iothread:1. For single drives the ambiguous behavior of productivity.

 Yes and no ;)

 Currently in proxmox 3.4, iothread:1  generate only 1 iothread for all
 disks.

 So, you'll have a small extra boost, but it'll not scale with multiple
 disks.

 Proxmox 4.0 will allow to enable|disable 1 iothread by disk.


 Does it also help for single disks or only multiple disks?

 Iothread can also help for single disk, because by default qemu use a main
 thread for disk but also other things(don't remember what exactly)




 - Mail original -
 De: Irek Fasikhov malm...@gmail.com
 À: Stefan Priebe s.pri...@profihost.ag
 Cc: aderumier aderum...@odiso.com, pushpesh sharma 
 pushpesh@gmail.com, Somnath Roy somnath@sandisk.com,
 ceph-devel ceph-de...@vger.kernel.org, ceph-users 
 ceph-users@lists.ceph.com
 Envoyé: Lundi 22 Juin 2015 09:22:13
 Objet: Re: rbd_cache, limiting read on high iops around 40k

 It is already possible to do in proxmox 3.4 (with the latest updates
 qemu-kvm 2.2.x). But it is necessary to register in the conf file
 iothread:1. For single drives the ambiguous behavior of productivity.

 2015-06-22 10:12 GMT+03:00 Stefan Priebe - Profihost AG 
 s.pri...@profihost.ag  :



 Am 22.06.2015 um 09:08 schrieb Alexandre DERUMIER  aderum...@odiso.com :

  Just an update, there seems to be no proper way to pass iothread
  parameter from openstack-nova (not at least in Juno release). So a
  default single iothread per VM is what all we have. So in conclusion a
  nova instance max iops on ceph rbd will be limited to 30-40K.
 
  Thanks for the update.
 
  For proxmox users,
 
  I have added iothread option to gui for proxmox 4.0

 Can we make iothread the default? Does it also help for single disks or
 only multiple disks?

  and added jemalloc as default memory allocator
 
 
  I have also send a jemmaloc patch to qemu dev mailing
  https://lists.gnu.org/archive/html/qemu-devel/2015-06/msg05265.html
 
  (Help is welcome to push it in qemu upstream ! )
 
 
 
  - Mail original -
  De: pushpesh sharma  pushpesh@gmail.com 
  À: aderumier  aderum...@odiso.com 
  Cc: Somnath Roy  somnath@sandisk.com , Irek Fasikhov 
 malm...@gmail.com , ceph-devel  ceph-de...@vger.kernel.org ,
 ceph-users  ceph-users@lists.ceph.com 
  Envoyé: Lundi 22 Juin 2015 07:58:47
  Objet: Re: rbd_cache, limiting read on high iops around 40k
 
  Just an update, there seems to be no proper way to pass iothread
  parameter from openstack-nova (not at least in Juno release). So a
  default single iothread per VM is what all we have. So in conclusion a
  nova instance max iops on ceph rbd will be limited to 30-40K.
 
  On Tue, Jun 16, 2015 at 10:08 PM, Alexandre DERUMIER
   aderum...@odiso.com  wrote:
  Hi,
 
  some news about qemu with tcmalloc vs jemmaloc.
 
  I'm testing with multiple disks (with iothreads) in 1 qemu guest.
 
  And if tcmalloc is a little faster than jemmaloc,
 
  I have hit a lot of time the
 tcmalloc::ThreadCache::ReleaseToCentralCache bug.
 
  increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES, don't help.
 
 
  with multiple disk, I'm around 200k iops with tcmalloc (before hitting
 the bug) and 350kiops with jemmaloc.
 
  The problem is that when I hit malloc bug, I'm around 4000-1 iops,
 and only way to fix is is to restart qemu ...
 
 
 
  - Mail original -
  De: pushpesh sharma  pushpesh@gmail.com 
  À: aderumier  aderum...@odiso.com 
  Cc: Somnath Roy  somnath@sandisk.com , Irek Fasikhov 
 malm...@gmail.com , ceph-devel  ceph-de...@vger.kernel.org ,
 ceph-users  ceph-users@lists.ceph.com 
  Envoyé: Vendredi 12 Juin 2015 08:58:21
  Objet: Re: rbd_cache, limiting read on high iops around 40k
 
  Thanks, posted the question in openstack list. Hopefully will get some
  expert opinion.
 
  On Fri, Jun 12, 2015 at 11:33 AM, Alexandre DERUMIER
   aderum...@odiso.com  wrote:
  Hi,
 
  here a libvirt xml sample from libvirt src
 
  (you need to define iothreads number, then assign then in disks).
 
  I don't use openstack, so I really don't known how it's working with
 it.
 
 
  domain type='qemu'
  nameQEMUGuest1/name
  uuidc7a5fdbd-edaf-9455-926a-d65c16db1809/uuid
  memory unit='KiB'219136/memory
  currentMemory unit='KiB'219136/currentMemory
  vcpu placement='static'2/vcpu
  iothreads2/iothreads
  os
  type arch='i686' machine='pc'hvm/type
  boot dev='hd'/
  /os
  clock offset='utc'/
  on_poweroffdestroy/on_poweroff
  on_rebootrestart/on_reboot
  on_crashdestroy/on_crash
  devices
  emulator/usr/bin/qemu/emulator
  disk type='file' device='disk'
  driver

Re: [ceph-users] rbd_cache, limiting read on high iops around 40k

2015-06-17 Thread Irek Fasikhov
If necessary, there are RPM files for centos 7:
​
 gperftools.spec
https://drive.google.com/file/d/0BxoNLVWxzOJWaVVmWTA3Z18zbUE/edit?usp=drive_web
​​
 pprof-2.4-1.el7.centos.noarch.rpm
https://drive.google.com/file/d/0BxoNLVWxzOJWRmQ2ZEt6a1pnSVk/edit?usp=drive_web
​​
 gperftools-libs-2.4-1.el7.centos.x86_64.rpm
https://drive.google.com/file/d/0BxoNLVWxzOJWcVByNUZHWWJqRXc/edit?usp=drive_web
​​
 gperftools-devel-2.4-1.el7.centos.x86_64.rpm
https://drive.google.com/file/d/0BxoNLVWxzOJWYTUzQTNha3J3NEU/edit?usp=drive_web
​​
 gperftools-debuginfo-2.4-1.el7.centos.x86_64.rpm
https://drive.google.com/file/d/0BxoNLVWxzOJWVzBic043YUk2LWM/edit?usp=drive_web
​​
 gperftools-2.4-1.el7.centos.x86_64.rpm
https://drive.google.com/file/d/0BxoNLVWxzOJWNm81QWdQYU9ZaG8/edit?usp=drive_web
​

2015-06-17 8:01 GMT+03:00 Alexandre DERUMIER aderum...@odiso.com:

 Hi,
 I finally fix it with tcmalloc with

 TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=268435456 LD_PRELOAD} =
 /usr/lib/libtcmalloc_minimal.so.4 qemu

 I got almost same result than jemmaloc in this case, maybe a littleb it
 faster


 Here the iops results for 1qemu vm with iothread by disk (iodepth=32,
 4krandread, nocache)


 qemu randread 4k nocache libc6  iops


 1 disk  29052
 2 disks 55878
 4 disks 127899
 8 disks 240566
 15 disks269976

 qemu randread 4k nocache jemmaloc   iops

 1 disk   41278
 2 disks  75781
 4 disks  195351
 8 disks  294241
 15 disks 298199



 qemu randread 4k nocache tcmalloc 16M cache iops


 1 disk   37911
 2 disks  67698
 4 disks  41076
 8 disks  43312
 15 disks 37569


 qemu randread 4k nocache tcmalloc patched 256M  iops

 1 disk no-iothread
 1 disk   42160
 2 disks  83135
 4 disks  194591
 8 disks  306038
 15 disks 302278


 - Mail original -
 De: aderumier aderum...@odiso.com
 À: Mark Nelson mnel...@redhat.com
 Cc: ceph-users ceph-users@lists.ceph.com
 Envoyé: Mardi 16 Juin 2015 20:27:54
 Objet: Re: [ceph-users] rbd_cache, limiting read on high iops around 40k

 I forgot to ask, is this with the patched version of tcmalloc that
 theoretically fixes the TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES issue?

 Yes, the patched version of tcmalloc, but also the last version from
 gperftools git.
 (I'm talking about qemu here, not osds).

 I have tried to increased TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES, but it
 doesn't help.



 For osd, increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES is helping.
 (Benchs are still running, I try to overload them as much as possible)



 - Mail original -
 De: Mark Nelson mnel...@redhat.com
 À: ceph-users ceph-users@lists.ceph.com
 Envoyé: Mardi 16 Juin 2015 19:04:27
 Objet: Re: [ceph-users] rbd_cache, limiting read on high iops around 40k

 I forgot to ask, is this with the patched version of tcmalloc that
 theoretically fixes the TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES issue?

 Mark

 On 06/16/2015 11:46 AM, Mark Nelson wrote:
  Hi Alexandre,
 
  Excellent find! Have you also informed the QEMU developers of your
  discovery?
 
  Mark
 
  On 06/16/2015 11:38 AM, Alexandre DERUMIER wrote:
  Hi,
 
  some news about qemu with tcmalloc vs jemmaloc.
 
  I'm testing with multiple disks (with iothreads) in 1 qemu guest.
 
  And if tcmalloc is a little faster than jemmaloc,
 
  I have hit a lot of time the
  tcmalloc::ThreadCache::ReleaseToCentralCache bug.
 
  increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES, don't help.
 
 
  with multiple disk, I'm around 200k iops with tcmalloc (before hitting
  the bug) and 350kiops with jemmaloc.
 
  The problem is that when I hit malloc bug, I'm around 4000-1 iops,
  and only way to fix is is to restart qemu ...
 
 
 
  - Mail original -
  De: pushpesh sharma pushpesh@gmail.com
  À: aderumier aderum...@odiso.com
  Cc: Somnath Roy somnath@sandisk.com, Irek Fasikhov
  malm...@gmail.com, ceph-devel ceph-de...@vger.kernel.org,
  ceph-users ceph-users@lists.ceph.com
  Envoyé: Vendredi 12 Juin 2015 08:58:21
  Objet: Re: rbd_cache, limiting read on high iops around 40k
 
  Thanks, posted the question in openstack list. Hopefully will get some
  expert opinion.
 
  On Fri, Jun 12, 2015 at 11:33 AM, Alexandre DERUMIER
  aderum...@odiso.com wrote:
  Hi,
 
  here a libvirt xml sample from libvirt src
 
  (you need to define iothreads number, then assign then in disks).
 
  I don't use openstack, so I really don't known how it's working with
 it.
 
 
  domain type='qemu'
  nameQEMUGuest1/name
  uuidc7a5fdbd-edaf-9455-926a-d65c16db1809/uuid
  memory unit='KiB'219136/memory
  currentMemory unit='KiB'219136/currentMemory
  vcpu placement='static'2/vcpu
  iothreads2/iothreads
  os
  type arch='i686' machine='pc'hvm/type
  boot dev='hd'/
  /os
  clock offset='utc'/
  on_poweroffdestroy/on_poweroff
  on_rebootrestart/on_reboot
  on_crashdestroy/on_crash
  devices
  emulator/usr/bin/qemu/emulator
  disk type='file' device='disk'
  driver name='qemu' type='raw' iothread='1'/
  source file='/var/lib/libvirt/images/iothrtest1

Re: [ceph-users] [Fwd: adding a a monitor wil result in cephx: verify_reply couldn't decrypt with error: error decoding block for decryption]

2015-06-11 Thread Irek Fasikhov
It is necessary to synchronize time

2015-06-11 11:09 GMT+03:00 Makkelie, R (ITCDCC) - KLM 
ramon.makke...@klm.com:

  i'm trying to add a extra monitor to my already existing cluster
 i do this with the ceph-deploy with the following command

 ceph-deploy mon add mynewhost

 the ceph-deploy says its all finished
 but when i take a look at my new monitor host in the logs i see the
 following error

 cephx: verify_reply couldn't decrypt with error: error decoding block for
 decryption

 and when i take a look in my existing monitor logs i see this error
 cephx: verify_authorizer could not decrypt ticket info: error: NSS AES
 final round failed: -8190

 i tried gatherking key's
 copy keys
 reinstall/purge the new monitor node

 greetz
 Ramon 
 For information, services and offers, please visit our web site:
 http://www.klm.com. This e-mail and any attachment may contain
 confidential and privileged material intended for the addressee only. If
 you are not the addressee, you are notified that no part of the e-mail or
 any attachment may be disclosed, copied or distributed, and that any other
 action related to this e-mail or attachment is strictly prohibited, and may
 be unlawful. If you have received this e-mail by error, please notify the
 sender immediately by return e-mail, and delete this message.

 Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its
 employees shall not be liable for the incorrect or incomplete transmission
 of this e-mail or any attachments, nor responsible for any delay in receipt.
 Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch
 Airlines) is registered in Amstelveen, The Netherlands, with registered
 number 33014286
 

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [Fwd: adding a a monitor wil result in cephx: verify_reply couldn't decrypt with error: error decoding block for decryption]

2015-06-11 Thread Irek Fasikhov
Hands follow command: ntpdate NTPADDRESS

2015-06-11 12:36 GMT+03:00 Makkelie, R (ITCDCC) - KLM 
ramon.makke...@klm.com:

  all ceph releated servers have the same NTP server
 and double checked the time and timezones
 the are all correct


 -Original Message-
 *From*: Irek Fasikhov malm...@gmail.com
 irek%20fasikhov%20%3cmalm...@gmail.com%3e
 *To*: Makkelie, R (ITCDCC) - KLM ramon.makke...@klm.com
 %22Makkelie,%20r%20%28itcdcc%29%20-%20klm%22%20%3cramon.makke...@klm.com%3e
 
 *Cc*: ceph-users@lists.ceph.com ceph-users@lists.ceph.com
 %22ceph-us...@lists.ceph.com%22%20%3cceph-us...@lists.ceph.com%3e
 *Subject*: Re: [ceph-users] [Fwd: adding a a monitor wil result in cephx:
 verify_reply couldn't decrypt with error: error decoding block for
 decryption]
 *Date*: Thu, 11 Jun 2015 12:16:53 +0300

 It is necessary to synchronize time


 2015-06-11 11:09 GMT+03:00 Makkelie, R (ITCDCC) - KLM 
 ramon.makke...@klm.com:

 i'm trying to add a extra monitor to my already existing cluster
 i do this with the ceph-deploy with the following command

 ceph-deploy mon add mynewhost

 the ceph-deploy says its all finished
 but when i take a look at my new monitor host in the logs i see the
 following error

 cephx: verify_reply couldn't decrypt with error: error decoding block for
 decryption

 and when i take a look in my existing monitor logs i see this error
 cephx: verify_authorizer could not decrypt ticket info: error: NSS AES
 final round failed: -8190

 i tried gatherking key's
 copy keys
 reinstall/purge the new monitor node

 greetz
 Ramon 
 For information, services and offers, please visit our web site:
 http://www.klm.com. This e-mail and any attachment may contain
 confidential and privileged material intended for the addressee only. If
 you are not the addressee, you are notified that no part of the e-mail or
 any attachment may be disclosed, copied or distributed, and that any other
 action related to this e-mail or attachment is strictly prohibited, and may
 be unlawful. If you have received this e-mail by error, please notify the
 sender immediately by return e-mail, and delete this message.

 Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its
 employees shall not be liable for the incorrect or incomplete transmission
 of this e-mail or any attachments, nor responsible for any delay in receipt.
 Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch
 Airlines) is registered in Amstelveen, The Netherlands, with registered
 number 33014286
 


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





 -- С уважением, Фасихов Ирек Нургаязович Моб.: +79229045757
 
 For information, services and offers, please visit our web site:
 http://www.klm.com. This e-mail and any attachment may contain
 confidential and privileged material intended for the addressee only. If
 you are not the addressee, you are notified that no part of the e-mail or
 any attachment may be disclosed, copied or distributed, and that any other
 action related to this e-mail or attachment is strictly prohibited, and may
 be unlawful. If you have received this e-mail by error, please notify the
 sender immediately by return e-mail, and delete this message.

 Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its
 employees shall not be liable for the incorrect or incomplete transmission
 of this e-mail or any attachments, nor responsible for any delay in receipt.
 Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch
 Airlines) is registered in Amstelveen, The Netherlands, with registered
 number 33014286
 




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd_cache, limiting read on high iops around 40k

2015-06-09 Thread Irek Fasikhov
Hi, Alexandre.

Very good work!
Do you have a rpm-file?
Thanks.

2015-06-10 7:10 GMT+03:00 Alexandre DERUMIER aderum...@odiso.com:

 Hi,

 I have tested qemu with last tcmalloc 2.4, and the improvement is huge
 with iothread: 50k iops (+45%) !



 qemu : no iothread : glibc : iops=33395
 qemu : no-iothread : tcmalloc (2.2.1) : iops=34516 (+3%)
 qemu : no-iothread : jemmaloc : iops=42226 (+26%)
 qemu : no-iothread : tcmalloc (2.4) : iops=35974 (+7%)


 qemu : iothread : glibc : iops=34516
 qemu : iothread : tcmalloc : iops=38676 (+12%)
 qemu : iothread : jemmaloc : iops=28023 (-19%)
 qemu : iothread : tcmalloc (2.4) : iops=50276 (+45%)





 qemu : iothread : tcmalloc (2.4) : iops=50276 (+45%)
 --
 rbd_iodepth32-test: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K,
 ioengine=libaio, iodepth=32
 fio-2.1.11
 Starting 1 process
 Jobs: 1 (f=1): [r(1)] [100.0% done] [214.7MB/0KB/0KB /s] [54.1K/0/0 iops]
 [eta 00m:00s]
 rbd_iodepth32-test: (groupid=0, jobs=1): err= 0: pid=894: Wed Jun 10
 05:54:24 2015
   read : io=5120.0MB, bw=201108KB/s, iops=50276, runt= 26070msec
 slat (usec): min=1, max=1136, avg= 3.54, stdev= 3.58
 clat (usec): min=128, max=6262, avg=631.41, stdev=197.71
  lat (usec): min=149, max=6265, avg=635.27, stdev=197.40
 clat percentiles (usec):
  |  1.00th=[  318],  5.00th=[  378], 10.00th=[  418], 20.00th=[  474],
  | 30.00th=[  516], 40.00th=[  564], 50.00th=[  612], 60.00th=[  652],
  | 70.00th=[  700], 80.00th=[  756], 90.00th=[  860], 95.00th=[  980],
  | 99.00th=[ 1272], 99.50th=[ 1384], 99.90th=[ 1688], 99.95th=[ 1896],
  | 99.99th=[ 3760]
 bw (KB  /s): min=145608, max=249688, per=100.00%, avg=201108.00,
 stdev=21718.87
 lat (usec) : 250=0.04%, 500=25.84%, 750=53.00%, 1000=16.63%
 lat (msec) : 2=4.46%, 4=0.03%, 10=0.01%
   cpu  : usr=9.73%, sys=24.93%, ctx=66417, majf=0, minf=38
   IO depths: 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%,
 =64=0.0%
  submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
 =64=0.0%
  complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%,
 =64=0.0%
  issued: total=r=1310720/w=0/d=0, short=r=0/w=0/d=0
  latency   : target=0, window=0, percentile=100.00%, depth=32

 Run status group 0 (all jobs):
READ: io=5120.0MB, aggrb=201107KB/s, minb=201107KB/s, maxb=201107KB/s,
 mint=26070msec, maxt=26070msec

 Disk stats (read/write):
   vdb: ios=1302555/0, merge=0/0, ticks=715176/0, in_queue=714840,
 util=99.73%






 rbd_iodepth32-test: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K,
 ioengine=libaio, iodepth=32
 fio-2.1.11
 Starting 1 process
 Jobs: 1 (f=1): [r(1)] [100.0% done] [158.7MB/0KB/0KB /s] [40.6K/0/0 iops]
 [eta 00m:00s]
 rbd_iodepth32-test: (groupid=0, jobs=1): err= 0: pid=889: Wed Jun 10
 06:05:06 2015
   read : io=5120.0MB, bw=143897KB/s, iops=35974, runt= 36435msec
 slat (usec): min=1, max=710, avg= 3.31, stdev= 3.35
 clat (usec): min=191, max=4740, avg=884.66, stdev=315.65
  lat (usec): min=289, max=4743, avg=888.31, stdev=315.51
 clat percentiles (usec):
  |  1.00th=[  462],  5.00th=[  516], 10.00th=[  548], 20.00th=[  596],
  | 30.00th=[  652], 40.00th=[  764], 50.00th=[  868], 60.00th=[  940],
  | 70.00th=[ 1004], 80.00th=[ 1096], 90.00th=[ 1256], 95.00th=[ 1416],
  | 99.00th=[ 2024], 99.50th=[ 2224], 99.90th=[ 2544], 99.95th=[ 2640],
  | 99.99th=[ 3632]
 bw (KB  /s): min=98352, max=177328, per=99.91%, avg=143772.11,
 stdev=21782.39
 lat (usec) : 250=0.01%, 500=3.48%, 750=35.69%, 1000=30.01%
 lat (msec) : 2=29.74%, 4=1.07%, 10=0.01%
   cpu  : usr=7.10%, sys=16.90%, ctx=54855, majf=0, minf=38
   IO depths: 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%,
 =64=0.0%
  submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
 =64=0.0%
  complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%,
 =64=0.0%
  issued: total=r=1310720/w=0/d=0, short=r=0/w=0/d=0
  latency   : target=0, window=0, percentile=100.00%, depth=32

 Run status group 0 (all jobs):
READ: io=5120.0MB, aggrb=143896KB/s, minb=143896KB/s, maxb=143896KB/s,
 mint=36435msec, maxt=36435msec

 Disk stats (read/write):
   vdb: ios=1301357/0, merge=0/0, ticks=1033036/0, in_queue=1032716,
 util=99.85%


 - Mail original -
 De: aderumier aderum...@odiso.com
 À: Robert LeBlanc rob...@leblancnet.us
 Cc: Mark Nelson mnel...@redhat.com, ceph-devel 
 ceph-de...@vger.kernel.org, pushpesh sharma pushpesh@gmail.com,
 ceph-users ceph-users@lists.ceph.com
 Envoyé: Mardi 9 Juin 2015 18:47:27
 Objet: Re: [ceph-users] rbd_cache, limiting read on high iops around 40k

 Hi Robert,

 What I found was that Ceph OSDs performed well with either
 tcmalloc or jemalloc (except when RocksDB was built with jemalloc
 instead of tcmalloc, I'm still working to dig into why that might be
 the case).
 yes,from my test, for osd tcmalloc is a little faster (but very 

Re: [ceph-users] HEALTH_WARN 6 requests are blocked

2015-05-12 Thread Irek Fasikhov
Patrick,
At the moment, you do not have any problems related to the slow query.

2015-05-12 8:56 GMT+03:00 Patrik Plank pat...@plank.me:

  So ok, understand.

 But what can I do if the scrubbing process hangs by one page since last
 night:


 root@ceph01:~# ceph health detail
 HEALTH_OK

 root@ceph01:~# ceph pg dump | grep scrub
 pg_statobjectsmipdegrmispunfbyteslog
 disklogstatestate_stampvreportedupup_primary
 actingacting_primarylast_scrubscrub_stamplast_deep_scrub
 deep_scrub_stamp
 2.5cb1010000423620608324324
 active+clean+scrubbing+deep2015-05-11 23:01:37.0567474749'324
 4749:6524[14,10]14[14,10]144749'3182015-05-10
 22:05:29.2528763423'3092015-05-04 21:44:46.609791

 Perhaps an idea?


 best regards


  -Original message-
 *From:* Irek Fasikhov malm...@gmail.com
 *Sent:* Tuesday 12th May 2015 7:49
 *To:* Patrik Plank pat...@plank.me; ceph-users@lists.ceph.com
 *Subject:* Re: [ceph-users] HEALTH_WARN 6 requests are blocked

 Scrubbing greatly affects the I / O and can slow queries on OSD. For more
 information, look in the 'ceph health detail' and 'ceph pg dump | grep
 scrub'

 2015-05-12 8:42 GMT+03:00 Patrik Plank pat...@plank.me:

  Hi,


 is that the reason for the Health Warn or the scrubbing notification?



 thanks

 regards


  -Original message-
 *From:* Irek Fasikhov malm...@gmail.com
 *Sent:* Tuesday 12th May 2015 7:33
 *To:* Patrik Plank pat...@plank.me
 *Cc:* ceph-users@lists.ceph.com  ceph-users@lists.ceph.com 
 ceph-users@lists.ceph.com
 *Subject:* Re: [ceph-users] HEALTH_WARN 6 requests are blocked

 Hi, Patrik.

 You must configure the priority of the I / O for scrubbing.

 http://dachary.org/?p=3268



 2015-05-12 8:03 GMT+03:00 Patrik Plank pat...@plank.me:

  Hi,


 the ceph cluster shows always the scrubbing notifications, although he
 do not scrub.

 And what does the Health Warn mean.

 Does anybody have an idea why the warning is displayed.

 How can I solve this?


  cluster 78227661-3a1b-4e56-addc-c2a272933ac2
  health HEALTH_WARN 6 requests are blocked  32 sec
  monmap e3: 3 mons at {ceph01=
 10.0.0.20:6789/0,ceph02=10.0.0.21:6789/0,ceph03=10.0.0.22:6789/0},
 election epoch 92, quorum 0,1,2 ceph01,ceph02,ceph03
  osdmap e4749: 30 osds: 30 up, 30 in
   pgmap v2321129: 4608 pgs, 2 pools, 1712 GB data, 440 kobjects
 3425 GB used, 6708 GB / 10134 GB avail
1 active+clean+scrubbing+deep
 4607 active+clean
   client io 3282 kB/s rd, 10742 kB/s wr, 182 op/s


 thanks

 best regards

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




 --
  С уважением, Фасихов Ирек Нургаязович
 Моб.: +79229045757




 --
  С уважением, Фасихов Ирек Нургаязович
 Моб.: +79229045757




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] HEALTH_WARN 6 requests are blocked

2015-05-11 Thread Irek Fasikhov
Hi, Patrik.

You must configure the priority of the I / O for scrubbing.

http://dachary.org/?p=3268



2015-05-12 8:03 GMT+03:00 Patrik Plank pat...@plank.me:

  Hi,


 the ceph cluster shows always the scrubbing notifications, although he do
 not scrub.

 And what does the Health Warn mean.

 Does anybody have an idea why the warning is displayed.

 How can I solve this?


  cluster 78227661-3a1b-4e56-addc-c2a272933ac2
  health HEALTH_WARN 6 requests are blocked  32 sec
  monmap e3: 3 mons at {ceph01=
 10.0.0.20:6789/0,ceph02=10.0.0.21:6789/0,ceph03=10.0.0.22:6789/0},
 election epoch 92, quorum 0,1,2 ceph01,ceph02,ceph03
  osdmap e4749: 30 osds: 30 up, 30 in
   pgmap v2321129: 4608 pgs, 2 pools, 1712 GB data, 440 kobjects
 3425 GB used, 6708 GB / 10134 GB avail
1 active+clean+scrubbing+deep
 4607 active+clean
   client io 3282 kB/s rd, 10742 kB/s wr, 182 op/s


 thanks

 best regards

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] HEALTH_WARN 6 requests are blocked

2015-05-11 Thread Irek Fasikhov
Scrubbing greatly affects the I / O and can slow queries on OSD. For more
information, look in the 'ceph health detail' and 'ceph pg dump | grep
scrub'

2015-05-12 8:42 GMT+03:00 Patrik Plank pat...@plank.me:

  Hi,


 is that the reason for the Health Warn or the scrubbing notification?



 thanks

 regards


  -Original message-
 *From:* Irek Fasikhov malm...@gmail.com
 *Sent:* Tuesday 12th May 2015 7:33
 *To:* Patrik Plank pat...@plank.me
 *Cc:* ceph-users@lists.ceph.com  ceph-users@lists.ceph.com 
 ceph-users@lists.ceph.com
 *Subject:* Re: [ceph-users] HEALTH_WARN 6 requests are blocked

 Hi, Patrik.

 You must configure the priority of the I / O for scrubbing.

 http://dachary.org/?p=3268



 2015-05-12 8:03 GMT+03:00 Patrik Plank pat...@plank.me:

  Hi,


 the ceph cluster shows always the scrubbing notifications, although he do
 not scrub.

 And what does the Health Warn mean.

 Does anybody have an idea why the warning is displayed.

 How can I solve this?


  cluster 78227661-3a1b-4e56-addc-c2a272933ac2
  health HEALTH_WARN 6 requests are blocked  32 sec
  monmap e3: 3 mons at {ceph01=
 10.0.0.20:6789/0,ceph02=10.0.0.21:6789/0,ceph03=10.0.0.22:6789/0},
 election epoch 92, quorum 0,1,2 ceph01,ceph02,ceph03
  osdmap e4749: 30 osds: 30 up, 30 in
   pgmap v2321129: 4608 pgs, 2 pools, 1712 GB data, 440 kobjects
 3425 GB used, 6708 GB / 10134 GB avail
1 active+clean+scrubbing+deep
 4607 active+clean
   client io 3282 kB/s rd, 10742 kB/s wr, 182 op/s


 thanks

 best regards

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




 --
  С уважением, Фасихов Ирек Нургаязович
 Моб.: +79229045757




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] very different performance on two volumes in the same pool

2015-04-27 Thread Irek Fasikhov
Hi, Nikola.

https://www.mail-archive.com/ceph-users@lists.ceph.com/msg19152.html

2015-04-27 14:17 GMT+03:00 Nikola Ciprich nikola.cipr...@linuxbox.cz:

 Hello Somnath,
  Thanks for the perf data..It seems innocuous..I am not seeing single
 tcmalloc trace, are you running with tcmalloc by the way ?

 according to ldd, it seems I have it compiled in, yes:
 [root@vfnphav1a ~]# ldd /usr/bin/ceph-osd
 .
 .
 libtcmalloc.so.4 = /usr/lib64/libtcmalloc.so.4 (0x7f7a3756e000)
 .
 .


  What about my other question, is the performance of slow volume
 increasing if you stop IO on the other volume ?
 I don't have any other cpeh users, actually whole cluster is idle..

  Are you using default ceph.conf ? Probably, you want to try with
 different osd_op_num_shards (may be = 10 , based on your osd server config)
 and osd_op_num_threads_per_shard (may be = 1). Also, you may want to see
 the effect by doing osd_enable_op_tracker = false

 I guess I'm using pretty default settings, few changes probably not much
 related:

 [osd]
 osd crush update on start = false

 [client]
 rbd cache = true
 rbd cache writethrough until flush = true

 [mon]
 debug paxos = 0



 I now tried setting
 throttler perf counter = false
 osd enable op tracker = false
 osd_op_num_threads_per_shard = 1
 osd_op_num_shards = 10

 and restarting all ceph servers.. but it seems to make no big difference..


 
  Are you seeing similar resource consumption on both the servers while IO
 is going on ?
 yes, on all three nodes, ceph-osd seems to be consuming lots of CPU during
 benchmark.

 
  Need some information about your client, are the volumes exposed with
 krbd or running with librbd environment ? If krbd and with same physical
 box, hope you mapped the images with 'noshare' enabled.

 I'm using fio with ceph engine, so I guess none rbd related stuff is in
 use here?


 
  Too many questions :-)  But, this may give some indication what is going
 on there.
 :-) hopefully my answers are not too confused, I'm still pretty new to
 ceph..

 BR

 nik


 
  Thanks  Regards
  Somnath
 
  -Original Message-
  From: Nikola Ciprich [mailto:nikola.cipr...@linuxbox.cz]
  Sent: Sunday, April 26, 2015 7:32 AM
  To: Somnath Roy
  Cc: ceph-users@lists.ceph.com; n...@linuxbox.cz
  Subject: Re: [ceph-users] very different performance on two volumes in
 the same pool
 
  Hello Somnath,
 
  On Fri, Apr 24, 2015 at 04:23:19PM +, Somnath Roy wrote:
   This could be again because of tcmalloc issue I reported earlier.
  
   Two things to observe.
  
   1. Is the performance improving if you stop IO on other volume ? If
 so, it could be different issue.
  there is no other IO.. only cephfs mounted, but no users of it.
 
  
   2. Run perf top in the OSD node and see if tcmalloc traces are popping
 up.
 
  don't see anything special:
 
3.34%  libc-2.12.so  [.] _int_malloc
2.87%  libc-2.12.so  [.] _int_free
2.79%  [vdso][.] __vdso_gettimeofday
2.67%  libsoftokn3.so[.] 0x0001fad9
2.34%  libfreeblpriv3.so [.] 0x000355e6
2.33%  libpthread-2.12.so[.] pthread_mutex_unlock
2.19%  libpthread-2.12.so[.] pthread_mutex_lock
1.80%  libc-2.12.so  [.] malloc
1.43%  [kernel]  [k] do_raw_spin_lock
1.42%  libc-2.12.so  [.] memcpy
1.23%  [kernel]  [k] __switch_to
1.19%  [kernel]  [k]
 acpi_processor_ffh_cstate_enter
1.09%  libc-2.12.so  [.] malloc_consolidate
1.08%  [kernel]  [k] __schedule
1.05%  libtcmalloc.so.4.1.0  [.] 0x00017e6f
0.98%  libc-2.12.so  [.] vfprintf
0.83%  libstdc++.so.6.0.13   [.] std::basic_ostreamchar,
 std::char_traitschar  std::__ostream_insertchar,
 std::char_traitschar (std::basic_ostreamchar,
0.76%  libstdc++.so.6.0.13   [.] 0x0008092a
0.73%  libc-2.12.so  [.] __memset_sse2
0.72%  libc-2.12.so  [.] __strlen_sse42
0.70%  libstdc++.so.6.0.13   [.] std::basic_streambufchar,
 std::char_traitschar ::xsputn(char const*, long)
0.68%  libpthread-2.12.so[.] pthread_mutex_trylock
0.67%  librados.so.2.0.0 [.] ceph_crc32c_sctp
0.63%  libpython2.6.so.1.0   [.] 0x0007d823
0.55%  libnss3.so[.] 0x00056d2a
0.52%  libc-2.12.so  [.] free
0.50%  libstdc++.so.6.0.13   [.] std::basic_stringchar,
 std::char_traitschar, std::allocatorchar ::basic_string(std::string
 const)
 
  should I check anything else?
  BR
  nik
 
 
  
   Thanks  Regards
   Somnath
  
   -Original Message-
   From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
 Of Nikola Ciprich
   Sent: Friday, April 24, 2015 7:10 AM
   To: 

Re: [ceph-users] strange benchmark problem : restarting osd daemon improve performance from 100k iops to 300k iops

2015-04-24 Thread Irek Fasikhov
Hi,Alexandre!
Do not try to change the parameter vm.min_free_kbytes?

2015-04-23 19:24 GMT+03:00 Somnath Roy somnath@sandisk.com:

 Alexandre,
 You can configure with --with-jemalloc or ./do_autogen -J to build ceph
 with jemalloc.

 Thanks  Regards
 Somnath

 -Original Message-
 From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
 Alexandre DERUMIER
 Sent: Thursday, April 23, 2015 4:56 AM
 To: Mark Nelson
 Cc: ceph-users; ceph-devel; Milosz Tanski
 Subject: Re: [ceph-users] strange benchmark problem : restarting osd
 daemon improve performance from 100k iops to 300k iops

 If you have the means to compile the same version of ceph with
 jemalloc, I would be very interested to see how it does.

 Yes, sure. (I have around 3-4 weeks to do all the benchs)

 But I don't know how to do it ?
 I'm running the cluster on centos7.1, maybe it can be easy to patch the
 srpms to rebuild the package with jemalloc.



 - Mail original -
 De: Mark Nelson mnel...@redhat.com
 À: aderumier aderum...@odiso.com, Srinivasula Maram 
 srinivasula.ma...@sandisk.com
 Cc: ceph-users ceph-users@lists.ceph.com, ceph-devel 
 ceph-de...@vger.kernel.org, Milosz Tanski mil...@adfin.com
 Envoyé: Jeudi 23 Avril 2015 13:33:00
 Objet: Re: [ceph-users] strange benchmark problem : restarting osd daemon
 improve performance from 100k iops to 300k iops

 Thanks for the testing Alexandre!

 If you have the means to compile the same version of ceph with jemalloc, I
 would be very interested to see how it does.

 In some ways I'm glad it turned out not to be NUMA. I still suspect we
 will have to deal with it at some point, but perhaps not today. ;)

 Mark

 On 04/23/2015 05:58 AM, Alexandre DERUMIER wrote:
  Maybe it's tcmalloc related
  I thinked to have patched it correctly, but perf show a lot of
  tcmalloc::ThreadCache::ReleaseToCentralCache
 
  before osd restart (100k)
  --
  11.66% ceph-osd libtcmalloc.so.4.1.2 [.]
  tcmalloc::ThreadCache::ReleaseToCentralCache
  8.51% ceph-osd libtcmalloc.so.4.1.2 [.]
  tcmalloc::CentralFreeList::FetchFromSpans
  3.04% ceph-osd libtcmalloc.so.4.1.2 [.]
  tcmalloc::CentralFreeList::ReleaseToSpans
  2.04% ceph-osd libtcmalloc.so.4.1.2 [.] operator new 1.63% swapper
  [kernel.kallsyms] [k] intel_idle 1.35% ceph-osd libtcmalloc.so.4.1.2
  [.] tcmalloc::CentralFreeList::ReleaseListToSpans
  1.33% ceph-osd libtcmalloc.so.4.1.2 [.] operator delete 1.07% ceph-osd
  libstdc++.so.6.0.19 [.] std::basic_stringchar,
  std::char_traitschar, std::allocatorchar ::basic_string 0.91%
  ceph-osd libpthread-2.17.so [.] pthread_mutex_trylock 0.88% ceph-osd
  libc-2.17.so [.] __memcpy_ssse3_back 0.81% ceph-osd ceph-osd [.]
  Mutex::Lock 0.79% ceph-osd [kernel.kallsyms] [k]
  copy_user_enhanced_fast_string 0.74% ceph-osd libpthread-2.17.so [.]
  pthread_mutex_unlock 0.67% ceph-osd [kernel.kallsyms] [k]
  _raw_spin_lock 0.63% swapper [kernel.kallsyms] [k]
  native_write_msr_safe 0.62% ceph-osd [kernel.kallsyms] [k]
  avc_has_perm_noaudit 0.58% ceph-osd ceph-osd [.] operator 0.57%
  ceph-osd [kernel.kallsyms] [k] __schedule 0.57% ceph-osd
  [kernel.kallsyms] [k] __d_lookup_rcu 0.54% swapper [kernel.kallsyms]
  [k] __schedule
 
 
  after osd restart (300k iops)
  --
  3.47% ceph-osd libtcmalloc.so.4.1.2 [.] operator new 1.92% ceph-osd
  libtcmalloc.so.4.1.2 [.] operator delete 1.86% swapper
  [kernel.kallsyms] [k] intel_idle 1.52% ceph-osd libstdc++.so.6.0.19
  [.] std::basic_stringchar, std::char_traitschar,
  std::allocatorchar ::basic_string 1.34% ceph-osd
  libtcmalloc.so.4.1.2 [.] tcmalloc::ThreadCache::ReleaseToCentralCache
  1.24% ceph-osd libc-2.17.so [.] __memcpy_ssse3_back 1.23% ceph-osd
  ceph-osd [.] Mutex::Lock 1.21% ceph-osd libpthread-2.17.so [.]
  pthread_mutex_trylock 1.11% ceph-osd [kernel.kallsyms] [k]
  copy_user_enhanced_fast_string 0.95% ceph-osd libpthread-2.17.so [.]
  pthread_mutex_unlock 0.94% ceph-osd [kernel.kallsyms] [k]
  _raw_spin_lock 0.78% ceph-osd [kernel.kallsyms] [k] __d_lookup_rcu
  0.70% ceph-osd [kernel.kallsyms] [k] tcp_sendmsg 0.70% ceph-osd
  ceph-osd [.] Message::Message 0.68% ceph-osd [kernel.kallsyms] [k]
  __schedule 0.66% ceph-osd [kernel.kallsyms] [k] idle_cpu 0.65%
  ceph-osd libtcmalloc.so.4.1.2 [.]
  tcmalloc::CentralFreeList::FetchFromSpans
  0.64% swapper [kernel.kallsyms] [k] native_write_msr_safe 0.61%
  ceph-osd ceph-osd [.]
  std::tr1::_Sp_counted_base(__gnu_cxx::_Lock_policy)2::_M_release
  0.60% swapper [kernel.kallsyms] [k] __schedule 0.60% ceph-osd
  libstdc++.so.6.0.19 [.] 0x000bdd2b 0.57% ceph-osd ceph-osd [.]
  operator 0.57% ceph-osd ceph-osd [.] crc32_iscsi_00 0.56% ceph-osd
  libstdc++.so.6.0.19 [.] std::string::_Rep::_M_dispose 0.55% ceph-osd
  [kernel.kallsyms] [k] __switch_to 0.54% ceph-osd libc-2.17.so [.]
  vfprintf 0.52% ceph-osd [kernel.kallsyms] [k] fget_light
 
  - Mail original -
  De: aderumier aderum...@odiso.com
  À: Srinivasula Maram 

Re: [ceph-users] Firefly - Giant : CentOS 7 : install failed ceph-deploy

2015-04-08 Thread Irek Fasikhov
I use Centos 7.1. The problem is that in the basic package repository has
ceph-common.

[root@ceph01p24 cluster]# yum --showduplicates list ceph-common
Loaded plugins: dellsysid, etckeeper, fastestmirror, priorities
Loading mirror speeds from cached hostfile
 * base: centos-mirror.rbc.ru
 * epel: be.mirror.eurid.eu
 * extras: ftp.funet.fi
 * updates: centos-mirror.rbc.ru
Installed Packages
ceph-common.x86_64

 0.80.7-0.el7.centos
@Ceph
Available Packages
ceph-common.x86_64

 0.80.6-0.el7.centos
Ceph
ceph-common.x86_64

 0.80.7-0.el7.centos
Ceph
ceph-common.x86_64

 0.80.8-0.el7.centos
Ceph
ceph-common.x86_64

 0.80.9-0.el7.centos
Ceph
ceph-common.x86_64
 1:0.80.7-0.4.el7

   epel
ceph-common.x86_64
 1:0.80.7-2.el7

   base

I make the installation as follows:

rpm -ivh
http://ceph.com/rpm-firefly/el7/noarch/ceph-release-1-0.el7.noarch.rpm
yum install redhat-lsb-core-4.1-27.el7.centos.1.x86_64
gperftools-libs.x86_64 yum-plugin-priorities.noarch ntp -y
yum install librbd1-0.80.7-0.el7.centos
librados2-0.80.7-0.el7.centos.x86_64.rpm -y
yum install gdisk cryptsetup leveldb python-jinja2 hdparm -y

yum install --disablerepo=base --disablerepo=epel
ceph-common-0.80.7-0.el7.centos.x86_64 -y
yum install --disablerepo=base --disablerepo=epel ceph-0.80.7-0.el7.centos
-y

2015-04-08 12:40 GMT+03:00 Vickey Singh vickey.singh22...@gmail.com:

 Hello Everyone


 I also tried setting higher priority as suggested by SAM but no luck


 Please see the Full logs here http://paste.ubuntu.com/10771358/


 While installing yum searches for correct Ceph repository but it founds 3
 versions of python-ceph under http://ceph.com/rpm-giant/el7/x86_64/


 How can i instruct yum to install latest version of ceph from giant
 repository ?? FYI i have this setting already


 [root@rgw-node1 yum.repos.d]# cat /etc/yum/pluginconf.d/priorities.conf

 [main]

 enabled = 1

 check_obsoletes = 1

 [root@rgw-node1 yum.repos.d]#




 This issue can be easily reproduced, just now i tried on a fresh server
 centos 7.0.1406 but it still fails.

 Please help.

 Please help.

 Please help.


 # cat /etc/redhat-release

 CentOS Linux release 7.0.1406 (Core)

 #

 # uname -r

 3.10.0-123.20.1.el7.x86_64

 #


 Regards

 VS


 On Wed, Apr 8, 2015 at 11:10 AM, Sam Wouters s...@ericom.be wrote:

  Hi Vickey,

 we had a similar issue and we resolved it by giving the centos base and
 update repo a higher priority (ex 10) then the epel repo.
 The ceph-deploy tool only sets a prio of 1 for the ceph repo's, but the
 centos and epel repo's stay on the default of 99.

 regards,
 Sam

 On 08-04-15 09:32, Vickey Singh wrote:

  Hi Ken


  As per your suggestion , i tried enabling epel-testing repository but
 still no luck.


  Please check the below output. I would really appreciate  any help
 here.



  # yum install ceph --enablerepo=epel-testing


  --- Package python-rbd.x86_64 1:0.80.7-0.5.el7 will be installed

 -- Processing Dependency: librbd1 = 1:0.80.7 for package:
 1:python-rbd-0.80.7-0.5.el7.x86_64

 -- Finished Dependency Resolution

 Error: Package: 1:python-cephfs-0.80.7-0.4.el7.x86_64 (epel)

Requires: libcephfs1 = 1:0.80.7

Available: 1:libcephfs1-0.86-0.el7.centos.x86_64 (Ceph)

libcephfs1 = 1:0.86-0.el7.centos

Available: 1:libcephfs1-0.87-0.el7.centos.x86_64 (Ceph)

libcephfs1 = 1:0.87-0.el7.centos

Installing: 1:libcephfs1-0.87.1-0.el7.centos.x86_64 (Ceph)

libcephfs1 = 1:0.87.1-0.el7.centos

 *Error: Package: 1:python-rbd-0.80.7-0.5.el7.x86_64 (epel-testing)*

Requires: librbd1 = 1:0.80.7

Removing: librbd1-0.80.9-0.el7.centos.x86_64 (@Ceph)

librbd1 = 0.80.9-0.el7.centos

Updated By: 1:librbd1-0.87.1-0.el7.centos.x86_64 (Ceph)

librbd1 = 1:0.87.1-0.el7.centos

Available: 1:librbd1-0.86-0.el7.centos.x86_64 (Ceph)

librbd1 = 1:0.86-0.el7.centos

Available: 1:librbd1-0.87-0.el7.centos.x86_64 (Ceph)

librbd1 = 1:0.87-0.el7.centos

 *Error: Package: 1:python-rados-0.80.7-0.5.el7.x86_64 (epel-testing)*

Requires: librados2 = 1:0.80.7

Removing: librados2-0.80.9-0.el7.centos.x86_64 (@Ceph)

librados2 = 0.80.9-0.el7.centos

Updated By: 1:librados2-0.87.1-0.el7.centos.x86_64 (Ceph)

librados2 = 1:0.87.1-0.el7.centos

Available: 1:librados2-0.86-0.el7.centos.x86_64 (Ceph)

 

Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-03 Thread Irek Fasikhov
You have a number of replication?

2015-03-03 15:14 GMT+03:00 Andrija Panic andrija.pa...@gmail.com:

 Hi Irek,

 yes, stoping OSD (or seting it to OUT) resulted in only 3% of data
 degraded and moved/recovered.
 When I after that removed it from Crush map ceph osd crush rm id,
 that's when the stuff with 37% happened.

 And thanks Irek for help - could you kindly just let me know of the
 prefered steps when removing whole node?
 Do you mean I first stop all OSDs again, or just remove each OSD from
 crush map, or perhaps, just decompile cursh map, delete the node
 completely, compile back in, and let it heal/recover ?

 Do you think this would result in less data missplaces and moved arround ?

 Sorry for bugging you, I really appreaciate your help.

 Thanks

 On 3 March 2015 at 12:58, Irek Fasikhov malm...@gmail.com wrote:

 A large percentage of the rebuild of the cluster map (But low percentage
 degradation). If you had not made ceph osd crush rm id, the percentage
 would be low.
 In your case, the correct option is to remove the entire node, rather
 than each disk individually

 2015-03-03 14:27 GMT+03:00 Andrija Panic andrija.pa...@gmail.com:

 Another question - I mentioned here 37% of objects being moved arround -
 this is MISPLACED object (degraded objects were 0.001%, after I removed 1
 OSD from cursh map (out of 44 OSD or so).

 Can anybody confirm this is normal behaviour - and are there any
 workarrounds ?

 I understand this is because of the object placement algorithm of CEPH,
 but still 37% of object missplaces just by removing 1 OSD from crush maps
 out of 44 make me wonder why this large percentage ?

 Seems not good to me, and I have to remove another 7 OSDs (we are
 demoting some old hardware nodes). This means I can potentialy go with 7 x
 the same number of missplaced objects...?

 Any thoughts ?

 Thanks

 On 3 March 2015 at 12:14, Andrija Panic andrija.pa...@gmail.com wrote:

 Thanks Irek.

 Does this mean, that after peering for each PG, there will be delay of
 10sec, meaning that every once in a while, I will have 10sec od the cluster
 NOT being stressed/overloaded, and then the recovery takes place for that
 PG, and then another 10sec cluster is fine, and then stressed again ?

 I'm trying to understand process before actually doing stuff (config
 reference is there on ceph.com but I don't fully understand the
 process)

 Thanks,
 Andrija

 On 3 March 2015 at 11:32, Irek Fasikhov malm...@gmail.com wrote:

 Hi.

 Use value osd_recovery_delay_start
 example:
 [root@ceph08 ceph]# ceph --admin-daemon
 /var/run/ceph/ceph-osd.94.asok config show  | grep 
 osd_recovery_delay_start
   osd_recovery_delay_start: 10

 2015-03-03 13:13 GMT+03:00 Andrija Panic andrija.pa...@gmail.com:

 HI Guys,

 I yesterday removed 1 OSD from cluster (out of 42 OSDs), and it
 caused over 37% od the data to rebalance - let's say this is fine (this 
 is
 when I removed it frm Crush Map).

 I'm wondering - I have previously set some throtling mechanism, but
 during first 1h of rebalancing, my rate of recovery was going up to 1500
 MB/s - and VMs were unusable completely, and then last 4h of the duration
 of recover this recovery rate went down to, say, 100-200 MB.s and during
 this VM performance was still pretty impacted, but at least I could work
 more or a less

 So my question, is this behaviour expected, is throtling here working
 as expected, since first 1h was almoust no throtling applied if I check 
 the
 recovery rate 1500MB/s and the impact on Vms.
 And last 4h seemed pretty fine (although still lot of impact in
 general)

 I changed these throtling on the fly with:

 ceph tell osd.* injectargs '--osd_recovery_max_active 1'
 ceph tell osd.* injectargs '--osd_recovery_op_priority 1'
 ceph tell osd.* injectargs '--osd_max_backfills 1'

 My Jorunals are on SSDs (12 OSD per server, of which 6 journals on
 one SSD, 6 journals on another SSD)  - I have 3 of these hosts.

 Any thought are welcome.
 --

 Andrija Panić

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




 --
 С уважением, Фасихов Ирек Нургаязович
 Моб.: +79229045757




 --

 Andrija Panić




 --

 Andrija Panić




 --
 С уважением, Фасихов Ирек Нургаязович
 Моб.: +79229045757




 --

 Andrija Panić




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-03 Thread Irek Fasikhov
Once you have only three nodes in the cluster.
I recommend you add new nodes to the cluster, and then delete the old.

2015-03-03 15:28 GMT+03:00 Irek Fasikhov malm...@gmail.com:

 You have a number of replication?

 2015-03-03 15:14 GMT+03:00 Andrija Panic andrija.pa...@gmail.com:

 Hi Irek,

 yes, stoping OSD (or seting it to OUT) resulted in only 3% of data
 degraded and moved/recovered.
 When I after that removed it from Crush map ceph osd crush rm id,
 that's when the stuff with 37% happened.

 And thanks Irek for help - could you kindly just let me know of the
 prefered steps when removing whole node?
 Do you mean I first stop all OSDs again, or just remove each OSD from
 crush map, or perhaps, just decompile cursh map, delete the node
 completely, compile back in, and let it heal/recover ?

 Do you think this would result in less data missplaces and moved arround ?

 Sorry for bugging you, I really appreaciate your help.

 Thanks

 On 3 March 2015 at 12:58, Irek Fasikhov malm...@gmail.com wrote:

 A large percentage of the rebuild of the cluster map (But low percentage
 degradation). If you had not made ceph osd crush rm id, the percentage
 would be low.
 In your case, the correct option is to remove the entire node, rather
 than each disk individually

 2015-03-03 14:27 GMT+03:00 Andrija Panic andrija.pa...@gmail.com:

 Another question - I mentioned here 37% of objects being moved arround
 - this is MISPLACED object (degraded objects were 0.001%, after I removed 1
 OSD from cursh map (out of 44 OSD or so).

 Can anybody confirm this is normal behaviour - and are there any
 workarrounds ?

 I understand this is because of the object placement algorithm of CEPH,
 but still 37% of object missplaces just by removing 1 OSD from crush maps
 out of 44 make me wonder why this large percentage ?

 Seems not good to me, and I have to remove another 7 OSDs (we are
 demoting some old hardware nodes). This means I can potentialy go with 7 x
 the same number of missplaced objects...?

 Any thoughts ?

 Thanks

 On 3 March 2015 at 12:14, Andrija Panic andrija.pa...@gmail.com
 wrote:

 Thanks Irek.

 Does this mean, that after peering for each PG, there will be delay of
 10sec, meaning that every once in a while, I will have 10sec od the 
 cluster
 NOT being stressed/overloaded, and then the recovery takes place for that
 PG, and then another 10sec cluster is fine, and then stressed again ?

 I'm trying to understand process before actually doing stuff (config
 reference is there on ceph.com but I don't fully understand the
 process)

 Thanks,
 Andrija

 On 3 March 2015 at 11:32, Irek Fasikhov malm...@gmail.com wrote:

 Hi.

 Use value osd_recovery_delay_start
 example:
 [root@ceph08 ceph]# ceph --admin-daemon
 /var/run/ceph/ceph-osd.94.asok config show  | grep 
 osd_recovery_delay_start
   osd_recovery_delay_start: 10

 2015-03-03 13:13 GMT+03:00 Andrija Panic andrija.pa...@gmail.com:

 HI Guys,

 I yesterday removed 1 OSD from cluster (out of 42 OSDs), and it
 caused over 37% od the data to rebalance - let's say this is fine (this 
 is
 when I removed it frm Crush Map).

 I'm wondering - I have previously set some throtling mechanism, but
 during first 1h of rebalancing, my rate of recovery was going up to 1500
 MB/s - and VMs were unusable completely, and then last 4h of the 
 duration
 of recover this recovery rate went down to, say, 100-200 MB.s and during
 this VM performance was still pretty impacted, but at least I could work
 more or a less

 So my question, is this behaviour expected, is throtling here
 working as expected, since first 1h was almoust no throtling applied if 
 I
 check the recovery rate 1500MB/s and the impact on Vms.
 And last 4h seemed pretty fine (although still lot of impact in
 general)

 I changed these throtling on the fly with:

 ceph tell osd.* injectargs '--osd_recovery_max_active 1'
 ceph tell osd.* injectargs '--osd_recovery_op_priority 1'
 ceph tell osd.* injectargs '--osd_max_backfills 1'

 My Jorunals are on SSDs (12 OSD per server, of which 6 journals on
 one SSD, 6 journals on another SSD)  - I have 3 of these hosts.

 Any thought are welcome.
 --

 Andrija Panić

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




 --
 С уважением, Фасихов Ирек Нургаязович
 Моб.: +79229045757




 --

 Andrija Panić




 --

 Andrija Panić




 --
 С уважением, Фасихов Ирек Нургаязович
 Моб.: +79229045757




 --

 Andrija Panić




 --
 С уважением, Фасихов Ирек Нургаязович
 Моб.: +79229045757




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-03 Thread Irek Fasikhov
Hi.

Use value osd_recovery_delay_start
example:
[root@ceph08 ceph]# ceph --admin-daemon /var/run/ceph/ceph-osd.94.asok
config show  | grep osd_recovery_delay_start
  osd_recovery_delay_start: 10

2015-03-03 13:13 GMT+03:00 Andrija Panic andrija.pa...@gmail.com:

 HI Guys,

 I yesterday removed 1 OSD from cluster (out of 42 OSDs), and it caused
 over 37% od the data to rebalance - let's say this is fine (this is when I
 removed it frm Crush Map).

 I'm wondering - I have previously set some throtling mechanism, but during
 first 1h of rebalancing, my rate of recovery was going up to 1500 MB/s -
 and VMs were unusable completely, and then last 4h of the duration of
 recover this recovery rate went down to, say, 100-200 MB.s and during this
 VM performance was still pretty impacted, but at least I could work more or
 a less

 So my question, is this behaviour expected, is throtling here working as
 expected, since first 1h was almoust no throtling applied if I check the
 recovery rate 1500MB/s and the impact on Vms.
 And last 4h seemed pretty fine (although still lot of impact in general)

 I changed these throtling on the fly with:

 ceph tell osd.* injectargs '--osd_recovery_max_active 1'
 ceph tell osd.* injectargs '--osd_recovery_op_priority 1'
 ceph tell osd.* injectargs '--osd_max_backfills 1'

 My Jorunals are on SSDs (12 OSD per server, of which 6 journals on one
 SSD, 6 journals on another SSD)  - I have 3 of these hosts.

 Any thought are welcome.
 --

 Andrija Panić

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-03 Thread Irek Fasikhov
A large percentage of the rebuild of the cluster map (But low percentage
degradation). If you had not made ceph osd crush rm id, the percentage
would be low.
In your case, the correct option is to remove the entire node, rather than
each disk individually

2015-03-03 14:27 GMT+03:00 Andrija Panic andrija.pa...@gmail.com:

 Another question - I mentioned here 37% of objects being moved arround -
 this is MISPLACED object (degraded objects were 0.001%, after I removed 1
 OSD from cursh map (out of 44 OSD or so).

 Can anybody confirm this is normal behaviour - and are there any
 workarrounds ?

 I understand this is because of the object placement algorithm of CEPH,
 but still 37% of object missplaces just by removing 1 OSD from crush maps
 out of 44 make me wonder why this large percentage ?

 Seems not good to me, and I have to remove another 7 OSDs (we are demoting
 some old hardware nodes). This means I can potentialy go with 7 x the same
 number of missplaced objects...?

 Any thoughts ?

 Thanks

 On 3 March 2015 at 12:14, Andrija Panic andrija.pa...@gmail.com wrote:

 Thanks Irek.

 Does this mean, that after peering for each PG, there will be delay of
 10sec, meaning that every once in a while, I will have 10sec od the cluster
 NOT being stressed/overloaded, and then the recovery takes place for that
 PG, and then another 10sec cluster is fine, and then stressed again ?

 I'm trying to understand process before actually doing stuff (config
 reference is there on ceph.com but I don't fully understand the process)

 Thanks,
 Andrija

 On 3 March 2015 at 11:32, Irek Fasikhov malm...@gmail.com wrote:

 Hi.

 Use value osd_recovery_delay_start
 example:
 [root@ceph08 ceph]# ceph --admin-daemon /var/run/ceph/ceph-osd.94.asok
 config show  | grep osd_recovery_delay_start
   osd_recovery_delay_start: 10

 2015-03-03 13:13 GMT+03:00 Andrija Panic andrija.pa...@gmail.com:

 HI Guys,

 I yesterday removed 1 OSD from cluster (out of 42 OSDs), and it caused
 over 37% od the data to rebalance - let's say this is fine (this is when I
 removed it frm Crush Map).

 I'm wondering - I have previously set some throtling mechanism, but
 during first 1h of rebalancing, my rate of recovery was going up to 1500
 MB/s - and VMs were unusable completely, and then last 4h of the duration
 of recover this recovery rate went down to, say, 100-200 MB.s and during
 this VM performance was still pretty impacted, but at least I could work
 more or a less

 So my question, is this behaviour expected, is throtling here working
 as expected, since first 1h was almoust no throtling applied if I check the
 recovery rate 1500MB/s and the impact on Vms.
 And last 4h seemed pretty fine (although still lot of impact in general)

 I changed these throtling on the fly with:

 ceph tell osd.* injectargs '--osd_recovery_max_active 1'
 ceph tell osd.* injectargs '--osd_recovery_op_priority 1'
 ceph tell osd.* injectargs '--osd_max_backfills 1'

 My Jorunals are on SSDs (12 OSD per server, of which 6 journals on one
 SSD, 6 journals on another SSD)  - I have 3 of these hosts.

 Any thought are welcome.
 --

 Andrija Panić

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




 --
 С уважением, Фасихов Ирек Нургаязович
 Моб.: +79229045757




 --

 Andrija Panić




 --

 Andrija Panić




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-03 Thread Irek Fasikhov
osd_recovery_delay_start - is the delay in seconds between iterations
recovery (osd_recovery_max_active)

It is described here:
https://github.com/ceph/ceph/search?utf8=%E2%9C%93q=osd_recovery_delay_start


2015-03-03 14:27 GMT+03:00 Andrija Panic andrija.pa...@gmail.com:

 Another question - I mentioned here 37% of objects being moved arround -
 this is MISPLACED object (degraded objects were 0.001%, after I removed 1
 OSD from cursh map (out of 44 OSD or so).

 Can anybody confirm this is normal behaviour - and are there any
 workarrounds ?

 I understand this is because of the object placement algorithm of CEPH,
 but still 37% of object missplaces just by removing 1 OSD from crush maps
 out of 44 make me wonder why this large percentage ?

 Seems not good to me, and I have to remove another 7 OSDs (we are demoting
 some old hardware nodes). This means I can potentialy go with 7 x the same
 number of missplaced objects...?

 Any thoughts ?

 Thanks

 On 3 March 2015 at 12:14, Andrija Panic andrija.pa...@gmail.com wrote:

 Thanks Irek.

 Does this mean, that after peering for each PG, there will be delay of
 10sec, meaning that every once in a while, I will have 10sec od the cluster
 NOT being stressed/overloaded, and then the recovery takes place for that
 PG, and then another 10sec cluster is fine, and then stressed again ?

 I'm trying to understand process before actually doing stuff (config
 reference is there on ceph.com but I don't fully understand the process)

 Thanks,
 Andrija

 On 3 March 2015 at 11:32, Irek Fasikhov malm...@gmail.com wrote:

 Hi.

 Use value osd_recovery_delay_start
 example:
 [root@ceph08 ceph]# ceph --admin-daemon /var/run/ceph/ceph-osd.94.asok
 config show  | grep osd_recovery_delay_start
   osd_recovery_delay_start: 10

 2015-03-03 13:13 GMT+03:00 Andrija Panic andrija.pa...@gmail.com:

 HI Guys,

 I yesterday removed 1 OSD from cluster (out of 42 OSDs), and it caused
 over 37% od the data to rebalance - let's say this is fine (this is when I
 removed it frm Crush Map).

 I'm wondering - I have previously set some throtling mechanism, but
 during first 1h of rebalancing, my rate of recovery was going up to 1500
 MB/s - and VMs were unusable completely, and then last 4h of the duration
 of recover this recovery rate went down to, say, 100-200 MB.s and during
 this VM performance was still pretty impacted, but at least I could work
 more or a less

 So my question, is this behaviour expected, is throtling here working
 as expected, since first 1h was almoust no throtling applied if I check the
 recovery rate 1500MB/s and the impact on Vms.
 And last 4h seemed pretty fine (although still lot of impact in general)

 I changed these throtling on the fly with:

 ceph tell osd.* injectargs '--osd_recovery_max_active 1'
 ceph tell osd.* injectargs '--osd_recovery_op_priority 1'
 ceph tell osd.* injectargs '--osd_max_backfills 1'

 My Jorunals are on SSDs (12 OSD per server, of which 6 journals on one
 SSD, 6 journals on another SSD)  - I have 3 of these hosts.

 Any thought are welcome.
 --

 Andrija Panić

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




 --
 С уважением, Фасихов Ирек Нургаязович
 Моб.: +79229045757




 --

 Andrija Panić




 --

 Andrija Panić




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] who is using radosgw with civetweb?

2015-02-26 Thread Irek Fasikhov
I fully support Wido. We also have no problems.

OS: CentOS7
[root@s3backup etc]# ceph -v
ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)


2015-02-26 13:22 GMT+03:00 Dan van der Ster d...@vanderster.com:

 Hi Sage,

 We switched from apache+fastcgi to civetweb (+haproxy) around one
 month ago and so far it is working quite well. Just like GuangYang, we
 had seen many error 500's with fastcgi, but we never investigated it
 deeply. After moving to civetweb we don't get any errors at all no
 matter what load we send to the gateways.

 Here are some details:
   - the whole cluster, radosgw included, is firefly 0.80.8 and
 Scientific Linux 6.6
   - we have 6 gateways, each running on a 2-core VM
   - civetweb is listening on 8080
   - haproxy is listening on _each_ gateway VM on 80 and 443 and
 proxying to the radosgw's
   - so far we've written ~20 million objects (mostly very small)
 through civetweb.

 Our feedback is that the civetweb configuration is _much_ easier, much
 cleaner, and more reliable than what we had with apache+fastcgi.
 Before, we needed the non-standard apache (with 100-continue support)
 and the fastcgi config was always error-prone.

 The main goals we had for adding haproxy were for load balancing and
 to add SSL. Currently haproxy is configured to balance the http
 sessions evenly over all of our gateways -- one civetweb feature which
 would be nice to have would be a /health report (which returns e.g.
 some load metric for that gateway) that we could feed into haproxy
 so it would be able to better balance the load.

 In conclusion, +1 from us... AFAWCT civetweb is the way to go for Red
 Hat's future supported configuration.

 Best Regards, Dan (+Herve who did the work!)




 On Wed, Feb 25, 2015 at 8:31 PM, Sage Weil sw...@redhat.com wrote:
  Hey,
 
  We are considering switching to civetweb (the embedded/standalone rgw web
  server) as the primary supported RGW frontend instead of the current
  apache + mod-fastcgi or mod-proxy-fcgi approach.  Supported here means
  both the primary platform the upstream development focuses on and what
 the
  downstream Red Hat product will officially support.
 
  How many people are using RGW standalone using the embedded civetweb
  server instead of apache?  In production?  At what scale?  What
  version(s) (civetweb first appeared in firefly and we've backported most
  fixes).
 
  Have you seen any problems?  Any other feedback?  The hope is to (vastly)
  simplify deployment.
 
  Thanks!
  sage
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Dumpling/Firefly/Hammer SSD/Memstore performance comparison

2015-02-17 Thread Irek Fasikhov
Mark, very very good!

2015-02-17 20:37 GMT+03:00 Mark Nelson mnel...@redhat.com:

 Hi All,

 I wrote up a short document describing some tests I ran recently to look
 at how SSD backed OSD performance has changed across our LTS releases. This
 is just looking at RADOS performance and not RBD or RGW.  It also doesn't
 offer any real explanations regarding the results.  It's just a first high
 level step toward understanding some of the behaviors folks on the mailing
 list have reported over the last couple of releases.  I hope you find it
 useful.

 Mark

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Introducing Learning Ceph : The First ever Book on Ceph

2015-02-13 Thread Irek Fasikhov
Karan

Whether to send the book in Russian?

Thanks.

2015-02-13 11:43 GMT+03:00 Karan Singh karan.si...@csc.fi:

 Here is the new link for sample book :
 https://www.dropbox.com/s/2zcxawtv4q29fm9/Learning_Ceph_Sample.pdf?dl=0


 
 Karan Singh
 Systems Specialist , Storage Platforms
 CSC - IT Center for Science,
 Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland
 mobile: +358 503 812758
 tel. +358 9 4572001
 fax +358 9 4572302
 http://www.csc.fi/
 

 On 13 Feb 2015, at 05:25, Frank Yu flyxia...@gmail.com wrote:

 Wow, Cong
 BTW, I found the link of sample copy is 404.



 2015-02-06 6:53 GMT+08:00 Karan Singh karan.si...@csc.fi:

 Hello Community Members

 I am happy to introduce the first book on Ceph with the title “*Learning
 Ceph*”.

 Me and many folks from the publishing house together with technical
 reviewers spent several months to get this book compiled and published.

 Finally the book is up for sale on , i hope you would like it and surely
 will learn a lot from it.

 Amazon :
 http://www.amazon.com/Learning-Ceph-Karan-Singh/dp/1783985623/ref=sr_1_1?s=booksie=UTF8qid=1423174441sr=1-1keywords=ceph
 Packtpub : https://www.packtpub.com/application-development/learning-ceph

 You can grab the sample copy from here :
 https://www.dropbox.com/s/ek76r01r9prs6pb/Learning_Ceph_Packt.pdf?dl=0

 *Finally , I would like to express my sincere thanks to *

 *Sage Weil* - For developing Ceph and everything around it as well as
 writing foreword for “Learning Ceph”.
 *Patrick McGarry *- For his usual off the track support that too always.

 Last but not the least , to our great community members , who are also
 reviewers of the book *Don Talton , Julien Recurt , Sebastien Han *and 
 *Zihong
 Chen *, Thank you guys for your efforts.


 
 Karan Singh
 Systems Specialist , Storage Platforms
 CSC - IT Center for Science,
 Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland
 mobile: +358 503 812758
 tel. +358 9 4572001
 fax +358 9 4572302
 http://www.csc.fi/
 


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




 --
 Regards
 Frank Yu



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph Performance with SSD journal

2015-02-13 Thread Irek Fasikhov
Hi.
What version?

2015-02-13 6:04 GMT+03:00 Sumit Gaur sumitkg...@gmail.com:

 Hi Chir,
 Please fidn my answer below in blue

 On Thu, Feb 12, 2015 at 12:42 PM, Chris Hoy Poy ch...@gopc.net wrote:

 Hi Sumit,

 A couple questions:

 What brand/model SSD?

 samsung 480G SSD(PM853T) having random write 90K IOPS (4K, 368MBps)


 What brand/model HDD?

 64GB memory, 300GB SAS HDD (seagate), 10Gb nic


 Also how they are connected to controller/motherboard? Are they sharing a
 bus (ie SATA expander)?

 no , They are connected with local Bus not the SATA expander.



 RAM?

 *64GB *


 Also look at the output of  iostat -x or similiar, are the SSDs hitting
 100% utilisation?

 *No, SSD was hitting 2000 iops only.  *


 I suspect that the 5:1 ratio of HDDs to SDDs is not ideal, you now have
 5x the write IO trying to fit into a single SSD.

 * I have not seen any documented reference to calculate the ratio. Could
 you suggest one. Here I want to mention that results for 1024K write
 improve a lot. Problem is with 1024K read and 4k write .*

 *SSD journal 810 IOPS and 810MBps*
 *HDD journal 620 IOPS and 620 MBps*




 I'll take a punt on it being a SATA connected SSD (most common), 5x ~130
 megabytes/second gets very close to most SATA bus limits. If its a shared
 BUS, you possibly hit that limit even earlier (since all that data is now
 being written twice out over the bus).

 cheers;
 \Chris


 --
 *From: *Sumit Gaur sumitkg...@gmail.com
 *To: *ceph-users@lists.ceph.com
 *Sent: *Thursday, 12 February, 2015 9:23:35 AM
 *Subject: *[ceph-users] ceph Performance with SSD journal


 Hi Ceph-Experts,

 Have a small ceph architecture related question

 As blogs and documents suggest that ceph perform much better if we use
 journal on SSD.

 I have made the ceph cluster with 30 HDD + 6 SSD for 6 OSD nodes. 5 HDD
 + 1 SSD on each node and each SSD have 5 partition for journaling 5 OSDs
 on the node.

 Now I ran similar test as I ran for all HDD setup.

 What I saw below two reading goes in wrong direction as expected

 1) 4K write IOPS are less for SSD setup, though not major difference but
 less.
 2) 1024K Read IOPS are  less  for SSD setup than HDD setup.

 On the other hand 4K read and 1024K write both have much better numbers
 for SSD setup.

 Let me know if I am missing some obvious concept.

 Thanks
 sumit

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] re: Upgrade 0.80.5 to 0.80.8 --the VM's read requestbecome too slow

2015-02-12 Thread Irek Fasikhov
Hi.
hmm ... I thought, why I have such a low speed reading on another cluster


P.S. ceph 0.80.8

2015-02-12 14:33 GMT+03:00 Alexandre DERUMIER aderum...@odiso.com:

 Hi,
 Can you test with disabling rbd_cache ?

 I remember of a bug detected in giant, not sure it's also the case for
 fireflt

 This was this tracker:

 http://tracker.ceph.com/issues/9513

 But It has been solved and backported to firefly.

 Also, can you test 0.80.6 and 0.80.7 ?







 - Mail original -
 De: killingwolf killingw...@qq.com
 À: ceph-users ceph-users@lists.ceph.com
 Envoyé: Jeudi 12 Février 2015 12:16:32
 Objet: [ceph-users] re: Upgrade 0.80.5 to 0.80.8 --the VM's read
 requestbecome too slow

 I have this problems too , Help!

 -- 原始邮件 --
 发件人: 杨万元;yangwanyuan8...@gmail.com;
 发送时间: 2015年2月12日(星期四) 中午11:14
 收件人: ceph-users@lists.ceph.comceph-users@lists.ceph.com;
 主题: [ceph-users] Upgrade 0.80.5 to 0.80.8 --the VM's read requestbecome
 too slow

 Hello!
 We use Ceph+Openstack in our private cloud. Recently we upgrade our
 centos6.5 based cluster from Ceph Emperor to Ceph Firefly.
 At first,we use redhat yum repo epel to upgrade, this Ceph's version is
 0.80.5. First upgrade monitor,then osd,last client. when we complete this
 upgrade, we boot a VM on the cluster,then use fio to test the io
 performance. The io performance is as better as before. Everything is ok!
 Then we upgrade the cluster from 0.80.5 to 0.80.8,when we completed , we
 reboot the VM to load the newest librbd. after that we also use fio to test
 the io performance .then we find the randwrite and write is as good as
 before.but the randread and read is become worse, randwrite's iops from
 4000-5000 to 300-400 ,and the latency is worse. the write's bw from 400MB/s
 to 115MB/s . then I downgrade the ceph client version from 0.80.8 to
 0.80.5, then the reslut become normal.
 So I think maybe something cause about librbd. I compare the 0.80.8
 release notes with 0.80.5 (
 http://ceph.com/docs/master/release-notes/#v0-80-8-firefly ), I just find
 this change in 0.80.8 is something about read request : librbd: cap memory
 utilization for read requests (Jason Dillaman) . Who can explain this?


 My ceph cluster is 400osd,5mons :
 ceph -s
 health HEALTH_OK
 monmap e11: 5 mons at {BJ-M1-Cloud71=
 172.28.2.71:6789/0,BJ-M1-Cloud73=172.28.2.73:6789/0,BJ-M2-Cloud80=172.28.2.80:6789/0,BJ-M2-Cloud81=172.28.2.81:6789/0,BJ-M3-Cloud85=172.28.2.85:6789/0
 }, election epoch 198, quorum 0,1,2,3,4
 BJ-M1-Cloud71,BJ-M1-Cloud73,BJ-M2-Cloud80,BJ-M2-Cloud81,BJ-M3-Cloud85
 osdmap e120157: 400 osds: 400 up, 400 in
 pgmap v26161895: 29288 pgs, 6 pools, 20862 GB data, 3014 kobjects
 41084 GB used, 323 TB / 363 TB avail
 29288 active+clean
 client io 52640 kB/s rd, 32419 kB/s wr, 5193 op/s


 The follwing is my ceph client conf :
 [global]
 auth_service_required = cephx
 filestore_xattr_use_omap = true
 auth_client_required = cephx
 auth_cluster_required = cephx
 mon_host =
 172.29.204.24,172.29.204.48,172.29.204.55,172.29.204.58,172.29.204.73
 mon_initial_members = ZR-F5-Cloud24, ZR-F6-Cloud48, ZR-F7-Cloud55,
 ZR-F8-Cloud58, ZR-F9-Cloud73
 fsid = c01c8e28-304e-47a4-b876-cb93acc2e980
 mon osd full ratio = .85
 mon osd nearfull ratio = .75
 public network = 172.29.204.0/24
 mon warn on legacy crush tunables = false

 [osd]
 osd op threads = 12
 filestore journal writeahead = true
 filestore merge threshold = 40
 filestore split multiple = 8

 [client]
 rbd cache = true
 rbd cache writethrough until flush = false
 rbd cache size = 67108864
 rbd cache max dirty = 50331648
 rbd cache target dirty = 33554432

 [client.cinder]
 admin socket = /var/run/ceph/rbd-$pid.asok



 My VM is 8core16G,we use fio scripts is :
 fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=randread -size=60G
 -filename=/dev/vdb -name=EBS -iodepth=32 -runtime=200
 fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=randwrite -size=60G
 -filename=/dev/vdb -name=EBS -iodepth=32 -runtime=200
 fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=read -size=60G
 -filename=/dev/vdb -name=EBS -iodepth=32 -runtime=200
 fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=write -size=60G
 -filename=/dev/vdb -name=EBS -iodepth=32 -runtime=200

 The following is the io test result
 ceph client verison :0.80.5
 read: bw= 430MB
 write: bw=420MB
 randread: iops= 4875 latency=65ms
 randwrite: iops=6844 latency=46ms

 ceph client verison :0.80.8
 read: bw= 115MB
 write: bw=480MB
 randread: iops= 381 latency=83ms
 randwrite: iops=4843 latency=68ms

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list

[ceph-users] 0.80.8 ReplicationPG Fail

2015-02-06 Thread Irek Fasikhov
Morning found that some OSD dropped out of Tier Cache Pool. Maybe it's a
coincidence, but at this point was rollback.

2015-02-05 23:23:18.231723 7fd747ff1700 -1 *** Caught signal
(Segmentation fault) **
 in thread 7fd747ff1700

 ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)
 1: /usr/bin/ceph-osd() [0x9bde51]
 2: (()+0xf710) [0x7fd766f97710]
 3: (std::_Rb_tree_decrement(std::_Rb_tree_node_base*)+0xa) [0x7fd7666c1eca]
 4: (ReplicatedPG::make_writeable(ReplicatedPG::OpContext*)+0x14c) [0x87cd5c]
 5: (ReplicatedPG::prepare_transaction(ReplicatedPG::OpContext*)+0x1db)
[0x89d29b]
 6: (ReplicatedPG::execute_ctx(ReplicatedPG::OpContext*)+0xcd4) [0x89e0f4]
 7: (ReplicatedPG::do_op(std::tr1::shared_ptrOpRequest)+0x2ca5) [0x8a2a55]
 8: (ReplicatedPG::do_request(std::tr1::shared_ptrOpRequest,
ThreadPool::TPHandle)+0x5b1) [0x832251]
 9: (OSD::dequeue_op(boost::intrusive_ptrPG,
std::tr1::shared_ptrOpRequest, ThreadPool::TPHandle)+0x37c)
[0x61344c]
 10: (OSD::OpWQ::_process(boost::intrusive_ptrPG,
ThreadPool::TPHandle)+0x63d) [0x6472ad]
 11: (ThreadPool::WorkQueueValstd::pairboost::intrusive_ptrPG,
std::tr1::shared_ptrOpRequest , boost::intrusive_ptrPG
::_void_process(void*, ThreadPool::TPHandle)+0xae) [0x67dcde]
 12: (ThreadPool::worker(ThreadPool::WorkThread*)+0x551) [0xa2a181]
 13: (ThreadPool::WorkThread::entry()+0x10) [0xa2d260]
 14: (()+0x79d1) [0x7fd766f8f9d1]
 15: (clone()+0x6d) [0x7fd765f088fd]
 NOTE: a copy of the executable, or `objdump -rdS executable` is
needed to interpret this.

Are there any ideas? Thank.

http://tracker.ceph.com/issues/10778
-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD over cache tier over EC pool: rbd rm doesn't remove objects

2015-01-28 Thread Irek Fasikhov
Hi,Sage.

Yes, Firefly.
[root@ceph05 ~]# ceph --version
ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)

Yes, I have seen this behavior.

[root@ceph08 ceph]# rbd info vm-160-disk-1
rbd image 'vm-160-disk-1':
size 32768 MB in 8192 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.179faf52eb141f2
format: 2
features: layering
parent: rbd/base-145-disk-1@__base__
overlap: 32768 MB
[root@ceph08 ceph]# rbd rm vm-160-disk-1
Removing image: 100% complete...done.
[root@ceph08 ceph]# rbd info vm-160-disk-1
2015-01-28 10:39:01.595785 7f1fbea9e760 -1 librbd::ImageCtx: error finding
header: (2) No such file or directoryrbd: error opening image
vm-160-disk-1: (2) No such file or directory

[root@ceph08 ceph]# rados -p rbdcache ls | grep 179faf52eb141f2 | wc
   59445944  249633
[root@ceph08 ceph]# rados -p rbdcache ls | grep 179faf52eb141f2 | wc
   58575857  245979
[root@ceph08 ceph]# rados -p rbd ls | grep 179faf52eb141f2 | wc
   43774377  183819
[root@ceph08 ceph]# rados -p rbdcache ls | grep 179faf52eb141f2 | wc
   50175017  210699
[root@ceph08 ceph]# rados -p rbdcache ls | grep 179faf52eb141f2 | wc
   50155015  210615
[root@ceph08 ceph]# rados -p rbd ls | grep 179faf52eb141f2 | wc
[root@ceph08 ceph]# rados -p rcachehe ls | grep 179faf52eb141f2 | wc
   19861986   83412
[root@ceph08 ceph]# rados -p rbd ls | grep 179faf52eb141f2 | wc
981 981   41202
[root@ceph08 ceph]# rados -p rbd ls | grep 179faf52eb141f2 | wc
802 802   33684
[root@ceph08 ceph]# rados -p rbdcache ls | grep 179faf52eb141f2 | wc
   16111611   67662

Thank, Sage!


Tue Jan 27 2015 at 7:01:43 PM, Sage Weil s...@newdream.net:

 On Tue, 27 Jan 2015, Irek Fasikhov wrote:
  Hi,All.
  Indeed, there is a problem. Removed 1 TB of data space on a cluster is
 not
  cleared. This feature of the behavior or a bug? And how long will it be
  cleaned?

 Your subject says cache tier but I don't see it in the 'ceph df' output
 below.  The cache tiers will store 'whiteout' objects that cache object
 non-existence that could be delaying some deletion.  You can wrangle the
 cluster into flushing those with

  ceph osd pool set cachepool cache_target_dirty_ratio .05

 (though you'll probably want to change it back to the default .4 later).

 If there's no cache tier involved, there may be another problem.  What
 version is this?  Firefly?

 sage

 
  Sat Sep 20 2014 at 8:19:24 AM, Mika?l Cluseau mclus...@isi.nc:
Hi all,
 
I have weird behaviour on my firefly test + convenience
storage cluster. It consists of 2 nodes with a light imbalance
in available space:
 
# idweighttype nameup/downreweight
-114.58root default
-28.19host store-1
12.73osd.1up1
02.73osd.0up1
52.73osd.5up1
-36.39host store-2
22.73osd.2up1
32.73osd.3up1
40.93osd.4up1
 
I used to store ~8TB of rbd volumes, coming to a near-full
state. There was some annoying stuck misplaced PGs so I began
to remove 4.5TB of data; the weird thing is: the space hasn't
been reclaimed on the OSDs, they keeped stuck around 84% usage.
I tried to move PGs around and it happens that the space is
correctly reclaimed if I take an OSD out, let him empty it XFS
volume and then take it in again.
 
I'm currently applying this to and OSD in turn, but I though it
could be worth telling about this. The current ceph df output
is:
 
GLOBAL:
SIZE   AVAIL RAW USED %RAW USED
12103G 5311G 6792G56.12
POOLS:
NAME ID USED   %USED OBJECTS
data 0  0  0 0
metadata 1  0  0 0
rbd  2  444G   3.67  117333
[...]
archives-ec  14 3628G  29.98 928902
archives 15 37518M 0.30  273167
 
Before just moving data, AVAIL was around 3TB.
 
I finished the process with the OSDs on store-1, who show the
following space usage now:
 
/dev/sdb1 2.8T  1.4T  1.4T  50%
/var/lib/ceph/osd/ceph-0
/dev/sdc1 2.8T  1.3T  1.5T  46%
/var/lib/ceph/osd/ceph-1
/dev/sdd1 2.8T  1.3T  1.5T  48%
/var/lib/ceph/osd/ceph-5
 
I'm currently fixing OSD 2, 3 will be the last one to be fixed.
The df on store-2 shows the following:
 
/dev/sdb1   2.8T  1.9T  855G  70%
/var/lib/ceph/osd/ceph-2
/dev/sdc1   2.8T  2.4T  417G

Re: [ceph-users] RBD over cache tier over EC pool: rbd rm doesn't remove objects

2015-01-26 Thread Irek Fasikhov
Hi,All.

Indeed, there is a problem. Removed 1 TB of data space on a cluster is not
cleared. This feature of the behavior or a bug? And how long will it be
cleaned?

Sat Sep 20 2014 at 8:19:24 AM, Mikaël Cluseau mclus...@isi.nc:

  Hi all,

 I have weird behaviour on my firefly test + convenience storage cluster.
 It consists of 2 nodes with a light imbalance in available space:

 # idweighttype nameup/downreweight
 -114.58root default
 -28.19host store-1
 12.73osd.1up1
 02.73osd.0up1
 52.73osd.5up1
 -36.39host store-2
 22.73osd.2up1
 32.73osd.3up1
 40.93osd.4up1

 I used to store ~8TB of rbd volumes, coming to a near-full state. There
 was some annoying stuck misplaced PGs so I began to remove 4.5TB of data;
 the weird thing is: the space hasn't been reclaimed on the OSDs, they
 keeped stuck around 84% usage. I tried to move PGs around and it happens
 that the space is correctly reclaimed if I take an OSD out, let him empty
 it XFS volume and then take it in again.

 I'm currently applying this to and OSD in turn, but I though it could be
 worth telling about this. The current ceph df output is:

 GLOBAL:
 SIZE   AVAIL RAW USED %RAW USED
 12103G 5311G 6792G56.12
 POOLS:
 NAME ID USED   %USED OBJECTS
 data 0  0  0 0
 metadata 1  0  0 0
 rbd  2  444G   3.67  117333
 [...]
 archives-ec  14 3628G  29.98 928902
 archives 15 37518M 0.30  273167

 Before just moving data, AVAIL was around 3TB.

 I finished the process with the OSDs on store-1, who show the following
 space usage now:

 /dev/sdb1 2.8T  1.4T  1.4T  50% /var/lib/ceph/osd/ceph-0
 /dev/sdc1 2.8T  1.3T  1.5T  46% /var/lib/ceph/osd/ceph-1
 /dev/sdd1 2.8T  1.3T  1.5T  48% /var/lib/ceph/osd/ceph-5

 I'm currently fixing OSD 2, 3 will be the last one to be fixed. The df on
 store-2 shows the following:

 /dev/sdb1   2.8T  1.9T  855G  *70%* /var/lib/ceph/osd/ceph-2
 /dev/sdc1   2.8T  2.4T  417G  *86%* /var/lib/ceph/osd/ceph-3
 /dev/sdd1   932G  481G  451G  52% /var/lib/ceph/osd/ceph-4

 OSD 2 was at 84% 3h ago, and OSD 3 was ~75%.

 During rbd rm (that took a bit more that 3 days), ceph log was showing
 things like that:

 2014-09-03 16:17:38.831640 mon.0 192.168.1.71:6789/0 417194 : [INF] pgmap
 v14953987: 3196 pgs: 2882 active+clean, 314 active+remapped; 7647 GB data,
 11067 GB used, 3828 GB / 14896 GB avail; 0 B/s rd, 6778 kB/s wr, 18 op/s;
 -5/5757286 objects degraded (-0.000%)
 [...]
 2014-09-05 03:09:59.895507 mon.0 192.168.1.71:6789/0 513976 : [INF] pgmap
 v15050766: 3196 pgs: 2882 active+clean, 314 active+remapped; 6010 GB data,
 11156 GB used, 3740 GB / 14896 GB avail; 0 B/s rd, 0 B/s wr, 8 op/s;
 -388631/5247320 objects degraded (-7.406%)
 [...]
 2014-09-06 03:56:50.008109 mon.0 192.168.1.71:6789/0 580816 : [INF] pgmap
 v15117604: 3196 pgs: 2882 active+clean, 314 active+remapped; 4865 GB data,
 11207 GB used, 3689 GB / 14896 GB avail; 0 B/s rd, 6117 kB/s wr, 22 op/s;
 -706519/3699415 objects degraded (-19.098%)
 2014-09-06 03:56:44.476903 osd.0 192.168.1.71:6805/11793 729 : [WRN] 1
 slow requests, 1 included below; oldest blocked for  30.058434 secs
 2014-09-06 03:56:44.476909 osd.0 192.168.1.71:6805/11793 730 : [WRN] slow
 request 30.058434 seconds old, received at 2014-09-06 03:56:14.418429:
 osd_op(client.19843278.0:46081 rb.0.c7fd7f.238e1f29.b3fa [delete]
 15.b8fb7551 ack+ondisk+write e38950) v4 currently waiting for blocked object
 2014-09-06 03:56:49.477785 osd.0 192.168.1.71:6805/11793 731 : [WRN] 2
 slow requests, 1 included below; oldest blocked for  35.059315 secs
 [... stabilizes here:]
 2014-09-06 22:13:48.771531 mon.0 192.168.1.71:6789/0 632527 : [INF] pgmap
 v15169313: 3196 pgs: 2882 active+clean, 314 active+remapped; 4139 GB data,
 11215 GB used, 3681 GB / 14896 GB avail; 64 B/s rd, 64 B/s wr, 0 op/s;
 -883219/3420796 objects degraded (-25.819%)
 [...]
 2014-09-07 03:09:48.491325 mon.0 192.168.1.71:6789/0 633880 : [INF] pgmap
 v15170666: 3196 pgs: 2882 active+clean, 314 active+remapped; 4139 GB data,
 11215 GB used, 3681 GB / 14896 GB avail; 18727 B/s wr, 2 op/s;
 -883219/3420796 objects degraded (-25.819%)

 And now, during data movement I described before:

 2014-09-20 15:16:13.394694 mon.0 [INF] pgmap v15344707: 3196 pgs: 2132
 active+clean, 432 active+remapped+wait_backfill, 621 active+remapped, 11
 active+remapped+backfilling; 4139 GB data, 6831 GB used, 5271 GB / 12103 GB
 avail; 379097/3792969 objects degraded (9.995%)

 If some ceph developer wants me to do something or to provide some data,
 please say so quickly, I will probably 

Re: [ceph-users] Part 2: ssd osd fails often with FAILED assert(soid scrubber.start || soid = scrubber.end)

2015-01-26 Thread Irek Fasikhov
Hi, All,Loic
I have exactly the same error. I understand the problem is in 0.80.9? Thank
you.



Sat Jan 17 2015 at 2:21:09 AM, Loic Dachary l...@dachary.org:



 On 14/01/2015 18:33, Udo Lembke wrote:
  Hi Loic,
  thanks for the answer. I hope it's not like in
  http://tracker.ceph.com/issues/8747 where the issue happens with an
  patched version if understand right.

 http://tracker.ceph.com/issues/8747 is a duplicate of
 http://tracker.ceph.com/issues/8011 indeed :-)
 
  So I must only wait few month ;-) for an backport...
 
  Udo
 
  Am 14.01.2015 09:40, schrieb Loic Dachary:
  Hi,
 
  This is http://tracker.ceph.com/issues/8011 which is being
  backported.
 
  Cheers
 
 

 --
 Loïc Dachary, Artisan Logiciel Libre

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] PG num calculator live on Ceph.com

2015-01-09 Thread Irek Fasikhov
Very very good :)

пт, 9 янв. 2015, 2:17, William Bloom (wibloom) wibl...@cisco.com:

  Awesome, thanks Michael.



 Regards

 William



 *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf
 Of *Michael J. Kidd
 *Sent:* Wednesday, January 07, 2015 2:09 PM
 *To:* ceph-us...@ceph.com
 *Subject:* [ceph-users] PG num calculator live on Ceph.com



 Hello all,

   Just a quick heads up that we now have a PG calculator to help determine
 the proper PG per pool numbers to achieve a target PG per OSD ratio.

 http://ceph.com/pgcalc

 Please check it out!  Happy to answer any questions, and always welcome
 any feedback on the tool / verbiage, etc...

 As an aside, we're also working to update the documentation to reflect the
 best practices.  See Ceph.com tracker for this at:
 http://tracker.ceph.com/issues/9867

 Thanks!

 Michael J. Kidd
 Sr. Storage Consultant
 Inktank Professional Services

  - by Red Hat
   ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] system metrics monitoring

2014-12-11 Thread Irek Fasikhov
Hi.

We use Zabbix.

2014-12-12 8:33 GMT+03:00 pragya jain prag_2...@yahoo.co.in:

 hello sir!

 I need some open source monitoring tool for examining these metrics.

 Please suggest some open source monitoring software.

 Thanks
 Regards
 Pragya Jain


   On Thursday, 11 December 2014 9:16 PM, Denish Patel den...@omniti.com
 wrote:



 Try http://www.circonus.com

 On Thu, Dec 11, 2014 at 1:22 AM, pragya jain prag_2...@yahoo.co.in
 wrote:

 please somebody reply my query.

 Regards
 Pragya Jain


   On Tuesday, 9 December 2014 11:53 AM, pragya jain prag_2...@yahoo.co.in
 wrote:



 hello all!

 As mentioned at statistics and monitoring page of Riak
 Systems Metrics To Graph
 http://docs.basho.com/riak/latest/ops/running/stats-and-monitoring/#Systems-Metrics-To-Graph
 MetricAvailable Disk SpaceIOWaitRead OperationsWrite OperationsNetwork
 ThroughputLoad Average
 Can somebody suggest me some monitoring tools that monitor these metrics?

 Regards
 Pragya Jain



 ___
 riak-users mailing list
 riak-us...@lists.basho.com
 http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




 --
 Denish Patel,
 OmniTI Computer Consulting Inc.
 Database Architect,
 http://omniti.com/does/data-management
 http://www.pateldenish.com



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] VM restore on Ceph *very* slow

2014-12-11 Thread Irek Fasikhov
Hi.

For faster operation, use rbd export/export-diff and import/import-diff

2014-12-11 17:17 GMT+03:00 Lindsay Mathieson lindsay.mathie...@gmail.com:


 Anyone know why a VM live restore would be excessively slow on Ceph?
 restoring
 a  small VM with 12GB disk/2GB Ram is taking 18 *minutes*. Larger VM's can
 be
 over half an hour.

 The same VM's on the same disks, but native, or glusterfs take less than 30
 seconds.

 VM's are KVM on Proxmox.


 thanks,
 --
 Lindsay
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] VM restore on Ceph *very* slow

2014-12-11 Thread Irek Fasikhov
Examples
Backups:
/usr/bin/nice -n +20 /usr/bin/rbd -n client.backup export
test/vm-105-disk-1@rbd_data.505392ae8944a - | /usr/bin/pv -s 40G -n -i 1 |
/usr/bin/nice -n +20 /usr/bin/pbzip2 -c  /backup/vm-105-disk-1
Restore:
pbzip2 -dk /nfs/RBD/big-vm-268-disk-1-LyncV2-20140830-011308.pbzip2 -c |
rbd -n client.rbdbackup -k /etc/ceph/big.keyring -c /etc/ceph/big.conf
import --image-format 2 - rbd/Lyncolddisk1

2014-12-12 8:38 GMT+03:00 Irek Fasikhov malm...@gmail.com:

 Hi.

 For faster operation, use rbd export/export-diff and import/import-diff

 2014-12-11 17:17 GMT+03:00 Lindsay Mathieson lindsay.mathie...@gmail.com
 :


 Anyone know why a VM live restore would be excessively slow on Ceph?
 restoring
 a  small VM with 12GB disk/2GB Ram is taking 18 *minutes*. Larger VM's
 can be
 over half an hour.

 The same VM's on the same disks, but native, or glusterfs take less than
 30
 seconds.

 VM's are KVM on Proxmox.


 thanks,
 --
 Lindsay
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




 --
 С уважением, Фасихов Ирек Нургаязович
 Моб.: +79229045757




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] What's the difference between ceph-0.87-0.el6.x86_64.rpm and ceph-0.80.7-0.el6.x86_64.rpm

2014-12-10 Thread Irek Fasikhov
Hi, Cao.

https://github.com/ceph/ceph/commits/firefly


2014-12-11 5:00 GMT+03:00 Cao, Buddy buddy@intel.com:

  Hi, I tried to download firefly rpm package, but found two rpms existing
 in different folders, what is the difference of 0.87.0 and  0.80.7?



 http://ceph.com/rpm/el6/x86_64/ceph-0.87-0.el6.x86_64.rpm

 http://ceph.com/rpm-firefly/el6/x86_64/ceph-0.80.7-0.el6.x86_64.rpm





 Wei Cao (Buddy)



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] active+degraded on an empty new cluster

2014-12-09 Thread Irek Fasikhov
Hi.

http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/

ceph pg force_create_pg pgid


2014-12-09 14:50 GMT+03:00 Giuseppe Civitella giuseppe.civite...@gmail.com
:

 Hi all,

 last week I installed a new ceph cluster on 3 vm running Ubuntu 14.04 with
 default kernel.
 There is a ceph monitor a two osd hosts. Here are some datails:
 ceph -s
 cluster c46d5b02-dab1-40bf-8a3d-f8e4a77b79da
  health HEALTH_WARN 192 pgs degraded; 192 pgs stuck unclean
  monmap e1: 1 mons at {ceph-mon1=10.1.1.83:6789/0}, election epoch 1,
 quorum 0 ceph-mon1
  osdmap e83: 6 osds: 6 up, 6 in
   pgmap v231: 192 pgs, 3 pools, 0 bytes data, 0 objects
 207 MB used, 30446 MB / 30653 MB avail
  192 active+degraded

 root@ceph-mon1:/home/ceph# ceph osd dump
 epoch 99
 fsid c46d5b02-dab1-40bf-8a3d-f8e4a77b79da
 created 2014-12-06 13:15:06.418843
 modified 2014-12-09 11:38:04.353279
 flags
 pool 0 'data' replicated size 2 min_size 1 crush_ruleset 0 object_hash
 rjenkins pg_num 64 pgp_num 64 last_change 18 flags hashpspool
 crash_replay_interval 45 stripe_width 0
 pool 1 'metadata' replicated size 2 min_size 1 crush_ruleset 0 object_hash
 rjenkins pg_num 64 pgp_num 64 last_change 19 flags hashpspool stripe_width 0
 pool 2 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash
 rjenkins pg_num 64 pgp_num 64 last_change 20 flags hashpspool stripe_width 0
 max_osd 6
 osd.0 up   in  weight 1 up_from 90 up_thru 90 down_at 89
 last_clean_interval [58,89) 10.1.1.84:6805/995 10.1.1.84:6806/4000995
 10.1.1.84:6807/4000995 10.1.1.84:6808/4000995 exists,up
 e3895075-614d-48e2-b956-96e13dbd87fe
 osd.1 up   in  weight 1 up_from 88 up_thru 0 down_at 87
 last_clean_interval [8,87) 10.1.1.85:6800/23146 10.1.1.85:6815/7023146
 10.1.1.85:6816/7023146 10.1.1.85:6817/7023146 exists,up
 144bc6ee-2e3d-4118-a460-8cc2bb3ec3e8
 osd.2 up   in  weight 1 up_from 61 up_thru 0 down_at 60
 last_clean_interval [11,60) 10.1.1.85:6805/26784 10.1.1.85:6802/5026784
 10.1.1.85:6811/5026784 10.1.1.85:6812/5026784 exists,up
 8d5c7108-ef11-4947-b28c-8e20371d6d78
 osd.3 up   in  weight 1 up_from 95 up_thru 0 down_at 94
 last_clean_interval [57,94) 10.1.1.84:6800/810 10.1.1.84:6810/3000810
 10.1.1.84:6811/3000810 10.1.1.84:6812/3000810 exists,up
 bd762b2d-f94c-4879-8865-cecd63895557
 osd.4 up   in  weight 1 up_from 97 up_thru 0 down_at 96
 last_clean_interval [74,96) 10.1.1.84:6801/9304 10.1.1.84:6802/2009304
 10.1.1.84:6803/2009304 10.1.1.84:6813/2009304 exists,up
 7d28a54b-b474-4369-b958-9e6bf6c856aa
 osd.5 up   in  weight 1 up_from 99 up_thru 0 down_at 98
 last_clean_interval [79,98) 10.1.1.85:6801/19513 10.1.1.85:6808/2019513
 10.1.1.85:6810/2019513 10.1.1.85:6813/2019513 exists,up
 f4d76875-0e40-487c-a26d-320f8b8d60c5

 root@ceph-mon1:/home/ceph# ceph osd tree
 # idweight  type name   up/down reweight
 -1  0   root default
 -2  0   host ceph-osd1
 0   0   osd.0   up  1
 3   0   osd.3   up  1
 4   0   osd.4   up  1
 -3  0   host ceph-osd2
 1   0   osd.1   up  1
 2   0   osd.2   up  1
 5   0   osd.5   up  1

 Current HEALTH_WARN state says 192 active+degraded since I rebooted an
 osd host. Previously it was incomplete. It never reached a HEALTH_OK
 state.
 Any hint about what to do next to have an healthy cluster?


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Issue in renaming rbd

2014-12-03 Thread Irek Fasikhov
Hi.
You can only rename in the same pool.
For transfer to another pool: rbd cp and rbd export/import.

2014-12-03 16:15 GMT+03:00 Mallikarjun Biradar 
mallikarjuna.bira...@gmail.com:

 Hi all,

 Whether renaming rbd is allowed?

 I am getting this error,

 ems@rack6-ramp-4:~$ sudo rbd rename rbdPool1 -p testPool2 rbdPoolTest1 -p
 testPool2
 rbd: mv/rename across pools not supported
 source pool: testPool2 dest pool: rbd
 ems@rack6-ramp-4:~$

 ems@rack6-ramp-4:~$ sudo rbd rename rbdPool1 rbdPoolTest1
 rbd: rename error: (2) No such file or directory
 2014-12-03 18:41:50.786397 7f73b4f75840 -1 librbd: error finding source
 object: (2) No such file or directory
 ems@rack6-ramp-4:~$

 ems@rack6-ramp-4:~$ sudo rbd ls -p testPool2
 rbdPool1
 ems@rack6-ramp-4:~$

 Why its taking rbd as destination pool, though I have provided another
 pool as per syntax.

 Syntax in rbd help:
 rbd   (mv | rename) src dest  rename src image to dest

 The rbd which I am trying to rename is mounted and IO is running on it.

 -Thanks  regards,
 Mallikarjun Biradar

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Issue in renaming rbd

2014-12-03 Thread Irek Fasikhov
root@backhb2:~# ceph osd pool -h | grep rename
osd pool rename poolname poolnamerename srcpool to destpool


2014-12-03 16:23 GMT+03:00 Mallikarjun Biradar 
mallikarjuna.bira...@gmail.com:

 Hi,

 I am trying to rename in the same pool.

 sudo rbd rename rbdPool1 -p testPool2 rbdPoolTest1 -p testPool2

 -Thanks  Regards,
 Mallikarjun Biradar

 On Wed, Dec 3, 2014 at 6:50 PM, Irek Fasikhov malm...@gmail.com wrote:

 Hi.
 You can only rename in the same pool.
 For transfer to another pool: rbd cp and rbd export/import.

 2014-12-03 16:15 GMT+03:00 Mallikarjun Biradar 
 mallikarjuna.bira...@gmail.com:

 Hi all,

 Whether renaming rbd is allowed?

 I am getting this error,

 ems@rack6-ramp-4:~$ sudo rbd rename rbdPool1 -p testPool2 rbdPoolTest1
 -p testPool2
 rbd: mv/rename across pools not supported
 source pool: testPool2 dest pool: rbd
 ems@rack6-ramp-4:~$

 ems@rack6-ramp-4:~$ sudo rbd rename rbdPool1 rbdPoolTest1
 rbd: rename error: (2) No such file or directory
 2014-12-03 18:41:50.786397 7f73b4f75840 -1 librbd: error finding source
 object: (2) No such file or directory
 ems@rack6-ramp-4:~$

 ems@rack6-ramp-4:~$ sudo rbd ls -p testPool2
 rbdPool1
 ems@rack6-ramp-4:~$

 Why its taking rbd as destination pool, though I have provided another
 pool as per syntax.

 Syntax in rbd help:
 rbd   (mv | rename) src dest  rename src image to
 dest

 The rbd which I am trying to rename is mounted and IO is running on it.

 -Thanks  regards,
 Mallikarjun Biradar

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




 --
 С уважением, Фасихов Ирек Нургаязович
 Моб.: +79229045757





-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] trouble starting second monitor

2014-12-01 Thread Irek Fasikhov
[celtic][DEBUG ] create the mon path if it does not exist

mkdir /var/lib/ceph/mon/

2014-12-01 4:32 GMT+03:00 K Richard Pixley r...@noir.com:

 What does this mean, please?

 --rich

 ceph@adriatic:~/my-cluster$ ceph status
 cluster 1023db58-982f-4b78-b507-481233747b13
  health HEALTH_OK
  monmap e1: 1 mons at {black=192.168.1.77:6789/0}, election epoch 2,
 quorum 0 black
  mdsmap e7: 1/1/1 up {0=adriatic=up:active}, 3 up:standby
  osdmap e17: 4 osds: 4 up, 4 in
   pgmap v48: 192 pgs, 3 pools, 1884 bytes data, 20 objects
 29134 MB used, 113 GB / 149 GB avail
  192 active+clean
 ceph@adriatic:~/my-cluster$ ceph-deploy mon create celtic
 [ceph_deploy.conf][DEBUG ] found configuration file at:
 /home/ceph/.cephdeploy.conf
 [ceph_deploy.cli][INFO  ] Invoked (1.5.20): /usr/bin/ceph-deploy mon
 create celtic
 [ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts celtic
 [ceph_deploy.mon][DEBUG ] detecting platform for host celtic ...
 [celtic][DEBUG ] connection detected need for sudo
 [celtic][DEBUG ] connected to host: celtic
 [celtic][DEBUG ] detect platform information from remote host
 [celtic][DEBUG ] detect machine type
 [ceph_deploy.mon][INFO  ] distro info: Ubuntu 14.04 trusty
 [celtic][DEBUG ] determining if provided host has same hostname in remote
 [celtic][DEBUG ] get remote short hostname
 [celtic][DEBUG ] deploying mon to celtic
 [celtic][DEBUG ] get remote short hostname
 [celtic][DEBUG ] remote hostname: celtic
 [celtic][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
 [celtic][DEBUG ] create the mon path if it does not exist
 [celtic][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-celtic/
 done
 [celtic][DEBUG ] create a done file to avoid re-doing the mon deployment
 [celtic][DEBUG ] create the init path if it does not exist
 [celtic][DEBUG ] locating the `service` executable...
 [celtic][INFO  ] Running command: sudo initctl emit ceph-mon cluster=ceph
 id=celtic
 [celtic][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon
 /var/run/ceph/ceph-mon.celtic.asok mon_status
 [celtic][ERROR ] admin_socket: exception getting command descriptions:
 [Errno 2] No such file or directory
 [celtic][WARNIN] monitor: mon.celtic, might not be running yet
 [celtic][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon
 /var/run/ceph/ceph-mon.celtic.asok mon_status
 [celtic][ERROR ] admin_socket: exception getting command descriptions:
 [Errno 2] No such file or directory
 [celtic][WARNIN] celtic is not defined in `mon initial members`
 [celtic][WARNIN] monitor celtic does not exist in monmap
 [celtic][WARNIN] neither `public_addr` nor `public_network` keys are
 defined for monitors
 [celtic][WARNIN] monitors may not be able to form quorum

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] S3CMD and Ceph

2014-11-27 Thread Irek Fasikhov
Hi,Ben!

Do you have problems with permissions. The configuration is fully
operational.

2014-11-27 11:39 GMT+03:00 Ben b@benjackson.email:

  Even with those settings it doesnt work.

 I still get ERROR: Access to bucket 'BUCKET' was denied'

 Radosgw-admin shows me as the owner of the bucket, and when i do 's3cmd
 ls' by itself, it lists all buckets. But when I do 's3cmd ls s3://BUCKET'
 it gives me denied error.



 On 27/11/14 19:32, Irek Fasikhov wrote:

  I like this work.
  [rbd@rbdbackup ~]$ cat .s3cfg
 [default]
 access_key = 2M4PRTYOGI3AXBZFAXFR
 secret_key = LQYFttxRn+7bBJ5rD1Y7ckZCN8XjEInOFY3s9RUR
 host_base = s3.X.ru
 host_bucket = %(bucket)s.s3.X.ru
 enable_multipart = True
 multipart_chunk_size_mb = 30
 use_https = True


 2014-11-27 7:43 GMT+03:00 b b@benjackson.email:

 I'm having some issues with a user in ceph using S3 Browser and S3cmd

 It was previously working.

 I can no longer use s3cmd to list the contents of a bucket, i am getting
 403 and 405 errors
 When using S3browser, I can see the contents of the bucket, I can upload
 files, but i cannot create additional folders within the bucket (i get 403
 error)

 The bucket is owned by the user, I am using the correct keys, I have
 checked the keys for escape characters, but there are no slashes in the key.

 I'm not sure what else I can do to get this to work.
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




  --
  С уважением, Фасихов Ирек Нургаязович
 Моб.: +79229045757





-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] S3CMD and Ceph

2014-11-27 Thread Irek Fasikhov
I like this work.
[rbd@rbdbackup ~]$ cat .s3cfg
[default]
access_key = 2M4PRTYOGI3AXBZFAXFR
secret_key = LQYFttxRn+7bBJ5rD1Y7ckZCN8XjEInOFY3s9RUR
host_base = s3.X.ru
host_bucket = %(bucket)s.s3.X.ru
enable_multipart = True
multipart_chunk_size_mb = 30
use_https = True


2014-11-27 7:43 GMT+03:00 b b@benjackson.email:

 I'm having some issues with a user in ceph using S3 Browser and S3cmd

 It was previously working.

 I can no longer use s3cmd to list the contents of a bucket, i am getting
 403 and 405 errors
 When using S3browser, I can see the contents of the bucket, I can upload
 files, but i cannot create additional folders within the bucket (i get 403
 error)

 The bucket is owned by the user, I am using the correct keys, I have
 checked the keys for escape characters, but there are no slashes in the key.

 I'm not sure what else I can do to get this to work.
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osds fails to start with mismatch in id

2014-11-10 Thread Irek Fasikhov
Hi, Ramakrishna.
I think you understand what the problem is:
[ceph@ceph05 ~]$ cat /var/lib/ceph/osd/ceph-56/whoami
56
[ceph@ceph05 ~]$ cat /var/lib/ceph/osd/ceph-57/whoami
57


Tue Nov 11 2014 at 6:01:40, Ramakrishna Nishtala (rnishtal) 
rnish...@cisco.com:

  Hi Greg,

 Thanks for the pointer. I think you are right. The full story is like this.



 After installation, everything works fine until I reboot. I do observe
 udevadm getting triggered in logs, but the devices do not come up after
 reboot. Exact issue as http://tracker.ceph.com/issues/5194. But this has
 been fixed a while back per the case details.

 As a workaround, I copied the contents from /proc/mounts to fstab and
 that’s where I landed into the issue.



 After your suggestion, defined as UUID in fstab, but similar problem.

 blkid.tab now moved to tmpfs and also isn’t consistent ever after issuing
 blkid explicitly to get the UUID’s. Goes in line with ceph-disk comments.



 Decided to reinstall, dd the partitions, zapdisks etc. Did not help. Very
 weird that links below change in /dev/disk/by-uuid and
 /dev/disk/by-partuuid etc.



 *Before reboot*

 lrwxrwxrwx 1 root root 10 Nov 10 06:31
 11aca3e2-a9d5-4bcc-a5b0-441c53d473b6 - ../../sdd2

 lrwxrwxrwx 1 root root 10 Nov 10 06:31
 89594989-90cb-4144-ac99-0ffd6a04146e - ../../sde2

 lrwxrwxrwx 1 root root 10 Nov 10 06:31
 c17fe791-5525-4b09-92c4-f90eaaf80dc6 - ../../sda2

 lrwxrwxrwx 1 root root 10 Nov 10 06:31
 c57541a1-6820-44a8-943f-94d68b4b03d4 - ../../sdc2

 lrwxrwxrwx 1 root root 10 Nov 10 06:31
 da7030dd-712e-45e4-8d89-6e795d9f8011 - ../../sdb2



 *After reboot*

 lrwxrwxrwx 1 root root 10 Nov 10 09:50
 11aca3e2-a9d5-4bcc-a5b0-441c53d473b6 - ../../sdd2

 lrwxrwxrwx 1 root root 10 Nov 10 09:50
 89594989-90cb-4144-ac99-0ffd6a04146e - ../../sde2

 lrwxrwxrwx 1 root root 10 Nov 10 09:50
 c17fe791-5525-4b09-92c4-f90eaaf80dc6 - ../../sda2

 lrwxrwxrwx 1 root root 10 Nov 10 09:50
 c57541a1-6820-44a8-943f-94d68b4b03d4 - ../../sdb2

 lrwxrwxrwx 1 root root 10 Nov 10 09:50
 da7030dd-712e-45e4-8d89-6e795d9f8011 - ../../sdh2



 Essentially, the transformation here is sdb2-sdh2 and sdc2- sdb2. In
 fact I haven’t partitioned my sdh at all before the test. The only
 difference probably from the standard procedure is I have pre-created the
 partitions for the journal and data, with parted.



 /lib/udev/rules.d  osd rules has four different partition GUID codes,

 45b0969e-9b03-4f30-b4c6-5ec00ceff106,

 45b0969e-9b03-4f30-b4c6-b4b80ceff106,

 4fbd7e29-9d25-41b8-afd0-062c0ceff05d,

 4fbd7e29-9d25-41b8-afd0-5ec00ceff05d,



 But all my partitions journal/data are having
 ebd0a0a2-b9e5-4433-87c0-68b6b72699c7 as partition guid code.



 Appreciate any help.



 Regards,



 Rama

 =

 -Original Message-
 From: Gregory Farnum [mailto:g...@gregs42.com]
 Sent: Sunday, November 09, 2014 3:36 PM
 To: Ramakrishna Nishtala (rnishtal)
 Cc: ceph-us...@ceph.com
 Subject: Re: [ceph-users] osds fails to start with mismatch in id



 On Sun, Nov 9, 2014 at 3:21 PM, Ramakrishna Nishtala (rnishtal) 
 rnish...@cisco.com wrote:

  Hi

 

  I am on ceph 0.87, RHEL 7

 

  Out of 60 few osd’s start and the rest complain about mismatch about

  id’s as below.

 

 

 

  2014-11-09 07:09:55.501177 7f4633e01880 -1 OSD id 56 != my id 53

 

  2014-11-09 07:09:55.810048 7f636edf4880 -1 OSD id 57 != my id 54

 

  2014-11-09 07:09:56.122957 7f459a766880 -1 OSD id 58 != my id 55

 

  2014-11-09 07:09:56.429771 7f87f8e0c880 -1 OSD id 0 != my id 56

 

  2014-11-09 07:09:56.741329 7fadd9b91880 -1 OSD id 2 != my id 57

 

 

 

  Found one OSD ID in /var/lib/ceph/cluster-id/keyring. To check this

  out manually corrected it and turned authentication to none too, but

  did not help.

 

 

 

  Any clues, how it can be corrected?



 It sounds like maybe the symlinks to data and journal aren't matching up
 with where they're supposed to be. This is usually a result of using
 unstable /dev links that don't always match to the same physical disks.
 Have you checked that?

 -Greg
  ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] PG inconsistency

2014-11-06 Thread Irek Fasikhov
What is your version of the ceph?
0.80.0 - 0.80.3
https://github.com/ceph/ceph/commit/7557a8139425d1705b481d7f010683169fd5e49b

Thu Nov 06 2014 at 16:24:21, GuangYang yguan...@outlook.com:

 Hello Cephers,
 Recently we observed a couple of inconsistencies in our Ceph cluster,
 there were two major patterns leading to inconsistency as I observed: 1)
 EIO to read the file, 2) the digest is inconsistent (for EC) even there is
 no read error).

 While ceph has built-in tool sets to repair the inconsistencies, I also
 would like to check with the community in terms of what is the best ways to
 handle such issues (e.g. should we run fsck / xfs_repair when such issue
 happens).

 In more details, I have the following questions:
 1. When there is inconsistency detected, what is the chance there is some
 hardware issues which need to be repaired physically, or should I run some
 disk/filesystem tools to further check?
 2. Should we use fsck / xfs_repair to fix the inconsistencies, or should
 we solely relay on Ceph's repair tool sets?

 It would be great to hear you experience and suggestions.

 BTW, we are using XFS in the cluster.

 Thanks,
 Guang
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] PG inconsistency

2014-11-06 Thread Irek Fasikhov
Thu Nov 06 2014 at 16:44:09, GuangYang yguan...@outlook.com:

 Thanks Dan. By killed/formatted/replaced the OSD, did you replace the
 disk? Not an filesystem expert here, but would like to understand the
 underlying what happened behind the EIO and does that reveal something
 (e.g. hardware issue).

 In our case, we are using 6TB drive so that there are lot of data to
 migrate and as backfilling/recovering bring latency increasing, we hope to
 avoid that as much as we can..


For example, use the following parameters:
osd_recovery_delay_start = 10
osd recovery op priority = 2
osd max backfills = 1
osd recovery max active =1
osd recovery threads = 1




 Thanks,
 Guang

 
  From: daniel.vanders...@cern.ch
  Date: Thu, 6 Nov 2014 13:36:46 +
  Subject: Re: PG inconsistency
  To: yguan...@outlook.com; ceph-users@lists.ceph.com
 
  Hi,
  I've only ever seen (1), EIO to read a file. In this case I've always
  just killed / formatted / replaced that OSD completely -- that moves
  the PG to a new master and the new replication fixes the
  inconsistency. This way, I've never had to pg repair. I don't know if
  this is a best or even good practise, but it works for us.
  Cheers, Dan
 
  On Thu Nov 06 2014 at 2:24:32 PM GuangYang
  yguan...@outlook.commailto:yguan...@outlook.com wrote:
  Hello Cephers,
  Recently we observed a couple of inconsistencies in our Ceph cluster,
  there were two major patterns leading to inconsistency as I observed:
  1) EIO to read the file, 2) the digest is inconsistent (for EC) even
  there is no read error).
 
  While ceph has built-in tool sets to repair the inconsistencies, I also
  would like to check with the community in terms of what is the best
  ways to handle such issues (e.g. should we run fsck / xfs_repair when
  such issue happens).
 
  In more details, I have the following questions:
  1. When there is inconsistency detected, what is the chance there is
  some hardware issues which need to be repaired physically, or should I
  run some disk/filesystem tools to further check?
  2. Should we use fsck / xfs_repair to fix the inconsistencies, or
  should we solely relay on Ceph's repair tool sets?
 
  It would be great to hear you experience and suggestions.
 
  BTW, we are using XFS in the cluster.
 
  Thanks,
  Guang

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Typical 10GbE latency

2014-11-06 Thread Irek Fasikhov
Hi,Udo.
Good value :)

Whether an additional optimization on the host?
Thanks.

Thu Nov 06 2014 at 16:57:36, Udo Lembke ulem...@polarzone.de:

 Hi,
 from one host to five OSD-hosts.

 NIC Intel 82599EB; jumbo-frames; single Switch IBM G8124 (blade network).

 rtt min/avg/max/mdev = 0.075/0.114/0.231/0.037 ms
 rtt min/avg/max/mdev = 0.088/0.164/0.739/0.072 ms
 rtt min/avg/max/mdev = 0.081/0.141/0.229/0.030 ms
 rtt min/avg/max/mdev = 0.083/0.115/0.183/0.030 ms
 rtt min/avg/max/mdev = 0.087/0.144/0.190/0.028 ms


 Udo

 Am 06.11.2014 14:18, schrieb Wido den Hollander:
  Hello,
 
  While working at a customer I've ran into a 10GbE latency which seems
  high to me.
 
  I have access to a couple of Ceph cluster and I ran a simple ping test:
 
  $ ping -s 8192 -c 100 -n ip
 
  Two results I got:
 
  rtt min/avg/max/mdev = 0.080/0.131/0.235/0.039 ms
  rtt min/avg/max/mdev = 0.128/0.168/0.226/0.023 ms
 
  Both these environment are running with Intel 82599ES 10Gbit cards in
  LACP. One with Extreme Networks switches, the other with Arista.
 
  Now, on a environment with Cisco Nexus 3000 and Nexus 7000 switches I'm
  seeing:
 
  rtt min/avg/max/mdev = 0.160/0.244/0.298/0.029 ms
 
  As you can see, the Cisco Nexus network has high latency compared to the
  other setup.
 
  You would say the switches are to blame, but we also tried with a direct
  TwinAx connection, but that didn't help.
 
  This setup also uses the Intel 82599ES cards, so the cards don't seem to
  be the problem.
 
  The MTU is set to 9000 on all these networks and cards.
 
  I was wondering, others with a Ceph cluster running on 10GbE, could you
  perform a simple network latency test like this? I'd like to compare the
  results.
 

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Full backup/restore of Ceph cluster?

2014-11-05 Thread Irek Fasikhov
Hi.

I changed the script and added it multithreaded archiver.
See: http://www.theirek.com/blog/2014/10/26/primier-biekapa-rbd-ustroistva

2014-11-05 14:03 GMT+03:00 Alexandre DERUMIER aderum...@odiso.com:

 What if I just wanted to back up a running cluster without having
 another cluster to replicate to

 Yes, import is optionnal,

 you can simply export and pipe to tar


 rbd export-diff --from-snap snap1 pool/image@snap2 - | tar 


 - Mail original -

 De: Christopher Armstrong ch...@opdemand.com
 À: Alexandre DERUMIER aderum...@odiso.com
 Cc: ceph-users@lists.ceph.com
 Envoyé: Mercredi 5 Novembre 2014 10:08:49
 Objet: Re: [ceph-users] Full backup/restore of Ceph cluster?


 Hi Alexandre,


 Thanks for the link! Unless I'm misunderstanding, this is to replicate an
 RBD volume from one cluster to another.
 ? i.e. I'd ideally like a tarball of raw files that I could extract on a
 new host, start the Ceph daemons, and get up and running.





 Chris Armstrong
 Head of Services
 OpDemand / Deis.io
 GitHub: https://github.com/deis/deis -- Docs: http://docs.deis.io/


 On Wed, Nov 5, 2014 at 1:04 AM, Alexandre DERUMIER  aderum...@odiso.com
  wrote:


 Is RBD snapshotting what I'm looking for? Is this even possible?

 Yes, you can use rbd snapshoting, export / import

 http://ceph.com/dev-notes/incremental-snapshots-with-rbd/

 But you need to do it for each rbd volume.

 Here a script to do it:

 http://www.rapide.nl/blog/item/ceph_-_rbd_replication



 (AFAIK it's not possible to do it at pool level)


 - Mail original -

 De: Christopher Armstrong  ch...@opdemand.com 
 À: ceph-users@lists.ceph.com
 Envoyé: Mercredi 5 Novembre 2014 08:52:31
 Objet: [ceph-users] Full backup/restore of Ceph cluster?




 Hi folks,


 I was wondering if anyone has a solution for performing a complete backup
 and restore of a CEph cluster. A Google search came up with some
 articles/blog posts, some of which are old, and I don't really have a great
 idea of the feasibility of this.


 Here's what I've found:


 http://ceph.com/community/blog/tag/backup/

 http://ceph.com/docs/giant/rbd/rbd-snapshot/

 http://t3491.file-systems-ceph-user.file-systemstalk.us/backups-t3491.html



 Is RBD snapshotting what I'm looking for? Is this even possible? Any info
 is much appreciated!


 Thanks,


 Chris




 Chris Armstrong
 Head of Services
 OpDemand / Deis.io
 GitHub: https://github.com/deis/deis -- Docs: http://docs.deis.io/

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] where to download 0.87 debs?

2014-10-30 Thread Irek Fasikhov
http://ceph.com/debian-giant/ :)

2014-10-30 12:45 GMT+03:00 Jon Kåre Hellan jon.kare.hel...@uninett.no:

  Will there be debs?

 On 30/10/14 10:37, Irek Fasikhov wrote:

 Hi.

  Use http://ceph.com/rpm-giant/

 2014-10-30 12:34 GMT+03:00 Kenneth Waegeman kenneth.waege...@ugent.be:

 Hi,

 Will http://ceph.com/rpm/ also be updated to have the giant packages?

 Thanks

 Kenneth




 - Message from Patrick McGarry patr...@inktank.com -
Date: Wed, 29 Oct 2014 22:13:50 -0400
From: Patrick McGarry patr...@inktank.com
 Subject: Re: [ceph-users] where to download 0.87 RPMS?
  To: 廖建锋 de...@f-club.cn
  Cc: ceph-users ceph-users@lists.ceph.com



  I have updated the http://ceph.com/get page to reflect a more generic
 approach to linking.  It's also worth noting that the new
 http://download.ceph.com/ infrastructure is available now.

 To get to the rpms specifically you can either crawl the
 download.ceph.com tree or use the symlink at
 http://ceph.com/rpm-giant/

 Hope that (and the updated linkage on ceph.com/get) helps.  Thanks!


 Best Regards,

 Patrick McGarry
 Director Ceph Community || Red Hat
 http://ceph.com  ||  http://community.redhat.com
 @scuttlemonkey || @ceph


 On Wed, Oct 29, 2014 at 9:15 PM, 廖建锋 de...@f-club.cn wrote:




 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

  ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



  - End message from Patrick McGarry patr...@inktank.com -

 --

 Met vriendelijke groeten,
 Kenneth Waegeman


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




  --
 С уважением, Фасихов Ирек Нургаязович
 Моб.: +79229045757


 ___
 ceph-users mailing 
 listceph-us...@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] When will Ceph 0.72.3?

2014-10-29 Thread Irek Fasikhov
Dear developers.

Very much want io priorities ;)
During the execution of Snap roollback appear slow queries.

Thanks
-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Use 2 osds to create cluster but health check display active+degraded

2014-10-29 Thread Irek Fasikhov
Hi.

Because the disc requires three different hosts, the default number of
replications 3.

2014-10-29 10:56 GMT+03:00 Vickie CH mika.leaf...@gmail.com:

 Hi all,
   Try to use two OSDs to create a cluster. After the deply finished, I
 found the health status is 88 active+degraded 104 active+remapped.
 Before use 2 osds to create cluster the result is ok. I'm confuse why this
 situation happened. Do I need to set crush map to fix this problem?


 --ceph.conf-
 [global]
 fsid = c404ded6-4086-4f0b-b479-89bc018af954
 mon_initial_members = storage0
 mon_host = 192.168.1.10
 auth_cluster_required = cephx
 auth_service_required = cephx
 auth_client_required = cephx
 filestore_xattr_use_omap = true
 osd_pool_default_size = 2
 osd_pool_default_min_size = 1
 osd_pool_default_pg_num = 128
 osd_journal_size = 2048
 osd_pool_default_pgp_num = 128
 osd_mkfs_type = xfs
 -

 ---ceph -s---
 cluster c404ded6-4086-4f0b-b479-89bc018af954
  health HEALTH_WARN 88 pgs degraded; 192 pgs stuck unclean
  monmap e1: 1 mons at {storage0=192.168.10.10:6789/0}, election epoch
 2, quorum 0 storage0
  osdmap e20: 2 osds: 2 up, 2 in
   pgmap v45: 192 pgs, 3 pools, 0 bytes data, 0 objects
 79752 kB used, 1858 GB / 1858 GB avail
   88 active+degraded
  104 active+remapped
 


 Best wishes,
 Mika

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Use 2 osds to create cluster but health check display active+degraded

2014-10-29 Thread Irek Fasikhov
Hi.
This parameter does not apply to pools by default.
ceph osd dump | grep pool. see size=?


2014-10-29 11:40 GMT+03:00 Vickie CH mika.leaf...@gmail.com:

 Der Irek:

 Thanks for your reply.
 Even already set osd_pool_default_size = 2 the cluster still need 3
 different hosts right?
 Is this default number can be changed by user and write into ceph.conf
 before deploy?


 Best wishes,
 Mika

 2014-10-29 16:29 GMT+08:00 Irek Fasikhov malm...@gmail.com:

 Hi.

 Because the disc requires three different hosts, the default number of
 replications 3.

 2014-10-29 10:56 GMT+03:00 Vickie CH mika.leaf...@gmail.com:

 Hi all,
   Try to use two OSDs to create a cluster. After the deply finished,
 I found the health status is 88 active+degraded 104 active+remapped.
 Before use 2 osds to create cluster the result is ok. I'm confuse why this
 situation happened. Do I need to set crush map to fix this problem?


 --ceph.conf-
 [global]
 fsid = c404ded6-4086-4f0b-b479-89bc018af954
 mon_initial_members = storage0
 mon_host = 192.168.1.10
 auth_cluster_required = cephx
 auth_service_required = cephx
 auth_client_required = cephx
 filestore_xattr_use_omap = true
 osd_pool_default_size = 2
 osd_pool_default_min_size = 1
 osd_pool_default_pg_num = 128
 osd_journal_size = 2048
 osd_pool_default_pgp_num = 128
 osd_mkfs_type = xfs
 -

 ---ceph -s---
 cluster c404ded6-4086-4f0b-b479-89bc018af954
  health HEALTH_WARN 88 pgs degraded; 192 pgs stuck unclean
  monmap e1: 1 mons at {storage0=192.168.10.10:6789/0}, election
 epoch 2, quorum 0 storage0
  osdmap e20: 2 osds: 2 up, 2 in
   pgmap v45: 192 pgs, 3 pools, 0 bytes data, 0 objects
 79752 kB used, 1858 GB / 1858 GB avail
   88 active+degraded
  104 active+remapped
 


 Best wishes,
 Mika

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




 --
 С уважением, Фасихов Ирек Нургаязович
 Моб.: +79229045757





-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Use 2 osds to create cluster but health check display active+degraded

2014-10-29 Thread Irek Fasikhov
Mark.
I meant that the existing pools, this parameter is not used.
I'm sure he pools DATA, METADATA, RDB(They are created by default) have
size = 3.

2014-10-29 11:56 GMT+03:00 Mark Kirkwood mark.kirkw...@catalyst.net.nz:

 That is not my experience:

 $ ceph -v
 ceph version 0.86-579-g06a73c3 (06a73c39169f2f332dec760f56d3ec20455b1646)

 $ cat /etc/ceph/ceph.conf
 [global]
 ...
 osd pool default size = 2

 $ ceph osd dump|grep size
 pool 2 'hot' replicated size 2 min_size 1 crush_ruleset 0 object_hash
 rjenkins pg_num 128 pgp_num 128 last_change 47 flags
 hashpspool,incomplete_clones tier_of 1 cache_mode writeback target_bytes
 20 hit_set bloom{false_positive_probability: 0.05, target_size:
 0, seed: 0} 3600s x1 stripe_width 0
 pool 10 '.rgw.root' replicated size 2 min_size 1 crush_ruleset 0
 object_hash rjenkins pg_num 8 pgp_num 8 last_change 102 owner
 18446744073709551615 flags hashpspool stripe_width 0
 pool 11 '.rgw.control' replicated size 2 min_size 1 crush_ruleset 0
 object_hash rjenkins pg_num 8 pgp_num 8 last_change 104 owner
 18446744073709551615 flags hashpspool stripe_width 0
 pool 12 '.rgw' replicated size 2 min_size 1 crush_ruleset 0 object_hash
 rjenkins pg_num 8 pgp_num 8 last_change 106 owner 18446744073709551615
 flags hashpspool stripe_width 0
 pool 13 '.rgw.gc' replicated size 2 min_size 1 crush_ruleset 0 object_hash
 rjenkins pg_num 8 pgp_num 8 last_change 107 owner 18446744073709551615
 flags hashpspool stripe_width 0
 pool 14 '.users.uid' replicated size 2 min_size 1 crush_ruleset 0
 object_hash rjenkins pg_num 8 pgp_num 8 last_change 108 owner
 18446744073709551615 flags hashpspool stripe_width 0
 pool 15 '.rgw.buckets.index' replicated size 2 min_size 1 crush_ruleset 0
 object_hash rjenkins pg_num 8 pgp_num 8 last_change 110 owner
 18446744073709551615 flags hashpspool stripe_width 0
 pool 16 '.rgw.buckets' replicated size 2 min_size 1 crush_ruleset 0
 object_hash rjenkins pg_num 8 pgp_num 8 last_change 112 owner
 18446744073709551615 flags hashpspool stripe_width 0
 pool 17 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash
 rjenkins pg_num 1024 pgp_num 1024 last_change 186 flags hashpspool
 stripe_width 0






 On 29/10/14 21:46, Irek Fasikhov wrote:

 Hi.
 This parameter does not apply to pools by default.
 ceph osd dump | grep pool. see size=?


 2014-10-29 11:40 GMT+03:00 Vickie CH mika.leaf...@gmail.com
 mailto:mika.leaf...@gmail.com:

 Der Irek:

 Thanks for your reply.
 Even already set osd_pool_default_size = 2 the cluster still need
 3 different hosts right?
 Is this default number can be changed by user and write into
 ceph.conf before deploy?


 Best wishes,
 Mika

 2014-10-29 16:29 GMT+08:00 Irek Fasikhov malm...@gmail.com
 mailto:malm...@gmail.com:

 Hi.

 Because the disc requires three different hosts, the default
 number of replications 3.

 2014-10-29 10:56 GMT+03:00 Vickie CH mika.leaf...@gmail.com
 mailto:mika.leaf...@gmail.com:


 Hi all,
Try to use two OSDs to create a cluster. After the
 deply finished, I found the health status is 88
 active+degraded 104 active+remapped. Before use 2 osds to
 create cluster the result is ok. I'm confuse why this
 situation happened. Do I need to set crush map to fix this
 problem?


 --ceph.conf-
 [global]
 fsid = c404ded6-4086-4f0b-b479-89bc018af954
 mon_initial_members = storage0
 mon_host = 192.168.1.10
 auth_cluster_required = cephx
 auth_service_required = cephx
 auth_client_required = cephx
 filestore_xattr_use_omap = true
 osd_pool_default_size = 2
 osd_pool_default_min_size = 1
 osd_pool_default_pg_num = 128
 osd_journal_size = 2048
 osd_pool_default_pgp_num = 128
 osd_mkfs_type = xfs
 -

 ---ceph -s---
 cluster c404ded6-4086-4f0b-b479-89bc018af954
   health HEALTH_WARN 88 pgs degraded; 192 pgs stuck
 unclean
   monmap e1: 1 mons at {storage0=192.168.10.10:6789/0
 http://192.168.10.10:6789/0}, election epoch 2, quorum 0
 storage0
   osdmap e20: 2 osds: 2 up, 2 in
pgmap v45: 192 pgs, 3 pools, 0 bytes data, 0 objects
  79752 kB used, 1858 GB / 1858 GB avail
88 active+degraded
   104 active+remapped
 


 Best wishes,
 Mika

 ___
 ceph-users mailing list

Re: [ceph-users] Use 2 osds to create cluster but health check display active+degraded

2014-10-29 Thread Irek Fasikhov
ceph osd tree please :)

2014-10-29 12:03 GMT+03:00 Vickie CH mika.leaf...@gmail.com:

 Dear all,
 Thanks for the reply.
 Pool replicated size is 2. Because the replicated size parameter already
 write into ceph.conf before deploy.
 Because not familiar crush map.  I will according Mark's information to do
 a test that change the crush map to see the result.

 ---ceph.conf--
 [global]
 fsid = c404ded6-4086-4f0b-b479-
 89bc018af954
 mon_initial_members = storage0
 mon_host = 192.168.1.10
 auth_cluster_required = cephx
 auth_service_required = cephx
 auth_client_required = cephx
 filestore_xattr_use_omap = true

 *osd_pool_default_size = 2osd_pool_default_min_size = 1*
 osd_pool_default_pg_num = 128
 osd_journal_size = 2048
 osd_pool_default_pgp_num = 128
 osd_mkfs_type = xfs
 ---

 --ceph osd dump result -
 pool 0 'data' replicated size 2 min_size 1 crush_ruleset 0 object_hash
 rjenkins pg_num 64 pgp_num 64 last_change 14 flags hashpspool
 crash_replay_interval 45 stripe_width 0
 pool 1 'metadata' replicated size 2 min_size 1 crush_ruleset 0 object_hash
 rjenkins pg_num 64 pgp_num 64 last_change 15 flags hashpspool stripe_width 0
 pool 2 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash
 rjenkins pg_num 64 pgp_num 64 last_change 16 flags hashpspool stripe_width 0
 max_osd 2

 --

 Best wishes,
 Mika

 Best wishes,
 Mika

 2014-10-29 16:56 GMT+08:00 Mark Kirkwood mark.kirkw...@catalyst.net.nz:

 That is not my experience:

 $ ceph -v
 ceph version 0.86-579-g06a73c3 (06a73c39169f2f332dec760f56d3ec20455b1646)

 $ cat /etc/ceph/ceph.conf
 [global]
 ...
 osd pool default size = 2

 $ ceph osd dump|grep size
 pool 2 'hot' replicated size 2 min_size 1 crush_ruleset 0 object_hash
 rjenkins pg_num 128 pgp_num 128 last_change 47 flags
 hashpspool,incomplete_clones tier_of 1 cache_mode writeback target_bytes
 20 hit_set bloom{false_positive_probability: 0.05, target_size:
 0, seed: 0} 3600s x1 stripe_width 0
 pool 10 '.rgw.root' replicated size 2 min_size 1 crush_ruleset 0
 object_hash rjenkins pg_num 8 pgp_num 8 last_change 102 owner
 18446744073709551615 flags hashpspool stripe_width 0
 pool 11 '.rgw.control' replicated size 2 min_size 1 crush_ruleset 0
 object_hash rjenkins pg_num 8 pgp_num 8 last_change 104 owner
 18446744073709551615 flags hashpspool stripe_width 0
 pool 12 '.rgw' replicated size 2 min_size 1 crush_ruleset 0 object_hash
 rjenkins pg_num 8 pgp_num 8 last_change 106 owner 18446744073709551615
 flags hashpspool stripe_width 0
 pool 13 '.rgw.gc' replicated size 2 min_size 1 crush_ruleset 0
 object_hash rjenkins pg_num 8 pgp_num 8 last_change 107 owner
 18446744073709551615 flags hashpspool stripe_width 0
 pool 14 '.users.uid' replicated size 2 min_size 1 crush_ruleset 0
 object_hash rjenkins pg_num 8 pgp_num 8 last_change 108 owner
 18446744073709551615 flags hashpspool stripe_width 0
 pool 15 '.rgw.buckets.index' replicated size 2 min_size 1 crush_ruleset 0
 object_hash rjenkins pg_num 8 pgp_num 8 last_change 110 owner
 18446744073709551615 flags hashpspool stripe_width 0
 pool 16 '.rgw.buckets' replicated size 2 min_size 1 crush_ruleset 0
 object_hash rjenkins pg_num 8 pgp_num 8 last_change 112 owner
 18446744073709551615 flags hashpspool stripe_width 0
 pool 17 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash
 rjenkins pg_num 1024 pgp_num 1024 last_change 186 flags hashpspool
 stripe_width 0






 On 29/10/14 21:46, Irek Fasikhov wrote:

 Hi.
 This parameter does not apply to pools by default.
 ceph osd dump | grep pool. see size=?


 2014-10-29 11:40 GMT+03:00 Vickie CH mika.leaf...@gmail.com
 mailto:mika.leaf...@gmail.com:

 Der Irek:

 Thanks for your reply.
 Even already set osd_pool_default_size = 2 the cluster still need
 3 different hosts right?
 Is this default number can be changed by user and write into
 ceph.conf before deploy?


 Best wishes,
 Mika

 2014-10-29 16:29 GMT+08:00 Irek Fasikhov malm...@gmail.com
 mailto:malm...@gmail.com:

 Hi.

 Because the disc requires three different hosts, the default
 number of replications 3.

 2014-10-29 10:56 GMT+03:00 Vickie CH mika.leaf...@gmail.com
 mailto:mika.leaf...@gmail.com:


 Hi all,
Try to use two OSDs to create a cluster. After the
 deply finished, I found the health status is 88
 active+degraded 104 active+remapped. Before use 2 osds to
 create cluster the result is ok. I'm confuse why this
 situation happened. Do I need to set crush map to fix this
 problem?


 --ceph.conf-
 [global]
 fsid = c404ded6-4086-4f0b-b479-89bc018af954
 mon_initial_members

Re: [ceph-users] Scrub proces, IO performance

2014-10-28 Thread Irek Fasikhov
No. Appeared in 0.80.6. But there is a bug which is corrected in 0.80.8
See: http://tracker.ceph.com/issues/9677

2014-10-28 14:50 GMT+03:00 Mateusz Skała mateusz.sk...@budikom.net:

 Thanks for reply, we are using now ceph 0.80.1 firefly, is this options
 available?



 *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf
 Of *Mateusz Skała
 *Sent:* Tuesday, October 28, 2014 9:27 AM
 *To:* ceph-us...@ceph.com
 *Subject:* [ceph-users] Scrub proces, IO performance



 Hello,

 We are using Ceph as a storage backend for KVM, used for hosting MS
 Windows RDP, Linux for web applications with MySQL database and file
 sharing from Linux. Wen scrub or deep-scrub process is active, RDP sessions
 are freezing for a few seconds and web applications have big replay
 latency.

 New we have disabled scrubbing  and deep-scrubbing process between  6AM -
 10PM, when majority of users doesn't work, but user experience is still
 poor, like I write above. We are considering disabling scrubbing process at
 all. Does a new version 0.87 with addresses scrubbing priority is going to
 solve our problem (according to http://tracker.ceph.com/issues/6278)? Can
 we switch off scrubbing at all? How we can change our configuration to
 lower scrubbing performance impact? Does changing block size  can lower
 scrubbing impact or increase performance?



 Our Ceph cluster configuration :



 * we are using ~216 RBD disks for KVM VM's

 * ~11TB used, 3.593TB data, replica count 3

 * we have 5 mons, 32 OSD

 * 3 pools/ 4096pgs (only one - RBD in use)

 * 6 nodes (5osd+mon, 1 osd only) in two racks

 * 1 SATA disk for system, 1 SSD disk for journal and 4 or 6 SATA disk for
 OSD

 * 2 networks on 2 NIC 1Gbps (cluster + public)  on all nodes.

 * 2x 10GBps links between racks

 * without scrub max 45 iops

 * when scrub running 120 - 180 iops





 ceph.conf



 mon initial members = ceph35, ceph30, ceph20, ceph15, ceph10

 mon host = 10.20.8.35, 10.20.8.30, 10.20.8.20, 10.20.8.15, 10.20.8.10



 public network = 10.20.8.0/22

 cluster network = 10.20.4.0/22



 filestore xattr use omap = true

 filestore max sync interval = 15



 osd journal size = 10240

 osd pool default size = 3

 osd pool default min size = 1

 osd pool default pg num = 2048

 osd pool default pgp num = 2048

 osd crush chooseleaf type = 1

 osd recovery max active = 1

 osd recovery op priority = 1

 osd max backfills = 1



 auth cluster required = cephx

 auth service required = cephx

 auth client required = cephx



 rbd default format = 2



 Regards,

 Mateusz

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osd_disk_thread_ioprio_class/_priorioty ignored?

2014-10-23 Thread Irek Fasikhov
Hi.
Already have the necessary changes in git.
https://github.com/ceph/ceph/commit/86926c6089d63014dd770b4bb61fc7aca3998542

2014-10-23 16:42 GMT+04:00 Paweł Sadowski c...@sadziu.pl:

 On 10/23/2014 09:10 AM, Paweł Sadowski wrote:
  Hi,
 
  I was trying to determine performance impact of deep-scrubbing with
  osd_disk_thread_ioprio_class option set but it looks like it's ignored.
  Performance (during deep-scrub) is the same with this options set or
  left with defaults (1/3 of normal performance).
 
 
  # ceph --admin-daemon /var/run/ceph/ceph-osd.26.asok config show  | grep
  osd_disk_thread_ioprio
osd_disk_thread_ioprio_class: idle,
osd_disk_thread_ioprio_priority: 7,
 
  # ps -efL | grep 'ce[p]h-osd --cluster=ceph -i 26' | awk '{ print $4; }'
  | xargs --no-run-if-empty ionice -p | sort | uniq -c
   18 unknown: prio 0
  186 unknown: prio 4
 
  # cat /sys/class/block/sdf/queue/scheduler
  noop deadline [cfq]
 
  And finallyGDB:
 
  Breakpoint 1, ceph_ioprio_string_to_class (s=...) at
  common/io_priority.cc:48
  warning: Source file is more recent than executable.
  48return IOPRIO_CLASS_IDLE;
  (gdb) cont
  Continuing.
 
  Breakpoint 2, OSD::set_disk_tp_priority (this=0x3398000) at
 osd/OSD.cc:8548
  warning: Source file is more recent than executable.
  8548  disk_tp.set_ioprio(cls,
  cct-_conf-osd_disk_thread_ioprio_priority);
  (gdb) print cls
  $1 = -22
 
  So the IO priorities are *NOT*set (cls = 0). I'm not sure where this
  -22 came from.Any ideas?
  In the mean time I'll compile ceph from sources and check again.
 
 
 
  Ceph installed from Ceph repositories:
 
  # ceph-osd -v
  ceph version 0.86 (97dcc0539dfa7dac3de74852305d51580b7b1f82)
 
  # apt-cache policy ceph
  ceph:
Installed: 0.86-1precise
Candidate: 0.86-1precise
Version table:
   *** 0.86-1precise 0
  500 http://eu.ceph.com/debian-giant/ precise/main amd64 Packages
  100 /var/lib/dpkg/status

 Following patch corrects problem:

 diff --git a/src/common/io_priority.cc b/src/common/io_priority
 index b9eeae8..4cd299a 100644
 --- a/src/common/io_priority.cc
 +++ b/src/common/io_priority.cc
 @@ -41,7 +41,7 @@ int ceph_ioprio_set(int whence, int who, int

  int ceph_ioprio_string_to_class(const std::string s)
  {
 -  std::string l;
 +  std::string l(s);
std::transform(s.begin(), s.end(), l.begin(), ::tolower);

if (l == idle)


 # ps -efL | grep 'ce[p]h-osd --cluster=ceph -i 26' | awk '{ print $4; }'
 | xargs --no-run-if-empty ionice -p | sort | uniq -c
   1 idle
   4 unknown: prio 0
 183 unknown: prio 4

 Change to *best effort* (ceph tell osd.26 injectargs
 '--osd_disk_thread_ioprio_class be')

 # ps -efL | grep 'ce[p]h-osd --cluster=ceph -i 26' | awk '{ print $4; }'
 | xargs --no-run-if-empty ionice -p | sort | uniq -c
   1 best-effort: prio 7
   4 unknown: prio 0
 183 unknown: prio 4


 --
 PS
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Why performance of benchmarks with small blocks is extremely small?

2014-10-01 Thread Irek Fasikhov
Timur, read this thread:
https://www.mail-archive.com/ceph-users@lists.ceph.com/msg12486.html
Тимур, прочитай эту ветку.


2014-10-01 16:24 GMT+04:00 Andrei Mikhailovsky and...@arhont.com:

 Timur,

 As far as I know, the latest master has a number of improvements for ssd
 disks. If you check the mailing list discussion from a couple of weeks
 back, you can see that the latest stable firefly is not that well optimised
 for ssd drives and IO is limited. However changes are being made to address
 that.

 I am well surprised that you can get 10K IOps as in my tests I was not
 getting over 3K IOPs on the ssd disks which are capable of doing 90K IOps.

 P.S. does anyone know if the ssd optimisation code will be added to the
 next maintenance release of firefy?

 Andrei
 --

 *From: *Timur Nurlygayanov tnurlygaya...@mirantis.com
 *To: *Christian Balzer ch...@gol.com
 *Cc: *ceph-us...@ceph.com
 *Sent: *Wednesday, 1 October, 2014 1:11:25 PM
 *Subject: *Re: [ceph-users] Why performance of benchmarks with small
 blocks is extremely small?


 Hello Christian,

 Thank you for your detailed answer!

 I have other pre-production environment with 4 Ceph servers, 4 SSD disks
 per Ceph server (each Ceph OSD node on the separate SSD disk)
 Probably I should move journals to other disks or it is not required in my
 case?

 [root@ceph-node ~]# mount | grep ceph
 /dev/sdb4 on /var/lib/ceph/osd/ceph-0 type xfs
 (rw,noexec,nodev,noatime,nodiratime,inode64,logbsize=256k,delaylog,user_xattr,data=writeback)
 /dev/sde4 on /var/lib/ceph/osd/ceph-5 type xfs
 (rw,noexec,nodev,noatime,nodiratime,inode64,logbsize=256k,delaylog,user_xattr,data=writeback)
 /dev/sdd4 on /var/lib/ceph/osd/ceph-2 type xfs
 (rw,noexec,nodev,noatime,nodiratime,inode64,logbsize=256k,delaylog,user_xattr,data=writeback)
 /dev/sdc4 on /var/lib/ceph/osd/ceph-1 type xfs
 (rw,noexec,nodev,noatime,nodiratime,inode64,logbsize=256k,delaylog,user_xattr,data=writeback)

 [root@ceph-node ~]# find /var/lib/ceph/osd/ | grep journal
 /var/lib/ceph/osd/ceph-0/journal
 /var/lib/ceph/osd/ceph-5/journal
 /var/lib/ceph/osd/ceph-1/journal
 /var/lib/ceph/osd/ceph-2/journal

 My SSD disks have ~ 40k IOPS per disk, but on the VM I can see only ~ 10k
 - 14k IOPS for disks operations.
 To check this I execute the following command on VM with root partition
 mounted on disk in Ceph storage:

 root@test-io:/home/ubuntu# rm -rf /tmp/test  spew -d --write -r -b 4096
 10M /tmp/test
 WTR:56506.22 KiB/s   Transfer time: 00:00:00IOPS:14126.55

 Is it expected result or I can improve the performance and get at least
 30k-40k IOPS on the VM disks? (I have 2x 10Gb/s networks interfaces in LACP
 bonding for storage network, looks like network can't be the bottleneck).

 Thank you!


 On Wed, Oct 1, 2014 at 6:50 AM, Christian Balzer ch...@gol.com wrote:


 Hello,

 [reduced to ceph-users]

 On Sat, 27 Sep 2014 19:17:22 +0400 Timur Nurlygayanov wrote:

  Hello all,
 
  I installed OpenStack with Glance + Ceph OSD with replication factor 2
  and now I can see the write operations are extremly slow.
  For example, I can see only 0.04 MB/s write speed when I run rados bench
  with 512b blocks:
 
  rados bench -p test 60 write --no-cleanup -t 1 -b 512
 
 There are 2 things wrong with that this test:

 1. You're using rados bench, when in fact you should be testing from
 within VMs. For starters a VM could make use of the rbd cache you enabled,
 rados bench won't.

 2. Given the parameters of this test you're testing network latency more
 than anything else. If you monitor the Ceph nodes (atop is a good tool for
 that), you will probably see that neither CPU nor disks resources are
 being exhausted. With a single thread rados puts that tiny block of 512
 bytes on the wire, the primary OSD for the PG has to write this to the
 journal (on your slow, non-SSD disks) and send it to the secondary OSD,
 which has to ACK the write to its journal back to the primary one, which
 in turn then ACKs it to the client (rados bench) and then rados bench can
 send the next packet.
 You get the drift.

 Using your parameters I can get 0.17MB/s on a pre-production cluster
 that uses 4xQDR Infiniband (IPoIB) connections, on my shitty test cluster
 with 1GB/s links I get similar results to you, unsurprisingly.

 Ceph excels only with lots of parallelism, so an individual thread might
 be slow (and in your case HAS to be slow, which has nothing to do with
 Ceph per se) but many parallel ones will utilize the resources available.

 Having data blocks that are adequately sized (4MB, the default rados size)
 will help for bandwidth and the rbd cache inside a properly configured VM
 should make that happen.

 Of course in most real life scenarios you will run out of IOPS long before
 you run out of bandwidth.


   Maintaining 1 concurrent writes of 512 bytes for up to 60 seconds or 0
  objects
   Object prefix: benchmark_data_node-17.domain.tld_15862
 sec Cur ops   started  finished

[ceph-users] rbd export - nc -rbd import = memory leak

2014-09-26 Thread Irek Fasikhov
Hi, All.

I see a memory leak when importing raw deviсe.

Export Scheme:
[rbd@rbdbackup ~]$ rbd --no-progress -n client.rbdbackup -k
/etc/ceph/big.keyring -c /etc/ceph/big.conf export rbdtest/vm-111-disk-1 -
| nc 10.43.255.252 12345

[root@ct2 ~]# nc -l 12345 | rbd import --no-progress --image-format 2 -
rbd/vm-111-disk-1

This is the same problem with ssh

Memory usage, see the screenshots:
https://drive.google.com/folderview?id=0BxoNLVWxzOJWSHlTSEZvM3lkQXMusp=sharing

-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd export - nc -rbd import = memory leak

2014-09-26 Thread Irek Fasikhov
OS: CentOS 6.5
Kernel: 2.6.32-431.el6.x86_64
Ceph --version: ceph version 0.72.2
(a913ded2ff138aefb8cb84d347d72164099cfd60)



2014-09-26 15:44 GMT+04:00 Irek Fasikhov malm...@gmail.com:

 Hi, All.

 I see a memory leak when importing raw deviсe.

 Export Scheme:
 [rbd@rbdbackup ~]$ rbd --no-progress -n client.rbdbackup -k
 /etc/ceph/big.keyring -c /etc/ceph/big.conf export rbdtest/vm-111-disk-1 -
 | nc 10.43.255.252 12345

 [root@ct2 ~]# nc -l 12345 | rbd import --no-progress --image-format 2 -
 rbd/vm-111-disk-1

 This is the same problem with ssh

 Memory usage, see the screenshots:

 https://drive.google.com/folderview?id=0BxoNLVWxzOJWSHlTSEZvM3lkQXMusp=sharing

 --
 С уважением, Фасихов Ирек Нургаязович
 Моб.: +79229045757




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [PG] Slow request *** seconds old,v4 currently waiting for pg to exist locally

2014-09-25 Thread Irek Fasikhov
osd_op(client.4625.1:9005787)
.


This is due to external factors. For example, the network settings.

2014-09-25 10:05 GMT+04:00 Udo Lembke ulem...@polarzone.de:

 Hi again,
 sorry - forgot my post... see

 osdmap e421: 9 osds: 9 up, 9 in

 shows that all your 9 osds are up!

 Do you have trouble with your journal/filesystem?

 Udo

 Am 25.09.2014 08:01, schrieb Udo Lembke:
  Hi,
  looks that some osds are down?!
 
  What is the output of ceph osd tree
 
  Udo
 
  Am 25.09.2014 04:29, schrieb Aegeaner:
  The cluster healthy state is WARN:
 
   health HEALTH_WARN 118 pgs degraded; 8 pgs down; 59 pgs
  incomplete; 28 pgs peering; 292 pgs stale; 87 pgs stuck inactive;
  292 pgs stuck stale; 205 pgs stuck unclean; 22 requests are blocked
   32 sec; recovery 12474/46357 objects degraded (26.909%)
   monmap e3: 3 mons at
  {CVM-0-mon01=
 172.18.117.146:6789/0,CVM-0-mon02=172.18.117.152:6789/0,CVM-0-mon03=172.18.117.153:6789/0
 },
  election epoch 24, quorum 0,1,2 CVM-0-mon01,CVM-0-mon02,CVM-0-mon03
   osdmap e421: 9 osds: 9 up, 9 in
pgmap v2261: 292 pgs, 4 pools, 91532 MB data, 23178 objects
  330 MB used, 3363 GB / 3363 GB avail
  12474/46357 objects degraded (26.909%)
20 stale+peering
87 stale+active+clean
 8 stale+down+peering
59 stale+incomplete
   118 stale+active+degraded
 
 
  What does these errors mean? Can these PGs be recovered?
 
 
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Rebalancing slow I/O.

2014-09-11 Thread Irek Fasikhov
Hi,All.

DELL R720X8,96 OSDs, Network 2x10Gbit LACP.

When one of the nodes crashes, I get very slow I / O operations on virtual
machines.
A cluster map by default.
[ceph@ceph08 ~]$ ceph osd tree
# idweight  type name   up/down reweight
-1  262.1   root defaults
-2  32.76   host ceph01
0   2.73osd.0   up  1
...
11  2.73osd.11  up  1
-3  32.76   host ceph02
13  2.73osd.13  up  1
..
12  2.73osd.12  up  1
-4  32.76   host ceph03
24  2.73osd.24  up  1

35  2.73osd.35  up  1
-5  32.76   host ceph04
37  2.73osd.37  up  1
.
47  2.73osd.47  up  1
-6  32.76   host ceph05
48  2.73osd.48  up  1
...
59  2.73osd.59  up  1
-7  32.76   host ceph06
60  2.73osd.60  down0
...
71  2.73osd.71  down0
-8  32.76   host ceph07
72  2.73osd.72  up  1

83  2.73osd.83  up  1
-9  32.76   host ceph08
84  2.73osd.84  up  1

95  2.73osd.95  up  1


If I change the cluster map on the following:
root---|
  |
  |-rack1
  ||
  |host ceph01
  |host ceph02
  |host ceph03
  |host ceph04
  |
  |---rack2
   |
  host ceph05
  host ceph06
  host ceph07
  host ceph08
What will povidenie cluster failover one node? And how much will it affect
the performance?
Thank you

-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] enrich ceph test methods, what is your concern about ceph. thanks

2014-08-26 Thread Irek Fasikhov
Hi.
I and many people use fio.
For ceph rbd has a special engine:
https://telekomcloud.github.io/ceph/2014/02/26/ceph-performance-analysis_fio_rbd.html


2014-08-26 12:15 GMT+04:00 yuelongguang fasts...@163.com:

 hi,all

 i am planning to do a test on ceph, include performance, throughput,
 scalability,availability.
 in order to get a full test result, i  hope you all can give me some
 advice. meanwhile i can send the result to you,if you like.
 as for each category test( performance, throughput,
 scalability,availability)  ,  do you have some some test idea and test
 tools?
 basicly i have know some tools to test throughtput,iops .  but you can
 tell the tools you prefer and the result you expect.

 thanks very much




 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph monitor load, low performance

2014-08-26 Thread Irek Fasikhov
Move logs on the SSD and immediately increase performance. you have about
50% of the performance lost on logs. And just for the three replications
recommended more than 5 hosts


2014-08-26 12:17 GMT+04:00 Mateusz Skała mateusz.sk...@budikom.net:


 Hi thanks for reply.



  From the top of my head, it is recommended to use 3 mons in
 production. Also, for the 22 osds your number of PGs look a bug low,
 you should look at that.

 I get it from http://ceph.com/docs/master/rados/operations/placement-
 groups/

 (22osd's * 100)/3 replicas = 733, ~1024 pgs
 Please correct me if I'm wrong.

 It will be 5 mons (on 6 hosts) but now we must migrate some data from used
 servers.




 The performance of the cluster is poor - this is too vague. What is
 your current performance, what benchmarks have you tried, what is your
 data workload and most importantly, how is your cluster setup. what
 disks, ssds, network, ram, etc.

 Please provide more information so that people could help you.

 Andrei


 Hardware informations:
 ceph15:
 RAM: 4GB
 Network: 4x 1GB NIC
 OSD disk's:
 2x SATA Seagate ST31000524NS
 2x SATA WDC WD1003FBYX-18Y7B0

 ceph25:
 RAM: 16GB
 Network: 4x 1GB NIC
 OSD disk's:
 2x SATA WDC WD7500BPKX-7
 2x SATA WDC WD7500BPKX-2
 2x SATA SSHD ST1000LM014-1EJ164

 ceph30
 RAM: 16GB
 Network: 4x 1GB NIC
 OSD disks:
 6x SATA SSHD ST1000LM014-1EJ164

 ceph35:
 RAM: 16GB
 Network: 4x 1GB NIC
 OSD disks:
 6x SATA SSHD ST1000LM014-1EJ164


 All journals are on OSD's. 2 NIC are for backend network (10.20.4.0/22)
 and 2 NIC are for frontend (10.20.8.0/22).

 This cluster we use as storage backend for 100VM's on KVM. I don't make
 benchmarks but all vm's are migrated from Xen+GlusterFS(NFS), before
 migration every VM are running fine, now each VM  from time to time hangs
 for few seconds, apps installed on VM's loading much more time. GlusterFS
 are running on 2 servers with 1x 1GB NIC and 2x8 disks WDC WD7500BPKX-7.

 I make one test with recovery, if disk marks out, then recovery io is
 150-200MB/s but all vm's hangs until recovery ends.

 Biggest load is on ceph35, IOps on each disk are near 150, cpu load ~4-5.
 On other hosts cpu load 2, 120~130iops

 Our ceph.conf

 ===
 [global]

 fsid=a9d17295-62f2-46f6-8325-1cad7724e97f
 mon initial members = ceph35, ceph30, ceph25, ceph15
 mon host = 10.20.8.35, 10.20.8.30, 10.20.8.25, 10.20.8.15
 public network = 10.20.8.0/22
 cluster network = 10.20.4.0/22
 osd journal size = 1024
 filestore xattr use omap = true
 osd pool default size = 3
 osd pool default min size = 1
 osd pool default pg num = 1024
 osd pool default pgp num = 1024
 osd crush chooseleaf type = 1
 auth cluster required = cephx
 auth service required = cephx
 auth client required = cephx
 rbd default format = 2

 ##ceph35 osds
 [osd.0]
 cluster addr = 10.20.4.35
 [osd.1]
 cluster addr = 10.20.4.35
 [osd.2]
 cluster addr = 10.20.4.35
 [osd.3]
 cluster addr = 10.20.4.36
 [osd.4]
 cluster addr = 10.20.4.36
 [osd.5]
 cluster addr = 10.20.4.36

 ##ceph25 osds
 [osd.6]
 cluster addr = 10.20.4.25
 public addr = 10.20.8.25
 [osd.7]
 cluster addr = 10.20.4.25
 public addr = 10.20.8.25
 [osd.8]
 cluster addr = 10.20.4.25
 public addr = 10.20.8.25
 [osd.9]
 cluster addr = 10.20.4.26
 public addr = 10.20.8.26
 [osd.10]
 cluster addr = 10.20.4.26
 public addr = 10.20.8.26
 [osd.11]
 cluster addr = 10.20.4.26
 public addr = 10.20.8.26

 ##ceph15 osds
 [osd.12]
 cluster addr = 10.20.4.15
 public addr = 10.20.8.15
 [osd.13]
 cluster addr = 10.20.4.15
 public addr = 10.20.8.15
 [osd.14]
 cluster addr = 10.20.4.15
 public addr = 10.20.8.15
 [osd.15]
 cluster addr = 10.20.4.16
 public addr = 10.20.8.16

 ##ceph30 osds
 [osd.16]
 cluster addr = 10.20.4.30
 public addr = 10.20.8.30
 [osd.17]
 cluster addr = 10.20.4.30
 public addr = 10.20.8.30
 [osd.18]
 cluster addr = 10.20.4.30
 public addr = 10.20.8.30
 [osd.19]
 cluster addr = 10.20.4.31
 public addr = 10.20.8.31
 [osd.20]
 cluster addr = 10.20.4.31
 public addr = 10.20.8.31
 [osd.21]
 cluster addr = 10.20.4.31
 public addr = 10.20.8.31

 [mon.ceph35]
 host = ceph35
 mon addr = 10.20.8.35:6789
 [mon.ceph30]
 host = ceph30
 mon addr = 10.20.8.30:6789
 [mon.ceph25]
 host = ceph25
 mon addr = 10.20.8.25:6789
 [mon.ceph15]
 host = ceph15
 mon addr = 10.20.8.15:6789
 

 Regards,

 Mateusz


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph monitor load, low performance

2014-08-26 Thread Irek Fasikhov
I'm sorry, of course it journals)


2014-08-26 13:16 GMT+04:00 Mateusz Skała mateusz.sk...@budikom.net:

 You mean to move /var/log/ceph/* to SSD disk?


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] enrich ceph test methods, what is your concern about ceph. thanks

2014-08-26 Thread Irek Fasikhov
For me, the bottleneck is single-threaded operation. The recording will
have more or less solved with the inclusion of rbd cache, but there are
problems with reading. But I think that these problems can be solved cache
pool, but have not tested.

It follows that the more threads, the greater the speed of reading and
writing. But in reality it is different.

The speed and number of operations, depending on many factors, such as
network latency.

Examples testing, special attention to the charts:

https://software.intel.com/en-us/blogs/2013/10/25/measure-ceph-rbd-performance-in-a-quantitative-way-part-i
and
https://software.intel.com/en-us/blogs/2013/11/20/measure-ceph-rbd-performance-in-a-quantitative-way-part-ii


2014-08-26 15:11 GMT+04:00 yuelongguang fasts...@163.com:


 thanks Irek Fasikhov.
 is it the only way to test ceph-rbd?  and an important aim of the test is
 to find where  the bottleneck is.   qemu/librbd/ceph.
 could you share your test result with me?



 thanks






 在 2014-08-26 04:22:22,Irek Fasikhov malm...@gmail.com 写道:

 Hi.
 I and many people use fio.
 For ceph rbd has a special engine:
 https://telekomcloud.github.io/ceph/2014/02/26/ceph-performance-analysis_fio_rbd.html


 2014-08-26 12:15 GMT+04:00 yuelongguang fasts...@163.com:

 hi,all

 i am planning to do a test on ceph, include performance, throughput,
 scalability,availability.
 in order to get a full test result, i  hope you all can give me some
 advice. meanwhile i can send the result to you,if you like.
 as for each category test( performance, throughput,
 scalability,availability)  ,  do you have some some test idea and test
 tools?
 basicly i have know some tools to test throughtput,iops .  but you can
 tell the tools you prefer and the result you expect.

 thanks very much




 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




 --
 С уважением, Фасихов Ирек Нургаязович
 Моб.: +79229045757






-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] enrich ceph test methods, what is your concern about ceph. thanks

2014-08-26 Thread Irek Fasikhov
Sorry..Enter pressed :)

continued...
no, it's not the only way to check, but it depends what you want to use ceph


2014-08-26 15:22 GMT+04:00 Irek Fasikhov malm...@gmail.com:

 For me, the bottleneck is single-threaded operation. The recording will
 have more or less solved with the inclusion of rbd cache, but there are
 problems with reading. But I think that these problems can be solved cache
 pool, but have not tested.

 It follows that the more threads, the greater the speed of reading and
 writing. But in reality it is different.

 The speed and number of operations, depending on many factors, such as
 network latency.

 Examples testing, special attention to the charts:


 https://software.intel.com/en-us/blogs/2013/10/25/measure-ceph-rbd-performance-in-a-quantitative-way-part-i
 and

 https://software.intel.com/en-us/blogs/2013/11/20/measure-ceph-rbd-performance-in-a-quantitative-way-part-ii


 2014-08-26 15:11 GMT+04:00 yuelongguang fasts...@163.com:


 thanks Irek Fasikhov.
 is it the only way to test ceph-rbd?  and an important aim of the test is
 to find where  the bottleneck is.   qemu/librbd/ceph.
 could you share your test result with me?



 thanks






 在 2014-08-26 04:22:22,Irek Fasikhov malm...@gmail.com 写道:

 Hi.
 I and many people use fio.
 For ceph rbd has a special engine:
 https://telekomcloud.github.io/ceph/2014/02/26/ceph-performance-analysis_fio_rbd.html


 2014-08-26 12:15 GMT+04:00 yuelongguang fasts...@163.com:

 hi,all

 i am planning to do a test on ceph, include performance, throughput,
 scalability,availability.
 in order to get a full test result, i  hope you all can give me some
 advice. meanwhile i can send the result to you,if you like.
 as for each category test( performance, throughput,
 scalability,availability)  ,  do you have some some test idea and test
 tools?
 basicly i have know some tools to test throughtput,iops .  but you can
 tell the tools you prefer and the result you expect.

 thanks very much




 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




 --
 С уважением, Фасихов Ирек Нургаязович
 Моб.: +79229045757






 --
 С уважением, Фасихов Ирек Нургаязович
 Моб.: +79229045757




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to calculate necessary disk amount

2014-08-22 Thread Irek Fasikhov
Hi.

10ТB*2/0.85 ~= 24 TB with two replications, total volume for the raw data.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to calculate necessary disk amount

2014-08-22 Thread Irek Fasikhov
I recommend you use replication, because radosgw uses asynchronous
replication.

Yes divided by nearfull ratio.
No, it's for the entire cluster.


2014-08-22 11:51 GMT+04:00 idzzy idez...@gmail.com:

 Hi,

 If not use replication, Is it only to divide by nearfull_ratio?
 (does only radosgw support replication?)

 10T/0.85 = 11.8 TB of each node?

 # ceph pg dump | egrep full_ratio|nearfulll_ratio
 full_ratio 0.95
 nearfull_ratio 0.85

 Sorry I’m not familiar with ceph architecture.
 Thanks for the reply.

 —
 idzzy

 On August 22, 2014 at 3:53:21 PM, Irek Fasikhov (malm...@gmail.com) wrote:

 Hi.

 10ТB*2/0.85 ~= 24 TB with two replications, total volume for the raw data.






-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to calculate necessary disk amount

2014-08-22 Thread Irek Fasikhov
node1: 4[TB], node2: 4[TB], node3: 4[TB] :)
22 авг. 2014 г. 12:53 пользователь idzzy idez...@gmail.com написал:

 Hi Irek,

 Understood.

 Let me ask about only this.

  No, it's for the entire cluster.

 Is this meant that total disk amount size of all nodes is over than 11.8
 TB?
 e.g  node1: 4[TB], node2: 4[TB], node3: 4[TB]

 not each node.
 e.g  node1: 11.8[TB], node2: 11.8[TB], node3:11.8 [TB]

 Thank you.


 On August 22, 2014 at 5:06:02 PM, Irek Fasikhov (malm...@gmail.com) wrote:

 I recommend you use replication, because radosgw uses asynchronous
 replication.

 Yes divided by nearfull ratio.
 No, it's for the entire cluster.


 2014-08-22 11:51 GMT+04:00 idzzy idez...@gmail.com:

  Hi,

  If not use replication, Is it only to divide by nearfull_ratio?
  (does only radosgw support replication?)

 10T/0.85 = 11.8 TB of each node?

  # ceph pg dump | egrep full_ratio|nearfulll_ratio
  full_ratio 0.95
 nearfull_ratio 0.85

  Sorry I’m not familiar with ceph architecture.
  Thanks for the reply.

  —
  idzzy

 On August 22, 2014 at 3:53:21 PM, Irek Fasikhov (malm...@gmail.com)
 wrote:

  Hi.

 10ТB*2/0.85 ~= 24 TB with two replications, total volume for the raw data.






 --
 С уважением, Фасихов Ирек Нургаязович
 Моб.: +79229045757


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fw: external monitoring tools for processes

2014-08-11 Thread Irek Fasikhov
Hi.

I use ZABBIX with the following script:
[ceph@ceph08 ~]$ cat /etc/zabbix/external/ceph
#!/usr/bin/python

import sys
import os
import commands
import json
import datetime
import time

#Chech arguments. If count arguments equally 1, then false.
if len(sys.argv) == 1:
print You will need arguments!;
exit;

def generate(data,type):
JSON={\data\:[
for js in range(len(splits)):
JSON+={\{#+type+}\:\+splits[js]+\},;
return JSON[:-1]+]}

if sys.argv[1] == osd:
if len(sys.argv)==2:
splits=commands.getoutput('df | grep osd | awk {\'print
$6\'}| sed \'s/[^0-9]//g\'| sed \':a;N;$!ba;s/\\n/,/g\'').split(,)
print generate(splits,OSD)
else:
ID=sys.argv[2]
LEVEL=sys.argv[3]
PERF=sys.argv[4]
CACHEFILE=/tmp/zabbix.ceph.osd+ID+.cache
CACHETTL=5

TIME=int(round(float(datetime.datetime.now().strftime(%s

##CACHE FOR OPTIMIZATION PERFORMANCE#
if os.path.isfile(CACHEFILE):
CACHETIME=int(round(os.stat(CACHEFILE).st_mtime))
else:
CACHETIME=0
if TIME-CACHETIMECACHETTL:
if os.system('sudo ceph --admin-daemon
/var/run/ceph/ceph-osd.'+ID+'.asok perfcounters_dump '+CACHEFILE)0: exit

json_data=open(CACHEFILE)
data = json.load(json_data)
json_data.close()
## PARSING 
if LEVEL in data:
if PERF in data[LEVEL]:
try:
key=data[LEVEL][PERF].has_key(sum)
print
(data[LEVEL][PERF][sum])/(data[LEVEL][PERF][avgcount])
except AttributeError:
print data[LEVEL][PERF]

and zabbix templates:
https://dl.dropboxusercontent.com/u/575018/zbx_export_templates.xml



2014-08-11 7:42 GMT+04:00 pragya jain prag_2...@yahoo.co.in:

 please somebody reply my question.


On Saturday, 9 August 2014 3:34 PM, pragya jain prag_2...@yahoo.co.in
 wrote:



 hi all,

 can somebody suggest me some external monitoring tools which can monitor
 whether the processes in ceph, such as, heartbeating, data scrubbing,
 authentication, backfilling, recovering etc. are working properly or not.

 Regards
 Pragya Jain

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mounting RBD in linux containers

2014-08-11 Thread Irek Fasikhov
dmesg output please.


2014-08-11 2:16 GMT+04:00 Lorieri lori...@gmail.com:

 same here, did you manage to fix it ?

 On Mon, Oct 28, 2013 at 3:13 PM, Kevin Weiler
 kevin.wei...@imc-chicago.com wrote:
  Hi Josh,
 
  We did map it directly to the host, and it seems to work just fine. I
  think this is a problem with how the container is accessing the rbd
 module.
 
  --
 
  Kevin Weiler
 
  IT
 
 
  IMC Financial Markets | 233 S. Wacker Drive, Suite 4300 | Chicago, IL
  60606 | http://imc-chicago.com/
 
  Phone: +1 312-204-7439 | Fax: +1 312-244-3301 | E-Mail:
  kevin.wei...@imc-chicago.com
 
 
 
 
 
 
 
  On 10/18/13 7:50 PM, Josh Durgin josh.dur...@inktank.com wrote:
 
 On 10/18/2013 10:04 AM, Kevin Weiler wrote:
  The kernel is 3.11.4-201.fc19.x86_64, and the image format is 1. I did,
  however, try a map with an RBD that was format 2. I got the same error.
 
 To rule out any capability drops as the culprit, can you map an rbd
 image on the same host outside of a container?
 
 Josh
 
  --
 
  *Kevin Weiler*
 
  IT
 
  IMC Financial Markets | 233 S. Wacker Drive, Suite 4300 | Chicago, IL
  60606 | http://imc-chicago.com/
 
  Phone: +1 312-204-7439 | Fax: +1 312-244-3301 | E-Mail:
  _kevin.wei...@imc-chicago.com mailto:kevin.wei...@imc-chicago.com_
 
 
  From: Gregory Farnum g...@inktank.com mailto:g...@inktank.com
  Date: Friday, October 18, 2013 10:26 AM
  To: Omar Marquez omar.marq...@imc-chicago.com
  mailto:omar.marq...@imc-chicago.com
  Cc: Kyle Bader kyle.ba...@gmail.com mailto:kyle.ba...@gmail.com,
  Kevin Weiler kevin.wei...@imc-chicago.com
  mailto:kevin.wei...@imc-chicago.com, ceph-users@lists.ceph.com
  mailto:ceph-users@lists.ceph.com ceph-users@lists.ceph.com
  mailto:ceph-users@lists.ceph.com, Khalid Goudeaux
  khalid.goude...@imc-chicago.com
 mailto:khalid.goude...@imc-chicago.com
  Subject: Re: [ceph-users] mounting RBD in linux containers
 
  What kernel are you running, and which format is the RBD image? I
  thought we had a special return code for when the kernel doesn't
 support
  the features used by that image, but that could be the problem.
  -Greg
 
  On Thursday, October 17, 2013, Omar Marquez wrote:
 
 
  Strace produces below:
 
  Š
 
  futex(0xb5637c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0xb56378,
  {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
  futex(0xb562f8, FUTEX_WAKE_PRIVATE, 1)  = 1
  add_key(0x424408, 0x7fff82c4e210, 0x7fff82c4e140, 0x22,
  0xfffe) = 607085216
  stat(/sys/bus/rbd, {st_mode=S_IFDIR|0755, st_size=0, ...}) =
 0
  *open(/sys/bus/rbd/add, O_WRONLY)  = 3*
  *write(3, 10.198.41.6:6789
  http://10.198.41.6:6789,10.198.41.8:678
  http://10.198.41.8:678..., 96) = -1 EINVAL (Invalid
 argument)*
  close(3)= 0
  rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTORER,
 0x7fbf8a7efa90},
  {SIG_DFL, [], 0}, 8) = 0
  rt_sigaction(SIGQUIT, {SIG_IGN, [], SA_RESTORER,
  0x7fbf8a7efa90}, {SIG_DFL, [], 0}, 8) = 0
  rt_sigprocmask(SIG_BLOCK, [CHLD], [PIPE], 8) = 0
  clone(child_stack=0, flags=CLONE_PARENT_SETTID|SIGCHLD,
  parent_tidptr=0x7fff82c4e040) = 22
  wait4(22, [{WIFEXITED(s)  WEXITSTATUS(s) == 0}], 0, NULL) =
 22
  rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER,
 0x7fbf8a7efa90},
  NULL, 8) = 0
  rt_sigaction(SIGQUIT, {SIG_DFL, [], SA_RESTORER,
  0x7fbf8a7efa90}, NULL, 8) = 0
  rt_sigprocmask(SIG_SETMASK, [PIPE], NULL, 8) = 0
  write(2, rbd: add failed: , 17rbd: add failed: )   = 17
  write(2, (22) Invalid argument, 21(22) Invalid argument)   =
 21
  write(2, \n, 1
  )   = 1
  exit_group(1)   = ?
  +++ exited with 1 +++
 
 
  The app is run inside the container with setuid = 0 and the
  container is able to mount all required filesystems Š could this
  still be a capability problem ? Also I do not see any call to
  capset() in the strafe log Š
 
  --
  Om
 
 
  From: Kyle Bader kyle.ba...@gmail.com
  Date: Thursday, October 17, 2013 5:08 PM
  To: Kevin Weiler kevin.wei...@imc-chicago.com
  Cc: ceph-users@lists.ceph.com ceph-users@lists.ceph.com, Omar
  Marquez omar.marq...@imc-chicago.com, Khalid Goudeaux
  khalid.goude...@imc-chicago.com
  Subject: Re: [ceph-users] mounting RBD in linux containers
 
  My first guess would be that it's due to LXC dropping capabilities,
  I'd investigate whether CAP_SYS_ADMIN is being dropped. You need
  CAP_SYS_ADMIN for mount and block ioctls, if the container doesn't
  have those privs a map will likely fail. Maybe try tracing the
  command with strace?
 
  On Thu, Oct 17, 2013 at 2:45 PM, Kevin Weiler
  kevin.wei...@imc-chicago.com wrote:
 
  Hi all,
 
  We're trying to mount an rbd image inside of a linux container
 

Re: [ceph-users] flashcache from fb and dm-cache??

2014-07-30 Thread Irek Fasikhov
Ceph has at CachePool. which can be created from SSD.
30 июля 2014 г. 18:41 пользователь German Anders gand...@despegar.com
написал:

  Also, does someone try flashcache from facebook on ceph? cons? pros? any
 perf improvement? and dm-cache?



 *German Anders*
















 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] rbd rm. Error: trim_objectcould not find coid

2014-07-23 Thread Irek Fasikhov
Hi, All.

I encountered such a problem.

Was the status of one pg - inconsistent. RBD found this device and deleted
it, now on the OSD get the following error:


cod 0'0 active+inconsistent snaptrimq=[15~1,89~1]] exit
Started/Primary/Active/Recovering 0.025609 1 0.53
-8 2014-07-23 12:03:13.386747 7f3617b02700  5 osd.94 pg_epoch: 35725
pg[80.3d6( v 35718'170614 (34929'167412,35718'170614] local-les=35724
n=2242 ec=9510 les/c 35724/35718 35723/35723/35723) [94,36] r=0 lpr=35723
pi=35713-35722/2 ml
cod 0'0 active+inconsistent snaptrimq=[15~1,89~1]] enter
Started/Primary/Active/Recovered
-7 2014-07-23 12:03:13.386783 7f3617b02700  5 osd.94 pg_epoch: 35725
pg[80.3d6( v 35718'170614 (34929'167412,35718'170614] local-les=35724
n=2242 ec=9510 les/c 35724/35718 35723/35723/35723) [94,36] r=0 lpr=35723
pi=35713-35722/2 ml
cod 0'0 active+inconsistent snaptrimq=[15~1,89~1]] exit
Started/Primary/Active/Recovered 0.35 0 0.00
-6 2014-07-23 12:03:13.386795 7f3617b02700  5 osd.94 pg_epoch: 35725
pg[80.3d6( v 35718'170614 (34929'167412,35718'170614] local-les=35724
n=2242 ec=9510 les/c 35724/35718 35723/35723/35723) [94,36] r=0 lpr=35723
pi=35713-35722/2 ml
cod 0'0 active+inconsistent snaptrimq=[15~1,89~1]] enter
Started/Primary/Active/Clean
-5 2014-07-23 12:03:13.386932 7f3617b02700  5 osd.94 pg_epoch: 35725
pg[2.772( v 35722'486163 lc 35716'486156 (35141'483132,35722'486163]
local-les=35724 n=1328 ec=1 les/c 35724/35718 35723/35723/35723) [94,38,59]
r=0 lpr=35723 pi=3
5713-35722/2 lcod 0'0 mlcod 0'0 active+recovery_wait m=4] exit
Started/Primary/Active/WaitLocalRecoveryReserved 4.377808 7 0.96
-4 2014-07-23 12:03:13.386956 7f3617b02700  5 osd.94 pg_epoch: 35725
pg[2.772( v 35722'486163 lc 35716'486156 (35141'483132,35722'486163]
local-les=35724 n=1328 ec=1 les/c 35724/35718 35723/35723/35723) [94,38,59]
r=0 lpr=35723 pi=3
5713-35722/2 lcod 0'0 mlcod 0'0 active+recovery_wait m=4] enter
Started/Primary/Active/WaitRemoteRecoveryReserved
-3 2014-07-23 12:03:13.387282 7f36148fd700 -1 osd.94 pg_epoch: 35725
pg[80.3d6( v 35718'170614 (34929'167412,35718'170614] local-les=35724
n=2242 ec=9510 les/c 35724/35725 35723/35723/35723) [94,36] r=0 lpr=35723
mlcod 0'0 active+cl
ean+inconsistent snaptrimq=[15~1,89~1]] *trim_objectcould not find coid *
f022c7d6/rbd_data.3ed9c72ae8944a.0717/15//80
-2 2014-07-23 12:03:13.388628 7f3617101700  5 osd.94 pg_epoch: 35725
pg[2.772( v 35722'486163 lc 35716'486156 (35141'483132,35722'486163]
local-les=35724 n=1328 ec=1 les/c 35724/35718 35723/35723/35723) [94,38,59]
r=0 lpr=35723 pi=3
5713-35722/2 lcod 0'0 mlcod 0'0 active+recovery_wait m=4] exit
Started/Primary/Active/WaitRemoteRecoveryReserved 0.001672 2 0.79
-1 2014-07-23 12:03:13.388670 7f3617101700  5 osd.94 pg_epoch: 35725
pg[2.772( v 35722'486163 lc 35716'486156 (35141'483132,35722'486163]
local-les=35724 n=1328 ec=1 les/c 35724/35718 35723/35723/35723) [94,38,59]
r=0 lpr=35723 pi=3
5713-35722/2 lcod 0'0 mlcod 0'0 active+recovery_wait m=4] enter
Started/Primary/Active/Recovering
 0 2014-07-23 12:03:13.389138 7f36148fd700 -1 osd/ReplicatedPG.cc: In
function 'ReplicatedPG::RepGather* ReplicatedPG::trim_object(const
hobject_t)' thread 7f36148fd700 time 2014-07-23 12:03:13.387304
osd/ReplicatedPG.cc: 1824: FAILED assert(0)

[root@ceph08 DIR_7]# find /var/lib/ceph/osd/ceph-94/ -name
'*3ed9c72ae8944a.0717*' -ls
10745283770 -rw-r--r--   1 root root1 Июл 23 11:30
/var/lib/ceph/osd/ceph-94/current/80.3d6_head/DIR_6/DIR_D/DIR_7/rbd\\udata.3ed9c72ae8944a.0717__15_F022C7D6__50



How to make ceph forgot about the existence of this file?

Ceph version 0.72.2


Thanks.

-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] feature set mismatch after upgrade from Emperor to Firefly

2014-07-20 Thread Irek Fasikhov
Привет, Андрей.

ceph osd getcrushmap -o /tmp/crush
crushtool -i /tmp/crush --set-chooseleaf_vary_r 0 -o /tmp/crush.new
ceph osd setcrushmap -i /tmp/crush.new

Or

update kernel 3.15.


2014-07-20 20:19 GMT+04:00 Andrei Mikhailovsky and...@arhont.com:

 Hello guys,


 I have noticed the following message/error after upgrading to firefly.
 Does anyone know what needs doing to correct it?


 Thanks

 Andrei



 [   25.911055] libceph: mon1 192.168.168.201:6789 feature set mismatch,
 my 40002  server's 20002040002, missing 2000200

 [   25.911698] libceph: mon1 192.168.168.201:6789 socket error on read

 [   35.913049] libceph: mon2 192.168.168.13:6789 feature set mismatch, my
 40002  server's 20002040002, missing 2000200

 [   35.913694] libceph: mon2 192.168.168.13:6789 socket error on read

 [   45.909466] libceph: mon0 192.168.168.200:6789 feature set mismatch,
 my 40002  server's 20002040002, missing 2000200

 [   45.910104] libceph: mon0 192.168.168.200:6789 socket error on read






 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph RBD and Backup.

2014-07-02 Thread Irek Fasikhov
Hi,All.

Dear community. How do you make backups CEPH RDB?

Thanks

-- 
Fasihov Irek (aka Kataklysm).
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Calamari Goes Open Source

2014-05-31 Thread Irek Fasikhov
Very Very Good! Thanks Inktank/RedHat.


2014-05-31 2:43 GMT+04:00 John Kinsella j...@stratosec.co:

  Cool! Looking forward to kicking the tires on that...
  On May 30, 2014, at 3:04 PM, Patrick McGarry patr...@inktank.com wrote:

 Hey cephers,

 Sorry to push this announcement so late on a Friday but...

 Calamari has arrived!

 The source code bits have been flipped, the ticket tracker has been
 moved, and we have even given you a little bit of background from both
 a technical and vision point of view:

 Technical (ceph.com):
 http://ceph.com/community/ceph-calamari-goes-open-source/

 Vision (inktank.com):
 http://www.inktank.com/software/future-of-calamari/

 The ceph.com link should give you everything you need to know about
 what tech comprises Calamari, where the source lives, and where the
 discussions will take place.  If you have any questions feel free to
 hit the new ceph-calamari list or stop by IRC and we'll get you
 started.  Hope you all enjoy the GUI!



 Best Regards,

 Patrick McGarry
 Director, Community || Inktank
 http://ceph.com  ||  http://inktank.com
 @scuttlemonkey || @ceph || @inktank
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


  Stratosec http://stratosec.co/ - Compliance as a Service
 o: 415.315.9385
 @johnlkinsella http://twitter.com/johnlkinsella


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RAID implementation.

2014-04-29 Thread Irek Fasikhov
No sense doing RAID for ceph!


2014-04-29 15:58 GMT+04:00 yalla.gnan.ku...@accenture.com:

  Hi All,



 I have setup a three node ceph storage cluster on Ubunut. I want to
 implement RAID 5, RAID 1 and RAID 1+0 volumes using ceph.

 Any information or link providing information on this will help a lot.





 Thanks

 Kumar

 --

 This message is for the designated recipient only and may contain
 privileged, proprietary, or otherwise confidential information. If you have
 received it in error, please notify the sender immediately and delete the
 original. Any other use of the e-mail by you is prohibited. Where allowed
 by local law, electronic communications with Accenture and its affiliates,
 including e-mail and instant messaging (including content), may be scanned
 by our systems for the purposes of information security and assessment of
 internal compliance with Accenture policy.

 __

 www.accenture.com

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD journal overload?

2014-04-28 Thread Irek Fasikhov
You what model SSD?
Which version of the kernel?



2014-04-28 12:35 GMT+04:00 Udo Lembke ulem...@polarzone.de:

 Hi,
 perhaps due IOs from the journal?
 You can test with iostat (like iostat -dm 5 sdg).

 on debian iostat is in the package sysstat.

 Udo

 Am 28.04.2014 07:38, schrieb Indra Pramana:
  Hi Craig,
 
  Good day to you, and thank you for your enquiry.
 
  As per your suggestion, I have created a 3rd partition on the SSDs and
 did
  the dd test directly into the device, and the result is very slow.
 
  
  root@ceph-osd-08:/mnt# dd bs=1M count=128 if=/dev/zero of=/dev/sdg3
  conv=fdatasync oflag=direct
  128+0 records in
  128+0 records out
  134217728 bytes (134 MB) copied, 19.5223 s, 6.9 MB/s
 
  root@ceph-osd-08:/mnt# dd bs=1M count=128 if=/dev/zero of=/dev/sdf3
  conv=fdatasync oflag=direct
  128+0 records in
  128+0 records out
  134217728 bytes (134 MB) copied, 5.34405 s, 25.1 MB/s
  
 
  I did a test onto another server with exactly similar specification and
  similar SSD drive (Seagate SSD 100 GB) but not added into the cluster yet
  (thus no load), and the result is fast:
 
  
  root@ceph-osd-09:/home/indra# dd bs=1M count=128 if=/dev/zero
 of=/dev/sdf1
  conv=fdatasync oflag=direct
  128+0 records in
  128+0 records out
  134217728 bytes (134 MB) copied, 0.742077 s, 181 MB/s
  
 
  Is the Ceph journal load really takes up a lot of the SSD resources? I
  don't understand how come the performance can drop significantly.
  Especially since the two Ceph journals are only taking the first 20 GB
 out
  of the 100 GB of the SSD total capacity.
 
  Any advice is greatly appreciated.
 
  Looking forward to your reply, thank you.
 
  Cheers.
 
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD journal overload?

2014-04-28 Thread Irek Fasikhov
Most likely you need to apply a patch to the kernel.

http://www.theirek.com/blog/2014/02/16/patch-dlia-raboty-s-enierghoniezavisimym-keshiem-ssd-diskov


2014-04-28 15:20 GMT+04:00 Indra Pramana in...@sg.or.id:

 Hi Udo and Irek,

 Good day to you, and thank you for your emails.


 perhaps due IOs from the journal?
 You can test with iostat (like iostat -dm 5 sdg).

 Yes, I have shared the iostat result earlier on this same thread. At times
 the utilisation of the 2 journal drives will hit 100%, especially when I
 simulate writing data using rados bench command. Any suggestions what could
 be the cause of the I/O issue?


 
 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
1.850.001.653.140.00   93.36


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz
 avgqu-sz   await r_await w_await  svctm  %util
 sdg   0.00 0.000.00   55.00 0.00 25365.33
 922.3834.22  568.900.00  568.90  17.82  98.00
 sdf   0.00 0.000.00   55.67 0.00 25022.67
 899.0229.76  500.570.00  500.57  17.60  98.00


 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
2.100.001.372.070.00   94.46


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz
 avgqu-sz   await r_await w_await  svctm  %util
 sdg   0.00 0.000.00   56.67 0.00 25220.00
 890.1223.60  412.140.00  412.14  17.62  99.87
 sdf   0.00 0.000.00   52.00 0.00 24637.33
 947.5933.65  587.410.00  587.41  19.23 100.00


 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
2.210.001.776.750.00   89.27


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz
 avgqu-sz   await r_await w_await  svctm  %util
 sdg   0.00 0.000.00   54.33 0.00 24802.67
 912.9825.75  486.360.00  486.36  18.40 100.00
 sdf   0.00 0.000.00   53.00 0.00 24716.00
 932.6835.26  669.890.00  669.89  18.87 100.00


 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
1.870.001.675.250.00   91.21


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz
 avgqu-sz   await r_await w_await  svctm  %util
 sdg   0.00 0.000.00   94.33 0.00 26257.33
 556.6918.29  208.440.00  208.44  10.50  99.07
 sdf   0.00 0.000.00   51.33 0.00 24470.67
 953.4032.75  684.620.00  684.62  19.51 100.13


 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
1.510.001.347.250.00   89.89


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz
 avgqu-sz   await r_await w_await  svctm  %util
 sdg   0.00 0.000.00   52.00 0.00 22565.33
 867.9024.73  446.510.00  446.51  19.10  99.33
 sdf   0.00 0.000.00   64.67 0.00 24892.00
 769.8619.50  330.020.00  330.02  15.32  99.07
 

 You what model SSD?

 For this one, I am using Seagate 100GB SSD, model: HDS-2TM-ST100FM0012

 Which version of the kernel?

 Ubuntu 13.04, Linux kernel version: 3.8.0-19-generic #30-Ubuntu SMP Wed
 May 1 16:35:23 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

 Looking forward to your reply, thank you.

 Cheers.



 On Mon, Apr 28, 2014 at 4:45 PM, Irek Fasikhov malm...@gmail.com wrote:

 You what model SSD?
 Which version of the kernel?



 2014-04-28 12:35 GMT+04:00 Udo Lembke ulem...@polarzone.de:

 Hi,
 perhaps due IOs from the journal?
 You can test with iostat (like iostat -dm 5 sdg).

 on debian iostat is in the package sysstat.

 Udo

 Am 28.04.2014 07:38, schrieb Indra Pramana:
  Hi Craig,
 
  Good day to you, and thank you for your enquiry.
 
  As per your suggestion, I have created a 3rd partition on the SSDs and
 did
  the dd test directly into the device, and the result is very slow.
 
  
  root@ceph-osd-08:/mnt# dd bs=1M count=128 if=/dev/zero of=/dev/sdg3
  conv=fdatasync oflag=direct
  128+0 records in
  128+0 records out
  134217728 bytes (134 MB) copied, 19.5223 s, 6.9 MB/s
 
  root@ceph-osd-08:/mnt# dd bs=1M count=128 if=/dev/zero of=/dev/sdf3
  conv=fdatasync oflag=direct
  128+0 records in
  128+0 records out
  134217728 bytes (134 MB) copied, 5.34405 s, 25.1 MB/s
  
 
  I did a test onto another server with exactly similar specification and
  similar SSD drive (Seagate SSD 100 GB) but not added into the cluster
 yet
  (thus no load), and the result is fast:
 
  
  root@ceph-osd-09:/home/indra# dd bs=1M count=128 if=/dev/zero
 of=/dev/sdf1
  conv=fdatasync oflag=direct
  128+0 records in
  128+0 records out
  134217728 bytes (134 MB) copied, 0.742077 s, 181 MB/s
  
 
  Is the Ceph journal load really takes up a lot of the SSD resources? I
  don't understand how come the performance can drop significantly.
  Especially since the two Ceph journals are only taking

Re: [ceph-users] SSD journal overload?

2014-04-28 Thread Irek Fasikhov
This is my article :).
To patch to the kernel (http://www.theirek.com/downloads/code/CMD_FLUSH.diff
).
After rebooting, run the following commands:
echo temporary write through  /sys/class/scsi_disk/disk/cache_type


2014-04-28 15:44 GMT+04:00 Indra Pramana in...@sg.or.id:

 Hi Irek,

 Thanks for the article. Do you have any other web sources pertaining to
 the same issue, which is in English?

 Looking forward to your reply, thank you.

 Cheers.


 On Mon, Apr 28, 2014 at 7:40 PM, Irek Fasikhov malm...@gmail.com wrote:

 Most likely you need to apply a patch to the kernel.


 http://www.theirek.com/blog/2014/02/16/patch-dlia-raboty-s-enierghoniezavisimym-keshiem-ssd-diskov


 2014-04-28 15:20 GMT+04:00 Indra Pramana in...@sg.or.id:

 Hi Udo and Irek,

 Good day to you, and thank you for your emails.


 perhaps due IOs from the journal?
 You can test with iostat (like iostat -dm 5 sdg).

 Yes, I have shared the iostat result earlier on this same thread. At
 times the utilisation of the 2 journal drives will hit 100%, especially
 when I simulate writing data using rados bench command. Any suggestions
 what could be the cause of the I/O issue?


 
 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
1.850.001.653.140.00   93.36


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s
 avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
  sdg   0.00 0.000.00   55.00 0.00 25365.33
 922.3834.22  568.900.00  568.90  17.82  98.00
 sdf   0.00 0.000.00   55.67 0.00 25022.67
 899.0229.76  500.570.00  500.57  17.60  98.00


 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
2.100.001.372.070.00   94.46


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s
 avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
  sdg   0.00 0.000.00   56.67 0.00 25220.00
 890.1223.60  412.140.00  412.14  17.62  99.87
 sdf   0.00 0.000.00   52.00 0.00 24637.33
 947.5933.65  587.410.00  587.41  19.23 100.00


 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
2.210.001.776.750.00   89.27


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s
 avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
  sdg   0.00 0.000.00   54.33 0.00 24802.67
 912.9825.75  486.360.00  486.36  18.40 100.00
 sdf   0.00 0.000.00   53.00 0.00 24716.00
 932.6835.26  669.890.00  669.89  18.87 100.00


 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
1.870.001.675.250.00   91.21


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s
 avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
  sdg   0.00 0.000.00   94.33 0.00 26257.33
 556.6918.29  208.440.00  208.44  10.50  99.07
 sdf   0.00 0.000.00   51.33 0.00 24470.67
 953.4032.75  684.620.00  684.62  19.51 100.13


 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
1.510.001.347.250.00   89.89


 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s
 avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
  sdg   0.00 0.000.00   52.00 0.00 22565.33
 867.9024.73  446.510.00  446.51  19.10  99.33
 sdf   0.00 0.000.00   64.67 0.00 24892.00
 769.8619.50  330.020.00  330.02  15.32  99.07
 

 You what model SSD?

 For this one, I am using Seagate 100GB SSD, model: HDS-2TM-ST100FM0012

 Which version of the kernel?

 Ubuntu 13.04, Linux kernel version: 3.8.0-19-generic #30-Ubuntu SMP Wed
 May 1 16:35:23 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

 Looking forward to your reply, thank you.

 Cheers.



 On Mon, Apr 28, 2014 at 4:45 PM, Irek Fasikhov malm...@gmail.comwrote:

 You what model SSD?
 Which version of the kernel?



 2014-04-28 12:35 GMT+04:00 Udo Lembke ulem...@polarzone.de:

 Hi,
 perhaps due IOs from the journal?
 You can test with iostat (like iostat -dm 5 sdg).

 on debian iostat is in the package sysstat.

 Udo

 Am 28.04.2014 07:38, schrieb Indra Pramana:
  Hi Craig,
 
  Good day to you, and thank you for your enquiry.
 
  As per your suggestion, I have created a 3rd partition on the SSDs
 and did
  the dd test directly into the device, and the result is very slow.
 
  
  root@ceph-osd-08:/mnt# dd bs=1M count=128 if=/dev/zero of=/dev/sdg3
  conv=fdatasync oflag=direct
  128+0 records in
  128+0 records out
  134217728 bytes (134 MB) copied, 19.5223 s, 6.9 MB/s
 
  root@ceph-osd-08:/mnt# dd bs=1M count=128 if=/dev/zero of=/dev/sdf3
  conv=fdatasync oflag=direct
  128+0 records in
  128+0 records out
  134217728 bytes (134 MB) copied, 5.34405 s, 25.1 MB/s
  
 
  I did a test onto another server with exactly similar specification

Re: [ceph-users] Pool with empty name recreated

2014-04-25 Thread Irek Fasikhov
Hi.

radosgw-admin bucket list



2014-04-25 15:32 GMT+04:00 myk...@gmail.com:

 Hi, All.
 Yesterday i managed to reproduce the bug on my test environment
 with a fresh installation of dumpling release. I`ve attached the
 link to archive with debug logs.
 http://lamcdn.net/pool_with_empty_name_bug_logs.tar.gz
 Test cluster contains only one bucket with name
 test and one file in this bucket with name README and acl public-read.
 Pool with empty name is created when RGW processes request with
 non-existent bucket name. For example:
 $ curl -kIL http://rgw.test.lo/test/README
 HTTP/1.1 200 OK - bucket exists, file exists
 $ curl -kIL http://test.rgw.test.lo/README
 HTTP/1.1 200 OK - bucket exists, file exists
 $ curl -kIL http://rgw.test.lo/test/README2
 HTTP/1.1 403 OK - bucket exists, file does not exists
 $ curl -kIL http://test.rgw.test.lo/README2
 HTTP/1.1 403 Forbidden - bucket exists, file does not exists
 $ curl -kIL http://rgw.test.lo/test2/README
 HTTP/1.1 404 Not Found - bucket does not exists, pool with empty name
 is created
 $ curl -kIL http://test2.rgw.test.lo/README
 HTTP/1.1 404 Not Found - bucket does not exists, pool with empty name
 is created

 If someone confirm this behaviour we can file a bug and request
 backport.

 --
 Regards,
 Mikhail


 On Thu, 24 Apr 2014 10:33:00 -0700
 Gregory Farnum g...@inktank.com wrote:

  Yehuda says he's fixed several of these bugs in recent code, but if
  you're seeing it from a recent dev release, please file a bug!
  Likewise if you're on a named release and would like to see a
  backport. :) -Greg
  Software Engineer #42 @ http://inktank.com | http://ceph.com
 
 
  On Thu, Apr 24, 2014 at 4:10 AM, Dan van der Ster
  daniel.vanders...@cern.ch wrote:
   Hi,
   We also get the '' pool from rgw, which is clearly a bug somewhere.
   But we recently learned that you can prevent it from being
   recreated by removing the 'x' capability on the mon from your
   client.radosgw.* users, for example:
  
   client.radosgw.cephrgw1
   key: xxx
   caps: [mon] allow r
   caps: [osd] allow rwx
  
  
   Cheers, Dan
  
  
   myk...@gmail.com wrote:
  
   Hi,
  
   I cant delete pool with empty name:
  
   $ sudo rados rmpool   --yes-i-really-really-mean-it
   successfully deleted pool
  
   but after a few seconds it is recreated automatically.
  
   $ sudo ceph osd dump | grep '^pool'
   pool 3 '.rgw' rep size 2 min_size 1 crush_ruleset 0 object_hash
   rjenkins pg_num 8 pgp_num 8 last_change 9 owner
   18446744073709551615 pool 4 '.rgw.gc' rep size 2 min_size 1
   crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8
   last_change 10 owner 18446744073709551615 pool 5 '.rgw.control'
   rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num
   8 pgp_num 8 last_change 11 owner 18446744073709551615 pool 6
   '.users.uid' rep size 2 min_size 1 crush_ruleset 0 object_hash
   rjenkins pg_num 8 pgp_num 8 last_change 13 owner 0 pool 7
   '.users.email' rep size 2 min_size 1 crush_ruleset 0 object_hash
   rjenkins pg_num 8 pgp_num 8 last_change 15 owner 0 pool 8 '.users'
   rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num
   8 pgp_num 8 last_change 17 owner 0 pool 9 '.rgw.buckets' rep size
   2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 1024
   pgp_num 1024 last_change 38 owner 18446744073709551615 pool 10
   '.rgw.root' rep size 2 min_size 1 crush_ruleset 0 object_hash
   rjenkins pg_num 8 pgp_num 8 last_change 100 owner 0 pool 17 '' rep
   size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 8
   pgp_num 8 last_change 3347 owner 0
  
   ceph version 0.67.7 (d7ab4244396b57aac8b7e80812115bbd079e6b73)
  
   How can i delete it forever?
  
  
   -- Dan van der Ster || Data  Storage Services || CERN IT
   Department --
  
   ___
   ceph-users mailing list
   ceph-users@lists.ceph.com
   http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Pool with empty name recreated

2014-04-24 Thread Irek Fasikhov
You need to create a pool named .rgw.buckets.index



2014-04-24 14:05 GMT+04:00 myk...@gmail.com:

 Hi,

 I cant delete pool with empty name:

 $ sudo rados rmpool   --yes-i-really-really-mean-it
 successfully deleted pool

 but after a few seconds it is recreated automatically.

 $ sudo ceph osd dump | grep '^pool'
 pool 3 '.rgw' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins
 pg_num 8 pgp_num 8 last_change 9 owner 18446744073709551615
 pool 4 '.rgw.gc' rep size 2 min_size 1 crush_ruleset 0 object_hash
 rjenkins pg_num 8 pgp_num 8 last_change 10 owner 18446744073709551615
 pool 5 '.rgw.control' rep size 2 min_size 1 crush_ruleset 0 object_hash
 rjenkins pg_num 8 pgp_num 8 last_change 11 owner 18446744073709551615
 pool 6 '.users.uid' rep size 2 min_size 1 crush_ruleset 0 object_hash
 rjenkins pg_num 8 pgp_num 8 last_change 13 owner 0
 pool 7 '.users.email' rep size 2 min_size 1 crush_ruleset 0 object_hash
 rjenkins pg_num 8 pgp_num 8 last_change 15 owner 0
 pool 8 '.users' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins
 pg_num 8 pgp_num 8 last_change 17 owner 0
 pool 9 '.rgw.buckets' rep size 2 min_size 1 crush_ruleset 0 object_hash
 rjenkins pg_num 1024 pgp_num 1024 last_change 38 owner 18446744073709551615
 pool 10 '.rgw.root' rep size 2 min_size 1 crush_ruleset 0 object_hash
 rjenkins pg_num 8 pgp_num 8 last_change 100 owner 0
 pool 17 '' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins
 pg_num 8 pgp_num 8 last_change 3347 owner 0

 ceph version 0.67.7 (d7ab4244396b57aac8b7e80812115bbd079e6b73)

 How can i delete it forever?

 --
 Regards,
 Mikhail


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Pool with empty name recreated

2014-04-24 Thread Irek Fasikhov
These pools of different purposes.


[root@ceph01 ~]# radosgw-admin zone list
{ zones: [
default]}
[root@ceph01 ~]# radosgw-admin zone get default
{ domain_root: .rgw,
  control_pool: .rgw.control,
  gc_pool: .rgw.gc,
  log_pool: .log,
  intent_log_pool: .intent-log,
  usage_log_pool: .usage,
  user_keys_pool: .users,
  user_email_pool: .users.email,
  user_swift_pool: .users.swift,
  user_uid_pool: .users.uid,
  system_key: { access_key: ,
  secret_key: },
  placement_pools: [
{ key: default-placement,
  val: { *index_pool: .rgw.buckets.index,*
  *data_pool: .rgw.buckets*}}]}


[root@ceph01 ~]# ceph osd dump | grep pool
pool 0 'data' rep size 3 min_size 1 crush_ruleset 0 object_hash rjenkins
pg_num 64 pgp_num 64 last_change 5156 owner 0 crash_replay_interval 45
pool 1 'metadata' rep size 3 min_size 1 crush_ruleset 1 object_hash
rjenkins pg_num 64 pgp_num 64 last_change 5158 owner 0
pool 2 'rbd' rep size 3 min_size 2 crush_ruleset 2 object_hash rjenkins
pg_num 3200 pgp_num 3200 last_change 11642 owner 0
pool 80 'rbdtest' rep size 2 min_size 1 crush_ruleset 0 object_hash
rjenkins pg_num 1024 pgp_num 1024 last_change 11550 owner 0
pool 101 '.rgw.root' rep size 3 min_size 1 crush_ruleset 0 object_hash
rjenkins pg_num 8 pgp_num 8 last_change 11476 owner 0
pool 102 '.rgw.control' rep size 3 min_size 1 crush_ruleset 0 object_hash
rjenkins pg_num 8 pgp_num 8 last_change 11478 owner 0
pool 103 '.users.uid' rep size 3 min_size 1 crush_ruleset 0 object_hash
rjenkins pg_num 8 pgp_num 8 last_change 11480 owner 18446744073709551615
pool 104 '.rgw' rep size 3 min_size 1 crush_ruleset 0 object_hash rjenkins
pg_num 8 pgp_num 8 last_change 11482 owner 18446744073709551615
pool 105 '.rgw.gc' rep size 3 min_size 1 crush_ruleset 0 object_hash
rjenkins pg_num 8 pgp_num 8 last_change 11484 owner 18446744073709551615
pool 106 '.users.email' rep size 3 min_size 1 crush_ruleset 0 object_hash
rjenkins pg_num 8 pgp_num 8 last_change 11486 owner 0
pool 107 '.users' rep size 3 min_size 1 crush_ruleset 0 object_hash
rjenkins pg_num 8 pgp_num 8 last_change 11488 owner 0
pool 108 '.rgw.buckets.index' rep size 3 min_size 1 crush_ruleset 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 11490 owner 0
pool 109 '.rgw.buckets' rep size 3 min_size 2 crush_ruleset 0 object_hash
rjenkins pg_num 512 pgp_num 512 last_change 11631 owner 0





2014-04-24 14:30 GMT+04:00 myk...@gmail.com:

  You need to create a pool named .rgw.buckets.index
 I tried it before i sent a letter to the list.
 All of my buckets have index_pool: .rgw.buckets.

 --
 Regards,
 Mikhail


 On Thu, 24 Apr 2014 14:21:57 +0400
 Irek Fasikhov malm...@gmail.com wrote:

  You need to create a pool named .rgw.buckets.index
 
 
 
  2014-04-24 14:05 GMT+04:00 myk...@gmail.com:
 
   Hi,
  
   I cant delete pool with empty name:
  
   $ sudo rados rmpool   --yes-i-really-really-mean-it
   successfully deleted pool
  
   but after a few seconds it is recreated automatically.
  
   $ sudo ceph osd dump | grep '^pool'
   pool 3 '.rgw' rep size 2 min_size 1 crush_ruleset 0 object_hash
   rjenkins pg_num 8 pgp_num 8 last_change 9 owner 18446744073709551615
   pool 4 '.rgw.gc' rep size 2 min_size 1 crush_ruleset 0 object_hash
   rjenkins pg_num 8 pgp_num 8 last_change 10 owner
   18446744073709551615 pool 5 '.rgw.control' rep size 2 min_size 1
   crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change
   11 owner 18446744073709551615 pool 6 '.users.uid' rep size 2
   min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8
   last_change 13 owner 0 pool 7 '.users.email' rep size 2 min_size 1
   crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change
   15 owner 0 pool 8 '.users' rep size 2 min_size 1 crush_ruleset 0
   object_hash rjenkins pg_num 8 pgp_num 8 last_change 17 owner 0
   pool 9 '.rgw.buckets' rep size 2 min_size 1 crush_ruleset 0
   object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 38 owner
   18446744073709551615 pool 10 '.rgw.root' rep size 2 min_size 1
   crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change
   100 owner 0 pool 17 '' rep size 2 min_size 1 crush_ruleset 0
   object_hash rjenkins pg_num 8 pgp_num 8 last_change 3347 owner 0
  
   ceph version 0.67.7 (d7ab4244396b57aac8b7e80812115bbd079e6b73)
  
   How can i delete it forever?
  
   --
   Regards,
   Mikhail
  
  
   ___
   ceph-users mailing list
   ceph-users@lists.ceph.com
   http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
  
 
 
 




-- 
С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Pool with empty name recreated

2014-04-24 Thread Irek Fasikhov
I do not use distributed replication across zones. :)


2014-04-24 15:00 GMT+04:00 myk...@gmail.com:

 I dont use distributed replication across zones.
 $ sudo radosgw-admin zone list
 { zones: [
 default]}

 --
 Regards,
 Mikhail


 On Thu, 24 Apr 2014 14:52:09 +0400
 Irek Fasikhov malm...@gmail.com wrote:

  These pools of different purposes.
 
 
  [root@ceph01 ~]# radosgw-admin zone list
  { zones: [
  default]}
  [root@ceph01 ~]# radosgw-admin zone get default
  { domain_root: .rgw,
control_pool: .rgw.control,
gc_pool: .rgw.gc,
log_pool: .log,
intent_log_pool: .intent-log,
usage_log_pool: .usage,
user_keys_pool: .users,
user_email_pool: .users.email,
user_swift_pool: .users.swift,
user_uid_pool: .users.uid,
system_key: { access_key: ,
secret_key: },
placement_pools: [
  { key: default-placement,
val: { *index_pool: .rgw.buckets.index,*
*data_pool: .rgw.buckets*}}]}
 
 
  [root@ceph01 ~]# ceph osd dump | grep pool
  pool 0 'data' rep size 3 min_size 1 crush_ruleset 0 object_hash
  rjenkins pg_num 64 pgp_num 64 last_change 5156 owner 0
  crash_replay_interval 45 pool 1 'metadata' rep size 3 min_size 1
  crush_ruleset 1 object_hash rjenkins pg_num 64 pgp_num 64 last_change
  5158 owner 0 pool 2 'rbd' rep size 3 min_size 2 crush_ruleset 2
  object_hash rjenkins pg_num 3200 pgp_num 3200 last_change 11642 owner
  0 pool 80 'rbdtest' rep size 2 min_size 1 crush_ruleset 0 object_hash
  rjenkins pg_num 1024 pgp_num 1024 last_change 11550 owner 0
  pool 101 '.rgw.root' rep size 3 min_size 1 crush_ruleset 0 object_hash
  rjenkins pg_num 8 pgp_num 8 last_change 11476 owner 0
  pool 102 '.rgw.control' rep size 3 min_size 1 crush_ruleset 0
  object_hash rjenkins pg_num 8 pgp_num 8 last_change 11478 owner 0
  pool 103 '.users.uid' rep size 3 min_size 1 crush_ruleset 0
  object_hash rjenkins pg_num 8 pgp_num 8 last_change 11480 owner
  18446744073709551615 pool 104 '.rgw' rep size 3 min_size 1
  crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change
  11482 owner 18446744073709551615 pool 105 '.rgw.gc' rep size 3
  min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8
  last_change 11484 owner 18446744073709551615 pool 106 '.users.email'
  rep size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 8
  pgp_num 8 last_change 11486 owner 0 pool 107 '.users' rep size 3
  min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8
  last_change 11488 owner 0 pool 108 '.rgw.buckets.index' rep size 3
  min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8
  last_change 11490 owner 0 pool 109 '.rgw.buckets' rep size 3 min_size
  2 crush_ruleset 0 object_hash rjenkins pg_num 512 pgp_num 512
  last_change 11631 owner 0
 
 
 
 
 
  2014-04-24 14:30 GMT+04:00 myk...@gmail.com:
 
You need to create a pool named .rgw.buckets.index
   I tried it before i sent a letter to the list.
   All of my buckets have index_pool: .rgw.buckets.
  
   --
   Regards,
   Mikhail
  
  
   On Thu, 24 Apr 2014 14:21:57 +0400
   Irek Fasikhov malm...@gmail.com wrote:
  
You need to create a pool named .rgw.buckets.index
   
   
   
2014-04-24 14:05 GMT+04:00 myk...@gmail.com:
   
 Hi,

 I cant delete pool with empty name:

 $ sudo rados rmpool   --yes-i-really-really-mean-it
 successfully deleted pool

 but after a few seconds it is recreated automatically.

 $ sudo ceph osd dump | grep '^pool'
 pool 3 '.rgw' rep size 2 min_size 1 crush_ruleset 0 object_hash
 rjenkins pg_num 8 pgp_num 8 last_change 9 owner
 18446744073709551615 pool 4 '.rgw.gc' rep size 2 min_size 1
 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8
 last_change 10 owner 18446744073709551615 pool 5 '.rgw.control'
 rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins
 pg_num 8 pgp_num 8 last_change 11 owner 18446744073709551615
 pool 6 '.users.uid' rep size 2 min_size 1 crush_ruleset 0
 object_hash rjenkins pg_num 8 pgp_num 8 last_change 13 owner 0
 pool 7 '.users.email' rep size 2 min_size 1 crush_ruleset 0
 object_hash rjenkins pg_num 8 pgp_num 8 last_change 15 owner 0
 pool 8 '.users' rep size 2 min_size 1 crush_ruleset 0
 object_hash rjenkins pg_num 8 pgp_num 8 last_change 17 owner 0
 pool 9 '.rgw.buckets' rep size 2 min_size 1 crush_ruleset 0
 object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 38
 owner 18446744073709551615 pool 10 '.rgw.root' rep size 2
 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 8
 pgp_num 8 last_change 100 owner 0 pool 17 '' rep size 2
 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 8
 pgp_num 8 last_change 3347 owner 0

 ceph version 0.67.7 (d7ab4244396b57aac8b7e80812115bbd079e6b73)

 How can i delete it forever?

 --
 Regards,
 Mikhail