Re: [ceph-users] S3 Bucket usage up 150% diference between rgw-admin and external metering tools.

2020-01-21 Thread EDH - Manuel Rios
d 2~r0BMPPs8CewVZ6Qheu1s9WzaBn7bBvU Return: An error occurred (NoSuchUpload) when calling the AbortMultipartUpload operation: Unknown The same error is reported by S3CMD. Maybe is there something wrong parsing the "1$" inside the key Best Regards, Regards Manuel -Mensaje original- De: c

Re: [ceph-users] S3 Bucket usage up 150% diference between rgw-admin and external metering tools.

2020-01-21 Thread EDH - Manuel Rios
"num_objects": 143 } }, "bucket_quota": { "enabled": false, "check_on_raw": false, "max_size": -1024, "max_size_kb": 0, "max_objects": -1 } } -Mensaje or

[ceph-users] S3 Bucket usage up 150% diference between rgw-admin and external metering tools.

2020-01-20 Thread EDH - Manuel Rios
Hi Cephs Several nodes of our Ceph 14.2.5 are fully dedicated to host cold storage / backups information. Today checking the data usage with a customer found that rgw-admin is reporting: { "bucket": "XX", "tenant": "", "zonegroup": "4d8c7c5f-ca40-4ee3-b5bb-b2cad90bd007",

[ceph-users] [db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting

2020-01-02 Thread EDH - Manuel Rios
HI Today checking our monitor logs see that RocksDB compactation trigger every minute. Is that normal? 2020-01-02 14:08:33.091 7f2b8acbe700 4 rocksdb: [db/db_impl_compaction_flush.cc:1403] [default] Manual compaction starting 2020-01-02 14:08:33.091 7f2b8acbe700 4 rocksdb:

Re: [ceph-users] Consumer-grade SSD in Ceph

2019-12-27 Thread EDH - Manuel Rios
Micron 9300 Obtener Outlook para Android From: ceph-users on behalf of Sinan Polat Sent: Friday, December 27, 2019 12:35:44 PM To: Eneko Lacunza Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Consumer-grade SSD in Ceph Thanks

Re: [ceph-users] Global power failure, OpenStack Nova/libvirt/KVM, and Ceph RBD locks

2019-11-15 Thread EDH - Manuel Rios Fernandez
Hi, For solve the issue, mount with: rbd map pool/disk_id , and mount the / volume in a linux machine "A ceph node will be ok", this will flush the journal and close and discard the pending changes in openstack nodes cache, then unmount and rbd unmap. Boot the instance from openstack again, and

Re: [ceph-users] cephfs full, 2/3 Raw capacity used

2019-08-26 Thread EDH - Manuel Rios Fernandez
Balancer just balance in Healthy mode. The problem is that data is distributed without be balanced in their first write, that cause unproperly data balanced across osd. This problem only happens in CEPH, we are the same with 14.2.2, having to change the weight manually.Because the balancer is a

[ceph-users] Balancer dont work with state pgs backfill_toofull

2019-08-23 Thread EDH - Manuel Rios Fernandez
Root affected got more than 70TB free. The only solution is manual reweight the OSD. But in this situacion balancer in unmap mode should move data to get all HEALTHY Hope some fix come in the next 14.2.X to fix that issue. Ceph 14.2.2 Centos 7.6 cluster: id:

Re: [ceph-users] Applications slow in VMs running RBD disks

2019-08-21 Thread EDH - Manuel Rios Fernandez
Use 100% Flash setup avoid rotational disk for get some performance in HDD with windows. Windows is very sensitive to disk latency and interface with latency provides a bad sense for customer some times. You can check in your Graphana for ceph your avg read/write when in windows go up

Re: [ceph-users] fixing a bad PG per OSD decision with pg-autoscaling?

2019-08-21 Thread EDH - Manuel Rios Fernandez
HI Nigel, In Nautilus you can decrease PG , but it take weeks , for example for us to go from 4096 to 2048 took more than 2 weeks. First at all pg-autoscaling is activable by pool. And you’re going to get a lot of warning , but it works. Normally is recommended upgrade a cluster with

[ceph-users] Ceph Balancer code

2019-08-17 Thread EDH - Manuel Rios Fernandez
Hi , Whats the reason for not allow balancer PG if objects are inactive/misplaced at least in nautilus 14.2.2 ? https://github.com/ceph/ceph/blob/master/src/pybind/mgr/balancer/module.py#L 874 if

Re: [ceph-users] 14.2.2 - OSD Crash

2019-08-07 Thread EDH - Manuel Rios Fernandez
to a NVME of 480GB per node helps in this situation but not sure. Manuel De: Igor Fedotov Enviado el: miércoles, 7 de agosto de 2019 13:10 Para: EDH - Manuel Rios Fernandez ; 'Ceph Users' Asunto: Re: [ceph-users] 14.2.2 - OSD Crash Hi Manuel, as Brad pointed out timeouts

[ceph-users] Nautilus - Balancer is always on

2019-08-07 Thread EDH - Manuel Rios Fernandez
Hi All, ceph mgr module disable balancer Error EINVAL: module 'balancer' cannot be disabled (always-on) Whats the way to restart balanacer? Restart MGR service? I wanna suggest to Balancer developer to setup a ceph-balancer.log for this module get more information about whats

Re: [ceph-users] RadosGW (Ceph Object Gateay) Pools

2019-08-06 Thread EDH - Manuel Rios Fernandez
Hi, I think -> default.rgw.buckets.index for us it reach 2k-6K iops for a index size of 23GB. Regards Manuel -Mensaje original- De: ceph-users En nombre de dhils...@performair.com Enviado el: miércoles, 7 de agosto de 2019 1:41 Para: ceph-users@lists.ceph.com Asunto: [ceph-users]

[ceph-users] 14.2.2 - OSD Crash

2019-08-06 Thread EDH - Manuel Rios Fernandez
Hi We got a pair of OSD located in node that crash randomly since 14.2.2 OS Version : Centos 7.6 There're a ton of lines before crash , I will unespected: -- 3045> 2019-08-07 00:39:32.013 7fe9a4996700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7fe987e49700' had timed

Re: [ceph-users] radosgw (beast): how to enable verbose log? request, user-agent, etc.

2019-08-06 Thread EDH - Manuel Rios Fernandez
Hi Felix, You can increase debug option with debug rgw in your rgw nodes. We got it to 10. But at least in our case we switched again to civetweb because it don’t provide a clear log without a lot verbose. Regards Manuel De: ceph-users En nombre de Félix Barbeira

Re: [ceph-users] even number of monitors

2019-08-05 Thread EDH - Manuel Rios Fernandez
With 4 monitors if you lost 2 , your quorum will get out, because it needs be N+1 Monitors recommended: 1 - 3 - 5 - 7 Regards Manuel -Mensaje original- De: ceph-users En nombre de Alfredo Daniel Rezinovsky Enviado el: lunes, 5 de agosto de 2019 12:28 Para: ceph-users Asunto:

Re: [ceph-users] Adventures with large RGW buckets

2019-08-01 Thread EDH - Manuel Rios Fernandez
HI Greg / Eric, What about allow delete bucket object with a lifecycle policy? You can actually put 1 day of object life, that task is done at cluster level. And them delete objects young than 1 day, and remove bucket. That sometimes speed deletes as task is done by rgw's. It should be like a

Re: [ceph-users] Balancer in HEALTH_ERR

2019-08-01 Thread EDH - Manuel Rios Fernandez
Hi Eric, CEPH006 is the node that we’re evacuating , for that task we added CEPH005. Thanks De: Smith, Eric Enviado el: jueves, 1 de agosto de 2019 20:12 Para: EDH - Manuel Rios Fernandez ; ceph-users@lists.ceph.com Asunto: Re: [ceph-users] Balancer in HEALTH_ERR >From y

[ceph-users] Balancer in HEALTH_ERR

2019-08-01 Thread EDH - Manuel Rios Fernandez
Hi , Two weeks ago, we started a data migration from one old ceph node to a new one. For task we added a 120TB Host to the cluster and evacuated the old one with the ceph osd crush reweight osd.X 0.0 that move near 15 TB per day. After 1 week and few days we found that balancer module

Re: [ceph-users] Ceph Nautilus - can't balance due to degraded state

2019-07-29 Thread EDH - Manuel Rios Fernandez
Same here, Nautilus 14.2.2. Evacuate one host and join another one at the same time and all is unbalance. Best De: ceph-users En nombre de David Herselman Enviado el: lunes, 29 de julio de 2019 11:31 Para: ceph-users@lists.ceph.com Asunto: [ceph-users] Ceph Nautilus - can't

Re: [ceph-users] Nautilus dashboard: crushmap viewer shows only first root

2019-07-24 Thread EDH - Manuel Rios Fernandez
Hi Eugen, Yes its solved, we reported in 14.2.1 and team fixed in 14.2.2 Regards, Manuel -Mensaje original- De: ceph-users En nombre de Eugen Block Enviado el: miércoles, 24 de julio de 2019 15:10 Para: ceph-users@lists.ceph.com Asunto: [ceph-users] Nautilus dashboard: crushmap viewer

[ceph-users] Repair statsfs fail some osd 14.2.1 to 14.2.2

2019-07-23 Thread EDH - Manuel Rios Fernandez
Hi Ceph, Upgraded last night from 14.2.1 to 14.2.2, 36 osd with old stats. We're still repairing stats one by one . But one failed. Hope this helps. CentOS Version: Linux CEPH006 3.10.0-957.10.1.el7.x86_64 #1 SMP Mon Mar 18 15:06:45 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

[ceph-users] RGW Beast crash 14.2.1

2019-07-11 Thread EDH - Manuel Rios Fernandez
Hi Folks, This night RGW crashed without sense using beast as fronted. We solved turning on civetweb again. Should be report to tracker? Regards Manuel Centos 7.6 Linux ceph-rgw03 3.10.0-957.21.3.el7.x86_64 #1 SMP Tue Jun 18 16:35:19 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Re: [ceph-users] Even more objects in a single bucket?

2019-06-17 Thread EDH - Manuel Rios Fernandez
Hi Harald , We saw in our internal Veeam repo that only 4TB used created more than 10M objects. I don't know if Veeam need to list content inside the bucket, that will make a 500 millions bucket not a good solution at least in our experience with sharding. I read someone in the IRC telling that

Re: [ceph-users] Can I limit OSD memory usage?

2019-06-07 Thread EDH - Manuel Rios Fernandez
Hi Sergei, Please add to your host: For 64GB RAM, reserve 1GB. vm.min_free_kbytes = 1048576 For 128GB RAM, reserve 2GB. vm.min_free_kbytes = 2097152 For 256GB RAM, reserve 3GB. vm.min_free_kbytes = 3145728 This will prevent to your OSD to use ALL memory of host and OOM act. Regards

Re: [ceph-users] OSD RAM recommendations

2019-06-07 Thread EDH - Manuel Rios Fernandez
In nautilus min 4GB per disk . In case of ssd/nvme 6-12GB per disk. 8GB per disk is a good way to get performance +2/4GB for the OS. Regards, Manuel -Mensaje original- De: ceph-users En nombre de jes...@krogh.cc Enviado el: viernes, 7 de junio de 2019 19:36 Para: Jorge Garcia CC:

[ceph-users] cls_rgw.cc:3461: couldn't find tag in name index tag

2019-06-05 Thread EDH - Manuel Rios Fernandez
Hi Checking our cluster logs we found tons of this lines in the osd. One osd /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x8 6_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.2.1/rp m/el7/BUILD/ceph-14.2.1/src/cls/rgw/cls_rgw.cc:3461:

[ceph-users] Lifecycle policy completed but not done

2019-05-30 Thread EDH - Manuel Rios Fernandez
Hi Cephs! Yesterday we setup a Lifecycle policy for remote all incomplete partial uploads in the buckets, due this make mistakes between Used space showed in tools and bucket stats from ceph. We setup this policy (s3cmd setlifecycle rule.xml s3://GIB --no-ssl)

Re: [ceph-users] cephfs causing high load on vm, taking down 15 min later another cephfs vm

2019-05-21 Thread EDH - Manuel Rios Fernandez
Hi Marc Is there any scrub / deepscrub running in the affected OSDs? Best Regards, Manuel -Mensaje original- De: ceph-users En nombre de Marc Roos Enviado el: martes, 21 de mayo de 2019 10:01 Para: ceph-users ; Marc Roos Asunto: Re: [ceph-users] cephfs causing high load on vm, taking

Re: [ceph-users] Large OMAP Objects in default.rgw.log pool

2019-05-20 Thread EDH - Manuel Rios Fernandez
Hi Arnondh, Whats your ceph version? Regards De: ceph-users En nombre de mr. non non Enviado el: lunes, 20 de mayo de 2019 12:39 Para: ceph-users@lists.ceph.com Asunto: [ceph-users] Large OMAP Objects in default.rgw.log pool Hi, I found the same issue like above. Does

Re: [ceph-users] RGW Bucket unable to list buckets 100TB bucket

2019-05-18 Thread EDH - Manuel Rios Fernandez
ayo de 2019 10:14 Para: ceph-users@lists.ceph.com Asunto: Re: [ceph-users] RGW Bucket unable to list buckets 100TB bucket On 10/05/2019 08:42, EDH - Manuel Rios Fernandez wrote: > Hi > > Yesterday night we added 2 Intel Optane Nvme > > Generated 4 partitions for get the max pe

[ceph-users] Scrub Crash OSD 14.2.1

2019-05-17 Thread EDH - Manuel Rios Fernandez
Hi , Today we got some osd that crash after scrub. Version 14.2.1 2019-05-17 12:49:40.955 7fd980d8fd80 4 rocksdb: EVENT_LOG_v1 {"time_micros": 1558090180955778, "job": 1, "event": "recovery_finished"} 2019-05-17 12:49:40.967 7fd980d8fd80 4 rocksdb:

Re: [ceph-users] openstack with ceph rbd vms IO/erros

2019-05-17 Thread EDH - Manuel Rios Fernandez
Did you check your KVM host RAM usage? We saw this on host very very loaded with overcommit in RAM causes a random crash of VM. As you said for solve must be remounted externaly and fsck. You can prevent it disabled ceph cache at Openstack Nova host. But your VM’s are going get less

Re: [ceph-users] Ceph Bucket strange issues rgw.none + id and marker diferent.

2019-05-15 Thread EDH - Manuel Rios Fernandez
84.03 MIN/MAX VAR: 0.00/1.05 STDDEV: 25.52 -Mensaje original- De: J. Eric Ivancich Enviado el: miércoles, 15 de mayo de 2019 18:12 Para: EDH - Manuel Rios Fernandez ; 'Casey Bodley' ; ceph-users@lists.ceph.com Asunto: Re: [ceph-users] Ceph Bucket strange issues rgw.none + id and marker

Re: [ceph-users] How do you deal with "clock skew detected"?

2019-05-15 Thread EDH - Manuel Rios Fernandez
We setup 2 monitors as NTP server, and the other nodes are sync from monitors. -Mensaje original- De: ceph-users En nombre de Richard Hesketh Enviado el: miércoles, 15 de mayo de 2019 14:04 Para: ceph-users@lists.ceph.com Asunto: Re: [ceph-users] How do you deal with "clock skew

[ceph-users] Using centraliced management configuration drops some unrecognized config option

2019-05-14 Thread EDH - Manuel Rios Fernandez
Hi We're moving our config to centralized management configuration with "ceph config set" and with the minimal ceph.conf in all nodes. Several options from ceph are not allowed. Why? ceph version 14.2.1 (d555a9489eb35f84f2e1ef49b77e19da9d113972) nautilus (stable) ceph config set osd

Re: [ceph-users] ceph nautilus deep-scrub health error

2019-05-14 Thread EDH - Manuel Rios Fernandez
Hi Muthu We found the same issue near 2000 pgs not deep-scrubbed in time. We’re manually force scrubbing with : ceph health detail | grep -i not | awk '{print $2}' | while read i; do ceph pg deep-scrub ${i}; done It launch near 20-30 pgs to be deep-scrubbed. I think you can

Re: [ceph-users] Ceph MGR CRASH : balancer module

2019-05-14 Thread EDH - Manuel Rios Fernandez
We can confirm that Balancer module works smooth in 14.2.1. We’re balancing with bytes and pg. Now all osd are 100% balanced. De: ceph-users En nombre de xie.xing...@zte.com.cn Enviado el: martes, 14 de mayo de 2019 9:53 Para: tze...@us.ibm.com CC: ceph-users@lists.ceph.com Asunto:

[ceph-users] Ceph Health 14.2.1 Dont report slow OPS

2019-05-13 Thread EDH - Manuel Rios Fernandez
Hi The lastest versión of ceph is not reporting anymore slowops in dashboard and cli? Bug? Or expected? ceph version 14.2.1 (d555a9489eb35f84f2e1ef49b77e19da9d113972) nautilus (stable) Linux 3.10.0-957.12.1.el7.x86_64 #1 SMP Mon Apr 29 14:59:59 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Re: [ceph-users] Slow requests from bluestore osds

2019-05-12 Thread EDH - Manuel Rios Fernandez
Hi Marc, Try to compact OSD with slow request ceph tell osd.[ID] compact This will make the OSD offline for some seconds(SSD) to minutes(HDD) and perform a compact of OMAP database. Regards, -Mensaje original- De: ceph-users En nombre de Marc Schöchlin Enviado el: lunes, 13 de

[ceph-users] Daemon configuration preference

2019-05-10 Thread EDH - Manuel Rios Fernandez
Hi Cephs We migrated the ceph.conf into the cluster's configuration database. What information got preference once the daemon startup ceph.conf or configuration database? Is cluster configuration databases read in-live or we continue needing restart daemons? Regards

Re: [ceph-users] RGW Bucket unable to list buckets 100TB bucket

2019-05-10 Thread EDH - Manuel Rios Fernandez
---Mensaje original- De: ceph-users En nombre de EDH - Manuel Rios Fernandez Enviado el: sábado, 4 de mayo de 2019 15:53 Para: 'Matt Benjamin' CC: 'ceph-users' Asunto: Re: [ceph-users] RGW Bucket unable to list buckets 100TB bucket Hi Folks, The user is telling us that their software drops

Re: [ceph-users] Getting "No space left on device" when reading from cephfs

2019-05-09 Thread EDH - Manuel Rios Fernandez
Im not sure that setup to 128 max backfills is a good idea, I shared our config for recovery and backfilling: osd recovery threads = 4 osd recovery op priority = 1 osd recovery max active = 2 osd recovery max single start = 1 osd max backfills = 4 osd backfill scan max = 16 osd backfill

Re: [ceph-users] Ceph Bucket strange issues rgw.none + id and marker diferent.

2019-05-08 Thread EDH - Manuel Rios Fernandez
e original- De: J. Eric Ivancich Enviado el: miércoles, 8 de mayo de 2019 21:00 Para: EDH - Manuel Rios Fernandez ; 'Casey Bodley' ; ceph-users@lists.ceph.com Asunto: Re: [ceph-users] Ceph Bucket strange issues rgw.none + id and marker diferent. Hi Manuel, My response is interleaved. On 5

Re: [ceph-users] Ceph Bucket strange issues rgw.none + id and marker diferent.

2019-05-07 Thread EDH - Manuel Rios Fernandez
more idea? Regards -Mensaje original- De: J. Eric Ivancich Enviado el: martes, 7 de mayo de 2019 23:53 Para: EDH - Manuel Rios Fernandez ; 'Casey Bodley' ; ceph-users@lists.ceph.com Asunto: Re: [ceph-users] Ceph Bucket strange issues rgw.none + id and marker diferent. On 5/7/19 11

Re: [ceph-users] Ceph Bucket strange issues rgw.none + id and marker diferent.

2019-05-07 Thread EDH - Manuel Rios Fernandez
original- De: Casey Bodley Enviado el: martes, 7 de mayo de 2019 19:13 Para: EDH - Manuel Rios Fernandez ; ceph-users@lists.ceph.com Asunto: Re: [ceph-users] Ceph Bucket strange issues rgw.none + id and marker diferent. On 5/7/19 11:24 AM, EDH - Manuel Rios Fernandez wrote: > Hi Casey >

Re: [ceph-users] Ceph Bucket strange issues rgw.none + id and marker diferent.

2019-05-07 Thread EDH - Manuel Rios Fernandez
in bucket stats is nothing to worry about. What ceph version is this? This reminds me of a fix in https://github.com/ceph/ceph/pull/23940, which I now see never got its backports to mimic or luminous. :( On 5/7/19 10:20 AM, EDH - Manuel Rios Fernandez wrote: > > Hi Ceph’s > > We got an i

[ceph-users] Ceph Bucket strange issues rgw.none + id and marker diferent.

2019-05-07 Thread EDH - Manuel Rios Fernandez
Hi Ceph's We got an issue that we're still looking the cause after more than 60 hour searching a misconfiguration. After cheking a lot of documentation and Questions we find that bucket id and bucket marker are not the same. We compared all our other bucket and all got the same id and

[ceph-users] cls_rgw.cc:3420: couldn't find tag in name index

2019-05-04 Thread EDH - Manuel Rios Fernandez
Hi Ceph's! We're looking in some OSD with overcpu usage the next logs: 2019-05-05 01:40:57.733 7efeb10bc700 0 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x8 6_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.5/rp

Re: [ceph-users] RGW Bucket unable to list buckets 100TB bucket

2019-05-04 Thread EDH - Manuel Rios Fernandez
:600 Of course list a bucket with 7M objects is a painfull maybe this help to allow software complete the listing? Best Regards Manuel -Mensaje original- De: Matt Benjamin Enviado el: viernes, 3 de mayo de 2019 15:47 Para: EDH - Manuel Rios Fernandez CC: ceph-users Asunto: Re: [ceph

[ceph-users] RGW BEAST mimic backport dont show customer IP

2019-05-03 Thread EDH - Manuel Rios Fernandez
Hi Folks, We migrated our RGW from Citeweb to Beast as frontend backport to mimic, the performance is impressive compared with the old one. But. in ceph logs don't show client peer IP, checked with debug rgw = 1 and 2. Checked the documentation in ceph don't tell us much more. How

Re: [ceph-users] Ceph cluster available to clients with 2 different VLANs ?

2019-05-03 Thread EDH - Manuel Rios Fernandez
You can put multiple networks in ceph.conf with commas public network = 172.16.2.0/24, 192.168.0/22 But remember your servers must be able to reach it. L3 , FW needed. Regards Manuel De: ceph-users En nombre de Martin Verges Enviado el: viernes, 3 de mayo de 2019 11:36 Para:

Re: [ceph-users] RGW Bucket unable to list buckets 100TB bucket

2019-05-03 Thread EDH - Manuel Rios Fernandez
us=206 == -Mensaje original- De: EDH - Manuel Rios Fernandez Enviado el: viernes, 3 de mayo de 2019 15:12 Para: 'Matt Benjamin' CC: 'ceph-users' Asunto: RE: [ceph-users] RGW Bucket unable to list buckets 100TB bucket Hi Matt, Thanks for your help, We have done the changes plus a

Re: [ceph-users] RGW Bucket unable to list buckets 100TB bucket

2019-05-03 Thread EDH - Manuel Rios Fernandez
Enviado el: viernes, 3 de mayo de 2019 14:00 Para: EDH - Manuel Rios Fernandez CC: ceph-users Asunto: Re: [ceph-users] RGW Bucket unable to list buckets 100TB bucket Hi Folks, Thanks for sharing your ceph.conf along with the behavior. There are some odd things there. 1. rgw_num_rados_h

[ceph-users] RGW Bucket unable to list buckets 100TB bucket

2019-05-03 Thread EDH - Manuel Rios Fernandez
Hi, We got a ceph deployment 13.2.5 version, but several bucket with millions of files. services: mon: 3 daemons, quorum CEPH001,CEPH002,CEPH003 mgr: CEPH001(active) osd: 106 osds: 106 up, 106 in rgw: 2 daemons active data: pools: 17 pools, 7120 pgs