Re: [ceph-users] RDMA/RoCE enablement failed with (113) No route to host

2018-12-21 Thread Michael Green
I was informed today that the CEPH environment I’ve been working on is no longer available. Unfortunately this happened before I could try any of your suggestions, Roman. Thank you for all the attention and advice. -- Michael Green > On Dec 20, 2018, at 08:21, Roman Penyaev wrote: > >>

Re: [ceph-users] cephfs file block size: must it be so big?

2018-12-21 Thread Gregory Farnum
On Fri, Dec 14, 2018 at 6:44 PM Bryan Henderson wrote: > > Going back through the logs though it looks like the main reason we do a > > 4MiB block size is so that we have a chance of reporting actual cluster > > sizes to 32-bit systems, > > I believe you're talking about a different block size

Re: [ceph-users] Ceph OOM Killer Luminous

2018-12-21 Thread Brad Hubbard
Can you provide the complete OOM message from the dmesg log? On Sat, Dec 22, 2018 at 7:53 AM Pardhiv Karri wrote: > > > Thank You for the quick response Dyweni! > > We are using FileStore as this cluster is upgraded from > Hammer-->Jewel-->Luminous 12.2.8. 16x2TB HDD per node for all nodes.

Re: [ceph-users] Ceph OOM Killer Luminous

2018-12-21 Thread Pardhiv Karri
Thank You for the quick response Dyweni! We are using FileStore as this cluster is upgraded from Hammer-->Jewel-->Luminous 12.2.8. 16x2TB HDD per node for all nodes. R730xd has 128GB and R740xd has 96GB of RAM. Everything else is the same. Thanks, Pardhiv Karri On Fri, Dec 21, 2018 at 1:43 PM

Re: [ceph-users] Bluestore nvme DB/WAL size

2018-12-21 Thread Anthony D'Atri
> It'll cause problems if yours the only one NVMe drive will die - you'll lost > all the DB partitions and all the OSDs are going to be failed The severity of this depends a lot on the size of the cluster. If there are only, say, 4 nodes total, for sure the loss of a quarter of the OSDs will

Re: [ceph-users] Ceph OOM Killer Luminous

2018-12-21 Thread Dyweni - Ceph-Users
Hi, You could be running out of memory due to the default Bluestore cache sizes. How many disks/OSDs in the R730xd versus the R740xd? How much memory in each server type? How many are HDD versus SSD? Are you running Bluestore? OSD's in Luminous, which run Bluestore, allocate memory to use

[ceph-users] Ceph OOM Killer Luminous

2018-12-21 Thread Pardhiv Karri
Hi, We have a luminous cluster which was upgraded from Hammer --> Jewel --> Luminous 12.2.8 recently. Post upgrade we are seeing issue with a few nodes where they are running out of memory and dying. In the logs we are seeing OOM killer. We don't have this issue before upgrade. The only

Re: [ceph-users] Ceph Cluster to OSD Utilization not in Sync

2018-12-21 Thread Pardhiv Karri
Thank You Dwyeni for the quick response. We have 2 Hammer which are due for upgrade to Luminous next month and 1 Luminous 12.2.8. Will try this on Luminous and if it works then will apply the same once the Hammer clusters are upgraded rather than adjusting the weights. Thanks, Pardhiv Karri On

Re: [ceph-users] Ceph Cluster to OSD Utilization not in Sync

2018-12-21 Thread Dyweni - Ceph-Users
Hi, If you are running Ceph Luminous or later, use the Ceph Manager Daemon's Balancer module. (http://docs.ceph.com/docs/luminous/mgr/balancer/). Otherwise, tweak the OSD weights (not the OSD CRUSH weights) until you achieve uniformity. (You should be able to get under 1 STDDEV). I would

[ceph-users] Ceph Cluster to OSD Utilization not in Sync

2018-12-21 Thread Pardhiv Karri
Hi, We have Ceph clusters which are greater than 1PB. We are using tree algorithm. The issue is with the data placement. If the cluster utilization percentage is at 65% then some of the OSDs are already above 87%. We had to change the near_full ratio to 0.90 to circumvent warnings and to get back

[ceph-users] Your email to ceph-uses mailing list: Signature check failures.

2018-12-21 Thread Dyweni - Ceph-Users
Hi Cary, I ran across your email on the ceph-users mailing list 'Signature check failures.'. I've just run across the same issue on my end. Also Gentoo user here. Running Ceph 12.2.5... 32bit/armhf and 64bit/x64_64. Was your environment mixed or strictly just x86_64? What is

Re: [ceph-users] Possible data damage: 1 pg inconsistent

2018-12-21 Thread Frank Ritchie
Christop, do you have any links to the bug? On Fri, Dec 21, 2018 at 11:07 AM Christoph Adomeit < christoph.adom...@gatworks.de> wrote: > Hi, > > same here but also for pgs in cephfs pools. > > As far as I know there is a known bug that under memory pressure some > reads return zero > and this

Re: [ceph-users] Possible data damage: 1 pg inconsistent

2018-12-21 Thread Christoph Adomeit
Hi, same here but also for pgs in cephfs pools. As far as I know there is a known bug that under memory pressure some reads return zero and this will lead to the error message. I have set nodeep-scrub and i am waiting for 12.2.11. Thanks Christoph On Fri, Dec 21, 2018 at 03:23:21PM +0100,

[ceph-users] CephFS MDS optimal setup on Google Cloud

2018-12-21 Thread Mahmoud Ismail
Hello, I'm doing benchmarks for metadata operations on CephFS, HDFS, and HopsFS on Google Cloud. In my current setup, i'm using 32 vCPU machines with 29 GB memory, and i have 1 MDS, 1 MON and 3 OSDs. The MDS and the MON nodes are co-located on one vm, while each of the OSDs is on a separate vm

Re: [ceph-users] Possible data damage: 1 pg inconsistent

2018-12-21 Thread Hervé Ballans
Hi Frank, I encounter exactly the same issue with the same disks than yours. Every day, after a batch of deep scrubbing operation, ther are generally between 1 and 3 inconsistent pgs, and that, on different OSDs. It could confirm a problem on these disks, but : - it concerns only the pgs of

Re: [ceph-users] Bluestore nvme DB/WAL size

2018-12-21 Thread David C
I'm in a similar situation, currently running filestore with spinners and journals on NVME partitions which are about 1% of the size of the OSD. If I migrate to bluestore, I'll still only have that 1% available. Per the docs, if my block.db device fills up, the metadata is going to spill back onto

Re: [ceph-users] Bluestore nvme DB/WAL size

2018-12-21 Thread Konstantin Shalygin
I am considering using logical volumes of an NVMe drive as DB or WAL devices for OSDs on spinning disks. The documentation recommends against DB devices smaller than 4% of slow disk size. Our servers have 16x 10TB HDDs and a single 1.5TB NVMe, so dividing it equally will result in each OSD

Re: [ceph-users] Bluestore nvme DB/WAL size

2018-12-21 Thread Janne Johansson
Den tors 20 dec. 2018 kl 22:45 skrev Vladimir Brik : > Hello > I am considering using logical volumes of an NVMe drive as DB or WAL > devices for OSDs on spinning disks. > The documentation recommends against DB devices smaller than 4% of slow > disk size. Our servers have 16x 10TB HDDs and a