Re: [ceph-users] Hardware selection for ceph backup on ceph

2020-01-12 Thread Martin Verges
Hello Stefan,

AMD EPYC
>

great choice!

Has anybody experience with the drives?


some of our customers have different toshiba MG06SCA drives and they work
great according to them. Can't say for MG07ACA but to be honest, I don't
think there should be a huge difference.

--
Martin Verges
Managing director

Hint: Secure one of the last slots in the upcoming 4-day Ceph Intensive
Training at https://croit.io/training/4-days-ceph-in-depth-training.

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Fr., 10. Jan. 2020 um 17:32 Uhr schrieb Stefan Priebe - Profihost AG <
s.pri...@profihost.ag>:

> Hi,
>
> we‘re currently in the process of building a new ceph cluster to backup
> rbd images from multiple ceph clusters.
>
> We would like to start with just a single ceph cluster to backup which is
> about 50tb. Compression ratio of the data is around 30% while using zlib.
> We need to scale the backup cluster up to 1pb.
>
> The workload on the original rbd images is mostly 4K writes so I expect
> rbd export-diff to do a lot of small writes.
>
> The current idea is to use the following hw as a start:
> 6 Servers with:
>  1 AMD EPYC 7302P 3GHz, 16C/32T
> 128g Memory
> 14x 12tb Toshiba Enterprise MG07ACA HDD drives 4K native
> Dual 25gb network
>
> Does it fit? Has anybody experience with the drives? Can we use EC or do
> we need to use normal replication?
>
> Greets,
> Stefan
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Failed to encode map errors

2019-12-03 Thread Martin Verges
Hello,

what versions of Ceph are you running?

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Di., 3. Dez. 2019 um 19:05 Uhr schrieb John Hearns :

> And me again for the second time in one day.
>
> ceph -w is now showing messages like this:
>
> 2019-12-03 15:17:22.426988 osd.6 [WRN] failed to encode map e28961 with
> expected crc
>
> Any advice please?
>
> *Kheiron Medical Technologies*
>
> kheironmed.com | supporting radiologists with deep learning
>
> Kheiron Medical Technologies Ltd. is a registered company in England and
> Wales. This e-mail and its attachment(s) are intended for the above named
> only and are confidential. If they have come to you in error then you must
> take no action based upon them but contact us immediately. Any disclosure,
> copying, distribution or any action taken or omitted to be taken in
> reliance on it is prohibited and may be unlawful. Although this e-mail and
> its attachments are believed to be free of any virus, it is the
> responsibility of the recipient to ensure that they are virus free. If you
> contact us by e-mail then we will store your name and address to facilitate
> communications. Any statements contained herein are those of the individual
> and not the organisation.
>
> Registered number: 10184103. Registered office: 2nd Floor Stylus
> Building, 116 Old Street, London, England, EC1V 9BG
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Migrating from block to lvm

2019-11-15 Thread Martin Verges
I would consider doing it host-by-host wise, as you should always be able
to handle the complete loss of a node. This would be much faster in the end
as you save a lot of time not migrating data back and forth. However this
can lead to problems if your cluster is not configured according to the
hardware performance given.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Fr., 15. Nov. 2019 um 20:46 Uhr schrieb Janne Johansson <
icepic...@gmail.com>:

> Den fre 15 nov. 2019 kl 19:40 skrev Mike Cave :
>
>> So would you recommend doing an entire node at the same time or per-osd?
>>
>
> You should be able to do it per-OSD (or per-disk in case you run more than
> one OSD per disk), to minimize data movement over the network, letting
> other OSDs on the same host take a bit of the load while re-making the
> disks one by one. You can use "ceph osd reweight  0.0" to make the
> particular OSD release its data but still claim it supplies $crush-weight
> to the host, meaning the other disks will have to take its data more or
> less.
> Moving data between disks in the same host usually goes lots faster than
> over the network to other hosts.
>
> --
> May the most significant bit of your life be positive.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Updating crush location on all nodes of a cluster

2019-10-22 Thread Martin Verges
Hello Alexandre,

maybe you take a look into https://www.youtube.com/watch?v=V33f7ipw9d4 where
you can see how easy Ceph CRUSH can be managed.

1. Changing the locations of all hosts at once
> We are worried that this will generate too much IO and network activity
> (and there is no way to pause / throttle this AFAIK). Maybe this is not
> actually an issue?


Just configure the cluster to allow slow recovery before changing the crush
map. Typical options that might help you are "osd recovery sleep
hdd|hybrid|ssd" and "osd max backfills".

2. Changing the locations of a couple hosts to reduce data movement
> We are afraid that if we set 2 hosts to DC1, 2 hosts to DC2 and leave the
> rest as-is; Ceph will behave as if there are 3 DCs and will try and fill
> those 4 hosts with as many replicas as possible until they are full.
>

If you leave any data unsorted, you will never know what data copies are
getting unavailable. In fact, you will produce service impact with such
setups in case of one data center fails.
Do you use any EC configuration suitable for 2 DC configurations, or do you
use replica and want to tolerate having 2 missing copies at the same time?

3. Try and move PGs ahead of the change?
> Maybe we could move PGs so that each PG has a replica on an OSD of each DC
> *before* updating the crush map so that the update does not have to
> actually move any data? (which would allow us to do this at the desired
> pace)
>

Maybe the PG UPMAP is something that you can use for this, but your cluster
hardware and configuration should always be configured to handle workloads
like this rebalance without impacting your clients. See 1.

4. Something else?
> Thank you for your time and your help. :)
>

You are welcome as every Ceph user! ;)

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Di., 22. Okt. 2019 um 11:37 Uhr schrieb Alexandre Berthaud <
alexandre.berth...@clever-cloud.com>:

> Hello everyone,
>
> We have a Ceph cluster (running 14.2.2) which has already dozens of TB of
> data and... we did not set the location of the OSD hosts. The hosts are
> located in 2 datacenters. We would like to update the locations of all the
> hosts so not all replicas end up in a single DC.
>
> We are wondering how we should go about this.
>
> 1. Changing the locations of all hosts at once
>
> We are worried that this will generate too much IO and network activity
> (and there is no way to pause / throttle this AFAIK). Maybe this is not
> actually an issue?
>
> 2. Changing the locations of a couple hosts to reduce data movement
>
> We are afraid that if we set 2 hosts to DC1, 2 hosts to DC2 and leave the
> rest as-is; Ceph will behave as if there are 3 DCs and will try and fill
> those 4 hosts with as many replicas as possible until they are full.
>
> 3. Try and move PGs ahead of the change?
>
> Maybe we could move PGs so that each PG has a replica on an OSD of each DC
> *before* updating the crush map so that the update does not have to
> actually move any data? (which would allow us to do this at the desired
> pace)
>
> 4. Something else?
>
> Thank you for your time and your help. :)
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Can't create erasure coded pools with k+m greater than hosts?

2019-10-20 Thread Martin Verges
Just don't do such setups for production, It will be a lot of pain,
trouble, and cause you problems.

Just take a cheap system, put some of the disks in it and do a way way
better deployment than something like 4+2 on 3 hosts. Whatever you do with
that cluster (example kernel update, reboot, PSU failure, ...) causes you
and all attached clients, especially bad with VMs on that Ceph cluster, to
stop any IO or even crash completely.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Sa., 19. Okt. 2019 um 01:51 Uhr schrieb Chris Taylor :

> Full disclosure - I have not created an erasure code pool yet!
>
> I have been wanting to do the same thing that you are attempting and
> have these links saved. I believe this is what you are looking for.
>
> This link is for decompiling the CRUSH rules and recompiling:
>
> https://docs.ceph.com/docs/luminous/rados/operations/crush-map-edits/
>
>
> This link is for creating the EC rules for 4+2 with only 3 hosts:
>
> https://ceph.io/planet/erasure-code-on-small-clusters/
>
>
> I hope that helps!
>
>
>
> Chris
>
>
> On 2019-10-18 2:55 pm, Salsa wrote:
> > Ok, I'm lost here.
> >
> > How am I supposed to write a crush rule?
> >
> > So far I managed to run:
> >
> > #ceph osd crush rule dump test -o test.txt
> >
> > So I can edit the rule. Now I have two problems:
> >
> > 1. Whats the functions and operations to use here? Is there
> > documentation anywhere abuot this?
> > 2. How may I create a crush rule using this file? 'ceph osd crush rule
> > create ... -i test.txt' does not work.
> >
> > Am I taking the wrong approach here?
> >
> >
> > --
> > Salsa
> >
> > Sent with ProtonMail Secure Email.
> >
> > ‐‐‐ Original Message ‐‐‐
> > On Friday, October 18, 2019 3:56 PM, Paul Emmerich
> >  wrote:
> >
> >> Default failure domain in Ceph is "host" (see ec profile), i.e., you
> >> need at least k+m hosts (but at least k+m+1 is better for production
> >> setups).
> >> You can change that to OSD, but that's not a good idea for a
> >> production setup for obvious reasons. It's slightly better to write a
> >> crush rule that explicitly picks two disks on 3 different hosts
> >>
> >> Paul
> >>
> >>
> 
> >>
> >> Paul Emmerich
> >>
> >> Looking for help with your Ceph cluster? Contact us at
> >> https://croit.io
> >>
> >> croit GmbH
> >> Freseniusstr. 31h
> >> 81247 München
> >> www.croit.io
> >> Tel: +49 89 1896585 90
> >>
> >> On Fri, Oct 18, 2019 at 8:45 PM Salsa sa...@protonmail.com wrote:
> >>
> >> > I have probably misunterstood how to create erasure coded pools so I
> may be in need of some theory and appreciate if you can point me to
> documentation that may clarify my doubts.
> >> > I have so far 1 cluster with 3 hosts and 30 OSDs (10 each host).
> >> > I tried to create an erasure code profile like so:
> >> > "
> >> >
> >> > ceph osd erasure-code-profile get ec4x2rs
> >> >
> >> > ==
> >> >
> >> > crush-device-class=
> >> > crush-failure-domain=host
> >> > crush-root=default
> >> > jerasure-per-chunk-alignment=false
> >> > k=4
> >> > m=2
> >> > plugin=jerasure
> >> > technique=reed_sol_van
> >> > w=8
> >> > "
> >> > If I create a pool using this profile or any profile where K+M >
> hosts , then the pool gets stuck.
> >> > "
> >> >
> >> > ceph -s
> >> >
> >> > 
> >> >
> >> > cluster:
> >> > id: eb4aea44-0c63-4202-b826-e16ea60ed54d
> >> > health: HEALTH_WARN
> >> > Reduced data availability: 16 pgs inactive, 16 pgs incomplete
> >> > 2 pools have too many placement groups
> >

Re: [ceph-users] rgw S3 lifecycle cannot keep up

2019-10-02 Thread Martin Verges
Hello Christian,

the problem is, that HDD is not capable of providing lots of IOs required
for "~4 million small files".

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Mi., 2. Okt. 2019 um 11:56 Uhr schrieb Christian Pedersen <
chrip...@gmail.com>:

> Hi,
>
> Using the S3 gateway I store ~4 million small files in my cluster every
> day. I have a lifecycle setup to move these files to cold storage after a
> day and delete them after two days.
>
> The default storage is SSD based and the cold storage is HDD.
>
> However the rgw lifecycle process cannot keep up with this. In a 24 hour
> period. A little less than a million files are moved per day (
> https://imgur.com/a/H52hD2h ). I have tried only enabling the delete part
> of the lifecycle, but even though it deleted from SSD storage, the result
> is the same. The screenshots are taken while there is no incoming files to
> the cluster.
>
> I'm running 5 rgw servers, but that doesn't really change anything from
> when I was running less. I've tried adjusting rgw lc max objs, but again no
> change in performance.
>
> Any suggestions on how I can tune the lifecycle process?
>
> Cheers,
> Christian
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Need advice with setup planning

2019-09-22 Thread Martin Verges
Hello Salsa,

Amazing! Where were you 3 months ago? Only problem is that I think we have
> no moer budget for this so I can't get approval for software license.
>

We where here and on some Ceph days as well. We do provide a completely
free version but with limited features (no HA, LDAP,...) but with the
benefit of not having to worry about the installation, upgrades and a lot
of day-to-day stuff.
Therefore you don't need to pay us a single cent to benefit from a tested
Ceph software stack (from OS, libs, userspace, ...). You can install and
manage everything needed for Ceph to provide RBD, for example to consume
with proxmox, and every extended feature can still be used using the
command line as you would do without our software.

Please keep in mind that our license always includes support that you would
need to pay otherwise (using your own time, or pay someone else).

The service is critical and we are afraid that the network might be
> congested and QoS for the end user degrades
>

If the service is critical, a 10G network would be the choice. However, one
of our test clusters with 11 systems do have a dual 1GbE that works
perfectly fine. It just needs to be configured correctly using a good hash
policy (we use layer3+4 in our software)
and of course will never achieve the full performance.

Btw. a brand new single port 10G card only costs ~40€ (used from 6€), dual
port starting from ~80€ (used from 20€). If it is critical storage system,
that little money should always be possible to spend.

Great! Thanks for the help and congratulations on that demo. It is the best
> I've used and the easiest ceph setup I've found. As feedback, the last part
> of the demo tutorial is not 100% compatible with the master branch from
> github.
>

Thanks for the feedback, we do need to update the videos to the newest
version but our time is unfortunately limited :/

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Fr., 20. Sept. 2019 um 18:34 Uhr schrieb Martin Verges <
martin.ver...@croit.io>:

> Hello Salsa,
>
> I have tested Ceph using VMs but never got to put it to use and had a lot
>> of trouble to get it to install.
>>
> if you want to get rid of all the troubles from installing to day2day
> operations, you could consider using https://croit.io/croit-virtual-demo
>
> - Use 2 HDDs for SO using RAID 1 (I've left 3.5TB unallocated in case I
>> can use it later for storage)
>> - Install CentOS 7.7
>>
> Is ok, but won't be necessary if you choose croit as we boot from the
> network and don't install a operating system.
>
> - Use 2 vLANs, one for ceph internal usage and another for external
>> access. Since they've 4 network adapters, I'll try to bond them in pairs to
>> speed up network (1Gb).
>>
> If there is no internal policy that forces you to do seperate networks,
> you can use a simple 1 vlan setup and bond 4*1GbE. Otherwise it's ok.
>
>
>> - I'll try to use ceph-ansible for installation. I failed to use it on
>> lab, but it seems more recommended.
>> - Install Ceph Nautilus
>>
> Ultra easy with croit, maybe look at our videos on youtube -
> https://www.youtube.com/playlist?list=PL1g9zo59diHDSJgkZcMRUq6xROzt_YKox
>
>
>> - Each server will host OSD, MON, MGR and MDS.
>>
> ok, but you should use ssd for metadata.
>
>
>> - One VM for ceph-admin: This wil be used to run ceph-ansible and maybe
>> to host some ceph services later
>>
> perfect for croit ;)
>
>
>> - I'll have to serve samba, iscsi and probably NFS too. Not sure how or
>> on which servers.
>>
> Just put it on the servers as well, with croit it is just a click away and
> everything is included in our interface.
> If not using croit, you can still install it on the same systems and
> configure it by hand/script.
>
> --
> Martin Verges
> Managing director
>
> Mobile: +49 174 9335695
> E-Mail: martin.ver...@croit.io
> Chat: https://t.me/MartinVerges
>
> croit GmbH, Freseniusstr. 31h, 81247 Munich
> CEO: Martin Verges - VAT-ID: DE310638492
> Com. register: Amtsgericht Munich HRB 231263
>
> Web: https://croit.io
> YouTube: https://goo.gl/PGE1Bx
>
>
> Am Fr., 20. Sept. 2019 um 18:14 Uhr schrieb Salsa :
>
>> I have tested Ceph using VMs but never got to put it to use and had a lot
>> of trouble to get it to install.
>>
>> Now I've been asked to do a production setup using 3 servers (Dell R740)
>> with 12 4TB each.
>>
>> My plan is this:
>> - Use 2 HDDs for SO usin

Re: [ceph-users] Need advice with setup planning

2019-09-20 Thread Martin Verges
Hello Salsa,

I have tested Ceph using VMs but never got to put it to use and had a lot
> of trouble to get it to install.
>
if you want to get rid of all the troubles from installing to day2day
operations, you could consider using https://croit.io/croit-virtual-demo

- Use 2 HDDs for SO using RAID 1 (I've left 3.5TB unallocated in case I can
> use it later for storage)
> - Install CentOS 7.7
>
Is ok, but won't be necessary if you choose croit as we boot from the
network and don't install a operating system.

- Use 2 vLANs, one for ceph internal usage and another for external access.
> Since they've 4 network adapters, I'll try to bond them in pairs to speed
> up network (1Gb).
>
If there is no internal policy that forces you to do seperate networks, you
can use a simple 1 vlan setup and bond 4*1GbE. Otherwise it's ok.


> - I'll try to use ceph-ansible for installation. I failed to use it on
> lab, but it seems more recommended.
> - Install Ceph Nautilus
>
Ultra easy with croit, maybe look at our videos on youtube -
https://www.youtube.com/playlist?list=PL1g9zo59diHDSJgkZcMRUq6xROzt_YKox


> - Each server will host OSD, MON, MGR and MDS.
>
ok, but you should use ssd for metadata.


> - One VM for ceph-admin: This wil be used to run ceph-ansible and maybe to
> host some ceph services later
>
perfect for croit ;)


> - I'll have to serve samba, iscsi and probably NFS too. Not sure how or on
> which servers.
>
Just put it on the servers as well, with croit it is just a click away and
everything is included in our interface.
If not using croit, you can still install it on the same systems and
configure it by hand/script.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Fr., 20. Sept. 2019 um 18:14 Uhr schrieb Salsa :

> I have tested Ceph using VMs but never got to put it to use and had a lot
> of trouble to get it to install.
>
> Now I've been asked to do a production setup using 3 servers (Dell R740)
> with 12 4TB each.
>
> My plan is this:
> - Use 2 HDDs for SO using RAID 1 (I've left 3.5TB unallocated in case I
> can use it later for storage)
> - Install CentOS 7.7
> - Use 2 vLANs, one for ceph internal usage and another for external
> access. Since they've 4 network adapters, I'll try to bond them in pairs to
> speed up network (1Gb).
> - I'll try to use ceph-ansible for installation. I failed to use it on
> lab, but it seems more recommended.
> - Install Ceph Nautilus
> - Each server will host OSD, MON, MGR and MDS.
> - One VM for ceph-admin: This wil be used to run ceph-ansible and maybe to
> host some ceph services later
> - I'll have to serve samba, iscsi and probably NFS too. Not sure how or on
> which servers.
>
> Am I missing anything? Am I doing anything "wrong"?
>
> I searched for some actual guidance on setup but I couldn't find anything
> complete, like a good tutorial or reference based on possible use-cases.
>
> So, is there any suggestions you could share or links and references I
> should take a look?
>
> Thanks;
>
> --
> Salsa
>
> Sent with ProtonMail <https://protonmail.com> Secure Email.
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Debian Buster builds

2019-07-07 Thread Martin Verges
Hello,

you still need to use other mirrors as debian buster still only
provides 12.2.11 packages (https://packages.debian.org/buster/ceph)

We from croit.io maintain (unofficial) Nautilus builds for Buster
here: https://mirror.croit.io/debian-nautilus/ (signed with
https://mirror.croit.io/keys/release.asc).
These versions are used in our free and commercial management software
and all are tested. It is 100% pure Ceph, therefore you could compile
from the Ceph sources or simply use them.

You find more about how to install and use our mirror at
https://croit.io/2019/07/07/2019-07-07-debian-mirror.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am So., 7. Juli 2019 um 11:45 Uhr schrieb Thore Krüss :
>
> Good evening,
> since Buster is now officially stable, what's the way to proceed to get 
> packages
> for mimic and nautilus?
>
> Best regards
> Thore
>
>
> On Tue, Jun 18, 2019 at 05:11:25PM +0200, Tobias Gall wrote:
> > Hello,
> >
> > I would like to switch to debian buster and test the upgrade from luminous
> > but there are currently no ceph releases/builds for buster.
> >
> > Debian Buster will be released next month[1].
> >
> > Please resume the debian builds as stated in the release notes for mimic[2].
> >
> > Will there be a luminous build for buster?
> >
> > [1] https://lists.debian.org/debian-devel-announce/2019/06/msg3.html
> > [2] https://github.com/ceph/ceph/pull/22602/files
> >
> >
> > Thank you and best regards
> > Tobias
> >
> >
> >
>
>
>
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Two questions about ceph update/upgrade strategies

2019-06-05 Thread Martin Verges
Hello Rainer,

most of the time you just install the newer versions an restart the
old ones without having to worry about the sequence.

Otherwise just use a management solution that helps you with any
day-to-day operation including the complete software update part.
Something like you can see in this video
https://www.youtube.com/watch?v=Jrnzlylidjs.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

Am Di., 4. Juni 2019 um 11:19 Uhr schrieb Rainer Krienke
:
>
> I have a fresh ceph 14.2.1 cluster up and running based on Ubuntu 18.04.
> It consists of 9 hosts (+1 admin host). The nine hosts have each 16
> ceph-osd daemons running, three in these nine hosts also have
> a ceph-mon and a ceph-mgr daemon running. So three hosts are running
> osd, mon and also mgr daemons.
>
> Now I am unsure about the right way to go for ceph upgrades and linux
> system host updates.
>
> Ceph-Upgrade:
> Reading the ceph upgrade docs I ask myself how a future upgrade say to
> 14.2.2 should be performed correctly? The recommendation says to upgrade
> first monitors, then osds etc...
>
> So what is the correct way to go in a mixed setup like mine? Following
> the rules strictly would mean not to use ceph-deploy install, but
> instead to log into the mon(/osd) hosts and then upgrade only the
> ceph-mon package and restart this mon, and then do the same with the
> other monitors/osd hosts. After all mons have been successfully upgraded
> I should then continue with upgrading OSDs (ceph-osd package) on one
> host and restart all osds on this host one after another or reboot the
> whole host. Then proceed to the next osd-host.
>
> Is this the correct and best way to go?
>
> Linux system updates:
> The second point I would like to hear your opinions about is how you
> handle linux system updates? Since even a non ceph linux system package
> update might break ceph or even stop the whole linux host from booting,
> care has to be taken. So how do you handle this problem? Do you run host
> upgrades only manually in a fixed sequence eg first on a osd/mon host
> and if the update is successful, then run the linux system package
> updates on the other hosts?   Do you use another strategy?
>
> Thanks
> Rainer
> --
> Rainer Krienke, Uni Koblenz, Rechenzentrum, A22, Universitaetsstrasse 1
> 56070 Koblenz, Tel: +49261287 1312 Fax +49261287 100 1312
> Web: http://userpages.uni-koblenz.de/~krienke
> PGP: http://userpages.uni-koblenz.de/~krienke/mypgp.html
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] bluestore block.db on SSD, where block.wal?

2019-06-03 Thread Martin Verges
Hello,

please see
https://www.mail-archive.com/ceph-users@lists.ceph.com/msg54607.html and
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-October/030740.html
.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am So., 2. Juni 2019 um 18:43 Uhr schrieb M Ranga Swami Reddy <
swamire...@gmail.com>:

> Hello - I planned to use the bluestore's block.db on SSD (and data is on
> HDD) with 4% of HDD size. Here I have not mentioned the block.wal..in this
> case where block.wal place?
> is it in HDD (ie data) or in block.db of SSD?
>
> Thanks
> Swami
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Using Ceph Ansible to Add Nodes to Cluster at Weight 0

2019-05-30 Thread Martin Verges
Hello Mike,

there is no problem adding 100 OSDs at the same time if your cluster is
configured correctly.
Just add the OSDs and let the cluster slowly (as fast as your hardware
supports without service interruption) rebalance.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Do., 30. Mai 2019 um 02:00 Uhr schrieb Mike Cave :

> Good afternoon,
>
>
>
> I’m about to expand my cluster from 380 to 480 OSDs (5 nodes with 20 disks
> per node) and am trying to determine the best way to go about this task.
>
>
>
> I deployed the cluster with ceph ansible and everything worked well. So
> I’d like to add the new nodes with ceph ansible as well.
>
>
>
> The issue I have is adding that many OSDs at once will likely cause a huge
> issue with the cluster if they come in fully weighted.
>
>
>
> I was hoping to use ceph ansible and set the initial weight to zero and
> then gently bring them up to the correct weight for each OSD.
>
>
>
> I will be doing this with a total of 380 OSDs over the next while. My plan
> is to bring in groups of 6 nodes (I have six racks and the map is
> rack-redundant) until I’m completed on the additions.
>
>
>
> In dev I tried bringing in a node while the cluster was in ‘no rebalance’
> mode and there was still significant movement with some stuck pgs and other
> oddities until I reweighted and then unset ‘no rebalance’.
>
>
>
> I’d like a s little friction for the cluster as possible as it is in heavy
> use right now.
>
>
>
> I’m running mimic (13.2.5) on CentOS.
>
>
>
> Any suggestions on best practices for this?
>
>
>
> Thank you for reading and any help you might be able provide. I’m happy to
> provide any details you might want.
>
>
>
> Cheers,
>
> Mike
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] performance in a small cluster

2019-05-29 Thread Martin Verges
Hello Robert,

We have identified the performance settings in the BIOS as a major factor
>

could you share your insights what options you changed to increase
performance and could you provide numbers to it?

Many thanks in advance

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Mi., 29. Mai 2019 um 09:36 Uhr schrieb Robert Sander <
r.san...@heinlein-support.de>:

> Am 24.05.19 um 14:43 schrieb Paul Emmerich:
> > * SSD model? Lots of cheap SSDs simply can't handle more than that
>
> The customer currently has 12 Micron 5100 1,92TB (Micron_5100_MTFDDAK1)
> SSDs and will get a batch of Micron 5200 in the next days
>
> We have identified the performance settings in the BIOS as a major
> factor. Ramping that up we got a remarkable performance increase.
>
> Regards
> --
> Robert Sander
> Heinlein Support GmbH
> Linux: Akademie - Support - Hosting
> http://www.heinlein-support.de
>
> Tel: 030-405051-43
> Fax: 030-405051-19
>
> Zwangsangaben lt. §35a GmbHG:
> HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
> Geschäftsführer: Peer Heinlein  -- Sitz: Berlin
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD Sizing for DB/WAL: 4% for large drives?

2019-05-28 Thread Martin Verges
Hello Jake,

you can use 2.2% as well and performance will most of the time better than
without having a DB/WAL. However if the DB/WAL is filled up, a spillover to
the regular drive is done and the performance will just drop as if you
wouldn't have a DB/WAL drive.

I believe that you could use "ceph daemon osd.X perf dump" and look for
"db_used_bytes" and "wal_used_bytes", but without guarantee from my side.
As far I know, it would be ok to choose values from 2-4% depending on your
usage and configuration.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Di., 28. Mai 2019 um 18:28 Uhr schrieb Jake Grimmett <
j...@mrc-lmb.cam.ac.uk>:

> Hi Martin,
>
> thanks for your reply :)
>
> We already have a separate NVMe SSD pool for cephfs metadata.
>
> I agree it's much simpler & more robust not using a separate DB/WAL, but
> as we have enough money for a 1.6TB SSD for every 6 HDD, so it's
> tempting to go down that route. If people think a 2.2% DB/WAL is a bad
> idea, we will definitely have a re-think.
>
> Perhaps I'm being greedy for more performance; we have a 250 node HPC
> cluster, and it would be nice to see how cephfs compares to our beegfs
> scratch.
>
> best regards,
>
> Jake
>
>
> On 5/28/19 3:14 PM, Martin Verges wrote:
> > Hello Jake,
> >
> > do you have any latency requirements that you do require the DB/WAL at
> all?
> > If not, CephFS with EC on SATA HDD works quite well as long as you have
> > the metadata on a separate ssd pool.
> >
> > --
> > Martin Verges
> > Managing director
> >
> > Mobile: +49 174 9335695
> > E-Mail: martin.ver...@croit.io <mailto:martin.ver...@croit.io>
> > Chat: https://t.me/MartinVerges
> >
> > croit GmbH, Freseniusstr. 31h, 81247 Munich
> > CEO: Martin Verges - VAT-ID: DE310638492
> > Com. register: Amtsgericht Munich HRB 231263
> >
> > Web: https://croit.io
> > YouTube: https://goo.gl/PGE1Bx
> >
> >
> > Am Di., 28. Mai 2019 um 15:13 Uhr schrieb Jake Grimmett
> > mailto:j...@mrc-lmb.cam.ac.uk>>:
> >
> > Dear All,
> >
> > Quick question regarding SSD sizing for a DB/WAL...
> >
> > I understand 4% is generally recommended for a DB/WAL.
> >
> > Does this 4% continue for "large" 12TB drives, or can we  economise
> and
> > use a smaller DB/WAL?
> >
> > Ideally I'd fit a smaller drive providing a 266GB DB/WAL per 12TB
> OSD,
> > rather than 480GB. i.e. 2.2% rather than 4%.
> >
> > Will "bad things" happen as the OSD fills with a smaller DB/WAL?
> >
> > By the way the cluster will mainly be providing CephFS, fairly large
> > files, and will use erasure encoding.
> >
> > many thanks for any advice,
> >
> > Jake
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD Sizing for DB/WAL: 4% for large drives?

2019-05-28 Thread Martin Verges
Hello Jake,

do you have any latency requirements that you do require the DB/WAL at all?
If not, CephFS with EC on SATA HDD works quite well as long as you have the
metadata on a separate ssd pool.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Di., 28. Mai 2019 um 15:13 Uhr schrieb Jake Grimmett <
j...@mrc-lmb.cam.ac.uk>:

> Dear All,
>
> Quick question regarding SSD sizing for a DB/WAL...
>
> I understand 4% is generally recommended for a DB/WAL.
>
> Does this 4% continue for "large" 12TB drives, or can we  economise and
> use a smaller DB/WAL?
>
> Ideally I'd fit a smaller drive providing a 266GB DB/WAL per 12TB OSD,
> rather than 480GB. i.e. 2.2% rather than 4%.
>
> Will "bad things" happen as the OSD fills with a smaller DB/WAL?
>
> By the way the cluster will mainly be providing CephFS, fairly large
> files, and will use erasure encoding.
>
> many thanks for any advice,
>
> Jake
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Post-mortem analisys?

2019-05-13 Thread Martin Verges
Hello Marco,

first of all, hyperconverged setups with public accessable VMs could be
affected by DDoS attacks or other harmful issues that causes cascading
errors in your infrastructure.

Are you sure your network worked correctly at the time?

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Mo., 13. Mai 2019 um 11:43 Uhr schrieb Marco Gaiarin :

>
> [I't is not really a 'mortem', but...]
>
>
> Saturday afternoon, my 3-nodes proxmox ceph cluster have a big
> 'slowdown', that started at 12:35:24 with some OOM condition in one of
> the 3 storage nodes, followed with also OOM on another node, at
> 12:43:31.
>
> After that, all bad things happens: stuck requests, SCSI timeout on
> VMs, MONs flip-flop on RBD clients.
>
> I make a 'ceph -s' every hour, so at 14:17:01 i had at two nodes:
>
> cluster 8794c124-c2ec-4e81-8631-742992159bd6
>  health HEALTH_WARN
> 26 requests are blocked > 32 sec
>  monmap e9: 5 mons at {2=
> 10.27.251.11:6789/0,3=10.27.251.12:6789/0,4=10.27.251.9:6789/0,blackpanther=10.27.251.2:6789/0,capitanmarvel=10.27.251.8:6789/0
> }
> election epoch 3930, quorum 0,1,2,3,4
> blackpanther,capitanmarvel,4,2,3
>  osdmap e15713: 12 osds: 12 up, 12 in
>   pgmap v67358590: 768 pgs, 3 pools,  GB data, 560 kobjects
> 6639 GB used, 11050 GB / 17689 GB avail
>  768 active+clean
>   client io 266 kB/s wr, 25 op/s
>
> and on the third:
> cluster 8794c124-c2ec-4e81-8631-742992159bd6
>  health HEALTH_WARN
> 5 mons down, quorum
>  monmap e9: 5 mons at {2=
> 10.27.251.11:6789/0,3=10.27.251.12:6789/0,4=10.27.251.9:6789/0,blackpanther=10.27.251.2:6789/0,capitanmarvel=10.27.251.8:6789/0
> }
> election epoch 3931, quorum
>  osdmap e15713: 12 osds: 12 up, 12 in
>   pgmap v67358598: 768 pgs, 3 pools,  GB data, 560 kobjects
> 6639 GB used, 11050 GB / 17689 GB avail
>  767 active+clean
>1 active+clean+scrubbing
>   client io 617 kB/s wr, 70 op/s
>
>
> At that hour, the site served by the cluster was just closed (eg, no
> users). The only task running, looking at logs, seems a backup
> (bacula), but was just saving catalog, eg database workload on a
> container, and ended at 14.27.
>
>
> All that continue, more or less, till sunday morning, then all goes
> back as normal.
> Seems there was no hardware failures on nodes.
>
> Backup tasks (all VM/LXC backups) on saturday night competed with no
> errors.
>
>
> Someone can provide some hint on how to 'correlate' various logs, and
> so (try to) understand what happens?
>
>
> Thanks.
>
> --
> dott. Marco Gaiarin GNUPG Key ID:
> 240A3D66
>   Associazione ``La Nostra Famiglia''
> http://www.lanostrafamiglia.it/
>   Polo FVG   -   Via della Bontà, 7 - 33078   -   San Vito al Tagliamento
> (PN)
>   marco.gaiarin(at)lanostrafamiglia.it   t +39-0434-842711   f
> +39-0434-842797
>
> Dona il 5 PER MILLE a LA NOSTRA FAMIGLIA!
>   http://www.lanostrafamiglia.it/index.php/it/sostienici/5x1000
> (cf 00307430132, categoria ONLUS oppure RICERCA SANITARIA)
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Poor performance for 512b aligned "partial" writes from Windows guests in OpenStack + potential fix

2019-05-10 Thread Martin Verges
yes, we recommend this as a precaution to get the best possible IO
performance for all workloads and usage scenarios. 512e doesn't bring any
advantage and in some cases can mean a performance disadvantage. By the
way, 4kN and 512e cost exactly the same at our dealers.

Whether this really makes a difference in the individual case with virtual
disks by the underlying physical disks, I can't say.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Fr., 10. Mai 2019 um 10:54 Uhr schrieb Trent Lloyd <
trent.ll...@canonical.com>:

> Note that the issue I am talking about here is how a "Virtual" Ceph RBD
> disk is presented to a virtual guest, and specifically for Windows guests
> (Linux guests are not affected). I am not at all talking about how the
> physical disks are presented to Ceph itself (although Martin was, he wasn't
> clear whether changing these underlying physical disks to 4kn was for Ceph
> or other environments).
>
> I would not expect that having your underlying physical disk presented to
> Ceph itself as 512b/512e or 4kn to have a significant impact on performance
> for the reason that Linux systems generally send 4k-aligned I/O anyway
> (regardless of what the underlying disk is reporting for
> physical_block_size). There may be some exceptions to that, such as
> applications performing Direct I/O to the disk. If anyone knows otherwise,
> it would be great to hear specific details.
>
> Regards,
> Trent
>
> On Fri, May 10, 2019 at 4:40 PM Marc Roos 
> wrote:
>
>>
>> Hmmm, so if I have (wd) drives that list this in smartctl output, I
>> should try and reformat them to 4k, which will give me better
>> performance?
>>
>> Sector Sizes: 512 bytes logical, 4096 bytes physical
>>
>> Do you have a link to this download? Can only find some .cz site with
>> the rpms.
>>
>>
>> -Original Message-
>> From: Martin Verges [mailto:martin.ver...@croit.io]
>> Sent: vrijdag 10 mei 2019 10:21
>> To: Trent Lloyd
>> Cc: ceph-users
>> Subject: Re: [ceph-users] Poor performance for 512b aligned "partial"
>> writes from Windows guests in OpenStack + potential fix
>>
>> Hello Trent,
>>
>> many thanks for the insights. We always suggest to use 4kN over 512e
>> HDDs to our users.
>>
>> As we recently found out, is that WD Support offers a tool called HUGO
>> to reformat 512e to 4kN drives with "hugo format -m  -n
>> max --fastformat -b 4096" in seconds.
>> Maybe that helps someone that has bought the wrong disk.
>>
>> --
>> Martin Verges
>> Managing director
>>
>> Mobile: +49 174 9335695
>> E-Mail: martin.ver...@croit.io
>> Chat: https://t.me/MartinVerges
>>
>> croit GmbH, Freseniusstr. 31h, 81247 Munich
>> CEO: Martin Verges - VAT-ID: DE310638492 Com. register: Amtsgericht
>> Munich HRB 231263
>>
>> Web: https://croit.io
>> YouTube: https://goo.gl/PGE1Bx
>>
>>
>>
>> Am Fr., 10. Mai 2019 um 10:00 Uhr schrieb Trent Lloyd
>> :
>>
>>
>> I recently was investigating a performance problem for a
>> reasonably
>> sized OpenStack deployment having around 220 OSDs (3.5" 7200 RPM SAS
>> HDD) with NVMe Journals. The primary workload is Windows guests backed
>> by Cinder RBD volumes.
>> This specific deployment is Ceph Jewel (FileStore +
>> SimpleMessenger) which while it is EOL, the issue is reproducible on
>> current versions and also on BlueStore however for different reasons
>> than FileStore.
>>
>>
>> Generally the Ceph cluster was suffering from very poor outlier
>> performance, the numbers change a little bit depending on the exact
>> situation but roughly 80% of I/O was happening in a "reasonable" time of
>> 0-200ms but 5-20% of I/O operations were taking excessively long
>> anywhere from 500ms through to 10-20+ seconds. However the normal
>> metrics for commit and apply latency were normal, and in fact, this
>> latency was hard to spot in the performance metrics available in jewel.
>>
>> Previously I more simply considered FileStore to have the
>> "commit"
>> (to journal) stage where it was written to the journal and it is OK to
>> return to the client and then the "apply" (to disk) stage where it was
>> flushed to disk and confirmed so that the data could

Re: [ceph-users] Poor performance for 512b aligned "partial" writes from Windows guests in OpenStack + potential fix

2019-05-10 Thread Martin Verges
Hello,

I'm not yet sure if I'm allowed to share the files, but if you find one of
those, you can verify the md5sum.

27d2223d66027d8e989fc07efb2df514  hugo-6.8.0.i386.deb.zip
b7db78c3927ef3d53eb2113a4e369906  hugo-6.8.0.i386.rpm.zip
9a53ed8e201298de6da7ac6a7fd9dba0  hugo-6.8.0.i386.tar.gz.zip
2deaa31186adb36b92016a252b996e70  HUGO-6.8.0.win32.zip
cd031ca8bf47b8976035d08125a2c591  HUGO-6.8.0.win64.zip
b9d90bb70415c4c5ec29dc04180c65a8  HUGO-6.8.0.winArm64.zip
6d4fc696de0b0f95b54fccdb096e634f  hugo-6.8.0.x86_64.deb.zip
12f8e39dc3cdd6c03e4eb3809a37ce65  hugo-6.8.0.x86_64.rpm.zip
545527fbb28af0c0ff4611fa20be0460  hugo-6.8.0.x86_64.tar.gz.zip

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Fr., 10. Mai 2019 um 10:40 Uhr schrieb Marc Roos <
m.r...@f1-outsourcing.eu>:

>
> Hmmm, so if I have (wd) drives that list this in smartctl output, I
> should try and reformat them to 4k, which will give me better
> performance?
>
> Sector Sizes: 512 bytes logical, 4096 bytes physical
>
> Do you have a link to this download? Can only find some .cz site with
> the rpms.
>
>
> -Original Message-
> From: Martin Verges [mailto:martin.ver...@croit.io]
> Sent: vrijdag 10 mei 2019 10:21
> To: Trent Lloyd
> Cc: ceph-users
> Subject: Re: [ceph-users] Poor performance for 512b aligned "partial"
> writes from Windows guests in OpenStack + potential fix
>
> Hello Trent,
>
> many thanks for the insights. We always suggest to use 4kN over 512e
> HDDs to our users.
>
> As we recently found out, is that WD Support offers a tool called HUGO
> to reformat 512e to 4kN drives with "hugo format -m  -n
> max --fastformat -b 4096" in seconds.
> Maybe that helps someone that has bought the wrong disk.
>
> --
> Martin Verges
> Managing director
>
> Mobile: +49 174 9335695
> E-Mail: martin.ver...@croit.io
> Chat: https://t.me/MartinVerges
>
> croit GmbH, Freseniusstr. 31h, 81247 Munich
> CEO: Martin Verges - VAT-ID: DE310638492 Com. register: Amtsgericht
> Munich HRB 231263
>
> Web: https://croit.io
> YouTube: https://goo.gl/PGE1Bx
>
>
>
> Am Fr., 10. Mai 2019 um 10:00 Uhr schrieb Trent Lloyd
> :
>
>
> I recently was investigating a performance problem for a
> reasonably
> sized OpenStack deployment having around 220 OSDs (3.5" 7200 RPM SAS
> HDD) with NVMe Journals. The primary workload is Windows guests backed
> by Cinder RBD volumes.
> This specific deployment is Ceph Jewel (FileStore +
> SimpleMessenger) which while it is EOL, the issue is reproducible on
> current versions and also on BlueStore however for different reasons
> than FileStore.
>
>
> Generally the Ceph cluster was suffering from very poor outlier
> performance, the numbers change a little bit depending on the exact
> situation but roughly 80% of I/O was happening in a "reasonable" time of
> 0-200ms but 5-20% of I/O operations were taking excessively long
> anywhere from 500ms through to 10-20+ seconds. However the normal
> metrics for commit and apply latency were normal, and in fact, this
> latency was hard to spot in the performance metrics available in jewel.
>
> Previously I more simply considered FileStore to have the "commit"
> (to journal) stage where it was written to the journal and it is OK to
> return to the client and then the "apply" (to disk) stage where it was
> flushed to disk and confirmed so that the data could be purged from the
> journal. However there is really a third stage in the middle where
> FileStore submits the I/O to the operating system and this is done
> before the lock on the object is released. Until that succeeds another
> operation cannot write to the same object (generally being a 4MB area of
> the disk).
>
> I found that the fstore_op threads would get stuck for hundreds of
> MS or more inside of pwritev() which was blocking inside of the kernel.
> Normally we expect pwritev() to be buffered I/O into the page cache and
> return quite fast however in this case the kernel was in a few percent
> of cases blocking with the stack trace included at the end of the e-mail
> [1]. My finding from that stack is that inside __block_write_begin_int
> we see a call to out_of_line_wait_on_bit call which is really an inlined
> call for wait_on_buffer which occurs in linux/fs/buffer.c in the section
> around line 2000-2024 with the comment "If we issued read requests - let
> them complete."
>

Re: [ceph-users] Poor performance for 512b aligned "partial" writes from Windows guests in OpenStack + potential fix

2019-05-10 Thread Martin Verges
Hello Trent,

many thanks for the insights. We always suggest to use 4kN over 512e HDDs
to our users.

As we recently found out, is that WD Support offers a tool called HUGO to
reformat 512e to 4kN drives with "hugo format -m  -n max
--fastformat -b 4096" in seconds.
Maybe that helps someone that has bought the wrong disk.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Fr., 10. Mai 2019 um 10:00 Uhr schrieb Trent Lloyd <
trent.ll...@canonical.com>:

> I recently was investigating a performance problem for a reasonably sized
> OpenStack deployment having around 220 OSDs (3.5" 7200 RPM SAS HDD) with
> NVMe Journals. The primary workload is Windows guests backed by Cinder RBD
> volumes.
> This specific deployment is Ceph Jewel (FileStore + SimpleMessenger) which
> while it is EOL, the issue is reproducible on current versions and also on
> BlueStore however for different reasons than FileStore.
>
> Generally the Ceph cluster was suffering from very poor outlier
> performance, the numbers change a little bit depending on the exact
> situation but roughly 80% of I/O was happening in a "reasonable" time of
> 0-200ms but 5-20% of I/O operations were taking excessively long anywhere
> from 500ms through to 10-20+ seconds. However the normal metrics for commit
> and apply latency were normal, and in fact, this latency was hard to spot
> in the performance metrics available in jewel.
>
> Previously I more simply considered FileStore to have the "commit" (to
> journal) stage where it was written to the journal and it is OK to return
> to the client and then the "apply" (to disk) stage where it was flushed to
> disk and confirmed so that the data could be purged from the journal.
> However there is really a third stage in the middle where FileStore submits
> the I/O to the operating system and this is done before the lock on the
> object is released. Until that succeeds another operation cannot write to
> the same object (generally being a 4MB area of the disk).
>
> I found that the fstore_op threads would get stuck for hundreds of MS or
> more inside of pwritev() which was blocking inside of the kernel. Normally
> we expect pwritev() to be buffered I/O into the page cache and return quite
> fast however in this case the kernel was in a few percent of cases blocking
> with the stack trace included at the end of the e-mail [1]. My finding from
> that stack is that inside __block_write_begin_int we see a call to
> out_of_line_wait_on_bit call which is really an inlined call for
> wait_on_buffer which occurs in linux/fs/buffer.c in the section around line
> 2000-2024 with the comment "If we issued read requests - let them
> complete." (
> https://github.com/torvalds/linux/blob/a2d635decbfa9c1e4ae15cb05b68b2559f7f827c/fs/buffer.c#L2002
> )
>
> My interpretation of that code is that for Linux to store a write in the
> page cache, it has to have the entire 4K page as that is the granularity of
> which it tracks the dirty state and it needs the entire 4K page to later
> submit back to the disk. Since we wrote a part of the page, and the page
> wasn't already in the cache, it has to fetch the remainder of the page from
> the disk. When this happens, it blocks waiting for this read to complete
> before returning from the pwritev() call - hence our normally buffered
> write blocks. This holds up the tp_fstore_op thread, of which there are (by
> default) only 2-4 such threads trying to process several hundred operations
> per second. Additionally the size of the osd_op_queue is bounded, and
> operations do not clear out of this queue until the tp_fstore_op thread is
> done. Which ultimately means that not only are these partial writes delayed
> but it knocks on to delay other writes behind them because of the
> constrained thread pools.
>
> What was further confusing to this, is that I could easily reproduce this
> in a test deployment using an rbd benchmark that was only writing to a
> total disk size of 256MB which I would easily have expected to fit in the
> page cache:
> rbd create -p rbd --size=256M bench2
> rbd bench-write -p rbd bench2 --io-size 512 --io-threads 256 --io-total
> 256M --io-pattern rand
>
> This is explained by the fact that on secondary OSDs (at least, there was
> some refactoring of fadvise which I have not fully understood as of yet),
> FileStore is using fadvise FADVISE_DONTNEED on the objects after write
> which causes the kernel to immediately discard them from the page cache
> wi

Re: [ceph-users] Ceph cluster available to clients with 2 different VLANs ?

2019-05-03 Thread Martin Verges
Hello,

configure a gateway on your router or use a good rack switch that can
provide such features and use layer3 routing to connect different vlans /
ip zones.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Fr., 3. Mai 2019 um 10:21 Uhr schrieb Hervé Ballans <
herve.ball...@ias.u-psud.fr>:

> Hi all,
>
> I have a Ceph cluster on Luminous 12.2.10 with 3 mon and 6 osd servers.
> My current network settings is a separated public and cluster (private
> IP) network.
>
> I would like my cluster available to clients on another VLAN than the
> default one (which is the public network on ceph.conf)
>
> Is it possible ? How can I achieve that ?
> For information, each node still has two unused network cards.
>
> Thanks for any suggestions,
>
> Hervé
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] hardware requirements for metadata server

2019-05-01 Thread Martin Verges
Hello,

you should put your metadata on a fast (ssd/nvme) pool. The size depends on
your data, but you can scale it anytime as you know it from Ceph itself.
Maybe just start with 3 SSDs in 3 Servers and see how it goes.
For CPU / Ram it's a bit the same, you need a few gigs for smaller and
maybe more for bigger deployments. Maybe you can provide some insights
about your typical data (size, count,..) and don't forget, you can scale by
adding additional online mds as well.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Mi., 1. Mai 2019 um 02:08 Uhr schrieb Manuel Sopena Ballesteros <
manuel...@garvan.org.au>:

> Dear ceph users,
>
>
>
> I would like to ask, does the metadata server needs much block devices for
> storage? Or does it only needs RAM? How could I calculate the amount of
> disks and/or memory needed?
>
>
>
> Thank you very much
>
>
>
>
>
> Manuel Sopena Ballesteros
>
> Big Data Engineer | Kinghorn Centre for Clinical Genomics
>
>  [image: cid:image001.png@01D4C835.ED3C2230] <https://www.garvan.org.au/>
>
>
> *a:* 384 Victoria Street, Darlinghurst NSW 2010
> *p:* +61 2 9355 5760  |  +61 4 12 123 123
> *e:* manuel...@garvan.org.au
>
> Like us on Facebook <http://www.facebook.com/garvaninstitute> | Follow us
> on Twitter <http://twitter.com/GarvanInstitute> and LinkedIn
> <http://www.linkedin.com/company/garvan-institute-of-medical-research>
>
>
> NOTICE
> Please consider the environment before printing this email. This message
> and any attachments are intended for the addressee named and may contain
> legally privileged/confidential/copyright information. If you are not the
> intended recipient, you should not read, use, disclose, copy or distribute
> this communication. If you have received this message in error please
> notify us at once by return email and then delete both messages. We accept
> no liability for the distribution of viruses or similar in electronic
> communications. This notice should not be removed.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Access cephfs from second public network

2019-03-22 Thread Martin Verges
Hello,

if you want to use multiple IP networks, just use IP routing. Today's
switches support such a feature in line rate with great performance.
Please keep your firewall/security layer in mind when you use routed networks.

I don't believe it would be possible to use any other proxy /
modification / ... technology to connect as a Ceph client to a Ceph
storage within a different IP network.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Fr., 22. März 2019 um 11:32 Uhr schrieb Andres Rojas Guerrero
:
>
> Hi, thank's for the answer, we have seen that the client only have OSD
> map for first public network ...
>
> # cat
> /sys/kernel/debug/ceph/88f62260-b8de-499f-b6fe-5eb66a967083.client360894/osdmap
>
>
> you say that the cepfs clients have the same view of the cluster that
> have de MON's, that's mean that MON's doesn't have the right vision of
> the OSD from the second public network?
>
>
>
>
> On 22/3/19 9:54, Burkhard Linke wrote:
> > Hi,
> >
> >
> > just my 2 ct:
> >
> >
> > Clients do not access the MDS directly, e.g. you do not use the IP
> > address of a MDS in the mount command.
> >
> >
> > Clients contact the MONs and retrieve the MDS map, which contains all
> > filesystems and their corresponding MDS server(s). The same is also true
> > for the OSDs. The clients thus use the MONs' view of the cluster. I'm
> > not familar with multiple IP address setups, but I would bet that a
> > client is not able to select the correct public network.
> >
> >
> > You can check your setup on the OSD/MDS hosts (are the OSDs/MDSs bound
> > to the second network at all?), and in /sys/kernel/debug/ceph/ > client id>. The later contains the current versions of the MON map, MDS
> > map and OSD map. These are the settings used by the client to contact
> > the corresponding daemon (assuming kernel cephfs client, ceph-fuse is
> > different).
> >
> >
> > Regards,
> >
> > Burkhard
> >
>
> --
> ***
> Andrés Rojas Guerrero
> Unidad Sistemas Linux
> Area Arquitectura Tecnológica
> Secretaría General Adjunta de Informática
> Consejo Superior de Investigaciones Científicas (CSIC)
> Pinar 19
> 28006 - Madrid
> Tel: +34 915680059 -- Ext. 990059
> email: a.ro...@csic.es
> ID comunicate.csic.es: @50852720l:matrix.csic.es
> ***
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] v14.2.0 Nautilus released

2019-03-21 Thread Martin Verges
Hello,

we strongly believe it would be good for Ceph to have the packaged
directly on the official Debian mirrors, but for everyone out there
having trouble with Ceph on Debian we are glad to help.
If Ceph is not available on Debian, it might affect a lot of other
Software, for example Proxmox.

You can find Ceph Nautilus 14.2.0 for Debian 10 Buster on our public mirror.

$ curl https://mirror.croit.io/keys/release.asc | apt-key add -
$ echo 'deb https://mirror.croit.io/debian-nautilus/ buster main' >>
/etc/apt/sources.list.d/croit-ceph.list

If we can help to get the packages on the official mirrors, please
feel free contact us!

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Mi., 20. März 2019 um 20:49 Uhr schrieb Ronny Aasen
:
>
>
> with Debian buster frozen, If there are issues with ceph on debian that
> would best be fixed in debian, now is the last chance to get anything
> into buster before the next release.
>
> it is also important to get mimic and luminous packages built for
> Buster. Since you want to avoid a situation where you have to upgrade
> both the OS and ceph at the same time.
>
> kind regards
> Ronny Aasen
>
>
>
> On 20.03.2019 07:09, Alfredo Deza wrote:
> > There aren't any Debian packages built for this release because we
> > haven't updated the infrastructure to build (and test) Debian packages
> > yet.
> >
> > On Tue, Mar 19, 2019 at 10:24 AM Sean Purdy  
> > wrote:
> >> Hi,
> >>
> >>
> >> Will debian packages be released?  I don't see them in the nautilus repo.  
> >> I thought that Nautilus was going to be debian-friendly, unlike Mimic.
> >>
> >>
> >> Sean
> >>
> >> On Tue, 19 Mar 2019 14:58:41 +0100
> >> Abhishek Lekshmanan  wrote:
> >>
> >>> We're glad to announce the first release of Nautilus v14.2.0 stable
> >>> series. There have been a lot of changes across components from the
> >>> previous Ceph releases, and we advise everyone to go through the release
> >>> and upgrade notes carefully.
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Intel D3-S4610 performance

2019-03-13 Thread Martin Verges
Hello Kai,

there are tons of bad SSDs on the market. You cannot buy any brand without
having some bad and maybe some good models.

Here as an example some performance values from Intel:

Intel SSD DC S4600 960GB, 2.5", SATA (SSDSC2KG960G701)
jobs=1 - iops=23k
jobs=5 - iops=51k

Intel SSD D3-S4510 960GB, 2.5", SATA (SSDSC2KB960G801)
jobs=1 - iops=10k
jobs=5 - iops=22k

The Samsung PM863a you mentioned is quite a good SSD for general purpose
and cost efficiency, unfortunately Samsung seems to no longer produce this
drive and try to sell the PM883 instead. Performance seems to be near the
same, and maybe it will be a good choice too.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Di., 12. März 2019 um 09:14 Uhr schrieb Kai Wembacher <
kai.wembac...@infranaut.com>:

> Hi everyone,
>
>
>
> I have an Intel D3-S4610 SSD with 1.92 TB here for testing and get some
> pretty bad numbers, when running the fio benchmark suggested by Sébastien
> Han (
> http://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/
> ):
>
>
>
> Intel D3-S4610 1.92 TB
>
> --numjobs=1 write: IOPS=3860, BW=15.1MiB/s (15.8MB/s)(905MiB/60001msec)
>
> --numjobs=2 write: IOPS=7138, BW=27.9MiB/s (29.2MB/s)(1673MiB/60001msec)
>
> --numjobs=4 write: IOPS=12.5k, BW=48.7MiB/s (51.0MB/s)(2919MiB/60002msec)
>
>
>
> Compared to our current Samsung SM863 SSDs the Intel one is about 6x
> slower.
>
>
>
> Has someone here tested this SSD and can give me some values for
> comparison?
>
>
>
> Many thanks in advance,
>
>
>
> Kai
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] 3-node cluster with 3 x Intel Optane 900P - very low benchmarked performance (200 IOPS)?

2019-03-09 Thread Martin Verges
Hello,

did you test the performance of your individual drives?

Here is a small snippet:
-
DRIVE=/dev/XXX
smartctl --a $DRIVE
for i in 1 2 4 8 16; do echo "Test $i"; fio --filename=$DRIVE --direct=1
--sync=1 --rw=write --bs=4k --numjobs=$i --iodepth=1 --runtime=60
--time_based --group_reporting --name=journal-test; done
-

Please share the results that we know what's possible with your hardware.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

Vitaliy Filippov  schrieb am Sa., 9. März 2019, 21:09:

> There are 2:
>
> fio -ioengine=rbd -direct=1 -name=test -bs=4k -iodepth=1 -rw=randwrite
> -pool=bench -rbdname=testimg
>
> fio -ioengine=rbd -direct=1 -name=test -bs=4k -iodepth=128 -rw=randwrite
> -pool=bench -rbdname=testimg
>
> The first measures your min possible latency - it does not scale with the
> number of OSDs at all, but it's usually what real applications like
> DBMSes
> need.
>
> The second measures your max possible random write throughput which you
> probably won't be able to utilize if you don't have enough VMs all
> writing
> in parallel.
>
> --
> With best regards,
>Vitaliy Filippov
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] PGs stuck in created state

2019-03-07 Thread Martin Verges
Hello,

try restarting every osd if possible.
Upgrade to a recent ceph version.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Do., 7. März 2019 um 08:39 Uhr schrieb simon falicon <
simonfali...@gmail.com>:

> Hello Ceph Users,
>
> I have an issue with my ceph cluster, after one serious fail in four SSD
> (electricaly dead) I have lost PGs (and replicats) and I have 14 Pgs stuck.
>
> So for correct it I have try to force create this PGs (with same IDs) but
> now the Pgs stuck in creating state -_-" :
>
> ~# ceph -s
>  health HEALTH_ERR
> 14 pgs are stuck inactive for more than 300 seconds
> 
>
> ceph pg dump | grep creating
>
> dumped all in format plain
> 9.300000000creating2019-02-25 
> 09:32:12.3339790'00:0[20,26]20[20,11]200'0
> 2019-02-25 09:32:12.3339790'02019-02-25 09:32:12.333979
> 3.900000000creating2019-02-25 
> 09:32:11.2954510'00:0[16,39]16[17,6]170'0
> 2019-02-25 09:32:11.2954510'02019-02-25 09:32:11.295451
> ...
>
> I have try to create new PG dosent existe before and it work, but for this
> PG stuck in creating state.
>
> In my monitor logs I have this message:
>
> 2019-02-25 11:02:46.904897 7f5a371ed700  0 mon.controller1@1(peon) e7 
> handle_command mon_command({"prefix": "pg force_create_pg", "pgid": "4.20e"} 
> v 0) v1
> 2019-02-25 11:02:46.904938 7f5a371ed700  0 log_channel(audit) log [INF] : 
> from='client.? 172.31.101.107:0/3101034432' entity='client.admin' 
> cmd=[{"prefix": "pg force_create_pg", "pgid": "4.20e"}]: dispatch
>
> When I check map I have:
>
> ~# ceph pg map 4.20e
> osdmap e428069 pg 4.20e (4.20e) -> up [27,37,36] acting [13,17]
>
> I have restart OSD 27,37,36,13 and 17 but no effect. (one by one)
>
> I have see this issue http://tracker.ceph.com/issues/18298 but I run on
> ceph 10.2.11.
>
> So could you help me please ?
>
> Many thanks by advance,
> Sfalicon.
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph REST API

2019-03-06 Thread Martin Verges
Hello,

you could use croit to manage your cluster. We provide a extensive RESTful
API that can be used to automate near everything in your cluster.
Take a look at https://croit.io/docs/v1901 and try it yourself with our
vagrant demo or using the production guide.

If you miss something, please feel free to contact us and we will gladly
provide an update within a few days.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Mi., 6. März 2019 um 08:16 Uhr schrieb :

> Hi,
>
>
>
> As I referring the below link for *Ceph* *REST* *API*
>
>
>
> http://docs.ceph.com/docs/mimic/mgr/restful/
> <https://clicktime.symantec.com/3MWitG6uWtzCjfRLkewAyRM7Vc?u=http%3A%2F%2Fdocs.ceph.com%2Fdocs%2Fmimic%2Fmgr%2Frestful%2F>
>
>
>
> Some of the other endpoints implemented in the restful module include:
>
> /config/cluster: GET
>
> /config/osd: GET, PATCH
>
> /crush/rule: GET
>
> /mon: GET
>
> /osd: GET
>
> /pool: GET, POST
>
> /pool/: DELETE, GET, PATCH
>
> /request: DELETE, GET, POST
>
> /request/: DELETE, GET
>
> /server: GET
>
>
>
> As I see only few REST API mentioned above link.
>
>
>
> Is it possible to do operations like create/delete/update operations like
> rbd image, snapshots, mds file system, authentication, users etc. *through
> REST API?*
>
> If yes, what is the procedure to do? please help me with suggestions.
>
>
>
> Thanks
>
> Parkiti babu
>
>
> The information contained in this electronic message and any attachments
> to this message are intended for the exclusive use of the addressee(s) and
> may contain proprietary, confidential or privileged information. If you are
> not the intended recipient, you should not disseminate, distribute or copy
> this e-mail. Please notify the sender immediately and destroy all copies of
> this message and any attachments. WARNING: Computer viruses can be
> transmitted via email. The recipient should check this email and any
> attachments for the presence of viruses. The company accepts no liability
> for any damage caused by any virus transmitted by this email.
> www.wipro.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph bluestore performance on 4kn vs. 512e?

2019-02-26 Thread Martin Verges
Hello Oliver,

as 512e requires the drive to read a 4k block, change the 512 byte and then
write back the 4k block to the disk, it should have a significant
performance impact. However costs are the same, so always choose 4Kn drives.
By the way, this might not affect you, as long as you write 4k at once but
I'm unsure if that is given in any use case or in a Ceph specific scenario,
therefore be save and choose 4Kn drives.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Mo., 25. Feb. 2019 um 12:43 Uhr schrieb Oliver Schulz <
oliver.sch...@tu-dortmund.de>:

> Dear all,
>
> in real-world use, is there a significant performance
> benefit in using 4kn instead of 512e HDDs (using
> Ceph bluestore with block-db on NVMe-SSD)?
>
>
> Cheers and thanks for any advice,
>
> Oliver
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Bluestore HDD Cluster Advice

2019-02-02 Thread Martin Verges
Hello John,

you don't need such a big CPU, save yourself some money with a 12c/24t and
invest it in better / more disks. Same goes for memory, 128G would be
enough. Why do you install 4x 25G NIC, hard disks won't be able to use that
capacity?

In addition, you can use the 2 disks for OSDs and not OS if you choose
croit for system management meaning 10 more OSDs in your small cluster for
better performance and a lot easier to manage. The best part of it, this
feature comes with our complete free version, so it is just a gain on your
side! Try it out

Please make sure to buy the right disks, there is a huge performance gap
between 512e and 4Kn drives but near to no price difference. Bluestore does
perform better then filestore in most environments, but as always depending
on your specific workload. I would not recommend to even considering a
filestore osd anymore, instead buy the correct hardware for your use case
and configure the cluster accordingly.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Fr., 1. Feb. 2019 um 18:26 Uhr schrieb John Petrini <
jpetr...@coredial.com>:

> Hello,
>
> We'll soon be building out four new luminous clusters with Bluestore.
> Our current clusters are running filestore so we're not very familiar
> with Bluestore yet and I'd like to have an idea of what to expect.
>
> Here are the OSD hardware specs (5x per cluster):
> 2x 3.0GHz 18c/36t
> 22x 1.8TB 10K SAS (RAID1 OS + 20 OSD's)
> 5x 480GB Intel S4610 SSD's (WAL and DB)
> 192 GB RAM
> 4X Mellanox 25GB NIC
> PERC H730p
>
> With filestore we've found that we can achieve sub-millisecond write
> latency by running very fast journals (currently Intel S4610's). My
> main concern is that Bluestore doesn't use journals and instead writes
> directly to the higher latency HDD; in theory resulting in slower acks
> and higher write latency. How does Bluestore handle this? Can we
> expect similar or better performance then our current filestore
> clusters?
>
> I've heard it repeated that Bluestore performs better than Filestore
> but I've also heard some people claiming this is not always the case
> with HDD's. Is there any truth to that and if so is there a
> configuration we can use to achieve this same type of performance with
> Bluestore?
>
> Thanks all.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Simple API to have cluster healthcheck ?

2019-01-30 Thread Martin Verges
Hello Vincent,

when you install or migrate to croit, you can get a large number of REST
API's (see https://croit.io/docs/v1809/cluster#get-cluster-status) and we
support read-only users that you can create in our GUI.
If you want to use our API's from the cli, you can use our httpie-auth
plugin from https://github.com/croit/httpie-auth-croit to simplify the auth.

You can try it out our Ceph management solution with our demo from
https://croit.io/croit-virtual-demo on your computer or by just importing
your existing cluster using the https://croit.io/croit-production-guide.

Everything you see in our GUI can be reached through API's. To get a
glimpse of the possibilities, look at https://croit.io/screenshots.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Mi., 30. Jan. 2019 um 14:04 Uhr schrieb PHARABOT Vincent <
vincent.phara...@3ds.com>:

> Hello,
>
>
>
> I have my cluster set up correctly now (thank you again for the help)
>
>
>
> I am seeking now a way to get cluster health thru API (REST) with curl
> command.
>
> I had a look at manager / RESTful and Dashboard but none seems to provide
> simple way to get cluster health
>
> RESTful module do a lot of things but I didn’t find the simple health
> check result – moreover I don’t want monitoring user to be able to do all
> the command in this module.
>
> Dashboard is a dashboard so could not get health thru curl
>
>
>
> It seems it was possible with “ceph-rest-api” but it looks like this tools
> is no more available in ceph-common…
>
>
>
> Is there a simple way to have this ? (without writing python mgr module
> which will take a lot of time for this)
>
>
>
> Thank you
>
> Vincent
>
>
>
> This email and any attachments are intended solely for the use of the
> individual or entity to whom it is addressed and may be confidential and/or
> privileged.
>
> If you are not one of the named recipients or have received this email in
> error,
>
> (i) you should not read, disclose, or copy it,
>
> (ii) please notify sender of your receipt by reply email and delete this
> email and all attachments,
>
> (iii) Dassault Systèmes does not accept or assume any liability or
> responsibility for any use of or reliance on this email.
>
> Please be informed that your personal data are processed according to our
> data privacy policy as described on our website. Should you have any
> questions related to personal data protection, please contact 3DS Data
> Protection Officer at 3ds.compliance-priv...@3ds.com
>
>
> For other languages, go to https://www.3ds.com/terms/email-disclaimer
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Commercial support

2019-01-24 Thread Martin Verges
Hello Ketil,

we as croit offer commercial support for Ceph as well as our own Ceph based
unified storage solution.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Mi., 23. Jan. 2019 um 23:29 Uhr schrieb Ketil Froyn :

> Hi,
>
> How is the commercial support for Ceph? More specifically, I was  recently
> pointed in the direction of the very interesting combination of CephFS,
> Samba and ctdb. Is anyone familiar with companies that provide commercial
> support for in-house solutions like this?
>
> Regards, Ketil
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] MON dedicated hosts

2018-12-17 Thread Martin Verges
Hello,

we do not see a problem in a small cluster having 3 MON on OSD hosts.
However we do suggest to use 5 MON's.
Near every customers of us does this without a problem! Please just
make sure to have enough cpu/ram/disk available.

So:
1. No not necessary, only if you want to spend more money than required.
2. Maybe think of it when your cluster becomes >500, maybe 1k OSDs or
simply if the cluster design would be easier with dedicated servers.

Hint: Our training's cover a Ceph cluster planing session that covers
exactly such topics. See https://croit.io/training.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx
Am Mo., 17. Dez. 2018 um 10:10 Uhr schrieb Sam Huracan
:
>
> Hi everybody,
>
> We've runned a 50TB Cluster with 3 MON services on the same nodes with OSDs.
> We are planning to upgrade to 200TB, I have 2 questions:
>  1.  Should we separate MON services to dedicated hosts?
>  2.  From your experiences, how size of cluster we shoud consider to put MON 
> on dedicated hosts?
>
>
> Thanks in advance.
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Journal drive recommendation

2018-11-26 Thread Martin Verges
Hello,

what type of SSD data drives do you plan to use?

In general, I would not recommend to use external journal on ssd OSDs, but
it is possible to squeeze out a bit more performance depending on your data
disks.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Di., 27. Nov. 2018, 02:50 hat Amit Ghadge 
geschrieben:

> Hi all,
>
> We have planning to use SSD data drive, so for journal drive, is there any
> recommendation to use same drive or separate drive?
>
> Thanks,
> Amit
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Low traffic Ceph cluster with consumer SSD.

2018-11-25 Thread Martin Verges
Hello Anton,

we have some bad experience with consumer disks. They tend to fail quite
early and sometimes have extrem poor performance in Ceph workloads.
If possible, spend some money on reliable Samsung PM/SM863a SSDs. However a
customer of us uses the WD Blue 1TB SSDs and seems to be quite happy with.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Sa., 24. Nov. 2018 um 18:10 Uhr schrieb Anton Aleksandrov <
an...@aleksandrov.eu>:

> Hello community,
>
> We are building CEPH cluster on pretty old (but free) hardware. We will
> have 12 nodes with 1 OSD per node and migrate data from single RAID5
> setup, so our traffic is not very intense, we basically need more space
> and possibility to expand it.
>
> We plan to have data on dedicate disk in each node and my question is
> about WAL/DB for Bluestore. How bad would it be to place it on
> system-consumer-SSD? How big risk is it, that everything will get
> "slower than using spinning HDD for the same purpose"? And how big risk
> is it, that our nodes will die, because of SSD lifespan?
>
> I am sorry, for such untechnical question.
>
> Regards,
> Anton.
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Benchmark performance when using SSD as the journal

2018-11-13 Thread Martin Verges
Please never use the Datasheet values to select your SSD. We never had a
single one that that delivers the shown perfomance in a Ceph Journal use
case.

However, do not use Filestore anymore. Especialy with newer kernel
versions. Use Bluestore instead.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

Am Mi., 14. Nov. 2018, 05:46 hat  geschrieben:

> Thanks Merrick!
>
>
>
> I checked with Intel spec [1], the performance Intel said is,
>
>
>
> ·  Sequential Read (up to) 500 MB/s
>
> ·  Sequential Write (up to) 330 MB/s
>
> ·  Random Read (100% Span) 72000 IOPS
>
> ·  Random Write (100% Span) 2 IOPS
>
>
>
> I think these indicator should be must better than general HDD, and I have
> run read/write commands with “rados bench” respectively,   there should be
> some difference.
>
>
>
> And is there any kinds of configuration that could give us any performance
> gain with this SSD (Intel S4500)?
>
>
>
> [1]
> https://ark.intel.com/products/120521/Intel-SSD-DC-S4500-Series-480GB-2-5in-SATA-6Gb-s-3D1-TLC-
>
>
>
> Best Regards,
>
> Dave Chen
>
>
>
> *From:* Ashley Merrick 
> *Sent:* Wednesday, November 14, 2018 12:30 PM
> *To:* Chen2, Dave
> *Cc:* ceph-users
> *Subject:* Re: [ceph-users] Benchmark performance when using SSD as the
> journal
>
>
>
> [EXTERNAL EMAIL]
> Please report any suspicious attachments, links, or requests for sensitive
> information.
>
> Only certain SSD's are good for CEPH Journals as can be seen @
> https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/
>
>
>
> The SSD your using isn't listed but doing a quick search online it appears
> to be a SSD designed for read workloads as a "upgrade" from a HD so
> probably is not designed for the high write requirements a journal demands.
>
> Therefore when it's been hit by 3 OSD's of workloads your not going to get
> much more performance out of it than you would just using the disk as your
> seeing.
>
>
>
> On Wed, Nov 14, 2018 at 12:21 PM  wrote:
>
> Hi all,
>
>
>
> We want to compare the performance between HDD partition as the journal
> (inline from OSD disk) and SSD partition as the journal, here is what we
> have done, we have 3 nodes used as Ceph OSD,  each has 3 OSD on it.
> Firstly, we created the OSD with journal from OSD partition, and run “rados
> bench” utility to test the performance, and then migrate the journal from
> HDD to SSD (Intel S4500) and run “rados bench” again, the expected result
> is SSD partition should be much better than HDD, but the result shows us
> there is nearly no change,
>
>
>
> The configuration of Ceph is as below,
>
> pool size: 3
>
> osd size: 3*3
>
> pg (pgp) num: 300
>
> osd nodes are separated across three different nodes
>
> rbd image size: 10G (10240M)
>
>
>
> The utility I used is,
>
> rados bench -p rbd $duration write
>
> rados bench -p rbd $duration seq
>
> rados bench -p rbd $duration rand
>
>
>
> Is there anything wrong from what I did?  Could anyone give me some
> suggestion?
>
>
>
>
>
> Best Regards,
>
> Dave Chen
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Luminous or Mimic client on Debian Testing (Buster)

2018-11-13 Thread Martin Verges
Hello again,

maybe some other hint. If you want to mount the cephfs without
modifying your system, you could also do the "trick" of mount.cephfs.

=
echo "XXX" | base64 --decode | keyctl padd ceph client.admin @u
mount -t ceph X.X.X.X:/ /mnt/ -o name=admin,key=client.admin
=====

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


2018-11-13 18:07 GMT+01:00 Martin Verges :
> Hello,
>
> unfortunately there is no such deb package at the moment.
>
> However you could extract the sbin/mount.ceph command from the desired
> version and copy the file into your debian buster installation. After
> that you should be able to use the CephFS Kernel client from debian
> buster.
>
> I tried it on a test system and it worked without a problem.
>
> =
> wget 
> https://download.ceph.com/debian-luminous/pool/main/c/ceph/ceph-common_12.2.9-1~bpo90%2B1_amd64.deb
> dpkg -x ceph-common_12.2.9-1~bpo90+1_amd64.deb ceph-common
> cp ceph-common/sbin/mount.ceph /sbin/mount.ceph
> rm -r ceph-common
> mount.ceph X.X.X.X:6789:/ /mnt/ -o name=admin,secret=XXX
> =
>
> --
> Martin Verges
> Managing director
>
> Mobile: +49 174 9335695
> E-Mail: martin.ver...@croit.io
> Chat: https://t.me/MartinVerges
>
> croit GmbH, Freseniusstr. 31h, 81247 Munich
> CEO: Martin Verges - VAT-ID: DE310638492
> Com. register: Amtsgericht Munich HRB 231263
>
> Web: https://croit.io
> YouTube: https://goo.gl/PGE1Bx
>
>
> 2018-11-13 16:42 GMT+01:00 Hervé Ballans :
>> Hi,
>>
>> On my CephFS production cluster (Luminous 12.2.8), I would like to add a
>> CephFS client from a server installed with Debian Buster (Testing release).
>>
>> But, the default proposed Ceph packages in this release are still Jewel :
>>
>> # cat /etc/debian_version
>> buster/sid
>>
>> # apt search ceph-common
>> Sorting... Done
>> Full Text Search... Done
>> ceph-common/testing 10.2.5-7.2 amd64
>>   common utilities to mount and interact with a ceph storage cluster
>>
>> So, I tried to add another Ceph repository :
>>
>> # echo deb https://download.ceph.com/debian-luminous/ $(lsb_release -sc)
>> main | tee /etc/apt/sources.list.d/ceph.list
>> # apt update
>> ...
>> Err:75 https://download.ceph.com/debian-luminous buster Release
>>   404  Not Found [IP: 158.69.68.124 443]
>> Reading package lists... Done
>> E: The repository 'https://download.ceph.com/debian-luminous buster Release'
>> does not have a Release file.
>> N: Updating from such a repository can't be done securely, and is therefore
>> disabled by default.
>> N: See apt-secure(8) manpage for repository creation and user configuration
>> details.
>>
>> The same for Mimic repository...
>>
>> Has anyone succesfully installed a recent version of Ceph client in a Debian
>> Testing ?
>>
>> Regards,
>> rv
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Luminous or Mimic client on Debian Testing (Buster)

2018-11-13 Thread Martin Verges
Hello,

unfortunately there is no such deb package at the moment.

However you could extract the sbin/mount.ceph command from the desired
version and copy the file into your debian buster installation. After
that you should be able to use the CephFS Kernel client from debian
buster.

I tried it on a test system and it worked without a problem.

=
wget 
https://download.ceph.com/debian-luminous/pool/main/c/ceph/ceph-common_12.2.9-1~bpo90%2B1_amd64.deb
dpkg -x ceph-common_12.2.9-1~bpo90+1_amd64.deb ceph-common
cp ceph-common/sbin/mount.ceph /sbin/mount.ceph
rm -r ceph-common
mount.ceph X.X.X.X:6789:/ /mnt/ -o name=admin,secret=XXX
=

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


2018-11-13 16:42 GMT+01:00 Hervé Ballans :
> Hi,
>
> On my CephFS production cluster (Luminous 12.2.8), I would like to add a
> CephFS client from a server installed with Debian Buster (Testing release).
>
> But, the default proposed Ceph packages in this release are still Jewel :
>
> # cat /etc/debian_version
> buster/sid
>
> # apt search ceph-common
> Sorting... Done
> Full Text Search... Done
> ceph-common/testing 10.2.5-7.2 amd64
>   common utilities to mount and interact with a ceph storage cluster
>
> So, I tried to add another Ceph repository :
>
> # echo deb https://download.ceph.com/debian-luminous/ $(lsb_release -sc)
> main | tee /etc/apt/sources.list.d/ceph.list
> # apt update
> ...
> Err:75 https://download.ceph.com/debian-luminous buster Release
>   404  Not Found [IP: 158.69.68.124 443]
> Reading package lists... Done
> E: The repository 'https://download.ceph.com/debian-luminous buster Release'
> does not have a Release file.
> N: Updating from such a repository can't be done securely, and is therefore
> disabled by default.
> N: See apt-secure(8) manpage for repository creation and user configuration
> details.
>
> The same for Mimic repository...
>
> Has anyone succesfully installed a recent version of Ceph client in a Debian
> Testing ?
>
> Regards,
> rv
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Supermicro server 5019D8-TR12P for new Ceph cluster

2018-11-13 Thread Martin Verges
Hello,

we believe 1 core or thread per OSD + 2-4 for OS and other services
are enough for most use cases, so yes. Same goes for 64 GB Ram, we
suggest ~4 G per OSD (12*4 = 48 GB) so 16 GB for the Linux is more
then enough. Buy good drives (ssd & hdd) to prevent performance
issues.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


2018-11-13 14:42 GMT+01:00 Michal Zacek :
> Hello,
>
> what do you think about this Supermicro server:
> http://www.supermicro.com/products/system/1U/5019/SSG-5019D8-TR12P.cfm ? We
> are considering about eight or ten server each with twelve 10TB SATA drives,
> one m.2 SSD and 64GB RAM. Public and cluster network will be 10Gbit/s. The
> question is if one Intel XEON D-2146NT wit eight cores (16 with HT) will be
> enough for 12 SAT disks. Cluster will be used for storing pictures. File
> size from 1MB to 2TB ;-).
>
> Thanks,
> Michal
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] read performance, separate client CRUSH maps or limit osd read access from each client

2018-11-09 Thread Martin Verges
Hello Vlad,

you can generate something like this:

rule dc1_primary_dc2_secondary {
id 1
type replicated
min_size 1
max_size 10
step take dc1
step chooseleaf firstn 1 type host
step emit
step take dc2
step chooseleaf firstn 1 type host
step emit
step take dc3
step chooseleaf firstn -2 type host
step emit
}

rule dc2_primary_dc1_secondary {
id 2
type replicated
min_size 1
max_size 10
step take dc1
step chooseleaf firstn 1 type host
step emit
step take dc2
step chooseleaf firstn 1 type host
step emit
step take dc3
step chooseleaf firstn -2 type host
step emit
}

After you added such crush rules, you can configure the pools:

~ $ ceph osd pool set  crush_ruleset 1
~ $ ceph osd pool set  crush_ruleset 2

Now you place your workload from dc1 to the dc1 pool, and workload
from dc2 to the dc2 pool. You could also use HDD with SSD journal (if
your workload issn't that write intensive) and save some money in dc3
as your client would always read from a SSD and write to Hybrid.

Btw. all this could be done with a few simple clicks through our web
frontend. Even if you want to export it via CephFS / NFS / .. it is
possible to set it on a per folder level. Feel free to take a look at
https://www.youtube.com/watch?v=V33f7ipw9d4 to see how easy it could
be.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


2018-11-09 17:35 GMT+01:00 Vlad Kopylov :
> Please disregard pg status, one of test vms was down for some time it is
> healing.
> Question only how to make it read from proper datacenter
>
> If you have an example.
>
> Thanks
>
>
> On Fri, Nov 9, 2018 at 11:28 AM Vlad Kopylov  wrote:
>>
>> Martin, thank you for the tip.
>> googling ceph crush rule examples doesn't give much on rules, just static
>> placement of buckets.
>> this all seems to be for placing data, not to giving client in specific
>> datacenter proper read osd
>>
>> maybe something wrong with placement groups?
>>
>> I added datacenter dc1 dc2 dc3
>> Current replicated_rule is
>>
>> rule replicated_rule {
>> id 0
>> type replicated
>> min_size 1
>> max_size 10
>> step take default
>> step chooseleaf firstn 0 type host
>> step emit
>> }
>>
>> # buckets
>> host ceph1 {
>> id -3 # do not change unnecessarily
>> id -2 class ssd # do not change unnecessarily
>> # weight 1.000
>> alg straw2
>> hash 0 # rjenkins1
>> item osd.0 weight 1.000
>> }
>> datacenter dc1 {
>> id -9 # do not change unnecessarily
>> id -4 class ssd # do not change unnecessarily
>> # weight 1.000
>> alg straw2
>> hash 0 # rjenkins1
>> item ceph1 weight 1.000
>> }
>> host ceph2 {
>> id -5 # do not change unnecessarily
>> id -6 class ssd # do not change unnecessarily
>> # weight 1.000
>> alg straw2
>> hash 0 # rjenkins1
>> item osd.1 weight 1.000
>> }
>> datacenter dc2 {
>> id -10 # do not change unnecessarily
>> id -8 class ssd # do not change unnecessarily
>> # weight 1.000
>> alg straw2
>> hash 0 # rjenkins1
>> item ceph2 weight 1.000
>> }
>> host ceph3 {
>> id -7 # do not change unnecessarily
>> id -12 class ssd # do not change unnecessarily
>> # weight 1.000
>> alg straw2
>> hash 0 # rjenkins1
>> item osd.2 weight 1.000
>> }
>> datacenter dc3 {
>> id -11 # do not change unnecessarily
>> id -13 class ssd # do not change unnecessarily
>> # weight 1.000
>> alg straw2
>> hash 0 # rjenkins1
>> item ceph3 weight 1.000
>> }
>> root default {
>> id -1 # do not change unnecessarily
>> id -14 class ssd # do not change unnecessarily
>> # weight 3.000
>> alg straw2
>> hash 0 # rjenkins1
>> item dc1 weight 1.000
>> item dc2 weight 1.000
>> item dc3 weight 1.000
>> }
>>
>>
>> #ceph pg dump
>> dumped all
>> version 29433
>> stamp 2018-11-09 11:23:44.510872
>> last_osdmap_epoch 0
>> last_pg_scan 0
>> PG_STAT OBJECTS MISSING_ON_PRIMARY DEGRADED MISPLACED UNFOUND BYTESLOG
>> DISK_LOG STATE  STATE_STAMPVERSION
>> REPORTED UP  UP_PRIMARY ACTING  ACTING_PRIMARY LAST_SCRUB SCRUB_STAMP
>

Re: [ceph-users] read performance, separate client CRUSH maps or limit osd read access from each client

2018-11-08 Thread Martin Verges
Hello Vlad,

Ceph clients connect to the primary OSD of each PG. If you create a
crush rule for building1 and one for building2 that takes a OSD from
the same building as the first one, your reads to the pool will always
be on the same building (if the cluster is healthy) and only write
request get replicated to the other building.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


2018-11-09 4:54 GMT+01:00 Vlad Kopylov :
> I am trying to test replicated ceph with servers in different buildings, and
> I have a read problem.
> Reads from one building go to osd in another building and vice versa, making
> reads slower then writes! Making read as slow as slowest node.
>
> Is there a way to
> - disable parallel read (so it reads only from the same osd node where mon
> is);
> - or give each client read restriction per osd?
> - or maybe strictly specify read osd on mount;
> - or have node read delay cap (for example if node time out is larger then 2
> ms then do not use such node for read as other replicas are available).
> - or ability to place Clients on the Crush map - so it understands that osd
> in - for example osd in the same data-center as client has preference, and
> pull data from it/them.
>
> Mounting with kernel client latest mimic.
>
> Thank you!
>
> Vlad
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph 12.2.9 release

2018-11-08 Thread Martin Verges
Hello Marc,

> - You can use this separate from the commandline?
yes, we don't take apart any feature or possible way, but we don't recommend it

> - And if I modify something from the commandline, these changes are visible 
> in the webinterface?
yes, we just ask Ceph/Linux for it's current state and show it through
the frontend (100% restful api). For example if you add a pool via CLI
you will see it after <5 seconds.

> - I can easily remove/add this webinterface? I mean sometimes you have these 
> tools that just customize the whole environment, that it is  difficult to 
> revert it.
unfortunately thats more the "hard" part. We are not only a interface,
we provide are a complete Ceph Storage solution.

We solve the problem with the different versions and the deployment
through PXE boot of the systems. You can also test this by simply
booting the normally installed system with a running Ceph from our
system. However, you should make sure that the versions of Ceph are
compatible.
Our images include all necessary programs, libraries and kernels. This
allows us to control exactly which version is running on the host.
Even months after the cluster is started, the exact same image will be
executed via PXE boot, if the admin so desires.

You can try it and use it for free in a vagrant demo
https://croit.io/croit-virtual-demo or as a real installation
https://croit.io/croit-production-guide.

As an example, just look in
https://www.youtube.com/watch?v=Jrnzlylidjs to see how we upgrade from
Ceph Luminous to Ceph Mimic.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


2018-11-08 11:23 GMT+01:00 Marc Roos :
>
> H interesting maybe,
>
> - You can use this separate from the commandline?
> - And if I modify something from the commandline, these changes are
> visible in the webinterface?
> - I can easily remove/add this webinterface? I mean sometimes you have
> these tools that just customize the whole environment, that it is
> difficult to revert it.
>
>
>
> -Original Message-
> From: Martin Verges [mailto:martin.ver...@croit.io]
> Sent: donderdag 8 november 2018 11:07
> To: Matthew Vernon
> Cc: ceph-users
> Subject: Re: [ceph-users] ceph 12.2.9 release
>
> Sorry to say this, but that's why there's the croit management interface
> (free community edition feature).
> You don't have to worry about problems that are absolutely critical for
> reliable and stable operation. It doesn't matter if you run a cluster
> with 10 or 1000 hard disks, it just has to run!
>
> On the 12.11. on the Ceph Day in Berlin I can give you information
> directly about it.
>
> --
> Martin Verges
> Managing director
>
> Mobile: +49 174 9335695
> E-Mail: martin.ver...@croit.io
> Chat: https://t.me/MartinVerges
>
> croit GmbH, Freseniusstr. 31h, 81247 Munich
> CEO: Martin Verges - VAT-ID: DE310638492 Com. register: Amtsgericht
> Munich HRB 231263
>
> Web: https://croit.io
> YouTube: https://goo.gl/PGE1Bx
>
>
> 2018-11-08 10:35 GMT+01:00 Matthew Vernon :
>> On 08/11/2018 09:17, Marc Roos wrote:
>>>
>>>   And that is why I don't like ceph-deploy. Unless you have maybe
>>> hundreds of disks, I don’t see why you cannot install it "manually".
>>
>>
>> ...as the recent ceph survey showed, plenty of people have hundreds of
>
>> disks! Ceph is meant to be operated at scale, which is why many admins
>
>> will have automation (ceph-ansible, etc.) in place.
>>
>> [our test clusters are 180 OSDs...]
>>
>> Regards,
>>
>> Matthew
>>
>>
>> --
>> The Wellcome Sanger Institute is operated by Genome Research Limited,
>> a charity registered in England with number 1021457 and a company
>> registered in England with number 2742969, whose registered office is
>> 215 Euston Road, London, NW1 2BE.
>> ___
>>
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph 12.2.9 release

2018-11-08 Thread Martin Verges
Sorry to say this, but that's why there's the croit management
interface (free community edition feature).
You don't have to worry about problems that are absolutely critical
for reliable and stable operation. It doesn't matter if you run a
cluster with 10 or 1000 hard disks, it just has to run!

On the 12.11. on the Ceph Day in Berlin I can give you information
directly about it.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


2018-11-08 10:35 GMT+01:00 Matthew Vernon :
> On 08/11/2018 09:17, Marc Roos wrote:
>>
>>   And that is why I don't like ceph-deploy. Unless you have maybe hundreds
>> of disks, I don’t see why you cannot install it "manually".
>
>
> ...as the recent ceph survey showed, plenty of people have hundreds of
> disks! Ceph is meant to be operated at scale, which is why many admins will
> have automation (ceph-ansible, etc.) in place.
>
> [our test clusters are 180 OSDs...]
>
> Regards,
>
> Matthew
>
>
> --
> The Wellcome Sanger Institute is operated by Genome Research Limited, a
> charity registered in England with number 1021457 and a company registered
> in England with number 2742969, whose registered office is 215 Euston Road,
> London, NW1 2BE. ___
>
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Packages for debian in Ceph repo

2018-10-30 Thread Martin Verges
Hello,

we provide a public mirror documented on
https://croit.io/2018/09/23/2018-09-23-debian-mirror for Ceph Mimic on
Debian Stretch.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


2018-10-30 17:07 GMT+01:00 Kevin Olbrich :
> Is it possible to use qemu-img with rbd support on Debian Stretch?
> I am on Luminous and try to connect my image-buildserver to load images into
> a ceph pool.
>
>> root@buildserver:~# qemu-img convert -p -O raw /target/test-vm.qcow2
>> rbd:rbd_vms_ssd_01/test_vm
>> qemu-img: Unknown protocol 'rbd'
>
>
> Kevin
>
> Am Mo., 3. Sep. 2018 um 12:07 Uhr schrieb Abhishek Lekshmanan
> :
>>
>> arad...@tma-0.net writes:
>>
>> > Can anyone confirm if the Ceph repos for Debian/Ubuntu contain packages
>> > for
>> > Debian? I'm not seeing any, but maybe I'm missing something...
>> >
>> > I'm seeing ceph-deploy install an older version of ceph on the nodes
>> > (from the
>> > Debian repo) and then failing when I run "ceph-deploy osd ..." because
>> > ceph-
>> > volume doesn't exist on the nodes.
>> >
>> The newer versions of Ceph (from mimic onwards) requires compiler
>> toolchains supporting c++17 which we unfortunately do not have for
>> stretch/jessie yet.
>>
>> -
>> Abhishek
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Lost machine with MON and MDS

2018-10-26 Thread Martin Verges
Hello,

did you lost the only mon as well? If yes, restore it not easy but
possible. The mds is no problem just install it.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Maiko de Andrade  schrieb am Fr., 26. Okt. 2018,
19:40:

> Hi,
>
> I have 3 machine with ceph config with cephfs. But I lost one machine,
> just with mon and mds. It's possible recovey cephfs? If yes how?
>
> ceph: Ubuntu 16.05.5 (lost this machine)
> - mon
> - mds
> - osd
>
> ceph-osd-1: Ubuntu 16.05.5
> - osd
>
> ceph-osd-2: Ubuntu 16.05.5
> - osd
>
>
>
> []´s
> Maiko de Andrade
> MAX Brasil
> Desenvolvedor de Sistemas
> +55 51 91251756
> http://about.me/maiko
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Client new version than server?

2018-10-26 Thread Martin Verges
Hello,

In my opinion it is not a problem. It could be a problem on mayor releases
(read the release notes to check for incompatibilities) but minor version
differences shouldn't be a problem.

In the most environments I know are different client versions connecting to
a cluster.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Andre Goree  schrieb am Sa., 27. Okt. 2018, 00:02:

> I wanted to ask for thoughts/guidance on the case of running a newer
> version of Ceph on a client than the version of Ceph that is running on
> the server.
>
> E.g., I have a client machine running Ceph 12.2.8, while the server is
> running 12.2.4.  Is this a terrible idea?  My thoughts are to more
> thoroughly test 12.2.8 on the server side before upgrading my production
> server to 12.2.8.  However, I have a client that's been recently
> installed and thus pulled down the latest Luminous version (12.2.8).
>
> Thanks in advance.
>
> --
> Andre Goree
> -=-=-=-=-=-
> Email - andre at drenet.net
> Website   - http://blog.drenet.net
> PGP key   - http://www.drenet.net/pubkey.html
> -=-=-=-=-=-
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] NVME Intel Optane - same servers different performance

2018-10-25 Thread Martin Verges
Hello Steven,

You could swap the SSDs from the hosts to see if the error is migrating.

If it migrates, I would appreciate that the affected SSD simply offers
less performance than the others. Possibly an RMA reason depending on
what the manufacturer guarantees.
If not, the system must be further searched for possible sources of error.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


2018-10-25 18:06 GMT+02:00 Steven Vacaroaia :
> Hi Martin,
>
> Yes, they are in the same slot - also I checked
>  the BIOS and the PCI speed and type is properly negotiated
>  system profile ( performance)
>
> note
> This happened after I upgrade firmware on the servers - however they all
> have the same firmware
>
> BAD server
> lspci | grep -i Optane
> 04:00.0 Non-Volatile memory controller: Intel Corporation Optane DC P4800X
> Series SSD
>
> GOOD server
> lspci | grep -i Optane
> 04:00.0 Non-Volatile memory controller: Intel Corporation Optane DC P4800X
> Series SSD
>
>
>
> On Thu, 25 Oct 2018 at 11:59, Martin Verges  wrote:
>>
>> Hello Steven,
>>
>> are you sure that the systems are exact the same? Sometimes vendors
>> place extension cards into different PCIe slots.
>>
>> --
>> Martin Verges
>> Managing director
>>
>> Mobile: +49 174 9335695
>> E-Mail: martin.ver...@croit.io
>> Chat: https://t.me/MartinVerges
>>
>> croit GmbH, Freseniusstr. 31h, 81247 Munich
>> CEO: Martin Verges - VAT-ID: DE310638492
>> Com. register: Amtsgericht Munich HRB 231263
>>
>> Web: https://croit.io
>> YouTube: https://goo.gl/PGE1Bx
>>
>>
>> 2018-10-25 17:46 GMT+02:00 Steven Vacaroaia :
>> > Hi,
>> > I have 4 x DELL R630 servers with exact same specs
>> > I installed Intel Optane SSDPED1K375GA in all
>> >
>> > When comparing fio performance ( both read and write), when is lower
>> > than
>> > the other 3
>> > ( see below - just read)
>> >
>> > Any suggestions as to what to check/fix ?
>> >
>> > BAD server
>> > [root@osd04 ~]# fio --filename=/dev/nvme0n1 --direct=1 --sync=1
>> > --rw=read
>> > --bs=4k --numjobs=100 --iodepth=1 --runtime=60 --time_based
>> > --group_reporting --name=journal-test
>> > journal-test: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
>> > 4096B-4096B, ioengine=psync, iodepth=1
>> > ...
>> > fio-3.1
>> > Starting 100 processes
>> > Jobs: 100 (f=100): [R(100)][100.0%][r=2166MiB/s,w=0KiB/s][r=554k,w=0
>> > IOPS][eta 00m:00s]
>> >
>> >
>> > GOOD server
>> > [root@osd02 ~]# fio --filename=/dev/nvme0n1 --direct=1 --sync=1
>> > --rw=read
>> > --bs=4k --numjobs=100 --iodepth=1 --runtime=60 --time_based
>> > --group_reporting --name=journal-test
>> > journal-test: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
>> > 4096B-4096B, ioengine=psync, iodepth=1
>> > ...
>> > fio-3.1
>> > Starting 100 processes
>> > Jobs: 100 (f=100): [R(100)][100.0%][r=2278MiB/s,w=0KiB/s][r=583k,w=0
>> > IOPS][eta 00m:00s]
>> >
>> >
>> > many thanks
>> > steven
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] NVME Intel Optane - same servers different performance

2018-10-25 Thread Martin Verges
Hello Steven,

are you sure that the systems are exact the same? Sometimes vendors
place extension cards into different PCIe slots.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


2018-10-25 17:46 GMT+02:00 Steven Vacaroaia :
> Hi,
> I have 4 x DELL R630 servers with exact same specs
> I installed Intel Optane SSDPED1K375GA in all
>
> When comparing fio performance ( both read and write), when is lower than
> the other 3
> ( see below - just read)
>
> Any suggestions as to what to check/fix ?
>
> BAD server
> [root@osd04 ~]# fio --filename=/dev/nvme0n1 --direct=1 --sync=1 --rw=read
> --bs=4k --numjobs=100 --iodepth=1 --runtime=60 --time_based
> --group_reporting --name=journal-test
> journal-test: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
> 4096B-4096B, ioengine=psync, iodepth=1
> ...
> fio-3.1
> Starting 100 processes
> Jobs: 100 (f=100): [R(100)][100.0%][r=2166MiB/s,w=0KiB/s][r=554k,w=0
> IOPS][eta 00m:00s]
>
>
> GOOD server
> [root@osd02 ~]# fio --filename=/dev/nvme0n1 --direct=1 --sync=1 --rw=read
> --bs=4k --numjobs=100 --iodepth=1 --runtime=60 --time_based
> --group_reporting --name=journal-test
> journal-test: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
> 4096B-4096B, ioengine=psync, iodepth=1
> ...
> fio-3.1
> Starting 100 processes
> Jobs: 100 (f=100): [R(100)][100.0%][r=2278MiB/s,w=0KiB/s][r=583k,w=0
> IOPS][eta 00m:00s]
>
>
> many thanks
> steven
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Monitor Recovery

2018-10-24 Thread Martin Verges
Hello John,

did you try
http://docs.ceph.com/docs/mimic/rados/troubleshooting/troubleshooting-mon/#preparing-your-logs
?

 At this point I'd prefer to just give up on it and assume it's in a bad
> state and recover it from the working monitors. What's the best way to go
> about this?


As long as you remaining MONs, you can remove it and add it again. However
I would suggest to find out whats wrong with it.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

2018-10-24 2:22 GMT+02:00 John Petrini :

> Hi List,
>
> I've got a monitor that won't stay up. It comes up and joins the
> cluster but crashes within a couple of minutes with no info in the
> logs. At this point I'd prefer to just give up on it and assume it's
> in a bad state and recover it from the working monitors. What's the
> best way to go about this?
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com