Re: [ceph-users] Ceph and NVMe

2018-09-06 Thread Jeff Bailey
I haven't had any problems using 375GB P4800X's in R730 and R740xd 
machines for DB+WAL.  The iDRAC whines a bit on the R740 but everything 
works fine.


On 9/6/2018 3:09 PM, Steven Vacaroaia wrote:

Hi ,
Just to add to this question, is anyone using Intel Optane DC P4800X on 
DELL R630 ...or any other server ?

Any gotchas / feedback/ knowledge sharing will be greatly appreciated
Steven

On Thu, 6 Sep 2018 at 14:59, Stefan Priebe - Profihost AG 
mailto:s.pri...@profihost.ag>> wrote:


Hello list,

has anybody tested current NVMe performance with luminous and bluestore?
Is this something which makes sense or just a waste of money?

Greets,
Stefan
___
ceph-users mailing list
ceph-users@lists.ceph.com 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Changing Replication count

2016-09-06 Thread Jeff Bailey



On 9/6/2016 8:41 PM, Vlad Blando wrote:

Hi,

My replication count now is this

[root@controller-node ~]# ceph osd lspools
4 images,5 volumes,


Those aren't replica counts they're pool ids.


[root@controller-node ~]#

and I made adjustment and made it to 3 for images and 2 to volumes to 
3, it's been 30 mins now and the values did not change, how do I know 
if it was really changed.


this is the command I executed

 ceph osd pool set images size 2
 ceph osd pool set volumes size 3

ceph osd pool set images min_size 2
ceph osd pool set images min_size 2


Another question, since the previous replication count for images is 4 
and volumes to 5, it will delete the excess replication right?


Thanks for the help


/vlad
ᐧ


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fast Ceph a Cluster with PB storage

2016-08-09 Thread Jeff Bailey



On 8/9/2016 10:43 AM, Wido den Hollander wrote:



Op 9 augustus 2016 om 16:36 schreef Александр Пивушков :


 > >> Hello dear community!

I'm new to the Ceph and not long ago took up the theme of building clusters.
Therefore it is very important to your opinion.
It is necessary to create a cluster from 1.2 PB storage and very rapid access to data. 
Earlier disks of "Intel® SSD DC P3608 Series 1.6TB NVMe PCIe 3.0 x4 Solid State 
Drive" were used, their speed of all satisfies, but with increase of volume of 
storage, the price of such cluster very strongly grows and therefore there was an idea to 
use Ceph.


You may want to tell us more about your environment, use case and in
particular what your clients are.
Large amounts of data usually means graphical or scientific data,
extremely high speed (IOPS) requirements usually mean database
like applications, which one is it, or is it a mix?


This is a mixed project, with combined graphics and science. Project linking 
the vast array of image data. Like google MAP :)
Previously, customers were Windows that are connected to powerful servers 
directly.
Ceph cluster connected on FC to servers of the virtual machines is now planned. 
Virtualization - oVirt.


Stop right there. oVirt, despite being from RedHat, doesn't really support
Ceph directly all that well, last I checked.
That is probably where you get the idea/need for FC from.

If anyhow possible, you do NOT want another layer and protocol conversion
between Ceph and the VMs, like a FC gateway or iSCSI or NFS.

So if you're free to choose your Virtualization platform, use KVM/qemu at
the bottom and something like Openstack, OpenNebula, ganeti, Pacemake with
KVM resource agents on top.

oh, that's too bad ...
I do not understand something...

oVirt built on kvm
https://www.ovirt.org/documentation/introduction/about-ovirt/

Ceph, such as support kvm
http://docs.ceph.com/docs/master/architecture/



KVM is just the hypervisor. oVirt is a tool which controls KVM and it doesn't 
have support for Ceph. That means that it can't pass down the proper arguments 
to KVM to talk to RBD.


What could be the overhead costs and how big they are?


I do not understand why oVirt bad, and the qemu in the Openstack, it's good.
What can be read?



Like I said above. oVirt and OpenStack both control KVM. OpenStack also knows 
how to  'configure' KVM to use RBD, oVirt doesn't.

Maybe Proxmox is a better solution in your case.



oVirt can use ceph through cinder.  It doesn't currently provide all the 
functionality of

other oVirt storage domains but it does work.


Wido



--
Александр Пивушков___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Tips for faster openstack instance boot

2016-02-08 Thread Jeff Bailey
Your glance images need to be raw, also.  A QCOW image will be 
copied/converted.


On 2/8/2016 3:33 PM, Jason Dillaman wrote:

If Nova and Glance are properly configured, it should only require a quick 
clone of the Glance image to create your Nova ephemeral image.  Have you 
double-checked your configuration against the documentation [1]?  What version 
of OpenStack are you using?

To answer your questions:


- From Ceph point of view. does COW works cross pool i.e. image from glance
pool ---> (cow) --> instance disk on nova pool

Yes, cloning copy-on-write images works across pools


- Will a single pool for glance and nova instead of separate pool . will help
here ?

Should be no change -- the creation of the clone is extremely lightweight (add 
the image to a directory, create a couple metadata objects)


- Is there any tunable parameter from Ceph or OpenStack side that should be
set ?

I'd double-check your OpenStack configuration.  Perhaps Glance isn't configured with 
"show_image_direct_url = True", or Glance is configured to cache your RBD 
images, or you have an older OpenStack release that requires patches to fully support 
Nova+RBD.

[1] http://docs.ceph.com/docs/master/rbd/rbd-openstack/


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Intel P3700 PCI-e as journal drives?

2016-01-12 Thread Jeff Bailey

On 1/12/2016 4:51 AM, Burkhard Linke wrote:

Hi,

On 01/08/2016 03:02 PM, Paweł Sadowski wrote:

Hi,

Quick results for 1/5/10 jobs:


*snipsnap*

Run status group 0 (all jobs):
   WRITE: io=21116MB, aggrb=360372KB/s, minb=360372KB/s, 
maxb=360372KB/s,

mint=6msec, maxt=6msec


*snipsnap*

Run status group 0 (all jobs):
   WRITE: io=57723MB, aggrb=985119KB/s, minb=985119KB/s, 
maxb=985119KB/s,

mint=60001msec, maxt=60001msec

Disk stats (read/write):
   nvme0n1: ios=0/14754265, merge=0/0, ticks=0/253092, in_queue=254880,
util=100.00%

*snipsnap*


Run status group 0 (all jobs):
   WRITE: io=65679MB, aggrb=1094.7MB/s, minb=1094.7MB/s, 
maxb=1094.7MB/s,

mint=60001msec, maxt=60001msec


*snipsnap*


=== START OF INFORMATION SECTION ===
Vendor:   NVMe
Product:  INTEL SSDPEDMD01
Revision: 8DV1
User Capacity:1,600,321,314,816 bytes [1.60 TB]
Logical block size:   512 bytes
Rotation Rate:Solid State Device
Thank you for the fast answer. The numbers really look promising! Do 
you have experience with the speed of these drives with respect to 
their size? Are the smaller models (e.g. the 400GB one) as fast as the 
larger ones, or does the speed scale with the overall size, e.g. due 
to a larger number of flash chips / memory channels?


Attached are similar runs on a 400GB P3700.  The 400GB is a little 
slower than the 1.6TB but not bad.



Regards,
Burkhard

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Script started on Tue 12 Jan 2016 03:04:55 AM EST
[root@hv01 ~]# fio --filename=/dev/nvme0n1p4 --direct=1 --sync=1 --rw=write 
--bs=4k --numjobs=1 --iodepth=1 --runtime=60 --time_based --group_reporting 
--name=journal-test
journal-test: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
fio-2.2.8
Starting 1 process
Jobs: 1 (f=1)
journal-test: (groupid=0, jobs=1): err= 0: pid=87175: Tue Jan 12 03:05:59 2016
  write: io=23805MB, bw=406279KB/s, iops=101569, runt= 6msec
clat (usec): min=8, max=6156, avg= 9.52, stdev=17.85
 lat (usec): min=8, max=6156, avg= 9.59, stdev=17.85
clat percentiles (usec):
 |  1.00th=[8],  5.00th=[8], 10.00th=[8], 20.00th=[8],
 | 30.00th=[8], 40.00th=[9], 50.00th=[9], 60.00th=[9],
 | 70.00th=[9], 80.00th=[9], 90.00th=[   11], 95.00th=[   18],
 | 99.00th=[   20], 99.50th=[   23], 99.90th=[   29], 99.95th=[   35],
 | 99.99th=[   51]
bw (KB  /s): min=368336, max=419216, per=99.98%, avg=406197.88, 
stdev=11905.08
lat (usec) : 10=86.21%, 20=12.49%, 50=1.29%, 100=0.01%, 250=0.01%
lat (usec) : 500=0.01%, 750=0.01%, 1000=0.01%
lat (msec) : 2=0.01%, 4=0.01%, 10=0.01%
  cpu  : usr=18.81%, sys=11.95%, ctx=6094194, majf=0, minf=116
  IO depths: 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
 submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
 complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
 issued: total=r=0/w=6094190/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
 latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: io=23805MB, aggrb=406279KB/s, minb=406279KB/s, maxb=406279KB/s, 
mint=6msec, maxt=6msec

Disk stats (read/write):
  nvme0n1: ios=74/6087837, merge=0/0, ticks=5/43645, in_queue=43423, util=71.60%



[root@hv01 ~]# fio --filename=/dev/nvme0n1p4 --direct=1 --sync=1 --rw=write 
--bs=4k --numjobs=5 --iodepth=1 --runtime=60 --time_based --group_reporting 
--name=journal-test

journal-test: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
...
fio-2.2.8
Starting 5 processes
Jobs: 5 (f=5)
journal-test: (groupid=0, jobs=5): err= 0: pid=87229: Tue Jan 12 03:07:31 2016
  write: io=54260MB, bw=926023KB/s, iops=231505, runt= 60001msec
clat (usec): min=8, max=12011, avg=20.95, stdev=64.79
 lat (usec): min=8, max=12012, avg=21.06, stdev=64.79
clat percentiles (usec):
 |  1.00th=[9],  5.00th=[   10], 10.00th=[   11], 20.00th=[   12],
 | 30.00th=[   13], 40.00th=[   14], 50.00th=[   16], 60.00th=[   17],
 | 70.00th=[   19], 80.00th=[   23], 90.00th=[   28], 95.00th=[   33],
 | 99.00th=[  108], 99.50th=[  203], 99.90th=[  540], 99.95th=[  684],
 | 99.99th=[ 1272]
bw (KB  /s): min=132048, max=236048, per=20.01%, avg=185337.04, 
stdev=17916.73
lat (usec) : 10=2.84%, 20=67.27%, 50=27.54%, 100=1.28%, 250=0.69%
lat (usec) : 500=0.27%, 750=0.08%, 1000=0.02%
lat (msec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%
  cpu  : usr=5.62%, sys=21.65%, ctx=13890559, majf=0, minf=576
  IO depths: 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
 submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
 complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
 issued: 

Re: [ceph-users] the state of cephfs in giant

2014-10-13 Thread Jeff Bailey

On 10/13/2014 4:56 PM, Sage Weil wrote:

On Mon, 13 Oct 2014, Eric Eastman wrote:

I would be interested in testing the Samba VFS and Ganesha NFS integration
with CephFS.  Are there any notes on how to configure these two interfaces
with CephFS?


For ganesha I'm doing something like:

FSAL
{
  CEPH
  {
FSAL_Shared_Library = libfsalceph.so;
  }
}

EXPORT
{
  Export_Id = 1;
  Path = 131.123.35.53:/;
  Pseudo = /ceph;
  Tag = ceph;
  FSAL
  {
Name = CEPH;
  }
}



For samba, based on
https://github.com/ceph/ceph-qa-suite/blob/master/tasks/samba.py#L106
I think you need something like

[myshare]
path = /
writeable = yes
vfs objects = ceph
ceph:config_file = /etc/ceph/ceph.conf

Not sure what the ganesha config looks like.  Matt and the other folks at
cohortfs would know more.

sage
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ulimit max user processes (-u) and non-root ceph clients

2013-12-16 Thread Jeff Bailey

On 12/16/2013 2:36 PM, Dan Van Der Ster wrote:

On Dec 16, 2013 8:26 PM, Gregory Farnum g...@inktank.com wrote:

On Mon, Dec 16, 2013 at 11:08 AM, Dan van der Ster
daniel.vanders...@cern.ch wrote:

Hi,

Sorry to revive this old thread, but I wanted to update you on the current
pains we're going through related to clients' nproc (and now nofile)
ulimits. When I started this thread we were using RBD for Glance images
only, but now we're trying to enable RBD-backed Cinder volumes and are not
really succeeding at the moment :(

As we had guessed from our earlier experience, librbd and therefore qemu-kvm
need increased nproc/nofile limits otherwise VMs will freeze. In fact we
just observed a lockup of a test VM due to the RBD device blocking
completely (this appears as blocked flush processes in the VM); we're
actually not sure which of the nproc/nofile limits caused the freeze, but it
was surely one of those.

And the main problem we face now is that it isn't trivial to increase the
limits of qemu-kvm on a running OpenStack hypervisor -- the values are set
by libvirtd and seem to require a restart of all guest VMs on a host to
reload a qemu config file. I'll update this thread when we find the solution
to that...

Is there some reason you can't just set it ridiculously high to start with?


As I mentioned, we haven't yet found a way to change the limits without 
affecting (stopping) the existing running (important) VMs. We thought that 
/etc/security/limits.conf would do the trick, but alas limits there have no 
effect on qemu.


I don't know whether qemu (perhaps librbd to be more precise?) is aware 
of the limits and avoids them or simply gets errors when it exceeds 
them.  If it's the latter then couldn't you just use prlimit to change 
them?  If that's not possible then maybe just change the limit settings, 
migrate the VM and then migrate it back?



Cheers, Dan


Moving forward, IMHO it would be much better if Ceph clients could
gracefully work with large clusters without _requiring_ changes to the
ulimits. I understand that such poorly configured clients would necessarily
have decreased performance (since librados would need to use a thread pool
and also lose some of the persistent client-OSD connections). But client
lockups are IMHO worse that slightly lower performance.

Have you guys discussed the client ulimit issues recently and is there a
plan in the works?

I'm afraid not. It's a plannable but non-trivial amount of work and
the Inktank dev team is pretty well booked for a while. Anybody
running into this as a serious bottleneck should
1) try and start a community effort
2) try and promote it as a priority with any Inktank business contacts
they have.
(You are only the second group to report it as an ongoing concern
rather than a one-off hiccup, and honestly it sounds like you're just
having issues with hitting the arbitrary limits, not with real
resource exhaustion issues.)
:)
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com