Re: [ceph-users] MON dedicated hosts

2018-12-17 Thread Sam Huracan
Thanks Mr Konstantin and Martin,

So with 200TB Cluster,  the most afforadable choice is adding MON to OSD
hosts, and preparing enough CPU, RAM for MON services and Storage for
levelDB.



Vào Th 2, 17 thg 12, 2018 vào lúc 16:55 Martin Verges <
martin.ver...@croit.io> đã viết:

> Hello,
>
> we do not see a problem in a small cluster having 3 MON on OSD hosts.
> However we do suggest to use 5 MON's.
> Near every customers of us does this without a problem! Please just
> make sure to have enough cpu/ram/disk available.
>
> So:
> 1. No not necessary, only if you want to spend more money than required.
> 2. Maybe think of it when your cluster becomes >500, maybe 1k OSDs or
> simply if the cluster design would be easier with dedicated servers.
>
> Hint: Our training's cover a Ceph cluster planing session that covers
> exactly such topics. See https://croit.io/training.
>
> --
> Martin Verges
> Managing director
>
> Mobile: +49 174 9335695
> E-Mail: martin.ver...@croit.io
> Chat: https://t.me/MartinVerges
>
> croit GmbH, Freseniusstr. 31h, 81247 Munich
> CEO: Martin Verges - VAT-ID: DE310638492
> Com. register: Amtsgericht Munich HRB 231263
>
> Web: https://croit.io
> YouTube: https://goo.gl/PGE1Bx
> Am Mo., 17. Dez. 2018 um 10:10 Uhr schrieb Sam Huracan
> :
> >
> > Hi everybody,
> >
> > We've runned a 50TB Cluster with 3 MON services on the same nodes with
> OSDs.
> > We are planning to upgrade to 200TB, I have 2 questions:
> >  1.  Should we separate MON services to dedicated hosts?
> >  2.  From your experiences, how size of cluster we shoud consider to put
> MON on dedicated hosts?
> >
> >
> > Thanks in advance.
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] MON dedicated hosts

2018-12-17 Thread Sam Huracan
Hi everybody,

We've runned a 50TB Cluster with 3 MON services on the same nodes with OSDs.
We are planning to upgrade to 200TB, I have 2 questions:
 1.  Should we separate MON services to dedicated hosts?
 2.  From your experiences, how size of cluster we shoud consider to put
MON on dedicated hosts?


Thanks in advance.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] FileStore SSD (journal) vs BlueStore SSD (DB/Wal)

2018-08-05 Thread Sam Huracan
Hi,

Anyone who has real experiences on this case, could you give me more
information and estimation?

Thanks.

2018-08-05 15:00 GMT+07:00 Sam Huracan :

> Thanks Saludos!
>
> As far as I know, we should keep the FileStore SSD Journal after
> upgrading, because BlueStore will affect the write performance??
> I think I'll choose Luminous which is recently the most stable version.
>
>
>
>
>
>
>
> On Sat, Aug 4, 2018, 03:31 Xavier Trilla 
> wrote:
>
>> Hi Sam,
>>
>>
>>
>> Having done any benchmark myself -as we only use SSDs or NVMes- but is my
>> understanding Luminous -I would not recommend upgrading production to Mimic
>> yet, but I’m quite conservative- Bluestore is going to be slower for writes
>> than filestore with SSD journals.
>>
>>
>>
>> You could try dmcache, bcache, etc and add some SSD caching to each HDD
>>  (Meaning it can affect write endurance of the SSDs).
>>
>>
>>
>> Dmcache and bluestore seems to be a quite interesting option IMO, as
>> you’ll get faster reads and writes, and you’ll avoid the double write
>> penalty of filestore.
>>
>>
>>
>> Cheers!
>>
>>
>>
>> Saludos Cordiales,
>>
>> Xavier Trilla P.
>>
>> Clouding.io <https://clouding.io/>
>>
>>
>>
>> ¿Un Servidor Cloud con SSDs, redundado
>>
>> y disponible en menos de 30 segundos?
>>
>>
>>
>> ¡Pruébalo ahora en Clouding.io <https://clouding.io/>!
>>
>>
>>
>> *De:* ceph-users  * En nombre de *Sam
>> Huracan
>> *Enviado el:* viernes, 3 de agosto de 2018 16:36
>> *Para:* ceph-users@lists.ceph.com
>> *Asunto:* Re: [ceph-users] FileStore SSD (journal) vs BlueStore SSD
>> (DB/Wal)
>>
>>
>>
>> Hi,
>>
>>
>>
>> Anyone can help us answer these questions?
>>
>>
>>
>>
>>
>>
>>
>> 2018-08-03 8:36 GMT+07:00 Sam Huracan :
>>
>> Hi Cephers,
>>
>>
>>
>> We intend to upgrade our Cluster from Jewel to Luminous (or Mimic?)
>>
>>
>>
>> Our model is currently using OSD File Store with SSD Journal (1 SSD for 7
>> SATA 7.2K)
>>
>>
>>
>> My question are:
>>
>>
>>
>>
>>
>> 1.Should we change to BlueStore with DB/WAL put in SSD and data in HDD?
>> (we want to keep the model using journal SSD for caching). Is there any
>> improvement in overall performance? We think with model SSD cache,
>> FileStore will write faster because data written in SSD before flushing to
>> SAS, whereas with BlueStore, data will be written directly to SAS.
>>
>>
>>
>>
>>
>> 2. Do you guys ever benchmark and compare 2 cluster: FileStore SSD
>> (journal)  and BlueStore SSD (DB/WAL) like that?
>>
>>
>>
>>
>>
>> Thanks in advance.
>>
>>
>>
>>
>>
>>
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] FileStore SSD (journal) vs BlueStore SSD (DB/Wal)

2018-08-05 Thread Sam Huracan
Thanks Saludos!

As far as I know, we should keep the FileStore SSD Journal after upgrading,
because BlueStore will affect the write performance??
I think I'll choose Luminous which is recently the most stable version.







On Sat, Aug 4, 2018, 03:31 Xavier Trilla  wrote:

> Hi Sam,
>
>
>
> Having done any benchmark myself -as we only use SSDs or NVMes- but is my
> understanding Luminous -I would not recommend upgrading production to Mimic
> yet, but I’m quite conservative- Bluestore is going to be slower for writes
> than filestore with SSD journals.
>
>
>
> You could try dmcache, bcache, etc and add some SSD caching to each HDD
>  (Meaning it can affect write endurance of the SSDs).
>
>
>
> Dmcache and bluestore seems to be a quite interesting option IMO, as
> you’ll get faster reads and writes, and you’ll avoid the double write
> penalty of filestore.
>
>
>
> Cheers!
>
>
>
> Saludos Cordiales,
>
> Xavier Trilla P.
>
> Clouding.io <https://clouding.io/>
>
>
>
> ¿Un Servidor Cloud con SSDs, redundado
>
> y disponible en menos de 30 segundos?
>
>
>
> ¡Pruébalo ahora en Clouding.io <https://clouding.io/>!
>
>
>
> *De:* ceph-users  * En nombre de *Sam
> Huracan
> *Enviado el:* viernes, 3 de agosto de 2018 16:36
> *Para:* ceph-users@lists.ceph.com
> *Asunto:* Re: [ceph-users] FileStore SSD (journal) vs BlueStore SSD
> (DB/Wal)
>
>
>
> Hi,
>
>
>
> Anyone can help us answer these questions?
>
>
>
>
>
>
>
> 2018-08-03 8:36 GMT+07:00 Sam Huracan :
>
> Hi Cephers,
>
>
>
> We intend to upgrade our Cluster from Jewel to Luminous (or Mimic?)
>
>
>
> Our model is currently using OSD File Store with SSD Journal (1 SSD for 7
> SATA 7.2K)
>
>
>
> My question are:
>
>
>
>
>
> 1.Should we change to BlueStore with DB/WAL put in SSD and data in HDD?
> (we want to keep the model using journal SSD for caching). Is there any
> improvement in overall performance? We think with model SSD cache,
> FileStore will write faster because data written in SSD before flushing to
> SAS, whereas with BlueStore, data will be written directly to SAS.
>
>
>
>
>
> 2. Do you guys ever benchmark and compare 2 cluster: FileStore SSD
> (journal)  and BlueStore SSD (DB/WAL) like that?
>
>
>
>
>
> Thanks in advance.
>
>
>
>
>
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] FileStore SSD (journal) vs BlueStore SSD (DB/Wal)

2018-08-03 Thread Sam Huracan
Hi,

Anyone can help us answer these questions?



2018-08-03 8:36 GMT+07:00 Sam Huracan :

> Hi Cephers,
>
> We intend to upgrade our Cluster from Jewel to Luminous (or Mimic?)
>
> Our model is currently using OSD File Store with SSD Journal (1 SSD for 7
> SATA 7.2K)
>
> My question are:
>
>
> 1.Should we change to BlueStore with DB/WAL put in SSD and data in HDD?
> (we want to keep the model using journal SSD for caching). Is there any
> improvement in overall performance? We think with model SSD cache,
> FileStore will write faster because data written in SSD before flushing to
> SAS, whereas with BlueStore, data will be written directly to SAS.
>
>
> 2. Do you guys ever benchmark and compare 2 cluster: FileStore SSD
> (journal)  and BlueStore SSD (DB/WAL) like that?
>
>
> Thanks in advance.
>
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] FileStore SSD (journal) vs BlueStore SSD (DB/Wal)

2018-08-02 Thread Sam Huracan
Hi Cephers,

We intend to upgrade our Cluster from Jewel to Luminous (or Mimic?)

Our model is currently using OSD File Store with SSD Journal (1 SSD for 7
SATA 7.2K)

My question are:


1.Should we change to BlueStore with DB/WAL put in SSD and data in HDD? (we
want to keep the model using journal SSD for caching). Is there any
improvement in overall performance? We think with model SSD cache,
FileStore will write faster because data written in SSD before flushing to
SAS, whereas with BlueStore, data will be written directly to SAS.


2. Do you guys ever benchmark and compare 2 cluster: FileStore SSD
(journal)  and BlueStore SSD (DB/WAL) like that?


Thanks in advance.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fwd: High IOWait Issue

2018-03-26 Thread Sam Huracan
Hi,

We are using Raid cache mode Writeback for SSD journal, I consider this is
reason of utilization of SSD journal is so low.
Is it  true? Anybody has experience with this matter, plz confirm.

Thanks

2018-03-26 23:00 GMT+07:00 Sam Huracan <nowitzki.sa...@gmail.com>:

> Thanks for your information.
> Here is result when I run atop on 1 Ceph HDD host:
> http://prntscr.com/iwmc86
>
> There is some disk busy with over 100%, but the SSD journal (SSD) use only
> 3%, is it normal? Is there any way to optimize using of SSD journal? Could
> you give me some keyword?
>
> Here is configuration of Ceph HDD Host:
> Dell PowerEdge R730xd Server Quantity
> PE R730/xd Motherboard 1
> Intel Xeon E5-2620 v4 2.1GHz,20M Cache,8.0GT/s QPI,Turbo,HT,8C/16T (85W)
> Max Mem 2133MHz 1
> 16GB RDIMM, 2400MT/s, Dual Rank, x8 Data Width 2
> 300GB 15K RPM SAS 12Gbps 2.5in Flex Bay Hard Drive - OS Drive (RAID 1) 2
> 4TB 7.2K RPM NLSAS 12Gbps 512n 3.5in Hot-plug Hard Drive - OSD Drive 7
> 200GB Solid State Drive SATA Mix Use MLC 6Gbps 2.5in Hot-plug Drive -
> Journal Drive (RAID 1) 2
> PERC H730 Integrated RAID Controller, 1GB Cache *(we are using Writeback
> mode)* 1
> Dual, Hot-plug, Redundant Power Supply (1+1), 750W 1
> Broadcom 5720 QP 1Gb Network Daughter Card 1
> QLogic 57810 Dual Port 10Gb Direct Attach/SFP+ Network Adapter 1
>
> For some reasons, we can't configure Jumbo Frame in this cluster. We'll
> refer your suggest about scrub.
>
>
> 2018-03-26 7:41 GMT+07:00 Christian Balzer <ch...@gol.com>:
>
>>
>> Hello,
>>
>> in general and as reminder for others, the more information you supply,
>> the more likely are people to answer and answer with actually pertinent
>> information.
>> Since you haven't mentioned the hardware (actual HDD/SSD models, CPU/RAM,
>> controllers, etc) we're still missing a piece of the puzzle that could be
>> relevant.
>>
>> But given what we have some things are more likely than others.
>> Also, an inline 90KB screenshot of a TEXT iostat output is a bit of a
>> no-no, never mind that atop instead of top from the start would have given
>> you and us much more insight.
>>
>> On Sun, 25 Mar 2018 14:35:57 +0700 Sam Huracan wrote:
>>
>> > Thank you all.
>> >
>> > 1. Here is my ceph.conf file:
>> > https://pastebin.com/xpF2LUHs
>> >
>> As Lazlo noted (and it matches your iostat output beautifully), tuning
>> down scrubs is likely going to have an immediate beneficial impact, as
>> deep-scrubs in particular are VERY disruptive and I/O intense operations.
>>
>> However the "osd scrub sleep = 0.1" may make things worse in certain Jewel
>> versions, as they all went through the unified queue and this would cause
>> a sleep for ALL operations, not just the scrub ones.
>> I can't remember when this was fixed and the changelog is of no help, so
>> hopefully somebody who knows will pipe up.
>> If in doubt of course, experiment.
>>
>> In addition to that, if you have low usage times, set
>> your osd_scrub_(start|end)_hour accordingly and also check the ML archives
>> for other scrub scheduling tips.
>>
>> I'd also leave these:
>> filestore max sync interval = 100
>> filestore min sync interval = 50
>> filestore queue max ops  = 5000
>> filestore queue committing max ops  = 5000
>> journal max write entries  = 1000
>> journal queue max ops  = 5000
>>
>> at their defaults, playing with those parameters requires a good
>> understanding of how Ceph filestore works AND usually only makes sense
>> with SSD/NVMe setups.
>> Especially the first 2 could lead to quite the IO pileup.
>>
>>
>> > 2. Here is result from ceph -s:
>> > root@ceph1:/etc/ceph# ceph -s
>> > cluster 31154d30-b0d3-4411-9178-0bbe367a5578
>> >  health HEALTH_OK
>> >  monmap e3: 3 mons at {ceph1=
>> > 10.0.30.51:6789/0,ceph2=10.0.30.52:6789/0,ceph3=10.0.30.53:6789/0}
>> > election epoch 18, quorum 0,1,2 ceph1,ceph2,ceph3
>> >  osdmap e2473: 63 osds: 63 up, 63 in
>> > flags sortbitwise,require_jewel_osds
>> >   pgmap v34069952: 4096 pgs, 6 pools, 21534 GB data, 5696 kobjects
>> > 59762 GB used, 135 TB / 194 TB avail
>> > 4092 active+clean
>> >2 active+clean+scrubbing
>> >2 active+clean+scrubbing+deep
>> >   client io 36096 kB/s rd, 41611 kB/s wr, 1643 op/s rd, 1634 op/s wr
>> >
>> See above about deep-scrub, which will read A

[ceph-users] Fwd: Fwd: High IOWait Issue

2018-03-26 Thread Sam Huracan
Thanks for your information.
Here is result when I run atop on 1 Ceph HDD host:
http://prntscr.com/iwmc86

There is some disk busy with over 100%, but the SSD journal (SSD) use only
3%, is it normal? Is there any way to optimize using of SSD journal? Could
you give me some keyword?

Here is configuration of Ceph HDD Host:
Dell PowerEdge R730xd Server Quantity
PE R730/xd Motherboard 1
Intel Xeon E5-2620 v4 2.1GHz,20M Cache,8.0GT/s QPI,Turbo,HT,8C/16T (85W)
Max Mem 2133MHz 1
16GB RDIMM, 2400MT/s, Dual Rank, x8 Data Width 2
300GB 15K RPM SAS 12Gbps 2.5in Flex Bay Hard Drive - OS Drive (RAID 1) 2
4TB 7.2K RPM NLSAS 12Gbps 512n 3.5in Hot-plug Hard Drive - OSD Drive 7
200GB Solid State Drive SATA Mix Use MLC 6Gbps 2.5in Hot-plug Drive -
Journal Drive (RAID 1) 2
PERC H730 Integrated RAID Controller, 1GB Cache *(we are using Writeback
mode)* 1
Dual, Hot-plug, Redundant Power Supply (1+1), 750W 1
Broadcom 5720 QP 1Gb Network Daughter Card 1
QLogic 57810 Dual Port 10Gb Direct Attach/SFP+ Network Adapter 1

For some reasons, we can't configure Jumbo Frame in this cluster. We'll
refer your suggest about scrub.


2018-03-26 7:41 GMT+07:00 Christian Balzer <ch...@gol.com>:

>
> Hello,
>
> in general and as reminder for others, the more information you supply,
> the more likely are people to answer and answer with actually pertinent
> information.
> Since you haven't mentioned the hardware (actual HDD/SSD models, CPU/RAM,
> controllers, etc) we're still missing a piece of the puzzle that could be
> relevant.
>
> But given what we have some things are more likely than others.
> Also, an inline 90KB screenshot of a TEXT iostat output is a bit of a
> no-no, never mind that atop instead of top from the start would have given
> you and us much more insight.
>
> On Sun, 25 Mar 2018 14:35:57 +0700 Sam Huracan wrote:
>
> > Thank you all.
> >
> > 1. Here is my ceph.conf file:
> > https://pastebin.com/xpF2LUHs
> >
> As Lazlo noted (and it matches your iostat output beautifully), tuning
> down scrubs is likely going to have an immediate beneficial impact, as
> deep-scrubs in particular are VERY disruptive and I/O intense operations.
>
> However the "osd scrub sleep = 0.1" may make things worse in certain Jewel
> versions, as they all went through the unified queue and this would cause
> a sleep for ALL operations, not just the scrub ones.
> I can't remember when this was fixed and the changelog is of no help, so
> hopefully somebody who knows will pipe up.
> If in doubt of course, experiment.
>
> In addition to that, if you have low usage times, set
> your osd_scrub_(start|end)_hour accordingly and also check the ML archives
> for other scrub scheduling tips.
>
> I'd also leave these:
> filestore max sync interval = 100
> filestore min sync interval = 50
> filestore queue max ops  = 5000
> filestore queue committing max ops  = 5000
> journal max write entries  = 1000
> journal queue max ops  = 5000
>
> at their defaults, playing with those parameters requires a good
> understanding of how Ceph filestore works AND usually only makes sense
> with SSD/NVMe setups.
> Especially the first 2 could lead to quite the IO pileup.
>
>
> > 2. Here is result from ceph -s:
> > root@ceph1:/etc/ceph# ceph -s
> > cluster 31154d30-b0d3-4411-9178-0bbe367a5578
> >  health HEALTH_OK
> >  monmap e3: 3 mons at {ceph1=
> > 10.0.30.51:6789/0,ceph2=10.0.30.52:6789/0,ceph3=10.0.30.53:6789/0}
> > election epoch 18, quorum 0,1,2 ceph1,ceph2,ceph3
> >  osdmap e2473: 63 osds: 63 up, 63 in
> > flags sortbitwise,require_jewel_osds
> >   pgmap v34069952: 4096 pgs, 6 pools, 21534 GB data, 5696 kobjects
> > 59762 GB used, 135 TB / 194 TB avail
> > 4092 active+clean
> >2 active+clean+scrubbing
> >2 active+clean+scrubbing+deep
> >   client io 36096 kB/s rd, 41611 kB/s wr, 1643 op/s rd, 1634 op/s wr
> >
> See above about deep-scrub, which will read ALL the objects of the PG
> being scrubbed and thus not only saturates the OSDs involved with reads
> but ALSO dirties the pagecache with cold objects, making other reads on
> the nodes slow by requiring them to hit the disks, too.
>
> It would be interesting to see a "ceph -s" when your cluster is busy but
> NOT scrubbing, 1600 write op/s are about what 21 HDDs can handle.
> So for the time being, disable scrubs entirely and see if your problems
> go away.
> If so, you now know the limits of your current setup and will want to
> avoid hitting them again.
>
> Having a dedicated SSD pool for high-end VMs or a cache-tier (if it is a
> fit, n

Re: [ceph-users] Fwd: High IOWait Issue

2018-03-25 Thread Sam Huracan
Thank you all.

1. Here is my ceph.conf file:
https://pastebin.com/xpF2LUHs

2. Here is result from ceph -s:
root@ceph1:/etc/ceph# ceph -s
cluster 31154d30-b0d3-4411-9178-0bbe367a5578
 health HEALTH_OK
 monmap e3: 3 mons at {ceph1=
10.0.30.51:6789/0,ceph2=10.0.30.52:6789/0,ceph3=10.0.30.53:6789/0}
election epoch 18, quorum 0,1,2 ceph1,ceph2,ceph3
 osdmap e2473: 63 osds: 63 up, 63 in
flags sortbitwise,require_jewel_osds
  pgmap v34069952: 4096 pgs, 6 pools, 21534 GB data, 5696 kobjects
59762 GB used, 135 TB / 194 TB avail
4092 active+clean
   2 active+clean+scrubbing
   2 active+clean+scrubbing+deep
  client io 36096 kB/s rd, 41611 kB/s wr, 1643 op/s rd, 1634 op/s wr



3. We use 1 SSD for journaling 7 HDD (/dev/sdi), I set 16GB for each
journal,  here is result from ceph-disk list command:

/dev/sda :
 /dev/sda1 ceph data, active, cluster ceph, osd.0, journal /dev/sdi1
/dev/sdb :
 /dev/sdb1 ceph data, active, cluster ceph, osd.1, journal /dev/sdi2
/dev/sdc :
 /dev/sdc1 ceph data, active, cluster ceph, osd.2, journal /dev/sdi3
/dev/sdd :
 /dev/sdd1 ceph data, active, cluster ceph, osd.3, journal /dev/sdi4
/dev/sde :
 /dev/sde1 ceph data, active, cluster ceph, osd.4, journal /dev/sdi5
/dev/sdf :
 /dev/sdf1 ceph data, active, cluster ceph, osd.5, journal /dev/sdi6
/dev/sdg :
 /dev/sdg1 ceph data, active, cluster ceph, osd.6, journal /dev/sdi7
/dev/sdh :
 /dev/sdh3 other, LVM2_member
 /dev/sdh1 other, vfat, mounted on /boot/efi
/dev/sdi :
 /dev/sdi1 ceph journal, for /dev/sda1
 /dev/sdi2 ceph journal, for /dev/sdb1
 /dev/sdi3 ceph journal, for /dev/sdc1
 /dev/sdi4 ceph journal, for /dev/sdd1
 /dev/sdi5 ceph journal, for /dev/sde1
 /dev/sdi6 ceph journal, for /dev/sdf1
 /dev/sdi7 ceph journal, for /dev/sdg1

4. With iostat, we just run "iostat -x 2", /dev/sdi is journal SSD,
/dev/sdh is OS Disk, and the rest is OSD Disks.
root@ceph1:/etc/ceph# lsblk
NAME MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda8:00   3.7T  0 disk
└─sda1 8:10   3.7T  0 part
/var/lib/ceph/osd/ceph-0
sdb8:16   0   3.7T  0 disk
└─sdb1 8:17   0   3.7T  0 part
/var/lib/ceph/osd/ceph-1
sdc8:32   0   3.7T  0 disk
└─sdc1 8:33   0   3.7T  0 part
/var/lib/ceph/osd/ceph-2
sdd8:48   0   3.7T  0 disk
└─sdd1 8:49   0   3.7T  0 part
/var/lib/ceph/osd/ceph-3
sde8:64   0   3.7T  0 disk
└─sde1 8:65   0   3.7T  0 part
/var/lib/ceph/osd/ceph-4
sdf8:80   0   3.7T  0 disk
└─sdf1 8:81   0   3.7T  0 part
/var/lib/ceph/osd/ceph-5
sdg8:96   0   3.7T  0 disk
└─sdg1 8:97   0   3.7T  0 part
/var/lib/ceph/osd/ceph-6
sdh8:112  0 278.9G  0 disk
├─sdh1 8:113  0   512M  0 part /boot/efi
└─sdh3 8:115  0 278.1G  0 part
  ├─hnceph--hdd1--vg-swap (dm-0) 252:00  59.6G  0 lvm  [SWAP]
  └─hnceph--hdd1--vg-root (dm-1) 252:10 218.5G  0 lvm  /
sdi8:128  0 185.8G  0 disk
├─sdi1 8:129  0  16.6G  0 part
├─sdi2 8:130  0  16.6G  0 part
├─sdi3 8:131  0  16.6G  0 part
├─sdi4 8:132  0  16.6G  0 part
├─sdi5 8:133  0  16.6G  0 part
├─sdi6 8:134  0  16.6G  0 part
└─sdi7 8:135  0  16.6G  0 part

Could you give me some idea to continue check?


2018-03-25 12:25 GMT+07:00 Budai Laszlo <laszlo.bu...@gmail.com>:

> could you post the result of "ceph -s" ? besides the health status there
> are other details that could help, like the status of your PGs., also the
> result of "ceph-disk list" would be useful to understand how your disks are
> organized. For instance with 1 SSD for 7 HDD the SSD could be the
> bottleneck.
> From the outputs you gave us we don't know which are the spinning disks
> and which is the ssd (looking at the numbers I suspect that sdi is your
> SSD). we also don't kow what parameters were you using when you've ran the
> iostat command.
>
> Unfortunately it's difficult to help you without knowing more about your
> system.
>
> Kind regards,
> Laszlo
>
> On 24.03.2018 20:19, Sam Huracan wrote:
> > This is from iostat:
> >
> > I'm using Ceph jewel, has no HW error.
> > Ceph  health OK, we've just use 50% total volume.
> >
> >
> > 2018-03-24 22:20 GMT+07:00 <c...@elchaka.de <mailto:c..

Re: [ceph-users] Fwd: High IOWait Issue

2018-03-24 Thread Sam Huracan
This is from iostat:

I'm using Ceph jewel, has no HW error.
Ceph  health OK, we've just use 50% total volume.


2018-03-24 22:20 GMT+07:00 <c...@elchaka.de>:

> I would Check with Tools like atop the utilization of your Disks also.
> Perhaps something Related in dmesg or dorthin?
>
> - Mehmet
>
> Am 24. März 2018 08:17:44 MEZ schrieb Sam Huracan <
> nowitzki.sa...@gmail.com>:
>>
>>
>> Hi guys,
>> We are running a production OpenStack backend by Ceph.
>>
>> At present, we are meeting an issue relating to high iowait in VM, in
>> some MySQL VM, we see sometime IOwait reaches  abnormal high peaks which
>> lead to slow queries increase, despite load is stable (we test with script
>> simulate real load), you can see in graph.
>> https://prnt.sc/ivndni
>>
>> MySQL VM are place on Ceph HDD Cluster, with 1 SSD journal for 7 HDD. In
>> this cluster, IOwait on each ceph host is about 20%.
>> https://prnt.sc/ivne08
>>
>>
>> Can you guy help me find the root cause of this issue, and how to
>> eliminate this high iowait?
>>
>> Thanks in advance.
>>
>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Fwd: High IOWait Issue

2018-03-24 Thread Sam Huracan
Hi guys,
We are running a production OpenStack backend by Ceph.

At present, we are meeting an issue relating to high iowait in VM, in some
MySQL VM, we see sometime IOwait reaches  abnormal high peaks which lead to
slow queries increase, despite load is stable (we test with script simulate
real load), you can see in graph.
https://prnt.sc/ivndni

MySQL VM are place on Ceph HDD Cluster, with 1 SSD journal for 7 HDD. In
this cluster, IOwait on each ceph host is about 20%.
https://prnt.sc/ivne08


Can you guy help me find the root cause of this issue, and how to eliminate
this high iowait?

Thanks in advance.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSDs going down/up at random

2018-01-09 Thread Sam Huracan
Hi Mike,

Could you show system log at moment osd down and up?

On Jan 10, 2018 12:52, "Mike O'Connor"  wrote:

> On 10/01/2018 3:52 PM, Linh Vu wrote:
> >
> > Have you checked your firewall?
> >
> There are no ip tables rules at this time but connection tracking is
> enable. I would expect errors about running out of table space if that
> was an issue.
>
> Thanks
> Mike
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD killed by OOM when many cache available

2017-11-17 Thread Sam Huracan
@Eric: How can I check status of fscache? Why can it be root cause?

Thanks

2017-11-18 7:30 GMT+07:00 Eric Nelson <ericnel...@gmail.com>:

> One thing that doesn't show up is fs cache, which is likely the cause
> here. We went through this on our SSDs and had to add the following to stop
> the crashes. I believe vm.vfs_cache_pressure and min_free_kbytes were the
> really helpful things in getting the crashes to stop. HTH!
>
> sysctl_param 'vm.vfs_cache_pressure' do
>
>   value 400
>
> end
>
> sysctl_param 'vm.dirty_ratio' do
>
>   value 20
>
> end
>
> sysctl_param 'vm.dirty_background_ratio' do
>
>   value 2
>
> end
>
> sysctl_param 'vm.min_free_kbytes' do
>
>   value 4194304
>
> end
>
>
>
> On Fri, Nov 17, 2017 at 4:24 PM, Sam Huracan <nowitzki.sa...@gmail.com>
> wrote:
>
>> I see some more logs about memory in syslog:
>>
>> Nov 17 10:47:17 ceph1 kernel: [2810698.553749] Node 0 DMA free:14828kB
>> min:32kB low:40kB high:48kB active_anon:0kB inactive_anon:0kB
>> active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB
>> isolated(file):0kB present:15996kB managed:15896kB mlocked:0kB dirty:0kB
>> writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB
>> slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB
>> bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB
>> pages_scanned:0 all_unreclaimable? yes
>> Nov 17 10:47:17 ceph1 kernel: [2810698.553753] lowmem_reserve[]: 0 1830
>> 32011 32011 32011
>> Nov 17 10:47:17 ceph1 kernel: [2810698.553755] Node 0 DMA32 free:134848kB
>> min:3860kB low:4824kB high:5788kB active_anon:120628kB
>> inactive_anon:121792kB active_file:653404kB inactive_file:399272kB
>> unevictable:0kB isolated(anon):0kB isolated(file):36kB present:1981184kB
>> managed:1900752kB mlocked:0kB dirty:344kB writeback:0kB mapped:1200kB
>> shmem:0kB slab_reclaimable:239900kB slab_unreclaimable:154560kB
>> kernel_stack:43376kB pagetables:1176kB unstable:0kB bounce:0kB free_pcp:0kB
>> local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0
>> all_unreclaimable? no
>> Nov 17 10:47:17 ceph1 kernel: [2810698.553758] lowmem_reserve[]: 0 0
>> 30180 30180 30180
>> Nov 17 10:47:17 ceph1 kernel: [2810698.553760] Node 0 Normal
>> free:237232kB min:63684kB low:79604kB high:95524kB active_anon:3075228kB
>> inactive_anon:629052kB active_file:12544336kB inactive_file:12570716kB
>> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:31457280kB
>> managed:30905084kB mlocked:0kB dirty:74368kB writeback:3796kB
>> mapped:48516kB shmem:1276kB slab_reclaimable:713684kB
>> slab_unreclaimable:416404kB kernel_stack:40896kB pagetables:25288kB
>> unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
>> writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
>> Nov 17 10:47:17 ceph1 kernel: [2810698.553763] lowmem_reserve[]: 0 0 0 0 0
>> Nov 17 10:47:17 ceph1 kernel: [2810698.553765] Node 0 DMA: 1*4kB (U)
>> 1*8kB (U) 0*16kB 1*32kB (U) 3*64kB (U) 2*128kB (U) 0*256kB 0*512kB 0*1024kB
>> 1*2048kB (M) 3*4096kB (M) = 14828kB
>> Nov 17 10:47:17 ceph1 kernel: [2810698.553772] Node 0 DMA32: 10756*4kB
>> (UME) 11442*8kB (UME) 25*16kB (UE) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB
>> 0*1024kB 0*2048kB 0*4096kB = 134960kB
>> Nov 17 10:47:17 ceph1 kernel: [2810698.553777] Node 0 Normal: 59473*4kB
>> (UME) 77*8kB (U) 10*16kB (H) 8*32kB (H) 6*64kB (H) 2*128kB (H) 1*256kB (H)
>> 1*512kB (H) 0*1024kB 0*2048kB 0*4096kB = 240332kB
>> Nov 17 10:47:17 ceph1 kernel: [2810698.553784] Node 0 hugepages_total=0
>> hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
>> Nov 17 10:47:17 ceph1 kernel: [2810698.553784] Node 0 hugepages_total=0
>> hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
>> Nov 17 10:47:17 ceph1 kernel: [2810698.553785] 6544612 total pagecache
>> pages
>> Nov 17 10:47:17 ceph1 kernel: [2810698.553786] 2347 pages in swap cache
>> Nov 17 10:47:17 ceph1 kernel: [2810698.553787] Swap cache stats: add
>> 2697126, delete 2694779, find 38874122/39241548
>> Nov 17 10:47:17 ceph1 kernel: [2810698.553788] Free swap  = 61498092kB
>> Nov 17 10:47:17 ceph1 kernel: [2810698.553788] Total swap = 62498812kB
>> Nov 17 10:47:17 ceph1 kernel: [2810698.553789] 8363615 pages RAM
>> Nov 17 10:47:17 ceph1 kernel: [2810698.553790] 0 pages HighMem/MovableOnly
>> Nov 17 10:47:17 ceph1 kernel: [2810698.553790] 158182 pages reserved
>> Nov 17 10:47:17 ceph1 kernel: [2810698.553791] 0 pages cma reserved
>>
>> Is it relate to page caches?
>>
>> 2017-11-18 7:22 GMT+07:00 Sam Huracan <nowitzki.sa...@gmail.com&g

Re: [ceph-users] OSD killed by OOM when many cache available

2017-11-17 Thread Sam Huracan
I see some more logs about memory in syslog:

Nov 17 10:47:17 ceph1 kernel: [2810698.553749] Node 0 DMA free:14828kB
min:32kB low:40kB high:48kB active_anon:0kB inactive_anon:0kB
active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB
isolated(file):0kB present:15996kB managed:15896kB mlocked:0kB dirty:0kB
writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB
slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB
bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB
pages_scanned:0 all_unreclaimable? yes
Nov 17 10:47:17 ceph1 kernel: [2810698.553753] lowmem_reserve[]: 0 1830
32011 32011 32011
Nov 17 10:47:17 ceph1 kernel: [2810698.553755] Node 0 DMA32 free:134848kB
min:3860kB low:4824kB high:5788kB active_anon:120628kB
inactive_anon:121792kB active_file:653404kB inactive_file:399272kB
unevictable:0kB isolated(anon):0kB isolated(file):36kB present:1981184kB
managed:1900752kB mlocked:0kB dirty:344kB writeback:0kB mapped:1200kB
shmem:0kB slab_reclaimable:239900kB slab_unreclaimable:154560kB
kernel_stack:43376kB pagetables:1176kB unstable:0kB bounce:0kB free_pcp:0kB
local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0
all_unreclaimable? no
Nov 17 10:47:17 ceph1 kernel: [2810698.553758] lowmem_reserve[]: 0 0 30180
30180 30180
Nov 17 10:47:17 ceph1 kernel: [2810698.553760] Node 0 Normal free:237232kB
min:63684kB low:79604kB high:95524kB active_anon:3075228kB
inactive_anon:629052kB active_file:12544336kB inactive_file:12570716kB
unevictable:0kB isolated(anon):0kB isolated(file):0kB present:31457280kB
managed:30905084kB mlocked:0kB dirty:74368kB writeback:3796kB
mapped:48516kB shmem:1276kB slab_reclaimable:713684kB
slab_unreclaimable:416404kB kernel_stack:40896kB pagetables:25288kB
unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Nov 17 10:47:17 ceph1 kernel: [2810698.553763] lowmem_reserve[]: 0 0 0 0 0
Nov 17 10:47:17 ceph1 kernel: [2810698.553765] Node 0 DMA: 1*4kB (U) 1*8kB
(U) 0*16kB 1*32kB (U) 3*64kB (U) 2*128kB (U) 0*256kB 0*512kB 0*1024kB
1*2048kB (M) 3*4096kB (M) = 14828kB
Nov 17 10:47:17 ceph1 kernel: [2810698.553772] Node 0 DMA32: 10756*4kB
(UME) 11442*8kB (UME) 25*16kB (UE) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB
0*1024kB 0*2048kB 0*4096kB = 134960kB
Nov 17 10:47:17 ceph1 kernel: [2810698.553777] Node 0 Normal: 59473*4kB
(UME) 77*8kB (U) 10*16kB (H) 8*32kB (H) 6*64kB (H) 2*128kB (H) 1*256kB (H)
1*512kB (H) 0*1024kB 0*2048kB 0*4096kB = 240332kB
Nov 17 10:47:17 ceph1 kernel: [2810698.553784] Node 0 hugepages_total=0
hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Nov 17 10:47:17 ceph1 kernel: [2810698.553784] Node 0 hugepages_total=0
hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Nov 17 10:47:17 ceph1 kernel: [2810698.553785] 6544612 total pagecache pages
Nov 17 10:47:17 ceph1 kernel: [2810698.553786] 2347 pages in swap cache
Nov 17 10:47:17 ceph1 kernel: [2810698.553787] Swap cache stats: add
2697126, delete 2694779, find 38874122/39241548
Nov 17 10:47:17 ceph1 kernel: [2810698.553788] Free swap  = 61498092kB
Nov 17 10:47:17 ceph1 kernel: [2810698.553788] Total swap = 62498812kB
Nov 17 10:47:17 ceph1 kernel: [2810698.553789] 8363615 pages RAM
Nov 17 10:47:17 ceph1 kernel: [2810698.553790] 0 pages HighMem/MovableOnly
Nov 17 10:47:17 ceph1 kernel: [2810698.553790] 158182 pages reserved
Nov 17 10:47:17 ceph1 kernel: [2810698.553791] 0 pages cma reserved

Is it relate to page caches?

2017-11-18 7:22 GMT+07:00 Sam Huracan <nowitzki.sa...@gmail.com>:

> Today, one of our Ceph OSDs was down, I've check syslog and see this OSD
> process was killed by OMM
>
>
> Nov 17 10:01:06 ceph1 kernel: [2807926.762304] Out of memory: Kill process
> 3330 (ceph-osd) score 7 or sacrifice child
> Nov 17 10:01:06 ceph1 kernel: [2807926.763745] Killed process 3330
> (ceph-osd) total-vm:2372392kB, anon-rss:559084kB, file-rss:7268kB
> Nov 17 10:01:06 ceph1 kernel: [2807926.985830] init: ceph-osd (ceph/6)
> main process (3330) killed by KILL signal
> Nov 17 10:01:06 ceph1 kernel: [2807926.985844] init: ceph-osd (ceph/6)
> main process ended, respawning
> Nov 17 10:03:39 ceph1 bash: root [15524]: sudo ceph health detail [0]
> Nov 17 10:03:40 ceph1 bash: root [15524]: sudo ceph health detail [0]
> Nov 17 10:17:01 ceph1 CRON[75167]: (root) CMD (   cd / && run-parts
> --report /etc/cron.hourly)
> Nov 17 10:47:17 ceph1 kernel: [2810698.553690] ceph-status.sh invoked
> oom-killer: gfp_mask=0x26000c0, order=2, oom_score_adj=0
> Nov 17 10:47:17 ceph1 kernel: [2810698.553693] ceph-status.sh cpuset=/
> mems_allowed=0
> Nov 17 10:47:17 ceph1 kernel: [2810698.553697] CPU: 5 PID: 194271 Comm:
> ceph-status.sh Not tainted 4.4.0-62-generic #83~14.04.1-Ubuntu
> Nov 17 10:47:17 ceph1 kernel: [2810698.553698] Hardware name: Dell Inc.
> PowerEdge R730xd/072T6D, BIOS 2.5.5 08/16/2017
> Nov 17 10:47:17 ceph1 kernel: [2810698.5

[ceph-users] OSD killed by OOM when many cache available

2017-11-17 Thread Sam Huracan
Today, one of our Ceph OSDs was down, I've check syslog and see this OSD
process was killed by OMM


Nov 17 10:01:06 ceph1 kernel: [2807926.762304] Out of memory: Kill process
3330 (ceph-osd) score 7 or sacrifice child
Nov 17 10:01:06 ceph1 kernel: [2807926.763745] Killed process 3330
(ceph-osd) total-vm:2372392kB, anon-rss:559084kB, file-rss:7268kB
Nov 17 10:01:06 ceph1 kernel: [2807926.985830] init: ceph-osd (ceph/6) main
process (3330) killed by KILL signal
Nov 17 10:01:06 ceph1 kernel: [2807926.985844] init: ceph-osd (ceph/6) main
process ended, respawning
Nov 17 10:03:39 ceph1 bash: root [15524]: sudo ceph health detail [0]
Nov 17 10:03:40 ceph1 bash: root [15524]: sudo ceph health detail [0]
Nov 17 10:17:01 ceph1 CRON[75167]: (root) CMD (   cd / && run-parts
--report /etc/cron.hourly)
Nov 17 10:47:17 ceph1 kernel: [2810698.553690] ceph-status.sh invoked
oom-killer: gfp_mask=0x26000c0, order=2, oom_score_adj=0
Nov 17 10:47:17 ceph1 kernel: [2810698.553693] ceph-status.sh cpuset=/
mems_allowed=0
Nov 17 10:47:17 ceph1 kernel: [2810698.553697] CPU: 5 PID: 194271 Comm:
ceph-status.sh Not tainted 4.4.0-62-generic #83~14.04.1-Ubuntu
Nov 17 10:47:17 ceph1 kernel: [2810698.553698] Hardware name: Dell Inc.
PowerEdge R730xd/072T6D, BIOS 2.5.5 08/16/2017
Nov 17 10:47:17 ceph1 kernel: [2810698.553699]  
88001a857b38 813dc4ac 88001a857cf0
Nov 17 10:47:17 ceph1 kernel: [2810698.553701]  
88001a857bc8 811fb066 88083d29e200
Nov 17 10:47:17 ceph1 kernel: [2810698.553703]  88001a857cf0
88001a857c00 8808453cf000 
Nov 17 10:47:17 ceph1 kernel: [2810698.553704] Call Trace:
Nov 17 10:47:17 ceph1 kernel: [2810698.553709]  []
dump_stack+0x63/0x87
Nov 17 10:47:17 ceph1 kernel: [2810698.553713]  []
dump_header+0x5b/0x1d5
Nov 17 10:47:17 ceph1 kernel: [2810698.553717]  []
oom_kill_process+0x205/0x3d0
Nov 17 10:47:17 ceph1 kernel: [2810698.553718]  [] ?
oom_unkillable_task+0x9e/0xc0
Nov 17 10:47:17 ceph1 kernel: [2810698.553720]  []
out_of_memory+0x40b/0x460
Nov 17 10:47:17 ceph1 kernel: [2810698.553722]  []
__alloc_pages_slowpath.constprop.87+0x742/0x7ad
Nov 17 10:47:17 ceph1 kernel: [2810698.553725]  []
__alloc_pages_nodemask+0x237/0x240
Nov 17 10:47:17 ceph1 kernel: [2810698.553727]  []
alloc_kmem_pages_node+0x4d/0xd0
Nov 17 10:47:17 ceph1 kernel: [2810698.553730]  []
copy_process+0x185/0x1ce0
Nov 17 10:47:17 ceph1 kernel: [2810698.553732]  [] ?
security_file_alloc+0x33/0x50
Nov 17 10:47:17 ceph1 kernel: [2810698.553734]  []
_do_fork+0x8a/0x310
Nov 17 10:47:17 ceph1 kernel: [2810698.553737]  [] ?
sigprocmask+0x51/0x80
Nov 17 10:47:17 ceph1 kernel: [2810698.553738]  []
SyS_clone+0x19/0x20
Nov 17 10:47:17 ceph1 kernel: [2810698.553743]  []
entry_SYSCALL_64_fastpath+0x16/0x75
Nov 17 10:47:17 ceph1 kernel: [2810698.553744] Mem-Info:
Nov 17 10:47:17 ceph1 kernel: [2810698.553747] active_anon:798964
inactive_anon:187711 isolated_anon:0
Nov 17 10:47:17 ceph1 kernel: [2810698.553747]  active_file:3299435
inactive_file:3242497 isolated_file:9
Nov 17 10:47:17 ceph1 kernel: [2810698.553747]  unevictable:0 dirty:18678
writeback:949 unstable:0
Nov 17 10:47:17 ceph1 kernel: [2810698.553747]  slab_reclaimable:238396
slab_unreclaimable:142741
Nov 17 10:47:17 ceph1 kernel: [2810698.553747]  mapped:12429 shmem:319
pagetables:6616 bounce:0
Nov 17 10:47:17 ceph1 kernel: [2810698.553747]  free:96727 free_pcp:0
free_cma:0
Nov 17 10:47:17 ceph1 kernel: [2810698.553749] Node 0 DMA free:14828kB
min:32kB low:40kB high:48kB active_anon:0kB inactive_anon:0kB
active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB
isolated(file):0kB present:15996kB managed:15896kB mlocked:0kB dirty:0kB
writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB
slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB
bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB
pages_scanned:0 all_unreclaimable? yes
Nov 17 10:47:17 ceph1 kernel: [2810698.553753] lowmem_reserve[]: 0 1830
32011 32011 32011
Nov 17 10:47:17 ceph1 kernel: [2810698.553755] Node 0 DMA32 free:134848kB
min:3860kB low:4824kB high:5788kB active_anon:120628kB
inactive_anon:121792kB active_file:653404kB inactive_file:399272kB
unevictable:0kB isolated(anon):0kB isolated(file):36kB present:1981184kB
managed:1900752kB mlocked:0kB dirty:344kB writeback:0kB mapped:1200kB
shmem:0kB slab_reclaimable:239900kB slab_unreclaimable:154560kB
kernel_stack:43376kB pagetables:1176kB unstable:0kB bounce:0kB free_pcp:0kB
local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0
all_unreclaimable? no


We check free Memory, Cache is free 24,705 MB

root@ceph1:/var/log# free -m
 total   used   free sharedbuffers cached
Mem: 32052  31729323  1125  24256
-/+ buffers/cache:   7347  24705
Swap:61033120  60913



It's so weird. Could you help me solve this problem, I'm afraid it 'll come
again.

Thanks in 

[ceph-users] Ceph cluster network bandwidth?

2017-11-16 Thread Sam Huracan
Hi,

We intend build a new Ceph cluster with 6 Ceph OSD hosts, 10 SAS disks
every host, using 10Gbps NIC for client network, object is replicated 3.

So, how could I sizing the cluster network for best performance?
As i have read, 3x replicate means 3x bandwidth client network = 30 Gbps,
is it true? I think it is too much and make great cost

Do you give me a suggestion?

Thanks in advance.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fwd: Luminous RadosGW issue

2017-11-15 Thread Sam Huracan
Thanks Hans, I've fixed it.
Ceph luminous auto create an user client.rgw, I did't know and make a new
user client.radowgw.


On Nov 9, 2017 17:03, "Hans van den Bogert" <hansbog...@gmail.com> wrote:

> On Nov 9, 2017, at 5:25 AM, Sam Huracan <nowitzki.sa...@gmail.com> wrote:
>
> root@radosgw system]# ceph --admin-daemon 
> /var/run/ceph/ceph-client.rgw.radosgw.asok
> config show | grep log_file
> "log_file": "/var/log/ceph/ceph-client.rgw.radosgw.log”,
>
>
> The .asok filename resembles what should be used in your config. If Im
> right you should use ‘client.rgw.radosgw’ in your ceph.conf.
>
>
>
> On Nov 9, 2017, at 5:25 AM, Sam Huracan <nowitzki.sa...@gmail.com> wrote:
>
> @Hans: Yes, I tried to redeploy RGW, and ensure client.radosgw.gateway is
> the same in ceph.conf.
> Everything go well, service radosgw running, port 7480 is opened, but all
> my config of radosgw in ceph.conf can't be set, rgw_dns_name is still
> empty, and log file keeps default value.
>
> [root@radosgw system]# ceph --admin-daemon 
> /var/run/ceph/ceph-client.rgw.radosgw.asok
> config show | grep log_file
> "log_file": "/var/log/ceph/ceph-client.rgw.radosgw.log",
>
>
> [root@radosgw system]# cat /etc/ceph/ceph.client.radosgw.keyring
> [client.radosgw.gateway]
> key = AQCsywNaqQdDHxAAC24O8CJ0A9Gn6qeiPalEYg==
> caps mon = "allow rwx"
> caps osd = "allow rwx"
>
>
> 2017-11-09 6:11 GMT+07:00 Hans van den Bogert <hansbog...@gmail.com>:
>
>> Are you sure you deployed it with the client.radosgw.gateway name as
>> well? Try to redeploy the RGW and make sure the name you give it
>> corresponds to the name you give in the ceph.conf. Also, do not forget
>> to push the ceph.conf to the RGW machine.
>>
>> On Wed, Nov 8, 2017 at 11:44 PM, Sam Huracan <nowitzki.sa...@gmail.com>
>> wrote:
>> >
>> >
>> > Hi Cephers,
>> >
>> > I'm testing RadosGW in Luminous version.  I've already installed done
>> in separate host, service is running but RadosGW did not accept any my
>> configuration in ceph.conf.
>> >
>> > My Config:
>> > [client.radosgw.gateway]
>> > host = radosgw
>> > keyring = /etc/ceph/ceph.client.radosgw.keyring
>> > rgw socket path = /var/run/ceph/ceph.radosgw.gateway.fastcgi.sock
>> > log file = /var/log/radosgw/client.radosgw.gateway.log
>> > rgw dns name = radosgw.demo.com
>> > rgw print continue = false
>> >
>> >
>> > When I show config of radosgw socket:
>> > [root@radosgw ~]# ceph --admin-daemon 
>> > /var/run/ceph/ceph-client.rgw.radosgw.asok
>> config show | grep dns
>> > "mon_dns_srv_name": "",
>> > "rgw_dns_name": "",
>> > "rgw_dns_s3website_name": "",
>> >
>> > rgw_dns_name is empty, hence S3 API is unable to access Ceph Object
>> Storage.
>> >
>> >
>> > Do anyone meet this issue?
>> >
>> > My ceph version I'm  using is ceph-radosgw-12.2.1-0.el7.x86_64
>> >
>> > Thanks in advance
>> >
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>>
>
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fwd: Luminous RadosGW issue

2017-11-08 Thread Sam Huracan
I checked ceph pools, cluster has some pools:

[ceph-deploy@ceph1 cluster-ceph]$ ceph osd lspools
2 rbd,3 .rgw.root,4 default.rgw.control,5 default.rgw.meta,6
default.rgw.log,



2017-11-09 11:25 GMT+07:00 Sam Huracan <nowitzki.sa...@gmail.com>:

> @Hans: Yes, I tried to redeploy RGW, and ensure client.radosgw.gateway is
> the same in ceph.conf.
> Everything go well, service radosgw running, port 7480 is opened, but all
> my config of radosgw in ceph.conf can't be set, rgw_dns_name is still
> empty, and log file keeps default value.
>
> [root@radosgw system]# ceph --admin-daemon 
> /var/run/ceph/ceph-client.rgw.radosgw.asok
> config show | grep log_file
> "log_file": "/var/log/ceph/ceph-client.rgw.radosgw.log",
>
>
> [root@radosgw system]# cat /etc/ceph/ceph.client.radosgw.keyring
> [client.radosgw.gateway]
> key = AQCsywNaqQdDHxAAC24O8CJ0A9Gn6qeiPalEYg==
> caps mon = "allow rwx"
> caps osd = "allow rwx"
>
>
> 2017-11-09 6:11 GMT+07:00 Hans van den Bogert <hansbog...@gmail.com>:
>
>> Are you sure you deployed it with the client.radosgw.gateway name as
>> well? Try to redeploy the RGW and make sure the name you give it
>> corresponds to the name you give in the ceph.conf. Also, do not forget
>> to push the ceph.conf to the RGW machine.
>>
>> On Wed, Nov 8, 2017 at 11:44 PM, Sam Huracan <nowitzki.sa...@gmail.com>
>> wrote:
>> >
>> >
>> > Hi Cephers,
>> >
>> > I'm testing RadosGW in Luminous version.  I've already installed done
>> in separate host, service is running but RadosGW did not accept any my
>> configuration in ceph.conf.
>> >
>> > My Config:
>> > [client.radosgw.gateway]
>> > host = radosgw
>> > keyring = /etc/ceph/ceph.client.radosgw.keyring
>> > rgw socket path = /var/run/ceph/ceph.radosgw.gateway.fastcgi.sock
>> > log file = /var/log/radosgw/client.radosgw.gateway.log
>> > rgw dns name = radosgw.demo.com
>> > rgw print continue = false
>> >
>> >
>> > When I show config of radosgw socket:
>> > [root@radosgw ~]# ceph --admin-daemon 
>> > /var/run/ceph/ceph-client.rgw.radosgw.asok
>> config show | grep dns
>> > "mon_dns_srv_name": "",
>> > "rgw_dns_name": "",
>> > "rgw_dns_s3website_name": "",
>> >
>> > rgw_dns_name is empty, hence S3 API is unable to access Ceph Object
>> Storage.
>> >
>> >
>> > Do anyone meet this issue?
>> >
>> > My ceph version I'm  using is ceph-radosgw-12.2.1-0.el7.x86_64
>> >
>> > Thanks in advance
>> >
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>>
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fwd: Luminous RadosGW issue

2017-11-08 Thread Sam Huracan
@Hans: Yes, I tried to redeploy RGW, and ensure client.radosgw.gateway is
the same in ceph.conf.
Everything go well, service radosgw running, port 7480 is opened, but all
my config of radosgw in ceph.conf can't be set, rgw_dns_name is still
empty, and log file keeps default value.

[root@radosgw system]# ceph --admin-daemon
/var/run/ceph/ceph-client.rgw.radosgw.asok config show | grep log_file
"log_file": "/var/log/ceph/ceph-client.rgw.radosgw.log",


[root@radosgw system]# cat /etc/ceph/ceph.client.radosgw.keyring
[client.radosgw.gateway]
key = AQCsywNaqQdDHxAAC24O8CJ0A9Gn6qeiPalEYg==
caps mon = "allow rwx"
caps osd = "allow rwx"


2017-11-09 6:11 GMT+07:00 Hans van den Bogert <hansbog...@gmail.com>:

> Are you sure you deployed it with the client.radosgw.gateway name as
> well? Try to redeploy the RGW and make sure the name you give it
> corresponds to the name you give in the ceph.conf. Also, do not forget
> to push the ceph.conf to the RGW machine.
>
> On Wed, Nov 8, 2017 at 11:44 PM, Sam Huracan <nowitzki.sa...@gmail.com>
> wrote:
> >
> >
> > Hi Cephers,
> >
> > I'm testing RadosGW in Luminous version.  I've already installed done in
> separate host, service is running but RadosGW did not accept any my
> configuration in ceph.conf.
> >
> > My Config:
> > [client.radosgw.gateway]
> > host = radosgw
> > keyring = /etc/ceph/ceph.client.radosgw.keyring
> > rgw socket path = /var/run/ceph/ceph.radosgw.gateway.fastcgi.sock
> > log file = /var/log/radosgw/client.radosgw.gateway.log
> > rgw dns name = radosgw.demo.com
> > rgw print continue = false
> >
> >
> > When I show config of radosgw socket:
> > [root@radosgw ~]# ceph --admin-daemon 
> > /var/run/ceph/ceph-client.rgw.radosgw.asok
> config show | grep dns
> > "mon_dns_srv_name": "",
> > "rgw_dns_name": "",
> > "rgw_dns_s3website_name": "",
> >
> > rgw_dns_name is empty, hence S3 API is unable to access Ceph Object
> Storage.
> >
> >
> > Do anyone meet this issue?
> >
> > My ceph version I'm  using is ceph-radosgw-12.2.1-0.el7.x86_64
> >
> > Thanks in advance
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Fwd: Luminous RadosGW issue

2017-11-08 Thread Sam Huracan
Hi Cephers,

I'm testing RadosGW in Luminous version.  I've already installed done in
separate host, service is running but RadosGW did not accept any my
configuration in ceph.conf.

My Config:
[client.radosgw.gateway]
host = radosgw
keyring = /etc/ceph/ceph.client.radosgw.keyring
rgw socket path = /var/run/ceph/ceph.radosgw.gateway.fastcgi.sock
log file = /var/log/radosgw/client.radosgw.gateway.log
*rgw dns name = radosgw.demo.com *
rgw print continue = false


When I show config of radosgw socket:
[root@radosgw ~]# ceph --admin-daemon
/var/run/ceph/ceph-client.rgw.radosgw.asok
config show | grep dns
"mon_dns_srv_name": "",
"*rgw_dns_name": "",*
"rgw_dns_s3website_name": "",

rgw_dns_name is empty, hence S3 API is unable to access Ceph Object Storage.


Do anyone meet this issue?

My ceph version I'm  using is ceph-radosgw-12.2.1-0.el7.x86_64

Thanks in advance
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] BlueStore questions about workflow and performance

2017-10-02 Thread Sam Huracan
Hi,

I'm reading this document:
 http://storageconference.us/2017/Presentations/CephObjectStore-slides.pdf

I have 3 questions:

1. BlueStore writes both data (to raw block device) and metadata (to
RockDB) simultaneously, or sequentially?

2. From my opinion, performance of BlueStore can not compare to FileStore
using SSD Journal, because performance of raw disk is less than using
buffer. (this is buffer purpose). How do you think?

3.  Do setting Rock DB and Rock DB Wal in SSD only enhance write, read
performance? or both?

Hope your answer,
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fwd: FileStore vs BlueStore

2017-09-20 Thread Sam Huracan
So why do not journal write only metadata?
As I've read, it is for ensure consistency of data, but I do not know how
to do that in detail? And why BlueStore still ensure consistency without
journal?

2017-09-20 16:03 GMT+07:00 <c...@jack.fr.eu.org>:

> On 20/09/2017 10:59, Sam Huracan wrote:
> > Hi Cephers,
> >
> > I've read about new BlueStore and have 2 questions:
> >
> > 1. The purpose of BlueStore is eliminating the drawbacks of POSIX when
> > using FileStore. These drawbacks also cause journal, result in double
> write
> > penalty. Could you explain me more detail about POSIX fails when using in
> > FileStore? and how Bluestore still guarantee consistency without journal?
> From what I guess, bluestore uses the definitive storage location as a
> journal
>
> So, filestore:
> - write data to the journal
> - write metadata
> - move data from journal to definitive storage
>
> Bluestore:
> - write data to definitive storage (free space, not overwriting anything)
> - write metadata
>
> >
> > I find a topic on reddit that told by journal, ceph avoid buffer cache,
> is
> > it true? is is drawback of POSIX?
> > https://www.reddit.com/r/ceph/comments/5wbp4d/so_will_
> > bluestore_make_ssd_journals_pointless/
> >
> > 2. We have to put journal on SSD to avoid double write, but it lead to
> > losing some OSDs when SSD fails. With BlueStore model, we can put all
> WAL,
> > metadata, data on 1 disk, make it easily for monitoring, maintaining. But
> > according to this post of Sebastian Han:
> > https://www.sebastien-han.fr/blog/2016/05/04/Ceph-Jewel-
> > configure-BlueStore-with-multiple-devices/
> > We could put WAL, metadata, and data on separate disks for increasing
> > performance, I think it is not different to FileStore model.
> > What is OSD deployment is most optimized? Put all on 1 disk or split on
> > multi disks?
> >
> > Thanks in advance
> >
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Fwd: FileStore vs BlueStore

2017-09-20 Thread Sam Huracan
Hi Cephers,

I've read about new BlueStore and have 2 questions:

1. The purpose of BlueStore is eliminating the drawbacks of POSIX when
using FileStore. These drawbacks also cause journal, result in double write
penalty. Could you explain me more detail about POSIX fails when using in
FileStore? and how Bluestore still guarantee consistency without journal?

I find a topic on reddit that told by journal, ceph avoid buffer cache, is
it true? is is drawback of POSIX?
https://www.reddit.com/r/ceph/comments/5wbp4d/so_will_
bluestore_make_ssd_journals_pointless/

2. We have to put journal on SSD to avoid double write, but it lead to
losing some OSDs when SSD fails. With BlueStore model, we can put all WAL,
metadata, data on 1 disk, make it easily for monitoring, maintaining. But
according to this post of Sebastian Han:
https://www.sebastien-han.fr/blog/2016/05/04/Ceph-Jewel-
configure-BlueStore-with-multiple-devices/
We could put WAL, metadata, and data on separate disks for increasing
performance, I think it is not different to FileStore model.
What is OSD deployment is most optimized? Put all on 1 disk or split on
multi disks?

Thanks in advance
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph Random Read Write Performance

2017-08-20 Thread Sam Huracan
Hi,

I have a question about Ceph's performance
I've built a Ceph  cluster with 3 OSD host, each host's configuration:
 - CPU: 1 x Intel Xeon E5-2620 v4 2.1GHz
 - Memory: 2 x 16GB RDIMM
 - Disk: 2 x 300GB 15K RPM SAS 12Gbps (RAID 1 for OS)
4 x 800GB Solid State Drive SATA (non-RAID for OSD)(Intel SSD
DC S3610)
 - NIC: 1 x 10Gbps (bonding for both public and replicate network).

My ceph.conf: https://pastebin.com/r4pJ3P45
We use this cluster for OpenStack cinder's backend.

We have benchmark this cluster by using 6 VMs, using vdbench.
Our vdbench script:  https://pastebin.com/9sxhrjie

After test, we got the result:
 - 100% random read: 100.000 IOPS
 - 100% random write: 20.000 IOPS
 - 75% RR - 25% RW: 80.000 IOPS

That results are so low, because we calculate the performance of this
cluster is: 112.000 IOPS Write and 1000.000 IOPS Read,

We are using Ceph Jewel 10.2.5-1trusty, kernel 4.4.0.-31 generic, Ubuntu
14.04


Could you help me solve this issue.

Thanks in advance
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph OSDs advice

2017-02-14 Thread Sam Huracan
Hi Khang,

What file system do you use in OSD node?
XFS always use Memory for caching data before writing to disk.

So, don't worry, it always holds memory in your system as much as possible.



2017-02-15 10:35 GMT+07:00 Khang Nguyễn Nhật 
:

> Hi all,
> My ceph OSDs is running on Fedora-server24 with config are:
> 128GB RAM DDR3, CPU Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz, 72 OSDs
> (8TB per OSD). My cluster was use ceph object gateway with S3 API. Now, it
> had contained 500GB data but it was used > 50GB RAM. I'm worry my OSD will
> dead if i continue put file to it. I had read "OSDs do not require as
> much RAM for regular operations (e.g., 500MB of RAM per daemon instance);
> however, during recovery they need significantly more RAM (e.g., ~1GB per
> 1TB of storage per daemon)." in Ceph Hardware Recommendations. Someone
> can give me advice on this issue? Thank
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] LibRBD_Show Real Size of RBD Image

2016-11-29 Thread Sam Huracan
Hi all,
I'm trying to use LIBRBD (Python)
http://docs.ceph.com/docs/jewel/rbd/librbdpy/

Is there a way to find real size of RBD Image through LIBRBD??

I saw I can get it by CMD:
http://ceph.com/planet/real-size-of-a-ceph-rbd-image/

Thanks
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Write process

2016-01-26 Thread Sam Huracan
Thanks Loris,
So after client receiving ACK, if client makes a read request to this
object immediately , does it have to wait for object written to file store,
or read directly from journal?

2016-01-25 17:12 GMT+07:00 Loris Cuoghi <l...@stella-telecom.fr>:

>
> Le 25/01/2016 11:04, Sam Huracan a écrit :
> > Hi Cephers,
> >
> > When an Ceph write made, does it write to all File Stores of Primary OSD
> > and Secondary OSD before sending ACK to client, or it writes to journal
> > of OSD and sending ACK without writing to File Store?
> >
> > I think it would write to journal of all OSD, so using SSD journal will
> > increase write IOPS.
>
> Hi
>
> That's it, writes are acknowledged as soon as they're written to each
> OSD's journal.
>
> Loris
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph Write process

2016-01-25 Thread Sam Huracan
Hi Cephers,

When an Ceph write made, does it write to all File Stores of Primary OSD
and Secondary OSD before sending ACK to client, or it writes to journal of
OSD and sending ACK without writing to File Store?

I think it would write to journal of all OSD, so using SSD journal will
increase write IOPS.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Swift use Rados backend

2016-01-08 Thread Sam Huracan
Hi,

How could I use Ceph as Backend for Swift?
I follow these git:
https://github.com/stackforge/swift-ceph-backend
https://github.com/enovance/swiftceph-ansible

I try to install manually, but I am stucking in configuring entry for ring.
What device I use in 'swift-ring-builder account.builder add
z1-10.10.10.53:6002/*sdb1 *100' if I use Rados?

Thanks and regards
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] In production - Change osd config

2016-01-02 Thread Sam Huracan
Hi,
I intend to add some config, but how to apply it in an production system.

[Osd]
osd journal size = 0
osd mount options xfs = "rw,noatime,inode64,logbufs=8,logbsize=256k"
filestore min sync interval = 5
filestore max sync interval = 15
filestore queue max ops = 2048
filestore queue max bytes = 1048576000
filestore queue committing max ops = 4096
filestore queue committing max bytes = 1048576000
filestore op thread = 32
filestore journal writeahead = true
filestore merge threshold = 40
filestore split multiple = 8

journal max write bytes = 1048576000
journal max write entries = 4096
journal queue max ops = 8092
journal queue max bytes = 1048576000

osd max write size = 512
osd op threads = 16
osd disk threads = 2
osd op num threads per shard = 3
osd op num shards = 10
osd map cache size = 1024
osd max backfills = 1
osd recovery max active = 2

I try restart all osd but not efficient.
Is   there  anyway to apply this change transparently to client?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Random Write Fio Test Delay

2015-12-31 Thread Sam Huracan
Yep, it happen on every run, I have checked on other VMs that do not use
Ceph, it had almost no delay, although results were about similarly, 576
iops for Ceph's VM and 650 for non-Ceph VM,  I use one image for all test,
ubuntu 14.04.1, kernel 3.13.0-32-generic



2015-12-31 23:51 GMT+07:00 Jan Schermer <j...@schermer.cz>:

> Is it only on the first run or on every run?
> Fio first creates the file and that can take a while depeding on how
> fallocate() works on your system. In other words you are probably waiting
> for a 1G file to be written before the test actually starts.
>
> Jan
>
>
> On 31 Dec 2015, at 04:49, Sam Huracan <nowitzki.sa...@gmail.com> wrote:
>
> Hi Ceph-users.
>
> I have an issue with my new Ceph Storage, which is backend for OpenStack
> (Cinder, Glance, Nova).
> When I test random write in VMs with fio, there is a long delay (60s)
> before fio begin running.
> Here is my script test:
>
> fio --directory=/root/ --direct=1 --rw=randwrite --bs=4k --size=1G
> --numjobs=3 --iodepth=4 --runtime=60 --group_reporting --name=testIOPslan1
> --output=testwriterandom
>
> It is so strange, because I have not ever seen this method when I test in
> Physical machine.
>
> My Ceph system include 5 node, 2 SAS 15k  per node, I use both journal
> and Filestore in one disk, Public Network 1 Gbps/Node, Replica Network 2
> Gbps/Node
>
> Here is my ceph.conf in node compute:
> http://pastebin.com/raw/wsyDHiRw
>
>
> Could you help me solve this issue?
>
> Thanks and regards.
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Hardware for a new installation

2015-12-22 Thread Sam Huracan
Hi,
I think the ratio is based on SSD max throughput/HDD max throughput

For example: one 400 Mbps SSD could be journal for 4 100 Mbps SAS.

This is my idea, I'm also building a Ceph Storage for Openstack.
Could you guys give some experiences?
On Dec 23, 2015 03:04, "Pshem Kowalczyk"  wrote:

> Hi,
>
> We'll be building our first production-grade ceph cluster to back an
> openstack setup (a few hundreds of VMs). Initially we'll need only about
> 20-30TB of storage, but that's likely to grow. I'm unsure about required
> IOPs (there are multiple, very different classes of workloads to consider).
> Currently we use a mixture of on-blade disks and commercial storage
> solutions (NetApp and EMC).
>
> We're a Cisco UCS shop for compute and I would like to know if anyone here
> has experience with the C3160 storage server Cisco offers. Any particular
> pitfalls to avoid?
>
> I would like to use SSD for journals, but I'm not sure what's the
> performance (and durability) of the UCS-C3X60-12G0400 really is.
> What's considered a reasonable ratio of HDD to journal SSD? 5:1, 4:1?
>
> kind regards
> Pshem
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Enable RBD Cache

2015-12-17 Thread Sam Huracan
Hi,

I'm testing OpenStack Kilo with Ceph 0.94.5, install in Ubuntu 14.04

To enable RBD cache, I follow this tutorial:
http://docs.ceph.com/docs/master/rbd/rbd-openstack/#configuring-nova

But when I check /var/run/ceph/guests in Compute nodes, there isn't have
any asok file.

How can I enable RBD cache in compute node, and how can check it?

Thanks and regards.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Fwd: Enable RBD Cache

2015-12-17 Thread Sam Huracan
-- Forwarded message --
From: Sam Huracan <nowitzki.sa...@gmail.com>
Date: 2015-12-18 1:03 GMT+07:00
Subject: Enable RBD Cache
To: ceph-us...@ceph.com


Hi,

I'm testing OpenStack Kilo with Ceph 0.94.5, install in Ubuntu 14.04

To enable RBD cache, I follow this tutorial:
http://docs.ceph.com/docs/master/rbd/rbd-openstack/#configuring-nova

But when I check /var/run/ceph/guests in Compute nodes, there isn't have
any asok file.

How can I enable RBD cache in compute node, and how can check it?

Thanks and regards.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Migrate Block Volumes and VMs

2015-12-15 Thread Sam Huracan
Hi everybody,

My OpenStack System use Ceph as backend for Glance, Cinder, Nova. In the
future, we intend build a new Ceph Cluster.
I can re-connect current OpenStack with new Ceph systems.

After that, I have tried export rbd images and import to new Ceph, but VMs
and Volumes were clone of Glance rbd images, like this:

rbd children images/e2c852e1-28ce-408d-b2ec-6351db35d55a@snap

vms/8a4465fa-cbae-4559-b519-861eb4eda378_disk
volumes/volume-b5937629-5f44-40c8-9f92-5f88129d3171


How could I export all rbd snapshot and its clones to import in new Ceph
Cluster?

Or is there any solution to move all Vms, Volumes, Images from old Ceph
cluster to the new ones?

Thanks and regards.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Sizing

2015-12-06 Thread Sam Huracan
Thanks you all,

So using cache is a solution for decreasing latenc.
Ceph has build-in cache tiering, With my Ceph System that serves
approximately 100 VM simultaneously, up to maximum 700VM, included SQL VM,
which is most efficient cache solution for me?

Thanks and regards.

2015-12-04 3:22 GMT+07:00 Warren Wang - ISD <warren.w...@walmart.com>:

> I would be a lot more conservative in terms of what a spinning drive can
> do. The Mirantis presentation has pretty high expectations out of a
> spinning drive, as they¹re ignoring somewhat latency (til the last few
> slides). Look at the max latencies for anything above 1 QD on a spinning
> drive.
>
> If you factor in a latency requirement, the capability of the drives fall
> dramatically. You might be able to offset this by using NVMe or something
> as a cache layer between the journal and the OSD, using bcache, LVM cache,
> etc. In much of the performance testing that we¹ve done, the average isn¹t
> too bad, but 90th percentile numbers tend to be quite bad. Part of it is
> probably from locking PGs during a flush, and the other part is just the
> nature of spinning drives.
>
> I¹d try to get a handle on expected workloads before picking the gear, but
> if you have to pick before that, SSD if you have the budget :) You can
> offset it a little by using erasure coding for the RGW portion, or using
> spinning drives for that.
>
> I think picking gear for Ceph is tougher than running an actual cluster :)
> Best of luck. I think you¹re still starting with better, and more info
> than some of us did years ago.
>
> Warren Wang
>
>
>
>
> From:  Sam Huracan <nowitzki.sa...@gmail.com>
> Date:  Thursday, December 3, 2015 at 4:01 AM
> To:  Srinivasula Maram <srinivasula.ma...@sandisk.com>
> Cc:  Nick Fisk <n...@fisk.me.uk>, "ceph-us...@ceph.com"
> <ceph-us...@ceph.com>
> Subject:  Re: [ceph-users] Ceph Sizing
>
>
> I'm following this presentation of Mirantis team:
> http://www.slideshare.net/mirantis/ceph-talk-vancouver-20
>
> They calculate CEPH IOPS = Disk IOPS * HDD Quantity * 0.88 (4-8k random
> read proportion)
>
>
> And  VM IOPS = CEPH IOPS / VM Quantity
>
> But if I use replication of 3, Would VM IOPS be divided by 3?
>
>
> 2015-12-03 7:09 GMT+07:00 Sam Huracan <nowitzki.sa...@gmail.com>:
>
> IO size is 4 KB, and I need a Minimum sizing, cost optimized
> I intend use SuperMicro Devices
> http://www.supermicro.com/solutions/storage_Ceph.cfm
>
>
> What do you think?
>
>
> 2015-12-02 23:17 GMT+07:00 Srinivasula Maram
> <srinivasula.ma...@sandisk.com>:
>
> One more factor we need to consider here is IO size(block size) to get
> required IOPS, based on this we can calculate the bandwidth and design the
> solution.
>
> Thanks
> Srinivas
>
> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Nick Fisk
> Sent: Wednesday, December 02, 2015 9:28 PM
> To: 'Sam Huracan'; ceph-us...@ceph.com
> Subject: Re: [ceph-users] Ceph Sizing
>
> You've left out an important factorcost. Otherwise I would just say
> buy enough SSD to cover the capacity.
>
> > -Original Message-
> > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> > Of Sam Huracan
> > Sent: 02 December 2015 15:46
> > To: ceph-us...@ceph.com
> > Subject: [ceph-users] Ceph Sizing
> >
> > Hi,
> > I'm building a storage structure for OpenStack cloud System, input:
> > - 700 VM
> > - 150 IOPS per VM
> > - 20 Storage per VM (boot volume)
> > - Some VM run database (SQL or MySQL)
> >
> > I want to ask a sizing plan for Ceph to satisfy the IOPS requirement,
> > I list some factors considered:
> > - Amount of OSD (SAS Disk)
> > - Amount of Journal (SSD)
> > - Amount of OSD Servers
> > - Amount of MON Server
> > - Network
> > - Replica ( default is 3)
> >
> > I will divide to 3 pool with 3 Disk types: SSD, SAS 15k and SAS 10k
> > Should I use all 3 disk types in one server or build dedicated servers
> > for every pool? Example: 3 15k servers for Pool-1, 3 10k Servers for
> >Pool-2.
> >
> > Could you help me a formula to calculate the minimum devices needed
> > for above input.
> >
> > Thanks and regards.
>
>
>
>
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
>
>
>
>
>
>
>
>
>
> This email and any files transmitted with it are confidential and intended
> solely for the individual or entity to whom they are addressed. If you have
> received this email in error destroy it immediately. *** Walmart
> Confidential ***
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Sizing

2015-12-03 Thread Sam Huracan
I'm following this presentation of Mirantis team:
http://www.slideshare.net/mirantis/ceph-talk-vancouver-20

They calculate CEPH IOPS = Disk IOPS * HDD Quantity * 0.88 (4-8k random
read proportion)

And  VM IOPS = CEPH IOPS / VM Quantity

But if I use replication of 3, *Would VM IOPS be divided by 3? *

2015-12-03 7:09 GMT+07:00 Sam Huracan <nowitzki.sa...@gmail.com>:

> IO size is 4 KB, and I need a Minimum sizing, cost optimized
> I intend use SuperMicro Devices
> http://www.supermicro.com/solutions/storage_Ceph.cfm
>
> What do you think?
>
> 2015-12-02 23:17 GMT+07:00 Srinivasula Maram <
> srinivasula.ma...@sandisk.com>:
>
>> One more factor we need to consider here is IO size(block size) to get
>> required IOPS, based on this we can calculate the bandwidth and design the
>> solution.
>>
>> Thanks
>> Srinivas
>>
>> -Original Message-
>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
>> Nick Fisk
>> Sent: Wednesday, December 02, 2015 9:28 PM
>> To: 'Sam Huracan'; ceph-us...@ceph.com
>> Subject: Re: [ceph-users] Ceph Sizing
>>
>> You've left out an important factorcost. Otherwise I would just say
>> buy enough SSD to cover the capacity.
>>
>> > -Original Message-
>> > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
>> > Of Sam Huracan
>> > Sent: 02 December 2015 15:46
>> > To: ceph-us...@ceph.com
>> > Subject: [ceph-users] Ceph Sizing
>> >
>> > Hi,
>> > I'm building a storage structure for OpenStack cloud System, input:
>> > - 700 VM
>> > - 150 IOPS per VM
>> > - 20 Storage per VM (boot volume)
>> > - Some VM run database (SQL or MySQL)
>> >
>> > I want to ask a sizing plan for Ceph to satisfy the IOPS requirement,
>> > I list some factors considered:
>> > - Amount of OSD (SAS Disk)
>> > - Amount of Journal (SSD)
>> > - Amount of OSD Servers
>> > - Amount of MON Server
>> > - Network
>> > - Replica ( default is 3)
>> >
>> > I will divide to 3 pool with 3 Disk types: SSD, SAS 15k and SAS 10k
>> > Should I use all 3 disk types in one server or build dedicated servers
>> > for every pool? Example: 3 15k servers for Pool-1, 3 10k Servers for
>> Pool-2.
>> >
>> > Could you help me a formula to calculate the minimum devices needed
>> > for above input.
>> >
>> > Thanks and regards.
>>
>>
>>
>>
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph Sizing

2015-12-02 Thread Sam Huracan
Hi,
I'm building a storage structure for OpenStack cloud System, input:
- 700 VM
- 150 IOPS per VM
- 20 Storage per VM (boot volume)
- Some VM run database (SQL or MySQL)

I want to ask a sizing plan for Ceph to satisfy the IOPS requirement, I
list some factors considered:
- Amount of OSD (SAS Disk)
- Amount of Journal (SSD)
- Amount of OSD Servers
- Amount of MON Server
- Network
- Replica ( default is 3)

I will divide to 3 pool with 3 Disk types: SSD, SAS 15k and SAS 10k
Should I use all 3 disk types in one server or build dedicated servers for
every pool? Example: 3 15k servers for Pool-1, 3 10k Servers for Pool-2.

Could you help me a formula to calculate the minimum devices needed for
above input.

Thanks and regards.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph Sizing

2015-12-02 Thread Sam Huracan
Hi,
I'm building a storage structure for OpenStack cloud System, input:
- 700 VM
- 150 IOPS per VM
- 20 Storage per VM (boot volume)
- Some VM run database (SQL or MySQL)

I want to ask a sizing plan for Ceph to satisfy the IOPS requirement, I
list some factors considered:
- Amount of OSD (SAS Disk)
- Amount of Journal (SSD)
- Amount of OSD Servers
- Amount of MON Server
- Network
- Replica ( default is 3)

I will divide to 3 pool with 3 Disk types: SSD, SAS 15k and SAS 10k
Should I use all 3 disk types in one server or build dedicated servers for
every pool? Example: 3 15k servers for Pool-1, 3 10k Servers for Pool-2.

Could you help me a formula to calculate the minimum devices needed for
above input.

Thanks and regards.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Mon quorum fails

2015-12-02 Thread Sam Huracan
Hi,

My Mon quorum includes 3 nodes, if 2 nodes fail out incidently. How could I
recover system from 1 node left?

Thanks and regards.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Sizing

2015-12-02 Thread Sam Huracan
IO size is 4 KB, and I need a Minimum sizing, cost optimized
I intend use SuperMicro Devices
http://www.supermicro.com/solutions/storage_Ceph.cfm

What do you think?

2015-12-02 23:17 GMT+07:00 Srinivasula Maram <srinivasula.ma...@sandisk.com>
:

> One more factor we need to consider here is IO size(block size) to get
> required IOPS, based on this we can calculate the bandwidth and design the
> solution.
>
> Thanks
> Srinivas
>
> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Nick Fisk
> Sent: Wednesday, December 02, 2015 9:28 PM
> To: 'Sam Huracan'; ceph-us...@ceph.com
> Subject: Re: [ceph-users] Ceph Sizing
>
> You've left out an important factorcost. Otherwise I would just say
> buy enough SSD to cover the capacity.
>
> > -Original Message-
> > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> > Of Sam Huracan
> > Sent: 02 December 2015 15:46
> > To: ceph-us...@ceph.com
> > Subject: [ceph-users] Ceph Sizing
> >
> > Hi,
> > I'm building a storage structure for OpenStack cloud System, input:
> > - 700 VM
> > - 150 IOPS per VM
> > - 20 Storage per VM (boot volume)
> > - Some VM run database (SQL or MySQL)
> >
> > I want to ask a sizing plan for Ceph to satisfy the IOPS requirement,
> > I list some factors considered:
> > - Amount of OSD (SAS Disk)
> > - Amount of Journal (SSD)
> > - Amount of OSD Servers
> > - Amount of MON Server
> > - Network
> > - Replica ( default is 3)
> >
> > I will divide to 3 pool with 3 Disk types: SSD, SAS 15k and SAS 10k
> > Should I use all 3 disk types in one server or build dedicated servers
> > for every pool? Example: 3 15k servers for Pool-1, 3 10k Servers for
> Pool-2.
> >
> > Could you help me a formula to calculate the minimum devices needed
> > for above input.
> >
> > Thanks and regards.
>
>
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com