Re: [ceph-users] All pools have size=3 but MB data and MB used ratio is 1 to 5

2015-04-17 Thread Saverio Proto
 Do you by any chance have your OSDs placed at a local directory path rather
 than on a non utilized physical disk?

No, I have 18 Disks per Server. Each OSD is mapped to a physical disk.

Here in the output of one server:
ansible@zrh-srv-m-cph02:~$ df -h
Filesystem   Size  Used Avail Use% Mounted on
/dev/mapper/vg01-root 28G  4.5G   22G  18% /
none 4.0K 0  4.0K   0% /sys/fs/cgroup
udev  48G  4.0K   48G   1% /dev
tmpfs9.5G  1.3M  9.5G   1% /run
none 5.0M 0  5.0M   0% /run/lock
none  48G   20K   48G   1% /run/shm
none 100M 0  100M   0% /run/user
/dev/mapper/vg01-tmp 4.5G  9.4M  4.3G   1% /tmp
/dev/mapper/vg01-varlog  9.1G  5.1G  3.6G  59% /var/log
/dev/sdf1932G   15G  917G   2% /var/lib/ceph/osd/ceph-3
/dev/sdg1932G   15G  917G   2% /var/lib/ceph/osd/ceph-4
/dev/sdl1932G   13G  919G   2% /var/lib/ceph/osd/ceph-8
/dev/sdo1932G   15G  917G   2% /var/lib/ceph/osd/ceph-11
/dev/sde1932G   15G  917G   2% /var/lib/ceph/osd/ceph-2
/dev/sdd1932G   15G  917G   2% /var/lib/ceph/osd/ceph-1
/dev/sdt1932G   15G  917G   2% /var/lib/ceph/osd/ceph-15
/dev/sdq1932G   12G  920G   2% /var/lib/ceph/osd/ceph-12
/dev/sdc1932G   14G  918G   2% /var/lib/ceph/osd/ceph-0
/dev/sds1932G   17G  916G   2% /var/lib/ceph/osd/ceph-14
/dev/sdu1932G   14G  918G   2% /var/lib/ceph/osd/ceph-16
/dev/sdm1932G   15G  917G   2% /var/lib/ceph/osd/ceph-9
/dev/sdk1932G   17G  915G   2% /var/lib/ceph/osd/ceph-7
/dev/sdn1932G   14G  918G   2% /var/lib/ceph/osd/ceph-10
/dev/sdr1932G   15G  917G   2% /var/lib/ceph/osd/ceph-13
/dev/sdv1932G   14G  918G   2% /var/lib/ceph/osd/ceph-17
/dev/sdh1932G   17G  916G   2% /var/lib/ceph/osd/ceph-5
/dev/sdj1932G   14G  918G   2% /var/lib/ceph/osd/ceph-30
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] All pools have size=3 but MB data and MB used ratio is 1 to 5

2015-04-17 Thread Georgios Dimitrakakis

Hi!

Do you by any chance have your OSDs placed at a local directory path 
rather than on a non utilized physical disk?


If I remember correctly from a similar setup that I had performed in 
the past the ceph df command accounts for the entire disk and not just 
for the OSD data directory. I am not sure if this still applies since it 
was on an early Firefly release but it is something that it's easy to 
look for.


I don't know if the above make sense but what I mean is that if for 
instance your OSD are at something like /var/lib/ceph/osd.X (or 
whatever) and this doesn't correspond to mounted a device (e.g. 
/dev/sdc1) but are local on the disk that provides the / or /var 
partition then you should do a df -h to see what the amount of data 
are on that partition and compare it with the ceph df output. It 
should be (more or less) the same.


Best,

George



2015-03-27 18:27 GMT+01:00 Gregory Farnum g...@gregs42.com:
Ceph has per-pg and per-OSD metadata overhead. You currently have 
26000 PGs,
suitable for use on a cluster of the order of 260 OSDs. You have 
placed

almost 7GB of data into it (21GB replicated) and have about 7GB of
additional overhead.

You might try putting a suitable amount of data into the cluster 
before

worrying about the ratio of space used to data stored. :)
-Greg


Hello Greg,

I put a suitable amount of data now, and it looks like my ratio is
still 1 to 5.
The folder:
/var/lib/ceph/osd/ceph-N/current/meta/
did not grow, so it looks like that is not the problem.

Do you have any hint how to troubleshoot this issue ???


ansible@zrh-srv-m-cph02:~$ ceph osd pool get .rgw.buckets size
size: 3
ansible@zrh-srv-m-cph02:~$ ceph osd pool get .rgw.buckets min_size
min_size: 2


ansible@zrh-srv-m-cph02:~$ ceph -w
cluster 4179fcec-b336-41a1-a7fd-4a19a75420ea
 health HEALTH_WARN pool .rgw.buckets has too few pgs
 monmap e4: 4 mons at

{rml-srv-m-cph01=10.120.50.20:6789/0,rml-srv-m-cph02=10.120.50.21:6789/0,rml-srv-m-stk03=10.120.50.32:6789/0,zrh-srv-m-cph02=10.120.50.2:6789/0},
election epoch 668, quorum 0,1,2,3
zrh-srv-m-cph02,rml-srv-m-cph01,rml-srv-m-cph02,rml-srv-m-stk03
 osdmap e2170: 54 osds: 54 up, 54 in
  pgmap v619041: 28684 pgs, 15 pools, 109 GB data, 7358 kobjects
518 GB used, 49756 GB / 50275 GB avail
   28684 active+clean

ansible@zrh-srv-m-cph02:~$ ceph df
GLOBAL:
SIZE   AVAIL  RAW USED %RAW USED
50275G 49756G 518G  1.03
POOLS:
NAME   ID USED  %USED MAX AVAIL 
OBJECTS
rbd0155 016461G   
   2
gianfranco 7156 016461G   
   2
images 8   257M 016461G   
  38
.rgw.root  9840 016461G   
   3
.rgw.control   10 0 016461G   
   8
.rgw   11 21334 016461G   
 108
.rgw.gc12 0 016461G   
  32
.users.uid 13  1575 016461G   
   6
.users 1472 016461G   
   6
.rgw.buckets.index 15 0 016461G   
  30
.users.swift   1736 016461G   
   3
.rgw.buckets   18  108G  0.2216461G 
7534745
.intent-log19 0 016461G   
   0
.rgw.buckets.extra 20 0 016461G   
   0
volumes21  512M 016461G   
 161

ansible@zrh-srv-m-cph02:~$
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


--
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] All pools have size=3 but MB data and MB used ratio is 1 to 5

2015-04-14 Thread Saverio Proto
2015-03-27 18:27 GMT+01:00 Gregory Farnum g...@gregs42.com:
 Ceph has per-pg and per-OSD metadata overhead. You currently have 26000 PGs,
 suitable for use on a cluster of the order of 260 OSDs. You have placed
 almost 7GB of data into it (21GB replicated) and have about 7GB of
 additional overhead.

 You might try putting a suitable amount of data into the cluster before
 worrying about the ratio of space used to data stored. :)
 -Greg

Hello Greg,

I put a suitable amount of data now, and it looks like my ratio is still 1 to 5.
The folder:
/var/lib/ceph/osd/ceph-N/current/meta/
did not grow, so it looks like that is not the problem.

Do you have any hint how to troubleshoot this issue ???


ansible@zrh-srv-m-cph02:~$ ceph osd pool get .rgw.buckets size
size: 3
ansible@zrh-srv-m-cph02:~$ ceph osd pool get .rgw.buckets min_size
min_size: 2


ansible@zrh-srv-m-cph02:~$ ceph -w
cluster 4179fcec-b336-41a1-a7fd-4a19a75420ea
 health HEALTH_WARN pool .rgw.buckets has too few pgs
 monmap e4: 4 mons at
{rml-srv-m-cph01=10.120.50.20:6789/0,rml-srv-m-cph02=10.120.50.21:6789/0,rml-srv-m-stk03=10.120.50.32:6789/0,zrh-srv-m-cph02=10.120.50.2:6789/0},
election epoch 668, quorum 0,1,2,3
zrh-srv-m-cph02,rml-srv-m-cph01,rml-srv-m-cph02,rml-srv-m-stk03
 osdmap e2170: 54 osds: 54 up, 54 in
  pgmap v619041: 28684 pgs, 15 pools, 109 GB data, 7358 kobjects
518 GB used, 49756 GB / 50275 GB avail
   28684 active+clean

ansible@zrh-srv-m-cph02:~$ ceph df
GLOBAL:
SIZE   AVAIL  RAW USED %RAW USED
50275G 49756G 518G  1.03
POOLS:
NAME   ID USED  %USED MAX AVAIL OBJECTS
rbd0155 016461G   2
gianfranco 7156 016461G   2
images 8   257M 016461G  38
.rgw.root  9840 016461G   3
.rgw.control   10 0 016461G   8
.rgw   11 21334 016461G 108
.rgw.gc12 0 016461G  32
.users.uid 13  1575 016461G   6
.users 1472 016461G   6
.rgw.buckets.index 15 0 016461G  30
.users.swift   1736 016461G   3
.rgw.buckets   18  108G  0.2216461G 7534745
.intent-log19 0 016461G   0
.rgw.buckets.extra 20 0 016461G   0
volumes21  512M 016461G 161
ansible@zrh-srv-m-cph02:~$
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] All pools have size=3 but MB data and MB used ratio is 1 to 5

2015-03-27 Thread Saverio Proto
 I will start now to push a lot of data into the cluster to see if the
 metadata grows a lot or stays costant.

 There is a way to clean up old metadata ?

I pushed a lot of more data to the cluster. Then I lead the cluster
sleep for the night.

This morning I find this values:

6841 MB data
25814 MB used

that is a bit more of 1 to 3.

It looks like the extra space is in these folders (for N from 1 to 36):

/var/lib/ceph/osd/ceph-N/current/meta/

This meta folders have a lot of data in it. I would really be happy
to have pointers to understand what is in there and how to clean that
up eventually.

The problem is that googling for ceph meta or ceph metadata will
produce results for Ceph MDS that is completely unrelated :(

thanks

Saverio
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] All pools have size=3 but MB data and MB used ratio is 1 to 5

2015-03-26 Thread Saverio Proto
Thanks for the answer. Now the meaning of MB data and MB used is
clear, and if all the pools have size=3 I expect a ratio 1 to 3 of the
two values.

I still can't understand why MB used is so big in my setup.
All my pools are size =3 but the ratio MB data and MB used is 1 to
5 instead of 1 to 3.

My first guess was that I wrote a wrong crushmap that was making more
than 3 copies.. (is it really possible to make such a mistake?)

So I changed my crushmap and I put the default one, that just spreads
data across hosts, but I see no change, the ratio is still 1 to 5.

I thought maybe my 3 monitors have different views of the pgmap, so I
tried to restart the monitors but this also did not help.

What useful information may I share here to troubleshoot this issue further ?
ceph version 0.87.1 (283c2e7cfa2457799f534744d7d549f83ea1335e)

Thank you

Saverio



2015-03-25 14:55 GMT+01:00 Gregory Farnum g...@gregs42.com:
 On Wed, Mar 25, 2015 at 1:24 AM, Saverio Proto ziopr...@gmail.com wrote:
 Hello there,

 I started to push data into my ceph cluster. There is something I
 cannot understand in the output of ceph -w.

 When I run ceph -w I get this kinkd of output:

 2015-03-25 09:11:36.785909 mon.0 [INF] pgmap v278788: 26056 pgs: 26056
 active+clean; 2379 MB data, 19788 MB used, 33497 GB / 33516 GB avail


 2379MB is actually the data I pushed into the cluster, I can see it
 also in the ceph df output, and the numbers are consistent.

 What I dont understand is 19788MB used. All my pools have size 3, so I
 expected something like 2379 * 3. Instead this number is very big.

 I really need to understand how MB used grows because I need to know
 how many disks to buy.

 MB used is the summation of (the programmatic equivalent to) df
 across all your nodes, whereas MB data is calculated by the OSDs
 based on data they've written down. Depending on your configuration
 MB used can include thing like the OSD journals, or even totally
 unrelated data if the disks are shared with other applications.

 MB used including the space used by the OSD journals is my first
 guess about what you're seeing here, in which case you'll notice that
 it won't grow any faster than MB data does once the journal is fully
 allocated.
 -Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] All pools have size=3 but MB data and MB used ratio is 1 to 5

2015-03-26 Thread Gregory Farnum
On Thu, Mar 26, 2015 at 2:56 AM, Saverio Proto ziopr...@gmail.com wrote:
 Thanks for the answer. Now the meaning of MB data and MB used is
 clear, and if all the pools have size=3 I expect a ratio 1 to 3 of the
 two values.

 I still can't understand why MB used is so big in my setup.
 All my pools are size =3 but the ratio MB data and MB used is 1 to
 5 instead of 1 to 3.

 My first guess was that I wrote a wrong crushmap that was making more
 than 3 copies.. (is it really possible to make such a mistake?)

 So I changed my crushmap and I put the default one, that just spreads
 data across hosts, but I see no change, the ratio is still 1 to 5.

 I thought maybe my 3 monitors have different views of the pgmap, so I
 tried to restart the monitors but this also did not help.

 What useful information may I share here to troubleshoot this issue further ?
 ceph version 0.87.1 (283c2e7cfa2457799f534744d7d549f83ea1335e)

You just need to go look at one of your OSDs and see what data is
stored on it. Did you configure things so that the journals are using
a file on the same storage disk? If so, *that* is why the data used
is large.

I promise that your 5:1 ratio won't persist as you write more than 2GB
of data into the cluster.
-Greg


 Thank you

 Saverio



 2015-03-25 14:55 GMT+01:00 Gregory Farnum g...@gregs42.com:
 On Wed, Mar 25, 2015 at 1:24 AM, Saverio Proto ziopr...@gmail.com wrote:
 Hello there,

 I started to push data into my ceph cluster. There is something I
 cannot understand in the output of ceph -w.

 When I run ceph -w I get this kinkd of output:

 2015-03-25 09:11:36.785909 mon.0 [INF] pgmap v278788: 26056 pgs: 26056
 active+clean; 2379 MB data, 19788 MB used, 33497 GB / 33516 GB avail


 2379MB is actually the data I pushed into the cluster, I can see it
 also in the ceph df output, and the numbers are consistent.

 What I dont understand is 19788MB used. All my pools have size 3, so I
 expected something like 2379 * 3. Instead this number is very big.

 I really need to understand how MB used grows because I need to know
 how many disks to buy.

 MB used is the summation of (the programmatic equivalent to) df
 across all your nodes, whereas MB data is calculated by the OSDs
 based on data they've written down. Depending on your configuration
 MB used can include thing like the OSD journals, or even totally
 unrelated data if the disks are shared with other applications.

 MB used including the space used by the OSD journals is my first
 guess about what you're seeing here, in which case you'll notice that
 it won't grow any faster than MB data does once the journal is fully
 allocated.
 -Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] All pools have size=3 but MB data and MB used ratio is 1 to 5

2015-03-26 Thread Saverio Proto
 You just need to go look at one of your OSDs and see what data is
 stored on it. Did you configure things so that the journals are using
 a file on the same storage disk? If so, *that* is why the data used
 is large.

I followed your suggestion and this is the result of my trobleshooting.

Each OSD controls a disk that is mounted in a folder with the name:

/var/lib/ceph/osd/ceph-N

where N is the OSD number

The journal is stored on another disk drive. I have three extra SSD
drives per server, that I partitioned with 6 partitions each, and
those partitions are journal partitions.
I checked that the setup is correct because each
/var/lib/ceph/osd/ceph-N/journal points correctly to another drive.

with df -h I see the folders where my OSD are mounted. The space
occupation looks well distributed among all OSDs as expected.

the data is always in a folder called:

/var/lib/ceph/osd/ceph-N/current

I checked with the tool ncdu where the data is stored inside the
current folders.

in each OSD there is a folder with a lot of data called

/var/lib/ceph/osd/ceph-N/current/meta

If I sum the MB for each meta folder that is more or less the extra
space that is consumed, leading to the 1 to 5 ratio.

the meta folder contains a lot of binary files, unreadable, but
looking at the file names it looks like it is where the versions of
the osdmap are stored.

but it is really a lot of metadata.

I will start now to push a lot of data into the cluster to see if the
metadata grows a lot or stays costant.

There is a way to clean up old metadata ?

thanks

Saverio
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com