Hi Noah, Gregory and Sage,
first of all, thanks for your quick replies. Here are some answers to your
questions.
Gregory, I have got the output of "ceph -s" before and after this specific
TeraSort run, and to me it looks ok; all 30 osds are "up":
health HEALTH_OK
monmap e1: 1 mons at {0=192.168.111.18:6789/0}, election epoch 0, quorum 0 0
osdmap e22: 30 osds: 30 up, 30 in
pgmap v13688: 5760 pgs: 5760 active+clean; 1862 GB data, 1868 GB used, 6142
GB / 8366 GB avail
mdsmap e4: 1/1/1 up {0=0=up:active}
health HEALTH_OK
monmap e1: 1 mons at {0=192.168.111.18:6789/0}, election epoch 0, quorum 0 0
osdmap e22: 30 osds: 30 up, 30 in
pgmap v19657: 5760 pgs: 5760 active+clean; 1862 GB data, 1868 GB used, 6142
GB / 8366 GB avail
mdsmap e4: 1/1/1 up {0=0=up:active}
I do not have the full output of "ceph pg dump" for that specific TeraSort run,
but here is a typical output after automatically preparing CEPH for a benchmark
run
(removed almost all lines in the long pg_stat table hoping that you do not
need them):
dumped all in format plain
version 403
last_osdmap_epoch 22
last_pg_scan 1
full_ratio 0.95
nearfull_ratio 0.85
pg_stat objects mip degr unf bytes log disklog state
state_stamp v reported up acting last_scrub
scrub_stamp
2.314 0 0 0 0 0 0 0 active+clean
2012-12-14 08:31:24.524152 0'0 11'17 [23,7] [23,7] 0'0
2012-12-14 08:31:24.524096
0.316 0 0 0 0 0 0 0 active+clean
2012-12-14 08:25:12.780643 0'0 11'19 [23] [23] 0'0
2012-12-14 08:24:08.394930
1.317 0 0 0 0 0 0 0 active+clean
2012-12-14 08:27:56.400997 0'0 3'17 [11,17] [11,17] 0'0
2012-12-14 08:27:56.400953
[...]
pool 0 1 0 0 0 4 136 136
pool 1 21 0 0 0 23745 5518 5518
pool 2 0 0 0 0 0 0 0
sum 22 0 0 0 23749 5654 5654
osdstat kbused kbavail kb hb in hb out
0 2724 279808588 292420608
[3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29]
[]
1 2892 279808588 292420608
[3,4,5,6,8,9,11,12,13,14,15,16,17,18,20,22,24,25,26,27,28] []
2 2844 279808588 292420608
[3,4,5,6,7,8,9,10,11,12,13,15,16,17,18,19,20,22,23,24,25,26,27,29] []
3 2716 279808588 292420608
[0,1,2,6,7,8,9,10,11,12,13,14,15,16,17,19,20,22,23,24,25,26,27,28,29] []
4 2556 279808588 292420608
[1,2,7,8,9,12,13,14,15,16,17,18,19,20,21,22,24,25,26,27,28,29] []
5 2856 279808584 292420608
[0,2,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,28,29] []
6 2840 279808584 292420608
[0,1,2,3,4,5,9,10,11,12,13,14,15,16,17,18,19,20,22,24,25,26,27,28,29] []
7 2604 279808588 292420608
[1,2,3,4,5,9,10,11,12,13,15,17,18,19,20,21,23,24,25,26,27,28,29] []
8 2564 279808588 292420608
[1,2,3,4,5,9,10,11,12,14,16,17,18,19,20,21,22,23,24,25,27,28,29] []
9 2804 279808588 292420608
[1,2,3,4,5,6,8,12,13,14,15,16,17,18,19,20,21,22,23,24,26,27,29] []
10 2556 279808588 292420608
[0,1,2,4,5,6,7,8,12,13,14,15,16,17,19,20,21,22,23,24,25,26,27,28] []
11 3084 279808588 292420608
[0,1,2,3,4,5,6,7,8,12,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29] []
12 2572 279808588 292420608
[0,1,2,3,4,5,7,8,10,11,15,16,18,20,21,22,23,24,27,28,29] []
13 2912 279808560 292420608
[0,1,2,3,5,6,7,8,9,10,11,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29] []
14 2992 279808584 292420608
[1,2,3,4,5,6,7,8,9,10,11,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29] []
15 2652 279808588 292420608
[1,2,3,4,5,6,7,8,9,10,11,13,14,19,20,21,22,23,25,26,27,28,29] []
16 3028 279808588 292420608
[0,1,2,3,5,6,7,8,9,10,11,12,14,18,20,21,22,24,25,26,27,28,29] []
17 2772 279808588 292420608
[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,18,19,21,22,23,24,25,26,27,28,29] []
18 2804 279808588 292420608
[0,1,2,3,5,6,8,9,10,11,12,14,15,16,17,21,22,23,24,25,26,27,29] []
19 2620 279808588 292420608
[0,1,2,3,4,5,6,7,8,10,11,12,13,14,15,16,17,21,22,23,25,26,27,28,29] []
20 2956 279808588 292420608
[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,21,22,23,24,25,27,29] []
21 2876 279808588 292420608
[0,1,2,3,4,5,6,8,9,10,12,13,15,16,17,18,19,20,24,25,26,27,29] []
22 3044 279808588 292420608
[1,2,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,24,25,26,27,28,29] []
23 2752 279808584 292420608
[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,24,25,27,28,29] []
24 2948 279808588 292420608
[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,27,28,29]
[]
25 3068 279808588 292420608
[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,27,28,29]
[]
26 2540 279808588 292420608
[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,27,28] []
27 3012 279808588 292420608
[0,1,2,3,4,5,6,7,8,9,10,11,13,14,15,16,17,19,20,21,22,23,24,25,26] []
28 2800 279808560 292420608
[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,23,24,25,26] []
29 3052 279808588 292420608
[1,2,3,4,5,7,8,9,10,11,12,13,14,16,17,18,19,20,21,22,23,24,25,26] []
sum 84440 8394257568 8772618240
Does this information help? Is it really 64MB? That is what I had assumed.
As I am relatively new to CEPH, I need some time to digest and understand all
your answers.
Regards,
Jutta.
[email protected], Fujitsu Technology Solutions PBG PDG ES&S SWE
SOL 4, "Infrastructure Solutions", MchD 5B, Tel. ..49-89-3222-2705, Company
Details: http://de.ts.fujitsu.com/imprint
-----Original Message-----
From: Noah Watkins [mailto:[email protected]]
Sent: Thursday, December 13, 2012 9:33 PM
To: Gregory Farnum
Cc: Cameron Bahar; Sage Weil; Lachfeld, Jutta; [email protected]; Noah
Watkins; Joe Buck
Subject: Re: Usage of CEPH FS versa HDFS for Hadoop: TeraSort benchmark
performance comparison issue
The bindings use the default Hadoop settings (e.g. 64 or 128 MB
chunks) when creating new files. The chunk size can also be specified on a
per-file basis using the same interface as Hadoop. Additionally, while Hadoop
doesn't provide an interface to configuration parameters beyond chunk size, we
will also let users fully configure for any Ceph striping strategy.
http://ceph.com/docs/master/dev/file-striping/
-Noah
On Thu, Dec 13, 2012 at 12:27 PM, Gregory Farnum <[email protected]> wrote:
> On Thu, Dec 13, 2012 at 12:23 PM, Cameron Bahar <[email protected]> wrote:
>> Is the chunk size tunable in A Ceph cluster. I don't mean dynamic, but even
>> statically configurable when a cluster is first installed?
>
> Yeah. You can set chunk size on a per-file basis; you just can't
> change it once the file has any data written to it.
> In the context of Hadoop the question is just if the bindings are
> configured correctly to do so automatically.
> -Greg
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
> in the body of a message to [email protected] More majordomo
> info at http://vger.kernel.org/majordomo-info.html