I have a configuration with 18 OSDs spread across 3 hosts. I am struggling with 
getting an even distribution of placement groups between the OSDs for a 
specific pool. All the OSDs are the same size with the same weight in the crush 
map. The fill level of the individual placement groups is very close when I put 
data into them, however I get a fairly uneven spread of placement groups across 
the OSDs. This leads to the pool filling one of the OSDs well before the 
others, a file system can report 70% full in aggregate, but one of the OSDs 
fills so it can take no more data. In addition, this will clearly lead to less 
than balanced load between the different devices which are all of the same 
physical type with the same throughput.

If I dump the placement groups and count them by OSD I typically see something 
like this:

pool :  4       5       6       7       8       9       0       1       14      
2       15      3       | SUM 
----------------------------------------------------------------------------------------------------------------
osd.17  4       5       7       5       11      6       9       4       17      
12      42      8       | 130
osd.4   7       8       8       7       4       6       1       4       12      
8       23      8       | 96
osd.5   8       5       10      10      5       6       3       7       13      
7       34      13      | 121
osd.6   9       6       8       2       3       10      1       4       12      
10      26      10      | 101
osd.7   7       10      7       7       9       13      1       6       20      
5       29      5       | 119
osd.8   6       7       4       6       6       3       7       11      20      
7       28      9       | 114
osd.9   8       10      9       9       5       6       4       5       15      
5       22      4       | 102
osd.10  3       2       4       5       11      9       3       4       20      
7       38      8       | 114
osd.11  8       11      10      7       7       13      3       4       19      
8       29      6       | 125
osd.12  7       6       10      5       8       4       2       8       18      
6       37      9       | 120
osd.0   3       6       11      13      7       5       6       11      17      
6       35      9       | 129
osd.13  7       8       5       10      11      8       4       13      18      
11      35      5       | 135
osd.1   13      8       9       4       7       7       4       6       10      
10      43      3       | 124
osd.14  8       7       4       7       8       3       3       8       16      
3       28      6       | 101
osd.15  9       7       5       3       4       10      5       6       17      
7       35      5       | 113
osd.2   7       9       9       11      11      8       2       8       9       
6       34      9       | 123
osd.16  9       4       5       7       4       0       3       6       21      
4       26      6       | 95
osd.3   5       9       3       10      7       11      3       13      14      
6       32      5       | 118
----------------------------------------------------------------------------------------------------------------
SUM :   128     128     128     128     128     128     64      128     288     
128     576     128     |

In this example I want to put a filesystem across pools 14 and 15. The data 
pool has between 23 and 43 placement groups per OSD.

Am I just missing something here in defining the crush map? All I can find is 
recommendations to get a more even balance by having more PGs per OSD. 
Eventually I just get warnings about too many placement groups per OSD.

Or is the issue that there are multiple pools on this set of OSDs and placement 
groups are being created in parallel for several of them?  In this case though 
pool 15 was created after all the other pools existed and all their placement 
groups were created, even the first pool is unevenly spread.

So are there any controls which influence how placement groups are allocated to 
OSDs in the initial pool creation?

Thanks

   Steve


----------------------------------------------------------------------
The information contained in this transmission may be confidential. Any 
disclosure, copying, or further distribution of confidential information is not 
permitted unless such privilege is explicitly granted in writing by Quantum. 
Quantum reserves the right to have electronic communications, including email 
and attachments, sent across its networks filtered through anti virus and spam 
software programs and retain such messages in order to comply with applicable 
data security and retention requirements. Quantum is not responsible for the 
proper and complete transmission of the substance of this communication or for 
any delay in its receipt.
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to