[ 
https://issues.apache.org/jira/browse/KUDU-3008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhangYao updated KUDU-3008:
---------------------------
    Description: 
Accidentally I found that kudu will put all replicas of a table into one 
location when we only have 2 locations and the replica factor is odd. Below is 
the case:

{{location /DEFAULT/22254  has 3 tservers}}
 {{location /DEFAULT/22255 has 3 tservers}}
 {{Table created: replica factor = 3, tablet = 8.}}

{{Before I create the table, the ksck tablet summary is:}}

 
{code:java}
Tablet Replica Count by Tablet Server
               UUID               |                Host                | 
Replica Count |    Location    
----------------------------------+------------------------------------+---------------+----------------
 5f5ddec364834ce59282d37388010f06 | opencomputeoffline.xxxxxx.net:7056 | 10     
       | /DEFAULT/22255 
 00f24c36d39a49e8b77ff43b3bcbf0c9 | opencomputeoffline.xxxxxx.net:7054 | 10     
       | /DEFAULT/22255 
 d0091ae869704458865b9b079ad2389e | opencomputeoffline.xxxxxx.net:7055 | 9      
       | /DEFAULT/22255 
 507547dd183c4474855d55f7bdd9d526 | opencomputeoffline.xxxxxx.net:7052 | 7      
       | /DEFAULT/22254 
 c6a2b6e99f0a43308d9e5773b2d8c729 | opencomputeoffline.xxxxxx.net:7053 | 6      
       | /DEFAULT/22254 
 031808c37385477fb063e50fbc614f44 | opencomputeoffline.xxxxxx.net:7050 | 6      
       | /DEFAULT/22254 {code}
{{After I create the table, the ksck tablet summary is:}}

 
{code:java}
Tablet Replica Count by Tablet Server
 UUID | Host | Replica Count | Location 
----------------------------------+------------------------------------+---------------+----------------
 507547dd183c4474855d55f7bdd9d526 | opencomputeoffline.xxxxxx.net:7052 | 15 | 
/DEFAULT/22254 
 c6a2b6e99f0a43308d9e5773b2d8c729 | opencomputeoffline.xxxxxx.net:7053 | 14 | 
/DEFAULT/22254 
 031808c37385477fb063e50fbc614f44 | opencomputeoffline.xxxxxx.net:7050 | 14 | 
/DEFAULT/22254 
 5f5ddec364834ce59282d37388010f06 | opencomputeoffline.xxxxxx.net:7056 | 10 | 
/DEFAULT/22255 
 00f24c36d39a49e8b77ff43b3bcbf0c9 | opencomputeoffline.xxxxxx.net:7054 | 10 | 
/DEFAULT/22255 
 d0091ae869704458865b9b079ad2389e | opencomputeoffline.xxxxxx.net:7055 | 9 | 
/DEFAULT/22255 {code}
I found that /DEFAULT/22255 doesn't have new replica and all replicas are 
located in /DEFAULT/22254. When look into the code I found that in 
PlacementPolicy::SelectLocation when location num is 2, we only take care about 
even replica factor and try to spread replicas evenly in 2 locations. I think 
we should also consider about the odd replica factor. When there is 2 
locations, although there must have one location contains replicas more than 
half but it better than contains all replicas. 

  was:
Accidentally I found that kudu will put all replicas of a table into one 
location when we only have 2 locations and the replica factor is odd. Below is 
the case:

{{location /DEFAULT/22254  has 3 tservers}}
{{location /DEFAULT/22255 has 3 tservers}}
{{Table created: replica factor = 3, tablet = 8.}}

{{Before I create the table, the ksck tablet summary is:}}

 
{code:java}
Tablet Replica Count by Tablet Server
               UUID               |                Host                | 
Replica Count |    Location    
----------------------------------+------------------------------------+---------------+----------------
 5f5ddec364834ce59282d37388010f06 | opencomputeoffline.xxxxxx.net:7056 | 10     
       | /DEFAULT/22255 
 00f24c36d39a49e8b77ff43b3bcbf0c9 | opencomputeoffline.xxxxxx.net:7054 | 10     
       | /DEFAULT/22255 
 d0091ae869704458865b9b079ad2389e | opencomputeoffline.xxxxxx.net:7055 | 9      
       | /DEFAULT/22255 
 507547dd183c4474855d55f7bdd9d526 | opencomputeoffline.xxxxxx.net:7052 | 7      
       | /DEFAULT/22254 
 c6a2b6e99f0a43308d9e5773b2d8c729 | opencomputeoffline.xxxxxx.net:7053 | 6      
       | /DEFAULT/22254 
 031808c37385477fb063e50fbc614f44 | opencomputeoffline.xxxxxx.net:7050 | 6      
       | /DEFAULT/22254 {code}
{{}}{{After I create the table, the ksck tablet summary is:}}

 
{code:java}
Tablet Replica Count by Tablet Server
 UUID | Host | Replica Count | Location 
----------------------------------+------------------------------------+---------------+----------------
 507547dd183c4474855d55f7bdd9d526 | opencomputeoffline.xxxxxx.net:7052 | 15 | 
/DEFAULT/22254 
 c6a2b6e99f0a43308d9e5773b2d8c729 | opencomputeoffline.xxxxxx.net:7053 | 14 | 
/DEFAULT/22254 
 031808c37385477fb063e50fbc614f44 | opencomputeoffline.xxxxxx.net:7050 | 14 | 
/DEFAULT/22254 
 5f5ddec364834ce59282d37388010f06 | opencomputeoffline.xxxxxx.net:7056 | 10 | 
/DEFAULT/22255 
 00f24c36d39a49e8b77ff43b3bcbf0c9 | opencomputeoffline.xxxxxx.net:7054 | 10 | 
/DEFAULT/22255 
 d0091ae869704458865b9b079ad2389e | opencomputeoffline.xxxxxx.net:7055 | 9 | 
/DEFAULT/22255 {code}
I found that /DEFAULT/22255 doesn't have new replica and all replicas are 
located in /DEFAULT/22254. When look into the code I found that in 
PlacementPolicy::SelectLocation when location num is 2, we only take care about 
even replica factor and try to spread replicas evenly in 2 locations. I think 
we should also consider about the odd replica factor. When there is 2 
locations, although there must have one location contains replicas more than 
half but it better than contains all replicas. 

{{}}


> Don't put all replicas into one location with 2 locations and odd replica 
> factor.
> ---------------------------------------------------------------------------------
>
>                 Key: KUDU-3008
>                 URL: https://issues.apache.org/jira/browse/KUDU-3008
>             Project: Kudu
>          Issue Type: Improvement
>            Reporter: ZhangYao
>            Priority: Minor
>
> Accidentally I found that kudu will put all replicas of a table into one 
> location when we only have 2 locations and the replica factor is odd. Below 
> is the case:
> {{location /DEFAULT/22254  has 3 tservers}}
>  {{location /DEFAULT/22255 has 3 tservers}}
>  {{Table created: replica factor = 3, tablet = 8.}}
> {{Before I create the table, the ksck tablet summary is:}}
>  
> {code:java}
> Tablet Replica Count by Tablet Server
>                UUID               |                Host                | 
> Replica Count |    Location    
> ----------------------------------+------------------------------------+---------------+----------------
>  5f5ddec364834ce59282d37388010f06 | opencomputeoffline.xxxxxx.net:7056 | 10   
>          | /DEFAULT/22255 
>  00f24c36d39a49e8b77ff43b3bcbf0c9 | opencomputeoffline.xxxxxx.net:7054 | 10   
>          | /DEFAULT/22255 
>  d0091ae869704458865b9b079ad2389e | opencomputeoffline.xxxxxx.net:7055 | 9    
>          | /DEFAULT/22255 
>  507547dd183c4474855d55f7bdd9d526 | opencomputeoffline.xxxxxx.net:7052 | 7    
>          | /DEFAULT/22254 
>  c6a2b6e99f0a43308d9e5773b2d8c729 | opencomputeoffline.xxxxxx.net:7053 | 6    
>          | /DEFAULT/22254 
>  031808c37385477fb063e50fbc614f44 | opencomputeoffline.xxxxxx.net:7050 | 6    
>          | /DEFAULT/22254 {code}
> {{After I create the table, the ksck tablet summary is:}}
>  
> {code:java}
> Tablet Replica Count by Tablet Server
>  UUID | Host | Replica Count | Location 
> ----------------------------------+------------------------------------+---------------+----------------
>  507547dd183c4474855d55f7bdd9d526 | opencomputeoffline.xxxxxx.net:7052 | 15 | 
> /DEFAULT/22254 
>  c6a2b6e99f0a43308d9e5773b2d8c729 | opencomputeoffline.xxxxxx.net:7053 | 14 | 
> /DEFAULT/22254 
>  031808c37385477fb063e50fbc614f44 | opencomputeoffline.xxxxxx.net:7050 | 14 | 
> /DEFAULT/22254 
>  5f5ddec364834ce59282d37388010f06 | opencomputeoffline.xxxxxx.net:7056 | 10 | 
> /DEFAULT/22255 
>  00f24c36d39a49e8b77ff43b3bcbf0c9 | opencomputeoffline.xxxxxx.net:7054 | 10 | 
> /DEFAULT/22255 
>  d0091ae869704458865b9b079ad2389e | opencomputeoffline.xxxxxx.net:7055 | 9 | 
> /DEFAULT/22255 {code}
> I found that /DEFAULT/22255 doesn't have new replica and all replicas are 
> located in /DEFAULT/22254. When look into the code I found that in 
> PlacementPolicy::SelectLocation when location num is 2, we only take care 
> about even replica factor and try to spread replicas evenly in 2 locations. I 
> think we should also consider about the odd replica factor. When there is 2 
> locations, although there must have one location contains replicas more than 
> half but it better than contains all replicas. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to