Ahmed Hussein created HADOOP-17346:
--------------------------------------

             Summary: Fair call queue is defeated by abusive service principals
                 Key: HADOOP-17346
                 URL: https://issues.apache.org/jira/browse/HADOOP-17346
             Project: Hadoop Common
          Issue Type: Bug
          Components: common, ipc
            Reporter: Ahmed Hussein
            Assignee: Ahmed Hussein


[~daryn] reported  that the FCQ prioritizes based on the full kerberos 
principal (ie. "user/host@realm") rather than short name (ie. "user") to 
prevent service principals like the DNs and NMs being de-prioritized since 
service principals are expected to be well behaved.  Notably the DNs contribute 
a significant but important load so the intent is not to de-prioritize all DNs 
because their sum total load is high relative to users.

This has the unfortunate side effect of allowing misbehaving & non-critical 
service principals to abuse the FCQ. The gstorm/* principals are a prime 
example.   Each server is spamming opens as fast as possible which ensures that 
none of the gstorm servers can be de-prioritized because each principal is a 
fraction of the total load from all principals.

The secondary and more devasting problem is other abusive non-service 
principals cannot be effectively de-prioritized.  The sum total of all gstorm 
load prevents other principals from surpassing the priority thresholds.  
Principals stay in the highest priority queues which allows the abusive 
principals to overflow the entire call queue for extended periods of time.  
Notably it prevents the FCQ from moderating the heavy create loads from p_gup @ 
DB which cause significant performance degradation.

Prioritization should be based on short name with configurable exemptions for 
services like the DN/NM.

[~daryn] suggested a solution that we applied on our clusters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to