[ 
https://issues.apache.org/jira/browse/HDFS-17421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chingachgook updated HDFS-17421:
--------------------------------
    Description: 
Hi

There is a question about two hdfs metrics that arose as a result of my 
attempts to calculate the load on the KDC for an industrial cluster

There are two parameters in hdfs metrics
RpcAuthenticationSuccesses - Total number of successful authentication attempts
RpcAuthenticationFailures - Total number of authentication failures

I expect that any data request in the hadoop cluster will commit
the request to KDC -> get ticket,
the request to the NameNode
after which the request counter should activate either +1 to the metric if 
successful, or +1 to the metric if unsuccessful

However, in a test cluster where I have
4 DataNodes and 2 NameNodes (HA), I see completely incomprehensible indicators 
for these metrics.

By the way, at the same time, I noticed that the RpcAuthenticationSuccesses 
readings gradually increase by +1 every 30 seconds


*TEST 1*
I made sure that
1. Only HDFS-\{NN,DN,JN, ZKFC} and YARN-\{RM,NM} services work
2. All other components that were – hive, spark HistoryServer, are disabled
3. There are no YARN jobs running and no user requests to hdfs

At the time of testing, the value of RpcAuthenticationFailures indicators = 0
RpcAuthenticationSuccesses = 208322

To check the download, I run the spark-submit test - 
spark-examples_2.12-3.5.0.jar with the number of performers = 1
The request was completed in 1 minute and 20 seconds
RpcAuthenticationSuccesses = 208338

In total, +16 was added to the original value during execution
Let's say +2 can be attributed to the moment I wrote about above +1 every 30 
seconds. But what does +14 authentications mean?

*TEST 2*
RpcAuthenticationFailures = 0
RpcAuthenticationSuccesses = 208388

hdfs dfs -ls /
RpcAuthenticationFailures = 0
RpcAuthenticationSuccesses = 208389
Added +1. Why?
I started kinit long before the ls/request, i.e. the metric should not have 
changed, I think so, but maybe I'm wrong

*TEST 3*

disabled
- All DN are
- Satndby NN
- All YARN services (RM, NM)

still running
Three JN, ZKFC
One NN is active

The +1 counter continues to add +1 to the RpcAuthenticationSuccesses metric 
every 30 seconds

Either I misunderstand the meaning of these indicators, or something is 
considered wrong

 

  was:
There is a question about two hdfs metrics that arose as a result of my 
attempts to calculate the load on the KDC for an industrial cluster

There are two parameters in hdfs metrics
RpcAuthenticationSuccesses - Total number of successful authentication attempts
RpcAuthenticationFailures - Total number of authentication failures

I expect that any data request in the hadoop cluster will commit
the request to KDC -> get ticket,
after which the request counter should activate either +1 to the metric if 
successful, or +1 to the metric if unsuccessful

However, in a test cluster where I have
4 DataNodes and 2 NameNodes (HA), I see completely incomprehensible indicators 
for these metrics.

By the way, at the same time, I noticed that the RpcAuthenticationSuccesses 
readings gradually increase by +1 every 30 seconds

 

*TEST 1*
I made sure that
1. Only HDFS-\{NN,DN,JN, ZKFC} and YARN-\{RM,NM} services work
2. All other components that were – hive, spark HistoryServer, are disabled
3. There are no YARN jobs running and no user requests to hdfs

At the time of testing, the value of RpcAuthenticationFailures indicators = 0
RpcAuthenticationSuccesses = 208322


To check the download, I run the spark-submit test - 
spark-examples_2.12-3.5.0.jar with the number of performers = 1
The request was completed in 1 minute and 20 seconds
RpcAuthenticationSuccesses = 208338

In total, +16 was added to the original value at runtime
Let's say +2 can be attributed to the moment I wrote about above +1 every 30 
seconds. But what does +14 authentications mean?

*TEST 2*
RpcAuthenticationFailures = 0
RpcAuthenticationSuccesses = 208388

hdfs dfs -ls /
RpcAuthenticationFailures = 0
RpcAuthenticationSuccesses = 208389
Added +1. Why? 
I started kinit long before the ls/request, i.e. the metric should not have 
changed, I think so, but maybe I'm wrong

*TEST 3*

disabled
- All DN are 
- Satndby NN
- All YARN services (RM, NM)

still running
Three JN, ZKFC
One NN is active

The +1 counter continues to add +1 to the RpcAuthenticationSuccesses metric 
every 30 seconds


Either I misunderstand the meaning of these indicators, or something is 
considered wrong


> Check the correctness of the calculation RpcAuthentication*
> -----------------------------------------------------------
>
>                 Key: HDFS-17421
>                 URL: https://issues.apache.org/jira/browse/HDFS-17421
>             Project: Hadoop HDFS
>          Issue Type: Test
>          Components: hdfs, metrics
>            Reporter: Chingachgook
>            Priority: Major
>
> Hi
> There is a question about two hdfs metrics that arose as a result of my 
> attempts to calculate the load on the KDC for an industrial cluster
> There are two parameters in hdfs metrics
> RpcAuthenticationSuccesses - Total number of successful authentication 
> attempts
> RpcAuthenticationFailures - Total number of authentication failures
> I expect that any data request in the hadoop cluster will commit
> the request to KDC -> get ticket,
> the request to the NameNode
> after which the request counter should activate either +1 to the metric if 
> successful, or +1 to the metric if unsuccessful
> However, in a test cluster where I have
> 4 DataNodes and 2 NameNodes (HA), I see completely incomprehensible 
> indicators for these metrics.
> By the way, at the same time, I noticed that the RpcAuthenticationSuccesses 
> readings gradually increase by +1 every 30 seconds
> *TEST 1*
> I made sure that
> 1. Only HDFS-\{NN,DN,JN, ZKFC} and YARN-\{RM,NM} services work
> 2. All other components that were – hive, spark HistoryServer, are disabled
> 3. There are no YARN jobs running and no user requests to hdfs
> At the time of testing, the value of RpcAuthenticationFailures indicators = 0
> RpcAuthenticationSuccesses = 208322
> To check the download, I run the spark-submit test - 
> spark-examples_2.12-3.5.0.jar with the number of performers = 1
> The request was completed in 1 minute and 20 seconds
> RpcAuthenticationSuccesses = 208338
> In total, +16 was added to the original value during execution
> Let's say +2 can be attributed to the moment I wrote about above +1 every 30 
> seconds. But what does +14 authentications mean?
> *TEST 2*
> RpcAuthenticationFailures = 0
> RpcAuthenticationSuccesses = 208388
> hdfs dfs -ls /
> RpcAuthenticationFailures = 0
> RpcAuthenticationSuccesses = 208389
> Added +1. Why?
> I started kinit long before the ls/request, i.e. the metric should not have 
> changed, I think so, but maybe I'm wrong
> *TEST 3*
> disabled
> - All DN are
> - Satndby NN
> - All YARN services (RM, NM)
> still running
> Three JN, ZKFC
> One NN is active
> The +1 counter continues to add +1 to the RpcAuthenticationSuccesses metric 
> every 30 seconds
> Either I misunderstand the meaning of these indicators, or something is 
> considered wrong
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to