[jira] [Work logged] (HDFS-13293) RBF: The RouterRPCServer should transfer client IP via CallerContext to NamenodeRpcServer

2020-10-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13293?focusedWorklogId=497077=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-497077
 ]

ASF GitHub Bot logged work on HDFS-13293:
-

Author: ASF GitHub Bot
Created on: 08/Oct/20 04:36
Start Date: 08/Oct/20 04:36
Worklog Time Spent: 10m 
  Work Description: ferhui commented on a change in pull request #2363:
URL: https://github.com/apache/hadoop/pull/2363#discussion_r501443141



##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcClient.java
##
@@ -519,6 +525,20 @@ private Object invokeMethod(
 }
   }
 
+  /**
+   * For Tracking which is the actual client address.
+   * It adds key/value (clientIp/"ip") pair to the caller context.
+   */
+  private void appendClientIpToCallerContext() {
+final CallerContext ctx = CallerContext.getCurrent();
+String origContext = ctx == null ? null : ctx.getContext();
+byte[] origSignature = ctx == null ? null : ctx.getSignature();
+CallerContext.setCurrent(
+new CallerContext.Builder(origContext, clientConfiguration)

Review comment:
   OK, fixed, please review again, thanks!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 497077)
Time Spent: 3h  (was: 2h 50m)

> RBF: The RouterRPCServer should transfer client IP via CallerContext to 
> NamenodeRpcServer
> -
>
> Key: HDFS-13293
> URL: https://issues.apache.org/jira/browse/HDFS-13293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: maobaolong
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-13293.001.patch
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Otherwise, the namenode don't know the client's callerContext
> This jira focuses on audit log which logs real client ip. Leave locality to 
> HDFS-13248



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7343) HDFS smart storage management

2020-10-07 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209992#comment-17209992
 ] 

Brahma Reddy Battula commented on HDFS-7343:


Any Update on this feature..?

> HDFS smart storage management
> -
>
> Key: HDFS-7343
> URL: https://issues.apache.org/jira/browse/HDFS-7343
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Wei Zhou
>Priority: Major
> Attachments: HDFS-Smart-Storage-Management-update.pdf, 
> HDFS-Smart-Storage-Management.pdf, 
> HDFSSmartStorageManagement-General-20170315.pdf, 
> HDFSSmartStorageManagement-Phase1-20170315.pdf, access_count_tables.jpg, 
> move.jpg, tables_in_ssm.xlsx
>
>
> As discussed in HDFS-7285, it would be better to have a comprehensive and 
> flexible storage policy engine considering file attributes, metadata, data 
> temperature, storage type, EC codec, available hardware capabilities, 
> user/application preference and etc.
> Modified the title for re-purpose.
> We'd extend this effort some bit and aim to work on a comprehensive solution 
> to provide smart storage management service in order for convenient, 
> intelligent and effective utilizing of erasure coding or replicas, HDFS cache 
> facility, HSM offering, and all kinds of tools (balancer, mover, disk 
> balancer and so on) in a large cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error

2020-10-07 Thread Hui Fei (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209985#comment-17209985
 ] 

Hui Fei commented on HDFS-15240:


OK! I will take a look

> Erasure Coding: dirty buffer causes reconstruction block error
> --
>
> Key: HDFS-15240
> URL: https://issues.apache.org/jira/browse/HDFS-15240
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, 
> HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, 
> HDFS-15240.006.patch, image-2020-07-16-15-56-38-608.png
>
>
> When read some lzo files we found some blocks were broken.
> I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from 
> DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') 
> blocks. And find the longest common sequenece(LCS) between b6'(decoded) and 
> b6(read from DN)(b7'/b7 and b8'/b8).
> After selecting 6 blocks of the block group in combinations one time and 
> iterating through all cases, I find one case that the length of LCS is the 
> block length - 64KB, 64KB is just the length of ByteBuffer used by 
> StripedBlockReader. So the corrupt reconstruction block is made by a dirty 
> buffer.
> The following log snippet(only show 2 of 28 cases) is my check program 
> output. In my case, I known the 3th block is corrupt, so need other 5 blocks 
> to decode another 3 blocks, then find the 1th block's LCS substring is block 
> length - 64kb.
> It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the 
> dirty buffer was used before read the 1th block.
> Must be noted that StripedBlockReader read from the offset 0 of the 1th block 
> after used the dirty buffer.
> EDITED for readability.
> {code:java}
> decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8']
> Check the first 131072 bytes between block[1] and block[1'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[6] and block[6'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[8] and block[8'], the longest 
> common substring length is 4
> decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8']
> Check the first 131072 bytes between block[1] and block[1'], the longest 
> common substring length is 65536
> CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest 
> common substring length is 27197440  # this one
> Check the first 131072 bytes between block[7] and block[7'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[8] and block[8'], the longest 
> common substring length is 4{code}
> Now I know the dirty buffer causes reconstruction block error, but how does 
> the dirty buffer come about?
> After digging into the code and DN log, I found this following DN log is the 
> root reason.
> {code:java}
> [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel 
> java.nio.channels.SocketChannel[connected local=/:52586 
> remote=/:50010]. 18 millis timeout left.
> [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped 
> block: BP-714356632--1519726836856:blk_-YY_3472979393
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60)
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:834) {code}
> Reading from DN may timeout(hold by a future(F)) and output the INFO log, but 
> the futures that contains the future(F)  is cleared, 
> {code:java}
> return new StripingChunkReadResult(futures.remove(future),
> StripingChunkReadResult.CANCELLED); {code}
> futures.remove(future) cause NPE. So the EC reconstruction is 

[jira] [Work logged] (HDFS-13293) RBF: The RouterRPCServer should transfer client IP via CallerContext to NamenodeRpcServer

2020-10-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13293?focusedWorklogId=497054=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-497054
 ]

ASF GitHub Bot logged work on HDFS-13293:
-

Author: ASF GitHub Bot
Created on: 08/Oct/20 03:01
Start Date: 08/Oct/20 03:01
Worklog Time Spent: 10m 
  Work Description: aajisaka commented on a change in pull request #2363:
URL: https://github.com/apache/hadoop/pull/2363#discussion_r501420480



##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcClient.java
##
@@ -519,6 +525,20 @@ private Object invokeMethod(
 }
   }
 
+  /**
+   * For Tracking which is the actual client address.
+   * It adds key/value (clientIp/"ip") pair to the caller context.
+   */
+  private void appendClientIpToCallerContext() {
+final CallerContext ctx = CallerContext.getCurrent();
+String origContext = ctx == null ? null : ctx.getContext();
+byte[] origSignature = ctx == null ? null : ctx.getSignature();
+CallerContext.setCurrent(
+new CallerContext.Builder(origContext, clientConfiguration)

Review comment:
   Can we pass the string separator instead of configuration to avoid 
unnecessary `Configuration.get()` for each RPC?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 497054)
Time Spent: 2h 50m  (was: 2h 40m)

> RBF: The RouterRPCServer should transfer client IP via CallerContext to 
> NamenodeRpcServer
> -
>
> Key: HDFS-13293
> URL: https://issues.apache.org/jira/browse/HDFS-13293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: maobaolong
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-13293.001.patch
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Otherwise, the namenode don't know the client's callerContext
> This jira focuses on audit log which logs real client ip. Leave locality to 
> HDFS-13248



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15616) [SBN] Disable trigger edit Log Roll for Observers

2020-10-07 Thread Janus Chow (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209953#comment-17209953
 ] 

Janus Chow commented on HDFS-15616:
---

Hi, [~csun] , noticed you made a similar patch for Observer to disable saving 
checkpoint. Could you help to check this ticket?

> [SBN] Disable trigger edit Log Roll for Observers
> -
>
> Key: HDFS-15616
> URL: https://issues.apache.org/jira/browse/HDFS-15616
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Janus Chow
>Priority: Major
> Attachments: HDFS-15616.001.patch
>
>
> Currently when Observer is transitioned from StandbyState, the editLogTailer 
> will still send the request to roll editLog to ActiveNN, which should be 
> disabled to keep the definition of "logRollPeriodMs" clear.
> One thing I'm not sure is for a cluster with multi standby Namenode, all the 
> standby NN will trigger the roll. Should this feature be extended to all 
> standby NNs or implementing on observers first?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15616) [SBN] Disable trigger edit Log Roll for Observers

2020-10-07 Thread Janus Chow (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Janus Chow updated HDFS-15616:
--
Component/s: hdfs

> [SBN] Disable trigger edit Log Roll for Observers
> -
>
> Key: HDFS-15616
> URL: https://issues.apache.org/jira/browse/HDFS-15616
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Janus Chow
>Priority: Major
> Attachments: HDFS-15616.001.patch
>
>
> Currently when Observer is transitioned from StandbyState, the editLogTailer 
> will still send the request to roll editLog to ActiveNN, which should be 
> disabled to keep the definition of "
> logRollPeriodMs" clear.
> One thing I'm not sure is for a cluster with multi standby Namenode, all the 
> standby NN will trigger the roll. Should this feature be extended to all 
> standby NNs or implementing on observers first?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15616) [SBN] Disable trigger edit Log Roll for Observers

2020-10-07 Thread Janus Chow (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Janus Chow updated HDFS-15616:
--
Description: 
Currently when Observer is transitioned from StandbyState, the editLogTailer 
will still send the request to roll editLog to ActiveNN, which should be 
disabled to keep the definition of "logRollPeriodMs" clear.

One thing I'm not sure is for a cluster with multi standby Namenode, all the 
standby NN will trigger the roll. Should this feature be extended to all 
standby NNs or implementing on observers first?

  was:
Currently when Observer is transitioned from StandbyState, the editLogTailer 
will still send the request to roll editLog to ActiveNN, which should be 
disabled to keep the definition of "

logRollPeriodMs" clear.

One thing I'm not sure is for a cluster with multi standby Namenode, all the 
standby NN will trigger the roll. Should this feature be extended to all 
standby NNs or implementing on observers first?


> [SBN] Disable trigger edit Log Roll for Observers
> -
>
> Key: HDFS-15616
> URL: https://issues.apache.org/jira/browse/HDFS-15616
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Janus Chow
>Priority: Major
> Attachments: HDFS-15616.001.patch
>
>
> Currently when Observer is transitioned from StandbyState, the editLogTailer 
> will still send the request to roll editLog to ActiveNN, which should be 
> disabled to keep the definition of "logRollPeriodMs" clear.
> One thing I'm not sure is for a cluster with multi standby Namenode, all the 
> standby NN will trigger the roll. Should this feature be extended to all 
> standby NNs or implementing on observers first?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-9409) DataNode shutdown does not guarantee full shutdown of all threads due to race condition.

2020-10-07 Thread Ahmed Hussein (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-9409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hussein reassigned HDFS-9409:
---

Assignee: Ahmed Hussein

> DataNode shutdown does not guarantee full shutdown of all threads due to race 
> condition.
> 
>
> Key: HDFS-9409
> URL: https://issues.apache.org/jira/browse/HDFS-9409
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Chris Nauroth
>Assignee: Ahmed Hussein
>Priority: Major
>
> {{DataNode#shutdown}} is documented to return "only after shutdown is 
> complete".  Even after completion of this method, it's possible that threads 
> started by the DataNode are still running.  Race conditions in the shutdown 
> sequence may cause it to skip stopping and joining the {{BPServiceActor}} 
> threads.
> This is likely not a big problem in normal operations, because these are 
> daemon threads that won't block overall process exit.  It is more of a 
> problem for tests, because it makes it impossible to write reliable 
> assertions that these threads exited cleanly.  For large test suites, it can 
> also cause an accumulation of unneeded threads, which might harm test 
> performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15240) Erasure Coding: dirty buffer causes reconstruction block error

2020-10-07 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209917#comment-17209917
 ] 

Wei-Chiu Chuang commented on HDFS-15240:


[~ferhui] could you review this patch? Looks like a great fix from a quick look.

> Erasure Coding: dirty buffer causes reconstruction block error
> --
>
> Key: HDFS-15240
> URL: https://issues.apache.org/jira/browse/HDFS-15240
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding
>Reporter: HuangTao
>Assignee: HuangTao
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HDFS-15240.001.patch, HDFS-15240.002.patch, 
> HDFS-15240.003.patch, HDFS-15240.004.patch, HDFS-15240.005.patch, 
> HDFS-15240.006.patch, image-2020-07-16-15-56-38-608.png
>
>
> When read some lzo files we found some blocks were broken.
> I read back all internal blocks(b0-b8) of the block group(RS-6-3-1024k) from 
> DN directly, and choose 6(b0-b5) blocks to decode other 3(b6', b7', b8') 
> blocks. And find the longest common sequenece(LCS) between b6'(decoded) and 
> b6(read from DN)(b7'/b7 and b8'/b8).
> After selecting 6 blocks of the block group in combinations one time and 
> iterating through all cases, I find one case that the length of LCS is the 
> block length - 64KB, 64KB is just the length of ByteBuffer used by 
> StripedBlockReader. So the corrupt reconstruction block is made by a dirty 
> buffer.
> The following log snippet(only show 2 of 28 cases) is my check program 
> output. In my case, I known the 3th block is corrupt, so need other 5 blocks 
> to decode another 3 blocks, then find the 1th block's LCS substring is block 
> length - 64kb.
> It means (0,1,2,4,5,6)th blocks were used to reconstruct 3th block, and the 
> dirty buffer was used before read the 1th block.
> Must be noted that StripedBlockReader read from the offset 0 of the 1th block 
> after used the dirty buffer.
> EDITED for readability.
> {code:java}
> decode from block[0, 2, 3, 4, 5, 7] to generate block[1', 6', 8']
> Check the first 131072 bytes between block[1] and block[1'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[6] and block[6'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[8] and block[8'], the longest 
> common substring length is 4
> decode from block[0, 2, 3, 4, 5, 6] to generate block[1', 7', 8']
> Check the first 131072 bytes between block[1] and block[1'], the longest 
> common substring length is 65536
> CHECK AGAIN: all 27262976 bytes between block[1] and block[1'], the longest 
> common substring length is 27197440  # this one
> Check the first 131072 bytes between block[7] and block[7'], the longest 
> common substring length is 4
> Check the first 131072 bytes between block[8] and block[8'], the longest 
> common substring length is 4{code}
> Now I know the dirty buffer causes reconstruction block error, but how does 
> the dirty buffer come about?
> After digging into the code and DN log, I found this following DN log is the 
> root reason.
> {code:java}
> [INFO] [stripedRead-1017] : Interrupted while waiting for IO on channel 
> java.nio.channels.SocketChannel[connected local=/:52586 
> remote=/:50010]. 18 millis timeout left.
> [WARN] [StripedBlockReconstruction-199] : Failed to reconstruct striped 
> block: BP-714356632--1519726836856:blk_-YY_3472979393
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.util.StripedBlockUtil.getNextCompletedStripedRead(StripedBlockUtil.java:314)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.doReadMinimumSources(StripedReader.java:308)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedReader.readMinimumSources(StripedReader.java:269)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:94)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60)
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:834) {code}
> Reading from DN may timeout(hold by a future(F)) and output the INFO log, but 
> the futures that contains the future(F)  is cleared, 
> {code:java}
> return new StripingChunkReadResult(futures.remove(future),
> 

[jira] [Commented] (HDFS-15597) ContentSummary.getSpaceConsumed does not consider replication

2020-10-07 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209910#comment-17209910
 ] 

Wei-Chiu Chuang commented on HDFS-15597:


Thanks for the patch.

Spent some time to understand this patch.

(1) This is FileContext. The FileSystem#getContentSummary() has the exactly 
same implementation (and thus same bug), but when it is used for HDFS, 
DistributedFileSystem#getContentSummary() overrides it and NameNode provides 
the correct space usage. It is only when FileContext is used we have this bug.

(2) The patch addresses the bug for HDFS. However it will be incorrect for 
HDFS-EC. (replication=0)

> ContentSummary.getSpaceConsumed does not consider replication
> -
>
> Key: HDFS-15597
> URL: https://issues.apache.org/jira/browse/HDFS-15597
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: dfs
>Affects Versions: 2.6.0
>Reporter: Ajmal Ahammed
>Assignee: Aihua Xu
>Priority: Minor
> Attachments: HDFS-15597.patch
>
>
> I am trying to get the disk space consumed by an HDFS directory using the 
> {{ContentSummary.getSpaceConsumed}} method. I can't get the space consumption 
> correctly considering the replication factor. The replication factor is 2, 
> and I was expecting twice the size of the actual file size from the above 
> method.
> I can't get the space consumption correctly considering the replication 
> factor. The replication factor is 2, and I was expecting twice the size of 
> the actual file size from the above method.
> {code}
> ubuntu@ubuntu:~/ht$ sudo -u hdfs hdfs dfs -ls /var/lib/ubuntu
> Found 2 items
> -rw-r--r--   2 ubuntu ubuntu3145728 2020-09-08 09:55 
> /var/lib/ubuntu/size-test
> drwxrwxr-x   - ubuntu ubuntu  0 2020-09-07 06:37 /var/lib/ubuntu/test
> {code}
> But when I run the following code,
> {code}
> String path = "/etc/hadoop/conf/";
> conf.addResource(new Path(path + "core-site.xml"));
> conf.addResource(new Path(path + "hdfs-site.xml"));
> long size = 
> FileContext.getFileContext(conf).util().getContentSummary(fileStatus).getSpaceConsumed();
> System.out.println("Replication : " + fileStatus.getReplication());
> System.out.println("File size : " + size);
> {code}
> The output is
> {code}
> Replication : 0
> File size : 3145728
> {code}
> Both the file size and the replication factor seems to be incorrect.
> /etc/hadoop/conf/hdfs-site.xml contains the following config:
> {code}
>   
> dfs.replication
> 2
>   
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-13293) RBF: The RouterRPCServer should transfer client IP via CallerContext to NamenodeRpcServer

2020-10-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13293?focusedWorklogId=496945=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-496945
 ]

ASF GitHub Bot logged work on HDFS-13293:
-

Author: ASF GitHub Bot
Created on: 07/Oct/20 21:19
Start Date: 07/Oct/20 21:19
Worklog Time Spent: 10m 
  Work Description: goiri commented on a change in pull request #2363:
URL: https://github.com/apache/hadoop/pull/2363#discussion_r501316133



##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterRpc.java
##
@@ -1901,4 +1904,27 @@ private DFSClient getFileDFSClient(final String path) {
 }
 return null;
   }
+
+  @Test
+  public void testMkdirsWithCallerContext() throws IOException {
+GenericTestUtils.LogCapturer auditlog =
+GenericTestUtils.LogCapturer.captureLogs(FSNamesystem.auditLog);
+
+// Current callerContext is null
+assertNull(CallerContext.getCurrent());
+
+// Set client context
+CallerContext.setCurrent(
+new CallerContext.Builder("clientContext").build());
+
+// Create a directory via the router
+String dirPath = "/test_dir_with_callercontext";
+FsPermission permission = new FsPermission("755");
+routerProtocol.mkdirs(dirPath, permission, false);
+
+// The audit log should contains "callerContext=clientContext,clientIp:"
+assertTrue(auditlog.getOutput()
+.contains("callerContext=clientContext,clientIp:"));

Review comment:
   Correct, grabbing the proper Client IP is not trivial and error prone so 
I'm fine with this.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 496945)
Time Spent: 2h 40m  (was: 2.5h)

> RBF: The RouterRPCServer should transfer client IP via CallerContext to 
> NamenodeRpcServer
> -
>
> Key: HDFS-13293
> URL: https://issues.apache.org/jira/browse/HDFS-13293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: maobaolong
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-13293.001.patch
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Otherwise, the namenode don't know the client's callerContext
> This jira focuses on audit log which logs real client ip. Leave locality to 
> HDFS-13248



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-13293) RBF: The RouterRPCServer should transfer client IP via CallerContext to NamenodeRpcServer

2020-10-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13293?focusedWorklogId=496852=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-496852
 ]

ASF GitHub Bot logged work on HDFS-13293:
-

Author: ASF GitHub Bot
Created on: 07/Oct/20 18:55
Start Date: 07/Oct/20 18:55
Worklog Time Spent: 10m 
  Work Description: ferhui commented on a change in pull request #2363:
URL: https://github.com/apache/hadoop/pull/2363#discussion_r501240102



##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterRpc.java
##
@@ -1901,4 +1904,27 @@ private DFSClient getFileDFSClient(final String path) {
 }
 return null;
   }
+
+  @Test
+  public void testMkdirsWithCallerContext() throws IOException {
+GenericTestUtils.LogCapturer auditlog =
+GenericTestUtils.LogCapturer.captureLogs(FSNamesystem.auditLog);
+
+// Current callerContext is null
+assertNull(CallerContext.getCurrent());
+
+// Set client context
+CallerContext.setCurrent(
+new CallerContext.Builder("clientContext").build());
+
+// Create a directory via the router
+String dirPath = "/test_dir_with_callercontext";
+FsPermission permission = new FsPermission("755");
+routerProtocol.mkdirs(dirPath, permission, false);
+
+// The audit log should contains "callerContext=clientContext,clientIp:"
+assertTrue(auditlog.getOutput()
+.contains("callerContext=clientContext,clientIp:"));

Review comment:
   Sorry, do not understand it.
   I think If we check the actual IP, we should get the client actual IP, e.g 
"w.x.y.z", and then check the audit log, it should contain 
"callerContext=clientContext,clientIp:w.x.y.z", is it right?
   Now it's hard to get client ip. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 496852)
Time Spent: 2.5h  (was: 2h 20m)

> RBF: The RouterRPCServer should transfer client IP via CallerContext to 
> NamenodeRpcServer
> -
>
> Key: HDFS-13293
> URL: https://issues.apache.org/jira/browse/HDFS-13293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: maobaolong
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-13293.001.patch
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Otherwise, the namenode don't know the client's callerContext
> This jira focuses on audit log which logs real client ip. Leave locality to 
> HDFS-13248



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-13293) RBF: The RouterRPCServer should transfer client IP via CallerContext to NamenodeRpcServer

2020-10-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13293?focusedWorklogId=496845=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-496845
 ]

ASF GitHub Bot logged work on HDFS-13293:
-

Author: ASF GitHub Bot
Created on: 07/Oct/20 18:38
Start Date: 07/Oct/20 18:38
Worklog Time Spent: 10m 
  Work Description: goiri commented on a change in pull request #2363:
URL: https://github.com/apache/hadoop/pull/2363#discussion_r501230553



##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterRpc.java
##
@@ -1901,4 +1904,27 @@ private DFSClient getFileDFSClient(final String path) {
 }
 return null;
   }
+
+  @Test
+  public void testMkdirsWithCallerContext() throws IOException {
+GenericTestUtils.LogCapturer auditlog =
+GenericTestUtils.LogCapturer.captureLogs(FSNamesystem.auditLog);
+
+// Current callerContext is null
+assertNull(CallerContext.getCurrent());
+
+// Set client context
+CallerContext.setCurrent(
+new CallerContext.Builder("clientContext").build());
+
+// Create a directory via the router
+String dirPath = "/test_dir_with_callercontext";
+FsPermission permission = new FsPermission("755");
+routerProtocol.mkdirs(dirPath, permission, false);
+
+// The audit log should contains "callerContext=clientContext,clientIp:"
+assertTrue(auditlog.getOutput()
+.contains("callerContext=clientContext,clientIp:"));

Review comment:
   The only issue I see is to grab random logs but I guess is fine.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 496845)
Time Spent: 2h 20m  (was: 2h 10m)

> RBF: The RouterRPCServer should transfer client IP via CallerContext to 
> NamenodeRpcServer
> -
>
> Key: HDFS-13293
> URL: https://issues.apache.org/jira/browse/HDFS-13293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: maobaolong
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-13293.001.patch
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Otherwise, the namenode don't know the client's callerContext
> This jira focuses on audit log which logs real client ip. Leave locality to 
> HDFS-13248



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-13293) RBF: The RouterRPCServer should transfer client IP via CallerContext to NamenodeRpcServer

2020-10-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13293?focusedWorklogId=496778=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-496778
 ]

ASF GitHub Bot logged work on HDFS-13293:
-

Author: ASF GitHub Bot
Created on: 07/Oct/20 17:26
Start Date: 07/Oct/20 17:26
Worklog Time Spent: 10m 
  Work Description: ferhui commented on a change in pull request #2363:
URL: https://github.com/apache/hadoop/pull/2363#discussion_r501185702



##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterRpc.java
##
@@ -1901,4 +1904,27 @@ private DFSClient getFileDFSClient(final String path) {
 }
 return null;
   }
+
+  @Test
+  public void testMkdirsWithCallerContext() throws IOException {
+GenericTestUtils.LogCapturer auditlog =
+GenericTestUtils.LogCapturer.captureLogs(FSNamesystem.auditLog);
+
+// Current callerContext is null
+assertNull(CallerContext.getCurrent());
+
+// Set client context
+CallerContext.setCurrent(
+new CallerContext.Builder("clientContext").build());
+
+// Create a directory via the router
+String dirPath = "/test_dir_with_callercontext";
+FsPermission permission = new FsPermission("755");
+routerProtocol.mkdirs(dirPath, permission, false);
+
+// The audit log should contains "callerContext=clientContext,clientIp:"
+assertTrue(auditlog.getOutput()
+.contains("callerContext=clientContext,clientIp:"));

Review comment:
   Or just keep it that way, and do not modify UT





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 496778)
Time Spent: 2h 10m  (was: 2h)

> RBF: The RouterRPCServer should transfer client IP via CallerContext to 
> NamenodeRpcServer
> -
>
> Key: HDFS-13293
> URL: https://issues.apache.org/jira/browse/HDFS-13293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: maobaolong
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-13293.001.patch
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Otherwise, the namenode don't know the client's callerContext
> This jira focuses on audit log which logs real client ip. Leave locality to 
> HDFS-13248



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15610) Reduce datanode upgrade/hardlink thread

2020-10-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15610?focusedWorklogId=496769=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-496769
 ]

ASF GitHub Bot logged work on HDFS-15610:
-

Author: ASF GitHub Bot
Created on: 07/Oct/20 17:19
Start Date: 07/Oct/20 17:19
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2365:
URL: https://github.com/apache/hadoop/pull/2365#issuecomment-705079088


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  7s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  32m  1s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |   1m  8s |  |  trunk passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   0m 53s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 18s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  17m 40s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 50s |  |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 24s |  |  trunk passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   3m  8s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m  6s |  |  trunk passed  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  8s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 11s |  |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javac  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  3s |  |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  javac  |   1m  3s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   0m 47s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  xml  |   0m  1s |  |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  shadedclient  |  15m 45s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 45s |  |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 22s |  |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   3m 12s |  |  the patch passed  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 109m 15s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2365/3/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 35s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 198m 35s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.web.TestWebHDFS |
   |   | hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier |
   |   | hadoop.hdfs.TestDFSShell |
   |   | hadoop.hdfs.TestFileChecksum |
   |   | hadoop.hdfs.TestFileChecksumCompositeCrc |
   |   | hadoop.hdfs.TestSafeModeWithStripedFile |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2365/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2365 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle xml |
   | uname | Linux 98fe6dbf219b 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 

[jira] [Updated] (HDFS-13293) RBF: The RouterRPCServer should transfer client IP via CallerContext to NamenodeRpcServer

2020-10-07 Thread Hui Fei (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hui Fei updated HDFS-13293:
---
Summary: RBF: The RouterRPCServer should transfer client IP via 
CallerContext to NamenodeRpcServer  (was: RBF: The RouterRPCServer should 
transfer client ip via CallerContext to NamenodeRpcServer)

> RBF: The RouterRPCServer should transfer client IP via CallerContext to 
> NamenodeRpcServer
> -
>
> Key: HDFS-13293
> URL: https://issues.apache.org/jira/browse/HDFS-13293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: maobaolong
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-13293.001.patch
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Otherwise, the namenode don't know the client's callerContext
> This jira focuses on audit log which logs real client ip. Leave locality to 
> HDFS-13248



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-13293) RBF: The RouterRPCServer should transfer client ip via CallerContext to NamenodeRpcServer

2020-10-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13293?focusedWorklogId=496764=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-496764
 ]

ASF GitHub Bot logged work on HDFS-13293:
-

Author: ASF GitHub Bot
Created on: 07/Oct/20 17:12
Start Date: 07/Oct/20 17:12
Worklog Time Spent: 10m 
  Work Description: ferhui commented on a change in pull request #2363:
URL: https://github.com/apache/hadoop/pull/2363#discussion_r501176989



##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterRpc.java
##
@@ -1901,4 +1904,27 @@ private DFSClient getFileDFSClient(final String path) {
 }
 return null;
   }
+
+  @Test
+  public void testMkdirsWithCallerContext() throws IOException {
+GenericTestUtils.LogCapturer auditlog =
+GenericTestUtils.LogCapturer.captureLogs(FSNamesystem.auditLog);
+
+// Current callerContext is null
+assertNull(CallerContext.getCurrent());
+
+// Set client context
+CallerContext.setCurrent(
+new CallerContext.Builder("clientContext").build());
+
+// Create a directory via the router
+String dirPath = "/test_dir_with_callercontext";
+FsPermission permission = new FsPermission("755");
+routerProtocol.mkdirs(dirPath, permission, false);
+
+// The audit log should contains "callerContext=clientContext,clientIp:"
+assertTrue(auditlog.getOutput()
+.contains("callerContext=clientContext,clientIp:"));

Review comment:
   Had thought  about this. DFSClient & Client do not expose ip, and 
TestAuditLogger & TestAuditLogs do not check client ip. So do you have any 
suggestions?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 496764)
Time Spent: 2h  (was: 1h 50m)

> RBF: The RouterRPCServer should transfer client ip via CallerContext to 
> NamenodeRpcServer
> -
>
> Key: HDFS-13293
> URL: https://issues.apache.org/jira/browse/HDFS-13293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: maobaolong
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-13293.001.patch
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Otherwise, the namenode don't know the client's callerContext
> This jira focuses on audit log which logs real client ip. Leave locality to 
> HDFS-13248



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-15253) Set default throttle value on dfs.image.transfer.bandwidthPerSec

2020-10-07 Thread Rakesh Radhakrishnan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh Radhakrishnan resolved HDFS-15253.
-
Fix Version/s: 3.4.0
   Resolution: Fixed

> Set default throttle value on dfs.image.transfer.bandwidthPerSec
> 
>
> Key: HDFS-15253
> URL: https://issues.apache.org/jira/browse/HDFS-15253
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Karthik Palanisamy
>Assignee: Karthik Palanisamy
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The default value dfs.image.transfer.bandwidthPerSec is set to 0 so it can 
> use maximum available bandwidth for fsimage transfers during checkpoint. I 
> think we should throttle this. Many users were experienced namenode failover 
> when transferring large image size along with fsimage replication on 
> dfs.namenode.name.dir. eg. >25Gb.  
> Thought to set,
> dfs.image.transfer.bandwidthPerSec=52428800. (50 MB/s)
> dfs.namenode.checkpoint.txns=200 (Default is 1M, good to avoid frequent 
> checkpoint. However, the default checkpoint runs every 6 hours once)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15253) Set default throttle value on dfs.image.transfer.bandwidthPerSec

2020-10-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15253?focusedWorklogId=496739=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-496739
 ]

ASF GitHub Bot logged work on HDFS-15253:
-

Author: ASF GitHub Bot
Created on: 07/Oct/20 16:39
Start Date: 07/Oct/20 16:39
Worklog Time Spent: 10m 
  Work Description: rakeshadr merged pull request #2366:
URL: https://github.com/apache/hadoop/pull/2366


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 496739)
Time Spent: 40m  (was: 0.5h)

> Set default throttle value on dfs.image.transfer.bandwidthPerSec
> 
>
> Key: HDFS-15253
> URL: https://issues.apache.org/jira/browse/HDFS-15253
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Karthik Palanisamy
>Assignee: Karthik Palanisamy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The default value dfs.image.transfer.bandwidthPerSec is set to 0 so it can 
> use maximum available bandwidth for fsimage transfers during checkpoint. I 
> think we should throttle this. Many users were experienced namenode failover 
> when transferring large image size along with fsimage replication on 
> dfs.namenode.name.dir. eg. >25Gb.  
> Thought to set,
> dfs.image.transfer.bandwidthPerSec=52428800. (50 MB/s)
> dfs.namenode.checkpoint.txns=200 (Default is 1M, good to avoid frequent 
> checkpoint. However, the default checkpoint runs every 6 hours once)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15253) Set default throttle value on dfs.image.transfer.bandwidthPerSec

2020-10-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15253?focusedWorklogId=496728=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-496728
 ]

ASF GitHub Bot logged work on HDFS-15253:
-

Author: ASF GitHub Bot
Created on: 07/Oct/20 16:24
Start Date: 07/Oct/20 16:24
Worklog Time Spent: 10m 
  Work Description: rakeshadr commented on pull request #2366:
URL: https://github.com/apache/hadoop/pull/2366#issuecomment-705048932


   +1 LGTM, test case failures are unrelated to the patch. Will merge it 
shortly.
   
   Thanks @karthikhw for the contribution.
   Thanks @mukul1987 for the reviews.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 496728)
Time Spent: 0.5h  (was: 20m)

> Set default throttle value on dfs.image.transfer.bandwidthPerSec
> 
>
> Key: HDFS-15253
> URL: https://issues.apache.org/jira/browse/HDFS-15253
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Karthik Palanisamy
>Assignee: Karthik Palanisamy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The default value dfs.image.transfer.bandwidthPerSec is set to 0 so it can 
> use maximum available bandwidth for fsimage transfers during checkpoint. I 
> think we should throttle this. Many users were experienced namenode failover 
> when transferring large image size along with fsimage replication on 
> dfs.namenode.name.dir. eg. >25Gb.  
> Thought to set,
> dfs.image.transfer.bandwidthPerSec=52428800. (50 MB/s)
> dfs.namenode.checkpoint.txns=200 (Default is 1M, good to avoid frequent 
> checkpoint. However, the default checkpoint runs every 6 hours once)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-13293) RBF: The RouterRPCServer should transfer client ip via CallerContext to NamenodeRpcServer

2020-10-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13293?focusedWorklogId=496716=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-496716
 ]

ASF GitHub Bot logged work on HDFS-13293:
-

Author: ASF GitHub Bot
Created on: 07/Oct/20 16:12
Start Date: 07/Oct/20 16:12
Worklog Time Spent: 10m 
  Work Description: goiri commented on a change in pull request #2363:
URL: https://github.com/apache/hadoop/pull/2363#discussion_r501138092



##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterRpc.java
##
@@ -1901,4 +1904,27 @@ private DFSClient getFileDFSClient(final String path) {
 }
 return null;
   }
+
+  @Test
+  public void testMkdirsWithCallerContext() throws IOException {
+GenericTestUtils.LogCapturer auditlog =
+GenericTestUtils.LogCapturer.captureLogs(FSNamesystem.auditLog);
+
+// Current callerContext is null
+assertNull(CallerContext.getCurrent());
+
+// Set client context
+CallerContext.setCurrent(
+new CallerContext.Builder("clientContext").build());
+
+// Create a directory via the router
+String dirPath = "/test_dir_with_callercontext";
+FsPermission permission = new FsPermission("755");
+routerProtocol.mkdirs(dirPath, permission, false);
+
+// The audit log should contains "callerContext=clientContext,clientIp:"
+assertTrue(auditlog.getOutput()
+.contains("callerContext=clientContext,clientIp:"));

Review comment:
   Anyway we can check for the actual IP?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 496716)
Time Spent: 1h 50m  (was: 1h 40m)

> RBF: The RouterRPCServer should transfer client ip via CallerContext to 
> NamenodeRpcServer
> -
>
> Key: HDFS-13293
> URL: https://issues.apache.org/jira/browse/HDFS-13293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: maobaolong
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-13293.001.patch
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Otherwise, the namenode don't know the client's callerContext
> This jira focuses on audit log which logs real client ip. Leave locality to 
> HDFS-13248



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-13293) RBF: The RouterRPCServer should transfer client ip via CallerContext to NamenodeRpcServer

2020-10-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13293?focusedWorklogId=496715=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-496715
 ]

ASF GitHub Bot logged work on HDFS-13293:
-

Author: ASF GitHub Bot
Created on: 07/Oct/20 16:11
Start Date: 07/Oct/20 16:11
Worklog Time Spent: 10m 
  Work Description: goiri commented on a change in pull request #2363:
URL: https://github.com/apache/hadoop/pull/2363#discussion_r501137208



##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcClient.java
##
@@ -116,10 +118,13 @@
   /** Optional perf monitor. */
   private final RouterRpcMonitor rpcMonitor;
 
+  private final Configuration clientConf;
+
   /** Pattern to parse a stack trace line. */
   private static final Pattern STACK_TRACE_PATTERN =
   Pattern.compile("\\tat (.*)\\.(.*)\\((.*):(\\d*)\\)");
 
+  private static final String CLIENT_IP_STR = "clientIp";

Review comment:
   Makes sense, let's just keep in mind HDFS-13248 when doing this.
   So far it looks like is covered.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 496715)
Time Spent: 1h 40m  (was: 1.5h)

> RBF: The RouterRPCServer should transfer client ip via CallerContext to 
> NamenodeRpcServer
> -
>
> Key: HDFS-13293
> URL: https://issues.apache.org/jira/browse/HDFS-13293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: maobaolong
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-13293.001.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Otherwise, the namenode don't know the client's callerContext
> This jira focuses on audit log which logs real client ip. Leave locality to 
> HDFS-13248



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13293) RBF: The RouterRPCServer should transfer client ip via CallerContext to NamenodeRpcServer

2020-10-07 Thread Hui Fei (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hui Fei updated HDFS-13293:
---
Summary: RBF: The RouterRPCServer should transfer client ip via 
CallerContext to NamenodeRpcServer  (was: RBF: The RouterRPCServer should 
transfer CallerContext and client ip to NamenodeRpcServer)

> RBF: The RouterRPCServer should transfer client ip via CallerContext to 
> NamenodeRpcServer
> -
>
> Key: HDFS-13293
> URL: https://issues.apache.org/jira/browse/HDFS-13293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: maobaolong
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-13293.001.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Otherwise, the namenode don't know the client's callerContext
> This jira focuses on audit log which logs real client ip. Leave locality to 
> HDFS-13248



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13293) RBF: The RouterRPCServer should transfer client ip via CallerContext to NamenodeRpcServer

2020-10-07 Thread Hui Fei (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209490#comment-17209490
 ] 

Hui Fei commented on HDFS-13293:


Modify the caption and the description, focus on audit log

> RBF: The RouterRPCServer should transfer client ip via CallerContext to 
> NamenodeRpcServer
> -
>
> Key: HDFS-13293
> URL: https://issues.apache.org/jira/browse/HDFS-13293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: maobaolong
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-13293.001.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Otherwise, the namenode don't know the client's callerContext
> This jira focuses on audit log which logs real client ip. Leave locality to 
> HDFS-13248



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13293) RBF: The RouterRPCServer should transfer CallerContext and client ip to NamenodeRpcServer

2020-10-07 Thread Hui Fei (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hui Fei updated HDFS-13293:
---
Description: 
Otherwise, the namenode don't know the client's callerContext

This jira focuses on audit log which logs real client ip. Leave locality to 
HDFS-13248

  was:
Otherwise, the namenode don't know the client's callerContext
This jira focuses on audit log which logs real client ip. Leave locality to 


> RBF: The RouterRPCServer should transfer CallerContext and client ip to 
> NamenodeRpcServer
> -
>
> Key: HDFS-13293
> URL: https://issues.apache.org/jira/browse/HDFS-13293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: maobaolong
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-13293.001.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Otherwise, the namenode don't know the client's callerContext
> This jira focuses on audit log which logs real client ip. Leave locality to 
> HDFS-13248



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13293) RBF: The RouterRPCServer should transfer CallerContext and client ip to NamenodeRpcServer

2020-10-07 Thread Hui Fei (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hui Fei updated HDFS-13293:
---
Description: 
Otherwise, the namenode don't know the client's callerContext
This jira focuses on audit log which logs real client ip. Leave locality to 

  was:Otherwise, the namenode don't know the client's callerContext


> RBF: The RouterRPCServer should transfer CallerContext and client ip to 
> NamenodeRpcServer
> -
>
> Key: HDFS-13293
> URL: https://issues.apache.org/jira/browse/HDFS-13293
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: maobaolong
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-13293.001.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Otherwise, the namenode don't know the client's callerContext
> This jira focuses on audit log which logs real client ip. Leave locality to 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15616) [SBN] Disable trigger edit Log Roll for Observers

2020-10-07 Thread Janus Chow (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209361#comment-17209361
 ] 

Janus Chow commented on HDFS-15616:
---

Updated a patch to disable edit log roll for Observers.

> [SBN] Disable trigger edit Log Roll for Observers
> -
>
> Key: HDFS-15616
> URL: https://issues.apache.org/jira/browse/HDFS-15616
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Janus Chow
>Priority: Major
> Attachments: HDFS-15616.001.patch
>
>
> Currently when Observer is transitioned from StandbyState, the editLogTailer 
> will still send the request to roll editLog to ActiveNN, which should be 
> disabled to keep the definition of "
> logRollPeriodMs" clear.
> One thing I'm not sure is for a cluster with multi standby Namenode, all the 
> standby NN will trigger the roll. Should this feature be extended to all 
> standby NNs or implementing on observers first?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15616) [SBN] Disable trigger edit Log Roll for Observers

2020-10-07 Thread Janus Chow (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Janus Chow updated HDFS-15616:
--
Attachment: HDFS-15616.001.patch

> [SBN] Disable trigger edit Log Roll for Observers
> -
>
> Key: HDFS-15616
> URL: https://issues.apache.org/jira/browse/HDFS-15616
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Janus Chow
>Priority: Major
> Attachments: HDFS-15616.001.patch
>
>
> Currently when Observer is transitioned from StandbyState, the editLogTailer 
> will still send the request to roll editLog to ActiveNN, which should be 
> disabled to keep the definition of "
> logRollPeriodMs" clear.
> One thing I'm not sure is for a cluster with multi standby Namenode, all the 
> standby NN will trigger the roll. Should this feature be extended to all 
> standby NNs or implementing on observers first?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org