from:"Erik Krogen \(JIRA\)"

[jira] [Resolved] (HDFS-16923) The getListing RPC will throw NPE if the path does not exist

2023-03-01 Thread Erik Krogen (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen resolved HDFS-16923.

Resolution: Fixed

> The getListing RPC will throw NPE if the path does not exist
> 
>
> Key: HDFS-16923
> URL: https://issues.apache.org/jira/browse/HDFS-16923
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6
>
>
> The getListing RPC will throw NPE if the path does not exist. And the stack 
> as bellow:
> {code:java}
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RemoteException): 
> org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> java.lang.NullPointerException
>     at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4195)
>     at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:1421)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:783)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:622)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:590)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:574)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16764) ObserverNamenode handles addBlock rpc and throws a FileNotFoundException

2023-01-17 Thread Erik Krogen (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen resolved HDFS-16764.

Resolution: Fixed

> ObserverNamenode handles addBlock rpc and throws a FileNotFoundException 
> -
>
> Key: HDFS-16764
> URL: https://issues.apache.org/jira/browse/HDFS-16764
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.9
>
>
> ObserverNameNode currently can handle the addBlockLocation RPC, but it may 
> throw a FileNotFoundException when it contains stale txid.
>  * AddBlock is not a coordinated method, so Observer will not check the 
> statId.
>  * AddBlock does the validation with checkOperation(OperationCategory.READ)
> So the observer can handle the addBlock rpc. If this observer cannot replay 
> the edit of create file, it will throw a FileNotFoundException during doing 
> validation.
> The related code as follows:
> {code:java}
> checkOperation(OperationCategory.READ);
> final FSPermissionChecker pc = getPermissionChecker();
> FSPermissionChecker.setOperationType(operationName);
> readLock();
> try {
>   checkOperation(OperationCategory.READ);
>   r = FSDirWriteFileOp.validateAddBlock(this, pc, src, fileId, clientName,
> previous, onRetryBlock);
> } finally {
>   readUnlock(operationName);
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16872) Fix log throttling by declaring LogThrottlingHelper as static members

2023-01-10 Thread Erik Krogen (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen resolved HDFS-16872.

Resolution: Fixed

> Fix log throttling by declaring LogThrottlingHelper as static members
> -
>
> Key: HDFS-16872
> URL: https://issues.apache.org/jira/browse/HDFS-16872
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.4
>Reporter: Chengbing Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.5, 3.3.6
>
>
> In our production cluster with Observer NameNode enabled, we have plenty of 
> logs printed by {{FSEditLogLoader}} and {{RedundantEditLogInputStream}}. The 
> {{LogThrottlingHelper}} doesn't seem to work.
> {noformat}
> 2022-10-25 09:26:50,380 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Start loading edits file ByteStringEditLog[17686250688, 17686250688], 
> ByteStringEditLog[17686250688, 17686250688], ByteStringEditLog[17686250688, 
> 17686250688] maxTxnsToRead = 9223372036854775807
> 2022-10-25 09:26:50,380 INFO 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream: 
> Fast-forwarding stream 'ByteStringEditLog[17686250688, 17686250688], 
> ByteStringEditLog[17686250688, 17686250688], ByteStringEditLog[17686250688, 
> 17686250688]' to transaction ID 17686250688
> 2022-10-25 09:26:50,380 INFO 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream: 
> Fast-forwarding stream 'ByteStringEditLog[17686250688, 17686250688]' to 
> transaction ID 17686250688
> 2022-10-25 09:26:50,380 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Loaded 1 edits file(s) (the last named ByteStringEditLog[17686250688, 
> 17686250688], ByteStringEditLog[17686250688, 17686250688], 
> ByteStringEditLog[17686250688, 17686250688]) of total size 527.0, total edits 
> 1.0, total load time 0.0 ms
> 2022-10-25 09:26:50,387 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Start loading edits file ByteStringEditLog[17686250689, 17686250693], 
> ByteStringEditLog[17686250689, 17686250693], ByteStringEditLog[17686250689, 
> 17686250693] maxTxnsToRead = 9223372036854775807
> 2022-10-25 09:26:50,387 INFO 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream: 
> Fast-forwarding stream 'ByteStringEditLog[17686250689, 17686250693], 
> ByteStringEditLog[17686250689, 17686250693], ByteStringEditLog[17686250689, 
> 17686250693]' to transaction ID 17686250689
> 2022-10-25 09:26:50,387 INFO 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream: 
> Fast-forwarding stream 'ByteStringEditLog[17686250689, 17686250693]' to 
> transaction ID 17686250689
> 2022-10-25 09:26:50,387 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Loaded 1 edits file(s) (the last named ByteStringEditLog[17686250689, 
> 17686250693], ByteStringEditLog[17686250689, 17686250693], 
> ByteStringEditLog[17686250689, 17686250693]) of total size 890.0, total edits 
> 5.0, total load time 1.0 ms
> {noformat}
> After some digging, I found the cause is that {{LogThrottlingHelper}}'s are 
> declared as instance variables of all the enclosing classes, including 
> {{FSImage}}, {{FSEditLogLoader}} and {{RedundantEditLogInputStream}}. 
> Therefore the logging frequency will not be limited across different 
> instances. For classes with only limited number of instances, such as 
> {{FSImage}}, this is fine. For others whose instances are created frequently, 
> such as {{FSEditLogLoader}} and {{RedundantEditLogInputStream}}, it will 
> result in plenty of logs.
> This can be fixed by declaring {{LogThrottlingHelper}}'s as static members.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16689) Standby NameNode crashes when transitioning to Active with in-progress tailer

2022-12-21 Thread Erik Krogen (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen resolved HDFS-16689.

Resolution: Fixed

> Standby NameNode crashes when transitioning to Active with in-progress tailer
> -
>
> Key: HDFS-16689
> URL: https://issues.apache.org/jira/browse/HDFS-16689
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Standby NameNode crashes when transitioning to Active with a in-progress 
> tailer. And the error message like blew:
> {code:java}
> Caused by: java.lang.IllegalStateException: Cannot start writing at txid X 
> when there is a stream available for read: ByteStringEditLog[X, Y], 
> ByteStringEditLog[X, 0]
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.openForWrite(FSEditLog.java:344)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync.openForWrite(FSEditLogAsync.java:113)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:1423)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:2132)
>   ... 36 more
> {code}
> After tracing and found there is a critical bug in 
> *EditlogTailer#catchupDuringFailover()* when 
> *DFS_HA_TAILEDITS_INPROGRESS_KEY* is true. Because *catchupDuringFailover()* 
> try to replay all missed edits from JournalNodes with *onlyDurableTxns=true*. 
> It may cannot replay any edits when they are some abnormal JournalNodes. 
> Reproduce method, suppose:
> - There are 2 namenode, namely NN0 and NN1, and the status of echo namenode 
> is Active, Standby respectively. And there are 3 JournalNodes, namely JN0, 
> JN1 and JN2. 
> - NN0 try to sync 3 edits to JNs with started txid 3, but only successfully 
> synced them to JN1 and JN3. And JN0 is abnormal, such as GC, bad network or 
> restarted.
> - NN1's lastAppliedTxId is 2, and at the moment, we are trying failover 
> active from NN0 to NN1. 
> - NN1 only got two responses from JN0 and JN1 when it try to selecting 
> inputStreams with *fromTxnId=3*  and *onlyDurableTxns=true*, and the count 
> txid of response is 0, 3 respectively. JN2 is abnormal, such as GC,  bad 
> network or restarted.
> - NN1 will cannot replay any Edits with *fromTxnId=3* from JournalNodes 
> because the *maxAllowedTxns* is 0.
> So I think Standby NameNode should *catchupDuringFailover()* with 
> *onlyDurableTxns=false* , so that it can replay all missed edits from 
> JournalNode.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16852) Register the shutdown hook only when not in shutdown for KeyProviderCache constructor

2022-12-16 Thread Erik Krogen (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen resolved HDFS-16852.

Resolution: Fixed

> Register the shutdown hook only when not in shutdown for KeyProviderCache 
> constructor
> -
>
> Key: HDFS-16852
> URL: https://issues.apache.org/jira/browse/HDFS-16852
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Xing Lin
>Assignee: Xing Lin
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 2.10.3, 3.3.6
>
>
> When an HDFS client is created, it will register a shutdownhook to 
> shutdownHookManager. ShutdownHookManager doesn't allow adding a new 
> shutdownHook when the process is already in shutdown and throws an 
> IllegalStateException.
> This behavior is not ideal, when a spark program failed during pre-launch. In 
> that case, during shutdown, spark would call cleanStagingDir() to clean the 
> staging dir. In cleanStagingDir(), it will create a FileSystem object to talk 
> to HDFS. However, since this would be the first time to use a filesystem 
> object in that process, it will need to create an hdfs client and register 
> the shutdownHook. Then, we will hit the IllegalStateException. This 
> illegalStateException will mask the actual exception which causes the spark 
> program to fail during pre-launch.
> We propose to swallow IllegalStateException in KeyProviderCache and log a 
> warning. The TCP connection between the client and NameNode should be closed 
> by the OS when the process is shutdown. 
> Example stacktrace
> {code:java}
> 13-09-2022 14:39:42 PDT INFO - 22/09/13 21:39:41 ERROR util.Utils: Uncaught 
> exception in thread shutdown-hook-0   
> 13-09-2022 14:39:42 PDT INFO - java.lang.IllegalStateException: Shutdown in 
> progress, cannot add a shutdownHook    
> 13-09-2022 14:39:42 PDT INFO - at 
> org.apache.hadoop.util.ShutdownHookManager.addShutdownHook(ShutdownHookManager.java:299)
>           
> 13-09-2022 14:39:42 PDT INFO - at 
> org.apache.hadoop.hdfs.KeyProviderCache.(KeyProviderCache.java:71)      
>     
> 13-09-2022 14:39:42 PDT INFO - at 
> org.apache.hadoop.hdfs.ClientContext.(ClientContext.java:130)          
> 13-09-2022 14:39:42 PDT INFO - at 
> org.apache.hadoop.hdfs.ClientContext.get(ClientContext.java:167)          
> 13-09-2022 14:39:42 PDT INFO - at 
> org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:383)          
> 13-09-2022 14:39:42 PDT INFO - at 
> org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:287)          
> 13-09-2022 14:39:42 PDT INFO - at 
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:159)
>           
> 13-09-2022 14:39:42 PDT INFO - at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3261)        
>   
> 13-09-2022 14:39:42 PDT INFO - at 
> org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:121)          
> 13-09-2022 14:39:42 PDT INFO - at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3310)       
>    
> 13-09-2022 14:39:42 PDT INFO - at 
> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3278)          
> 13-09-2022 14:39:42 PDT INFO - at 
> org.apache.hadoop.fs.FileSystem.get(FileSystem.java:475)          
> 13-09-2022 14:39:42 PDT INFO - at 
> org.apache.hadoop.fs.Path.getFileSystem(Path.java:356)          
> 13-09-2022 14:39:42 PDT INFO - at 
> org.apache.spark.deploy.yarn.ApplicationMaster.cleanupStagingDir(ApplicationMaster.scala:675)
>           
> 13-09-2022 14:39:42 PDT INFO - at 
> org.apache.spark.deploy.yarn.ApplicationMaster.$anonfun$run$2(ApplicationMaster.scala:259)
>           
> 13-09-2022 14:39:42 PDT INFO - at 
> org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:214)    
>       
> 13-09-2022 14:39:42 PDT INFO - at 
> org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$2(ShutdownHookManager.scala:188)
>           
> 13-09-2022 14:39:42 PDT INFO - at 
> scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)        
>   
> 13-09-2022 14:39:42 PDT INFO - at 
> org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:2023)          
> 13-09-2022 14:39:42 PDT INFO - at 
> org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$1(ShutdownHookManager.scala:188)
>           
> 13-09-2022 14:39:42 PDT INFO - at 
> scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)        
>   
> 13-09-2022 14:39:42 PDT INFO - at scala.util.Try$.apply(Try.scala:213)        
>   
> 13-09-2022 14:39:42 PDT INFO - at 
> org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188)
>           
> 13-09-2022 14:39:42 PDT INFO - at 
>

[jira] [Resolved] (HDFS-16550) [SBN read] Improper cache-size for journal node may cause cluster crash

2022-11-30 Thread Erik Krogen (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen resolved HDFS-16550.

Fix Version/s: 3.4.0
   Resolution: Fixed

> [SBN read] Improper cache-size for journal node may cause cluster crash
> ---
>
> Key: HDFS-16550
> URL: https://issues.apache.org/jira/browse/HDFS-16550
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: image-2022-04-21-09-54-29-751.png, 
> image-2022-04-21-09-54-57-111.png, image-2022-04-21-12-32-56-170.png
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> When we introduced {*}SBN Read{*}, we encountered a situation during upgrade 
> the JournalNodes.
> Cluster Info: 
> *Active: nn0*
> *Standby: nn1*
> 1. Rolling restart journal node. {color:#ff}(related config: 
> fs.journalnode.edit-cache-size.bytes=1G, -Xms1G, -Xmx=1G){color}
> 2. The cluster runs for a while, edits cache usage is increasing and memory 
> is used up.
> 3. {color:#ff}Active namenode(nn0){color} shutdown because of “{_}Timed 
> out waiting 12ms for a quorum of nodes to respond”{_}.
> 4. Transfer nn1 to Active state.
> 5. {color:#ff}New Active namenode(nn1){color} also shutdown because of 
> “{_}Timed out waiting 12ms for a quorum of nodes to respond” too{_}.
> 6. {color:#ff}The cluster crashed{color}.
>  
> Related code:
> {code:java}
> JournaledEditsCache(Configuration conf) {
>   capacity = conf.getInt(DFSConfigKeys.DFS_JOURNALNODE_EDIT_CACHE_SIZE_KEY,
>   DFSConfigKeys.DFS_JOURNALNODE_EDIT_CACHE_SIZE_DEFAULT);
>   if (capacity > 0.9 * Runtime.getRuntime().maxMemory()) {
> Journal.LOG.warn(String.format("Cache capacity is set at %d bytes but " +
> "maximum JVM memory is only %d bytes. It is recommended that you " +
> "decrease the cache size or increase the heap size.",
> capacity, Runtime.getRuntime().maxMemory()));
>   }
>   Journal.LOG.info("Enabling the journaled edits cache with a capacity " +
>   "of bytes: " + capacity);
>   ReadWriteLock lock = new ReentrantReadWriteLock(true);
>   readLock = new AutoCloseableLock(lock.readLock());
>   writeLock = new AutoCloseableLock(lock.writeLock());
>   initialize(INVALID_TXN_ID);
> } {code}
> Currently, *fs.journalNode.edit-cache-size-bytes* can be set to a larger size 
> than the memory requested by the process. If 
> {*}fs.journalNode.edit-cache-sie.bytes > 0.9 * 
> Runtime.getruntime().maxMemory(){*}, only warn logs are printed during 
> journalnode startup. This can easily be overlooked by users. However, as the 
> cluster runs to a certain period of time, it is likely to cause the cluster 
> to crash.
>  
> NN log:
> !image-2022-04-21-09-54-57-111.png|width=1012,height=47!
> !image-2022-04-21-12-32-56-170.png|width=809,height=218!
> IMO, we should not set the {{cache size}} to a fixed value, but to the ratio 
> of maximum memory, which is 0.2 by default.
> This avoids the problem of too large cache size. In addition, users can 
> actively adjust the heap size when they need to increase the cache size.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16547) [SBN read] Namenode in safe mode should not be transfered to observer state

2022-11-21 Thread Erik Krogen (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen resolved HDFS-16547.

Resolution: Fixed

> [SBN read] Namenode in safe mode should not be transfered to observer state
> ---
>
> Key: HDFS-16547
> URL: https://issues.apache.org/jira/browse/HDFS-16547
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Tao Li
>Assignee: Tao Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Currently, when a Namenode is in safemode(under starting or enter safemode 
> manually), we can transfer this Namenode to Observer by command. This 
> Observer node may receive many requests and then throw a SafemodeException, 
> this causes unnecessary failover on the client.
> So Namenode in safe mode should not be transfer to observer state.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16832) [SBN READ] Fix NPE when check the block location of empty directory

2022-11-21 Thread Erik Krogen (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen resolved HDFS-16832.

Resolution: Fixed

> [SBN READ] Fix NPE when check the block location of empty directory
> ---
>
> Key: HDFS-16832
> URL: https://issues.apache.org/jira/browse/HDFS-16832
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: zhengchenyu
>Assignee: zhengchenyu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>
> HDFS-16732 is introduced for check block location when getListing or 
> getFileInfo. But When we check block location of empty directory will throw 
> NPE.
> Exception stack on tez client are below:
> {code:java}
> org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> java.lang.NullPointerException
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1554)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1492)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1389)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>   at com.sun.proxy.$Proxy12.getListing(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:678)
>   at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>   at com.sun.proxy.$Proxy13.getListing(Unknown Source)
>   at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1671)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1212)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1195)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1140)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1136)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1154)
>   at 
> org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:2054)
>   at 
> org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:278)
>   at 
> org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:239)
>   at 
> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:325)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:306)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:408)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:159)
>   at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:279)
>   at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:270)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>   at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:270)
>   at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:254)
>   at 
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
>   at 
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
>   at 
>

[jira] [Resolved] (HDFS-16659) JournalNode should throw NewerTxnIdException if SinceTxId is bigger than HighestWrittenTxId

2022-09-06 Thread Erik Krogen (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen resolved HDFS-16659.

Fix Version/s: 3.4.0
   Resolution: Fixed

> JournalNode should throw NewerTxnIdException if SinceTxId is bigger than 
> HighestWrittenTxId
> ---
>
> Key: HDFS-16659
> URL: https://issues.apache.org/jira/browse/HDFS-16659
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> JournalNode should throw `CacheMissException` if `sinceTxId` is bigger than 
> `highestWrittenTxId` during handling `getJournaledEdits` rpc from NNs. 
> Current logic may cause in-progress EditlogTailer cannot replay any Edits 
> from JournalNodes in some corner cases, resulting in ObserverNameNode cannot 
> handle requests from clients.
> Suppose there are 3 journalNodes, JN0 ~ JN1.
> * JN0 has some abnormal cases when Active Namenode is syncing 10 Edits with 
> first txid 11
> * NameNode just ignore the abnormal JN0 and continue to sync Edits to Journal 
> 1 and 2
> * JN0 backed to health
> * NameNode continue sync 10 Edits with first txid 21.
> * At this point, there are no Edits 11 ~ 30 in the cache of JN0
> * Observer NameNode try to select EditLogInputStream through 
> `getJournaledEdits` with since txId 21
> * Journal 2 has some abnormal cases and caused a slow response
> The expected result is: Response should contain 20 Edits from txId 21 to txId 
> 30 from JN1 and JN2. Because Active NameNode successfully write these Edits 
> to JN1 and JN2 and failed write these edits to JN0.
> But in the current implementation,  the response is [Response(0) from JN0, 
> Response(10) from JN1], because  there are some abnormal cases in  JN2, such 
> as GC, bad network,  cause a slow response. So the `maxAllowedTxns` will be 
> 0, NameNode will not replay any Edits.
> As above, the root case is that JournalNode should throw Miss Cache Exception 
> when `sinceTxid` is more than `highestWrittenTxId`.
> And the bug code as blew:
> {code:java}
> if (sinceTxId > getHighestWrittenTxId()) {
> // Requested edits that don't exist yet; short-circuit the cache here
> metrics.rpcEmptyResponses.incr();
> return 
> GetJournaledEditsResponseProto.newBuilder().setTxnCount(0).build(); 
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16732) [SBN READ] Avoid get location from observer when the block report is delayed.

2022-08-25 Thread Erik Krogen (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen resolved HDFS-16732.

Fix Version/s: 3.4.0
   3.3.9
   Resolution: Fixed

Merged PR 4756 to trunk and branch-3.3. Thanks [~zhengchenyu]!

> [SBN READ] Avoid get location from observer when the block report is delayed.
> -
>
> Key: HDFS-16732
> URL: https://issues.apache.org/jira/browse/HDFS-16732
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.2.1
>Reporter: zhengchenyu
>Assignee: zhengchenyu
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.9
>
>
> Hive on tez application fail occasionally after observer is enable, log show 
> below.
> {code:java}
> 2022-08-18 15:22:06,914 [ERROR] [Dispatcher thread {Central}] 
> |impl.VertexImpl|: Vertex Input: namenodeinfo_stg initializer failed, 
> vertex=vertex_1660618571916_4839_1_00 [Map 1]
> org.apache.tez.dag.app.dag.impl.AMUserCodeException: 
> java.lang.ArrayIndexOutOfBoundsException: 0
>   at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallback.onFailure(RootInputInitializerManager.java:329)
>   at 
> com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056)
>   at 
> com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
>   at 
> com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138)
>   at 
> com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958)
>   at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:748)
>   at 
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.afterRanInterruptibly(TrustedListenableFutureTask.java:133)
>   at 
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:80)
>   at 
> com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
>   at 
> org.apache.hadoop.mapred.FileInputFormat.identifyHosts(FileInputFormat.java:748)
>   at 
> org.apache.hadoop.mapred.FileInputFormat.getSplitHostsAndCachedHosts(FileInputFormat.java:714)
>   at 
> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:378)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:306)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:408)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:159)
>   at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:279)
>   at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:270)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>   at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:270)
>   at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:254)
>   at 
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
>   at 
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
>   ... 4 more {code}
> As describe in MAPREDUCE-7082, when the block is missing, then will throw 
> this exception, but my cluster had no missing block.
> In this example, I found getListing return location information. When block 
> report of observer is delayed, will return the block without location.
> HDFS-13924 is introduce to solve this problem, but only consider 
> getBlockLocations. 
> In observer node, all method which may return location should check whether 
> locations is empty or not.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail:

[jira] [Resolved] (HDFS-16181) [SBN Read] Fix metric of RpcRequestCacheMissAmount can't display when tailEditLog form JN

2021-10-04 Thread Erik Krogen (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen resolved HDFS-16181.

Fix Version/s: 3.1.5
   3.2.4
   3.3.2
   2.10.2
   3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

Thank you [~jianghuazhu]! This is my mistake. I just updated the JIRA status.

> [SBN Read] Fix metric of RpcRequestCacheMissAmount can't display when 
> tailEditLog form JN
> -
>
> Key: HDFS-16181
> URL: https://issues.apache.org/jira/browse/HDFS-16181
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: wangzhaohui
>Assignee: wangzhaohui
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.4.0, 2.10.2, 3.3.2, 3.2.4, 3.1.5
>
> Attachments: after.jpg, before.jpg
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> I found the JN turn on edit cache, but the metric of 
> rpcRequestCacheMissAmount can not display.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16233) Do not use exception handler to implement copy-on-write for EnumCounters

2021-09-24 Thread Erik Krogen (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen resolved HDFS-16233.

Resolution: Fixed

> Do not use exception handler to implement copy-on-write for EnumCounters
> 
>
> Key: HDFS-16233
> URL: https://issues.apache.org/jira/browse/HDFS-16233
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 2.10.2, 3.2.3, 3.3.2, 3.1.5
>
> Attachments: Screen Shot 2021-09-22 at 1.59.59 PM.png, 
> profile_c7_delete_asyncaudit.html
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> HDFS-14547 saves the NameNode heap space occupied by EnumCounters by 
> essentially implementing a copy-on-write strategy.
> At beginning, all EnumCounters refers to the same ConstEnumCounters to save 
> heap space. When it is modified, an exception is thrown and the exception 
> handler converts ConstEnumCounters to EnumCounters object and updates it.
> Using exception handler to perform anything more than occasional is bad for 
> performance. 
> Propose: use instanceof keyword to detect the type of object and do COW 
> accordingly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-15032) Balancer crashes when it fails to contact an NN via ObserverReadProxyProvider

2019-12-04 Thread Erik Krogen (Jira)

Erik Krogen created HDFS-15032:
--

 Summary: Balancer crashes when it fails to contact an NN via 
ObserverReadProxyProvider
 Key: HDFS-15032
 URL: https://issues.apache.org/jira/browse/HDFS-15032
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover
Affects Versions: 2.10.0
Reporter: Erik Krogen
Assignee: Erik Krogen


When trying to run the Balancer using ObserverReadProxyProvider (to allow it to 
read from the Observer Node as described in HDFS-14979), if one of the NNs 
isn't running, the Balancer will crash.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-14979) [Observer Node] Balancer should submit getBlocks to Observer Node when possible

2019-11-11 Thread Erik Krogen (Jira)

Erik Krogen created HDFS-14979:
--

 Summary: [Observer Node] Balancer should submit getBlocks to 
Observer Node when possible
 Key: HDFS-14979
 URL: https://issues.apache.org/jira/browse/HDFS-14979
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: balancer  mover, hdfs
Reporter: Erik Krogen
Assignee: Erik Krogen


In HDFS-14162, we made it so that the Balancer could function when 
{{ObserverReadProxyProvider}} was in use. However, the Balancer would still 
read from the active NameNode, because {{getBlocks}} wasn't annotated as 
{{@ReadOnly}}. This task is to enable the Balancer to actually read from the 
Observer Node to alleviate load from the active NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-14973) Balancer getBlocks RPC dispersal does not function properly

2019-11-08 Thread Erik Krogen (Jira)

Erik Krogen created HDFS-14973:
--

 Summary: Balancer getBlocks RPC dispersal does not function 
properly
 Key: HDFS-14973
 URL: https://issues.apache.org/jira/browse/HDFS-14973
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover
Affects Versions: 3.0.0, 2.8.2, 2.7.4, 2.9.0
Reporter: Erik Krogen
Assignee: Erik Krogen


In HDFS-11384, a mechanism was added to make the {{getBlocks}} RPC calls issued 
by the Balancer/Mover more dispersed, to alleviate load on the NameNode, since 
{{getBlocks}} can be very expensive and the Balancer should not impact normal 
cluster operation.

Unfortunately, this functionality does not function as expected, especially 
when the dispatcher thread count is low. The primary issue is that the delay is 
applied only to the first N threads that are submitted to the dispatcher's 
executor, where N is the size of the dispatcher's threadpool, but *not* to the 
first R threads, where R is the number of allowed {{getBlocks}} QPS (currently 
hardcoded to 20). For example, if the threadpool size is 100 (the default), 
threads 0-19 have no delay, 20-99 have increased levels of delay, and 100+ have 
no delay. As I understand it, the intent of the logic was that the delay 
applied to the first 100 threads would force the dispatcher executor's threads 
to all be consumed, thus blocking subsequent (non-delayed) threads until the 
delay period has expired. However, threads 0-19 can finish very quickly (their 
work can often be fulfilled in the time it takes to execute a single 
{{getBlocks}} RPC, on the order of tens of milliseconds), thus opening up 20 
new slots in the executor, which are then consumed by non-delayed threads 
100-119, and so on. So, although 80 threads have had a delay applied, the 
non-delay threads rush through in the 20 non-delay slots.

This problem gets even worse when the dispatcher threadpool size is less than 
the max {{getBlocks}} QPS. For example, if the threadpool size is 10, _no 
threads ever have a delay applied_, and the feature is not enabled at all.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-14245) Class cast error in GetGroups with ObserverReadProxyProvider

2019-10-10 Thread Erik Krogen (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen resolved HDFS-14245.

Resolution: Fixed

> Class cast error in GetGroups with ObserverReadProxyProvider
> 
>
> Key: HDFS-14245
> URL: https://issues.apache.org/jira/browse/HDFS-14245
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: HDFS-12943
>Reporter: Shen Yinjie
>Assignee: Erik Krogen
>Priority: Major
> Fix For: 2.10.0, 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-14245.000.patch, HDFS-14245.001.patch, 
> HDFS-14245.002.patch, HDFS-14245.003.patch, HDFS-14245.004.patch, 
> HDFS-14245.005.patch, HDFS-14245.006.patch, HDFS-14245.007.patch, 
> HDFS-14245.patch
>
>
> Run "hdfs groups" with ObserverReadProxyProvider, Exception throws as :
> {code:java}
> Exception in thread "main" java.io.IOException: Couldn't create proxy 
> provider class 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:261)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:119)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:95)
>  at org.apache.hadoop.hdfs.tools.GetGroups.getUgmProtocol(GetGroups.java:87)
>  at org.apache.hadoop.tools.GetGroupsBase.run(GetGroupsBase.java:71)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>  at org.apache.hadoop.hdfs.tools.GetGroups.main(GetGroups.java:96)
> Caused by: java.lang.reflect.InvocationTargetException
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:245)
>  ... 7 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hdfs.server.namenode.ha.NameNodeHAProxyFactory cannot be 
> cast to org.apache.hadoop.hdfs.server.namenode.ha.ClientHAProxyFactory
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.(ObserverReadProxyProvider.java:123)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.(ObserverReadProxyProvider.java:112)
>  ... 12 more
> {code}
> similar with HDFS-14116, we did a simple fix.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Reopened] (HDFS-14162) Balancer should work with ObserverNode

2019-10-04 Thread Erik Krogen (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen reopened HDFS-14162:


Re-opening for backport to older branches which should have been done from the 
start.

> Balancer should work with ObserverNode
> --
>
> Key: HDFS-14162
> URL: https://issues.apache.org/jira/browse/HDFS-14162
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Konstantin Shvachko
>Assignee: Erik Krogen
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14162-HDFS-12943.wip0.patch, HDFS-14162.000.patch, 
> HDFS-14162.001.patch, HDFS-14162.002.patch, HDFS-14162.003.patch, 
> HDFS-14162.004.patch, ReflectionBenchmark.java, 
> testBalancerWithObserver-3.patch, testBalancerWithObserver.patch
>
>
> Balancer provides a substantial RPC load on NameNode. It would be good to 
> divert Balancer RPCs {{getBlocks()}}, etc. to ObserverNode. The main problem 
> is that Balancer uses {{NamenodeProtocol}}, while ORPP currently supports 
> only {{ClientProtocol}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Reopened] (HDFS-14245) Class cast error in GetGroups with ObserverReadProxyProvider

2019-10-04 Thread Erik Krogen (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen reopened HDFS-14245:


Re-opening for backport to other branches, which should have been done from the 
start.

> Class cast error in GetGroups with ObserverReadProxyProvider
> 
>
> Key: HDFS-14245
> URL: https://issues.apache.org/jira/browse/HDFS-14245
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: HDFS-12943
>Reporter: Shen Yinjie
>Assignee: Erik Krogen
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14245.000.patch, HDFS-14245.001.patch, 
> HDFS-14245.002.patch, HDFS-14245.003.patch, HDFS-14245.004.patch, 
> HDFS-14245.005.patch, HDFS-14245.006.patch, HDFS-14245.007.patch, 
> HDFS-14245.patch
>
>
> Run "hdfs groups" with ObserverReadProxyProvider, Exception throws as :
> {code:java}
> Exception in thread "main" java.io.IOException: Couldn't create proxy 
> provider class 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:261)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:119)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:95)
>  at org.apache.hadoop.hdfs.tools.GetGroups.getUgmProtocol(GetGroups.java:87)
>  at org.apache.hadoop.tools.GetGroupsBase.run(GetGroupsBase.java:71)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>  at org.apache.hadoop.hdfs.tools.GetGroups.main(GetGroups.java:96)
> Caused by: java.lang.reflect.InvocationTargetException
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:245)
>  ... 7 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hdfs.server.namenode.ha.NameNodeHAProxyFactory cannot be 
> cast to org.apache.hadoop.hdfs.server.namenode.ha.ClientHAProxyFactory
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.(ObserverReadProxyProvider.java:123)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.(ObserverReadProxyProvider.java:112)
>  ... 12 more
> {code}
> similar with HDFS-14116, we did a simple fix.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-14829) [Dynamometer] Update TestDynamometerInfra to be Hadoop 3.2+ compatible

2019-09-06 Thread Erik Krogen (Jira)

Erik Krogen created HDFS-14829:
--

 Summary: [Dynamometer] Update TestDynamometerInfra to be Hadoop 
3.2+ compatible
 Key: HDFS-14829
 URL: https://issues.apache.org/jira/browse/HDFS-14829
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Erik Krogen


Currently the integration test included with Dynamometer, 
{{TestDynamometerInfra}}, is executing against version 3.1.2 of Hadoop. We 
should update it to run against a more recent version by default (3.2.x) and 
add support for 3.3 in anticipation of HDFS-14412.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-14667) Backport [HDFS-14403] "Cost-based FairCallQueue" to branch-2

2019-07-24 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-14667:
--

 Summary: Backport [HDFS-14403] "Cost-based FairCallQueue" to 
branch-2
 Key: HDFS-14667
 URL: https://issues.apache.org/jira/browse/HDFS-14667
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Erik Krogen
Assignee: Erik Krogen


We would like to target pulling HDFS-14403, an important operability 
enhancement, into branch-2.





--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-14643) [Dynamometer] Merge extra commits from GitHub to Hadoop

2019-07-10 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-14643:
--

 Summary: [Dynamometer] Merge extra commits from GitHub to Hadoop
 Key: HDFS-14643
 URL: https://issues.apache.org/jira/browse/HDFS-14643
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Erik Krogen
Assignee: Erik Krogen


While Dynamometer was in the process of being committed to Hadoop, a few 
patches went in to the GitHub version that haven't yet made it into the version 
committed here. Some of them are related to TravisCI and Bintray deployment, 
which can safely be ignored in a Hadoop context, but a few are relevant:

{code}
* 2d2591e 2019-05-24 Make XML parsing error message more explicit (PR #97) 
[lfengnan ]
* 755a298 2019-04-04 Fix misimplemented CountTimeWritable setter and update the 
README docs regarding the output file (PR #96) [Christopher Gregorian 
]
* 66d3e19 2019-03-14 Modify AuditReplay workflow to output count and latency of 
operations (PR #92) [Christopher Gregorian ]
* 5c1d8cd 2019-02-28 Fix issues with the start-workload.sh script (PR #84) 
[Erik Krogen ]
{code}

I will use this ticket to track porting these 4 commits into Hadoop's 
Dynamometer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-14640) [Dynamometer] Fix TestDynamometerInfra failures

2019-07-09 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-14640:
--

 Summary: [Dynamometer] Fix TestDynamometerInfra failures
 Key: HDFS-14640
 URL: https://issues.apache.org/jira/browse/HDFS-14640
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Erik Krogen
Assignee: Erik Krogen


I've been seeing Jenkins reporting some failures of the 
{{TestDynamometerInfra}} test (basically a big integration test). It seems like 
it's timing out after 15 minutes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-14639) [Dynamometer] Unnecessary duplicate bin directory appears in dist layout

2019-07-09 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-14639:
--

 Summary: [Dynamometer] Unnecessary duplicate bin directory appears 
in dist layout
 Key: HDFS-14639
 URL: https://issues.apache.org/jira/browse/HDFS-14639
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode, test
Reporter: Erik Krogen


The bin files get put into the 
{{share/hadoop/tools/dynamometer/dynamometer-*/bin}} locations as expected:
{code}
ekrogen at ekrogen-mn6 in 
~/dev/hadoop/trunk/hadoop-dist/target/hadoop-3.3.0-SNAPSHOT on 
ekrogen-HDFS-14410-dyno-docs!
± ls share/hadoop/tools/dynamometer/dynamometer-*/bin
share/hadoop/tools/dynamometer/dynamometer-blockgen/bin:
generate-block-lists.sh

share/hadoop/tools/dynamometer/dynamometer-infra/bin:
create-slim-hadoop-tar.shparse-metrics.sh 
start-dynamometer-cluster.sh upload-fsimage.sh

share/hadoop/tools/dynamometer/dynamometer-workload/bin:
parse-start-timestamp.sh start-workload.sh
{code}

But for blockgen specifically, it also ends up in another folder:
{code}
ekrogen at ekrogen-mn6 in 
~/dev/hadoop/trunk/hadoop-dist/target/hadoop-3.3.0-SNAPSHOT on 
ekrogen-HDFS-14410-dyno-docs!
± ls share/hadoop/tools/dynamometer-blockgen/bin
generate-block-lists.sh
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-14638) [Dynamometer] Fix scripts to refer to current build structure

2019-07-09 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-14638:
--

 Summary: [Dynamometer] Fix scripts to refer to current build 
structure
 Key: HDFS-14638
 URL: https://issues.apache.org/jira/browse/HDFS-14638
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode, test
Reporter: Erik Krogen


The scripts within the Dynamometer build dirs all refer to the old distribution 
structure with a single {{bin}} directory and a single {{lib}} directory. We 
need to update them to refer to the Hadoop-standard layout.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-14539) Remove Dynamometer's reliance on the tar utility

2019-06-03 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-14539:
--

 Summary: Remove Dynamometer's reliance on the tar utility
 Key: HDFS-14539
 URL: https://issues.apache.org/jira/browse/HDFS-14539
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Erik Krogen


Dynamometer currently relies on the tar utility, which is cumbersome and means 
that it won't work on Windows. We should remove this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-14500) NameNode StartupProgress continues to report edit log segments after the LOADING_EDITS phase is finished

2019-05-24 Thread Erik Krogen (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen resolved HDFS-14500.

Resolution: Fixed

> NameNode StartupProgress continues to report edit log segments after the 
> LOADING_EDITS phase is finished
> 
>
> Key: HDFS-14500
> URL: https://issues.apache.org/jira/browse/HDFS-14500
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.2.0, 2.9.2, 3.0.3, 2.8.5, 3.1.2
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Fix For: 2.10.0, 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-14500-branch-2.001.patch, HDFS-14500.000.patch, 
> HDFS-14500.001.patch
>
>
> When testing out a cluster with the edit log tailing fast path feature 
> enabled (HDFS-13150), an unrelated issue caused the NameNode to remain in 
> safe mode for an extended period of time, preventing the NameNode from fully 
> completing its startup sequence. We noticed that the Startup Progress web UI 
> displayed many edit log segments (millions of them).
> I traced this problem back to {{StartupProgress}}. Within 
> {{FSEditLogLoader}}, the loader continually tries to update the startup 
> progress with a new {{Step}} any time that it loads edits. Per the Javadoc 
> for {{StartupProgress}}, this should be a no-op once startup is completed:
> {code:title=StartupProgress.java}
>  * After startup completes, the tracked data is frozen.  Any subsequent 
> updates
>  * or counter increments are no-ops.
> {code}
> However, {{StartupProgress}} only implements that logic once the _entire_ 
> startup sequence has been completed. When {{FSEditLogLoader}} calls 
> {{addStep()}}, it adds it into the {{LOADING_EDITS}} phase:
> {code:title=FSEditLogLoader.java}
> StartupProgress prog = NameNode.getStartupProgress();
> Step step = createStartupProgressStep(edits);
> prog.beginStep(Phase.LOADING_EDITS, step);
> {code}
> This phase, in our case, ended long before, so it is nonsensical to continue 
> to add steps to it. I believe it is a bug that {{StartupProgress}} accepts 
> such steps instead of ignoring them; once a phase is complete, it should no 
> longer change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Reopened] (HDFS-14500) NameNode StartupProgress continues to report edit log segments after the LOADING_EDITS phase is finished

2019-05-24 Thread Erik Krogen (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen reopened HDFS-14500:


> NameNode StartupProgress continues to report edit log segments after the 
> LOADING_EDITS phase is finished
> 
>
> Key: HDFS-14500
> URL: https://issues.apache.org/jira/browse/HDFS-14500
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.2.0, 2.9.2, 3.0.3, 2.8.5, 3.1.2
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14500.000.patch, HDFS-14500.001.patch
>
>
> When testing out a cluster with the edit log tailing fast path feature 
> enabled (HDFS-13150), an unrelated issue caused the NameNode to remain in 
> safe mode for an extended period of time, preventing the NameNode from fully 
> completing its startup sequence. We noticed that the Startup Progress web UI 
> displayed many edit log segments (millions of them).
> I traced this problem back to {{StartupProgress}}. Within 
> {{FSEditLogLoader}}, the loader continually tries to update the startup 
> progress with a new {{Step}} any time that it loads edits. Per the Javadoc 
> for {{StartupProgress}}, this should be a no-op once startup is completed:
> {code:title=StartupProgress.java}
>  * After startup completes, the tracked data is frozen.  Any subsequent 
> updates
>  * or counter increments are no-ops.
> {code}
> However, {{StartupProgress}} only implements that logic once the _entire_ 
> startup sequence has been completed. When {{FSEditLogLoader}} calls 
> {{addStep()}}, it adds it into the {{LOADING_EDITS}} phase:
> {code:title=FSEditLogLoader.java}
> StartupProgress prog = NameNode.getStartupProgress();
> Step step = createStartupProgressStep(edits);
> prog.beginStep(Phase.LOADING_EDITS, step);
> {code}
> This phase, in our case, ended long before, so it is nonsensical to continue 
> to add steps to it. I believe it is a bug that {{StartupProgress}} accepts 
> such steps instead of ignoring them; once a phase is complete, it should no 
> longer change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-14500) NameNode StartupProgress continues to report edit log segments after the LOADING_EDITS phase is finished

2019-05-17 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-14500:
--

 Summary: NameNode StartupProgress continues to report edit log 
segments after the LOADING_EDITS phase is finished
 Key: HDFS-14500
 URL: https://issues.apache.org/jira/browse/HDFS-14500
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.1.2, 2.8.5, 3.0.3, 2.9.2, 3.2.0
Reporter: Erik Krogen
Assignee: Erik Krogen


When testing out a cluster with the edit log tailing fast path feature enabled 
(HDFS-13150), an unrelated issue caused the NameNode to remain in safe mode for 
an extended period of time, preventing the NameNode from fully completing its 
startup sequence. We noticed that the Startup Progress web UI displayed many 
edit log segments (millions of them).

I traced this problem back to {{StartupProgress}}. Within {{FSEditLogLoader}}, 
the loader continually tries to update the startup progress with a new {{Step}} 
any time that it loads edits. Per the Javadoc for {{StartupProgress}}, this 
should be a no-op once startup is completed:
{code:title=StartupProgress.java}
 * After startup completes, the tracked data is frozen.  Any subsequent updates
 * or counter increments are no-ops.
{code}
However, {{StartupProgress}} only implements that logic once the _entire_ 
startup sequence has been completed. When {{FSEditLogLoader}} calls 
{{addStep()}}, it adds it into the {{LOADING_EDITS}} phase:
{code:title=FSEditLogLoader.java}
StartupProgress prog = NameNode.getStartupProgress();
Step step = createStartupProgressStep(edits);
prog.beginStep(Phase.LOADING_EDITS, step);
{code}
This phase, in our case, ended long before, so it is nonsensical to continue to 
add steps to it. I believe it is a bug that {{StartupProgress}} accepts such 
steps instead of ignoring them; once a phase is complete, it should no longer 
change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-14462) WebHDFS throws "Error writing request body to server" instead of NSQuotaExceededException

2019-04-30 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-14462:
--

 Summary: WebHDFS throws "Error writing request body to server" 
instead of NSQuotaExceededException
 Key: HDFS-14462
 URL: https://issues.apache.org/jira/browse/HDFS-14462
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 3.1.2, 2.7.7, 2.8.5, 3.0.3, 2.9.2, 3.2.0
Reporter: Erik Krogen


We noticed recently in our environment that, when writing data to HDFS via 
WebHDFS, a quota exception is returned to the client as:
{code}
java.io.IOException: Error writing request body to server
at 
sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(HttpURLConnection.java:3536)
 ~[?:1.8.0_172]
at 
sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(HttpURLConnection.java:3519)
 ~[?:1.8.0_172]
at 
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) 
~[?:1.8.0_172]
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) 
~[?:1.8.0_172]
at java.io.FilterOutputStream.flush(FilterOutputStream.java:140) 
~[?:1.8.0_172]
at java.io.DataOutputStream.flush(DataOutputStream.java:123) 
~[?:1.8.0_172]
{code}
It is entirely opaque to the user that this exception was caused because they 
exceeded their quota. Yet in the DataNode logs:
{code}
2019-04-24 02:13:09,639 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer 
Exception
org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace quota 
of /foo/path/here is exceeded: quota =  B = X TB but diskspace 
consumed =  B = X TB
at 
org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyStoragespaceQuota(DirectoryWithQuotaFeature.java:211)
at 
org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyQuota(DirectoryWithQuotaFeature.java:239)
{code}
This was on a 2.7.x cluster, but I verified that the same logic exists on 
trunk. I believe we need to fix some of the logic within the 
{{ExceptionHandler}} to add special handling for the quota exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-14442) Disagreement between HAUtil.getAddressOfActive and RpcInvocationHandler.getConnectionId

2019-04-18 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-14442:
--

 Summary: Disagreement between HAUtil.getAddressOfActive and 
RpcInvocationHandler.getConnectionId
 Key: HDFS-14442
 URL: https://issues.apache.org/jira/browse/HDFS-14442
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.3.0
Reporter: Erik Krogen


While working on HDFS-14245, we noticed a discrepancy in some proxy-handling 
code.

The description of {{RpcInvocationHandler.getConnectionId()}} states:
{code}
  /**
   * Returns the connection id associated with the InvocationHandler instance.
   * @return ConnectionId
   */
  ConnectionId getConnectionId();
{code}
It does not make any claims about whether this connection ID will be an active 
proxy or not. Yet in {{HAUtil}} we have:
{code}
  /**
   * Get the internet address of the currently-active NN. This should rarely be
   * used, since callers of this method who connect directly to the NN using the
   * resulting InetSocketAddress will not be able to connect to the active NN if
   * a failover were to occur after this method has been called.
   * 
   * @param fs the file system to get the active address of.
   * @return the internet address of the currently-active NN.
   * @throws IOException if an error occurs while resolving the active NN.
   */
  public static InetSocketAddress getAddressOfActive(FileSystem fs)
  throws IOException {
if (!(fs instanceof DistributedFileSystem)) {
  throw new IllegalArgumentException("FileSystem " + fs + " is not a DFS.");
}
// force client address resolution.
fs.exists(new Path("/"));
DistributedFileSystem dfs = (DistributedFileSystem) fs;
DFSClient dfsClient = dfs.getClient();
return RPC.getServerAddress(dfsClient.getNamenode());
  }
{code}
Where the call {{RPC.getServerAddress()}} eventually terminates into 
{{RpcInvocationHandler#getConnectionId()}}, via {{RPC.getServerAddress()}} -> 
{{RPC.getConnectionIdForProxy()}} -> 
{{RpcInvocationHandler#getConnectionId()}}. {{HAUtil}} appears to be making an 
incorrect assumption that {{RpcInvocationHandler}} will necessarily return an 
_active_ connection ID. {{ObserverReadProxyProvider}} demonstrates a 
counter-example to this, since the current connection ID may be pointing at, 
for example, an Observer NameNode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-14435) ObserverReadProxyProvider is unable to properly fetch HAState from Standby NNs

2019-04-17 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-14435:
--

 Summary: ObserverReadProxyProvider is unable to properly fetch 
HAState from Standby NNs
 Key: HDFS-14435
 URL: https://issues.apache.org/jira/browse/HDFS-14435
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha, nn
Affects Versions: 3.3.0
Reporter: Erik Krogen
Assignee: Erik Krogen


We have been seeing issues during testing of the Consistent Read from Standby 
feature that indicate that ORPP is unable to call {{getHAServiceState}} on 
Standby NNs, as they are rejected with a {{StandbyException}}. Upon further 
investigation, we realized that although the Standby allows the 
{{getHAServiceState()}} call, reading a delegation token is not allowed in 
Standby state, thus the call will fail when using DT-based authentication. This 
hasn't caused issues in practice, since ORPP assumes that the state is Standby 
if it is unable to fetch the state, but we should fix the logic to properly 
handle this scenario.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-14413) HA Support for Dynamometer

2019-04-03 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-14413:
--

 Summary: HA Support for Dynamometer
 Key: HDFS-14413
 URL: https://issues.apache.org/jira/browse/HDFS-14413
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Erik Krogen


It would be nice if Dynamometer could handle spinning up a full 2 NN + 3 QJM 
cluster instead of just a single NN



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-14412) Enable Dynamometer to use the local build of Hadoop by default

2019-04-03 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-14412:
--

 Summary: Enable Dynamometer to use the local build of Hadoop by 
default
 Key: HDFS-14412
 URL: https://issues.apache.org/jira/browse/HDFS-14412
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Erik Krogen


Currently, by default, Dynamometer will download a Hadoop tarball from the 
internet to use as the Hadoop version-under-test. Since it is bundled inside of 
Hadoop now, it would make more sense for it to use the current version of 
Hadoop by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-14411) Combine Dynamometer's SimulatedDataNodes into DataNodeCluster

2019-04-03 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-14411:
--

 Summary: Combine Dynamometer's SimulatedDataNodes into 
DataNodeCluster
 Key: HDFS-14411
 URL: https://issues.apache.org/jira/browse/HDFS-14411
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Erik Krogen


Dynamometer has a {{SimulatedDataNodes}} class, which is very similar to 
{{DataNodeCluster}} but with some different functionality. It would be better 
to combine the two to keep maintenance changes in a single place.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-14410) Make Dynamometer documentation properly compile onto the Hadoop site

2019-04-03 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-14410:
--

 Summary: Make Dynamometer documentation properly compile onto the 
Hadoop site
 Key: HDFS-14410
 URL: https://issues.apache.org/jira/browse/HDFS-14410
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Erik Krogen


The documentation included with Dynamometer doesn't properly appear on the 
site, we need to twiddle with this a bit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-14409) Improve Dynamometer test suite

2019-04-03 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-14409:
--

 Summary: Improve Dynamometer test suite
 Key: HDFS-14409
 URL: https://issues.apache.org/jira/browse/HDFS-14409
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Erik Krogen


The testing within Dynamometer now is mostly one big integration test. It could 
really use better testing throughout.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-14370) Edit log tailing fast-path should allow for backoff

2019-03-13 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-14370:
--

 Summary: Edit log tailing fast-path should allow for backoff
 Key: HDFS-14370
 URL: https://issues.apache.org/jira/browse/HDFS-14370
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode, qjm
Affects Versions: 3.3.0
Reporter: Erik Krogen
Assignee: Erik Krogen


As part of HDFS-13150, in-progress edit log tailing was changed to use an 
RPC-based mechanism, thus allowing the edit log tailing frequency to be turned 
way down, and allowing standby/observer NameNodes to be only a few milliseconds 
stale as compared to the Active NameNode.

When there is a high volume of transactions on the system, each RPC fetches 
transactions and takes some time to process them, self-rate-limiting how 
frequently an RPC is submitted. In a lightly loaded cluster, however, most of 
these RPCs return an empty set of transactions, consuming a high 
(de)serialization overhead for very little benefit. This was reported by 
[~jojochuang] in HDFS-14276 and I have also reported it on a test cluster where 
the SbNN was submitting 8000 RPCs per second that returned empty.

I propose we add some sort of backoff to the tailing, so that if an empty 
response is received, it will wait a longer period of time before submitting a 
new RPC.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-14349) Edit log may be rolled more frequently than necessary with multiple Standby nodes

2019-03-08 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-14349:
--

 Summary: Edit log may be rolled more frequently than necessary 
with multiple Standby nodes
 Key: HDFS-14349
 URL: https://issues.apache.org/jira/browse/HDFS-14349
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha, hdfs, qjm
Reporter: Erik Krogen
Assignee: Ekanth Sethuramalingam


When HDFS-14317 was fixed, we tackled the problem that in a cluster with 
in-progress edit log tailing enabled, a Standby NameNode may _never_ roll the 
edit logs, which can eventually cause data loss.

Unfortunately, in the process, it was made so that if there are multiple 
Standby NameNodes, they will all roll the edit logs at their specified 
frequency, so the edit log will be rolled X times more frequently than they 
should be (where X is the number of Standby NNs). This is not as bad as the 
original bug since rolling frequently does not affect correctness or data 
availability, but may degrade performance by creating more edit log segments 
than necessary.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-14279) [SBN Read] Race condition in ObserverReadProxyProvider

2019-02-14 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-14279:
--

 Summary: [SBN Read] Race condition in ObserverReadProxyProvider
 Key: HDFS-14279
 URL: https://issues.apache.org/jira/browse/HDFS-14279
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs, namenode
Reporter: Erik Krogen
Assignee: Erik Krogen


There is a race condition in {{ObserverReadProxyProvider#getCurrentProxy()}}:
{code}
  private NNProxyInfo getCurrentProxy() {
if (currentProxy == null) {
  changeProxy(null);
}
return currentProxy;
  }
{code}
{{currentProxy}} is a {{volatile}}. Another {{changeProxy()}} could occur after 
the {{changeProxy()}} and before the {{return}}, thus making the return value 
incorrect. I have seen this result in an NPE.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-14211) [Consistent Observer Reads] Allow for configurable "always msync" mode

2019-01-16 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-14211:
--

 Summary: [Consistent Observer Reads] Allow for configurable 
"always msync" mode
 Key: HDFS-14211
 URL: https://issues.apache.org/jira/browse/HDFS-14211
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Erik Krogen


To allow for reads to be serviced from an ObserverNode (see HDFS-12943) in a 
consistent way, an {{msync}} API was introduced (HDFS-13688) to allow for a 
client to fetch the latest transaction ID from the Active NN, thereby ensuring 
that subsequent reads from the ObserverNode will be up-to-date with the current 
state of the Active.

Using this properly, however, requires application-side changes: for examples, 
a NodeManager should call {{msync}} before localizing the resources for a 
client, since it received notification of the existence of those resources via 
communicate which is out-of-band to HDFS and thus could potentially attempt to 
localize them prior to the availability of those resources on the ObserverNode.

Until such application-side changes can be made, which will be a longer-term 
effort, we need to provide a mechanism for unchanged clients to utilize the 
ObserverNode without exposing such a client to inconsistencies. This is 
essentially phase 3 of the roadmap outlined in the [design 
document|https://issues.apache.org/jira/secure/attachment/12915990/ConsistentReadsFromStandbyNode.pdf]
 for HDFS-12943.

The design document proposes some heuristics based on understanding of how 
common applications (e.g. MR) use HDFS for resources. As an initial pass, we 
can simply have a flag which tells a client to call {{msync}} before _every 
single_ read operation. This may seem counterintuitive, as it turns every read 
operation into two RPCs: {{msync}} to the Active following by an actual read 
operation to the Observer. However, the {{msync}} operation is extremely 
lightweight, as it does not acquire the {{FSNamesystemLock}}, and in 
experiments we have found that this approach can easily scale to well over 
100,000 {{msync}} operations per second on the Active (while still servicing 
approx. 10,000 write op/s). Combined with the fast-path edit log tailing for 
standby/observer nodes (HDFS-13150), this "always msync" approach should 
introduce only a few ms of extra latency to each read call.

Below are some experimental results collected from experiments which convert a 
normal RPC workload into one in which all read operations are turned into an 
{{msync}}. The baseline is a workload of 1.5k write op/s and 25k read op/s.

||Rate Multiplier|2|4|6|8||
||RPC Queue Avg Time (ms)|14.2|53.2|110.4|125.3||
||RPC Queue NumOps Avg (k)|51.4|102.3|147.8|177.9||
||RPC Queue NumOps Max (k)|148.8|269.5|306.3|312.4||

Results are promising up to between 4x and 6x of the baseline workload, which 
is approx. 100-150k read op/s.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-14155) Update "Consistent Read from Observer" User Guide with Edit Tailing Frequency

2018-12-17 Thread Erik Krogen (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen resolved HDFS-14155.

Resolution: Duplicate

> Update "Consistent Read from Observer" User Guide with Edit Tailing Frequency
> -
>
> Key: HDFS-14155
> URL: https://issues.apache.org/jira/browse/HDFS-14155
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: documentation, hdfs
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
>
> Currently the user guide created in HDFS-14131 does not make any mention of 
> the recommendation for {{dfs.ha.tail-edits.period}}, but the default works 
> very poorly in combination with this feature. We should update the 
> documentation to reflect this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-14155) Update "Consistent Read from Observer" User Guide with Edit Tailing Frequency

2018-12-17 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-14155:
--

 Summary: Update "Consistent Read from Observer" User Guide with 
Edit Tailing Frequency
 Key: HDFS-14155
 URL: https://issues.apache.org/jira/browse/HDFS-14155
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: documentation, hdfs
Reporter: Erik Krogen
Assignee: Erik Krogen


Currently the user guide created in HDFS-14131 does not make any mention of the 
recommendation for {{dfs.ha.tail-edits.period}}, but the default works very 
poorly in combination with this feature. We should update the documentation to 
reflect this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-13873) ObserverNode should reject read requests when it is too far behind.

2018-12-13 Thread Erik Krogen (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen resolved HDFS-13873.

Resolution: Fixed

> ObserverNode should reject read requests when it is too far behind.
> ---
>
> Key: HDFS-13873
> URL: https://issues.apache.org/jira/browse/HDFS-13873
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client, namenode
>Affects Versions: HDFS-12943
>Reporter: Konstantin Shvachko
>Assignee: Konstantin Shvachko
>Priority: Major
> Fix For: HDFS-12943
>
> Attachments: HDFS-13873-HDFS-12943.001.patch, 
> HDFS-13873-HDFS-12943.002.patch, HDFS-13873-HDFS-12943.003.patch, 
> HDFS-13873-HDFS-12943.004.patch, HDFS-13873-HDFS-12943.005.patch
>
>
> Add a server-side threshold for ObserverNode to reject read requests when it 
> is too far behind.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Reopened] (HDFS-14048) DFSOutputStream close() throws exception on subsequent call after DataNode restart

2018-11-06 Thread Erik Krogen (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen reopened HDFS-14048:


Re-opening for branch-2 commit.

Sorry for the trouble [~elgoiri], I have just attached the branch-2 patch. 
Since I'm not sure if Jenkins will run properly given the branch-2 build 
issues, I also executed all of the following tests locally without any 
failures: 
{{TestClientProtocolForPipelineRecovery,TestDFSOutputStream,TestClientBlockVerification,TestDatanodeRestart}}

> DFSOutputStream close() throws exception on subsequent call after DataNode 
> restart
> --
>
> Key: HDFS-14048
> URL: https://issues.apache.org/jira/browse/HDFS-14048
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.3.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Fix For: 3.0.4, 3.1.2, 3.3.0, 3.2.1
>
> Attachments: HDFS-14048.000.patch
>
>
> We recently discovered an issue in which, during a rolling upgrade, some jobs 
> were failing with exceptions like (sadly this is the whole stack trace):
> {code}
> java.io.IOException: A datanode is restarting: 
> DatanodeInfoWithStorage[1.1.1.1:71,BP-,DISK]
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:877)
> {code}
> with an earlier statement in the log like:
> {code}
> INFO [main] org.apache.hadoop.hdfs.DFSClient: A datanode is restarting: 
> DatanodeInfoWithStorage[1.1.1.1:71,BP-,DISK]
> {code}
> Strangely we did not see any other logs about the {{DFSOutputStream}} failing 
> after waiting for the DataNode restart. We eventually realized that in some 
> cases {{DFSOutputStream#close()}} may be called more than once, and that if 
> so, the {{IOException}} above is thrown on the _second_ call to {{close()}} 
> (this is even with HDFS-5335; prior to this it would have been thrown on all 
> calls to {{close()}} besides the first).
> The problem is that in {{DataStreamer#createBlockOutputStream()}}, after the 
> new output stream is created, it resets the error states:
> {code}
> errorState.resetInternalError();
> // remove all restarting nodes from failed nodes list
> failed.removeAll(restartingNodes);
> restartingNodes.clear(); 
> {code}
> But it forgets to clear {{lastException}}. When 
> {{DFSOutputStream#closeImpl()}} is called a second time, this block is 
> triggered:
> {code}
> if (isClosed()) {
>   LOG.debug("Closing an already closed stream. [Stream:{}, streamer:{}]",
>   closed, getStreamer().streamerClosed());
>   try {
> getStreamer().getLastException().check(true);
> {code}
> The second time, {{isClosed()}} is true, so the exception checking occurs and 
> the "Datanode is restarting" exception is thrown even though the stream has 
> already been successfully closed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-14048) DFSOutputStream close() throws exception on subsequent call after DataNode restart

2018-11-02 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-14048:
--

 Summary: DFSOutputStream close() throws exception on subsequent 
call after DataNode restart
 Key: HDFS-14048
 URL: https://issues.apache.org/jira/browse/HDFS-14048
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.3.0
Reporter: Erik Krogen
Assignee: Erik Krogen


We recently discovered an issue in which, during a rolling upgrade, some jobs 
were failing with exceptions like (sadly this is the whole stack trace):
{code}
java.io.IOException: A datanode is restarting: 
DatanodeInfoWithStorage[1.1.1.1:71,BP-,DISK]
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:877)
{code}
with an earlier statement in the log like:
{code}
INFO [main] org.apache.hadoop.hdfs.DFSClient: A datanode is restarting: 
DatanodeInfoWithStorage[1.1.1.1:71,BP-,DISK]
{code}

Strangely we did not see any other logs about the {{DFSOutputStream}} failing 
after waiting for the DataNode restart. We eventually realized that in some 
cases {{DFSOutputStream#close()}} may be called more than once, and that if so, 
the {{IOException}} above is thrown on the _second_ call to {{close()}} (this 
is even with HDFS-5335; prior to this it would have been thrown on all calls to 
{{close()}} besides the first).

The problem is that in {{DataStreamer#createBlockOutputStream()}}, after the 
new output stream is created, it resets the error states:
{code}
errorState.resetInternalError();
// remove all restarting nodes from failed nodes list
failed.removeAll(restartingNodes);
restartingNodes.clear(); 
{code}
But it forgets to clear {{lastException}}. When {{DFSOutputStream#closeImpl()}} 
is called a second time, this block is triggered:
{code}
if (isClosed()) {
  LOG.debug("Closing an already closed stream. [Stream:{}, streamer:{}]",
  closed, getStreamer().streamerClosed());
  try {
getStreamer().getLastException().check(true);
{code}
The second time, {{isClosed()}} is true, so the exception checking occurs and 
the "Datanode is restarting" exception is thrown even though the stream has 
already been successfully closed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-14034) Support getQuotaUsage API in WebHDFS

2018-10-29 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-14034:
--

 Summary: Support getQuotaUsage API in WebHDFS
 Key: HDFS-14034
 URL: https://issues.apache.org/jira/browse/HDFS-14034
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: fs, webhdfs
Reporter: Erik Krogen
Assignee: Erik Krogen


HDFS-8898 added support for a new API, {{getQuotaUsage}} which can fetch quota 
usage on a directory with significantly lower impact than the similar 
{{getContentSummary}}. This JIRA is to track adding support for this API to 
WebHDFS. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-13977) NameNode can kill itself if it tries to send too many txns to a QJM simultaneously

2018-10-08 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-13977:
--

 Summary: NameNode can kill itself if it tries to send too many 
txns to a QJM simultaneously
 Key: HDFS-13977
 URL: https://issues.apache.org/jira/browse/HDFS-13977
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode, qjm
Affects Versions: 2.7.7
Reporter: Erik Krogen
Assignee: Erik Krogen


h3. Problem & Logs
We recently encountered an issue on a large cluster (running 2.7.4) in which 
the NameNode killed itself because it was unable to communicate with the JNs 
via QJM. We discovered that it was the result of the NameNode trying to send a 
huge batch of over 1 million transactions to the JNs in a single RPC:
{code:title=NameNode Logs}
WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Remote 
journal X.X.X.X: failed to
 write txns 1000-11153636. Will try to write to this JN again after the 
next log roll.
...
WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Took 1098ms 
to send a batch of 1153637 edits (335886611 bytes) to remote journal 
X.X.X.X:
{code}
{code:title=JournalNode Logs}
INFO org.apache.hadoop.ipc.Server: Socket Reader #1 for port 8485: 
readAndProcess from client X.X.X.X threw exception [java.io.IOException: 
Requested data length 335886776 is longer than maximum configured RPC length 
67108864.  RPC came from X.X.X.X]
java.io.IOException: Requested data length 335886776 is longer than maximum 
configured RPC length 67108864.  RPC came from X.X.X.X
at 
org.apache.hadoop.ipc.Server$Connection.checkDataLength(Server.java:1610)
at 
org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1672)
at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:897)
at 
org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:753)
at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:724)
{code}
The JournalNodes rejected the RPC because it had a size well over the 64MB 
default {{ipc.maximum.data.length}}.

This was triggered by a huge number of files all hitting a hard lease timeout 
simultaneously, causing the NN to force-close them all at once. This can be a 
particularly nasty bug as the NN will attempt to re-send this same huge RPC on 
restart, as it loads an fsimage which still has all of these open files that 
need to be force-closed.

h3. Proposed Solution
To solve this we propose to modify {{EditsDoubleBuffer}} to add a "hard limit" 
based on the value of {{ipc.maximum.data.length}}. When {{writeOp()}} or 
{{writeRaw()}} is called, first check the size of {{bufCurrent}}. If it exceeds 
the hard limit, block the writer until the buffer is flipped and {{bufCurrent}} 
becomes {{bufReady}}. This gives some self-throttling to prevent the NameNode 
from killing itself in this way.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-13930) Fix crlf line endings in HDFS-12943 branch

2018-09-24 Thread Erik Krogen (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen resolved HDFS-13930.

Resolution: Fixed

> Fix crlf line endings in HDFS-12943 branch
> --
>
> Key: HDFS-13930
> URL: https://issues.apache.org/jira/browse/HDFS-13930
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-13930-HDFS-12943.000.patch, 
> branch-HDFS-12943-before.pdf
>
>
> One of the merge commits introduced the wrong line endings to some {{*.cmd}} 
> files. Looks like it was commit {{1363eff69c3}} that broke it.
> The tree is:
> {code}
> * |   1363eff69c3 2018-09-17 Merge commit 
> '9af96d4ed4b6f80d3ca53a2b003d2ef768650dd4' into HDFS-12943 [Konstantin V 
> Shvachko ]
> |\ \
> | |/
> | * 9af96d4ed4b 2018-09-05 HADOOP-15707. Add IsActiveServlet to be used for 
> Load Balancers. Contributed by Lukas Majercak. [Giovanni Matteo Fumarola 
> ]
> * |   94d7f90e93b 2018-09-17 Merge commit 
> 'e780556ae9229fe7a90817eb4e5449d7eed35dd8' into HDFS-12943 [Konstantin V 
> Shvachko ]
> {code}
> So that merge commit should have only introduced a single new commit 
> {{9af96d4ed4b}}. But:
> {code}
> ± git show --stat 9af96d4ed4b | cat
> commit 9af96d4ed4b6f80d3ca53a2b003d2ef768650dd4
> Author: Giovanni Matteo Fumarola 
> Date:   Wed Sep 5 10:50:25 2018 -0700
> HADOOP-15707. Add IsActiveServlet to be used for Load Balancers. 
> Contributed by Lukas Majercak.
>  .../org/apache/hadoop/http/IsActiveServlet.java| 71 
>  .../apache/hadoop/http/TestIsActiveServlet.java| 95 
> ++
>  .../federation/router/IsRouterActiveServlet.java   | 37 +
>  .../server/federation/router/RouterHttpServer.java |  9 ++
>  .../src/site/markdown/HDFSRouterFederation.md  |  2 +-
>  .../server/namenode/IsNameNodeActiveServlet.java   | 33 
>  .../hdfs/server/namenode/NameNodeHttpServer.java   |  3 +
>  .../site/markdown/HDFSHighAvailabilityWithQJM.md   |  8 ++
>  .../IsResourceManagerActiveServlet.java| 38 +
>  .../server/resourcemanager/ResourceManager.java|  5 ++
>  .../resourcemanager/webapp/RMWebAppFilter.java |  3 +-
>  .../src/site/markdown/ResourceManagerHA.md |  5 ++
>  12 files changed, 307 insertions(+), 2 deletions(-)
> {code}
> that commit has no changes to the cmd, whereas the merge commit does:
> {code}
> ± git show --stat 1363eff69c3 | cat
> commit 1363eff69c36c4f2085194b59a86370505cc00cd
> Merge: 94d7f90e93b 9af96d4ed4b
> Author: Konstantin V Shvachko 
> Date:   Mon Sep 17 17:39:11 2018 -0700
> Merge commit '9af96d4ed4b6f80d3ca53a2b003d2ef768650dd4' into HDFS-12943
> # Conflicts:
> #   
> hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSHighAvailabilityWithQJM.md
>  .../hadoop-common/src/main/bin/start-all.cmd   | 104 
> ++---
>  .../hadoop-common/src/main/bin/stop-all.cmd| 104 
> ++---
>  .../org/apache/hadoop/http/IsActiveServlet.java|  71 ++
>  .../apache/hadoop/http/TestIsActiveServlet.java|  95 +++
>  .../federation/router/IsRouterActiveServlet.java   |  37 
>  .../server/federation/router/RouterHttpServer.java |   9 ++
>  .../src/site/markdown/HDFSRouterFederation.md  |   2 +-
>  .../hadoop-hdfs/src/main/bin/hdfs-config.cmd   |  86 -
>  .../hadoop-hdfs/src/main/bin/start-dfs.cmd |  82 
>  .../hadoop-hdfs/src/main/bin/stop-dfs.cmd  |  82 
>  .../server/namenode/IsNameNodeActiveServlet.java   |  33 +++
>  .../hdfs/server/namenode/NameNodeHttpServer.java   |   3 +
>  .../site/markdown/HDFSHighAvailabilityWithQJM.md   |   8 ++
>  hadoop-mapreduce-project/bin/mapred-config.cmd |  86 -
>  hadoop-tools/hadoop-streaming/src/test/bin/cat.cmd |  36 +++
>  .../hadoop-streaming/src/test/bin/xargs_cat.cmd|  36 +++
>  hadoop-yarn-project/hadoop-yarn/bin/start-yarn.cmd |  94 +--
>  hadoop-yarn-project/hadoop-yarn/bin/stop-yarn.cmd  |  94 +--
>  .../IsResourceManagerActiveServlet.java|  38 
>  .../server/resourcemanager/ResourceManager.java|   5 +
>  .../resourcemanager/webapp/RMWebAppFilter.java |   3 +-
>  .../src/site/markdown/ResourceManagerHA.md |   5 +
>  22 files changed, 709 insertions(+), 404 deletions(-)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-13930) Fix crlf line endings in HDFS-12943 branch

2018-09-20 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-13930:
--

 Summary: Fix crlf line endings in HDFS-12943 branch
 Key: HDFS-13930
 URL: https://issues.apache.org/jira/browse/HDFS-13930
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Erik Krogen
Assignee: Erik Krogen


One of the merge commits introduced the wrong line endings to some {{*.cmd}} 
files. Looks like it was commit {{1363eff69c3}} that broke it.

The tree is:
{code}
* |   1363eff69c3 2018-09-17 Merge commit 
'9af96d4ed4b6f80d3ca53a2b003d2ef768650dd4' into HDFS-12943 [Konstantin V 
Shvachko ]
|\ \
| |/
| * 9af96d4ed4b 2018-09-05 HADOOP-15707. Add IsActiveServlet to be used for 
Load Balancers. Contributed by Lukas Majercak. [Giovanni Matteo Fumarola 
]
* |   94d7f90e93b 2018-09-17 Merge commit 
'e780556ae9229fe7a90817eb4e5449d7eed35dd8' into HDFS-12943 [Konstantin V 
Shvachko ]
{code}
So that merge commit should have only introduced a single new commit 
{{9af96d4ed4b}}. But:
{code}
± git show --stat 9af96d4ed4b | cat
commit 9af96d4ed4b6f80d3ca53a2b003d2ef768650dd4
Author: Giovanni Matteo Fumarola 
Date:   Wed Sep 5 10:50:25 2018 -0700

HADOOP-15707. Add IsActiveServlet to be used for Load Balancers. 
Contributed by Lukas Majercak.

 .../org/apache/hadoop/http/IsActiveServlet.java| 71 
 .../apache/hadoop/http/TestIsActiveServlet.java| 95 ++
 .../federation/router/IsRouterActiveServlet.java   | 37 +
 .../server/federation/router/RouterHttpServer.java |  9 ++
 .../src/site/markdown/HDFSRouterFederation.md  |  2 +-
 .../server/namenode/IsNameNodeActiveServlet.java   | 33 
 .../hdfs/server/namenode/NameNodeHttpServer.java   |  3 +
 .../site/markdown/HDFSHighAvailabilityWithQJM.md   |  8 ++
 .../IsResourceManagerActiveServlet.java| 38 +
 .../server/resourcemanager/ResourceManager.java|  5 ++
 .../resourcemanager/webapp/RMWebAppFilter.java |  3 +-
 .../src/site/markdown/ResourceManagerHA.md |  5 ++
 12 files changed, 307 insertions(+), 2 deletions(-)
{code}
that commit has no changes to the cmd, whereas the merge commit does:
{code}
± git show --stat 1363eff69c3 | cat
commit 1363eff69c36c4f2085194b59a86370505cc00cd
Merge: 94d7f90e93b 9af96d4ed4b
Author: Konstantin V Shvachko 
Date:   Mon Sep 17 17:39:11 2018 -0700

Merge commit '9af96d4ed4b6f80d3ca53a2b003d2ef768650dd4' into HDFS-12943

# Conflicts:
#   
hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSHighAvailabilityWithQJM.md

 .../hadoop-common/src/main/bin/start-all.cmd   | 104 ++---
 .../hadoop-common/src/main/bin/stop-all.cmd| 104 ++---
 .../org/apache/hadoop/http/IsActiveServlet.java|  71 ++
 .../apache/hadoop/http/TestIsActiveServlet.java|  95 +++
 .../federation/router/IsRouterActiveServlet.java   |  37 
 .../server/federation/router/RouterHttpServer.java |   9 ++
 .../src/site/markdown/HDFSRouterFederation.md  |   2 +-
 .../hadoop-hdfs/src/main/bin/hdfs-config.cmd   |  86 -
 .../hadoop-hdfs/src/main/bin/start-dfs.cmd |  82 
 .../hadoop-hdfs/src/main/bin/stop-dfs.cmd  |  82 
 .../server/namenode/IsNameNodeActiveServlet.java   |  33 +++
 .../hdfs/server/namenode/NameNodeHttpServer.java   |   3 +
 .../site/markdown/HDFSHighAvailabilityWithQJM.md   |   8 ++
 hadoop-mapreduce-project/bin/mapred-config.cmd |  86 -
 hadoop-tools/hadoop-streaming/src/test/bin/cat.cmd |  36 +++
 .../hadoop-streaming/src/test/bin/xargs_cat.cmd|  36 +++
 hadoop-yarn-project/hadoop-yarn/bin/start-yarn.cmd |  94 +--
 hadoop-yarn-project/hadoop-yarn/bin/stop-yarn.cmd  |  94 +--
 .../IsResourceManagerActiveServlet.java|  38 
 .../server/resourcemanager/ResourceManager.java|   5 +
 .../resourcemanager/webapp/RMWebAppFilter.java |   3 +-
 .../src/site/markdown/ResourceManagerHA.md |   5 +
 22 files changed, 709 insertions(+), 404 deletions(-)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-13904) ContentSummary does not always respect processing limit, resulting in long lock acquisitions

2018-09-07 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-13904:
--

 Summary: ContentSummary does not always respect processing limit, 
resulting in long lock acquisitions
 Key: HDFS-13904
 URL: https://issues.apache.org/jira/browse/HDFS-13904
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs, namenode
Reporter: Erik Krogen
Assignee: Erik Krogen


HDFS-4995 added a config {{dfs.content-summary.limit}} which allows for an 
administrator to set a limit on the number of entries processed during a single 
acquisition of the {{FSNamesystemLock}} during the creation of a content 
summary. This is useful to prevent very long (multiple seconds) pauses on the 
NameNode when {{getContentSummary}} is called on large directories.

However, even on versions with HDFS-4995, we have seen warnings like:
{code}
INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem read 
lock held for 9398 ms via
java.lang.Thread.getStackTrace(Thread.java:1552)
org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:950)
org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.readUnlock(FSNamesystemLock.java:188)
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.readUnlock(FSNamesystem.java:1486)
org.apache.hadoop.hdfs.server.namenode.ContentSummaryComputationContext.yield(ContentSummaryComputationContext.java:109)
org.apache.hadoop.hdfs.server.namenode.INodeDirectory.computeDirectoryContentSummary(INodeDirectory.java:679)
org.apache.hadoop.hdfs.server.namenode.INodeDirectory.computeContentSummary(INodeDirectory.java:642)
org.apache.hadoop.hdfs.server.namenode.INodeDirectory.computeDirectoryContentSummary(INodeDirectory.java:656)
{code}
happen quite consistently when {{getContentSummary}} was called on a large 
directory on a heavily-loaded NameNode. Such long pauses completely destroy the 
performance of the NameNode. We have the limit set to its default of 5000; if 
it was respected, clearly there would not be a 10-second pause.

The current {{yield()}} code within {{ContentSummaryComputationContext}} looks 
like:
{code}
  public boolean yield() {
// Are we set up to do this?
if (limitPerRun <= 0 || dir == null || fsn == null) {
  return false;
}

// Have we reached the limit?
long currentCount = counts.getFileCount() +
counts.getSymlinkCount() +
counts.getDirectoryCount() +
counts.getSnapshotableDirectoryCount();
if (currentCount <= nextCountLimit) {
  return false;
}

// Update the next limit
nextCountLimit = currentCount + limitPerRun;

boolean hadDirReadLock = dir.hasReadLock();
boolean hadDirWriteLock = dir.hasWriteLock();
boolean hadFsnReadLock = fsn.hasReadLock();
boolean hadFsnWriteLock = fsn.hasWriteLock();

// sanity check.
if (!hadDirReadLock || !hadFsnReadLock || hadDirWriteLock ||
hadFsnWriteLock || dir.getReadHoldCount() != 1 ||
fsn.getReadHoldCount() != 1) {
  // cannot relinquish
  return false;
}

// unlock
dir.readUnlock();
fsn.readUnlock("contentSummary");

try {
  Thread.sleep(sleepMilliSec, sleepNanoSec);
} catch (InterruptedException ie) {
} finally {
  // reacquire
  fsn.readLock();
  dir.readLock();
}
yieldCount++;
return true;
  }
{code}
We believe that this check in particular is the culprit:
{code}
if (!hadDirReadLock || !hadFsnReadLock || hadDirWriteLock ||
hadFsnWriteLock || dir.getReadHoldCount() != 1 ||
fsn.getReadHoldCount() != 1) {
  // cannot relinquish
  return false;
}
{code}
The content summary computation will only relinquish the lock if it is 
currently the _only_ holder of the lock. Given the high volume of read requests 
on a heavily loaded NameNode, especially when unfair locking is enabled, it is 
likely there may be another holder of the read lock performing some short-lived 
operation. By refusing to give up the lock in this case, the content summary 
computation ends up never relinquishing the lock.

We propose to simply remove the readHoldCount checks from this {{yield()}}. 
This should alleviate the case described above by giving up the read lock and 
allowing other short-lived operations to complete (while the content summary 
thread sleeps) so that the lock can finally be given up completely. This has 
the drawback that sometimes, the content summary may give up the lock 
unnecessarily, if the read lock is never actually released by the time the 
thread continues again. The only negative impact from this is to make some 
large content summary operations slightly slower, with the tradeoff of reducing 
NameNode-wide performance impact.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail:

[jira] [Resolved] (HDFS-13872) Only some protocol methods should perform msync wait

2018-09-06 Thread Erik Krogen (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen resolved HDFS-13872.

Resolution: Duplicate

Closing in favor of HDFS-13880

> Only some protocol methods should perform msync wait
> 
>
> Key: HDFS-13872
> URL: https://issues.apache.org/jira/browse/HDFS-13872
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-13872-HDFS-12943.000.patch
>
>
> Currently the implementation of msync added in HDFS-13767 waits until the 
> server has caught up to the client-specified transaction ID regardless of 
> what the inbound RPC is. This particularly causes problems for 
> ObserverReadProxyProvider (see HDFS-13779) when we try to fetch the state 
> from an observer/standby; this should be a quick operation, but it has to 
> wait for the node to catch up to the most current state. I initially thought 
> all {{HAServiceProtocol}} methods should thus be excluded from the wait 
> period, but actually I think the right approach is that _only_ 
> {{ClientProtocol}} methods should be subjected to the wait period. I propose 
> that we can do this via an annotation on client protocol which can then be 
> checked within {{ipc.Server}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-13872) Only ClientProtocol should perform msync wait

2018-08-27 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-13872:
--

 Summary: Only ClientProtocol should perform msync wait
 Key: HDFS-13872
 URL: https://issues.apache.org/jira/browse/HDFS-13872
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Erik Krogen


Currently the implementation of msync added in HDFS-13767 waits until the 
server has caught up to the client-specified transaction ID regardless of what 
the inbound RPC is. This particularly causes problems for 
ObserverReadProxyProvider (see HDFS-13779) when we try to fetch the state from 
an observer/standby; this should be a quick operation, but it has to wait for 
the node to catch up to the most current state. I initially thought all 
{{HAServiceProtocol}} methods should thus be excluded from the wait period, but 
actually I think the right approach is that _only_ {{ClientProtocol}} methods 
should be subjected to the wait period. I propose that we can do this via an 
annotation on client protocol which can then be checked within {{ipc.Server}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-12421) Balancer to emit standard metrics

2018-08-09 Thread Erik Krogen (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-12421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen resolved HDFS-12421.

Resolution: Duplicate

Just noticed this is a dup of HDFS-10648

 

> Balancer to emit standard metrics
> -
>
> Key: HDFS-12421
> URL: https://issues.apache.org/jira/browse/HDFS-12421
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer  mover, metrics
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Minor
>
> The Balancer currently prints some statistics about its operation to stdout 
> while it is running. This is fine if the balancer is manually run via CLI by 
> an operator, but for the more common case of it being a scheduled execution, 
> it is cumbersome to have to track down the logs to be able to monitor its 
> progress.
> We already have a standard metrics system in place; I propose that we have 
> the Balancer emit metrics while it is running so that they can be tracked via 
> standard metrics infrastructure. We can start with just the things that the 
> balancer already prints to stdout: bytes already moved, bytes left to move, 
> bytes currently being moved, and iteration number.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-13791) Limit logging frequency of edit tail related statements

2018-08-03 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-13791:
--

 Summary: Limit logging frequency of edit tail related statements
 Key: HDFS-13791
 URL: https://issues.apache.org/jira/browse/HDFS-13791
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs, qjm
Reporter: Erik Krogen
Assignee: Erik Krogen


There are a number of log statements that occur every time new edits are tailed 
by a Standby NameNode. When edits are tailing only on the order of every tens 
of seconds, this is fine. With the work in HDFS-13150, however, edits may be 
tailed every few milliseconds, which can flood the logs with tailing-related 
statements. We should throttle it to limit it to printing at most, say, once 
per 5 seconds.

We can implement logic similar to that used in HDFS-10713. This may be slightly 
more tricky since the log statements are distributed across a few classes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-13789) Reduce logging frequency of QuorumJournalManager#selectInputStreams

2018-08-02 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-13789:
--

 Summary: Reduce logging frequency of 
QuorumJournalManager#selectInputStreams
 Key: HDFS-13789
 URL: https://issues.apache.org/jira/browse/HDFS-13789
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode, qjm
Affects Versions: HDFS-12943
Reporter: Erik Krogen
Assignee: Erik Krogen


As part of HDFS-13150, a logging statement was added to indicate whenever an 
edit tail is performed via the RPC mechanism. To enable low latency tailing, 
the tail frequency must be set very low, so this log statement gets printed 
much too frequently at an INFO level. We should decrease to DEBUG. Note that if 
there are actually edits available to tail, other log messages will get 
printed; this is just targeting the case when it attempts to tail and there are 
no new edits.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-13150) [Edit Tail Fast Path] Allow SbNN to tail in-progress edits from JN via RPC

2018-07-17 Thread Erik Krogen (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-13150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen resolved HDFS-13150.

   Resolution: Fixed
Fix Version/s: HDFS-12943

Closing this as all sub-issues (HDFS-13607, HDFS-13608, HDFS-13609, HDFS-13610) 
have been completed. Thanks to all who helped with this new feature!

> [Edit Tail Fast Path] Allow SbNN to tail in-progress edits from JN via RPC
> --
>
> Key: HDFS-13150
> URL: https://issues.apache.org/jira/browse/HDFS-13150
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, hdfs, journal-node, namenode
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Fix For: HDFS-12943
>
> Attachments: edit-tailing-fast-path-design-v0.pdf, 
> edit-tailing-fast-path-design-v1.pdf, edit-tailing-fast-path-design-v2.pdf
>
>
> In the interest of making coordinated/consistent reads easier to complete 
> with low latency, it is advantageous to reduce the time between when a 
> transaction is applied on the ANN and when it is applied on the SbNN. We 
> propose adding a new "fast path" which can be used to tail edits when low 
> latency is desired. We leave the existing tailing logic in place, and fall 
> back to this path on startup, recovery, and when the fast path encounters 
> unrecoverable errors.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-13689) NameNodeRpcServer getEditsFromTxid assumes it is run on active NameNode

2018-06-19 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-13689:
--

 Summary: NameNodeRpcServer getEditsFromTxid assumes it is run on 
active NameNode
 Key: HDFS-13689
 URL: https://issues.apache.org/jira/browse/HDFS-13689
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs, namenode
Reporter: Erik Krogen
Assignee: Erik Krogen


{{NameNodeRpcServer#getEditsFromTxid}} currently decides which transactions are 
able to be served, i.e. which transactions are durable, using the following 
logic:
{code}
long syncTxid = log.getSyncTxId();
// If we haven't synced anything yet, we can only read finalized
// segments since we can't reliably determine which txns in in-progress
// segments have actually been committed (e.g. written to a quorum of JNs).
// If we have synced txns, we can definitely read up to syncTxid since
// syncTxid is only updated after a transaction is committed to all
// journals. (In-progress segments written by old writers are already
// discarded for us, so if we read any in-progress segments they are
// guaranteed to have been written by this NameNode.)
boolean readInProgress = syncTxid > 0;
{code}
This assumes that the NameNode serving this request is the current 
writer/active NameNode, which may not be true in the ObserverNode situation. 
Since {{selectInputStreams}} now has a {{onlyDurableTxns}} flag, which, if 
enabled, will only return durable/committed transactions, we can instead 
leverage this to provide the same functionality. We should utilize this to 
avoid consistency issues when serving this request from the ObserverNode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-13610) [Edit Tail Fast Path Pt 4] Cleanup: integration test, documentation, remove unnecessary dummy sync

2018-05-23 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-13610:
--

 Summary: [Edit Tail Fast Path Pt 4] Cleanup: integration test, 
documentation, remove unnecessary dummy sync
 Key: HDFS-13610
 URL: https://issues.apache.org/jira/browse/HDFS-13610
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, journal-node, namenode
Reporter: Erik Krogen
Assignee: Erik Krogen


See HDFS-13150 for full design.

This JIRA is targeted at cleanup tasks:
* Add in integration testing. We can expand {{TestStandbyInProgressTail}}
* Documentation in HDFSHighAvailabilityWithQJM
* Remove the dummy sync added as part of HDFS-10519; it is unnecessary since 
now in-progress tailing does not rely as heavily on the JN committedTxnId



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-13609) [Edit Tail Fast Path Pt 3] NameNode-side changes to support tailing edits via RPC

2018-05-23 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-13609:
--

 Summary: [Edit Tail Fast Path Pt 3] NameNode-side changes to 
support tailing edits via RPC
 Key: HDFS-13609
 URL: https://issues.apache.org/jira/browse/HDFS-13609
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, namenode
Reporter: Erik Krogen
Assignee: Erik Krogen


See HDFS-13150 for the full design.

This JIRA is targetted at the NameNode-side changes to enable tailing 
in-progress edits via the RPC mechanism added in HDFS-13608. Most changes are 
in the QuorumJournalManager.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-13608) [Edit Tail Fast Path Pt 2] Add ability for JournalNode to serve edits via RPC

2018-05-23 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-13608:
--

 Summary: [Edit Tail Fast Path Pt 2] Add ability for JournalNode to 
serve edits via RPC
 Key: HDFS-13608
 URL: https://issues.apache.org/jira/browse/HDFS-13608
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Erik Krogen
Assignee: Erik Krogen


See HDFS-13150 for full design.

This JIRA is to make the JournalNode-side changes necessary to support serving 
edits via RPC. This includes interacting with the cache added in HDFS-13607.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-13125) Improve efficiency of JN -> Standby Pipeline Under Frequent Edit Tailing

2018-05-22 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen resolved HDFS-13125.

Resolution: Duplicate

This was subsumed by HDFS-13150

> Improve efficiency of JN -> Standby Pipeline Under Frequent Edit Tailing
> 
>
> Key: HDFS-13125
> URL: https://issues.apache.org/jira/browse/HDFS-13125
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: journal-node, namenode
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
>
> The current edit tailing pipeline is designed for
> * High resiliency
> * High throughput
> and was _not_ designed for low latency.
> It was designed under the assumption that each edit log segment would 
> typically be read all at once, e.g. on startup or the SbNN tailing the entire 
> thing after it is finalized. The ObserverNode should be reading constantly 
> from the JournalNodes' in-progress edit logs with low latency, to reduce the 
> lag time from when a transaction is committed on the ANN and when it is 
> visible on the ObserverNode.
> Due to the critical nature of this pipeline to the health of HDFS, it would 
> be better not to redesign it altogether. Based on some experiments it seems 
> if we mitigate the following issues, lag times are reduced to low levels (low 
> hundreds of milliseconds even under very high write load):
> * The overhead of creating a new HTTP connection for each time new edits are 
> fetched. This makes sense when you're expecting to tail an entire segment; it 
> does not when you may only be fetching a small number of edits. We can 
> mitigate this by allowing edits to be tailed via an RPC call, or by adding a 
> connection pool for the existing connections to the journal.
> * The overhead of transmitting a whole file at once. Right now when an edit 
> segment is requested, the JN sends the entire segment, and on the SbNN it 
> will ignore edits up to the ones it wants. How to solve this one may be more 
> tricky, but one suggestion would be to keep recently logged edits in memory, 
> avoiding the need to serve them from file at all, allowing the JN to quickly 
> serve only the required edits.
> We can implement these as optimizations on top of the existing logic, with 
> fallbacks to the current slow-but-resilient pipeline.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-13602) Optimize checkOperation(WRITE) check in FSNamesystem getBlockLocations

2018-05-22 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-13602:
--

 Summary: Optimize checkOperation(WRITE) check in FSNamesystem 
getBlockLocations
 Key: HDFS-13602
 URL: https://issues.apache.org/jira/browse/HDFS-13602
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha, namenode
Reporter: Erik Krogen
Assignee: Chao Sun


Similar to the work done in HDFS-4591 to avoid having to take a write lock 
before checking if an operation category is allowed, we can do the same for the 
write lock that is taken sometimes (when updating access time) within 
getBlockLocations.

This is particularly useful when using the standby read feature (HDFS-12943), 
as it will be the case on an observer node that the operationCategory(READ) 
check succeeds but the operationCategory(WRITE) check fails. It would be ideal 
to fail this check _before_ acquiring the write lock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-13595) Edit tailing period configuration should accept time units

2018-05-18 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen resolved HDFS-13595.

Resolution: Invalid

This is already done. Looked at the wrong branch, my mistake.

> Edit tailing period configuration should accept time units
> --
>
> Key: HDFS-13595
> URL: https://issues.apache.org/jira/browse/HDFS-13595
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ha, namenode
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
>
> The {{dfs.ha.tail-edits.period}} config should accept time units to be able 
> to more easily specified across a wide range, and in particular for 
> HDFS-13150 it is useful to have a period shorter than 1 second which is not 
> currently possible.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-13595) Edit tailing period configuration should accept time units

2018-05-18 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-13595:
--

 Summary: Edit tailing period configuration should accept time units
 Key: HDFS-13595
 URL: https://issues.apache.org/jira/browse/HDFS-13595
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha, namenode
Reporter: Erik Krogen
Assignee: Erik Krogen


The {{dfs.ha.tail-edits.period}} config should accept time units to be able to 
more easily specified across a wide range, and in particular for HDFS-13150 it 
is useful to have a period shorter than 1 second which is not currently 
possible.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-13523) Support observer nodes in MiniDFSCluster

2018-05-02 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-13523:
--

 Summary: Support observer nodes in MiniDFSCluster
 Key: HDFS-13523
 URL: https://issues.apache.org/jira/browse/HDFS-13523
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode, test
Reporter: Erik Krogen


MiniDFSCluster should support Observer nodes so that we can write decent 
integration tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-13522) Support observer node from Router-Based Federation

2018-05-02 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-13522:
--

 Summary: Support observer node from Router-Based Federation
 Key: HDFS-13522
 URL: https://issues.apache.org/jira/browse/HDFS-13522
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: federation, namenode
Reporter: Erik Krogen


Changes will need to occur to the router to support the new observer node.

One such change will be to make the router understand the observer state, e.g. 
{{FederationNamenodeServiceState}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-13493) Reduce the HttpServer2 thread count on DataNodes

2018-04-23 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-13493:
--

 Summary: Reduce the HttpServer2 thread count on DataNodes
 Key: HDFS-13493
 URL: https://issues.apache.org/jira/browse/HDFS-13493
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Reporter: Erik Krogen
Assignee: Erik Krogen


Given that HFTP was removed in Hadoop 3 and WebHDFS is handled via Netty, the 
HttpServer2 instance within the DataNode is only used for very basic tasks such 
as the web UI. Thus we can safely reduce the thread count used here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-13400) WebHDFS append returned stream has incorrectly set position

2018-04-04 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-13400:
--

 Summary: WebHDFS append returned stream has incorrectly set 
position
 Key: HDFS-13400
 URL: https://issues.apache.org/jira/browse/HDFS-13400
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 3.0.1, 2.7.5, 2.8.3, 2.9.0
Reporter: Erik Krogen






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-13272) DataNodeHttpServer hard-codes HttpServer2 threads at 10

2018-03-13 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-13272:
--

 Summary: DataNodeHttpServer hard-codes HttpServer2 threads at 10
 Key: HDFS-13272
 URL: https://issues.apache.org/jira/browse/HDFS-13272
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Reporter: Erik Krogen
Assignee: Erik Krogen


In HDFS-7279, the Jetty server on the DataNode was hard-coded to use 10 
threads. In addition to the possibility of this being too few threads, it is 
much higher than necessary in resource constrained environments such as 
MiniDFSCluster. To avoid compatibility issues, rather than using 
{{HttpServer2#HTTP_MAX_THREADS}} directly, we can introduce a new configuration 
for the DataNode's thread pool size.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-13265) MiniDFSCluster should set reasonable defaults to reduce resource consumption

2018-03-12 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-13265:
--

 Summary: MiniDFSCluster should set reasonable defaults to reduce 
resource consumption
 Key: HDFS-13265
 URL: https://issues.apache.org/jira/browse/HDFS-13265
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, namenode, test
Reporter: Erik Krogen


MiniDFSCluster takes its defaults from {{DFSConfigKeys}} defaults, but many of 
these are not suitable for a unit test environment. For example, the default 
handler thread count of 10 is definitely more than necessary for (almost?) any 
unit test. We should set reasonable, lower defaults unless a test specifically 
requires more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-13264) CacheReplicationMonitor should be able to be disabled completely

2018-03-12 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-13264:
--

 Summary: CacheReplicationMonitor should be able to be disabled 
completely
 Key: HDFS-13264
 URL: https://issues.apache.org/jira/browse/HDFS-13264
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Erik Krogen


Currently there is no way to completely disable the CacheReplicationMonitor, 
even if the feature is not being used at all.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-13263) DiskBalancer should not start a thread if it is disabled

2018-03-12 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-13263:
--

 Summary: DiskBalancer should not start a thread if it is disabled
 Key: HDFS-13263
 URL: https://issues.apache.org/jira/browse/HDFS-13263
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Reporter: Erik Krogen
Assignee: Erik Krogen






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-13262) Services should not start threads unnecessarily

2018-03-12 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-13262:
--

 Summary: Services should not start threads unnecessarily
 Key: HDFS-13262
 URL: https://issues.apache.org/jira/browse/HDFS-13262
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, namenode, test
Reporter: Erik Krogen


There are a number of services in HDFS that start a thread even if they are 
disabled. Some services which may not be strictly necessary do not have a way 
to be disabled. This is particularly bad for the unit tests, in which the 
number of threads spawned by concurrent MiniDFSCluster-based tests can grow to 
be very large (e.g. see HDFS-12711)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-13150) Create fast path for SbNN tailing edits from JNs

2018-02-15 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-13150:
--

 Summary: Create fast path for SbNN tailing edits from JNs
 Key: HDFS-13150
 URL: https://issues.apache.org/jira/browse/HDFS-13150
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs, journal-node, namenode
Reporter: Erik Krogen
Assignee: Erik Krogen


In the interest of making coordinated/consistent reads easier to complete with 
low latency, it is advantageous to reduce the time between when a transaction 
is applied on the ANN and when it is applied on the SbNN. We propose adding a 
new "fast path" which can be used to tail edits when low latency is desired. We 
leave the existing tailing logic in place, and fall back to this path on 
startup, recovery, and when the fast path encounters unrecoverable errors.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-13122) Tailing edits should not update quota counts on ObserverNode

2018-02-08 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen resolved HDFS-13122.

Resolution: Duplicate

> Tailing edits should not update quota counts on ObserverNode
> 
>
> Key: HDFS-13122
> URL: https://issues.apache.org/jira/browse/HDFS-13122
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs, namenode
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
>
> Currently in {{FSImage#loadEdits()}}, after applying a set of edits, we call
> {code}
> updateCountForQuota(target.getBlockManager().getStoragePolicySuite(), 
> target.dir.rootDir);
> {code}
> to update the quota counts for the entire namespace, which can be very 
> expensive. This makes sense if we are about to become the ANN, since we need 
> valid quotas, but not on an ObserverNode which does not need to enforce 
> quotas.
> This is related to increasing the frequency with which the SbNN can tail 
> edits from the ANN to decrease the lag time for transactions to appear on the 
> Observer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-13126) Re-enable HTTP Request Logging for WebHDFS

2018-02-08 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-13126:
--

 Summary: Re-enable HTTP Request Logging for WebHDFS
 Key: HDFS-13126
 URL: https://issues.apache.org/jira/browse/HDFS-13126
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, webhdfs
Affects Versions: 2.7.0
Reporter: Erik Krogen


Due to HDFS-7279, starting in 2.7.0, the DataNode HTTP Request logs no longer 
include WebHDFS requests because the HTTP Request handling is done internal to 
{{HttpServer2}}, which is no longer used. If the request logging is enabled, we 
should add a Netty 
[LoggingHandler|https://netty.io/4.0/api/io/netty/handler/logging/LoggingHandler.html]
 to the ChannelPipeline for the http(s) servers used by the DataNode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-13125) Improve efficiency of JN -> Standby Pipeline Under Frequent Edit Tailing

2018-02-08 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-13125:
--

 Summary: Improve efficiency of JN -> Standby Pipeline Under 
Frequent Edit Tailing
 Key: HDFS-13125
 URL: https://issues.apache.org/jira/browse/HDFS-13125
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: journal-node, namenode
Reporter: Erik Krogen
Assignee: Erik Krogen


The current edit tailing pipeline is designed for
* High resiliency
* High throughput
and was _not_ designed for low latency.

It was designed under the assumption that each edit log segment would typically 
be read all at once, e.g. on startup or the SbNN tailing the entire thing after 
it is finalized. The ObserverNode should be reading constantly from the 
JournalNodes' in-progress edit logs with low latency, to reduce the lag time 
from when a transaction is committed on the ANN and when it is visible on the 
ObserverNode.

Due to the critical nature of this pipeline to the health of HDFS, it would be 
better not to redesign it altogether. Based on some experiments it seems if we 
mitigate the following issues, lag times are reduced to low levels (low 
hundreds of milliseconds even under very high write load):
* The overhead of creating a new HTTP connection for each time new edits are 
fetched. This makes sense when you're expecting to tail an entire segment; it 
does not when you may only be fetching a small number of edits. We can mitigate 
this by allowing edits to be tailed via an RPC call, or by adding a connection 
pool for the existing connections to the journal.
* The overhead of transmitting a whole file at once. Right now when an edit 
segment is requested, the JN sends the entire segment, and on the SbNN it will 
ignore edits up to the ones it wants. How to solve this one may be more tricky, 
but one suggestion would be to keep recently logged edits in memory, avoiding 
the need to serve them from file at all, allowing the JN to quickly serve only 
the required edits.

We can implement these as optimizations on top of the existing logic, with 
fallbacks to the current slow-but-resilient pipeline.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-13122) FSImage should not update quota counts on ObserverNode

2018-02-07 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-13122:
--

 Summary: FSImage should not update quota counts on ObserverNode
 Key: HDFS-13122
 URL: https://issues.apache.org/jira/browse/HDFS-13122
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs, namenode
Reporter: Erik Krogen
Assignee: Erik Krogen


Currently in {{FSImage#loadEdits()}}, after applying a set of edits, we call
{code}
updateCountForQuota(target.getBlockManager().getStoragePolicySuite(), 
target.dir.rootDir);
{code}
to update the quota counts for the entire namespace, which can be very 
expensive. This makes sense if we are about to become the ANN, since we need 
valid quotas, but not on an ObserverNode which does not need to enforce quotas.

This is related to increasing the frequency with which the SbNN can tail edits 
from the ANN to decrease the lag time for transactions to appear on the 
Observer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-12828) OIV ReverseXML Processor Fails With Escaped Characters

2017-11-16 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-12828:
--

 Summary: OIV ReverseXML Processor Fails With Escaped Characters
 Key: HDFS-12828
 URL: https://issues.apache.org/jira/browse/HDFS-12828
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs
Affects Versions: 2.8.0
Reporter: Erik Krogen


The HDFS OIV ReverseXML processor fails if the XML file contains escaped 
characters:
{code}
ekrogen at ekrogen-ld1 in 
~/dev/hadoop/trunk/hadoop-dist/target/hadoop-3.0.0-beta1-SNAPSHOT on trunk!
± $HADOOP_HOME/bin/hdfs dfs -fs hdfs://localhost:9000/ -ls /
Found 4 items
drwxr-xr-x   - ekrogen supergroup  0 2017-11-16 14:48 /foo
drwxr-xr-x   - ekrogen supergroup  0 2017-11-16 14:49 /foo"
drwxr-xr-x   - ekrogen supergroup  0 2017-11-16 14:50 /foo`
drwxr-xr-x   - ekrogen supergroup  0 2017-11-16 14:49 /foo&
{code}
Then after doing {{saveNamespace}} on that NameNode...
{code}
ekrogen at ekrogen-ld1 in 
~/dev/hadoop/trunk/hadoop-dist/target/hadoop-3.0.0-beta1-SNAPSHOT on trunk!
± $HADOOP_HOME/bin/hdfs oiv -i 
/tmp/hadoop-ekrogen/dfs/name/current/fsimage_008 -o 
/tmp/hadoop-ekrogen/dfs/name/current/fsimage_008.xml -p XML
ekrogen at ekrogen-ld1 in 
~/dev/hadoop/trunk/hadoop-dist/target/hadoop-3.0.0-beta1-SNAPSHOT on trunk!
± $HADOOP_HOME/bin/hdfs oiv -i 
/tmp/hadoop-ekrogen/dfs/name/current/fsimage_008.xml -o 
/tmp/hadoop-ekrogen/dfs/name/current/fsimage_008.xml.rev -p 
ReverseXML
OfflineImageReconstructor failed: unterminated entity ref starting with &
org.apache.hadoop.hdfs.util.XMLUtils$UnmanglingError: unterminated entity ref 
starting with &
at 
org.apache.hadoop.hdfs.util.XMLUtils.unmangleXmlString(XMLUtils.java:232)
at 
org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageReconstructor.loadNodeChildrenHelper(OfflineImageReconstructor.java:383)
at 
org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageReconstructor.loadNodeChildrenHelper(OfflineImageReconstructor.java:379)
at 
org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageReconstructor.loadNodeChildren(OfflineImageReconstructor.java:418)
at 
org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageReconstructor.access$1000(OfflineImageReconstructor.java:95)
at 
org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageReconstructor$INodeSectionProcessor.process(OfflineImageReconstructor.java:524)
at 
org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageReconstructor.processXml(OfflineImageReconstructor.java:1710)
at 
org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageReconstructor.run(OfflineImageReconstructor.java:1765)
at 
org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageViewerPB.run(OfflineImageViewerPB.java:191)
at 
org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageViewerPB.main(OfflineImageViewerPB.java:134)
{code}
See attachments for relevant fsimage XML file.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-12823) Backport HDFS-9259 "Make SO_SNDBUF size configurable at DFSClient" to branch-2.7

2017-11-15 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-12823:
--

 Summary: Backport HDFS-9259 "Make SO_SNDBUF size configurable at 
DFSClient" to branch-2.7
 Key: HDFS-12823
 URL: https://issues.apache.org/jira/browse/HDFS-12823
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs, hdfs-client
Reporter: Erik Krogen
Assignee: Erik Krogen


Given the pretty significant performance implications of HDFS-9259 (see 
discussion in HDFS-10326) when doing transfers across high latency links, it 
would be helpful to have this configurability exist in the 2.7 series. 

Opening a new JIRA since the original HDFS-9259 has been closed for a while and 
there are conflicts due to a few classes moving.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-12818) Support multiple storages in DataNodeCluster / SimulatedFSDataset

2017-11-15 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-12818:
--

 Summary: Support multiple storages in DataNodeCluster / 
SimulatedFSDataset
 Key: HDFS-12818
 URL: https://issues.apache.org/jira/browse/HDFS-12818
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, test
Reporter: Erik Krogen
Assignee: Erik Krogen
Priority: Minor


Currently {{SimulatedFSDataset}} (and thus, {{DataNodeCluster}} with 
{{-simulated}}) only supports a single storage per {{DataNode}}. Given that the 
number of storages can have important implications on the performance of block 
report processing, it would be useful for these classes to support a multiple 
storage configuration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-12803) We should not lock FsNamesystem even we operate a sub directory, we should refinement the lock

2017-11-15 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen resolved HDFS-12803.

Resolution: Duplicate

This is a longstanding request tracked in HDFS-5453

> We should not lock FsNamesystem even we operate a sub directory, we should 
> refinement the lock
> --
>
> Key: HDFS-12803
> URL: https://issues.apache.org/jira/browse/HDFS-12803
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.7.1, 3.0.0-alpha3
>Reporter: maobaolong
>
> An example:
> If a client is doing mkdir or delete a file, other client will wait for the 
> FSNamesystem's lock to do some operation.
> I think we have to refinement the lock. we can lock the parent inode only.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-12746) DataNode Audit Logger

2017-10-30 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-12746:
--

 Summary: DataNode Audit Logger
 Key: HDFS-12746
 URL: https://issues.apache.org/jira/browse/HDFS-12746
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, logging
Reporter: Erik Krogen


I would like to discuss adding in an audit logger for the Datanodes. We have 
audit logging on pretty much all other components: Namenode, ResourceManager, 
NodeManager. It seems the DN should have a similar concept to log, at minimum, 
all block reads/writes. I think all of the interesting information does already 
appear in the DN logs at INFO level but it would be nice to have a specific 
audit class that this gets logged through, a la {{RMAuditLogger}} and 
{{NMAuditLogger}}, to enable special handling.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-11707) TestDirectoryScanner#testThrottling fails on OSX

2017-10-10 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-11707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen resolved HDFS-11707.

Resolution: Duplicate

> TestDirectoryScanner#testThrottling fails on OSX
> 
>
> Key: HDFS-11707
> URL: https://issues.apache.org/jira/browse/HDFS-11707
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: test
>Affects Versions: 2.8.0
>Reporter: Erik Krogen
>Priority: Minor
>
> In branch-2 and trunk, {{TestDirectoryScanner#testThrottling}} consistently 
> fails on OS X (I'm running 10.11 specifically) with:
> {code}
> java.lang.AssertionError: Throttle is too permissive
> {code}
> It seems to work alright on Unix systems.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-12533) NNThroughputBenchmark threads get stuck on UGI.getCurrentUser()

2017-09-22 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-12533:
--

 Summary: NNThroughputBenchmark threads get stuck on 
UGI.getCurrentUser()
 Key: HDFS-12533
 URL: https://issues.apache.org/jira/browse/HDFS-12533
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Erik Krogen


In {{NameNode#getRemoteUser()}}, it first attempts to fetch from the RPC user 
(not a synchronized operation), and if there is no RPC call, it will call 
{{UserGroupInformation#getCurrentUser()}} (which is {{synchronized}}). This 
makes it efficient for RPC operations (the bulk) so that there is not too much 
contention.

In NNThroughputBenchmark, however, there is no RPC call since we bypass that 
later, so with a high thread count many of the threads are getting stuck. At 
one point I attached a profiler and found that quite a few threads had been 
waiting for {{#getCurrentUser()}} for 2 minutes (!). When taking this away I 
found some improvement in the throughput numbers I was seeing. To more closely 
emulate a real NN we should improve this issue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-12421) Balancer to emit standard metrics

2017-09-11 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-12421:
--

 Summary: Balancer to emit standard metrics
 Key: HDFS-12421
 URL: https://issues.apache.org/jira/browse/HDFS-12421
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: balancer & mover
Reporter: Erik Krogen
Assignee: Erik Krogen
Priority: Minor


The Balancer currently prints some statistics about its operation to stdout 
while it is running. This is fine if the balancer is manually run via CLI by an 
operator, but for the more common case of it being a scheduled execution, it is 
cumbersome to have to track down the logs to be able to monitor its progress.

We already have a standard metrics system in place; I propose that we have the 
Balancer emit metrics while it is running so that they can be tracked via 
standard metrics infrastructure. We can start with just the things that the 
balancer already prints to stdout: bytes already moved, bytes left to move, 
bytes currently being moved, and iteration number.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Reopened] (HDFS-12131) Add some of the FSNamesystem JMX values as metrics

2017-09-06 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen reopened HDFS-12131:


> Add some of the FSNamesystem JMX values as metrics
> --
>
> Key: HDFS-12131
> URL: https://issues.apache.org/jira/browse/HDFS-12131
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namenode
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-beta1, 2.8.3
>
> Attachments: HDFS-12131.000.patch, HDFS-12131.001.patch, 
> HDFS-12131.002.patch, HDFS-12131.002.patch, HDFS-12131.003.patch, 
> HDFS-12131.004.patch, HDFS-12131.005.patch, HDFS-12131.006.patch, 
> HDFS-12131-branch-2.006.patch, HDFS-12131-branch-2.8.006.patch
>
>
> A number of useful numbers are emitted via the FSNamesystem JMX, but not 
> through the metrics system. These would be useful to be able to track over 
> time, e.g. to alert on via standard metrics systems or to view trends and 
> rate changes:
> * NumLiveDataNodes
> * NumDeadDataNodes
> * NumDecomLiveDataNodes
> * NumDecomDeadDataNodes
> * NumDecommissioningDataNodes
> * NumStaleStorages
> * VolumeFailuresTotal
> * EstimatedCapacityLostTotal
> * NumInMaintenanceLiveDataNodes
> * NumInMaintenanceDeadDataNodes
> * NumEnteringMaintenanceDataNodes
> This is a simple change that just requires annotating the JMX methods with 
> {{@Metric}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-12342) Differentiate webhdfs vs. swebhdfs calls in audit log

2017-08-23 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-12342:
--

 Summary: Differentiate webhdfs vs. swebhdfs calls in audit log
 Key: HDFS-12342
 URL: https://issues.apache.org/jira/browse/HDFS-12342
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: logging
Reporter: Erik Krogen
Assignee: Erik Krogen


Currently the audit log only logs {{webhdfs}} vs {{rpc}} as the {{proto}}. It 
is useful to be able to audit whether certain commands were carried out via 
webhdfs or swebhdfs as this has different security and potentially performance 
implications. We have been running this internally for a while and have found 
it useful for looking at usage patterns.

Proposal is just to continue logging {{webhdfs}} as the proto for {{http}} 
WebHDFS commands, but log {{swebhdfs}} for SWebHDFS (over {{https}}). This will 
be incompatible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-12219) Javadoc for FSNamesystem#getMaxObjects is incorrect

2017-07-28 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-12219:
--

 Summary: Javadoc for FSNamesystem#getMaxObjects is incorrect
 Key: HDFS-12219
 URL: https://issues.apache.org/jira/browse/HDFS-12219
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Erik Krogen
Assignee: Erik Krogen
Priority: Trivial


The Javadoc states that this represents the total number of objects in the 
system, but it really represents the maximum allowed number of objects (as 
correctly stated on the Javadoc for {{FSNamesystemMBean#getMaxObjects()}}).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-12160) Fix broken NameNode metrics documentation

2017-07-18 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-12160:
--

 Summary: Fix broken NameNode metrics documentation
 Key: HDFS-12160
 URL: https://issues.apache.org/jira/browse/HDFS-12160
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0-alpha4, 2.8.0
Reporter: Erik Krogen
Assignee: Erik Krogen
Priority: Trivial


HDFS-11261 introduced documentation for the metrics added in HDFS-10872. The 
metrics have a pipe ({{|}}) in them which breaks the markdown table. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-12131) Add some of the FSNamesystem JMX values as metrics

2017-07-12 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-12131:
--

 Summary: Add some of the FSNamesystem JMX values as metrics
 Key: HDFS-12131
 URL: https://issues.apache.org/jira/browse/HDFS-12131
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs, namenode
Reporter: Erik Krogen
Assignee: Erik Krogen
Priority: Minor


A number of useful numbers are emitted via the FSNamesystem JMX, but not 
through the metrics system. These would be useful to be able to track over 
time, e.g. to alert on via standard metrics systems or to view trends and rate 
changes:
* NumLiveDataNodes
* NumDeadDataNodes
* NumDecomLiveDataNodes
* NumDecomDeadDataNodes
* NumDecommissioningDataNodes
* NumStaleStorages

This is a simple change that just requires annotating the JMX methods with 
{{@Metric}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-12004) Namenode UI continues to list DNs that have been removed from include and exclude

2017-06-26 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen resolved HDFS-12004.

Resolution: Duplicate

> Namenode UI continues to list DNs that have been removed from include and 
> exclude
> -
>
> Key: HDFS-12004
> URL: https://issues.apache.org/jira/browse/HDFS-12004
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Erik Krogen
>Priority: Minor
>
> Initially in HDFS, after a DN was decommission and subsequently removed from 
> the exclude file (thus removing all references to it), it would still appear 
> in the NN UI as a "dead" node until the NN was restarted. In HDFS-1773, 
> discussion about this was had, and it was decided that the web UI should not 
> show these nodes. However when HDFS-5334 went through and the NN web UI was 
> reimplemented client-side, the behavior reverted back to pre-HDFS-1773, and 
> dead+decommissioned nodes once again showed in the dead list. This can be 
> operationally confusing for the same reasons as discussed in HDFS-1773.
> I would like to open this discussion to determine if the regression was 
> intentional or if we should carry forward the logic implemented HDFS-1773 
> into the new UI.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-12004) Namenode UI continues to list DNs that have been removed from include and exclude

2017-06-20 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-12004:
--

 Summary: Namenode UI continues to list DNs that have been removed 
from include and exclude
 Key: HDFS-12004
 URL: https://issues.apache.org/jira/browse/HDFS-12004
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Erik Krogen
Priority: Minor


Initially in HDFS, after a DN was decommission and subsequently removed from 
the exclude file (thus removing all references to it), it would still appear in 
the NN UI as a "dead" node until the NN was restarted. In HDFS-1773, discussion 
about this was had, and it was decided that the web UI should not show these 
nodes. However when HDFS-5334 went through and the NN web UI was reimplemented 
client-side, the behavior reverted back to pre-HDFS-1773, and 
dead+decommissioned nodes once again showed in the dead list. This can be 
operationally confusing for the same reasons as discussed in HDFS-1773.

I would like to open this discussion to determine if the regression was 
intentional or if we should carry forward the logic implemented HDFS-1773 into 
the new UI.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-11717) Add unit test for HDFS-11709

2017-04-28 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-11717:
--

 Summary: Add unit test for HDFS-11709
 Key: HDFS-11717
 URL: https://issues.apache.org/jira/browse/HDFS-11717
 Project: Hadoop HDFS
  Issue Type: Task
  Components: ha, namenode
Affects Versions: 2.9.0, 2.7.4, 3.0.0-alpha3, 2.8.1
Reporter: Erik Krogen
Assignee: Erik Krogen
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-11707) TestDirectoryScanner#testThrottling fails on OSX

2017-04-26 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-11707:
--

 Summary: TestDirectoryScanner#testThrottling fails on OSX
 Key: HDFS-11707
 URL: https://issues.apache.org/jira/browse/HDFS-11707
 Project: Hadoop HDFS
  Issue Type: Test
  Components: test
Affects Versions: 2.8.0
Reporter: Erik Krogen
Priority: Minor


In branch-2 and trunk, {{TestDirectoryScanner#testThrottling}} consistently 
fails on OS X (I'm running 10.11 specifically) with:
{code}
java.lang.AssertionError: Throttle is too permissive
{code}
It seems to work alright on Unix systems.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-11615) FSNamesystemLock metrics can be inaccurate due millisecond precision

2017-04-03 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-11615:
--

 Summary: FSNamesystemLock metrics can be inaccurate due 
millisecond precision
 Key: HDFS-11615
 URL: https://issues.apache.org/jira/browse/HDFS-11615
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs
Affects Versions: 2.7.4
Reporter: Erik Krogen
Assignee: Erik Krogen


Currently the {{FSNamesystemLock}} metrics created in HDFS-10872 track the lock 
hold time using {{Timer.monotonicNow()}}, which has millisecond-level 
precision. However, many of these operations hold the lock for less than a 
millisecond, making these metrics inaccurate. We should instead use 
{{System.nanoTime()}} for higher accuracy.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-11352) Potential deadlock in NN when failing over

2017-01-19 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-11352:
--

 Summary: Potential deadlock in NN when failing over
 Key: HDFS-11352
 URL: https://issues.apache.org/jira/browse/HDFS-11352
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.7.4, 2.6.6
Reporter: Erik Krogen
Assignee: Erik Krogen


HDFS-11180 fixed a general class of deadlock that can occur when failing over  
between the MetricsSystemImpl and FSEditLog (see comments on that JIRA for more 
details). In trunk and branch-2/branch-2.8 this fix was successful by making 
the metrics calls not synchronize on FSEditLog.

In branch-2.6 and branch-2.7 there is one more method, 
{{FSNamesystem#getTransactionsSinceLastCheckpoint}}, which still requires the 
lock on FSEditLog and thus can result in the same deadlock scenario. This can 
be seen by running {{TestFSNamesystemMBean#testWithFSEditLogLock}} _with the 
patch in HDFS-11290_ on either of these branches (it fails currently).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-11208) Deadlock in WebHDFS on shutdown

2016-12-05 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-11208:
--

 Summary: Deadlock in WebHDFS on shutdown
 Key: HDFS-11208
 URL: https://issues.apache.org/jira/browse/HDFS-11208
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 3.0.0-alpha1, 2.6.5, 2.7.3, 2.8.0
Reporter: Erik Krogen
Assignee: Erik Krogen


Currently on the client side if the {{DelegationTokenRenewer}} attempts to 
renew a WebHdfs delegation token while the client system is shutting down (i.e. 
{{FileSystem.Cache.ClientFinalizer}} is running) a deadlock may occur. This 
happens because {{ClientFinalizer}} calls {{FileSystem.Cache.closeAll()}} which 
first takes a lock on the {{FileSystem.Cache}} object and then locks each file 
system in the cache as it iterates over them. {{DelegationTokenRenewer}} takes 
a lock on a filesystem object while it is renewing that filesystem's token, but 
within {{TokenAspect.TokenManager.renew()}} (used for renewal of WebHdfs 
tokens) {{FileSystem.get}} is called, which in turn takes a lock on the 
FileSystem cache object, potentially causing deadlock if {{ClientFinalizer}} is 
currently running.

See below for example deadlock output:
{code}
Found one Java-level deadlock:
=
"Thread-8572":
waiting to lock monitor 0x7eff401f9878 (object 0x00051ec3f930, a
dali.hdfs.web.WebHdfsFileSystem),
which is held by "FileSystem-DelegationTokenRenewer"
"FileSystem-DelegationTokenRenewer":
waiting to lock monitor 0x7f005c08f5c8 (object 0x00050389c8b8, a
dali.fs.FileSystem$Cache),
which is held by "Thread-8572"

Java stack information for the threads listed above:
===
"Thread-8572":
at dali.hdfs.web.WebHdfsFileSystem.close(WebHdfsFileSystem.java:864)

   - waiting to lock <0x00051ec3f930> (a
   dali.hdfs.web.WebHdfsFileSystem)
   at dali.fs.FilterFileSystem.close(FilterFileSystem.java:449)
   at dali.fs.FileSystem$Cache.closeAll(FileSystem.java:2407)
   - locked <0x00050389c8b8> (a dali.fs.FileSystem$Cache)
   at dali.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2424)
   - locked <0x00050389c8d0> (a
   dali.fs.FileSystem$Cache$ClientFinalizer)
   at dali.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
   "FileSystem-DelegationTokenRenewer":
   at dali.fs.FileSystem$Cache.getInternal(FileSystem.java:2343)
   - waiting to lock <0x00050389c8b8> (a dali.fs.FileSystem$Cache)
   at dali.fs.FileSystem$Cache.get(FileSystem.java:2332)
   at dali.fs.FileSystem.get(FileSystem.java:369)
   at
   dali.hdfs.web.TokenAspect$TokenManager.getInstance(TokenAspect.java:92)
   at dali.hdfs.web.TokenAspect$TokenManager.renew(TokenAspect.java:72)
   at dali.security.token.Token.renew(Token.java:373)
   at

   
dali.fs.DelegationTokenRenewer$RenewAction.renew(DelegationTokenRenewer.java:127)
   - locked <0x00051ec3f930> (a dali.hdfs.web.WebHdfsFileSystem)
   at

   
dali.fs.DelegationTokenRenewer$RenewAction.access$300(DelegationTokenRenewer.java:57)
   at dali.fs.DelegationTokenRenewer.run(DelegationTokenRenewer.java:258)

Found 1 deadlock.
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-11021) Add FSNamesystemLock metrics for BlockManager operations

2016-10-17 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-11021:
--

 Summary: Add FSNamesystemLock metrics for BlockManager operations
 Key: HDFS-11021
 URL: https://issues.apache.org/jira/browse/HDFS-11021
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Erik Krogen
Assignee: Erik Krogen


Right now the operations which the {{BlockManager}} issues to the 
{{Namesystem}} will not emit metrics about which operation caused the 
{{FSNamesystemLock}} to be held; they are all grouped under "OTHER". We should 
fix this since the {{BlockManager}} creates many acquisitions of both the read 
and write locks. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-10896) Move lock logging logic from FSNamesystem into FSNamesystemLock

2016-09-23 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-10896:
--

 Summary: Move lock logging logic from FSNamesystem into 
FSNamesystemLock
 Key: HDFS-10896
 URL: https://issues.apache.org/jira/browse/HDFS-10896
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Erik Krogen
Assignee: Erik Krogen


There are a number of tickets (HDFS-10742, HDFS-10817, HDFS-10713, this 
subtask's story HDFS-10475) which are adding/improving logging/metrics around 
the {{FSNamesystemLock}}. All of this is done in {{FSNamesystem}} right now, 
which is polluting the namesystem with ThreadLocal variables, timing counters, 
etc. which are only relevant to the lock itself and the number of these 
increases as the logging/metrics become more sophisticated. It would be best to 
move these all into FSNamesystemLock to keep the metrics/logging tied directly 
to the item of interest. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

1 2 >

1 - 100 of 105 matches

Mail list logo