[jira] [Updated] (HDFS-16625) Unit tests aren't checking for PMDK availability
[ https://issues.apache.org/jira/browse/HDFS-16625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-16625: Priority: Major (was: Blocker) > Unit tests aren't checking for PMDK availability > > > Key: HDFS-16625 > URL: https://issues.apache.org/jira/browse/HDFS-16625 > Project: Hadoop HDFS > Issue Type: Test > Components: test >Affects Versions: 3.4.0, 3.3.9 >Reporter: Steve Vaughan >Assignee: Steve Vaughan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.9 > > Time Spent: 1.5h > Remaining Estimate: 0h > > There are unit tests that require native PMDK libraries which aren't checking > if the library is available, resulting in unsuccessful test. Adding the > following in the test setup addresses the problem. > {code:java} > assumeTrue ("Requires PMDK", NativeIO.POSIX.isPmdkAvailable()); {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16625) Unit tests aren't checking for PMDK availability
[ https://issues.apache.org/jira/browse/HDFS-16625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki resolved HDFS-16625. - Hadoop Flags: Reviewed Resolution: Fixed > Unit tests aren't checking for PMDK availability > > > Key: HDFS-16625 > URL: https://issues.apache.org/jira/browse/HDFS-16625 > Project: Hadoop HDFS > Issue Type: Test > Components: test >Affects Versions: 3.4.0, 3.3.9 >Reporter: Steve Vaughan >Assignee: Steve Vaughan >Priority: Blocker > Labels: pull-request-available > Fix For: 3.4.0, 3.3.9 > > Time Spent: 1.5h > Remaining Estimate: 0h > > There are unit tests that require native PMDK libraries which aren't checking > if the library is available, resulting in unsuccessful test. Adding the > following in the test setup addresses the problem. > {code:java} > assumeTrue ("Requires PMDK", NativeIO.POSIX.isPmdkAvailable()); {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-16714) Remove okhttp and kotlin dependencies
[ https://issues.apache.org/jira/browse/HDFS-16714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki reassigned HDFS-16714: --- Assignee: Cheng Pan > Remove okhttp and kotlin dependencies > - > > Key: HDFS-16714 > URL: https://issues.apache.org/jira/browse/HDFS-16714 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 3.3.4 >Reporter: Cheng Pan >Assignee: Cheng Pan >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > hadoop-common already has apache http client dependencies, okhttp is > unnecessary -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17565906#comment-17565906 ] Masatake Iwasaki commented on HDFS-14084: - update the targets to 3.2.5 for preparing 3.2.4 release. > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch, HDFS-14084.014.patch, > HDFS-14084.015.patch, HDFS-14084.016.patch, HDFS-14084.017.patch, > HDFS-14084.018.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14084) Need for more stats in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-14084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-14084: Target Version/s: 3.2.5 (was: 3.2.4) > Need for more stats in DFSClient > > > Key: HDFS-14084 > URL: https://issues.apache.org/jira/browse/HDFS-14084 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Pranay Singh >Priority: Minor > Attachments: HDFS-14084.001.patch, HDFS-14084.002.patch, > HDFS-14084.003.patch, HDFS-14084.004.patch, HDFS-14084.005.patch, > HDFS-14084.006.patch, HDFS-14084.007.patch, HDFS-14084.008.patch, > HDFS-14084.009.patch, HDFS-14084.010.patch, HDFS-14084.011.patch, > HDFS-14084.012.patch, HDFS-14084.013.patch, HDFS-14084.014.patch, > HDFS-14084.015.patch, HDFS-14084.016.patch, HDFS-14084.017.patch, > HDFS-14084.018.patch > > > The usage of HDFS has changed from being used as a map-reduce filesystem, now > it's becoming more of like a general purpose filesystem. In most of the cases > there are issues with the Namenode so we have metrics to know the workload or > stress on Namenode. > However, there is a need to have more statistics collected for different > operations/RPCs in DFSClient to know which RPC operations are taking longer > time or to know what is the frequency of the operation.These statistics can > be exposed to the users of DFS Client and they can periodically log or do > some sort of flow control if the response is slow. This will also help to > isolate HDFS issue in a mixed environment where on a node say we have Spark, > HBase and Impala running together. We can check the throughput of different > operation across client and isolate the problem caused because of noisy > neighbor or network congestion or shared JVM. > We have dealt with several problems from the field for which there is no > conclusive evidence as to what caused the problem. If we had metrics or stats > in DFSClient we would be better equipped to solve such complex problems. > List of jiras for reference: > - > HADOOP-15538 HADOOP-15530 ( client side deadlock) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14571) Command line to force volume failures
[ https://issues.apache.org/jira/browse/HDFS-14571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17565904#comment-17565904 ] Masatake Iwasaki commented on HDFS-14571: - update the targets to 3.2.5 for preparing 3.2.4 release. > Command line to force volume failures > - > > Key: HDFS-14571 > URL: https://issues.apache.org/jira/browse/HDFS-14571 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, hdfs > Environment: Linux >Reporter: Scott A. Wehner >Priority: Major > Labels: disks, volumes > Original Estimate: 48h > Remaining Estimate: 48h > > Datanodes that have failed hard drives reports to the namenode that it has a > failed volume in line with enabling slow datanode detection and we have a > failing drive that has not failed, or has uncorrectable sectors, I want to > be able to run a command to force fail a datanode volume based on storageID > or Target Storage location (a.k.a mount point). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14349) Edit log may be rolled more frequently than necessary with multiple Standby nodes
[ https://issues.apache.org/jira/browse/HDFS-14349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17565905#comment-17565905 ] Masatake Iwasaki commented on HDFS-14349: - update the targets to 3.2.5 for preparing 3.2.4 release. > Edit log may be rolled more frequently than necessary with multiple Standby > nodes > - > > Key: HDFS-14349 > URL: https://issues.apache.org/jira/browse/HDFS-14349 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, hdfs, qjm >Reporter: Erik Krogen >Assignee: Ekanth Sethuramalingam >Priority: Major > Labels: multi-sbnn > > When HDFS-14317 was fixed, we tackled the problem that in a cluster with > in-progress edit log tailing enabled, a Standby NameNode may _never_ roll the > edit logs, which can eventually cause data loss. > Unfortunately, in the process, it was made so that if there are multiple > Standby NameNodes, they will all roll the edit logs at their specified > frequency, so the edit log will be rolled X times more frequently than they > should be (where X is the number of Standby NNs). This is not as bad as the > original bug since rolling frequently does not affect correctness or data > availability, but may degrade performance by creating more edit log segments > than necessary. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15289) Allow viewfs mounts with HDFS/HCFS scheme and centralized mount table
[ https://issues.apache.org/jira/browse/HDFS-15289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-15289: Target Version/s: 3.4.0, 3.3.9, 3.2.5 (was: 3.4.0, 3.2.4, 3.3.9) > Allow viewfs mounts with HDFS/HCFS scheme and centralized mount table > - > > Key: HDFS-15289 > URL: https://issues.apache.org/jira/browse/HDFS-15289 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 3.2.0 >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G >Priority: Major > Attachments: ViewFSOverloadScheme - V1.0.pdf, ViewFSOverloadScheme.png > > > ViewFS provides flexibility to mount different filesystem types with mount > points configuration table. This approach is solving the scalability > problems, but users need to reconfigure the filesystem to ViewFS and to its > scheme. This will be problematic in the case of paths persisted in meta > stores, ex: Hive. In systems like Hive, it will store uris in meta store. So, > changing the file system scheme will create a burden to upgrade/recreate meta > stores. In our experience many users are not ready to change that. > Router based federation is another implementation to provide coordinated > mount points for HDFS federation clusters. Even though this provides > flexibility to handle mount points easily, this will not allow > other(non-HDFS) file systems to mount. So, this does not solve the purpose > when users want to mount external(non-HDFS) filesystems. > So, the problem here is: Even though many users want to adapt to the scalable > fs options available, technical challenges of changing schemes (ex: in meta > stores) in deployments are obstructing them. > So, we propose to allow hdfs scheme in ViewFS like client side mount system > and provision user to create mount links without changing URI paths. > I will upload detailed design doc shortly. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14571) Command line to force volume failures
[ https://issues.apache.org/jira/browse/HDFS-14571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-14571: Target Version/s: 3.2.5 (was: 3.2.4) > Command line to force volume failures > - > > Key: HDFS-14571 > URL: https://issues.apache.org/jira/browse/HDFS-14571 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, hdfs > Environment: Linux >Reporter: Scott A. Wehner >Priority: Major > Labels: disks, volumes > Original Estimate: 48h > Remaining Estimate: 48h > > Datanodes that have failed hard drives reports to the namenode that it has a > failed volume in line with enabling slow datanode detection and we have a > failing drive that has not failed, or has uncorrectable sectors, I want to > be able to run a command to force fail a datanode volume based on storageID > or Target Storage location (a.k.a mount point). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15289) Allow viewfs mounts with HDFS/HCFS scheme and centralized mount table
[ https://issues.apache.org/jira/browse/HDFS-15289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17565902#comment-17565902 ] Masatake Iwasaki commented on HDFS-15289: - update the targets to 3.2.5 for preparing 3.2.4 release. > Allow viewfs mounts with HDFS/HCFS scheme and centralized mount table > - > > Key: HDFS-15289 > URL: https://issues.apache.org/jira/browse/HDFS-15289 > Project: Hadoop HDFS > Issue Type: New Feature > Components: fs >Affects Versions: 3.2.0 >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G >Priority: Major > Attachments: ViewFSOverloadScheme - V1.0.pdf, ViewFSOverloadScheme.png > > > ViewFS provides flexibility to mount different filesystem types with mount > points configuration table. This approach is solving the scalability > problems, but users need to reconfigure the filesystem to ViewFS and to its > scheme. This will be problematic in the case of paths persisted in meta > stores, ex: Hive. In systems like Hive, it will store uris in meta store. So, > changing the file system scheme will create a burden to upgrade/recreate meta > stores. In our experience many users are not ready to change that. > Router based federation is another implementation to provide coordinated > mount points for HDFS federation clusters. Even though this provides > flexibility to handle mount points easily, this will not allow > other(non-HDFS) file systems to mount. So, this does not solve the purpose > when users want to mount external(non-HDFS) filesystems. > So, the problem here is: Even though many users want to adapt to the scalable > fs options available, technical challenges of changing schemes (ex: in meta > stores) in deployments are obstructing them. > So, we propose to allow hdfs scheme in ViewFS like client side mount system > and provision user to create mount links without changing URI paths. > I will upload detailed design doc shortly. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14349) Edit log may be rolled more frequently than necessary with multiple Standby nodes
[ https://issues.apache.org/jira/browse/HDFS-14349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-14349: Target Version/s: 3.2.5 (was: 3.2.4) > Edit log may be rolled more frequently than necessary with multiple Standby > nodes > - > > Key: HDFS-14349 > URL: https://issues.apache.org/jira/browse/HDFS-14349 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, hdfs, qjm >Reporter: Erik Krogen >Assignee: Ekanth Sethuramalingam >Priority: Major > Labels: multi-sbnn > > When HDFS-14317 was fixed, we tackled the problem that in a cluster with > in-progress edit log tailing enabled, a Standby NameNode may _never_ roll the > edit logs, which can eventually cause data loss. > Unfortunately, in the process, it was made so that if there are multiple > Standby NameNodes, they will all roll the edit logs at their specified > frequency, so the edit log will be rolled X times more frequently than they > should be (where X is the number of Standby NNs). This is not as bad as the > original bug since rolling frequently does not affect correctness or data > availability, but may degrade performance by creating more edit log segments > than necessary. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16022) matlab mapreduce v95 demos can't run hadoop-3.2.2 run time
[ https://issues.apache.org/jira/browse/HDFS-16022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-16022: Target Version/s: 3.2.5 (was: 3.2.4) > matlab mapreduce v95 demos can't run hadoop-3.2.2 run time > -- > > Key: HDFS-16022 > URL: https://issues.apache.org/jira/browse/HDFS-16022 > Project: Hadoop HDFS > Issue Type: Bug > Components: dfsclient >Affects Versions: 3.2.2 > Environment: hadoop-3.2.2 + matlab run time+ centos7, the > maxArrivalDelay.ctf file is generated in win10+matlab2018b(V95) by hadoop > compiler tools. the airlinesmall.csv upload the HDFS. hadoop can run well by > the hadoop-mapreduce-examples-3.2.2.jar wordcount demos, even, jar compiled > by the source code in win10+ eclipses env. please help, I have got no idea > about this >Reporter: cathonxiong >Priority: Blocker > Attachments: matlab_errorlog > > > hadoop \ hadoop \> jar > /usr/local/MATLAB/MATLAB_Runtime/v95/toolbox/mlhadoop/jar/a2.2.0/mwmapreduce.jar > \> com.mathworks.hadoop.MWMapReduceDriver \> -D > mw.mcrroot=/usr/local/MATLAB/MATLAB_Runtime/v95 \> > /usr/local/MATLAB/MATLAB_Runtime/v95/maxArrivalDelay.ctf \> > hdfs://hadoop.namenode:50070/user/matlab/datasets/airlinesmall.csv \> > hdfs://hadoop.namenode:50070/user/matlab/resultsjava.library.path: > /usr/local/hadoop-3.2.2/lib/nativeHDFSCTFPath=hdfs://hadoop.namenode:8020/user/root/maxArrivalDelay/maxArrivalDelay.ctfUploading > CTF into distributed cache completed.mapred.child.env: > MCR_CACHE_ROOT=/tmp,LD_LIBRARY_PATH=/usr/local/MATLAB/MATLAB_Runtime/v95/runtime/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v95/bin/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v95/sys/os/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v95/sys/opengl/lib/glnxa64mapred.child.java.opts: > > -Djava.library.path=/usr/local/MATLAB/MATLAB_Runtime/v95/runtime/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v95/bin/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v95/sys/os/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v95/sys/opengl/lib/glnxa64New > java.library.path: > /usr/local/hadoop-3.2.2/lib/native:/usr/local/MATLAB/MATLAB_Runtime/v95/runtime/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v95/bin/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v95/sys/os/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v95/sys/opengl/lib/glnxa64Using > MATLAB mapper.Set input format class to: ChunkFileRecordReader.Using MATLAB > reducer.Set outputformat class to: class > org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormatSet map output > key class to: class com.mathworks.hadoop.MxArrayWritable2Set map output value > class to: class com.mathworks.hadoop.MxArrayWritable2Set reduce output key > class to: class com.mathworks.hadoop.MxArrayWritable2Set reduce output value > class to: class com.mathworks.hadoop.MxArrayWritable2*** run > **2021-05-11 14:58:47,043 INFO client.RMProxy: Connecting to > ResourceManager at hadoop.namenode/192.168.0.25:80322021-05-11 14:58:47,139 > WARN net.NetUtils: Unable to wrap exception of type class > org.apache.hadoop.ipc.RpcException: it has no (String) > constructorjava.lang.NoSuchMethodException: > org.apache.hadoop.ipc.RpcException.(java.lang.String) at > java.lang.Class.getConstructor0(Class.java:3082) at > java.lang.Class.getConstructor(Class.java:1825) at > org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:835) at > org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:811) at > org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1566) at > org.apache.hadoop.ipc.Client.call(Client.java:1508) at > org.apache.hadoop.ipc.Client.call(Client.java:1405) at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118) > at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source) at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:910) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) > at > org.apache.hadoop.io.retry.RetryInvocationHan
[jira] [Commented] (HDFS-16022) matlab mapreduce v95 demos can't run hadoop-3.2.2 run time
[ https://issues.apache.org/jira/browse/HDFS-16022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17565900#comment-17565900 ] Masatake Iwasaki commented on HDFS-16022: - update the targets to 3.2.5 for preparing 3.2.4 release. > matlab mapreduce v95 demos can't run hadoop-3.2.2 run time > -- > > Key: HDFS-16022 > URL: https://issues.apache.org/jira/browse/HDFS-16022 > Project: Hadoop HDFS > Issue Type: Bug > Components: dfsclient >Affects Versions: 3.2.2 > Environment: hadoop-3.2.2 + matlab run time+ centos7, the > maxArrivalDelay.ctf file is generated in win10+matlab2018b(V95) by hadoop > compiler tools. the airlinesmall.csv upload the HDFS. hadoop can run well by > the hadoop-mapreduce-examples-3.2.2.jar wordcount demos, even, jar compiled > by the source code in win10+ eclipses env. please help, I have got no idea > about this >Reporter: cathonxiong >Priority: Blocker > Attachments: matlab_errorlog > > > hadoop \ hadoop \> jar > /usr/local/MATLAB/MATLAB_Runtime/v95/toolbox/mlhadoop/jar/a2.2.0/mwmapreduce.jar > \> com.mathworks.hadoop.MWMapReduceDriver \> -D > mw.mcrroot=/usr/local/MATLAB/MATLAB_Runtime/v95 \> > /usr/local/MATLAB/MATLAB_Runtime/v95/maxArrivalDelay.ctf \> > hdfs://hadoop.namenode:50070/user/matlab/datasets/airlinesmall.csv \> > hdfs://hadoop.namenode:50070/user/matlab/resultsjava.library.path: > /usr/local/hadoop-3.2.2/lib/nativeHDFSCTFPath=hdfs://hadoop.namenode:8020/user/root/maxArrivalDelay/maxArrivalDelay.ctfUploading > CTF into distributed cache completed.mapred.child.env: > MCR_CACHE_ROOT=/tmp,LD_LIBRARY_PATH=/usr/local/MATLAB/MATLAB_Runtime/v95/runtime/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v95/bin/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v95/sys/os/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v95/sys/opengl/lib/glnxa64mapred.child.java.opts: > > -Djava.library.path=/usr/local/MATLAB/MATLAB_Runtime/v95/runtime/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v95/bin/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v95/sys/os/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v95/sys/opengl/lib/glnxa64New > java.library.path: > /usr/local/hadoop-3.2.2/lib/native:/usr/local/MATLAB/MATLAB_Runtime/v95/runtime/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v95/bin/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v95/sys/os/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v95/sys/opengl/lib/glnxa64Using > MATLAB mapper.Set input format class to: ChunkFileRecordReader.Using MATLAB > reducer.Set outputformat class to: class > org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormatSet map output > key class to: class com.mathworks.hadoop.MxArrayWritable2Set map output value > class to: class com.mathworks.hadoop.MxArrayWritable2Set reduce output key > class to: class com.mathworks.hadoop.MxArrayWritable2Set reduce output value > class to: class com.mathworks.hadoop.MxArrayWritable2*** run > **2021-05-11 14:58:47,043 INFO client.RMProxy: Connecting to > ResourceManager at hadoop.namenode/192.168.0.25:80322021-05-11 14:58:47,139 > WARN net.NetUtils: Unable to wrap exception of type class > org.apache.hadoop.ipc.RpcException: it has no (String) > constructorjava.lang.NoSuchMethodException: > org.apache.hadoop.ipc.RpcException.(java.lang.String) at > java.lang.Class.getConstructor0(Class.java:3082) at > java.lang.Class.getConstructor(Class.java:1825) at > org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:835) at > org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:811) at > org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1566) at > org.apache.hadoop.ipc.Client.call(Client.java:1508) at > org.apache.hadoop.ipc.Client.call(Client.java:1405) at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118) > at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source) at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:910) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocati
[jira] [Updated] (HDFS-16022) matlab mapreduce v95 demos can't run hadoop-3.2.2 run time
[ https://issues.apache.org/jira/browse/HDFS-16022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-16022: Priority: Major (was: Blocker) > matlab mapreduce v95 demos can't run hadoop-3.2.2 run time > -- > > Key: HDFS-16022 > URL: https://issues.apache.org/jira/browse/HDFS-16022 > Project: Hadoop HDFS > Issue Type: Bug > Components: dfsclient >Affects Versions: 3.2.2 > Environment: hadoop-3.2.2 + matlab run time+ centos7, the > maxArrivalDelay.ctf file is generated in win10+matlab2018b(V95) by hadoop > compiler tools. the airlinesmall.csv upload the HDFS. hadoop can run well by > the hadoop-mapreduce-examples-3.2.2.jar wordcount demos, even, jar compiled > by the source code in win10+ eclipses env. please help, I have got no idea > about this >Reporter: cathonxiong >Priority: Major > Attachments: matlab_errorlog > > > hadoop \ hadoop \> jar > /usr/local/MATLAB/MATLAB_Runtime/v95/toolbox/mlhadoop/jar/a2.2.0/mwmapreduce.jar > \> com.mathworks.hadoop.MWMapReduceDriver \> -D > mw.mcrroot=/usr/local/MATLAB/MATLAB_Runtime/v95 \> > /usr/local/MATLAB/MATLAB_Runtime/v95/maxArrivalDelay.ctf \> > hdfs://hadoop.namenode:50070/user/matlab/datasets/airlinesmall.csv \> > hdfs://hadoop.namenode:50070/user/matlab/resultsjava.library.path: > /usr/local/hadoop-3.2.2/lib/nativeHDFSCTFPath=hdfs://hadoop.namenode:8020/user/root/maxArrivalDelay/maxArrivalDelay.ctfUploading > CTF into distributed cache completed.mapred.child.env: > MCR_CACHE_ROOT=/tmp,LD_LIBRARY_PATH=/usr/local/MATLAB/MATLAB_Runtime/v95/runtime/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v95/bin/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v95/sys/os/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v95/sys/opengl/lib/glnxa64mapred.child.java.opts: > > -Djava.library.path=/usr/local/MATLAB/MATLAB_Runtime/v95/runtime/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v95/bin/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v95/sys/os/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v95/sys/opengl/lib/glnxa64New > java.library.path: > /usr/local/hadoop-3.2.2/lib/native:/usr/local/MATLAB/MATLAB_Runtime/v95/runtime/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v95/bin/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v95/sys/os/glnxa64:/usr/local/MATLAB/MATLAB_Runtime/v95/sys/opengl/lib/glnxa64Using > MATLAB mapper.Set input format class to: ChunkFileRecordReader.Using MATLAB > reducer.Set outputformat class to: class > org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormatSet map output > key class to: class com.mathworks.hadoop.MxArrayWritable2Set map output value > class to: class com.mathworks.hadoop.MxArrayWritable2Set reduce output key > class to: class com.mathworks.hadoop.MxArrayWritable2Set reduce output value > class to: class com.mathworks.hadoop.MxArrayWritable2*** run > **2021-05-11 14:58:47,043 INFO client.RMProxy: Connecting to > ResourceManager at hadoop.namenode/192.168.0.25:80322021-05-11 14:58:47,139 > WARN net.NetUtils: Unable to wrap exception of type class > org.apache.hadoop.ipc.RpcException: it has no (String) > constructorjava.lang.NoSuchMethodException: > org.apache.hadoop.ipc.RpcException.(java.lang.String) at > java.lang.Class.getConstructor0(Class.java:3082) at > java.lang.Class.getConstructor(Class.java:1825) at > org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:835) at > org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:811) at > org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1566) at > org.apache.hadoop.ipc.Client.call(Client.java:1508) at > org.apache.hadoop.ipc.Client.call(Client.java:1405) at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118) > at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source) at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:910) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) > at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.inv
[jira] [Updated] (HDFS-16177) Bug fix for Util#receiveFile
[ https://issues.apache.org/jira/browse/HDFS-16177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-16177: Fix Version/s: 3.2.4 (was: 3.2.3) > Bug fix for Util#receiveFile > > > Key: HDFS-16177 > URL: https://issues.apache.org/jira/browse/HDFS-16177 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.0 >Reporter: Tao Li >Assignee: Tao Li >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0, 3.3.2, 3.2.4 > > Attachments: download-fsimage.jpg > > Time Spent: 1h > Remaining Estimate: 0h > > The time to write file was miscalculated in Util#receiveFile. > !download-fsimage.jpg|width=578,height=134! -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16198) Short circuit read leaks Slot objects when InvalidToken exception is thrown
[ https://issues.apache.org/jira/browse/HDFS-16198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-16198: Fix Version/s: 3.2.4 (was: 3.2.3) > Short circuit read leaks Slot objects when InvalidToken exception is thrown > --- > > Key: HDFS-16198 > URL: https://issues.apache.org/jira/browse/HDFS-16198 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Eungsop Yoo >Assignee: Eungsop Yoo >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 2.10.2, 3.3.2, 3.2.4 > > Attachments: HDFS-16198.patch, screenshot-2.png > > Time Spent: 2.5h > Remaining Estimate: 0h > > In secure mode, 'dfs.block.access.token.enable' should be set 'true'. With > this configuration SecretManager.InvalidToken exception may be thrown if the > access token expires when we do short circuit reads. It doesn't matter > because the failed reads will be retried. But it causes the leakage of > ShortCircuitShm.Slot objects. > > We found this problem in our secure HBase clusters. The number of open file > descriptors of RegionServers kept increasing using short circuit reading. > !screenshot-2.png! > > It was caused by the leakage of shared memory segments used by short circuit > reading. > {code:java} > [root ~]# lsof -p $(ps -ef | grep proc_regionserver | grep -v grep | awk > '{print $2}') | grep /dev/shm | wc -l > 3925 > [root ~]# lsof -p $(ps -ef | grep proc_regionserver | grep -v grep | awk > '{print $2}') | grep /dev/shm | head -5 > java 86309 hbase DEL REG 0,19 2308279984 > /dev/shm/HadoopShortCircuitShm_DFSClient_NONMAPREDUCE_-1107866286_1_743473959 > java 86309 hbase DEL REG 0,19 2306359893 > /dev/shm/HadoopShortCircuitShm_DFSClient_NONMAPREDUCE_-1107866286_1_1594162967 > java 86309 hbase DEL REG 0,19 2305496758 > /dev/shm/HadoopShortCircuitShm_DFSClient_NONMAPREDUCE_-1107866286_1_2043027439 > java 86309 hbase DEL REG 0,19 2304784261 > /dev/shm/HadoopShortCircuitShm_DFSClient_NONMAPREDUCE_-1107866286_1_689571088 > java 86309 hbase DEL REG 0,19 2302621988 > /dev/shm/HadoopShortCircuitShm_DFSClient_NONMAPREDUCE_-1107866286_1_347008590 > {code} > > We finally found that the root cause of this is the leakage of > ShortCircuitShm.Slot. > > The fix is trivial. Just free the slot when InvalidToken exception is thrown. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16241) Standby close reconstruction thread
[ https://issues.apache.org/jira/browse/HDFS-16241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-16241: Fix Version/s: 3.2.4 (was: 3.2.3) > Standby close reconstruction thread > --- > > Key: HDFS-16241 > URL: https://issues.apache.org/jira/browse/HDFS-16241 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: zhanghuazong >Assignee: zhanghuazong >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.2, 3.2.4 > > Attachments: HDFS-16241 > > Time Spent: 0.5h > Remaining Estimate: 0h > > When the "Reconstruction Queue Initializer" thread of the active namenode has > not stopped, switch to standby namenode. The "Reconstruction Queue > Initializer" thread should be closed -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16187) SnapshotDiff behaviour with Xattrs and Acls is not consistent across NN restarts with checkpointing
[ https://issues.apache.org/jira/browse/HDFS-16187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-16187: Fix Version/s: 3.2.4 (was: 3.2.3) > SnapshotDiff behaviour with Xattrs and Acls is not consistent across NN > restarts with checkpointing > --- > > Key: HDFS-16187 > URL: https://issues.apache.org/jira/browse/HDFS-16187 > Project: Hadoop HDFS > Issue Type: Bug > Components: snapshots >Reporter: Srinivasu Majeti >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.2, 3.2.4 > > Time Spent: 5h > Remaining Estimate: 0h > > The below test shows the snapshot diff between across snapshots is not > consistent with Xattr(EZ here settinh the Xattr) across NN restarts with > checkpointed FsImage. > {code:java} > @Test > public void testEncryptionZonesWithSnapshots() throws Exception { > final Path snapshottable = new Path("/zones"); > fsWrapper.mkdir(snapshottable, FsPermission.getDirDefault(), > true); > dfsAdmin.allowSnapshot(snapshottable); > dfsAdmin.createEncryptionZone(snapshottable, TEST_KEY, NO_TRASH); > fs.createSnapshot(snapshottable, "snap1"); > SnapshotDiffReport report = > fs.getSnapshotDiffReport(snapshottable, "snap1", ""); > Assert.assertEquals(0, report.getDiffList().size()); > report = > fs.getSnapshotDiffReport(snapshottable, "snap1", ""); > System.out.println(report); > Assert.assertEquals(0, report.getDiffList().size()); > fs.setSafeMode(SafeModeAction.SAFEMODE_ENTER); > fs.saveNamespace(); > fs.setSafeMode(SafeModeAction.SAFEMODE_LEAVE); > cluster.restartNameNode(true); > report = > fs.getSnapshotDiffReport(snapshottable, "snap1", ""); > Assert.assertEquals(0, report.getDiffList().size()); > }{code} > {code:java} > Pre Restart: > Difference between snapshot snap1 and current directory under directory > /zones: > Post Restart: > Difference between snapshot snap1 and current directory under directory > /zones: > M .{code} > The side effect of this behavior is : distcp with snapshot diff would fail > with below error complaining that target cluster has some data changed . > {code:java} > WARN tools.DistCp: The target has been modified since snapshot x > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16182) numOfReplicas is given the wrong value in BlockPlacementPolicyDefault$chooseTarget can cause DataStreamer to fail with Heterogeneous Storage
[ https://issues.apache.org/jira/browse/HDFS-16182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-16182: Fix Version/s: 3.2.4 (was: 3.2.3) > numOfReplicas is given the wrong value in > BlockPlacementPolicyDefault$chooseTarget can cause DataStreamer to fail with > Heterogeneous Storage > --- > > Key: HDFS-16182 > URL: https://issues.apache.org/jira/browse/HDFS-16182 > Project: Hadoop HDFS > Issue Type: Bug > Components: namanode >Affects Versions: 3.4.0 >Reporter: Max Xie >Assignee: Max Xie >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.2, 3.2.4 > > Attachments: HDFS-16182.patch > > Time Spent: 3.5h > Remaining Estimate: 0h > > In our hdfs cluster, we use heterogeneous storage to store data in SSD for a > better performance. Sometimes hdfs client transfer data in pipline, it will > throw IOException and exit. Exception logs are below: > ``` > java.io.IOException: Failed to replace a bad datanode on the existing > pipeline due to no more good datanodes being available to try. (Nodes: > current=[DatanodeInfoWithStorage[dn01_ip:5004,DS-ef7882e0-427d-4c1e-b9ba-a929fac44fb4,DISK], > > DatanodeInfoWithStorage[dn02_ip:5004,DS-3871282a-ad45-4332-866a-f000f9361ecb,DISK], > > DatanodeInfoWithStorage[dn03_ip:5004,DS-a388c067-76a4-4014-a16c-ccc49c8da77b,SSD], > > DatanodeInfoWithStorage[dn04_ip:5004,DS-b81da262-0dd9-4567-a498-c516fab84fe0,SSD], > > DatanodeInfoWithStorage[dn05_ip:5004,DS-34e3af2e-da80-46ac-938c-6a3218a646b9,SSD]], > > original=[DatanodeInfoWithStorage[dn01_ip:5004,DS-ef7882e0-427d-4c1e-b9ba-a929fac44fb4,DISK], > > DatanodeInfoWithStorage[dn02_ip:5004,DS-3871282a-ad45-4332-866a-f000f9361ecb,DISK]]). > The current failed datanode replacement policy is DEFAULT, and a client may > configure this via > 'dfs.client.block.write.replace-datanode-on-failure.policy' in its > configuration. > ``` > After search it, I found when existing pipline need replace new dn to > transfer data, the client will get one additional dn from namenode and check > that the number of dn is the original number + 1. > ``` > ## DataStreamer$findNewDatanode > if (nodes.length != original.length + 1) { > throw new IOException( > "Failed to replace a bad datanode on the existing pipeline " > + "due to no more good datanodes being available to try. " > + "(Nodes: current=" + Arrays.asList(nodes) > + ", original=" + Arrays.asList(original) + "). " > + "The current failed datanode replacement policy is " > + dfsClient.dtpReplaceDatanodeOnFailure > + ", and a client may configure this via '" > + BlockWrite.ReplaceDatanodeOnFailure.POLICY_KEY > + "' in its configuration."); > } > ``` > The root cause is that Namenode$getAdditionalDatanode returns multi datanodes > , not one in DataStreamer.addDatanode2ExistingPipeline. > > Maybe we can fix it in BlockPlacementPolicyDefault$chooseTarget. I think > numOfReplicas should not be assigned by requiredStorageTypes. > > > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16350) Datanode start time should be set after RPC server starts successfully
[ https://issues.apache.org/jira/browse/HDFS-16350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-16350: Fix Version/s: 3.2.4 (was: 3.2.3) > Datanode start time should be set after RPC server starts successfully > -- > > Key: HDFS-16350 > URL: https://issues.apache.org/jira/browse/HDFS-16350 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0, 3.3.2, 3.2.4 > > Attachments: Screenshot 2021-11-23 at 4.32.04 PM.png > > Time Spent: 2.5h > Remaining Estimate: 0h > > We set start time of Datanode when the class is instantiated but it should be > ideally set only after RPC server starts and RPC handlers are initialized to > serve client requests. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16337) Show start time of Datanode on Web
[ https://issues.apache.org/jira/browse/HDFS-16337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-16337: Fix Version/s: 3.2.4 (was: 3.2.3) > Show start time of Datanode on Web > -- > > Key: HDFS-16337 > URL: https://issues.apache.org/jira/browse/HDFS-16337 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Tao Li >Assignee: Tao Li >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0, 3.3.2, 3.2.4 > > Attachments: image-2021-11-19-08-55-58-343.png > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Show _start time_ of Datanode on Web. > !image-2021-11-19-08-55-58-343.png|width=540,height=155! > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16352) return the real datanode numBlocks in #getDatanodeStorageReport
[ https://issues.apache.org/jira/browse/HDFS-16352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-16352: Fix Version/s: 3.2.4 > return the real datanode numBlocks in #getDatanodeStorageReport > --- > > Key: HDFS-16352 > URL: https://issues.apache.org/jira/browse/HDFS-16352 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: qinyuren >Assignee: qinyuren >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.2.4, 3.3.9 > > Attachments: image-2021-11-23-22-04-06-131.png > > Time Spent: 3h 40m > Remaining Estimate: 0h > > #getDatanodeStorageReport will return the array of DatanodeStorageReport > which contains the DatanodeInfo in each DatanodeStorageReport, but the > numBlocks in DatanodeInfo is always zero, which is confusing > !image-2021-11-23-22-04-06-131.png|width=683,height=338! > Or we can return the real numBlocks in DatanodeInfo when we call > #getDatanodeStorageReport -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16430) Validate maximum blocks in EC group when adding an EC policy
[ https://issues.apache.org/jira/browse/HDFS-16430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-16430: Fix Version/s: 3.2.4 (was: 3.2.3) > Validate maximum blocks in EC group when adding an EC policy > > > Key: HDFS-16430 > URL: https://issues.apache.org/jira/browse/HDFS-16430 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ec, erasure-coding >Affects Versions: 3.3.0, 3.3.1 >Reporter: daimin >Assignee: daimin >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0, 3.2.4, 3.3.9 > > Time Spent: 50m > Remaining Estimate: 0h > > HDFS EC adopts the last 4 bits of block ID to store the block index in EC > block group. Therefore maximum blocks in EC block group is 2^4=16, and which > is defined here: HdfsServerConstants#MAX_BLOCKS_IN_GROUP. > Currently there is no limitation or warning when adding a bad EC policy with > numDataUnits + numParityUnits > 16. It only results in read/write error on EC > file with bad EC policy. To users this is not very straightforward. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16403) Improve FUSE IO performance by supporting FUSE parameter max_background
[ https://issues.apache.org/jira/browse/HDFS-16403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-16403: Fix Version/s: 3.2.4 (was: 3.2.3) > Improve FUSE IO performance by supporting FUSE parameter max_background > --- > > Key: HDFS-16403 > URL: https://issues.apache.org/jira/browse/HDFS-16403 > Project: Hadoop HDFS > Issue Type: Improvement > Components: fuse-dfs >Affects Versions: 3.3.0, 3.3.1 >Reporter: daimin >Assignee: daimin >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0, 3.2.4, 3.3.9 > > Time Spent: 4.5h > Remaining Estimate: 0h > > When we examining the FUSE IO performance on HDFS, we found that the > simultaneous IO requests number are limited to a fixed number, like 12. This > limitation makes the IO performance on FUSE client quite unacceptable. We did > some research on this and inspired by the article [Performance and Resource > Utilization of FUSE User-Space File > Systems|https://dl.acm.org/doi/fullHtml/10.1145/3310148], clearly the FUSE > parameter '{{{}max_background{}}}' decides the simultaneous IO requests > number, which is 12 by default. > We add 'max_background' to fuse_dfs mount options, the FUSE kernel will take > effect when an option value is given. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16437) ReverseXML processor doesn't accept XML files without the SnapshotDiffSection.
[ https://issues.apache.org/jira/browse/HDFS-16437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-16437: Fix Version/s: 3.2.4 (was: 3.2.3) > ReverseXML processor doesn't accept XML files without the SnapshotDiffSection. > -- > > Key: HDFS-16437 > URL: https://issues.apache.org/jira/browse/HDFS-16437 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.1.1, 3.3.0 >Reporter: yanbin.zhang >Assignee: yanbin.zhang >Priority: Critical > Labels: pull-request-available > Fix For: 3.4.0, 3.2.4, 3.3.3 > > Time Spent: 5h 40m > Remaining Estimate: 0h > > In a cluster environment without snapshot, if you want to convert back to > fsimage through the generated xml, an error will be reported. > {code:java} > //代码占位符 > [test@test001 ~]$ hdfs oiv -p ReverseXML -i fsimage_0257220.xml > -o fsimage_0257220 > OfflineImageReconstructor failed: FSImage XML ended prematurely, without > including section(s) SnapshotDiffSection > java.io.IOException: FSImage XML ended prematurely, without including > section(s) SnapshotDiffSection > at > org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageReconstructor.processXml(OfflineImageReconstructor.java:1765) > at > org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageReconstructor.run(OfflineImageReconstructor.java:1842) > at > org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageViewerPB.run(OfflineImageViewerPB.java:211) > at > org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageViewerPB.main(OfflineImageViewerPB.java:149) > 22/01/25 15:56:52 INFO util.ExitUtil: Exiting with status 1: ExitException > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11041) Unable to unregister FsDatasetState MBean if DataNode is shutdown twice
[ https://issues.apache.org/jira/browse/HDFS-11041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-11041: Fix Version/s: 3.2.4 (was: 3.2.3) > Unable to unregister FsDatasetState MBean if DataNode is shutdown twice > --- > > Key: HDFS-11041 > URL: https://issues.apache.org/jira/browse/HDFS-11041 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Trivial > Fix For: 3.4.0, 2.10.2, 3.2.4, 3.3.3 > > Attachments: HDFS-11041.01.patch, HDFS-11041.02.patch, > HDFS-11041.03.patch > > > I saw error message like the following in some tests > {noformat} > 2016-10-21 04:09:03,900 [main] WARN util.MBeans > (MBeans.java:unregister(114)) - Error unregistering > Hadoop:service=DataNode,name=FSDatasetState-33cd714c-0b1a-471f-8efe-f431d7d874bc > javax.management.InstanceNotFoundException: > Hadoop:service=DataNode,name=FSDatasetState-33cd714c-0b1a-471f-8efe-f431d7d874bc > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:427) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:415) > at > com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:546) > at org.apache.hadoop.metrics2.util.MBeans.unregister(MBeans.java:112) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:2127) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:2016) > at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1985) > at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1962) > at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1936) > at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1929) > at > org.apache.hadoop.hdfs.TestDatanodeReport.testDatanodeReport(TestDatanodeReport.java:144) > {noformat} > The test shuts down datanode, and then shutdown cluster, which shuts down the > a datanode twice. Resetting the FsDatasetSpi reference in DataNode to null > resolves the issue. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16428) Source path with storagePolicy cause wrong typeConsumed while rename
[ https://issues.apache.org/jira/browse/HDFS-16428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-16428: Fix Version/s: 3.2.4 (was: 3.2.3) > Source path with storagePolicy cause wrong typeConsumed while rename > > > Key: HDFS-16428 > URL: https://issues.apache.org/jira/browse/HDFS-16428 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode >Reporter: lei w >Assignee: lei w >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.2.4, 3.3.3 > > Attachments: example.txt > > Time Spent: 2.5h > Remaining Estimate: 0h > > When compute quota in rename operation , we use storage policy of the target > directory to compute src quota usage. This will cause wrong value of > typeConsumed when source path was setted storage policy. I provided a unit > test to present this situation. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15737) Don't remove datanodes from outOfServiceNodeBlocks while checking in DatanodeAdminManager
[ https://issues.apache.org/jira/browse/HDFS-15737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-15737: Fix Version/s: (was: 2.10.2) > Don't remove datanodes from outOfServiceNodeBlocks while checking in > DatanodeAdminManager > - > > Key: HDFS-15737 > URL: https://issues.apache.org/jira/browse/HDFS-15737 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Ye Ni >Assignee: Ye Ni >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > With CyclicIteration, remove an item while iterating causes either dead loop > or ConcurrentModificationException. > This item should be removed by > {{toRemove.add(dn);}} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13678) StorageType is incompatible when rolling upgrade to 2.6/2.6+ versions
[ https://issues.apache.org/jira/browse/HDFS-13678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17541340#comment-17541340 ] Masatake Iwasaki commented on HDFS-13678: - updated the target version for preparing 2.10.2 release. > StorageType is incompatible when rolling upgrade to 2.6/2.6+ versions > - > > Key: HDFS-13678 > URL: https://issues.apache.org/jira/browse/HDFS-13678 > Project: Hadoop HDFS > Issue Type: Bug > Components: rolling upgrades >Affects Versions: 2.5.0 >Reporter: Yiqun Lin >Priority: Major > > In version 2.6.0, we supported more storage types in HDFS that implemented in > HDFS-6584. But this seems a incompatible change when we rolling upgrade our > cluster from 2.5.0 to 2.6.0 and throw following error. > {noformat} > 2018-06-14 11:43:39,246 ERROR [DataNode: > [[[DISK]file:/home/vipshop/hard_disk/dfs/, [DISK]file:/data1/dfs/, > [DISK]file:/data2/dfs/]] heartbeating to xx.xx.xx.xx:8022] > org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in BPOfferService > for Block pool BP-670256553-xx.xx.xx.xx-1528795419404 (Datanode Uuid > ab150e05-fcb7-49ed-b8ba-f05c27593fee) service to xx.xx.xx.xx:8022 > java.lang.ArrayStoreException > at java.util.ArrayList.toArray(ArrayList.java:412) > at > java.util.Collections$UnmodifiableCollection.toArray(Collections.java:1034) > at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1030) > at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:836) > at > org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.sendHeartbeat(DatanodeProtocolClientSideTranslatorPB.java:146) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:566) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:664) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:835) > at java.lang.Thread.run(Thread.java:748) > {noformat} > The scenery is that old DN parses StorageType error that got from new NN. > This error is taking place in sending heratbeat to NN and blocks won't be > reported to NN successfully. This will lead subsequent errors. > Corresponding logic in 2.5.0: > {code} > public static BlockCommand convert(BlockCommandProto blkCmd) { > ... > StorageType[][] targetStorageTypes = new StorageType[targetList.size()][]; > List targetStorageTypesList = > blkCmd.getTargetStorageTypesList(); > if (targetStorageTypesList.isEmpty()) { // missing storage types > for(int i = 0; i < targetStorageTypes.length; i++) { > targetStorageTypes[i] = new StorageType[targets[i].length]; > Arrays.fill(targetStorageTypes[i], StorageType.DEFAULT); > } > } else { > for(int i = 0; i < targetStorageTypes.length; i++) { > List p = > targetStorageTypesList.get(i).getStorageTypesList(); > targetStorageTypes[i] = p.toArray(new StorageType[p.size()]); < > error here > } > } > {code} > But corresponding to the current logic , it's will be better to return > default type instead of a exception in case StorageType changed(new fields > added or new types) in new versions during rolling upgrade. > {code:java} > public static StorageType convertStorageType(StorageTypeProto type) { > switch(type) { > case DISK: > return StorageType.DISK; > case SSD: > return StorageType.SSD; > case ARCHIVE: > return StorageType.ARCHIVE; > case RAM_DISK: > return StorageType.RAM_DISK; > case PROVIDED: > return StorageType.PROVIDED; > default: > throw new IllegalStateException( > "BUG: StorageTypeProto not found, type=" + type); > } > } > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13678) StorageType is incompatible when rolling upgrade to 2.6/2.6+ versions
[ https://issues.apache.org/jira/browse/HDFS-13678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-13678: Target Version/s: 2.10.3, 2.9.3 (was: 2.9.3, 2.10.2) > StorageType is incompatible when rolling upgrade to 2.6/2.6+ versions > - > > Key: HDFS-13678 > URL: https://issues.apache.org/jira/browse/HDFS-13678 > Project: Hadoop HDFS > Issue Type: Bug > Components: rolling upgrades >Affects Versions: 2.5.0 >Reporter: Yiqun Lin >Priority: Major > > In version 2.6.0, we supported more storage types in HDFS that implemented in > HDFS-6584. But this seems a incompatible change when we rolling upgrade our > cluster from 2.5.0 to 2.6.0 and throw following error. > {noformat} > 2018-06-14 11:43:39,246 ERROR [DataNode: > [[[DISK]file:/home/vipshop/hard_disk/dfs/, [DISK]file:/data1/dfs/, > [DISK]file:/data2/dfs/]] heartbeating to xx.xx.xx.xx:8022] > org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in BPOfferService > for Block pool BP-670256553-xx.xx.xx.xx-1528795419404 (Datanode Uuid > ab150e05-fcb7-49ed-b8ba-f05c27593fee) service to xx.xx.xx.xx:8022 > java.lang.ArrayStoreException > at java.util.ArrayList.toArray(ArrayList.java:412) > at > java.util.Collections$UnmodifiableCollection.toArray(Collections.java:1034) > at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1030) > at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:836) > at > org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.sendHeartbeat(DatanodeProtocolClientSideTranslatorPB.java:146) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:566) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:664) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:835) > at java.lang.Thread.run(Thread.java:748) > {noformat} > The scenery is that old DN parses StorageType error that got from new NN. > This error is taking place in sending heratbeat to NN and blocks won't be > reported to NN successfully. This will lead subsequent errors. > Corresponding logic in 2.5.0: > {code} > public static BlockCommand convert(BlockCommandProto blkCmd) { > ... > StorageType[][] targetStorageTypes = new StorageType[targetList.size()][]; > List targetStorageTypesList = > blkCmd.getTargetStorageTypesList(); > if (targetStorageTypesList.isEmpty()) { // missing storage types > for(int i = 0; i < targetStorageTypes.length; i++) { > targetStorageTypes[i] = new StorageType[targets[i].length]; > Arrays.fill(targetStorageTypes[i], StorageType.DEFAULT); > } > } else { > for(int i = 0; i < targetStorageTypes.length; i++) { > List p = > targetStorageTypesList.get(i).getStorageTypesList(); > targetStorageTypes[i] = p.toArray(new StorageType[p.size()]); < > error here > } > } > {code} > But corresponding to the current logic , it's will be better to return > default type instead of a exception in case StorageType changed(new fields > added or new types) in new versions during rolling upgrade. > {code:java} > public static StorageType convertStorageType(StorageTypeProto type) { > switch(type) { > case DISK: > return StorageType.DISK; > case SSD: > return StorageType.SSD; > case ARCHIVE: > return StorageType.ARCHIVE; > case RAM_DISK: > return StorageType.RAM_DISK; > case PROVIDED: > return StorageType.PROVIDED; > default: > throw new IllegalStateException( > "BUG: StorageTypeProto not found, type=" + type); > } > } > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14794) [SBN read] reportBadBlock is rejected by Observer.
[ https://issues.apache.org/jira/browse/HDFS-14794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17541339#comment-17541339 ] Masatake Iwasaki commented on HDFS-14794: - updated the target version for preparing 2.10.2 release. > [SBN read] reportBadBlock is rejected by Observer. > -- > > Key: HDFS-14794 > URL: https://issues.apache.org/jira/browse/HDFS-14794 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: 2.10.0 >Reporter: Konstantin Shvachko >Priority: Major > > {{reportBadBlock}} is rejected by Observer via StandbyException > {code}StandbyException: Operation category WRITE is not supported in state > observer{code} > We should investigate what are the consequences of this and if we should > treat {{reportBadBlock}} as IBRs. Note that {{reportBadBlock}} is a part of > both {{ClientProtocol}} and {{DatanodeProtocol}} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14794) [SBN read] reportBadBlock is rejected by Observer.
[ https://issues.apache.org/jira/browse/HDFS-14794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-14794: Target Version/s: 2.10.3 (was: 2.10.2) > [SBN read] reportBadBlock is rejected by Observer. > -- > > Key: HDFS-14794 > URL: https://issues.apache.org/jira/browse/HDFS-14794 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: 2.10.0 >Reporter: Konstantin Shvachko >Priority: Major > > {{reportBadBlock}} is rejected by Observer via StandbyException > {code}StandbyException: Operation category WRITE is not supported in state > observer{code} > We should investigate what are the consequences of this and if we should > treat {{reportBadBlock}} as IBRs. Note that {{reportBadBlock}} is a part of > both {{ClientProtocol}} and {{DatanodeProtocol}} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15037) Encryption Zone operations should not block other RPC calls while retreiving encryption keys.
[ https://issues.apache.org/jira/browse/HDFS-15037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-15037: Target Version/s: 2.10.3 (was: 2.10.2) > Encryption Zone operations should not block other RPC calls while retreiving > encryption keys. > - > > Key: HDFS-15037 > URL: https://issues.apache.org/jira/browse/HDFS-15037 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption, namenode >Affects Versions: 2.10.0 >Reporter: Konstantin Shvachko >Priority: Major > > I believe it was an intention to avoid blocking other operations while > retrieving keys with holding {{FSDirectory.dirLock}}. But in reality all > other operations enter first {{FSNamesystemLock}} then {{dirLock}}. So they > are all blocked waiting for the key. > We see substantial increase in RPC wait time ({{RpcQueueTimeAvgTime}}) on > NameNode when encryption operations are intermixed with regular workloads. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15357) Do not trust bad block reports from clients
[ https://issues.apache.org/jira/browse/HDFS-15357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17541336#comment-17541336 ] Masatake Iwasaki commented on HDFS-15357: - updated the target version for preparing 2.10.2 release. > Do not trust bad block reports from clients > --- > > Key: HDFS-15357 > URL: https://issues.apache.org/jira/browse/HDFS-15357 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Priority: Major > > {{reportBadBlocks()}} is implemented by both ClientNamenodeProtocol and > DatanodeProtocol. When DFSClient is calling it, a faulty client can cause > data availability issues in a cluster. > In the past we had such an incident where a node with a faulty NIC was > randomly corrupting data. All clients ran on the machine reported all > accessed blocks and all associated replicas to be corrupt. More recently, a > single faulty client process caused a small number of missing blocks. In > all cases, actual data was fine. > The bad block reports from clients shouldn't be trusted blindly. Instead, the > namenode should send a datanode command to verify the claim. A bonus would be > to keep the record for a while and ignore repeated reports from the same > nodes. > At minimum, there should be an option to ignore bad block reports from > clients, perhaps after logging it. A very crude way would be to make it short > out in {{ClientNamenodeProtocolServerSideTranslatorPB#reportBadBlocks()}}. > More sophisticated way would be to check for the datanode user name in > {{FSNamesystem#reportBadBlocks()}} so that it can be easily logged, or > optionally do further processing. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15004) Refactor TestBalancer for faster execution.
[ https://issues.apache.org/jira/browse/HDFS-15004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-15004: Target Version/s: 2.10.3 (was: 2.10.2) > Refactor TestBalancer for faster execution. > --- > > Key: HDFS-15004 > URL: https://issues.apache.org/jira/browse/HDFS-15004 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs, test >Affects Versions: 2.10.0 >Reporter: Konstantin Shvachko >Priority: Major > > {{TestBalancer}} is a big test by itself, it is also a part of many other > tests. Running these tests involves spinning of {{MiniDFSCluter}} and > shutting it down for every test case, which is inefficient. Many of the test > cases can run using the same instance of {{MiniDFSCluter}}, but not all of > them. Would be good to refactor the tests to optimize their running time. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15004) Refactor TestBalancer for faster execution.
[ https://issues.apache.org/jira/browse/HDFS-15004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17541338#comment-17541338 ] Masatake Iwasaki commented on HDFS-15004: - updated the target version for preparing 2.10.2 release. > Refactor TestBalancer for faster execution. > --- > > Key: HDFS-15004 > URL: https://issues.apache.org/jira/browse/HDFS-15004 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs, test >Affects Versions: 2.10.0 >Reporter: Konstantin Shvachko >Priority: Major > > {{TestBalancer}} is a big test by itself, it is also a part of many other > tests. Running these tests involves spinning of {{MiniDFSCluter}} and > shutting it down for every test case, which is inefficient. Many of the test > cases can run using the same instance of {{MiniDFSCluter}}, but not all of > them. Would be good to refactor the tests to optimize their running time. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15037) Encryption Zone operations should not block other RPC calls while retreiving encryption keys.
[ https://issues.apache.org/jira/browse/HDFS-15037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17541337#comment-17541337 ] Masatake Iwasaki commented on HDFS-15037: - updated the target version for preparing 2.10.2 release. > Encryption Zone operations should not block other RPC calls while retreiving > encryption keys. > - > > Key: HDFS-15037 > URL: https://issues.apache.org/jira/browse/HDFS-15037 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption, namenode >Affects Versions: 2.10.0 >Reporter: Konstantin Shvachko >Priority: Major > > I believe it was an intention to avoid blocking other operations while > retrieving keys with holding {{FSDirectory.dirLock}}. But in reality all > other operations enter first {{FSNamesystemLock}} then {{dirLock}}. So they > are all blocked waiting for the key. > We see substantial increase in RPC wait time ({{RpcQueueTimeAvgTime}}) on > NameNode when encryption operations are intermixed with regular workloads. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15357) Do not trust bad block reports from clients
[ https://issues.apache.org/jira/browse/HDFS-15357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-15357: Target Version/s: 3.4.0, 2.10.3 (was: 3.4.0, 2.10.2) > Do not trust bad block reports from clients > --- > > Key: HDFS-15357 > URL: https://issues.apache.org/jira/browse/HDFS-15357 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Kihwal Lee >Priority: Major > > {{reportBadBlocks()}} is implemented by both ClientNamenodeProtocol and > DatanodeProtocol. When DFSClient is calling it, a faulty client can cause > data availability issues in a cluster. > In the past we had such an incident where a node with a faulty NIC was > randomly corrupting data. All clients ran on the machine reported all > accessed blocks and all associated replicas to be corrupt. More recently, a > single faulty client process caused a small number of missing blocks. In > all cases, actual data was fine. > The bad block reports from clients shouldn't be trusted blindly. Instead, the > namenode should send a datanode command to verify the claim. A bonus would be > to keep the record for a while and ignore repeated reports from the same > nodes. > At minimum, there should be an option to ignore bad block reports from > clients, perhaps after logging it. A very crude way would be to make it short > out in {{ClientNamenodeProtocolServerSideTranslatorPB#reportBadBlocks()}}. > More sophisticated way would be to check for the datanode user name in > {{FSNamesystem#reportBadBlocks()}} so that it can be easily logged, or > optionally do further processing. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16165) Backport the Hadoop 3.x Kerberos synchronization fix to Hadoop 2.x
[ https://issues.apache.org/jira/browse/HDFS-16165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-16165: Target Version/s: 2.10.3 (was: 2.10.2) > Backport the Hadoop 3.x Kerberos synchronization fix to Hadoop 2.x > -- > > Key: HDFS-16165 > URL: https://issues.apache.org/jira/browse/HDFS-16165 > Project: Hadoop HDFS > Issue Type: Wish > Environment: Can be reproduced in docker HDFS environment with > Kerberos > https://github.com/vdesabou/kafka-docker-playground/blob/93a93de293ad2f9bb22afb244f2d8729a178296e/connect/connect-hdfs2-sink/hdfs2-sink-ha-kerberos-repro-gss-exception.sh >Reporter: Daniel Osvath >Priority: Major > Labels: Confluent > > *Problem Description* > For more than a year Apache Kafka Connect users have been running into a > Kerberos renewal issue that causes our HDFS2 connectors to fail. > We have been able to consistently reproduce the issue under high load with 40 > connectors (threads) that use the library. When we try an alternate > workaround that uses the kerberos keytab on the system the connector operates > without issues. > We identified the root cause to be a race condition bug in the Hadoop 2.x > library that causes the ticker renewal to fail with the error below: > {code:java} > Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt)] > at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)We > reached the conclusion of the root cause once we tried the same environment > (40 connectors) with Hadoop 3.x, and our HDFS3 connectors and operated > without renewal issues. Additionally, identifying that the synchronization > issue has been fixed for the newer Hadoop 3.x releases we confirmed our > hypothesis about the root cause. Request > {code} > There are many changes in HDFS 3 > [UserGroupInformation.java|https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java] > related to UGI synchronization which were done as part of > https://issues.apache.org/jira/browse/HADOOP-9747, and those changes suggest > some race conditions were happening with older version, i.e HDFS 2.x Which > would explain why we can reproduce the problem with HDFS2. > For example(among others): > {code:java} > private void relogin(HadoopLoginContext login, boolean ignoreLastLoginTime) > throws IOException { > // ensure the relogin is atomic to avoid leaving credentials in an > // inconsistent state. prevents other ugi instances, SASL, and SPNEGO > // from accessing or altering credentials during the relogin. > synchronized(login.getSubjectLock()) { > // another racing thread may have beat us to the relogin. > if (login == getLogin()) { > unprotectedRelogin(login, ignoreLastLoginTime); > } > } > } > {code} > All those changes were not backported to Hadoop 2.x (out HDFS2 connector uses > 2.10.1), on which several CDH distributions are based. > *Request* > We would like to ask for the synchronization fix to be backported to Hadoop > 2.x so that our users can operate without issues. > *Impact* > The older 2.x Hadoop version is used by our HDFS connector, which is used in > production by our community. Currently, the issue causes our HDFS connector > to fail, as it is unable to recover and renew the ticket at a later point. > Having the backported fix would allow our users to operate without issues > that require manual intervention every week (or few days in some cases). The > only workaround available to community for the issue is to run a command or > restart their workers. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-16165) Backport the Hadoop 3.x Kerberos synchronization fix to Hadoop 2.x
[ https://issues.apache.org/jira/browse/HDFS-16165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17541334#comment-17541334 ] Masatake Iwasaki commented on HDFS-16165: - updated the target version for preparing 2.10.2 release. > Backport the Hadoop 3.x Kerberos synchronization fix to Hadoop 2.x > -- > > Key: HDFS-16165 > URL: https://issues.apache.org/jira/browse/HDFS-16165 > Project: Hadoop HDFS > Issue Type: Wish > Environment: Can be reproduced in docker HDFS environment with > Kerberos > https://github.com/vdesabou/kafka-docker-playground/blob/93a93de293ad2f9bb22afb244f2d8729a178296e/connect/connect-hdfs2-sink/hdfs2-sink-ha-kerberos-repro-gss-exception.sh >Reporter: Daniel Osvath >Priority: Major > Labels: Confluent > > *Problem Description* > For more than a year Apache Kafka Connect users have been running into a > Kerberos renewal issue that causes our HDFS2 connectors to fail. > We have been able to consistently reproduce the issue under high load with 40 > connectors (threads) that use the library. When we try an alternate > workaround that uses the kerberos keytab on the system the connector operates > without issues. > We identified the root cause to be a race condition bug in the Hadoop 2.x > library that causes the ticker renewal to fail with the error below: > {code:java} > Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt)] > at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)We > reached the conclusion of the root cause once we tried the same environment > (40 connectors) with Hadoop 3.x, and our HDFS3 connectors and operated > without renewal issues. Additionally, identifying that the synchronization > issue has been fixed for the newer Hadoop 3.x releases we confirmed our > hypothesis about the root cause. Request > {code} > There are many changes in HDFS 3 > [UserGroupInformation.java|https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java] > related to UGI synchronization which were done as part of > https://issues.apache.org/jira/browse/HADOOP-9747, and those changes suggest > some race conditions were happening with older version, i.e HDFS 2.x Which > would explain why we can reproduce the problem with HDFS2. > For example(among others): > {code:java} > private void relogin(HadoopLoginContext login, boolean ignoreLastLoginTime) > throws IOException { > // ensure the relogin is atomic to avoid leaving credentials in an > // inconsistent state. prevents other ugi instances, SASL, and SPNEGO > // from accessing or altering credentials during the relogin. > synchronized(login.getSubjectLock()) { > // another racing thread may have beat us to the relogin. > if (login == getLogin()) { > unprotectedRelogin(login, ignoreLastLoginTime); > } > } > } > {code} > All those changes were not backported to Hadoop 2.x (out HDFS2 connector uses > 2.10.1), on which several CDH distributions are based. > *Request* > We would like to ask for the synchronization fix to be backported to Hadoop > 2.x so that our users can operate without issues. > *Impact* > The older 2.x Hadoop version is used by our HDFS connector, which is used in > production by our community. Currently, the issue causes our HDFS connector > to fail, as it is unable to recover and renew the ticket at a later point. > Having the backported fix would allow our users to operate without issues > that require manual intervention every week (or few days in some cases). The > only workaround available to community for the issue is to run a command or > restart their workers. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14277) [SBN read] Observer benchmark results
[ https://issues.apache.org/jira/browse/HDFS-14277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17541287#comment-17541287 ] Masatake Iwasaki commented on HDFS-14277: - I updated the target version and priority for preparing 2.10.3. > [SBN read] Observer benchmark results > - > > Key: HDFS-14277 > URL: https://issues.apache.org/jira/browse/HDFS-14277 > Project: Hadoop HDFS > Issue Type: Task > Components: ha, namenode >Affects Versions: 2.10.0, 3.3.0 > Environment: Hardware: 4-node cluster, each node has 4 core, Xeon > 2.5Ghz, 25GB memory. > Software: CentOS 7.4, CDH 6.0 + Consistent Reads from Standby, Kerberos, SSL, > RPC encryption + Data Transfer Encryption, Cloudera Navigator. >Reporter: Wei-Chiu Chuang >Priority: Major > Attachments: Observer profiler.png, Screen Shot 2019-02-14 at > 11.50.37 AM.png, observer RPC queue processing time.png > > > Ran a few benchmarks and profiler (VisualVM) today on an Observer-enabled > cluster. Would like to share the results with the community. The cluster has > 1 Observer node. > h2. NNThroughputBenchmark > Generate 1 million files and send fileStatus RPCs. > {code:java} > hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -fs > -op fileStatus -threads 100 -files 100 -useExisting > -keepResults > {code} > h3. Kerberos, SSL, RPC encryption, Data Transfer Encryption enabled: > ||Node||fileStatus (Ops per sec)|| > |Active NameNode|4865| > |Observer|3996| > h3. Kerberos, SSL: > ||Node||fileStatus (Ops per sec)|| > |Active NameNode|7078| > |Observer|6459| > Observation: > * due to the edit tailing overhead, Observer node consume 30% CPU > utilization even if the cluster is idle. > * While Active NN has less than 1ms RPC processing time, Observer node has > > 5ms RPC processing time. I am still looking for the source of the longer > processing time. The longer RPC processing time may be the cause for the > performance degradation compared to that of Active NN. Note the cluster has > Cloudera Navigator installed which adds additional overhead to RPC processing > time. > * {{GlobalStateIdContext#isCoordinatedCall()}} pops up as one of the top > hotspots in the profiler. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14277) [SBN read] Observer benchmark results
[ https://issues.apache.org/jira/browse/HDFS-14277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-14277: Target Version/s: 2.10.3 (was: 2.10.2) > [SBN read] Observer benchmark results > - > > Key: HDFS-14277 > URL: https://issues.apache.org/jira/browse/HDFS-14277 > Project: Hadoop HDFS > Issue Type: Task > Components: ha, namenode >Affects Versions: 2.10.0, 3.3.0 > Environment: Hardware: 4-node cluster, each node has 4 core, Xeon > 2.5Ghz, 25GB memory. > Software: CentOS 7.4, CDH 6.0 + Consistent Reads from Standby, Kerberos, SSL, > RPC encryption + Data Transfer Encryption, Cloudera Navigator. >Reporter: Wei-Chiu Chuang >Priority: Blocker > Attachments: Observer profiler.png, Screen Shot 2019-02-14 at > 11.50.37 AM.png, observer RPC queue processing time.png > > > Ran a few benchmarks and profiler (VisualVM) today on an Observer-enabled > cluster. Would like to share the results with the community. The cluster has > 1 Observer node. > h2. NNThroughputBenchmark > Generate 1 million files and send fileStatus RPCs. > {code:java} > hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -fs > -op fileStatus -threads 100 -files 100 -useExisting > -keepResults > {code} > h3. Kerberos, SSL, RPC encryption, Data Transfer Encryption enabled: > ||Node||fileStatus (Ops per sec)|| > |Active NameNode|4865| > |Observer|3996| > h3. Kerberos, SSL: > ||Node||fileStatus (Ops per sec)|| > |Active NameNode|7078| > |Observer|6459| > Observation: > * due to the edit tailing overhead, Observer node consume 30% CPU > utilization even if the cluster is idle. > * While Active NN has less than 1ms RPC processing time, Observer node has > > 5ms RPC processing time. I am still looking for the source of the longer > processing time. The longer RPC processing time may be the cause for the > performance degradation compared to that of Active NN. Note the cluster has > Cloudera Navigator installed which adds additional overhead to RPC processing > time. > * {{GlobalStateIdContext#isCoordinatedCall()}} pops up as one of the top > hotspots in the profiler. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14277) [SBN read] Observer benchmark results
[ https://issues.apache.org/jira/browse/HDFS-14277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-14277: Priority: Major (was: Blocker) > [SBN read] Observer benchmark results > - > > Key: HDFS-14277 > URL: https://issues.apache.org/jira/browse/HDFS-14277 > Project: Hadoop HDFS > Issue Type: Task > Components: ha, namenode >Affects Versions: 2.10.0, 3.3.0 > Environment: Hardware: 4-node cluster, each node has 4 core, Xeon > 2.5Ghz, 25GB memory. > Software: CentOS 7.4, CDH 6.0 + Consistent Reads from Standby, Kerberos, SSL, > RPC encryption + Data Transfer Encryption, Cloudera Navigator. >Reporter: Wei-Chiu Chuang >Priority: Major > Attachments: Observer profiler.png, Screen Shot 2019-02-14 at > 11.50.37 AM.png, observer RPC queue processing time.png > > > Ran a few benchmarks and profiler (VisualVM) today on an Observer-enabled > cluster. Would like to share the results with the community. The cluster has > 1 Observer node. > h2. NNThroughputBenchmark > Generate 1 million files and send fileStatus RPCs. > {code:java} > hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -fs > -op fileStatus -threads 100 -files 100 -useExisting > -keepResults > {code} > h3. Kerberos, SSL, RPC encryption, Data Transfer Encryption enabled: > ||Node||fileStatus (Ops per sec)|| > |Active NameNode|4865| > |Observer|3996| > h3. Kerberos, SSL: > ||Node||fileStatus (Ops per sec)|| > |Active NameNode|7078| > |Observer|6459| > Observation: > * due to the edit tailing overhead, Observer node consume 30% CPU > utilization even if the cluster is idle. > * While Active NN has less than 1ms RPC processing time, Observer node has > > 5ms RPC processing time. I am still looking for the source of the longer > processing time. The longer RPC processing time may be the cause for the > performance degradation compared to that of Active NN. Note the cluster has > Cloudera Navigator installed which adds additional overhead to RPC processing > time. > * {{GlobalStateIdContext#isCoordinatedCall()}} pops up as one of the top > hotspots in the profiler. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12548) HDFS Jenkins build is unstable on branch-2
[ https://issues.apache.org/jira/browse/HDFS-12548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17537599#comment-17537599 ] Masatake Iwasaki commented on HDFS-12548: - updated target version for preparing 2.10.2 release. reduced priority. > HDFS Jenkins build is unstable on branch-2 > -- > > Key: HDFS-12548 > URL: https://issues.apache.org/jira/browse/HDFS-12548 > Project: Hadoop HDFS > Issue Type: Bug > Components: build >Affects Versions: 2.9.0 >Reporter: Rushabh Shah >Priority: Major > > Feel free move the ticket to another project (e.g. infra). > Recently I attached branch-2 patch while working on one jira > [HDFS-12386|https://issues.apache.org/jira/browse/HDFS-12386?focusedCommentId=16180676&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16180676] > There were at-least 100 failed and timed out tests. I am sure they are not > related to my patch. > Also I came across another jira which was just a javadoc related change and > there were around 100 failed tests. > Below are the details for pre-commits that failed in branch-2 > 1 [HDFS-12386 attempt > 1|https://issues.apache.org/jira/browse/HDFS-12386?focusedCommentId=16180069&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16180069] > {noformat} > Ran on slave: asf912.gq1.ygridcore.net/H12 > Failed with following error message: > Build timed out (after 300 minutes). Marking the build as aborted. > Build was aborted > Performing Post build task... > {noformat} > 2. [HDFS-12386 attempt > 2|https://issues.apache.org/jira/browse/HDFS-12386?focusedCommentId=16180676&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16180676] > {noformat} > Ran on slave: asf900.gq1.ygridcore.net > Failed with following error message: > FATAL: command execution failed > Command close created at > at hudson.remoting.Command.(Command.java:60) > at hudson.remoting.Channel$CloseCommand.(Channel.java:1123) > at hudson.remoting.Channel$CloseCommand.(Channel.java:1121) > at hudson.remoting.Channel.close(Channel.java:1281) > at hudson.remoting.Channel.close(Channel.java:1263) > at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1128) > Caused: hudson.remoting.Channel$OrderlyShutdown > at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1129) > at hudson.remoting.Channel$1.handle(Channel.java:527) > at > hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:83) > Caused: java.io.IOException: Backing channel 'H0' is disconnected. > at > hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:192) > at > hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:257) > at com.sun.proxy.$Proxy125.isAlive(Unknown Source) > at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1043) > at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1035) > at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:155) > at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:109) > at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66) > at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) > at > hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:735) > at hudson.model.Build$BuildExecution.build(Build.java:206) > at hudson.model.Build$BuildExecution.doRun(Build.java:163) > at > hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:490) > at hudson.model.Run.execute(Run.java:1735) > at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) > at hudson.model.ResourceController.execute(ResourceController.java:97) > at hudson.model.Executor.run(Executor.java:405) > {noformat} > 3. [HDFS-12531 attempt > 1|https://issues.apache.org/jira/browse/HDFS-12531?focusedCommentId=16176493&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16176493] > {noformat} > Ran on slave: asf911.gq1.ygridcore.net > Failed with following error message: > FATAL: command execution failed > Command close created at > at hudson.remoting.Command.(Command.java:60) > at hudson.remoting.Channel$CloseCommand.(Channel.java:1123) > at hudson.remoting.Channel$CloseCommand.(Channel.java:1121) > at hudson.remoting.Channel.close(Channel.java:1281) > at hudson.remoting.Channel.close(Channel.java:1263) > at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1128) > Caused: hudson.remoting.Channel$OrderlyShutdown > at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1129) >
[jira] [Updated] (HDFS-12548) HDFS Jenkins build is unstable on branch-2
[ https://issues.apache.org/jira/browse/HDFS-12548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-12548: Target Version/s: 2.10.3 (was: 2.10.2) > HDFS Jenkins build is unstable on branch-2 > -- > > Key: HDFS-12548 > URL: https://issues.apache.org/jira/browse/HDFS-12548 > Project: Hadoop HDFS > Issue Type: Bug > Components: build >Affects Versions: 2.9.0 >Reporter: Rushabh Shah >Priority: Major > > Feel free move the ticket to another project (e.g. infra). > Recently I attached branch-2 patch while working on one jira > [HDFS-12386|https://issues.apache.org/jira/browse/HDFS-12386?focusedCommentId=16180676&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16180676] > There were at-least 100 failed and timed out tests. I am sure they are not > related to my patch. > Also I came across another jira which was just a javadoc related change and > there were around 100 failed tests. > Below are the details for pre-commits that failed in branch-2 > 1 [HDFS-12386 attempt > 1|https://issues.apache.org/jira/browse/HDFS-12386?focusedCommentId=16180069&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16180069] > {noformat} > Ran on slave: asf912.gq1.ygridcore.net/H12 > Failed with following error message: > Build timed out (after 300 minutes). Marking the build as aborted. > Build was aborted > Performing Post build task... > {noformat} > 2. [HDFS-12386 attempt > 2|https://issues.apache.org/jira/browse/HDFS-12386?focusedCommentId=16180676&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16180676] > {noformat} > Ran on slave: asf900.gq1.ygridcore.net > Failed with following error message: > FATAL: command execution failed > Command close created at > at hudson.remoting.Command.(Command.java:60) > at hudson.remoting.Channel$CloseCommand.(Channel.java:1123) > at hudson.remoting.Channel$CloseCommand.(Channel.java:1121) > at hudson.remoting.Channel.close(Channel.java:1281) > at hudson.remoting.Channel.close(Channel.java:1263) > at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1128) > Caused: hudson.remoting.Channel$OrderlyShutdown > at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1129) > at hudson.remoting.Channel$1.handle(Channel.java:527) > at > hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:83) > Caused: java.io.IOException: Backing channel 'H0' is disconnected. > at > hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:192) > at > hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:257) > at com.sun.proxy.$Proxy125.isAlive(Unknown Source) > at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1043) > at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1035) > at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:155) > at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:109) > at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66) > at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) > at > hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:735) > at hudson.model.Build$BuildExecution.build(Build.java:206) > at hudson.model.Build$BuildExecution.doRun(Build.java:163) > at > hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:490) > at hudson.model.Run.execute(Run.java:1735) > at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) > at hudson.model.ResourceController.execute(ResourceController.java:97) > at hudson.model.Executor.run(Executor.java:405) > {noformat} > 3. [HDFS-12531 attempt > 1|https://issues.apache.org/jira/browse/HDFS-12531?focusedCommentId=16176493&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16176493] > {noformat} > Ran on slave: asf911.gq1.ygridcore.net > Failed with following error message: > FATAL: command execution failed > Command close created at > at hudson.remoting.Command.(Command.java:60) > at hudson.remoting.Channel$CloseCommand.(Channel.java:1123) > at hudson.remoting.Channel$CloseCommand.(Channel.java:1121) > at hudson.remoting.Channel.close(Channel.java:1281) > at hudson.remoting.Channel.close(Channel.java:1263) > at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1128) > Caused: hudson.remoting.Channel$OrderlyShutdown > at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1129) > at hudson.remoting.Channel$1.handle(Channel.java:527) > at > hudson.remoting
[jira] [Updated] (HDFS-12548) HDFS Jenkins build is unstable on branch-2
[ https://issues.apache.org/jira/browse/HDFS-12548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-12548: Priority: Major (was: Critical) > HDFS Jenkins build is unstable on branch-2 > -- > > Key: HDFS-12548 > URL: https://issues.apache.org/jira/browse/HDFS-12548 > Project: Hadoop HDFS > Issue Type: Bug > Components: build >Affects Versions: 2.9.0 >Reporter: Rushabh Shah >Priority: Major > > Feel free move the ticket to another project (e.g. infra). > Recently I attached branch-2 patch while working on one jira > [HDFS-12386|https://issues.apache.org/jira/browse/HDFS-12386?focusedCommentId=16180676&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16180676] > There were at-least 100 failed and timed out tests. I am sure they are not > related to my patch. > Also I came across another jira which was just a javadoc related change and > there were around 100 failed tests. > Below are the details for pre-commits that failed in branch-2 > 1 [HDFS-12386 attempt > 1|https://issues.apache.org/jira/browse/HDFS-12386?focusedCommentId=16180069&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16180069] > {noformat} > Ran on slave: asf912.gq1.ygridcore.net/H12 > Failed with following error message: > Build timed out (after 300 minutes). Marking the build as aborted. > Build was aborted > Performing Post build task... > {noformat} > 2. [HDFS-12386 attempt > 2|https://issues.apache.org/jira/browse/HDFS-12386?focusedCommentId=16180676&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16180676] > {noformat} > Ran on slave: asf900.gq1.ygridcore.net > Failed with following error message: > FATAL: command execution failed > Command close created at > at hudson.remoting.Command.(Command.java:60) > at hudson.remoting.Channel$CloseCommand.(Channel.java:1123) > at hudson.remoting.Channel$CloseCommand.(Channel.java:1121) > at hudson.remoting.Channel.close(Channel.java:1281) > at hudson.remoting.Channel.close(Channel.java:1263) > at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1128) > Caused: hudson.remoting.Channel$OrderlyShutdown > at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1129) > at hudson.remoting.Channel$1.handle(Channel.java:527) > at > hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:83) > Caused: java.io.IOException: Backing channel 'H0' is disconnected. > at > hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:192) > at > hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:257) > at com.sun.proxy.$Proxy125.isAlive(Unknown Source) > at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1043) > at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1035) > at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:155) > at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:109) > at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66) > at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) > at > hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:735) > at hudson.model.Build$BuildExecution.build(Build.java:206) > at hudson.model.Build$BuildExecution.doRun(Build.java:163) > at > hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:490) > at hudson.model.Run.execute(Run.java:1735) > at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) > at hudson.model.ResourceController.execute(ResourceController.java:97) > at hudson.model.Executor.run(Executor.java:405) > {noformat} > 3. [HDFS-12531 attempt > 1|https://issues.apache.org/jira/browse/HDFS-12531?focusedCommentId=16176493&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16176493] > {noformat} > Ran on slave: asf911.gq1.ygridcore.net > Failed with following error message: > FATAL: command execution failed > Command close created at > at hudson.remoting.Command.(Command.java:60) > at hudson.remoting.Channel$CloseCommand.(Channel.java:1123) > at hudson.remoting.Channel$CloseCommand.(Channel.java:1121) > at hudson.remoting.Channel.close(Channel.java:1281) > at hudson.remoting.Channel.close(Channel.java:1263) > at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1128) > Caused: hudson.remoting.Channel$OrderlyShutdown > at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1129) > at hudson.remoting.Channel$1.handle(Channel.java:527) > at > hudson.remoting.Synchr
[jira] [Updated] (HDFS-14630) Configuration.getTimeDurationHelper() should not log time unit warning in info log.
[ https://issues.apache.org/jira/browse/HDFS-14630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-14630: Fix Version/s: 3.2.3 (was: 3.2.4) > Configuration.getTimeDurationHelper() should not log time unit warning in > info log. > --- > > Key: HDFS-14630 > URL: https://issues.apache.org/jira/browse/HDFS-14630 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.1.1 >Reporter: Surendra Singh Lilhore >Assignee: Hemanth Boyina >Priority: Minor > Fix For: 3.3.0, 3.2.3 > > Attachments: HDFS-14630.001.patch, HDFS-14630.patch > > > To solve [HDFS-12920|https://issues.apache.org/jira/browse/HDFS-12920] issue > we configured "dfs.client.datanode-restart.timeout" without time unit. No log > file is full of > {noformat} > 2019-06-22 20:13:14,605 | INFO | pool-12-thread-1 | No unit for > dfs.client.datanode-restart.timeout(30) assuming SECONDS > org.apache.hadoop.conf.Configuration.logDeprecation(Configuration.java:1409){noformat} > No need to log this, just give the behavior in property description. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14630) Configuration.getTimeDurationHelper() should not log time unit warning in info log.
[ https://issues.apache.org/jira/browse/HDFS-14630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509376#comment-17509376 ] Masatake Iwasaki commented on HDFS-14630: - and brandh-3.2.3. > Configuration.getTimeDurationHelper() should not log time unit warning in > info log. > --- > > Key: HDFS-14630 > URL: https://issues.apache.org/jira/browse/HDFS-14630 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.1.1 >Reporter: Surendra Singh Lilhore >Assignee: Hemanth Boyina >Priority: Minor > Fix For: 3.3.0, 3.2.4 > > Attachments: HDFS-14630.001.patch, HDFS-14630.patch > > > To solve [HDFS-12920|https://issues.apache.org/jira/browse/HDFS-12920] issue > we configured "dfs.client.datanode-restart.timeout" without time unit. No log > file is full of > {noformat} > 2019-06-22 20:13:14,605 | INFO | pool-12-thread-1 | No unit for > dfs.client.datanode-restart.timeout(30) assuming SECONDS > org.apache.hadoop.conf.Configuration.logDeprecation(Configuration.java:1409){noformat} > No need to log this, just give the behavior in property description. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14630) Configuration.getTimeDurationHelper() should not log time unit warning in info log.
[ https://issues.apache.org/jira/browse/HDFS-14630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17508676#comment-17508676 ] Masatake Iwasaki commented on HDFS-14630: - cherry-picked this to branch-3.2. https://lists.apache.org/thread/2pn7go8wx6w8tftwf3gotjh7rvzndv6z > Configuration.getTimeDurationHelper() should not log time unit warning in > info log. > --- > > Key: HDFS-14630 > URL: https://issues.apache.org/jira/browse/HDFS-14630 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.1.1 >Reporter: Surendra Singh Lilhore >Assignee: Hemanth Boyina >Priority: Minor > Fix For: 3.3.0, 3.2.4 > > Attachments: HDFS-14630.001.patch, HDFS-14630.patch > > > To solve [HDFS-12920|https://issues.apache.org/jira/browse/HDFS-12920] issue > we configured "dfs.client.datanode-restart.timeout" without time unit. No log > file is full of > {noformat} > 2019-06-22 20:13:14,605 | INFO | pool-12-thread-1 | No unit for > dfs.client.datanode-restart.timeout(30) assuming SECONDS > org.apache.hadoop.conf.Configuration.logDeprecation(Configuration.java:1409){noformat} > No need to log this, just give the behavior in property description. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14630) Configuration.getTimeDurationHelper() should not log time unit warning in info log.
[ https://issues.apache.org/jira/browse/HDFS-14630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-14630: Fix Version/s: 3.2.4 > Configuration.getTimeDurationHelper() should not log time unit warning in > info log. > --- > > Key: HDFS-14630 > URL: https://issues.apache.org/jira/browse/HDFS-14630 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.1.1 >Reporter: Surendra Singh Lilhore >Assignee: Hemanth Boyina >Priority: Minor > Fix For: 3.3.0, 3.2.4 > > Attachments: HDFS-14630.001.patch, HDFS-14630.patch > > > To solve [HDFS-12920|https://issues.apache.org/jira/browse/HDFS-12920] issue > we configured "dfs.client.datanode-restart.timeout" without time unit. No log > file is full of > {noformat} > 2019-06-22 20:13:14,605 | INFO | pool-12-thread-1 | No unit for > dfs.client.datanode-restart.timeout(30) assuming SECONDS > org.apache.hadoop.conf.Configuration.logDeprecation(Configuration.java:1409){noformat} > No need to log this, just give the behavior in property description. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16354) Add description of GETSNAPSHOTDIFFLISTING to WebHDFS doc
[ https://issues.apache.org/jira/browse/HDFS-16354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-16354: Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) > Add description of GETSNAPSHOTDIFFLISTING to WebHDFS doc > > > Key: HDFS-16354 > URL: https://issues.apache.org/jira/browse/HDFS-16354 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > HDFS-16091 added GETSNAPSHOTDIFFLISTING op leveraging > ClientProtocol#getSnapshotDiffReportListing. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16358) HttpFS implementation for getSnapshotDiffReportListing
[ https://issues.apache.org/jira/browse/HDFS-16358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-16358: Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) > HttpFS implementation for getSnapshotDiffReportListing > -- > > Key: HDFS-16358 > URL: https://issues.apache.org/jira/browse/HDFS-16358 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 2h > Remaining Estimate: 0h > > HttpFS should support getSnapshotDiffReportListing API for improved snapshot > diff. WebHdfs implementation available on HDFS-16091. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16354) Add description of GETSNAPSHOTDIFFLISTING to WebHDFS doc
[ https://issues.apache.org/jira/browse/HDFS-16354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-16354: Status: Patch Available (was: Open) > Add description of GETSNAPSHOTDIFFLISTING to WebHDFS doc > > > Key: HDFS-16354 > URL: https://issues.apache.org/jira/browse/HDFS-16354 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > HDFS-16091 added GETSNAPSHOTDIFFLISTING op leveraging > ClientProtocol#getSnapshotDiffReportListing. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16354) Add description of GETSNAPSHOTDIFFLISTING to WebHDFS doc
Masatake Iwasaki created HDFS-16354: --- Summary: Add description of GETSNAPSHOTDIFFLISTING to WebHDFS doc Key: HDFS-16354 URL: https://issues.apache.org/jira/browse/HDFS-16354 Project: Hadoop HDFS Issue Type: Improvement Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki HDFS-16091 added GETSNAPSHOTDIFFLISTING op leveraging ClientProtocol#getSnapshotDiffReportListing. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16091) WebHDFS should support getSnapshotDiffReportListing
[ https://issues.apache.org/jira/browse/HDFS-16091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-16091: Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) > WebHDFS should support getSnapshotDiffReportListing > --- > > Key: HDFS-16091 > URL: https://issues.apache.org/jira/browse/HDFS-16091 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 3h > Remaining Estimate: 0h > > When there are millions of diffs between two snapshots, the old > getSnapshotDiffReport() isn't scalable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16248) Add builder to SnapshotDiffReportListing
Masatake Iwasaki created HDFS-16248: --- Summary: Add builder to SnapshotDiffReportListing Key: HDFS-16248 URL: https://issues.apache.org/jira/browse/HDFS-16248 Project: Hadoop HDFS Issue Type: Improvement Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16249) Avoid calling getSnapshotDiffReportListing multiple times if not supported
Masatake Iwasaki created HDFS-16249: --- Summary: Avoid calling getSnapshotDiffReportListing multiple times if not supported Key: HDFS-16249 URL: https://issues.apache.org/jira/browse/HDFS-16249 Project: Hadoop HDFS Issue Type: Improvement Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16240) Replace unshaded guava in HttpFSServerWebServer
[ https://issues.apache.org/jira/browse/HDFS-16240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-16240: Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) > Replace unshaded guava in HttpFSServerWebServer > --- > > Key: HDFS-16240 > URL: https://issues.apache.org/jira/browse/HDFS-16240 > Project: Hadoop HDFS > Issue Type: Bug > Components: httpfs >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 40m > Remaining Estimate: 0h > > HDFS-16129 added use of com.google.common.annotations.VisibleForTesting to > HttpFSServerWebServer. It is replaced by replace-guava replacer of > HADOOP-17288 on every build time. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16129) HttpFS signature secret file misusage
[ https://issues.apache.org/jira/browse/HDFS-16129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-16129: Fix Version/s: 3.4.0 > HttpFS signature secret file misusage > - > > Key: HDFS-16129 > URL: https://issues.apache.org/jira/browse/HDFS-16129 > Project: Hadoop HDFS > Issue Type: Bug > Components: httpfs >Affects Versions: 3.4.0 >Reporter: Tamas Domok >Assignee: Tamas Domok >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 8h > Remaining Estimate: 0h > > I started to work on the YARN-10814 issue, and found this bug in the HttpFS. > I investigated the problem and I already have some fix for it. > > If the deprecated *httpfs.authentication.signature.secret.file* is not set in > the configuration (e.g.: httpfs-site.xml) then the new > *hadoop.http.authentication.signature.secret.file* config option won't be > used, it will fallback to the random secret provider silently. > The _HttpFSServerWebServer_ sets an _authFilterConfigurationPrefix_ when > building the server for the old path (*httpfs.authentication.*). Later the > _AuthenticationFilter.constructSecretProvider_ will immediately fallback to > +random+, because the config won't contain the file. If the old path was set > too, then it handled the file, and the provider was set to +file+ type. > The configuration should be based on both the old and the new prefix filter, > merging the two. The new config option should win in my opinion. > > There is another issue in the _HttpFSAuthenticationFilter_, it is closely > related. > If both config option is set then the _HttpFSAuthenticationFilter_ will fail > with an impossible file path (e.g.: > *${httpfs.config.dir}/httpfs-signature.secret*). > _HttpFSAuthenticationFilter_ constructs the configuration, filtering first > the new config prefix then the old prefix. The old prefix code works > correctly, it uses the _conf.get(key)_ > instead of the _entry.getValue()_ which gives back the file path mentioned > earlier. The code duplication can be eliminated and I think it would be > better to change the order, first adding the config options from the old path > then the new, and the new should overwrite the old values, with a warning log > message. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16240) Replace unshaded guava in HttpFSServerWebServer
[ https://issues.apache.org/jira/browse/HDFS-16240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-16240: Status: Patch Available (was: Open) > Replace unshaded guava in HttpFSServerWebServer > --- > > Key: HDFS-16240 > URL: https://issues.apache.org/jira/browse/HDFS-16240 > Project: Hadoop HDFS > Issue Type: Bug > Components: httpfs >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > HDFS-16129 added use of com.google.common.annotations.VisibleForTesting to > HttpFSServerWebServer. It is replaced by replace-guava replacer of > HADOOP-17288 on every build time. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16240) Replace unshaded guava in HttpFSServerWebServer
Masatake Iwasaki created HDFS-16240: --- Summary: Replace unshaded guava in HttpFSServerWebServer Key: HDFS-16240 URL: https://issues.apache.org/jira/browse/HDFS-16240 Project: Hadoop HDFS Issue Type: Bug Components: httpfs Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki HDFS-16129 added use of com.google.common.annotations.VisibleForTesting to HttpFSServerWebServer. It is replaced by replace-guava replacer of HADOOP-17288 on every build time. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16091) WebHDFS should support getSnapshotDiffReportListing
[ https://issues.apache.org/jira/browse/HDFS-16091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-16091: Status: Patch Available (was: Open) > WebHDFS should support getSnapshotDiffReportListing > --- > > Key: HDFS-16091 > URL: https://issues.apache.org/jira/browse/HDFS-16091 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > When there are millions of diffs between two snapshots, the old > getSnapshotDiffReport() isn't scalable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12920) HDFS default value change (with adding time unit) breaks old version MR tarball work with new version (3.0) of hadoop
[ https://issues.apache.org/jira/browse/HDFS-12920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17386221#comment-17386221 ] Masatake Iwasaki commented on HDFS-12920: - +1 on reverting HDFS-10845. > HDFS default value change (with adding time unit) breaks old version MR > tarball work with new version (3.0) of hadoop > - > > Key: HDFS-12920 > URL: https://issues.apache.org/jira/browse/HDFS-12920 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Junping Du >Priority: Blocker > > After HADOOP-15059 get resolved. I tried to deploy 2.9.0 tar ball with 3.0.0 > RC1, and run the job with following errors: > {noformat} > 2017-12-12 13:29:06,824 INFO [main] > org.apache.hadoop.service.AbstractService: Service > org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state INITED; cause: > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: > java.lang.NumberFormatException: For input string: "30s" > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: > java.lang.NumberFormatException: For input string: "30s" > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$2.call(MRAppMaster.java:542) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$2.call(MRAppMaster.java:522) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.callWithJobClassLoader(MRAppMaster.java:1764) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:522) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:308) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$5.run(MRAppMaster.java:1722) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1886) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1719) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1650) > {noformat} > This is because HDFS-10845, we are adding time unit to hdfs-default.xml but > it cannot be recognized by old version MR jars. > This break our rolling upgrade story, so should mark as blocker. > A quick workaround is to add values in hdfs-site.xml with removing all time > unit. But the right way may be to revert HDFS-10845 (and get rid of noisy > warnings). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13916) Distcp SnapshotDiff to support WebHDFS
[ https://issues.apache.org/jira/browse/HDFS-13916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-13916: Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) > Distcp SnapshotDiff to support WebHDFS > -- > > Key: HDFS-13916 > URL: https://issues.apache.org/jira/browse/HDFS-13916 > Project: Hadoop HDFS > Issue Type: New Feature > Components: distcp, webhdfs >Affects Versions: 3.0.1, 3.1.1 >Reporter: Xun REN >Assignee: Xun REN >Priority: Major > Labels: easyfix, newbie, patch > Fix For: 3.4.0 > > Attachments: HDFS-13916.002.patch, HDFS-13916.003.patch, > HDFS-13916.004.patch, HDFS-13916.005.patch, HDFS-13916.006.patch, > HDFS-13916.007.patch, HDFS-13916.patch > > > [~ljain] has worked on the JIRA: HDFS-13052 to provide the possibility to > make DistCP of SnapshotDiff with WebHDFSFileSystem. However, in the patch, > there is no modification for the real java class which is used by launching > the command "hadoop distcp ..." > > You can check in the latest version here: > [https://github.com/apache/hadoop/blob/branch-3.1.1/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpSync.java#L96-L100] > In the method "preSyncCheck" of the class "DistCpSync", we still check if the > file system is DFS. > So I propose to change the class DistCpSync in order to take into > consideration what was committed by Lokesh Jain. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13916) Distcp SnapshotDiff to support WebHDFS
[ https://issues.apache.org/jira/browse/HDFS-13916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17370069#comment-17370069 ] Masatake Iwasaki commented on HDFS-13916: - Thanks, [~weichiu] and [~ayushtkn]. I committed the patch and filed HDFS-16091 as a follow-up. > Distcp SnapshotDiff to support WebHDFS > -- > > Key: HDFS-13916 > URL: https://issues.apache.org/jira/browse/HDFS-13916 > Project: Hadoop HDFS > Issue Type: New Feature > Components: distcp, webhdfs >Affects Versions: 3.0.1, 3.1.1 >Reporter: Xun REN >Assignee: Xun REN >Priority: Major > Labels: easyfix, newbie, patch > Attachments: HDFS-13916.002.patch, HDFS-13916.003.patch, > HDFS-13916.004.patch, HDFS-13916.005.patch, HDFS-13916.006.patch, > HDFS-13916.007.patch, HDFS-13916.patch > > > [~ljain] has worked on the JIRA: HDFS-13052 to provide the possibility to > make DistCP of SnapshotDiff with WebHDFSFileSystem. However, in the patch, > there is no modification for the real java class which is used by launching > the command "hadoop distcp ..." > > You can check in the latest version here: > [https://github.com/apache/hadoop/blob/branch-3.1.1/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpSync.java#L96-L100] > In the method "preSyncCheck" of the class "DistCpSync", we still check if the > file system is DFS. > So I propose to change the class DistCpSync in order to take into > consideration what was committed by Lokesh Jain. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16091) WebHDFS should support getSnapshotDiffReportListing
Masatake Iwasaki created HDFS-16091: --- Summary: WebHDFS should support getSnapshotDiffReportListing Key: HDFS-16091 URL: https://issues.apache.org/jira/browse/HDFS-16091 Project: Hadoop HDFS Issue Type: Improvement Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki When there are millions of diffs between two snapshots, the old getSnapshotDiffReport() isn't scalable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13916) Distcp SnapshotDiff to support WebHDFS
[ https://issues.apache.org/jira/browse/HDFS-13916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17366562#comment-17366562 ] Masatake Iwasaki commented on HDFS-13916: - I rebased the patch on behalf of [~renxunsaky]. attached 007. > Distcp SnapshotDiff to support WebHDFS > -- > > Key: HDFS-13916 > URL: https://issues.apache.org/jira/browse/HDFS-13916 > Project: Hadoop HDFS > Issue Type: New Feature > Components: distcp, webhdfs >Affects Versions: 3.0.1, 3.1.1 >Reporter: Xun REN >Assignee: Xun REN >Priority: Major > Labels: easyfix, newbie, patch > Attachments: HDFS-13916.002.patch, HDFS-13916.003.patch, > HDFS-13916.004.patch, HDFS-13916.005.patch, HDFS-13916.006.patch, > HDFS-13916.007.patch, HDFS-13916.patch > > > [~ljain] has worked on the JIRA: HDFS-13052 to provide the possibility to > make DistCP of SnapshotDiff with WebHDFSFileSystem. However, in the patch, > there is no modification for the real java class which is used by launching > the command "hadoop distcp ..." > > You can check in the latest version here: > [https://github.com/apache/hadoop/blob/branch-3.1.1/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpSync.java#L96-L100] > In the method "preSyncCheck" of the class "DistCpSync", we still check if the > file system is DFS. > So I propose to change the class DistCpSync in order to take into > consideration what was committed by Lokesh Jain. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13916) Distcp SnapshotDiff to support WebHDFS
[ https://issues.apache.org/jira/browse/HDFS-13916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-13916: Attachment: HDFS-13916.007.patch > Distcp SnapshotDiff to support WebHDFS > -- > > Key: HDFS-13916 > URL: https://issues.apache.org/jira/browse/HDFS-13916 > Project: Hadoop HDFS > Issue Type: New Feature > Components: distcp, webhdfs >Affects Versions: 3.0.1, 3.1.1 >Reporter: Xun REN >Assignee: Xun REN >Priority: Major > Labels: easyfix, newbie, patch > Attachments: HDFS-13916.002.patch, HDFS-13916.003.patch, > HDFS-13916.004.patch, HDFS-13916.005.patch, HDFS-13916.006.patch, > HDFS-13916.007.patch, HDFS-13916.patch > > > [~ljain] has worked on the JIRA: HDFS-13052 to provide the possibility to > make DistCP of SnapshotDiff with WebHDFSFileSystem. However, in the patch, > there is no modification for the real java class which is used by launching > the command "hadoop distcp ..." > > You can check in the latest version here: > [https://github.com/apache/hadoop/blob/branch-3.1.1/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpSync.java#L96-L100] > In the method "preSyncCheck" of the class "DistCpSync", we still check if the > file system is DFS. > So I propose to change the class DistCpSync in order to take into > consideration what was committed by Lokesh Jain. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13916) Distcp SnapshotDiff to support WebHDFS
[ https://issues.apache.org/jira/browse/HDFS-13916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17348950#comment-17348950 ] Masatake Iwasaki commented on HDFS-13916: - [~renxunsaky] Could you rebase the patch on current trunk? I filed HADOOP-17719 and submitted similar PR before finding this JIRA. I can add supplemental fix in HADOOP-17719 after this comes in. > Distcp SnapshotDiff to support WebHDFS > -- > > Key: HDFS-13916 > URL: https://issues.apache.org/jira/browse/HDFS-13916 > Project: Hadoop HDFS > Issue Type: New Feature > Components: distcp, webhdfs >Affects Versions: 3.0.1, 3.1.1 >Reporter: Xun REN >Assignee: Xun REN >Priority: Major > Labels: easyfix, newbie, patch > Attachments: HDFS-13916.002.patch, HDFS-13916.003.patch, > HDFS-13916.004.patch, HDFS-13916.005.patch, HDFS-13916.006.patch, > HDFS-13916.patch > > > [~ljain] has worked on the JIRA: HDFS-13052 to provide the possibility to > make DistCP of SnapshotDiff with WebHDFSFileSystem. However, in the patch, > there is no modification for the real java class which is used by launching > the command "hadoop distcp ..." > > You can check in the latest version here: > [https://github.com/apache/hadoop/blob/branch-3.1.1/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpSync.java#L96-L100] > In the method "preSyncCheck" of the class "DistCpSync", we still check if the > file system is DFS. > So I propose to change the class DistCpSync in order to take into > consideration what was committed by Lokesh Jain. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15740) Make basename cross-platform
[ https://issues.apache.org/jira/browse/HDFS-15740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-15740: Fix Version/s: (was: 3.3.1) 3.4.0 > Make basename cross-platform > > > Key: HDFS-15740 > URL: https://issues.apache.org/jira/browse/HDFS-15740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: libhdfs++ >Affects Versions: 3.4.0 >Reporter: Gautham Banasandra >Assignee: Gautham Banasandra >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Original Estimate: 24h > Time Spent: 9.5h > Remaining Estimate: 14.5h > > The *basename* function isn't available on Visual Studio 2019 compiler. We > need to make it cross platform. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-15802) Address build failure of hadoop-hdfs-native-client on RHEL/CentOS 8
[ https://issues.apache.org/jira/browse/HDFS-15802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki resolved HDFS-15802. - Resolution: Duplicate > Address build failure of hadoop-hdfs-native-client on RHEL/CentOS 8 > --- > > Key: HDFS-15802 > URL: https://issues.apache.org/jira/browse/HDFS-15802 > Project: Hadoop HDFS > Issue Type: Bug > Components: libhdfs++ >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > > Building environment described in BUILDING.txt does not work for RHEL CentOS > 8 due to HDFS-15740. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15802) Address build failure of hadoop-hdfs-native-client on RHEL/CentOS 8
[ https://issues.apache.org/jira/browse/HDFS-15802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17275474#comment-17275474 ] Masatake Iwasaki edited comment on HDFS-15802 at 1/30/21, 5:07 AM: --- For gcc, using gcc-toolset-9 of AppStream of adding {{-lstdc++fs}} on linking might be an option. For cmake, installing CMake 3.19 or above from source might be needed. was (Author: iwasakims): For gcc, using gcc-devtoolset-9-toolchain of AppStream of adding {{-lstdc++fs}} on linking might be an option. For cmake, installing CMake 3.19 or above from source might be needed. > Address build failure of hadoop-hdfs-native-client on RHEL/CentOS 8 > --- > > Key: HDFS-15802 > URL: https://issues.apache.org/jira/browse/HDFS-15802 > Project: Hadoop HDFS > Issue Type: Bug > Components: libhdfs++ >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > > Building environment described in BUILDING.txt does not work for RHEL CentOS > 8 due to HDFS-15740. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15802) Address build failure of hadoop-hdfs-native-client on RHEL/CentOS 8
[ https://issues.apache.org/jira/browse/HDFS-15802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17275474#comment-17275474 ] Masatake Iwasaki commented on HDFS-15802: - For gcc, using gcc-devtoolset-9-toolchain of AppStream of adding {{-lstdc++fs}} on linking might be an option. For cmake, installing CMake 3.19 or above from source might be needed. > Address build failure of hadoop-hdfs-native-client on RHEL/CentOS 8 > --- > > Key: HDFS-15802 > URL: https://issues.apache.org/jira/browse/HDFS-15802 > Project: Hadoop HDFS > Issue Type: Bug > Components: libhdfs++ >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > > Building environment described in BUILDING.txt does not work for RHEL CentOS > 8 due to HDFS-15740. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15802) Address build failure of hadoop-hdfs-native-client on RHEL/CentOS 8
Masatake Iwasaki created HDFS-15802: --- Summary: Address build failure of hadoop-hdfs-native-client on RHEL/CentOS 8 Key: HDFS-15802 URL: https://issues.apache.org/jira/browse/HDFS-15802 Project: Hadoop HDFS Issue Type: Bug Components: libhdfs++ Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Building environment described in BUILDING.txt does not work for RHEL CentOS 8 due to HDFS-15740. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15648) TestFileChecksum should be parameterized
[ https://issues.apache.org/jira/browse/HDFS-15648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-15648: Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) > TestFileChecksum should be parameterized > > > Key: HDFS-15648 > URL: https://issues.apache.org/jira/browse/HDFS-15648 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > {{TestFileChecksumCompositeCrc}} extends {{TestFileChecksum}} overriding 3 > methods that return a constant flag True/False. > The class is useless and it causes confusion with two different jiras, while > the main bug should be in TestFileChecksum. > The {{TestFileChecksum}} should be parameterized -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15743) Fix -Pdist build failure of hadoop-hdfs-native-client
[ https://issues.apache.org/jira/browse/HDFS-15743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-15743: Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) > Fix -Pdist build failure of hadoop-hdfs-native-client > - > > Key: HDFS-15743 > URL: https://issues.apache.org/jira/browse/HDFS-15743 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 40m > Remaining Estimate: 0h > > {noformat} > [INFO] --- exec-maven-plugin:1.3.1:exec (pre-dist) @ > hadoop-hdfs-native-client --- > tar: ./*: Cannot stat: No such file or directory > tar: Exiting with failure status due to previous errors > Checking to bundle with: > bundleoption=false, liboption=snappy.lib, pattern=libsnappy. libdir= > Checking to bundle with: > bundleoption=false, liboption=zstd.lib, pattern=libzstd. libdir= > Checking to bundle with: > bundleoption=false, liboption=openssl.lib, pattern=libcrypto. libdir= > Checking to bundle with: > bundleoption=false, liboption=isal.lib, pattern=libisal. libdir= > Checking to bundle with: > bundleoption=, liboption=pmdk.lib, pattern=pmdk libdir= > Bundling bin files failed > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15743) Fix -Pdist build failure of hadoop-hdfs-native-client
[ https://issues.apache.org/jira/browse/HDFS-15743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-15743: Status: Patch Available (was: Open) > Fix -Pdist build failure of hadoop-hdfs-native-client > - > > Key: HDFS-15743 > URL: https://issues.apache.org/jira/browse/HDFS-15743 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > {noformat} > [INFO] --- exec-maven-plugin:1.3.1:exec (pre-dist) @ > hadoop-hdfs-native-client --- > tar: ./*: Cannot stat: No such file or directory > tar: Exiting with failure status due to previous errors > Checking to bundle with: > bundleoption=false, liboption=snappy.lib, pattern=libsnappy. libdir= > Checking to bundle with: > bundleoption=false, liboption=zstd.lib, pattern=libzstd. libdir= > Checking to bundle with: > bundleoption=false, liboption=openssl.lib, pattern=libcrypto. libdir= > Checking to bundle with: > bundleoption=false, liboption=isal.lib, pattern=libisal. libdir= > Checking to bundle with: > bundleoption=, liboption=pmdk.lib, pattern=pmdk libdir= > Bundling bin files failed > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15743) Fix -Pdist build failure of hadoop-hdfs-native-client
Masatake Iwasaki created HDFS-15743: --- Summary: Fix -Pdist build failure of hadoop-hdfs-native-client Key: HDFS-15743 URL: https://issues.apache.org/jira/browse/HDFS-15743 Project: Hadoop HDFS Issue Type: Bug Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki {noformat} [INFO] --- exec-maven-plugin:1.3.1:exec (pre-dist) @ hadoop-hdfs-native-client --- tar: ./*: Cannot stat: No such file or directory tar: Exiting with failure status due to previous errors Checking to bundle with: bundleoption=false, liboption=snappy.lib, pattern=libsnappy. libdir= Checking to bundle with: bundleoption=false, liboption=zstd.lib, pattern=libzstd. libdir= Checking to bundle with: bundleoption=false, liboption=openssl.lib, pattern=libcrypto. libdir= Checking to bundle with: bundleoption=false, liboption=isal.lib, pattern=libisal. libdir= Checking to bundle with: bundleoption=, liboption=pmdk.lib, pattern=pmdk libdir= Bundling bin files failed {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-15670) Testcase TestBalancer#testBalancerWithPinnedBlocks always fails
[ https://issues.apache.org/jira/browse/HDFS-15670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki resolved HDFS-15670. - Resolution: Cannot Reproduce I'm closing this as not reproducible. I guess it should be an environmental issue. Feel free to reopen if you have updates. > Testcase TestBalancer#testBalancerWithPinnedBlocks always fails > --- > > Key: HDFS-15670 > URL: https://issues.apache.org/jira/browse/HDFS-15670 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 3.0.0-beta1 >Reporter: Jianfei Jiang >Priority: Major > Attachments: HADOOP-15108.000.patch > > > When running testcases without any code changes, the function > testBalancerWithPinnedBlocks in TestBalancer.java never succeeded. I tried to > use Ubuntu 16.04 and redhat 7, maybe the failure is not related to various > linux environment. I am not sure if there is some bug in this case or I used > wrong environment and settings. Could anyone give some advice. > --- > Test set: org.apache.hadoop.hdfs.server.balancer.TestBalancer > --- > Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 100.389 sec > <<< FAILURE! - in org.apache.hadoop.hdfs.server.balancer.TestBalancer > testBalancerWithPinnedBlocks(org.apache.hadoop.hdfs.server.balancer.TestBalancer) > Time elapsed: 100.134 sec <<< ERROR! > java.lang.Exception: test timed out after 10 milliseconds > at java.lang.Object.wait(Native Method) > at > org.apache.hadoop.hdfs.DataStreamer.waitForAckedSeqno(DataStreamer.java:903) > at > org.apache.hadoop.hdfs.DFSOutputStream.flushInternal(DFSOutputStream.java:773) > at > org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:870) > at > org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101) > at org.apache.hadoop.hdfs.DFSTestUtil.createFile(DFSTestUtil.java:441) > at > org.apache.hadoop.hdfs.server.balancer.TestBalancer.testBalancerWithPinnedBlocks(TestBalancer.java:515) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15648) TestFileChecksum should be parameterized
[ https://issues.apache.org/jira/browse/HDFS-15648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-15648: Summary: TestFileChecksum should be parameterized (was: TestFileChecksum should be parameterized with a boolean flag) > TestFileChecksum should be parameterized > > > Key: HDFS-15648 > URL: https://issues.apache.org/jira/browse/HDFS-15648 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > {{TestFileChecksumCompositeCrc}} extends {{TestFileChecksum}} overriding 3 > methods that return a constant flag True/False. > The class is useless and it causes confusion with two different jiras, while > the main bug should be in TestFileChecksum. > The {{TestFileChecksum}} should be parameterized -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15702) Fix intermittent falilure of TestDecommission#testAllocAndIBRWhileDecommission
[ https://issues.apache.org/jira/browse/HDFS-15702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-15702: Status: Patch Available (was: Open) > Fix intermittent falilure of TestDecommission#testAllocAndIBRWhileDecommission > -- > > Key: HDFS-15702 > URL: https://issues.apache.org/jira/browse/HDFS-15702 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Masatake Iwasaki >Assignee: Masatake Iwasaki >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > {noformat} > java.lang.AssertionError: expected: but was: > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:834) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:144) > at > org.apache.hadoop.hdfs.TestDecommission.testAllocAndIBRWhileDecommission(TestDecommission.java:1025) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15702) Fix intermittent falilure of TestDecommission#testAllocAndIBRWhileDecommission
Masatake Iwasaki created HDFS-15702: --- Summary: Fix intermittent falilure of TestDecommission#testAllocAndIBRWhileDecommission Key: HDFS-15702 URL: https://issues.apache.org/jira/browse/HDFS-15702 Project: Hadoop HDFS Issue Type: Improvement Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki {noformat} java.lang.AssertionError: expected: but was: at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:834) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:144) at org.apache.hadoop.hdfs.TestDecommission.testAllocAndIBRWhileDecommission(TestDecommission.java:1025) {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-15676) TestRouterRpcMultiDestination#testNamenodeMetrics fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-15676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki resolved HDFS-15676. - Resolution: Duplicate I'm closing this as fixed by HDFS-15677. I will reopen this if the failure of testNameNodeMetrics alone rise again. > TestRouterRpcMultiDestination#testNamenodeMetrics fails on trunk > > > Key: HDFS-15676 > URL: https://issues.apache.org/jira/browse/HDFS-15676 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Masatake Iwasaki >Priority: Major > > qbt report (Nov 8, 2020, 11:28 AM) shows failures in testNamenodeMetrics -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15677) TestRouterRpcMultiDestination#testGetCachedDatanodeReport fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-15677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-15677: Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) > TestRouterRpcMultiDestination#testGetCachedDatanodeReport fails on trunk > > > Key: HDFS-15677 > URL: https://issues.apache.org/jira/browse/HDFS-15677 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 40m > Remaining Estimate: 0h > > qbt report (Nov 8, 2020, 11:28 AM) shows failures in > testGetCachedDatanodeReport -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15677) TestRouterRpcMultiDestination#testGetCachedDatanodeReport fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-15677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-15677: Status: Patch Available (was: Open) > TestRouterRpcMultiDestination#testGetCachedDatanodeReport fails on trunk > > > Key: HDFS-15677 > URL: https://issues.apache.org/jira/browse/HDFS-15677 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > qbt report (Nov 8, 2020, 11:28 AM) shows failures in > testGetCachedDatanodeReport -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15676) TestRouterRpcMultiDestination#testNamenodeMetrics fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-15676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17240554#comment-17240554 ] Masatake Iwasaki commented on HDFS-15676: - The test fails only if I run both testNamenodeMetrics and testGetCachedDatanodeReport. Running {{-Dtest=testNamenodeMetrics#testNamenodeMetrics}} alone has no issue. Since the TestRouterRpcMultiDestination reuses a mini cluster in the test cases, {{testNamenodeMetrics}} could be fixed by addressing {{testGetCachedDatanodeReport}} issue. {noformat} $ for i in `seq 100` ; do echo $i && mvn test -DignoreTestFailure=false -Dtest=TestRouterRpcMultiDestination#testNamenodeMetrics,TestRouterRpcMultiDestination#testGetCachedDatanodeReport || break ; done ... [ERROR] Failures: [ERROR] TestRouterRpcMultiDestination>TestRouterRpc.testNamenodeMetrics:1682 expected:<12> but was:<11> [ERROR] Errors: [ERROR] TestRouterRpcMultiDestination>TestRouterRpc.testGetCachedDatanodeReport:1829 » Timeout [INFO] [ERROR] Tests run: 2, Failures: 1, Errors: 1, Skipped: 0 {noformat} > TestRouterRpcMultiDestination#testNamenodeMetrics fails on trunk > > > Key: HDFS-15676 > URL: https://issues.apache.org/jira/browse/HDFS-15676 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Masatake Iwasaki >Priority: Major > > qbt report (Nov 8, 2020, 11:28 AM) shows failures in testNamenodeMetrics -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-15677) TestRouterRpcMultiDestination#testGetCachedDatanodeReport fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-15677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki reassigned HDFS-15677: --- Assignee: Masatake Iwasaki > TestRouterRpcMultiDestination#testGetCachedDatanodeReport fails on trunk > > > Key: HDFS-15677 > URL: https://issues.apache.org/jira/browse/HDFS-15677 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Masatake Iwasaki >Priority: Major > > qbt report (Nov 8, 2020, 11:28 AM) shows failures in > testGetCachedDatanodeReport -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-15676) TestRouterRpcMultiDestination#testNamenodeMetrics fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-15676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki reassigned HDFS-15676: --- Assignee: Masatake Iwasaki > TestRouterRpcMultiDestination#testNamenodeMetrics fails on trunk > > > Key: HDFS-15676 > URL: https://issues.apache.org/jira/browse/HDFS-15676 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Masatake Iwasaki >Priority: Major > > qbt report (Nov 8, 2020, 11:28 AM) shows failures in testNamenodeMetrics -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15648) TestFileChecksum should be parameterized with a boolean flag
[ https://issues.apache.org/jira/browse/HDFS-15648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-15648: Status: Patch Available (was: Open) > TestFileChecksum should be parameterized with a boolean flag > > > Key: HDFS-15648 > URL: https://issues.apache.org/jira/browse/HDFS-15648 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > {{TestFileChecksumCompositeCrc}} extends {{TestFileChecksum}} overriding 3 > methods that return a constant flag True/False. > The class is useless and it causes confusion with two different jiras, while > the main bug should be in TestFileChecksum. > The {{TestFileChecksum}} should be parameterized -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-15648) TestFileChecksum should be parameterized with a boolean flag
[ https://issues.apache.org/jira/browse/HDFS-15648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki reassigned HDFS-15648: --- Assignee: Masatake Iwasaki > TestFileChecksum should be parameterized with a boolean flag > > > Key: HDFS-15648 > URL: https://issues.apache.org/jira/browse/HDFS-15648 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Masatake Iwasaki >Priority: Major > > {{TestFileChecksumCompositeCrc}} extends {{TestFileChecksum}} overriding 3 > methods that return a constant flag True/False. > The class is useless and it causes confusion with two different jiras, while > the main bug should be in TestFileChecksum. > The {{TestFileChecksum}} should be parameterized -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-15296) TestBPOfferService#testMissBlocksWhenReregister is flaky
[ https://issues.apache.org/jira/browse/HDFS-15296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki resolved HDFS-15296. - Resolution: Duplicate I'm closing this as duplicate of HDFS-15654. > TestBPOfferService#testMissBlocksWhenReregister is flaky > > > Key: HDFS-15296 > URL: https://issues.apache.org/jira/browse/HDFS-15296 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: test >Affects Versions: 3.4.0 >Reporter: Mingliang Liu >Priority: Major > > TestBPOfferService.testMissBlocksWhenReregister fails intermittently in > {{trunk}} branch, not sure about other branches. Example failures are > - > https://builds.apache.org/job/hadoop-multibranch/job/PR-1964/4/testReport/org.apache.hadoop.hdfs.server.datanode/TestBPOfferService/testMissBlocksWhenReregister/ > - > https://builds.apache.org/job/PreCommit-HDFS-Build/29175/testReport/org.apache.hadoop.hdfs.server.datanode/TestBPOfferService/testMissBlocksWhenReregister/ > Sample exception stack is: > {quote} > Stacktrace > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hdfs.server.datanode.TestBPOfferService.testMissBlocksWhenReregister(TestBPOfferService.java:350) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15672) TestBalancerWithMultipleNameNodes#testBalancingBlockpoolsWithBlockPoolPolicy fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-15672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-15672: Status: Patch Available (was: Open) > TestBalancerWithMultipleNameNodes#testBalancingBlockpoolsWithBlockPoolPolicy > fails on trunk > --- > > Key: HDFS-15672 > URL: https://issues.apache.org/jira/browse/HDFS-15672 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > qbt report shows the following error: > {code:bash} > org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.testBalancingBlockpoolsWithBlockPoolPolicy > Failing for the past 1 build (Since Failed#317 ) > Took 10 min. > Error Message > test timed out after 60 milliseconds > Stacktrace > org.junit.runners.model.TestTimedOutException: test timed out after 60 > milliseconds > at java.lang.Thread.sleep(Native Method) > at > org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.sleep(TestBalancerWithMultipleNameNodes.java:353) > at > org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.wait(TestBalancerWithMultipleNameNodes.java:159) > at > org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.runBalancer(TestBalancerWithMultipleNameNodes.java:175) > at > org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.runTest(TestBalancerWithMultipleNameNodes.java:550) > at > org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.testBalancingBlockpoolsWithBlockPoolPolicy(TestBalancerWithMultipleNameNodes.java:609) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15674) TestBPOfferService#testMissBlocksWhenReregister fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-15674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-15674: Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) > TestBPOfferService#testMissBlocksWhenReregister fails on trunk > -- > > Key: HDFS-15674 > URL: https://issues.apache.org/jira/browse/HDFS-15674 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 1h > Remaining Estimate: 0h > > qbt report (Nov 8, 2020, 11:28 AM) shows failures timing out in > testMissBlocksWhenReregister -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-15672) TestBalancerWithMultipleNameNodes#testBalancingBlockpoolsWithBlockPoolPolicy fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-15672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki reassigned HDFS-15672: --- Assignee: Masatake Iwasaki > TestBalancerWithMultipleNameNodes#testBalancingBlockpoolsWithBlockPoolPolicy > fails on trunk > --- > > Key: HDFS-15672 > URL: https://issues.apache.org/jira/browse/HDFS-15672 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Masatake Iwasaki >Priority: Major > > qbt report shows the following error: > {code:bash} > org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.testBalancingBlockpoolsWithBlockPoolPolicy > Failing for the past 1 build (Since Failed#317 ) > Took 10 min. > Error Message > test timed out after 60 milliseconds > Stacktrace > org.junit.runners.model.TestTimedOutException: test timed out after 60 > milliseconds > at java.lang.Thread.sleep(Native Method) > at > org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.sleep(TestBalancerWithMultipleNameNodes.java:353) > at > org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.wait(TestBalancerWithMultipleNameNodes.java:159) > at > org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.runBalancer(TestBalancerWithMultipleNameNodes.java:175) > at > org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.runTest(TestBalancerWithMultipleNameNodes.java:550) > at > org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.testBalancingBlockpoolsWithBlockPoolPolicy(TestBalancerWithMultipleNameNodes.java:609) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15674) TestBPOfferService#testMissBlocksWhenReregister fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-15674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17233582#comment-17233582 ] Masatake Iwasaki commented on HDFS-15674: - The test intermittently fails if count of reported block in IBRs does not reached to expected value(4000). {noformat} [ERROR] Failures: [ERROR] TestBPOfferService.testMissBlocksWhenReregister:341 Timed out wait for IBR counts FBRCount = 2441, IBRCount = 3999; expected = 4000. Exception: Timed out waiting for condition. Thread diagnostics: Timestamp: 2020-11-16 12:45:57,425 {noformat} This waiting condition does not make sense because pending IBRs are possible to be deleted by {{clearIBRs()}} in {{BPServiceActor#reRegister}}. It is ok because succeeding full block report covers them based on the discussion of HDFS-15113. The tests should check that all blocks are reported rather than counting number of reported blocks in FBR/IBR independently. I submitted [PR #2467|https://github.com/apache/hadoop/pull/2467] for this. > TestBPOfferService#testMissBlocksWhenReregister fails on trunk > -- > > Key: HDFS-15674 > URL: https://issues.apache.org/jira/browse/HDFS-15674 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > qbt report (Nov 8, 2020, 11:28 AM) shows failures timing out in > testMissBlocksWhenReregister -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15674) TestBPOfferService#testMissBlocksWhenReregister fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-15674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-15674: Status: Patch Available (was: Open) > TestBPOfferService#testMissBlocksWhenReregister fails on trunk > -- > > Key: HDFS-15674 > URL: https://issues.apache.org/jira/browse/HDFS-15674 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Masatake Iwasaki >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > qbt report (Nov 8, 2020, 11:28 AM) shows failures timing out in > testMissBlocksWhenReregister -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-15674) TestBPOfferService#testMissBlocksWhenReregister fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-15674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki reassigned HDFS-15674: --- Assignee: Masatake Iwasaki > TestBPOfferService#testMissBlocksWhenReregister fails on trunk > -- > > Key: HDFS-15674 > URL: https://issues.apache.org/jira/browse/HDFS-15674 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ahmed Hussein >Assignee: Masatake Iwasaki >Priority: Major > > qbt report (Nov 8, 2020, 11:28 AM) shows failures timing out in > testMissBlocksWhenReregister -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15296) TestBPOfferService#testMissBlocksWhenReregister is flaky
[ https://issues.apache.org/jira/browse/HDFS-15296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17232468#comment-17232468 ] Masatake Iwasaki commented on HDFS-15296: - looks like duplicate of HDFS-15654 and HDFS-15674 is the follow-up > TestBPOfferService#testMissBlocksWhenReregister is flaky > > > Key: HDFS-15296 > URL: https://issues.apache.org/jira/browse/HDFS-15296 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: test >Affects Versions: 3.4.0 >Reporter: Mingliang Liu >Priority: Major > > TestBPOfferService.testMissBlocksWhenReregister fails intermittently in > {{trunk}} branch, not sure about other branches. Example failures are > - > https://builds.apache.org/job/hadoop-multibranch/job/PR-1964/4/testReport/org.apache.hadoop.hdfs.server.datanode/TestBPOfferService/testMissBlocksWhenReregister/ > - > https://builds.apache.org/job/PreCommit-HDFS-Build/29175/testReport/org.apache.hadoop.hdfs.server.datanode/TestBPOfferService/testMissBlocksWhenReregister/ > Sample exception stack is: > {quote} > Stacktrace > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hdfs.server.datanode.TestBPOfferService.testMissBlocksWhenReregister(TestBPOfferService.java:350) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15567) [SBN Read] HDFS should expose msync() API to allow downstream applications call it explicetly.
[ https://issues.apache.org/jira/browse/HDFS-15567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17194711#comment-17194711 ] Masatake Iwasaki commented on HDFS-15567: - I updated the target version. > [SBN Read] HDFS should expose msync() API to allow downstream applications > call it explicetly. > -- > > Key: HDFS-15567 > URL: https://issues.apache.org/jira/browse/HDFS-15567 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ha, hdfs-client >Reporter: Konstantin Shvachko >Priority: Major > > Consistent reads from Standby introduced {{msync()}} API HDFS-13688, which > updates client's state ID with current state of the Active NameNode to > guarantee consistency of subsequent calls to an ObserverNode. Currently this > API is exposed via {{DFSClient}} only, which makes it hard for applications > to access {{msync()}}. One way is to use something like this: > {code} > if(fs instanceof DistributedFileSystem) { > ((DistributedFileSystem)fs).getClient().msync(); > } > {code} > This should be exposed both for {{FileSystem}} and {{FileContext}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15567) [SBN Read] HDFS should expose msync() API to allow downstream applications call it explicetly.
[ https://issues.apache.org/jira/browse/HDFS-15567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-15567: Target Version/s: 2.10.2 (was: 2.10.1) > [SBN Read] HDFS should expose msync() API to allow downstream applications > call it explicetly. > -- > > Key: HDFS-15567 > URL: https://issues.apache.org/jira/browse/HDFS-15567 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ha, hdfs-client >Reporter: Konstantin Shvachko >Priority: Major > > Consistent reads from Standby introduced {{msync()}} API HDFS-13688, which > updates client's state ID with current state of the Active NameNode to > guarantee consistency of subsequent calls to an ObserverNode. Currently this > API is exposed via {{DFSClient}} only, which makes it hard for applications > to access {{msync()}}. One way is to use something like this: > {code} > if(fs instanceof DistributedFileSystem) { > ((DistributedFileSystem)fs).getClient().msync(); > } > {code} > This should be exposed both for {{FileSystem}} and {{FileContext}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org