[jira] [Commented] (HDFS-14993) checkDiskError doesn't work during datanode startup

2019-12-06 Thread Yang Yun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990370#comment-16990370
 ] 

Yang Yun commented on HDFS-14993:
-

Thanks [~ayushtkn]  and [~weichiu]  for the review. changed according to 
comments.

> checkDiskError doesn't work during datanode startup
> ---
>
> Key: HDFS-14993
> URL: https://issues.apache.org/jira/browse/HDFS-14993
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Major
> Attachments: HDFS-14993.patch, HDFS-14993.patch, HDFS-14993.patch
>
>
> the function checkDiskError() is called before addBlockPool, but list 
> bpSlices is empty this time. So the function check() in FsVolumeImpl.java 
> does nothing.
> @Override
> public VolumeCheckResult check(VolumeCheckContext ignored)
>  throws DiskErrorException {
>  // TODO:FEDERATION valid synchronization
>  for (BlockPoolSlice s : bpSlices.values()) {
>  s.checkDirs();
>  }
>  return VolumeCheckResult.HEALTHY;
> }



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14993) checkDiskError doesn't work during datanode startup

2019-12-06 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-14993:

Attachment: HDFS-14993.patch
Status: Patch Available  (was: Open)

> checkDiskError doesn't work during datanode startup
> ---
>
> Key: HDFS-14993
> URL: https://issues.apache.org/jira/browse/HDFS-14993
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Major
> Attachments: HDFS-14993.patch, HDFS-14993.patch, HDFS-14993.patch
>
>
> the function checkDiskError() is called before addBlockPool, but list 
> bpSlices is empty this time. So the function check() in FsVolumeImpl.java 
> does nothing.
> @Override
> public VolumeCheckResult check(VolumeCheckContext ignored)
>  throws DiskErrorException {
>  // TODO:FEDERATION valid synchronization
>  for (BlockPoolSlice s : bpSlices.values()) {
>  s.checkDirs();
>  }
>  return VolumeCheckResult.HEALTHY;
> }



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14983) RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option

2019-12-06 Thread Xieming Li (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990369#comment-16990369
 ] 

Xieming Li commented on HDFS-14983:
---

HI, [~elgoiri], thank you for your prompt response.

I have fixed almost all the issues pointed out.
{quote}Can we avoid the SuppressWarnings in 
TestRouterRefreshSuperUserGroupsConfiguration?
{quote}
I have deleted the SuppressWarning from that file, but it will produce a 
LineLengh CheckStyle Error,
 Since that "\{@link 
org.apache.hadoop.hdfs.server.federation.router.RouterAdminServer#refreshSuperUserGroupsConfiguration}"
 in javadoc can not be broken into two lines.

 

 

 

> RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option
> ---
>
> Key: HDFS-14983
> URL: https://issues.apache.org/jira/browse/HDFS-14983
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Akira Ajisaka
>Assignee: Xieming Li
>Priority: Minor
> Attachments: HDFS-14983.002.patch, HDFS-14983.003.patch, 
> HDFS-14983.draft.001.patch
>
>
> NameNode can update proxyuser config by -refreshSuperUserGroupsConfiguration 
> without restarting but DFSRouter cannot. It would be better for DFSRouter to 
> have such functionality to be compatible with NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14993) checkDiskError doesn't work during datanode startup

2019-12-06 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-14993:

Status: Open  (was: Patch Available)

> checkDiskError doesn't work during datanode startup
> ---
>
> Key: HDFS-14993
> URL: https://issues.apache.org/jira/browse/HDFS-14993
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Major
> Attachments: HDFS-14993.patch, HDFS-14993.patch
>
>
> the function checkDiskError() is called before addBlockPool, but list 
> bpSlices is empty this time. So the function check() in FsVolumeImpl.java 
> does nothing.
> @Override
> public VolumeCheckResult check(VolumeCheckContext ignored)
>  throws DiskErrorException {
>  // TODO:FEDERATION valid synchronization
>  for (BlockPoolSlice s : bpSlices.values()) {
>  s.checkDirs();
>  }
>  return VolumeCheckResult.HEALTHY;
> }



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14983) RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option

2019-12-06 Thread Xieming Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xieming Li updated HDFS-14983:
--
Status: Open  (was: Patch Available)

> RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option
> ---
>
> Key: HDFS-14983
> URL: https://issues.apache.org/jira/browse/HDFS-14983
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Akira Ajisaka
>Assignee: Xieming Li
>Priority: Minor
> Attachments: HDFS-14983.002.patch, HDFS-14983.003.patch, 
> HDFS-14983.draft.001.patch
>
>
> NameNode can update proxyuser config by -refreshSuperUserGroupsConfiguration 
> without restarting but DFSRouter cannot. It would be better for DFSRouter to 
> have such functionality to be compatible with NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14983) RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option

2019-12-06 Thread Xieming Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xieming Li updated HDFS-14983:
--
Attachment: HDFS-14983.003.patch
Status: Patch Available  (was: Open)

> RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option
> ---
>
> Key: HDFS-14983
> URL: https://issues.apache.org/jira/browse/HDFS-14983
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Akira Ajisaka
>Assignee: Xieming Li
>Priority: Minor
> Attachments: HDFS-14983.002.patch, HDFS-14983.003.patch, 
> HDFS-14983.draft.001.patch
>
>
> NameNode can update proxyuser config by -refreshSuperUserGroupsConfiguration 
> without restarting but DFSRouter cannot. It would be better for DFSRouter to 
> have such functionality to be compatible with NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15037) Encryption Zone operations should not block other RPC calls while retreiving encryption keys.

2019-12-06 Thread Konstantin Shvachko (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990275#comment-16990275
 ] 

Konstantin Shvachko commented on HDFS-15037:


Ah, OK. Good catch Wei-Chiu. So may be {{dirLock}}-only locking in these 
methods was introduced by mistake then, and we should fix it in a different 
issue? [~xiaochen] may be you could give some background here.
In any case it would be good to make KMS calls outside of the namesystem lock.

> Encryption Zone operations should not block other RPC calls while retreiving 
> encryption keys.
> -
>
> Key: HDFS-15037
> URL: https://issues.apache.org/jira/browse/HDFS-15037
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Priority: Major
>
> I believe it was an intention to avoid blocking other operations while 
> retrieving keys with holding {{FSDirectory.dirLock}}. But in reality all 
> other operations enter first {{FSNamesystemLock}} then {{dirLock}}. So they 
> are all blocked waiting for the key.
> We see substantial increase in RPC wait time ({{RpcQueueTimeAvgTime}}) on 
> NameNode when encryption operations are intermixed with regular workloads.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15037) Encryption Zone operations should not block other RPC calls while retreiving encryption keys.

2019-12-06 Thread Konstantin Shvachko (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-15037:
---
Summary: Encryption Zone operations should not block other RPC calls while 
retreiving encryption keys.  (was: Encryption Zone operations should not block 
other RPC calls while retreivingencryption keys.)

> Encryption Zone operations should not block other RPC calls while retreiving 
> encryption keys.
> -
>
> Key: HDFS-15037
> URL: https://issues.apache.org/jira/browse/HDFS-15037
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Priority: Major
>
> I believe it was an intention to avoid blocking other operations while 
> retrieving keys with holding {{FSDirectory.dirLock}}. But in reality all 
> other operations enter first {{FSNamesystemLock}} then {{dirLock}}. So they 
> are all blocked waiting for the key.
> We see substantial increase in RPC wait time ({{RpcQueueTimeAvgTime}}) on 
> NameNode when encryption operations are intermixed with regular workloads.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15032) Balancer crashes when it fails to contact an unavailable NN via ObserverReadProxyProvider

2019-12-06 Thread Konstantin Shvachko (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990272#comment-16990272
 ] 

Konstantin Shvachko commented on HDFS-15032:


Erik. the patch looks good and tests work as expected for me.
I did not understand what you are trying to achieve with method {{toString()}}. 
It is a good thing to define it for class {{ProxyCombiner}}, then in debugger I 
can see all proxies it combines. But I don't see why invoking {{toString()}} on 
say {{ClientProtocol}} should be diverted to {{ProxyCombiner.toString()}}. I 
just don't see what is it useful for, but it will do string comparison for all 
other calls hurting performance.

> Balancer crashes when it fails to contact an unavailable NN via 
> ObserverReadProxyProvider
> -
>
> Key: HDFS-15032
> URL: https://issues.apache.org/jira/browse/HDFS-15032
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 2.10.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-15032.000.patch, HDFS-15032.001.patch, 
> HDFS-15032.002.patch
>
>
> When trying to run the Balancer using ObserverReadProxyProvider (to allow it 
> to read from the Observer Node as described in HDFS-14979), if one of the NNs 
> isn't running, the Balancer will crash.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15017) Remove redundant import of AtomicBoolean in NameNodeConnector.

2019-12-06 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990267#comment-16990267
 ] 

Wei-Chiu Chuang commented on HDFS-15017:


+1

> Remove redundant import of AtomicBoolean in NameNodeConnector.
> --
>
> Key: HDFS-15017
> URL: https://issues.apache.org/jira/browse/HDFS-15017
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover, hdfs
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chao Sun
>Priority: Major
>  Labels: newbie
> Attachments: HDFS-15017-branch-2.000.patch
>
>
> Should remove redundant import.
> Looks like it is specific to branch 2.10. Trunk and 3x branches don't have it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15028) Keep the capacity of volume and reduce a system call

2019-12-06 Thread Masatake Iwasaki (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-15028:

Fix Version/s: 3.3.0
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Committed. Thanks, [~hadoop_yangyun].

> Keep the capacity of volume and reduce a system call
> 
>
> Key: HDFS-15028
> URL: https://issues.apache.org/jira/browse/HDFS-15028
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: HDFS-15028.patch, HDFS-15028.patch, HDFS-15028.patch, 
> HDFS-15028.patch, HDFS-15028.patch
>
>
> The local volume is not changed. so keep the first value of the capacity and 
> reuse for each heartbeat.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15031) Allow BootstrapStandby to download FSImage if the directory is already formatted

2019-12-06 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990257#comment-16990257
 ] 

Íñigo Goiri commented on HDFS-15031:


I think we can fix the checkstyle warnings.

> Allow BootstrapStandby to download FSImage if the directory is already 
> formatted
> 
>
> Key: HDFS-15031
> URL: https://issues.apache.org/jira/browse/HDFS-15031
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namenode
>Reporter: Danny Becker
>Assignee: Danny Becker
>Priority: Minor
> Attachments: HDFS-15031.000.patch, HDFS-15031.001.patch, 
> HDFS-15031.002.patch, HDFS-15031.003.patch, HDFS-15031.005.patch, 
> HDFS-15031.006.patch
>
>
> Currently, BootstrapStandby will only download the latest FSImage if it has 
> formatted the local image directory. This can be an issue when there are out 
> of date FSImages on a Standby NameNode, as the non-interactive mode will not 
> format the image directory, and BootstrapStandby will return an error code. 
> The changes here simply allow BootstrapStandby to download the latest FSImage 
> to the image directory, without needing to format first.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15017) Remove redundant import of AtomicBoolean in NameNodeConnector.

2019-12-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990247#comment-16990247
 ] 

Hadoop QA commented on HDFS-15017:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
57s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
23s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} branch-2 passed with JDK v1.8.0_222 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
4s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} branch-2 passed with JDK v1.8.0_222 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed with JDK v1.8.0_222 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 
0 new + 2 unchanged - 1 fixed = 2 total (was 3) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed with JDK v1.8.0_222 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 70m 45s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 99m 28s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:f555aa740b5 |
| JIRA Issue | HDFS-15017 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12987788/HDFS-15017-branch-2.000.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux e830bfdd6c26 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (HDFS-15028) Keep the capacity of volume and reduce a system call

2019-12-06 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990237#comment-16990237
 ] 

Hudson commented on HDFS-15028:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17736 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17736/])
HDFS-15028. Keep the capacity of volume and reduce a system call. (iwasakims: 
rev 11cd5b6e39adbf159891852f3482aebdde5459fb)
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestFsVolumeList.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java


> Keep the capacity of volume and reduce a system call
> 
>
> Key: HDFS-15028
> URL: https://issues.apache.org/jira/browse/HDFS-15028
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15028.patch, HDFS-15028.patch, HDFS-15028.patch, 
> HDFS-15028.patch, HDFS-15028.patch
>
>
> The local volume is not changed. so keep the first value of the capacity and 
> reuse for each heartbeat.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15028) Keep the capacity of volume and reduce a system call

2019-12-06 Thread Masatake Iwasaki (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990232#comment-16990232
 ] 

Masatake Iwasaki commented on HDFS-15028:
-

+1. committing this.

> Keep the capacity of volume and reduce a system call
> 
>
> Key: HDFS-15028
> URL: https://issues.apache.org/jira/browse/HDFS-15028
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15028.patch, HDFS-15028.patch, HDFS-15028.patch, 
> HDFS-15028.patch, HDFS-15028.patch
>
>
> The local volume is not changed. so keep the first value of the capacity and 
> reuse for each heartbeat.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15017) Remove redundant import of AtomicBoolean in NameNodeConnector.

2019-12-06 Thread Konstantin Shvachko (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990229#comment-16990229
 ] 

Konstantin Shvachko commented on HDFS-15017:


+1 Thanks Chao for the patch.

> Remove redundant import of AtomicBoolean in NameNodeConnector.
> --
>
> Key: HDFS-15017
> URL: https://issues.apache.org/jira/browse/HDFS-15017
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover, hdfs
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chao Sun
>Priority: Major
>  Labels: newbie
> Attachments: HDFS-15017-branch-2.000.patch
>
>
> Should remove redundant import.
> Looks like it is specific to branch 2.10. Trunk and 3x branches don't have it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14751) Synchronize on diffs in DirectoryScanner

2019-12-06 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990228#comment-16990228
 ] 

Hudson commented on HDFS-14751:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17735 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17735/])
HDFS-14751. Synchronize on diffs in DirectoryScanner. Contributed by (weichiu: 
rev ecd461f940efcd8c75f4833cf09bc7a52cc0b559)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DirectoryScanner.java


> Synchronize on diffs in DirectoryScanner
> 
>
> Key: HDFS-14751
> URL: https://issues.apache.org/jira/browse/HDFS-14751
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: HDFS-14751.001.patch, HDFS-14751.002.patch, 
> HDFS-14751.003.patch, HDFS-14751.004.patch
>
>
> {code:java}
> [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 21.693 s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency
> [ERROR] 
> testGenerationStampInFuture(org.apache.hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency)
>   Time elapsed: 7.572 s  <<< ERROR!
> java.util.ConcurrentModificationException
>   at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:909)
>   at java.util.ArrayList$Itr.next(ArrayList.java:859)
>   at 
> com.google.common.collect.AbstractMapBasedMultimap$Itr.next(AbstractMapBasedMultimap.java:1153)
>   at 
> java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1044)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:433)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNodeTestUtils.runDirectoryScanner(DataNodeTestUtils.java:202)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency.testGenerationStampInFuture(TestNameNodeMetadataConsistency.java:92)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}
> Ref:[https://builds.apache.org/job/PreCommit-HDFS-Build/27567/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: 

[jira] [Commented] (HDFS-14476) lock too long when fix inconsistent blocks between disk and in-memory

2019-12-06 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990227#comment-16990227
 ] 

Hudson commented on HDFS-14476:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17735 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17735/])
HDFS-14476. lock too long when fix inconsistent blocks between disk and 
(weichiu: rev 313b76f8e92643e3412a98dc73f83437729f3984)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DirectoryScanner.java


> lock too long when fix inconsistent blocks between disk and in-memory
> -
>
> Key: HDFS-14476
> URL: https://issues.apache.org/jira/browse/HDFS-14476
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.6.0, 2.7.0, 3.0.3
>Reporter: Sean Chow
>Assignee: Sean Chow
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14476-branch-2.01.patch, HDFS-14476.00.patch, 
> HDFS-14476.002.patch, HDFS-14476.01.patch, HDFS-14476.branch-3.2.001.patch, 
> datanode-with-patch-14476.png
>
>
> When directoryScanner have the results of differences between disk and 
> in-memory blocks. it will try to run {{checkAndUpdate}} to fix it. However 
> {{FsDatasetImpl.checkAndUpdate}} is a synchronized call
> As I have about 6millions blocks for every datanodes and every 6hours' scan 
> will have about 25000 abnormal blocks to fix. That leads to a long lock 
> holding FsDatasetImpl object.
> let's assume every block need 10ms to fix(because of latency of SAS disk), 
> that will cost 250 seconds to finish. That means all reads and writes will be 
> blocked for 3mins for that datanode.
>  
> {code:java}
> 2019-05-06 08:06:51,704 INFO 
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: BlockPool 
> BP-1644920766-10.223.143.220-1450099987967 Total blocks: 6850197, missing 
> metadata files:23574, missing block files:23574, missing blocks in 
> memory:47625, mismatched blocks:0
> ...
> 2019-05-06 08:16:41,625 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Took 588402ms to process 1 commands from NN
> {code}
> Take long time to process command from nn because threads are blocked. And 
> namenode will see long lastContact time for this datanode.
> Maybe this affect all hdfs versions.
> *how to fix:*
> just like process invalidate command from namenode with 1000 batch size, fix 
> these abnormal block should be handled with batch too and sleep 2 seconds 
> between the batch to allow normal reading/writing blocks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15005) Backport HDFS-12300 to branch-2

2019-12-06 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-15005:
---
Fix Version/s: 2.11.0
   2.10.1
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Thanks [~csun] committed the patch to branch-2 and branch-2.10

> Backport HDFS-12300 to branch-2
> ---
>
> Key: HDFS-15005
> URL: https://issues.apache.org/jira/browse/HDFS-15005
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Fix For: 2.10.1, 2.11.0
>
> Attachments: HDFS-15005-branch-2.000.patch, 
> HDFS-15005-branch-2.001.patch, HDFS-15005-branch-2.002.patch, 
> HDFS-15005-branch-2.003.patch
>
>
> Having DT related information is very useful in audit log. This tracks effort 
> to backport HDFS-12300 to branch-2.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14751) Synchronize on diffs in DirectoryScanner

2019-12-06 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-14751:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks [~leosun08]

> Synchronize on diffs in DirectoryScanner
> 
>
> Key: HDFS-14751
> URL: https://issues.apache.org/jira/browse/HDFS-14751
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: HDFS-14751.001.patch, HDFS-14751.002.patch, 
> HDFS-14751.003.patch, HDFS-14751.004.patch
>
>
> {code:java}
> [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 21.693 s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency
> [ERROR] 
> testGenerationStampInFuture(org.apache.hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency)
>   Time elapsed: 7.572 s  <<< ERROR!
> java.util.ConcurrentModificationException
>   at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:909)
>   at java.util.ArrayList$Itr.next(ArrayList.java:859)
>   at 
> com.google.common.collect.AbstractMapBasedMultimap$Itr.next(AbstractMapBasedMultimap.java:1153)
>   at 
> java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1044)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:433)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNodeTestUtils.runDirectoryScanner(DataNodeTestUtils.java:202)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency.testGenerationStampInFuture(TestNameNodeMetadataConsistency.java:92)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}
> Ref:[https://builds.apache.org/job/PreCommit-HDFS-Build/27567/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-14476) lock too long when fix inconsistent blocks between disk and in-memory

2019-12-06 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-14476.

Resolution: Fixed

Push into trunk.

> lock too long when fix inconsistent blocks between disk and in-memory
> -
>
> Key: HDFS-14476
> URL: https://issues.apache.org/jira/browse/HDFS-14476
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.6.0, 2.7.0, 3.0.3
>Reporter: Sean Chow
>Assignee: Sean Chow
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14476-branch-2.01.patch, HDFS-14476.00.patch, 
> HDFS-14476.002.patch, HDFS-14476.01.patch, HDFS-14476.branch-3.2.001.patch, 
> datanode-with-patch-14476.png
>
>
> When directoryScanner have the results of differences between disk and 
> in-memory blocks. it will try to run {{checkAndUpdate}} to fix it. However 
> {{FsDatasetImpl.checkAndUpdate}} is a synchronized call
> As I have about 6millions blocks for every datanodes and every 6hours' scan 
> will have about 25000 abnormal blocks to fix. That leads to a long lock 
> holding FsDatasetImpl object.
> let's assume every block need 10ms to fix(because of latency of SAS disk), 
> that will cost 250 seconds to finish. That means all reads and writes will be 
> blocked for 3mins for that datanode.
>  
> {code:java}
> 2019-05-06 08:06:51,704 INFO 
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: BlockPool 
> BP-1644920766-10.223.143.220-1450099987967 Total blocks: 6850197, missing 
> metadata files:23574, missing block files:23574, missing blocks in 
> memory:47625, mismatched blocks:0
> ...
> 2019-05-06 08:16:41,625 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Took 588402ms to process 1 commands from NN
> {code}
> Take long time to process command from nn because threads are blocked. And 
> namenode will see long lastContact time for this datanode.
> Maybe this affect all hdfs versions.
> *how to fix:*
> just like process invalidate command from namenode with 1000 batch size, fix 
> these abnormal block should be handled with batch too and sleep 2 seconds 
> between the batch to allow normal reading/writing blocks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14751) Synchronize on diffs in DirectoryScanner

2019-12-06 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-14751:
---
Fix Version/s: 3.3.0

> Synchronize on diffs in DirectoryScanner
> 
>
> Key: HDFS-14751
> URL: https://issues.apache.org/jira/browse/HDFS-14751
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: HDFS-14751.001.patch, HDFS-14751.002.patch, 
> HDFS-14751.003.patch, HDFS-14751.004.patch
>
>
> {code:java}
> [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 21.693 s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency
> [ERROR] 
> testGenerationStampInFuture(org.apache.hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency)
>   Time elapsed: 7.572 s  <<< ERROR!
> java.util.ConcurrentModificationException
>   at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:909)
>   at java.util.ArrayList$Itr.next(ArrayList.java:859)
>   at 
> com.google.common.collect.AbstractMapBasedMultimap$Itr.next(AbstractMapBasedMultimap.java:1153)
>   at 
> java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1044)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:433)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNodeTestUtils.runDirectoryScanner(DataNodeTestUtils.java:202)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency.testGenerationStampInFuture(TestNameNodeMetadataConsistency.java:92)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}
> Ref:[https://builds.apache.org/job/PreCommit-HDFS-Build/27567/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14476) lock too long when fix inconsistent blocks between disk and in-memory

2019-12-06 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-14476:
---
Fix Version/s: 3.3.0

> lock too long when fix inconsistent blocks between disk and in-memory
> -
>
> Key: HDFS-14476
> URL: https://issues.apache.org/jira/browse/HDFS-14476
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.6.0, 2.7.0, 3.0.3
>Reporter: Sean Chow
>Assignee: Sean Chow
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14476-branch-2.01.patch, HDFS-14476.00.patch, 
> HDFS-14476.002.patch, HDFS-14476.01.patch, HDFS-14476.branch-3.2.001.patch, 
> datanode-with-patch-14476.png
>
>
> When directoryScanner have the results of differences between disk and 
> in-memory blocks. it will try to run {{checkAndUpdate}} to fix it. However 
> {{FsDatasetImpl.checkAndUpdate}} is a synchronized call
> As I have about 6millions blocks for every datanodes and every 6hours' scan 
> will have about 25000 abnormal blocks to fix. That leads to a long lock 
> holding FsDatasetImpl object.
> let's assume every block need 10ms to fix(because of latency of SAS disk), 
> that will cost 250 seconds to finish. That means all reads and writes will be 
> blocked for 3mins for that datanode.
>  
> {code:java}
> 2019-05-06 08:06:51,704 INFO 
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: BlockPool 
> BP-1644920766-10.223.143.220-1450099987967 Total blocks: 6850197, missing 
> metadata files:23574, missing block files:23574, missing blocks in 
> memory:47625, mismatched blocks:0
> ...
> 2019-05-06 08:16:41,625 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Took 588402ms to process 1 commands from NN
> {code}
> Take long time to process command from nn because threads are blocked. And 
> namenode will see long lastContact time for this datanode.
> Maybe this affect all hdfs versions.
> *how to fix:*
> just like process invalidate command from namenode with 1000 batch size, fix 
> these abnormal block should be handled with batch too and sleep 2 seconds 
> between the batch to allow normal reading/writing blocks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14751) Synchronize on diffs in DirectoryScanner

2019-12-06 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990218#comment-16990218
 ] 

Wei-Chiu Chuang commented on HDFS-14751:


+1
I'll add version 03 on top of HDFS-14476.

> Synchronize on diffs in DirectoryScanner
> 
>
> Key: HDFS-14751
> URL: https://issues.apache.org/jira/browse/HDFS-14751
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Minor
> Attachments: HDFS-14751.001.patch, HDFS-14751.002.patch, 
> HDFS-14751.003.patch, HDFS-14751.004.patch
>
>
> {code:java}
> [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 21.693 s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency
> [ERROR] 
> testGenerationStampInFuture(org.apache.hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency)
>   Time elapsed: 7.572 s  <<< ERROR!
> java.util.ConcurrentModificationException
>   at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:909)
>   at java.util.ArrayList$Itr.next(ArrayList.java:859)
>   at 
> com.google.common.collect.AbstractMapBasedMultimap$Itr.next(AbstractMapBasedMultimap.java:1153)
>   at 
> java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1044)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:433)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNodeTestUtils.runDirectoryScanner(DataNodeTestUtils.java:202)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency.testGenerationStampInFuture(TestNameNodeMetadataConsistency.java:92)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}
> Ref:[https://builds.apache.org/job/PreCommit-HDFS-Build/27567/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15012) NN fails to parse Edit logs after applying HDFS-13101

2019-12-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990206#comment-16990206
 ] 

Hadoop QA commented on HDFS-15012:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m  4s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
26s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 53s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 2 new + 141 unchanged - 0 fixed = 143 total (was 141) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m  4s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}117m 33s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
47s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}187m 58s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes |
|   | hadoop.hdfs.web.TestWebHDFSAcl |
|   | hadoop.hdfs.web.TestWebHDFSForHA |
|   | hadoop.hdfs.server.namenode.TestAuditLogs |
|   | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.server.namenode.TestFileTruncate |
|   | hadoop.hdfs.web.TestWebHDFS |
|   | hadoop.hdfs.server.namenode.TestFSEditLogLoader |
|   | hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics |
|   | hadoop.hdfs.server.namenode.TestLeaseManager |
|   | hadoop.hdfs.TestFileChecksum |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery 
|
|   | hadoop.hdfs.server.namenode.TestPersistentStoragePolicySatisfier |
|   | hadoop.hdfs.web.TestWebHDFSXAttr |
|   | hadoop.hdfs.TestEncryptedTransfer |
|   | hadoop.hdfs.web.TestWebHdfsTokens |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-15012 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12987780/HDFS-15012.000.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 

[jira] [Commented] (HDFS-15037) Encryption Zone operations should not block other RPC calls while retreivingencryption keys.

2019-12-06 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990204#comment-16990204
 ] 

Wei-Chiu Chuang commented on HDFS-15037:


Thanks [~shv] for reporting the issue. I believe these three calls were made to 
support reencrypt (HDFS-10899), which was added since Hadoop 3. Did you 
backport reencrypt to your branch?

> Encryption Zone operations should not block other RPC calls while 
> retreivingencryption keys.
> 
>
> Key: HDFS-15037
> URL: https://issues.apache.org/jira/browse/HDFS-15037
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Priority: Major
>
> I believe it was an intention to avoid blocking other operations while 
> retrieving keys with holding {{FSDirectory.dirLock}}. But in reality all 
> other operations enter first {{FSNamesystemLock}} then {{dirLock}}. So they 
> are all blocked waiting for the key.
> We see substantial increase in RPC wait time ({{RpcQueueTimeAvgTime}}) on 
> NameNode when encryption operations are intermixed with regular workloads.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15017) Remove redundant import of AtomicBoolean in NameNodeConnector.

2019-12-06 Thread Chao Sun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HDFS-15017:

Status: Patch Available  (was: Open)

> Remove redundant import of AtomicBoolean in NameNodeConnector.
> --
>
> Key: HDFS-15017
> URL: https://issues.apache.org/jira/browse/HDFS-15017
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover, hdfs
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chao Sun
>Priority: Major
>  Labels: newbie
> Attachments: HDFS-15017-branch-2.000.patch
>
>
> Should remove redundant import.
> Looks like it is specific to branch 2.10. Trunk and 3x branches don't have it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15017) Remove redundant import of AtomicBoolean in NameNodeConnector.

2019-12-06 Thread Chao Sun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HDFS-15017:

Attachment: HDFS-15017-branch-2.000.patch

> Remove redundant import of AtomicBoolean in NameNodeConnector.
> --
>
> Key: HDFS-15017
> URL: https://issues.apache.org/jira/browse/HDFS-15017
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover, hdfs
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chao Sun
>Priority: Major
>  Labels: newbie
> Attachments: HDFS-15017-branch-2.000.patch
>
>
> Should remove redundant import.
> Looks like it is specific to branch 2.10. Trunk and 3x branches don't have it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15017) Remove redundant import of AtomicBoolean in NameNodeConnector.

2019-12-06 Thread Chao Sun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990201#comment-16990201
 ] 

Chao Sun commented on HDFS-15017:
-

Seems like a trivial change - the import was added by HDFS-7073

> Remove redundant import of AtomicBoolean in NameNodeConnector.
> --
>
> Key: HDFS-15017
> URL: https://issues.apache.org/jira/browse/HDFS-15017
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover, hdfs
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chao Sun
>Priority: Major
>  Labels: newbie
> Attachments: HDFS-15017-branch-2.000.patch
>
>
> Should remove redundant import.
> Looks like it is specific to branch 2.10. Trunk and 3x branches don't have it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15017) Remove redundant import of AtomicBoolean in NameNodeConnector.

2019-12-06 Thread Chao Sun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HDFS-15017:

Attachment: (was: HDFS-15017-branch-2.000.patch)

> Remove redundant import of AtomicBoolean in NameNodeConnector.
> --
>
> Key: HDFS-15017
> URL: https://issues.apache.org/jira/browse/HDFS-15017
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover, hdfs
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chao Sun
>Priority: Major
>  Labels: newbie
>
> Should remove redundant import.
> Looks like it is specific to branch 2.10. Trunk and 3x branches don't have it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15017) Remove redundant import of AtomicBoolean in NameNodeConnector.

2019-12-06 Thread Chao Sun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HDFS-15017:

Attachment: HDFS-15017-branch-2.000.patch

> Remove redundant import of AtomicBoolean in NameNodeConnector.
> --
>
> Key: HDFS-15017
> URL: https://issues.apache.org/jira/browse/HDFS-15017
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover, hdfs
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chao Sun
>Priority: Major
>  Labels: newbie
>
> Should remove redundant import.
> Looks like it is specific to branch 2.10. Trunk and 3x branches don't have it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14852) Remove of LowRedundancyBlocks do NOT remove the block from all queues

2019-12-06 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990192#comment-16990192
 ] 

Wei-Chiu Chuang commented on HDFS-14852:


[~ferhui]
if the symptom you saw was "web UI reporting missing blocks but the file path 
was empty", it would have been HDFS-13999.
But since you reported this issue on a Hadoop 3 cluster, that wouldn't be 
possible.

I added ec as a component. But looks like I was wrong. It doesn't seem to be ec 
related.

Additionally, I would like to see a test added to cover the change inside 
BlockManager. The test code attached covers LowRedundancyBlocks and I am 
concerned since BlockManager is a hugely complex piece of code.

[~sodonnell] you've looked at LowRedundancyBlocks recently. How do you think 
about the change?

> Remove of LowRedundancyBlocks do NOT remove the block from all queues
> -
>
> Key: HDFS-14852
> URL: https://issues.apache.org/jira/browse/HDFS-14852
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec
>Affects Versions: 3.2.0, 3.0.3, 3.1.2, 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: CorruptBlocksMismatch.png, HDFS-14852.001.patch, 
> HDFS-14852.002.patch, HDFS-14852.003.patch, HDFS-14852.004.patch, 
> HDFS-14852.005.patch, screenshot-1.png
>
>
> LowRedundancyBlocks.java
> {code:java}
> // Some comments here
> if(priLevel >= 0 && priLevel < LEVEL
> && priorityQueues.get(priLevel).remove(block)) {
>   NameNode.blockStateChangeLog.debug(
>   "BLOCK* NameSystem.LowRedundancyBlock.remove: Removing block {}"
>   + " from priority queue {}",
>   block, priLevel);
>   decrementBlockStat(block, priLevel, oldExpectedReplicas);
>   return true;
> } else {
>   // Try to remove the block from all queues if the block was
>   // not found in the queue for the given priority level.
>   for (int i = 0; i < LEVEL; i++) {
> if (i != priLevel && priorityQueues.get(i).remove(block)) {
>   NameNode.blockStateChangeLog.debug(
>   "BLOCK* NameSystem.LowRedundancyBlock.remove: Removing block" +
>   " {} from priority queue {}", block, i);
>   decrementBlockStat(block, i, oldExpectedReplicas);
>   return true;
> }
>   }
> }
> return false;
>   }
> {code}
> Source code is above, the comments as follow
> {quote}
>   // Try to remove the block from all queues if the block was
>   // not found in the queue for the given priority level.
> {quote}
> The function "remove" does NOT remove the block from all queues.
> Function add from LowRedundancyBlocks.java is used on some places and maybe 
> one block in two or more queues.
> We found that corrupt blocks mismatch corrupt files on NN web UI. Maybe it is 
> related to this.
> Upload initial patch



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15031) Allow BootstrapStandby to download FSImage if the directory is already formatted

2019-12-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990189#comment-16990189
 ] 

Hadoop QA commented on HDFS-15031:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
33s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 20s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
14s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 35s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 2 new + 26 unchanged - 0 fixed = 28 total (was 26) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 24s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 92m 25s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}150m 12s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestRedudantBlocks |
|   | hadoop.hdfs.TestDeadNodeDetection |
|   | hadoop.hdfs.TestMultipleNNPortQOP |
|   | hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency |
|   | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-15031 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12987778/HDFS-15031.006.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 3f9021029e81 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 76bb297 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28475/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 

[jira] [Updated] (HDFS-15037) Encryption Zone operations should not block other RPC calls while retreivingencryption keys.

2019-12-06 Thread Konstantin Shvachko (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-15037:
---
Description: 
I believe it was an intention to avoid blocking other operations while 
retrieving keys with holding {{FSDirectory.dirLock}}. But in reality all other 
operations enter first {{FSNamesystemLock}} then {{dirLock}}. So they are all 
blocked waiting for the key.
We see substantial increase in RPC wait time ({{RpcQueueTimeAvgTime}}) on 
NameNode when encryption operations are intermixed with regular workloads.

  was:
I believe it was an intention to avoid blocking other operations while 
retrieving keys with holding {{[FSDirectory.dirLock}}. But in reality all other 
operations enter first {{FSNamesystemLock}} then {{dirLock}}. So they are all 
blocked waiting for the key.
We see substantial increase in RPC wait time ({{RpcQueueTimeAvgTime}}) on 
NameNode when encryption operations are intermixed with regular workloads.


Here are the three methods, which hold only {{FSDirectory.dirLock}}, but not 
{{FSNamesystemLock}}:
* {{ReencryptionHandler.run()}}
* {{FSDirEncryptionZoneOp.getKeyNameForZone()}}
* {{EncryptionZoneManager.pauseForTestingAfterNthCheckpoint()}}

Looks to me the code need to be rearranged using some other lock for key 
retrieval.

> Encryption Zone operations should not block other RPC calls while 
> retreivingencryption keys.
> 
>
> Key: HDFS-15037
> URL: https://issues.apache.org/jira/browse/HDFS-15037
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Priority: Major
>
> I believe it was an intention to avoid blocking other operations while 
> retrieving keys with holding {{FSDirectory.dirLock}}. But in reality all 
> other operations enter first {{FSNamesystemLock}} then {{dirLock}}. So they 
> are all blocked waiting for the key.
> We see substantial increase in RPC wait time ({{RpcQueueTimeAvgTime}}) on 
> NameNode when encryption operations are intermixed with regular workloads.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15037) Encryption Zone operations should not block other RPC calls while retreivingencryption keys.

2019-12-06 Thread Konstantin Shvachko (Jira)
Konstantin Shvachko created HDFS-15037:
--

 Summary: Encryption Zone operations should not block other RPC 
calls while retreivingencryption keys.
 Key: HDFS-15037
 URL: https://issues.apache.org/jira/browse/HDFS-15037
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: encryption, namenode
Affects Versions: 2.10.0
Reporter: Konstantin Shvachko


I believe it was an intention to avoid blocking other operations while 
retrieving keys with holding {{[FSDirectory.dirLock}}. But in reality all other 
operations enter first {{FSNamesystemLock}} then {{dirLock}}. So they are all 
blocked waiting for the key.
We see substantial increase in RPC wait time ({{RpcQueueTimeAvgTime}}) on 
NameNode when encryption operations are intermixed with regular workloads.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13616) Batch listing of multiple directories

2019-12-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990172#comment-16990172
 ] 

Hadoop QA commented on HDFS-13616:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m 12s{color} 
| {color:red} HDFS-13616 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-13616 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28477/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Batch listing of multiple directories
> -
>
> Key: HDFS-13616
> URL: https://issues.apache.org/jira/browse/HDFS-13616
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 3.2.0
>Reporter: Andrew Wang
>Assignee: Chao Sun
>Priority: Major
> Attachments: BenchmarkListFiles.java, HDFS-13616.001.patch, 
> HDFS-13616.002.patch
>
>
> One of the dominant workloads for external metadata services is listing of 
> partition directories. This can end up being bottlenecked on RTT time when 
> partition directories contain a small number of files. This is fairly common, 
> since fine-grained partitioning is used for partition pruning by the query 
> engines.
> A batched listing API that takes multiple paths amortizes the RTT cost. 
> Initial benchmarks show a 10-20x improvement in metadata loading performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15036) Active NameNode should not silently fail the image transfer

2019-12-06 Thread Konstantin Shvachko (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990155#comment-16990155
 ] 

Konstantin Shvachko commented on HDFS-15036:


This can happen during checkpointing or preparing for a rolling upgrade.
 We observed it during rolling upgrade, when Standby was reporting: _"Rollback 
image has been created. Proceed to upgrade daemons."_ While Active still 
reported _" Rollback image has not been created."_

In the logs for ANN I see that it started receiving the image:
{code:java}
 
2019-12-05 23:14:56,328 INFO 
org.apache.hadoop.hdfs.server.namenode.ImageServlet: ImageServlet allowing 
checkpointer: hdfs/active.namenode.com 
{code}
But ANN did not print anything related to the image transfer afterwards. And 
the transferred image is missing in its storage directory.
 The ANN log message comes from {{isValidRequestor()}} called by 
{{ImageServlet.doPut()}}.

SBN log indicates that the image was fully and successfully transferred to ANN
{code:java}
 
2019-12-05 23:22:29,526 INFO 
org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Sending fileName: 
/hdfs-storage-dir/current/fsimage_rollback_00773999609, fileSize: 
1889021016. Sent total: 1889021016 bytes. Size of last segment intended to 
send: -1 bytes.
{code}
The SBN log message comes from {{TransferFsImage.copyFileToStream()}}.

Looking at the code in {{ImageServlet.doPut()}} I see that in one of the 
methods it calls {{Util.receiveFile()}} if an Exception is thrown inside the 
while-loop performing reading from the input (socket) stream and writing to the 
output (image file) stream, then it will go through a series of finalized 
sections without catching the exception and logging it or reporting the error 
to the sender.

We should:
 # Catch and log any exceptions occurring there
 # Notify SBN about the error, so that it could retry the transfer

> Active NameNode should not silently fail the image transfer
> ---
>
> Key: HDFS-15036
> URL: https://issues.apache.org/jira/browse/HDFS-15036
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chao Sun
>Priority: Major
>
> Image transfer from Standby NameNode to  Active silently fails on Active, 
> without any logging and not notifying the receiver side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-15036) Active NameNode should not silently fail the image transfer

2019-12-06 Thread Chao Sun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun reassigned HDFS-15036:
---

Assignee: Chao Sun

> Active NameNode should not silently fail the image transfer
> ---
>
> Key: HDFS-15036
> URL: https://issues.apache.org/jira/browse/HDFS-15036
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Chao Sun
>Priority: Major
>
> Image transfer from Standby NameNode to  Active silently fails on Active, 
> without any logging and not notifying the receiver side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15036) Active NameNode should not silently fail the image transfer

2019-12-06 Thread Konstantin Shvachko (Jira)
Konstantin Shvachko created HDFS-15036:
--

 Summary: Active NameNode should not silently fail the image 
transfer
 Key: HDFS-15036
 URL: https://issues.apache.org/jira/browse/HDFS-15036
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.10.0
Reporter: Konstantin Shvachko


Image transfer from Standby NameNode to  Active silently fails on Active, 
without any logging and not notifying the receiver side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14993) checkDiskError doesn't work during datanode startup

2019-12-06 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990139#comment-16990139
 ] 

Wei-Chiu Chuang commented on HDFS-14993:


nit:
I would really love to use slf4j to log messages rather than using 
System.out.println in the tests. 
Other than that lgtm

> checkDiskError doesn't work during datanode startup
> ---
>
> Key: HDFS-14993
> URL: https://issues.apache.org/jira/browse/HDFS-14993
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Major
> Attachments: HDFS-14993.patch, HDFS-14993.patch
>
>
> the function checkDiskError() is called before addBlockPool, but list 
> bpSlices is empty this time. So the function check() in FsVolumeImpl.java 
> does nothing.
> @Override
> public VolumeCheckResult check(VolumeCheckContext ignored)
>  throws DiskErrorException {
>  // TODO:FEDERATION valid synchronization
>  for (BlockPoolSlice s : bpSlices.values()) {
>  s.checkDirs();
>  }
>  return VolumeCheckResult.HEALTHY;
> }



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14983) RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option

2019-12-06 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990129#comment-16990129
 ] 

Íñigo Goiri commented on HDFS-14983:


Thanks [~risyomei], minor comments:
* Add break line between the imports and the javadocs (e.g., 
RefreshSuperUserGroupsConfigurationResponse, 
RefreshSuperUserGroupsConfigurationRequest,...).
* What is the {{address}} parameter in 
RouterAdmin#refreshSuperUserGroupsConfiguration()?
* I'm not sure there is a point having 
RouterAdmin#refreshSuperUserGroupsConfiguration() and 
RouterAdmin#refreshSuperUserGroupsExecutor(), we could have a single method.
* Can we avoid the SuppressWarnings in 
TestRouterRefreshSuperUserGroupsConfiguration?
* For TestRouterRefreshSuperUserGroupsConfiguration#initializeClientConfig() I 
think is cleaner to return a new configuration instead of messing around with 
the internal one. Actually, I would try to get a full new client configuration.

> RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option
> ---
>
> Key: HDFS-14983
> URL: https://issues.apache.org/jira/browse/HDFS-14983
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Akira Ajisaka
>Assignee: Xieming Li
>Priority: Minor
> Attachments: HDFS-14983.002.patch, HDFS-14983.draft.001.patch
>
>
> NameNode can update proxyuser config by -refreshSuperUserGroupsConfiguration 
> without restarting but DFSRouter cannot. It would be better for DFSRouter to 
> have such functionality to be compatible with NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15005) Backport HDFS-12300 to branch-2

2019-12-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990121#comment-16990121
 ] 

Hadoop QA commented on HDFS-15005:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
51s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
11s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} branch-2 passed with JDK v1.8.0_222 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
10s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} branch-2 passed with JDK v1.8.0_222 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed with JDK v1.8.0_222 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 31s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 9 new + 235 unchanged - 1 fixed = 244 total (was 236) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed with JDK v1.8.0_222 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 83m 12s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}113m 43s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.namenode.snapshot.TestSnapshotFileLength |
|   | hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys |
|   | hadoop.hdfs.TestSecureEncryptionZoneWithKMS |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:f555aa740b5 |
| JIRA Issue | HDFS-15005 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12987774/HDFS-15005-branch-2.003.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux f4a71bf356af 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Updated] (HDFS-15012) NN fails to parse Edit logs after applying HDFS-13101

2019-12-06 Thread Shashikant Banerjee (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDFS-15012:
---
Status: Patch Available  (was: Open)

Patch v0 adds a unit test and fix to address the issue.

> NN fails to parse Edit logs after applying HDFS-13101
> -
>
> Key: HDFS-15012
> URL: https://issues.apache.org/jira/browse/HDFS-15012
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nn
>Reporter: Eric Lin
>Assignee: Shashikant Banerjee
>Priority: Blocker
>  Labels: release-blocker
> Attachments: HDFS-15012.000.patch
>
>
> After applying HDFS-13101, and deleting and creating large number of 
> snapshots, SNN exited with below error:
>   
> {code:sh}
> 2019-11-18 08:28:06,528 ERROR 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception 
> on operation DeleteSnapshotOp [snapshotRoot=/path/to/hdfs/file, 
> snapshotName=distcp-3479-31-old, 
> RpcClientId=b16a6cb5-bdbb-45ae-9f9a-f7dc57931f37, Rpc
> CallId=1]
> java.lang.AssertionError: Element already exists: 
> element=partition_isactive=true, DELETED=[partition_isactive=true]
> at org.apache.hadoop.hdfs.util.Diff.insert(Diff.java:193)
> at org.apache.hadoop.hdfs.util.Diff.delete(Diff.java:239)
> at org.apache.hadoop.hdfs.util.Diff.combinePosterior(Diff.java:462)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff$2.initChildren(DirectoryWithSnapshotFeature.java:240)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff$2.iterator(DirectoryWithSnapshotFeature.java:250)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:755)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:753)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:790)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeReference.cleanSubtree(INodeReference.java:332)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeReference$WithName.cleanSubtree(INodeReference.java:583)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:760)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:753)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:790)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectorySnapshottableFeature.removeSnapshot(DirectorySnapshottableFeature.java:235)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.removeSnapshot(INodeDirectory.java:259)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.deleteSnapshot(SnapshotManager.java:301)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:688)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:141)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:903)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:756)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:324)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1144)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:796)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:614)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:676)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615)
> {code}
> We confirmed that fsimage and edit files were NOT corrupted, as reverting 
> HDFS-13101 fixed the issue. So the logic introduced in HDFS-13101 is broken 
> and failed to parse edit log files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: 

[jira] [Updated] (HDFS-15012) NN fails to parse Edit logs after applying HDFS-13101

2019-12-06 Thread Shashikant Banerjee (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated HDFS-15012:
---
Attachment: HDFS-15012.000.patch

> NN fails to parse Edit logs after applying HDFS-13101
> -
>
> Key: HDFS-15012
> URL: https://issues.apache.org/jira/browse/HDFS-15012
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nn
>Reporter: Eric Lin
>Assignee: Shashikant Banerjee
>Priority: Blocker
>  Labels: release-blocker
> Attachments: HDFS-15012.000.patch
>
>
> After applying HDFS-13101, and deleting and creating large number of 
> snapshots, SNN exited with below error:
>   
> {code:sh}
> 2019-11-18 08:28:06,528 ERROR 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception 
> on operation DeleteSnapshotOp [snapshotRoot=/path/to/hdfs/file, 
> snapshotName=distcp-3479-31-old, 
> RpcClientId=b16a6cb5-bdbb-45ae-9f9a-f7dc57931f37, Rpc
> CallId=1]
> java.lang.AssertionError: Element already exists: 
> element=partition_isactive=true, DELETED=[partition_isactive=true]
> at org.apache.hadoop.hdfs.util.Diff.insert(Diff.java:193)
> at org.apache.hadoop.hdfs.util.Diff.delete(Diff.java:239)
> at org.apache.hadoop.hdfs.util.Diff.combinePosterior(Diff.java:462)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff$2.initChildren(DirectoryWithSnapshotFeature.java:240)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff$2.iterator(DirectoryWithSnapshotFeature.java:250)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:755)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:753)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:790)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeReference.cleanSubtree(INodeReference.java:332)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeReference$WithName.cleanSubtree(INodeReference.java:583)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:760)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:753)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:790)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.DirectorySnapshottableFeature.removeSnapshot(DirectorySnapshottableFeature.java:235)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.removeSnapshot(INodeDirectory.java:259)
> at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.deleteSnapshot(SnapshotManager.java:301)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:688)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:141)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:903)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:756)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:324)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1144)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:796)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:614)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:676)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615)
> {code}
> We confirmed that fsimage and edit files were NOT corrupted, as reverting 
> HDFS-13101 fixed the issue. So the logic introduced in HDFS-13101 is broken 
> and failed to parse edit log files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: 

[jira] [Commented] (HDFS-6874) Add GETFILEBLOCKLOCATIONS operation to HttpFS

2019-12-06 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990092#comment-16990092
 ] 

hemanthboyina commented on HDFS-6874:
-

have  implemented getfileblocklocations and is working fine with httpfs 
but there is an issue with httpfswithWebHdfs as webhdfs on 
getfileblockloactions is trying to access getblocklocations in httpfs , which 
doesn't exists.
I think we need to implement getblocklocations in httpfs and call  
getfileblocklocations .

please correct me if am wrong .

> Add GETFILEBLOCKLOCATIONS operation to HttpFS
> -
>
> Key: HDFS-6874
> URL: https://issues.apache.org/jira/browse/HDFS-6874
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: httpfs
>Affects Versions: 2.4.1, 2.7.3
>Reporter: Gao Zhong Liang
>Assignee: Weiwei Yang
>Priority: Major
>  Labels: BB2015-05-TBR
> Attachments: HDFS-6874-1.patch, HDFS-6874-branch-2.6.0.patch, 
> HDFS-6874.02.patch, HDFS-6874.03.patch, HDFS-6874.04.patch, 
> HDFS-6874.05.patch, HDFS-6874.06.patch, HDFS-6874.07.patch, 
> HDFS-6874.08.patch, HDFS-6874.09.patch, HDFS-6874.10.patch, HDFS-6874.patch
>
>
> GETFILEBLOCKLOCATIONS operation is missing in HttpFS, which is already 
> supported in WebHDFS.  For the request of GETFILEBLOCKLOCATIONS in 
> org.apache.hadoop.fs.http.server.HttpFSServer, BAD_REQUEST is returned so far:
> ...
>  case GETFILEBLOCKLOCATIONS: {
> response = Response.status(Response.Status.BAD_REQUEST).build();
> break;
>   }
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-12-06 Thread Chao Sun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990087#comment-16990087
 ] 

Chao Sun commented on HDFS-14963:
-

Seems this and HDFS-15024 are solving very similar problems, and the solution 
there could be much simpler. Should we instead pursue that approach? I also 
tend to echo [~shv]'s point and not sure having clients to write to local file 
is a good idea.

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>  Labels: multi-sbnn
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15031) Allow BootstrapStandby to download FSImage if the directory is already formatted

2019-12-06 Thread Danny Becker (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Becker updated HDFS-15031:

Attachment: HDFS-15031.006.patch

> Allow BootstrapStandby to download FSImage if the directory is already 
> formatted
> 
>
> Key: HDFS-15031
> URL: https://issues.apache.org/jira/browse/HDFS-15031
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namenode
>Reporter: Danny Becker
>Assignee: Danny Becker
>Priority: Minor
> Attachments: HDFS-15031.000.patch, HDFS-15031.001.patch, 
> HDFS-15031.002.patch, HDFS-15031.003.patch, HDFS-15031.005.patch, 
> HDFS-15031.006.patch
>
>
> Currently, BootstrapStandby will only download the latest FSImage if it has 
> formatted the local image directory. This can be an issue when there are out 
> of date FSImages on a Standby NameNode, as the non-interactive mode will not 
> format the image directory, and BootstrapStandby will return an error code. 
> The changes here simply allow BootstrapStandby to download the latest FSImage 
> to the image directory, without needing to format first.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14908) LeaseManager should check parent-child relationship when filter open files.

2019-12-06 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990076#comment-16990076
 ] 

hemanthboyina commented on HDFS-14908:
--

thanks for the ping [~elgoiri] ,  either unifying the methods or using the 
DFSUtil.isParentEntry would be fine with me .

> LeaseManager should check parent-child relationship when filter open files.
> ---
>
> Key: HDFS-14908
> URL: https://issues.apache.org/jira/browse/HDFS-14908
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.1
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Minor
> Attachments: HDFS-14908.001.patch, HDFS-14908.002.patch, 
> HDFS-14908.003.patch, HDFS-14908.004.patch, HDFS-14908.005.patch, 
> HDFS-14908.006.patch, HDFS-14908.007.patch, HDFS-14908.008.patch, 
> HDFS-14908.TestV4.patch, Test.java, TestV2.java, TestV3.java
>
>
> Now when doing listOpenFiles(), LeaseManager only checks whether the filter 
> path is the prefix of the open files. We should check whether the filter path 
> is the parent/ancestor of the open files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time

2019-12-06 Thread Chao Sun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990059#comment-16990059
 ] 

Chao Sun commented on HDFS-15024:
-

{quote}
Chao Sun I think the msync case is just a case, maybe the current problem is a 
common problem for Support more than 2 NameNodes?
{quote}

yes you are correct. This is a more general problem for multi-sbn feature but I 
think we could optimize {{msync}} specifically to avoid the retry backoff. 

Regarding patch v1, seems it only handles the first few retries and later on 
when {{times}} gradually increment to passes beyond {{numNameNodes - 1 }}, it 
will still do exponential backoff on all the SBNs.

> [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a 
> condition of calculation of sleep time
> ---
>
> Key: HDFS-15024
> URL: https://issues.apache.org/jira/browse/HDFS-15024
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.10.0, 3.3.0, 3.2.1
>Reporter: huhaiyang
>Assignee: huhaiyang
>Priority: Major
>  Labels: multi-sbnn
> Attachments: HDFS-15024.001.patch, client_error.log
>
>
> When we enable the ONN , there will be three NN nodes for the client 
> configuration,
> Such as configuration
> 
> dfs.ha.namenodes.ns1
> nn2,nn3,nn1
> 
> Currently, 
> nn2 is in standby state
> nn3 is in observer state 
> nn1 is in active state
> When the user performs an access HDFS operation
> ./bin/hadoop --loglevel debug fs 
> -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  -mkdir /user/haiyang1/test8
> You need to request nn1 when you execute the msync method,
> Actually connect nn2 first and failover is required
> In connection nn3 does not meet the requirements, failover needs to be 
> performed, but at this time, failover operation needs to be performed during 
> a period of hibernation
> Finally, it took a period of hibernation to connect the successful request to 
> nn1
> In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime The current 
> default implementation is Sleep time is calculated when more than one 
> failover operation is performed
> I think that the Number of NameNodes as a condition of calculation of sleep 
> time is more reasonable
> That is, in the current test, executing failover on connection nn3 does not 
> need to sleep time to directly connect to the next nn node
> See client_error.log for details



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14998) [SBN read] Update Observer Namenode doc for ZKFC after HDFS-14130

2019-12-06 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990028#comment-16990028
 ] 

Hudson commented on HDFS-14998:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17733 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17733/])
HDFS-14998. [SBN read] Update Observer Namenode doc for ZKFC after 
(ayushsaxena: rev 705b172b95db345a99adf088fca83c67bd13a691)
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ObserverNameNode.md


> [SBN read] Update Observer Namenode doc for ZKFC after HDFS-14130
> -
>
> Key: HDFS-14998
> URL: https://issues.apache.org/jira/browse/HDFS-14998
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Minor
> Attachments: HDFS-14998.001.patch, HDFS-14998.002.patch, 
> HDFS-14998.003.patch, HDFS-14998.004.patch, HDFS-14998.005.patch, 
> HDFS-14998.006.patch
>
>
> After HDFS-14130, we should update observer namenode doc, observer namenode 
> can run with ZKFC running



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15005) Backport HDFS-12300 to branch-2

2019-12-06 Thread Chao Sun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HDFS-15005:

Attachment: HDFS-15005-branch-2.003.patch

> Backport HDFS-12300 to branch-2
> ---
>
> Key: HDFS-15005
> URL: https://issues.apache.org/jira/browse/HDFS-15005
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-15005-branch-2.000.patch, 
> HDFS-15005-branch-2.001.patch, HDFS-15005-branch-2.002.patch, 
> HDFS-15005-branch-2.003.patch
>
>
> Having DT related information is very useful in audit log. This tracks effort 
> to backport HDFS-12300 to branch-2.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15005) Backport HDFS-12300 to branch-2

2019-12-06 Thread Chao Sun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990023#comment-16990023
 ] 

Chao Sun commented on HDFS-15005:
-

Rebased to the latest branch-2. [~weichiu] pls take a look.

> Backport HDFS-12300 to branch-2
> ---
>
> Key: HDFS-15005
> URL: https://issues.apache.org/jira/browse/HDFS-15005
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-15005-branch-2.000.patch, 
> HDFS-15005-branch-2.001.patch, HDFS-15005-branch-2.002.patch, 
> HDFS-15005-branch-2.003.patch
>
>
> Having DT related information is very useful in audit log. This tracks effort 
> to backport HDFS-12300 to branch-2.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14998) [SBN read] Update Observer Namenode doc for ZKFC after HDFS-14130

2019-12-06 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-14998:

Hadoop Flags: Reviewed
  Resolution: Fixed
  Status: Resolved  (was: Patch Available)

Committed to trunk.
Thanx [~ferhui] for the contribution and [~csun] for the review!!!

> [SBN read] Update Observer Namenode doc for ZKFC after HDFS-14130
> -
>
> Key: HDFS-14998
> URL: https://issues.apache.org/jira/browse/HDFS-14998
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Minor
> Attachments: HDFS-14998.001.patch, HDFS-14998.002.patch, 
> HDFS-14998.003.patch, HDFS-14998.004.patch, HDFS-14998.005.patch, 
> HDFS-14998.006.patch
>
>
> After HDFS-14130, we should update observer namenode doc, observer namenode 
> can run with ZKFC running



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14998) [SBN read] Update Observer Namenode doc for ZKFC after HDFS-14130

2019-12-06 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990010#comment-16990010
 ] 

Ayush Saxena commented on HDFS-14998:
-

+1, Committing Shortly.

> [SBN read] Update Observer Namenode doc for ZKFC after HDFS-14130
> -
>
> Key: HDFS-14998
> URL: https://issues.apache.org/jira/browse/HDFS-14998
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Minor
> Attachments: HDFS-14998.001.patch, HDFS-14998.002.patch, 
> HDFS-14998.003.patch, HDFS-14998.004.patch, HDFS-14998.005.patch, 
> HDFS-14998.006.patch
>
>
> After HDFS-14130, we should update observer namenode doc, observer namenode 
> can run with ZKFC running



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15032) Balancer crashes when it fails to contact an unavailable NN via ObserverReadProxyProvider

2019-12-06 Thread Erik Krogen (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989926#comment-16989926
 ] 

Erik Krogen commented on HDFS-15032:


[~shv] can you take a look at the v2 patch when you have a chance? I don't 
think the test failures are related.

> Balancer crashes when it fails to contact an unavailable NN via 
> ObserverReadProxyProvider
> -
>
> Key: HDFS-15032
> URL: https://issues.apache.org/jira/browse/HDFS-15032
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 2.10.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-15032.000.patch, HDFS-15032.001.patch, 
> HDFS-15032.002.patch
>
>
> When trying to run the Balancer using ObserverReadProxyProvider (to allow it 
> to read from the Observer Node as described in HDFS-14979), if one of the NNs 
> isn't running, the Balancer will crash.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14983) RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option

2019-12-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989817#comment-16989817
 ] 

Hadoop QA commented on HDFS-14983:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
18s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 26s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
21s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
45s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red}  3m 45s{color} | 
{color:red} hadoop-hdfs-project generated 3 new + 16 unchanged - 3 fixed = 19 
total (was 19) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 33s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
12s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}113m 20s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  7m 
15s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}202m 19s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics |
|   | hadoop.hdfs.server.namenode.TestNameNodeMXBean |
|   | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-14983 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12987702/HDFS-14983.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  cc  |
| uname | Linux 0e12d2d5e49c 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 

[jira] [Commented] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts

2019-12-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989774#comment-16989774
 ] 

Hadoop QA commented on HDFS-14740:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
44s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
10s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 53s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
37s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
33s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red} 16m 33s{color} | 
{color:red} root generated 3 new + 23 unchanged - 3 fixed = 26 total (was 26) 
{color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 16m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 11s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  9m  
2s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}113m 10s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
54s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}237m  5s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeUUID |
|   | hadoop.hdfs.TestFileCorruption |
|   | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy |
|   | hadoop.tools.TestHdfsConfigFields |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | HDFS-14740 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12987690/HDFS-14740.007.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  

[jira] [Created] (HDFS-15035) Fix Rename API in BasicOzoneFileSystem

2019-12-06 Thread Ayush Saxena (Jira)
Ayush Saxena created HDFS-15035:
---

 Summary: Fix Rename API in BasicOzoneFileSystem
 Key: HDFS-15035
 URL: https://issues.apache.org/jira/browse/HDFS-15035
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ayush Saxena
Assignee: Ayush Saxena


In the Rename API :
1.  This doesn't work if one of the path contains URI and other doesn't.

{code:java}
if (src.equals(dst)) {
  return true;
}
{code}

2. This check is suppose to be done only for directories, but is done for Files 
too, it can be moved after getting the FileStatus and checking the type. 

{code:java}
// Some comments here
public String getFoo()
{
return foo;
}
{code}

3.  This too doesn't work (similar to 1.)

{code:java}
if (srcStatus.isDirectory()) {
  if (dst.toString().startsWith(src.toString() + OZONE_URI_DELIMITER)) {
LOG.trace("Cannot rename a directory to a subdirectory of self");
return false;
  }
{code}

4. Rename is even success if the URI provided is of different FileSystem.
In general HDFS/Other FS shall throw IllegalArgumentException if the path 
doesn't belong to the same FS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14668) Support Fuse with Users from multiple Security Realms

2019-12-06 Thread Istvan Fajth (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Istvan Fajth updated HDFS-14668:

Description: 
UPDATE:
See 
[this|https://issues.apache.org/jira/browse/HDFS-14668?focusedCommentId=16979466=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16979466]
 comment for the complete description of what is happening here.


Users from non-default  krb5 domain can't use hadoop-fuse.
There are 2 Realms with kdc. 
-one realm is for human users  (USERS.COM.US) 
-the other is for service principals.   (SERVICE.COM.US) 
Cross realm trust is setup.
In krb5.conf  the default domain  is set to SERVICE.COM.US

Users within USERS.COM.US Realm are not able to put any files to Fuse mounted 
location

The client shows:
  cp: cannot create regular file ‘/hdfs_mount/tmp/hello_from_fuse.txt’: 
Input/output error

  was:
UPDATE:
See this comment for the complete description of what is happening here.


Users from non-default  krb5 domain can't use hadoop-fuse.
There are 2 Realms with kdc. 
-one realm is for human users  (USERS.COM.US) 
-the other is for service principals.   (SERVICE.COM.US) 
Cross realm trust is setup.
In krb5.conf  the default domain  is set to SERVICE.COM.US

Users within USERS.COM.US Realm are not able to put any files to Fuse mounted 
location

The client shows:
  cp: cannot create regular file ‘/hdfs_mount/tmp/hello_from_fuse.txt’: 
Input/output error


> Support Fuse with Users from multiple Security Realms
> -
>
> Key: HDFS-14668
> URL: https://issues.apache.org/jira/browse/HDFS-14668
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fuse-dfs
>Affects Versions: 3.1.0, 3.0.3
>Reporter: Sailesh Patel
>Assignee: Istvan Fajth
>Priority: Critical
>  Labels: regression
>
> UPDATE:
> See 
> [this|https://issues.apache.org/jira/browse/HDFS-14668?focusedCommentId=16979466=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16979466]
>  comment for the complete description of what is happening here.
> Users from non-default  krb5 domain can't use hadoop-fuse.
> There are 2 Realms with kdc. 
> -one realm is for human users  (USERS.COM.US) 
> -the other is for service principals.   (SERVICE.COM.US) 
> Cross realm trust is setup.
> In krb5.conf  the default domain  is set to SERVICE.COM.US
> Users within USERS.COM.US Realm are not able to put any files to Fuse mounted 
> location
> The client shows:
>   cp: cannot create regular file ‘/hdfs_mount/tmp/hello_from_fuse.txt’: 
> Input/output error



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14668) Support Fuse with Users from multiple Security Realms

2019-12-06 Thread Istvan Fajth (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Istvan Fajth updated HDFS-14668:

Description: 
UPDATE:
See this comment for the complete description of what is happening here.


Users from non-default  krb5 domain can't use hadoop-fuse.
There are 2 Realms with kdc. 
-one realm is for human users  (USERS.COM.US) 
-the other is for service principals.   (SERVICE.COM.US) 
Cross realm trust is setup.
In krb5.conf  the default domain  is set to SERVICE.COM.US

Users within USERS.COM.US Realm are not able to put any files to Fuse mounted 
location

The client shows:
  cp: cannot create regular file ‘/hdfs_mount/tmp/hello_from_fuse.txt’: 
Input/output error

  was:
Users from non-default  krb5 domain can't use hadoop-fuse.
There are 2 Realms with kdc. 
-one realm is for human users  (USERS.COM.US) 
-the other is for service principals.   (SERVICE.COM.US) 
Cross realm trust is setup.
In krb5.conf  the default domain  is set to SERVICE.COM.US

Users within USERS.COM.US Realm are not able to put any files to Fuse mounted 
location

The client shows:
  cp: cannot create regular file ‘/hdfs_mount/tmp/hello_from_fuse.txt’: 
Input/output error


> Support Fuse with Users from multiple Security Realms
> -
>
> Key: HDFS-14668
> URL: https://issues.apache.org/jira/browse/HDFS-14668
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fuse-dfs
>Affects Versions: 3.1.0, 3.0.3
>Reporter: Sailesh Patel
>Assignee: Istvan Fajth
>Priority: Critical
>  Labels: regression
>
> UPDATE:
> See this comment for the complete description of what is happening here.
> Users from non-default  krb5 domain can't use hadoop-fuse.
> There are 2 Realms with kdc. 
> -one realm is for human users  (USERS.COM.US) 
> -the other is for service principals.   (SERVICE.COM.US) 
> Cross realm trust is setup.
> In krb5.conf  the default domain  is set to SERVICE.COM.US
> Users within USERS.COM.US Realm are not able to put any files to Fuse mounted 
> location
> The client shows:
>   cp: cannot create regular file ‘/hdfs_mount/tmp/hello_from_fuse.txt’: 
> Input/output error



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14668) Support Fuse with Users from multiple Security Realms

2019-12-06 Thread Istvan Fajth (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Istvan Fajth updated HDFS-14668:

Labels: regression  (was: )

> Support Fuse with Users from multiple Security Realms
> -
>
> Key: HDFS-14668
> URL: https://issues.apache.org/jira/browse/HDFS-14668
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fuse-dfs
>Affects Versions: 3.1.0, 3.0.3
>Reporter: Sailesh Patel
>Assignee: Istvan Fajth
>Priority: Critical
>  Labels: regression
>
> Users from non-default  krb5 domain can't use hadoop-fuse.
> There are 2 Realms with kdc. 
> -one realm is for human users  (USERS.COM.US) 
> -the other is for service principals.   (SERVICE.COM.US) 
> Cross realm trust is setup.
> In krb5.conf  the default domain  is set to SERVICE.COM.US
> Users within USERS.COM.US Realm are not able to put any files to Fuse mounted 
> location
> The client shows:
>   cp: cannot create regular file ‘/hdfs_mount/tmp/hello_from_fuse.txt’: 
> Input/output error



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14668) Support Fuse with Users from multiple Security Realms

2019-12-06 Thread Istvan Fajth (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Istvan Fajth updated HDFS-14668:

Priority: Critical  (was: Minor)

> Support Fuse with Users from multiple Security Realms
> -
>
> Key: HDFS-14668
> URL: https://issues.apache.org/jira/browse/HDFS-14668
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fuse-dfs
>Reporter: Sailesh Patel
>Assignee: Istvan Fajth
>Priority: Critical
>
> Users from non-default  krb5 domain can't use hadoop-fuse.
> There are 2 Realms with kdc. 
> -one realm is for human users  (USERS.COM.US) 
> -the other is for service principals.   (SERVICE.COM.US) 
> Cross realm trust is setup.
> In krb5.conf  the default domain  is set to SERVICE.COM.US
> Users within USERS.COM.US Realm are not able to put any files to Fuse mounted 
> location
> The client shows:
>   cp: cannot create regular file ‘/hdfs_mount/tmp/hello_from_fuse.txt’: 
> Input/output error



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14668) Support Fuse with Users from multiple Security Realms

2019-12-06 Thread Istvan Fajth (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Istvan Fajth updated HDFS-14668:

Affects Version/s: 3.1.0
   3.0.3

> Support Fuse with Users from multiple Security Realms
> -
>
> Key: HDFS-14668
> URL: https://issues.apache.org/jira/browse/HDFS-14668
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fuse-dfs
>Affects Versions: 3.1.0, 3.0.3
>Reporter: Sailesh Patel
>Assignee: Istvan Fajth
>Priority: Critical
>
> Users from non-default  krb5 domain can't use hadoop-fuse.
> There are 2 Realms with kdc. 
> -one realm is for human users  (USERS.COM.US) 
> -the other is for service principals.   (SERVICE.COM.US) 
> Cross realm trust is setup.
> In krb5.conf  the default domain  is set to SERVICE.COM.US
> Users within USERS.COM.US Realm are not able to put any files to Fuse mounted 
> location
> The client shows:
>   cp: cannot create regular file ‘/hdfs_mount/tmp/hello_from_fuse.txt’: 
> Input/output error



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14668) Support Fuse with Users from multiple Security Realms

2019-12-06 Thread Istvan Fajth (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Istvan Fajth updated HDFS-14668:

Issue Type: Bug  (was: Improvement)

> Support Fuse with Users from multiple Security Realms
> -
>
> Key: HDFS-14668
> URL: https://issues.apache.org/jira/browse/HDFS-14668
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fuse-dfs
>Reporter: Sailesh Patel
>Assignee: Istvan Fajth
>Priority: Minor
>
> Users from non-default  krb5 domain can't use hadoop-fuse.
> There are 2 Realms with kdc. 
> -one realm is for human users  (USERS.COM.US) 
> -the other is for service principals.   (SERVICE.COM.US) 
> Cross realm trust is setup.
> In krb5.conf  the default domain  is set to SERVICE.COM.US
> Users within USERS.COM.US Realm are not able to put any files to Fuse mounted 
> location
> The client shows:
>   cp: cannot create regular file ‘/hdfs_mount/tmp/hello_from_fuse.txt’: 
> Input/output error



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14668) Support Fuse with Users from multiple Security Realms

2019-12-06 Thread Istvan Fajth (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989730#comment-16989730
 ] 

Istvan Fajth commented on HDFS-14668:
-

After a couple of days thinking, and few hours of testing I decided to come up 
with the given PR.

The main reasons I chose this solution is the following:
- the affected UGI API calls are public, and may be used in other projects, 
where the necessary tunings might already have happened.
- there does not seem to be a good way of deciding whether the given username 
is a valid principal name, and we can not implement FUSE specific solutions in 
the UGI code
- I am not familiar enough with how other projects are using the UGI, this 
phenomenon might cause problems there as well, and I am not sure why it was 
necessary to add the username as a principal all the time from the UGI, and it 
is not clear if this scenario was considered at that time, but without [~daryn] 
I think we might not get this information ever so removing the newly added 
behaviour does not seem to be a good option and can cause troubles in other 
areas.
- this change has the least effect to any other code that has been written


The solution itself changes the connection builder setup, and in case of a 
kerberized environment FUSE does not set the username, which renders the value 
to null on the Java level properly, so that the Java kerberos layer from inside 
the UGI calls will determine the principal's name from the ticket cache 
provided.
In the non-kerberized environments, we still need to provide the username, as 
in that case we are checking permissions against the OS user name, and we don't 
want to loose this inside the FUSE logic either.

While I have been checking this, I came across the fact that inside FUSE we 
could have check and if set use the value of the HADOOP_USER_NAME environment 
variable, but we currently do not use it anywhere. I filed HDFS-15034 for this 
improvement to track it.

> Support Fuse with Users from multiple Security Realms
> -
>
> Key: HDFS-14668
> URL: https://issues.apache.org/jira/browse/HDFS-14668
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: fuse-dfs
>Reporter: Sailesh Patel
>Assignee: Istvan Fajth
>Priority: Minor
>
> Users from non-default  krb5 domain can't use hadoop-fuse.
> There are 2 Realms with kdc. 
> -one realm is for human users  (USERS.COM.US) 
> -the other is for service principals.   (SERVICE.COM.US) 
> Cross realm trust is setup.
> In krb5.conf  the default domain  is set to SERVICE.COM.US
> Users within USERS.COM.US Realm are not able to put any files to Fuse mounted 
> location
> The client shows:
>   cp: cannot create regular file ‘/hdfs_mount/tmp/hello_from_fuse.txt’: 
> Input/output error



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14869) Data loss in case of distcp using snapshot diff. Replication should include rename records if file was skipped in the previous iteration

2019-12-06 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989699#comment-16989699
 ] 

Hudson commented on HDFS-14869:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17732 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17732/])
HDFS-14869 Copy renamed files which are not excluded anymore by filter 
(shashikant: rev fc97034b29243a0509633849de55aa734859)
* (edit) 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpSync.java
* (edit) 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCp.java
* (edit) 
hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/TestDistCpSync.java


> Data loss in case of distcp using snapshot diff. Replication should include 
> rename records if file was skipped in the previous iteration
> 
>
> Key: HDFS-14869
> URL: https://issues.apache.org/jira/browse/HDFS-14869
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Fix For: 3.1.4
>
>
> This issue arises when a directory or file is excluded by exclusion filter 
> during distcp replication. Later on if the directory is renamed later to a 
> name which is not excluded by the filter, the snapshot diff reports only a 
> rename operation.  The directory is never copied to target even though its 
> not excluded now. This also doesn't throw any error so there is no way to 
> find the issue. 
> Steps to reproduce
>  * Create a directory in hdfs to copy using distcp.
>  * Include a staging folder in the directory.
> {code:java}
> [hdfs@ctr-e141-1563959304486-33995-01-03 hadoop-mapreduce]$ hadoop fs -ls 
> /tmp/tocopy
> Found 4 items
> -rw-r--r--   3 hdfs hdfs 16 2019-09-12 10:32 /tmp/tocopy/.b.txt
> drwxr-xr-x   - hdfs hdfs  0 2019-09-23 09:18 /tmp/tocopy/.staging
> -rw-r--r--   3 hdfs hdfs 12 2019-09-12 10:32 /tmp/tocopy/a.txt
> -rw-r--r--   3 hdfs hdfs  4 2019-09-20 08:23 /tmp/tocopy/foo.txt{code}
>  * The exclusion filter is set to exclude any staging directory
> {code:java}
> [hdfs@ctr-e141-1563959304486-33995-01-03 hadoop-mapreduce]$ cat 
> /tmp/filter
> .*\.Trash.*
> .*\.staging.*{code}
>  * Do a copy using distcp snapshots, the staging directory is not replicated.
> {code:java}
> hadoop jar hadoop-distcp-3.3.0-SNAPSHOT.jar 
> -Dmapreduce.job.user.classpath.first=true -filters /tmp/filter 
> /tmp/tocopy/.snapshot/s1 /tmp/target
> [hdfs@ctr-e141-1563959304486-33995-01-03 root]$ hadoop fs -ls /tmp/target
> Found 3 items
> -rw-r--r--   3 hdfs hdfs 16 2019-09-24 06:56 /tmp/target/.b.txt
> -rw-r--r--   3 hdfs hdfs 12 2019-09-24 06:56 /tmp/target/a.txt
> -rw-r--r--   3 hdfs hdfs  4 2019-09-24 06:56 /tmp/target/foo.txt{code}
>  * Rename the staging directory to final
> {code:java}
> [hdfs@ctr-e141-1563959304486-33995-01-03 hadoop-mapreduce]$ hadoop fs -mv 
> /tmp/tocopy/.staging /tmp/tocopy/final{code}
>  * Do a copy using snapshot diff.
> {code:java}
> [hdfs@ctr-e141-1563959304486-33995-01-03 hadoop-mapreduce]$ hdfs 
> snapshotDiff /tmp/tocopy s1 s2[hdfs@ctr-e141-1563959304486-33995-01-03 
> hadoop-mapreduce]$ hdfs snapshotDiff /tmp/tocopy s1 s2Difference between 
> snapshot s1 and snapshot s2 under directory /tmp/tocopy:M .R ./.staging -> 
> ./final
> {code}
>  * The diff report just has a rename record and the new final directory is 
> never copied.
> {code:java}
> [hdfs@ctr-e141-1563959304486-33995-01-03 hadoop-mapreduce]$ hadoop jar 
> hadoop-distcp-3.3.0-SNAPSHOT.jar -Dmapreduce.job.user.classpath.first=true 
> -filters /tmp/filter -diff s1 s2 -update /tmp/tocopy /tmp/target
> 19/09/24 07:05:32 INFO tools.DistCp: Input Options: 
> DistCpOptions{atomicCommit=false, syncFolder=true, deleteMissing=false, 
> ignoreFailures=false, overwrite=false, append=false, useDiff=true, 
> useRdiff=false, fromSnapshot=s1, toSnapshot=s2, skipCRC=false, blocking=true, 
> numListstatusThreads=0, maxMaps=20, mapBandwidth=0.0, 
> copyStrategy='uniformsize', preserveStatus=[BLOCKSIZE], atomicWorkPath=null, 
> logPath=null, sourceFileListing=null, sourcePaths=[/tmp/tocopy], 
> targetPath=/tmp/target, filtersFile='/tmp/filter', blocksPerChunk=0, 
> copyBufferSize=8192, verboseLog=false, directWrite=false}, 
> sourcePaths=[/tmp/tocopy], targetPathExists=true, preserveRawXattrsfalse
> 19/09/24 07:05:32 INFO client.RMProxy: Connecting to ResourceManager at 
> ctr-e141-1563959304486-33995-01-03.hwx.site/172.27.68.128:8050
> 19/09/24 07:05:33 INFO client.AHSProxy: Connecting to Application History 
> server at ctr-e141-1563959304486-33995-01-03.hwx.site/172.27.68.128:10200
> 

[jira] [Resolved] (HDFS-14869) Data loss in case of distcp using snapshot diff. Replication should include rename records if file was skipped in the previous iteration

2019-12-06 Thread Shashikant Banerjee (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee resolved HDFS-14869.

Fix Version/s: 3.1.4
   Resolution: Fixed

Thanks [~aasha] for the contribution and [~ste...@apache.org] for the review. I 
have committed this.

> Data loss in case of distcp using snapshot diff. Replication should include 
> rename records if file was skipped in the previous iteration
> 
>
> Key: HDFS-14869
> URL: https://issues.apache.org/jira/browse/HDFS-14869
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Fix For: 3.1.4
>
>
> This issue arises when a directory or file is excluded by exclusion filter 
> during distcp replication. Later on if the directory is renamed later to a 
> name which is not excluded by the filter, the snapshot diff reports only a 
> rename operation.  The directory is never copied to target even though its 
> not excluded now. This also doesn't throw any error so there is no way to 
> find the issue. 
> Steps to reproduce
>  * Create a directory in hdfs to copy using distcp.
>  * Include a staging folder in the directory.
> {code:java}
> [hdfs@ctr-e141-1563959304486-33995-01-03 hadoop-mapreduce]$ hadoop fs -ls 
> /tmp/tocopy
> Found 4 items
> -rw-r--r--   3 hdfs hdfs 16 2019-09-12 10:32 /tmp/tocopy/.b.txt
> drwxr-xr-x   - hdfs hdfs  0 2019-09-23 09:18 /tmp/tocopy/.staging
> -rw-r--r--   3 hdfs hdfs 12 2019-09-12 10:32 /tmp/tocopy/a.txt
> -rw-r--r--   3 hdfs hdfs  4 2019-09-20 08:23 /tmp/tocopy/foo.txt{code}
>  * The exclusion filter is set to exclude any staging directory
> {code:java}
> [hdfs@ctr-e141-1563959304486-33995-01-03 hadoop-mapreduce]$ cat 
> /tmp/filter
> .*\.Trash.*
> .*\.staging.*{code}
>  * Do a copy using distcp snapshots, the staging directory is not replicated.
> {code:java}
> hadoop jar hadoop-distcp-3.3.0-SNAPSHOT.jar 
> -Dmapreduce.job.user.classpath.first=true -filters /tmp/filter 
> /tmp/tocopy/.snapshot/s1 /tmp/target
> [hdfs@ctr-e141-1563959304486-33995-01-03 root]$ hadoop fs -ls /tmp/target
> Found 3 items
> -rw-r--r--   3 hdfs hdfs 16 2019-09-24 06:56 /tmp/target/.b.txt
> -rw-r--r--   3 hdfs hdfs 12 2019-09-24 06:56 /tmp/target/a.txt
> -rw-r--r--   3 hdfs hdfs  4 2019-09-24 06:56 /tmp/target/foo.txt{code}
>  * Rename the staging directory to final
> {code:java}
> [hdfs@ctr-e141-1563959304486-33995-01-03 hadoop-mapreduce]$ hadoop fs -mv 
> /tmp/tocopy/.staging /tmp/tocopy/final{code}
>  * Do a copy using snapshot diff.
> {code:java}
> [hdfs@ctr-e141-1563959304486-33995-01-03 hadoop-mapreduce]$ hdfs 
> snapshotDiff /tmp/tocopy s1 s2[hdfs@ctr-e141-1563959304486-33995-01-03 
> hadoop-mapreduce]$ hdfs snapshotDiff /tmp/tocopy s1 s2Difference between 
> snapshot s1 and snapshot s2 under directory /tmp/tocopy:M .R ./.staging -> 
> ./final
> {code}
>  * The diff report just has a rename record and the new final directory is 
> never copied.
> {code:java}
> [hdfs@ctr-e141-1563959304486-33995-01-03 hadoop-mapreduce]$ hadoop jar 
> hadoop-distcp-3.3.0-SNAPSHOT.jar -Dmapreduce.job.user.classpath.first=true 
> -filters /tmp/filter -diff s1 s2 -update /tmp/tocopy /tmp/target
> 19/09/24 07:05:32 INFO tools.DistCp: Input Options: 
> DistCpOptions{atomicCommit=false, syncFolder=true, deleteMissing=false, 
> ignoreFailures=false, overwrite=false, append=false, useDiff=true, 
> useRdiff=false, fromSnapshot=s1, toSnapshot=s2, skipCRC=false, blocking=true, 
> numListstatusThreads=0, maxMaps=20, mapBandwidth=0.0, 
> copyStrategy='uniformsize', preserveStatus=[BLOCKSIZE], atomicWorkPath=null, 
> logPath=null, sourceFileListing=null, sourcePaths=[/tmp/tocopy], 
> targetPath=/tmp/target, filtersFile='/tmp/filter', blocksPerChunk=0, 
> copyBufferSize=8192, verboseLog=false, directWrite=false}, 
> sourcePaths=[/tmp/tocopy], targetPathExists=true, preserveRawXattrsfalse
> 19/09/24 07:05:32 INFO client.RMProxy: Connecting to ResourceManager at 
> ctr-e141-1563959304486-33995-01-03.hwx.site/172.27.68.128:8050
> 19/09/24 07:05:33 INFO client.AHSProxy: Connecting to Application History 
> server at ctr-e141-1563959304486-33995-01-03.hwx.site/172.27.68.128:10200
> 19/09/24 07:05:33 INFO tools.DistCp: Number of paths in the copy list: 0
> 19/09/24 07:05:33 INFO client.RMProxy: Connecting to ResourceManager at 
> ctr-e141-1563959304486-33995-01-03.hwx.site/172.27.68.128:8050
> 19/09/24 07:05:33 INFO client.AHSProxy: Connecting to Application History 
> server at ctr-e141-1563959304486-33995-01-03.hwx.site/172.27.68.128:10200
> 19/09/24 07:05:33 INFO 

[jira] [Created] (HDFS-15034) fuse-dfs does not respect HADOOP_USER_NAME envvar with simple auth

2019-12-06 Thread Istvan Fajth (Jira)
Istvan Fajth created HDFS-15034:
---

 Summary: fuse-dfs does not respect HADOOP_USER_NAME envvar with 
simple auth
 Key: HDFS-15034
 URL: https://issues.apache.org/jira/browse/HDFS-15034
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: fuse-dfs
Reporter: Istvan Fajth


In the fuse code, there is an explicit map fro the context uid to the username 
on the OS level with the help of getpwuid() system call.
As we have already a way to access the callers environment, to determine the 
kerberos ticket cache path, we can respect the HADOOP_USER_NAME setting in a 
SIMPLE_AUTH based environment, so that the host where the mount is does not 
need to have all the users that are defined and used on HDFS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14983) RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option

2019-12-06 Thread Xieming Li (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989634#comment-16989634
 ] 

Xieming Li commented on HDFS-14983:
---

I have added documentation, javadoc, and a unit test.

> RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option
> ---
>
> Key: HDFS-14983
> URL: https://issues.apache.org/jira/browse/HDFS-14983
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Akira Ajisaka
>Assignee: Xieming Li
>Priority: Minor
> Attachments: HDFS-14983.002.patch, HDFS-14983.draft.001.patch
>
>
> NameNode can update proxyuser config by -refreshSuperUserGroupsConfiguration 
> without restarting but DFSRouter cannot. It would be better for DFSRouter to 
> have such functionality to be compatible with NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14983) RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option

2019-12-06 Thread Xieming Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xieming Li updated HDFS-14983:
--
Attachment: HDFS-14983.002.patch
Status: Patch Available  (was: Open)

> RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option
> ---
>
> Key: HDFS-14983
> URL: https://issues.apache.org/jira/browse/HDFS-14983
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Akira Ajisaka
>Assignee: Xieming Li
>Priority: Minor
> Attachments: HDFS-14983.002.patch, HDFS-14983.draft.001.patch
>
>
> NameNode can update proxyuser config by -refreshSuperUserGroupsConfiguration 
> without restarting but DFSRouter cannot. It would be better for DFSRouter to 
> have such functionality to be compatible with NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14983) RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option

2019-12-06 Thread Xieming Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xieming Li updated HDFS-14983:
--
Status: Open  (was: Patch Available)

> RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option
> ---
>
> Key: HDFS-14983
> URL: https://issues.apache.org/jira/browse/HDFS-14983
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Akira Ajisaka
>Assignee: Xieming Li
>Priority: Minor
> Attachments: HDFS-14983.draft.001.patch
>
>
> NameNode can update proxyuser config by -refreshSuperUserGroupsConfiguration 
> without restarting but DFSRouter cannot. It would be better for DFSRouter to 
> have such functionality to be compatible with NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts

2019-12-06 Thread Rakesh Radhakrishnan (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989600#comment-16989600
 ] 

Rakesh Radhakrishnan commented on HDFS-14740:
-

Thanks [~PhiloHe] for the updates.

How about keeping the two pmem related configs with matching names like below : 

{{'dfs.datanode.pmem.cache.restore'}} and {{'dfs.datanode.pmem.cache.dirs'}} ?

> Recover data blocks from persistent memory read cache during datanode restarts
> --
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, 
> HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, 
> HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts

2019-12-06 Thread Feilong He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989587#comment-16989587
 ] 

Feilong He commented on HDFS-14740:
---

[^HDFS-14740.007.patch] has been uploaded to change a property to 
'dfs.datanode.cache.restore.enabled'. Comment is welcome!

> Recover data blocks from persistent memory read cache during datanode restarts
> --
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, 
> HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, 
> HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts

2019-12-06 Thread Feilong He (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-14740:
--
Attachment: HDFS-14740.007.patch

> Recover data blocks from persistent memory read cache during datanode restarts
> --
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, 
> HDFS-14740.005.patch, HDFS-14740.006.patch, HDFS-14740.007.patch, 
> HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts

2019-12-06 Thread Feilong He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989573#comment-16989573
 ] 

Feilong He commented on HDFS-14740:
---

Thanks [~rakeshr] so much for your comments. Sorry for this late reply.
 # Yes, 'dfs.datanode.cache.persistence.enabled' looks a bit ambiguous to user. 
This property is used to control whether the cache on pmem should be restored 
to aviod unnecessarily pulling data to pmem again after DataNode restarts. I 
prefer to use 'dfs.datanode.cache.restore.enabled'. If you have other comment, 
please kindly let me know.
 # I have conducted some tests on the case you mentioned.  1) In my test, a 
file is cached to pmem by HDFS with the above flag set to true. Then, I 
shutdown the cluster and set the flag to false. After restarted the cluster, I 
noted that the previous cache is dropped on pmem and DataNode has to recache 
the block data to pmem, as we expected. 2) I also did another test. Firstly, a 
file is cached to pmem by HDFS with the above flag set to false. Then, I 
shutdown the cluster and set the flat to true. During the restarting of 
DataNode, I can see that the previous cache is restored, as we expected. To sum 
up, the behavior in the two tests aligns with the purpose of this flag. 

> Recover data blocks from persistent memory read cache during datanode restarts
> --
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, 
> HDFS-14740.005.patch, HDFS-14740.006.patch, 
> HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts

2019-12-06 Thread Feilong He (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He reassigned HDFS-14740:
-

Assignee: Feilong He  (was: Rui Mo)

> Recover data blocks from persistent memory read cache during datanode restarts
> --
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: caching, datanode
>Reporter: Feilong He
>Assignee: Feilong He
>Priority: Major
> Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, 
> HDFS-14740.005.patch, HDFS-14740.006.patch, 
> HDFS_Persistent_Read-Cache_Design-v1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.1.pdf, 
> HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the 
> status for cached data, if any, when DataNode restarts, thus, cache warm up 
> time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org