[jira] [Resolved] (HDDS-1705) Recon: Add estimatedTotalCount to the response of containers and containers/{id} endpoints

2019-07-08 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDDS-1705.
-
  Resolution: Fixed
   Fix Version/s: 0.4.1
Target Version/s:   (was: 0.5.0)

I've committed this. Thanks for the contribution [~vivekratnavel] and thanks 
for the review [~swagle].

> Recon: Add estimatedTotalCount to the response of containers and 
> containers/{id} endpoints
> --
>
> Key: HDDS-1705
> URL: https://issues.apache.org/jira/browse/HDDS-1705
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Recon
>Affects Versions: 0.4.0
>Reporter: Vivek Ratnavel Subramanian
>Assignee: Vivek Ratnavel Subramanian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.4.1
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Reopened] (HDFS-12748) NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY

2019-07-08 Thread Weiwei Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang reopened HDFS-12748:


> NameNode memory leak when accessing webhdfs GETHOMEDIRECTORY
> 
>
> Key: HDFS-12748
> URL: https://issues.apache.org/jira/browse/HDFS-12748
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.8.2
>Reporter: Jiandan Yang 
>Assignee: Weiwei Yang
>Priority: Major
> Fix For: 3.3.0, 3.2.1
>
> Attachments: HDFS-12748-branch-3.1.01.patch, HDFS-12748.001.patch, 
> HDFS-12748.002.patch, HDFS-12748.003.patch, HDFS-12748.004.patch, 
> HDFS-12748.005.patch
>
>
> In our production environment, the standby NN often do fullgc, through mat we 
> found the largest object is FileSystem$Cache, which contains 7,844,890 
> DistributedFileSystem.
> By view hierarchy of method FileSystem.get() , I found only 
> NamenodeWebHdfsMethods#get call FileSystem.get(). I don't know why creating 
> different DistributedFileSystem every time instead of get a FileSystem from 
> cache.
> {code:java}
> case GETHOMEDIRECTORY: {
>   final String js = JsonUtil.toJsonString("Path",
>   FileSystem.get(conf != null ? conf : new Configuration())
>   .getHomeDirectory().toUri().getPath());
>   return Response.ok(js).type(MediaType.APPLICATION_JSON).build();
> }
> {code}
> When we close FileSystem when GETHOMEDIRECTORY, NN don't do fullgc.
> {code:java}
> case GETHOMEDIRECTORY: {
>   FileSystem fs = null;
>   try {
> fs = FileSystem.get(conf != null ? conf : new Configuration());
> final String js = JsonUtil.toJsonString("Path",
> fs.getHomeDirectory().toUri().getPath());
> return Response.ok(js).type(MediaType.APPLICATION_JSON).build();
>   } finally {
> if (fs != null) {
>   fs.close();
> }
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1775) Make OM KeyDeletingService compatible with HA model

2019-07-08 Thread Hanisha Koneru (JIRA)
Hanisha Koneru created HDDS-1775:


 Summary: Make OM KeyDeletingService compatible with HA model
 Key: HDDS-1775
 URL: https://issues.apache.org/jira/browse/HDDS-1775
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
Reporter: Hanisha Koneru
Assignee: Hanisha Koneru


Currently OM KeyDeletingService directly deletes all the keys in DeletedTable 
after deleting the corresponding blocks through SCM. For HA compatibility, the 
key purging should happen through the OM Ratis server. This Jira introduces 
PurgeKeys request in OM protocol. This request will be submitted to OMs Ratis 
server after SCM deletes blocks corresponding to deleted keys.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1774) Add disk hang test to fault injection test

2019-07-08 Thread Eric Yang (JIRA)
Eric Yang created HDDS-1774:
---

 Summary: Add disk hang test to fault injection test
 Key: HDDS-1774
 URL: https://issues.apache.org/jira/browse/HDDS-1774
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
Reporter: Eric Yang


When disk is corrupted, the disk may show behavior of hang in accessing data.  
One of the simulation that can be performed is to set disk IO throughput to 0 
bytes/sec to simulate disk hang.  Ozone file system client can detect disk 
access timeout, and proceed to read/write data to another datanode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-14637) Namenode may not replicate blocks to meet the policy after enabling upgradeDomain

2019-07-08 Thread Stephen O'Donnell (JIRA)
Stephen O'Donnell created HDFS-14637:


 Summary: Namenode may not replicate blocks to meet the policy 
after enabling upgradeDomain
 Key: HDFS-14637
 URL: https://issues.apache.org/jira/browse/HDFS-14637
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.3.0
Reporter: Stephen O'Donnell
Assignee: Stephen O'Donnell


After changing the network topology or placement policy on a cluster and 
restarting the namenode, the namenode will scan all blocks on the cluster at 
startup, and check if they meet the current placement policy. If they do not, 
they are added to the replication queue and the namenode will arrange for them 
to be replicated to ensure the placement policy is used.

If you start with a cluster with no UpgradeDomain, and then enable 
UpgradeDomain, then on restart the NN does notice all the blocks violate the 
placement policy and it adds them to the replication queue. I believe there are 
some issues in the logic that prevents the blocks from replicating depending on 
the setup:

With UD enabled, but no racks configured, and possible on a 2 rack cluster, the 
queued replication work never makes any progress, as in 
blockManager.validateReconstructionWork(), it checks to see if the new replica 
increases the number of racks, and if it does not, it skips it and tries again 
later.
{code:java}
DatanodeStorageInfo[] targets = rw.getTargets();
if ((numReplicas.liveReplicas() >= requiredRedundancy) &&
(!isPlacementPolicySatisfied(block)) ) {
  if (!isInNewRack(rw.getSrcNodes(), targets[0].getDatanodeDescriptor())) {
// No use continuing, unless a new rack in this case
return false;
  }
  // mark that the reconstruction work is to replicate internal block to a
  // new rack.
  rw.setNotEnoughRack();
}

{code}
Additionally, in blockManager.scheduleReconstruction() is there some logic that 
sets the number of new replicas required to one, if the live replicas >= 
requiredReduncancy:
{code:java}
int additionalReplRequired;
if (numReplicas.liveReplicas() < requiredRedundancy) {
  additionalReplRequired = requiredRedundancy - numReplicas.liveReplicas()
  - pendingNum;
} else {
  additionalReplRequired = 1; // Needed on a new rack
}{code}
With UD, it is possible for 2 new replicas to be needed to meet the block 
placement policy, if all existing replicas are on a node with the same domain. 
For traditional '2 rack redundancy', only 1 new replica would ever have been 
needed in this scenario.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1773) Add intermitten IO disk test to fault injection test

2019-07-08 Thread Eric Yang (JIRA)
Eric Yang created HDDS-1773:
---

 Summary: Add intermitten IO disk test to fault injection test
 Key: HDDS-1773
 URL: https://issues.apache.org/jira/browse/HDDS-1773
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
Reporter: Eric Yang


Disk errors can also be simulated by setting cgroup blkio rate to 0 while Ozone 
cluster is running.  
This test will be added to corruption test project and this test will only be 
performed if there is write access into host cgroup to control the throttle of 
disk IO.

Expected result:
When datanode becomes irresponsive due to slow io, scm must flag the node as 
unhealthy.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1772) Add disk full test to fault injection test

2019-07-08 Thread Eric Yang (JIRA)
Eric Yang created HDDS-1772:
---

 Summary: Add disk full test to fault injection test
 Key: HDDS-1772
 URL: https://issues.apache.org/jira/browse/HDDS-1772
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
Reporter: Eric Yang


In Read-only test, one of the simulation to verify is the data disk becomes 
full.  This can be tested by using a small Docker data disk to simulate disk 
full.  When data disk is full, Ozone should continue to operate, and provide 
read access to Ozone file system.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-1771) Add slow IO disk test to fault injection test

2019-07-08 Thread Eric Yang (JIRA)
Eric Yang created HDDS-1771:
---

 Summary: Add slow IO disk test to fault injection test
 Key: HDDS-1771
 URL: https://issues.apache.org/jira/browse/HDDS-1771
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
Reporter: Eric Yang


In fault injection test, one possible simulation to run is to create slow disk 
IO to assist developing a set of timing profiles that works for Ozone cluster.  
When we write to a file, the data travels across a bunch of buffers and caches 
before it is effectively written to the disk.  By controlling cgroup blkio rate 
in Linux Kernel, we can simulate slow disk read, write.  Docker provides the 
following parameters to control cgroup:

{code}
--device-read-bps=""
--device-write-bps=""
--device-read-iops=""
--device-write-iops=""
{code}

The test will be added to read/write test with docker compose file as 
parameters to test the timing profiles.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-1338) ozone shell commands are throwing InvocationTargetException

2019-07-08 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton resolved HDDS-1338.

Resolution: Duplicate

> ozone shell commands are throwing InvocationTargetException
> ---
>
> Key: HDDS-1338
> URL: https://issues.apache.org/jira/browse/HDDS-1338
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Nilotpal Nandi
>Priority: Major
>
> ozone version
> {noformat}
> Source code repository g...@github.com:hortonworks/ozone.git -r 
> 310ebf5dc83b6c9e68d09246ed6c6f7cf6370fde
> Compiled by jenkins on 2019-03-21T22:06Z
> Compiled with protoc 2.5.0
> From source with checksum 9c367143ad43b81ca84bfdaafd1c3f
> Using HDDS 0.4.0.3.0.100.0-388
> Source code repository g...@github.com:hortonworks/ozone.git -r 
> 310ebf5dc83b6c9e68d09246ed6c6f7cf6370fde
> Compiled by jenkins on 2019-03-21T22:06Z
> Compiled with protoc 2.5.0
> From source with checksum f3297cbd3a5f59fb4e5fd551afa05ba9
> {noformat}
> Here is the ozone volume create failure output :
> {noformat}
> hdfs@ctr-e139-1542663976389-91321-01-02 ~]$ ozone sh volume create 
> testvolume11
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.0.100.0-388/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.0.100.0-388/hadoop-ozone/share/ozone/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 19/03/26 17:31:37 ERROR client.OzoneClientFactory: Couldn't create protocol 
> class org.apache.hadoop.ozone.client.rpc.RpcClient exception:
> java.lang.reflect.InvocationTargetException
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>  at 
> org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:291)
>  at 
> org.apache.hadoop.ozone.client.OzoneClientFactory.getRpcClient(OzoneClientFactory.java:169)
>  at 
> org.apache.hadoop.ozone.web.ozShell.OzoneAddress.createClient(OzoneAddress.java:111)
>  at 
> org.apache.hadoop.ozone.web.ozShell.volume.CreateVolumeHandler.call(CreateVolumeHandler.java:70)
>  at 
> org.apache.hadoop.ozone.web.ozShell.volume.CreateVolumeHandler.call(CreateVolumeHandler.java:38)
>  at picocli.CommandLine.execute(CommandLine.java:919)
>  at picocli.CommandLine.access$700(CommandLine.java:104)
>  at picocli.CommandLine$RunLast.handle(CommandLine.java:1083)
>  at picocli.CommandLine$RunLast.handle(CommandLine.java:1051)
>  at 
> picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:959)
>  at picocli.CommandLine.parseWithHandlers(CommandLine.java:1242)
>  at picocli.CommandLine.parseWithHandler(CommandLine.java:1181)
>  at org.apache.hadoop.hdds.cli.GenericCli.execute(GenericCli.java:61)
>  at org.apache.hadoop.ozone.web.ozShell.Shell.execute(Shell.java:82)
>  at org.apache.hadoop.hdds.cli.GenericCli.run(GenericCli.java:52)
>  at org.apache.hadoop.ozone.web.ozShell.Shell.main(Shell.java:93)
> Caused by: java.lang.VerifyError: Cannot inherit from final class
>  at java.lang.ClassLoader.defineClass1(Native Method)
>  at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
>  at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>  at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
>  at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
>  at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
>  at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
>  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>  at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
>  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.(OzoneManagerProtocolClientSideTranslatorPB.java:169)
>  at org.apache.hadoop.ozone.client.rpc.RpcClient.(RpcClient.java:142)
>  ... 20 more
> Couldn't create protocol class org.apache.hadoop.ozone.client.rpc.RpcClient
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For 

[jira] [Resolved] (HDDS-1305) Robot test containers: hadoop client can't access o3fs

2019-07-08 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton resolved HDDS-1305.

Resolution: Duplicate

Thanks to report this issue. It will be fixed in HDDS-1717

 

(Based on the timeline that one is the duplicate, but we have a working patch 
there, so I am closing this one).

> Robot test containers: hadoop client can't access o3fs
> --
>
> Key: HDDS-1305
> URL: https://issues.apache.org/jira/browse/HDDS-1305
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Affects Versions: 0.5.0
>Reporter: Sandeep Nemuri
>Assignee: Anu Engineer
>Priority: Major
> Attachments: run.log
>
>
> Run the robot test using:
> {code:java}
> ./test.sh --keep --env ozonefs
> {code}
> login to OM container and check if we have desired volume/bucket/key got 
> created with robot tests.
> {code:java}
> [root@o3new ~]$ docker exec -it ozonefs_om_1 /bin/bash
> bash-4.2$ ozone fs -ls o3fs://bucket1.fstest/
> Found 3 items
> -rw-rw-rw-   1 hadoop hadoop  22990 2019-03-15 17:28 
> o3fs://bucket1.fstest/KEY.txt
> drwxrwxrwx   - hadoop hadoop  0 1970-01-01 00:00 
> o3fs://bucket1.fstest/testdir
> drwxrwxrwx   - hadoop hadoop  0 2019-03-15 17:27 
> o3fs://bucket1.fstest/testdir1
> {code}
> {code:java}
> [root@o3new ~]$ docker exec -it ozonefs_hadoop3_1 /bin/bash
> bash-4.4$ hadoop classpath
> /opt/hadoop/etc/hadoop:/opt/hadoop/share/hadoop/common/lib/*:/opt/hadoop/share/hadoop/common/*:/opt/hadoop/share/hadoop/hdfs:/opt/hadoop/share/hadoop/hdfs/lib/*:/opt/hadoop/share/hadoop/hdfs/*:/opt/hadoop/share/hadoop/mapreduce/*:/opt/hadoop/share/hadoop/yarn:/opt/hadoop/share/hadoop/yarn/lib/*:/opt/hadoop/share/hadoop/yarn/*:/opt/ozone/share/ozone/lib/hadoop-ozone-filesystem-lib-current-0.5.0-SNAPSHOT.jar
> bash-4.4$ hadoop fs -ls o3fs://bucket1.fstest/
> 2019-03-18 19:12:42 INFO  Configuration:3204 - Removed undeclared tags:
> 2019-03-18 19:12:42 ERROR OzoneClientFactory:294 - Couldn't create protocol 
> class org.apache.hadoop.ozone.client.rpc.RpcClient exception:
> java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:291)
>   at 
> org.apache.hadoop.ozone.client.OzoneClientFactory.getRpcClient(OzoneClientFactory.java:169)
>   at 
> org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl.(OzoneClientAdapterImpl.java:127)
>   at 
> org.apache.hadoop.fs.ozone.OzoneFileSystem.initialize(OzoneFileSystem.java:189)
>   at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3354)
>   at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124)
>   at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3403)
>   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3371)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:477)
>   at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361)
>   at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:325)
>   at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:249)
>   at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:232)
>   at 
> org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:104)
>   at org.apache.hadoop.fs.shell.Command.run(Command.java:176)
>   at org.apache.hadoop.fs.FsShell.run(FsShell.java:328)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>   at org.apache.hadoop.fs.FsShell.main(FsShell.java:391)
> Caused by: java.lang.VerifyError: Cannot inherit from final class
>   at java.lang.ClassLoader.defineClass1(Native Method)
>   at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
>   at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>   at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at 

[jira] [Resolved] (HDDS-1644) Overload RpcClient#createKey to pass non-default acls

2019-07-08 Thread Ajay Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar resolved HDDS-1644.
--
Resolution: Won't Fix

> Overload RpcClient#createKey to pass non-default acls
> -
>
> Key: HDDS-1644
> URL: https://issues.apache.org/jira/browse/HDDS-1644
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>Reporter: Ajay Kumar
>Assignee: Anu Engineer
>Priority: Major
>
> Overload RpcClient#createKey to pass default acls as function parameters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2019-07-08 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1191/

No changes




-1 overall


The following subsystems voted -1:
asflicense findbugs hadolint pathlen unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-documentstore
 
   Unread field:TimelineEventSubDoc.java:[line 56] 
   Unread field:TimelineMetricSubDoc.java:[line 44] 

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-mawo/hadoop-yarn-applications-mawo-core
 
   Class org.apache.hadoop.applications.mawo.server.common.TaskStatus 
implements Cloneable but does not define or use clone method At 
TaskStatus.java:does not define or use clone method At TaskStatus.java:[lines 
39-346] 
   Equals method for 
org.apache.hadoop.applications.mawo.server.worker.WorkerId assumes the argument 
is of type WorkerId At WorkerId.java:the argument is of type WorkerId At 
WorkerId.java:[line 114] 
   
org.apache.hadoop.applications.mawo.server.worker.WorkerId.equals(Object) does 
not check for null argument At WorkerId.java:null argument At 
WorkerId.java:[lines 114-115] 

FindBugs :

   module:hadoop-tools/hadoop-dynamometer/hadoop-dynamometer-infra 
   org.apache.hadoop.tools.dynamometer.Client.addFileToZipRecursively(File, 
File, ZipOutputStream) may fail to clean up java.io.InputStream on checked 
exception Obligation to clean up resource created at Client.java:to clean up 
java.io.InputStream on checked exception Obligation to clean up resource 
created at Client.java:[line 859] is not discharged 
   Exceptional return value of java.io.File.mkdirs() ignored in 
org.apache.hadoop.tools.dynamometer.DynoInfraUtils.fetchHadoopTarball(File, 
String, Configuration, Logger) At DynoInfraUtils.java:ignored in 
org.apache.hadoop.tools.dynamometer.DynoInfraUtils.fetchHadoopTarball(File, 
String, Configuration, Logger) At DynoInfraUtils.java:[line 138] 
   Found reliance on default encoding in 
org.apache.hadoop.tools.dynamometer.SimulatedDataNodes.run(String[]):in 
org.apache.hadoop.tools.dynamometer.SimulatedDataNodes.run(String[]): new 
java.io.InputStreamReader(InputStream) At SimulatedDataNodes.java:[line 149] 
   org.apache.hadoop.tools.dynamometer.SimulatedDataNodes.run(String[]) 
invokes System.exit(...), which shuts down the entire virtual machine At 
SimulatedDataNodes.java:down the entire virtual machine At 
SimulatedDataNodes.java:[line 123] 
   org.apache.hadoop.tools.dynamometer.SimulatedDataNodes.run(String[]) may 
fail to close stream At SimulatedDataNodes.java:stream At 
SimulatedDataNodes.java:[line 149] 

FindBugs :

   module:hadoop-tools/hadoop-dynamometer/hadoop-dynamometer-blockgen 
   Self assignment of field BlockInfo.replication in new 
org.apache.hadoop.tools.dynamometer.blockgenerator.BlockInfo(BlockInfo) At 
BlockInfo.java:in new 
org.apache.hadoop.tools.dynamometer.blockgenerator.BlockInfo(BlockInfo) At 
BlockInfo.java:[line 78] 

Failed junit tests :

   hadoop.util.TestDiskCheckerWithDiskIo 
   hadoop.hdfs.server.datanode.TestDirectoryScanner 
   hadoop.hdfs.web.TestWebHdfsTimeouts 
   hadoop.hdfs.server.federation.router.TestRouterWithSecureStartup 
   hadoop.hdfs.server.federation.security.TestRouterHttpDelegationToken 
   hadoop.hdfs.server.federation.router.TestRouterRpcMultiDestination 
   hadoop.ozone.container.ozoneimpl.TestOzoneContainer 
   hadoop.ozone.om.TestOzoneManagerHA 
   hadoop.ozone.client.rpc.TestOzoneRpcClientWithRatis 
   hadoop.ozone.client.rpc.TestOzoneAtRestEncryption 
   hadoop.ozone.client.rpc.TestOzoneRpcClient 
   hadoop.ozone.client.rpc.TestSecureOzoneRpcClient 
   hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1191/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1191/artifact/out/diff-compile-javac-root.txt
  [336K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1191/artifact/out/diff-checkstyle-root.txt
  [17M]

   hadolint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1191/artifact/out/diff-patch-hadolint.txt
  [8.0K]

   pathlen:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1191/artifact/out/pathlen.txt
  [12K]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1191/artifact/out/diff-patch-pylint.txt
  [120K]

   shellcheck:

   

[jira] [Created] (HDDS-1770) SCM crashes when ReplicationManager is trying to re-replicate under replicated containers

2019-07-08 Thread Nanda kumar (JIRA)
Nanda kumar created HDDS-1770:
-

 Summary: SCM crashes when ReplicationManager is trying to 
re-replicate under replicated containers
 Key: HDDS-1770
 URL: https://issues.apache.org/jira/browse/HDDS-1770
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: SCM
Reporter: Nanda kumar


SCM crashes with the following exception when ReplicationManager is trying to 
re-replicate under replicated containers
{noformat}
2019-07-08 12:46:36 ERROR ReplicationManager:215 - Exception in Replication 
Monitor Thread.
java.lang.IllegalArgumentException: Affinity node /default-rack/aab15e2d07cc is 
not a member of topology
at 
org.apache.hadoop.hdds.scm.net.NetworkTopologyImpl.checkAffinityNode(NetworkTopologyImpl.java:767)
at 
org.apache.hadoop.hdds.scm.net.NetworkTopologyImpl.chooseRandom(NetworkTopologyImpl.java:407)
at 
org.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementRackAware.chooseNode(SCMContainerPlacementRackAware.java:242)
at 
org.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementRackAware.chooseDatanodes(SCMContainerPlacementRackAware.java:168)
at 
org.apache.hadoop.hdds.scm.container.ReplicationManager.handleUnderReplicatedContainer(ReplicationManager.java:487)
at 
org.apache.hadoop.hdds.scm.container.ReplicationManager.processContainer(ReplicationManager.java:293)
at 
java.base/java.util.concurrent.ConcurrentHashMap$KeySetView.forEach(ConcurrentHashMap.java:4698)
at 
java.base/java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1083)
at 
org.apache.hadoop.hdds.scm.container.ReplicationManager.run(ReplicationManager.java:205)
at java.base/java.lang.Thread.run(Thread.java:834)
2019-07-08 12:46:36 INFO  ExitUtil:210 - Exiting with status 1: 
java.lang.IllegalArgumentException: Affinity node /default-rack/aab15e2d07cc is 
not a member of topology
2019-07-08 12:46:36 INFO  StorageContainerManagerStarter:51 - SHUTDOWN_MSG: 
/
SHUTDOWN_MSG: Shutting down StorageContainerManager at 
8c763563f672/192.168.112.2
/
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86

2019-07-08 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/

No changes




-1 overall


The following subsystems voted -1:
asflicense findbugs hadolint pathlen unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/empty-configuration.xml
 
   hadoop-tools/hadoop-azure/src/config/checkstyle-suppressions.xml 
   hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/public/crossdomain.xml 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml
 

FindBugs :

   module:hadoop-common-project/hadoop-common 
   Class org.apache.hadoop.fs.GlobalStorageStatistics defines non-transient 
non-serializable instance field map In GlobalStorageStatistics.java:instance 
field map In GlobalStorageStatistics.java 

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbase-client
 
   Boxed value is unboxed and then immediately reboxed in 
org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result,
 byte[], byte[], KeyConverter, ValueConverter, boolean) At 
ColumnRWHelper.java:then immediately reboxed in 
org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result,
 byte[], byte[], KeyConverter, ValueConverter, boolean) At 
ColumnRWHelper.java:[line 335] 

Failed junit tests :

   hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys 
   hadoop.hdfs.server.datanode.TestDirectoryScanner 
   hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead 
   hadoop.registry.secure.TestSecureLogins 
   
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer
 
   hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2 
   hadoop.yarn.sls.TestSLSRunner 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-compile-cc-root-jdk1.7.0_95.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-compile-javac-root-jdk1.7.0_95.txt
  [328K]

   cc:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-compile-cc-root-jdk1.8.0_212.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-compile-javac-root-jdk1.8.0_212.txt
  [308K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-checkstyle-root.txt
  [16M]

   hadolint:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-patch-hadolint.txt
  [4.0K]

   pathlen:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/pathlen.txt
  [12K]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-patch-pylint.txt
  [24K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-patch-shellcheck.txt
  [72K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-patch-shelldocs.txt
  [8.0K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/whitespace-eol.txt
  [12M]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/whitespace-tabs.txt
  [1.2M]

   xml:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/xml.txt
  [12K]

   findbugs:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/branch-findbugs-hadoop-common-project_hadoop-common-warnings.html
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-hbase_hadoop-yarn-server-timelineservice-hbase-client-warnings.html
  [8.0K]

   javadoc:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-javadoc-javadoc-root-jdk1.7.0_95.txt
  [16K]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/diff-javadoc-javadoc-root-jdk1.8.0_212.txt
  [1.1M]

   unit:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/376/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [228K]
   

[jira] [Created] (HDFS-14636) SBN : If you configure the default proxy provider still read Request going to Observer namenode only.

2019-07-08 Thread Harshakiran Reddy (JIRA)
Harshakiran Reddy created HDFS-14636:


 Summary: SBN : If you configure the default proxy provider still 
read Request going to Observer namenode only.
 Key: HDFS-14636
 URL: https://issues.apache.org/jira/browse/HDFS-14636
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.1.1
Reporter: Harshakiran Reddy


{noformat}
In Observer cluster, will configure the default proxy provider instead of 
"org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider", still 
Read request going to Observer namenode only.{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-14635) Support to refresh the rack awareness dynamically

2019-07-08 Thread liying (JIRA)
liying created HDFS-14635:
-

 Summary: Support to refresh the rack awareness dynamically 
 Key: HDFS-14635
 URL: https://issues.apache.org/jira/browse/HDFS-14635
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Affects Versions: 2.7.2
Reporter: liying


At present , there are two ways to load the rack script in the hadoop codes. 
The class ScriptBasedMapping is the cache way, and the class 
ScriptBasedMapping#RawScriptBasedMapping is the way that
it will load script every time(every request)。 The cache is the good way to 
implement this feature, because it consumes cpu performance if loading the 
script for every quest。But here's another question that is we cann't refresh 
the cache, so it is import to support to refresh the rack awareness dynamically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org