[jira] [Created] (HDDS-2270) Avoid buffer copying in ContainerStateMachine.loadSnapshot/persistContainerSet

2019-10-08 Thread Tsz-wo Sze (Jira)
Tsz-wo Sze created HDDS-2270:


 Summary: Avoid buffer copying in 
ContainerStateMachine.loadSnapshot/persistContainerSet
 Key: HDDS-2270
 URL: https://issues.apache.org/jira/browse/HDDS-2270
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: Ozone Datanode
Reporter: Tsz-wo Sze
Assignee: Tsz-wo Sze


ContainerStateMachine:
- In loadSnapshot(..), it first reads the snapshotFile to a  byte[] and then 
parses it to ContainerProtos.Container2BCSIDMapProto.  The buffer copying can 
be avoided.
{code}
try (FileInputStream fin = new FileInputStream(snapshotFile)) {
  byte[] container2BCSIDData = IOUtils.toByteArray(fin);
  ContainerProtos.Container2BCSIDMapProto proto =
  ContainerProtos.Container2BCSIDMapProto
  .parseFrom(container2BCSIDData);
  ...
}
{code}

- persistContainerSet(..) has similar problem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-2269) Provide config for fair/non-fair for OM RW Lock

2019-10-08 Thread Bharat Viswanadham (Jira)
Bharat Viswanadham created HDDS-2269:


 Summary: Provide config for fair/non-fair for OM RW Lock
 Key: HDDS-2269
 URL: https://issues.apache.org/jira/browse/HDDS-2269
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Bharat Viswanadham


Provide config in OzoneManager Lock for fair/non-fair for OM RW Lock.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-2268) Incorrect container checksum upon downgrade

2019-10-08 Thread Attila Doroszlai (Jira)
Attila Doroszlai created HDDS-2268:
--

 Summary: Incorrect container checksum upon downgrade
 Key: HDDS-2268
 URL: https://issues.apache.org/jira/browse/HDDS-2268
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Datanode, upgrade
Reporter: Attila Doroszlai


Container file checksum is calculated based on all YAML fields in a given Ozone 
version.  If the same container file is used in older Ozone, which has fewer 
fields, the expected checksum will be different.

Example: origin pipeline ID and origin node ID were added for HDDS-837 in Ozone 
0.4.0.  Starting Ozone 0.3.0 with the same data results in checksum error.

{noformat}
datanode_1  | ... ERROR ContainerReader:166 - Failed to parse ContainerFile for 
ContainerID: 1
datanode_1  | 
org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: 
Container checksum error for ContainerID: 1.
datanode_1  | Stored Checksum: 
7a6ec508d6e3796c5fe5fd52574b3d3437b0a0eaa4e053f7a96a5e39f4abb374
datanode_1  | Expected Checksum: 
fee023a02d3ced2f7b0b42c116cce5f03da6b57b29965ca878dc46d1213230b6
datanode_1  |   at 
org.apache.hadoop.ozone.container.common.helpers.ContainerUtils.verifyChecksum(ContainerUtils.java:259)
datanode_1  |   at 
org.apache.hadoop.ozone.container.keyvalue.helpers.KeyValueContainerUtil.parseKVContainerData(KeyValueContainerUtil.java:165)
datanode_1  |   at 
org.apache.hadoop.ozone.container.ozoneimpl.ContainerReader.verifyContainerData(ContainerReader.java:180)
datanode_1  |   at 
org.apache.hadoop.ozone.container.ozoneimpl.ContainerReader.verifyContainerFile(ContainerReader.java:164)
datanode_1  |   at 
org.apache.hadoop.ozone.container.ozoneimpl.ContainerReader.readVolume(ContainerReader.java:142)
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2244) Use new ReadWrite lock in OzoneManager

2019-10-08 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham resolved HDDS-2244.
--
Resolution: Fixed

> Use new ReadWrite lock in OzoneManager
> --
>
> Key: HDDS-2244
> URL: https://issues.apache.org/jira/browse/HDDS-2244
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Use new ReadWriteLock added in HDDS-2223.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2260) Avoid evaluation of LOG.trace and LOG.debug statement in the read/write path (HDDS)

2019-10-08 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham resolved HDDS-2260.
--
Fix Version/s: 0.5.0
   Resolution: Fixed

> Avoid evaluation of LOG.trace and LOG.debug statement in the read/write path 
> (HDDS)
> ---
>
> Key: HDDS-2260
> URL: https://issues.apache.org/jira/browse/HDDS-2260
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client, Ozone Datanode
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Siddharth Wagle
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> LOG.trace and LOG.debug with logging information will be evaluated even when 
> debug/trace logging is disabled. This jira proposes to wrap all the 
> trace/debug logging with 
> LOG.isDebugEnabled and LOG.isTraceEnabled to prevent the logging.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-2267) Container metadata scanner interval mismatch

2019-10-08 Thread Attila Doroszlai (Jira)
Attila Doroszlai created HDDS-2267:
--

 Summary: Container metadata scanner interval mismatch
 Key: HDDS-2267
 URL: https://issues.apache.org/jira/browse/HDDS-2267
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Datanode
Reporter: Attila Doroszlai
Assignee: Attila Doroszlai


Container metadata scanner can be configured to run at specific time intervals, 
eg. hourly ({{hdds.containerscrub.metadata.scan.interval}}).  However, the 
actual run interval does not match the configuration.  After a datanode 
restart, it runs in quick succession, later it runs at apparently random 
intervals.

{noformat:title=sample log}
datanode_1  | 2019-10-08 14:05:30 INFO  ContainerMetadataScanner:88 - Completed 
an iteration of container metadata scrubber in 0 minutes. Number of  iterations 
(since the data-node restart) : 1, Number of containers scanned in this 
iteration : 0, Number of unhealthy containers found in this iteration : 0
datanode_1  | 2019-10-08 14:09:33 INFO  ContainerMetadataScanner:88 - Completed 
an iteration of container metadata scrubber in 0 minutes. Number of  iterations 
(since the data-node restart) : 1, Number of containers scanned in this 
iteration : 6, Number of unhealthy containers found in this iteration : 0
...
datanode_1  | 2019-10-08 14:09:33 INFO  ContainerMetadataScanner:88 - Completed 
an iteration of container metadata scrubber in 0 minutes. Number of  iterations 
(since the data-node restart) : 28, Number of containers scanned in this 
iteration : 6, Number of unhealthy containers found in this iteration : 0
datanode_1  | 2019-10-08 14:21:01 INFO  ContainerMetadataScanner:88 - Completed 
an iteration of container metadata scrubber in 0 minutes. Number of  iterations 
(since the data-node restart) : 29, Number of containers scanned in this 
iteration : 6, Number of unhealthy containers found in this iteration : 0
datanode_1  | 2019-10-08 14:21:01 INFO  ContainerMetadataScanner:88 - Completed 
an iteration of container metadata scrubber in 0 minutes. Number of  iterations 
(since the data-node restart) : 30, Number of containers scanned in this 
iteration : 6, Number of unhealthy containers found in this iteration : 0
datanode_1  | 2019-10-08 15:30:38 INFO  ContainerMetadataScanner:88 - Completed 
an iteration of container metadata scrubber in 0 minutes. Number of  iterations 
(since the data-node restart) : 31, Number of containers scanned in this 
iteration : 6, Number of unhealthy containers found in this iteration : 0
datanode_1  | 2019-10-08 16:45:01 INFO  ContainerMetadataScanner:88 - Completed 
an iteration of container metadata scrubber in 0 minutes. Number of  iterations 
(since the data-node restart) : 32, Number of containers scanned in this 
iteration : 6, Number of unhealthy containers found in this iteration : 0
{noformat}

The problem is that time elapsed is measured in nanoseconds, while the 
configuration is in milliseconds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2019-10-08 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1283/

[Oct 7, 2019 4:04:36 AM] (shashikant) HDDS-2169. Avoid buffer copies while 
submitting client requests in
[Oct 7, 2019 7:38:08 AM] (aajisaka) HADOOP-16512. [hadoop-tools] Fix order of 
actual and expected expression
[Oct 7, 2019 9:35:39 AM] (elek) HDDS-2252. Enable gdpr robot test in daily build
[Oct 7, 2019 12:07:46 PM] (stevel) HADOOP-16587. Make ABFS AAD endpoints 
configurable.
[Oct 7, 2019 5:17:25 PM] (bharat) HDDS-2239. Fix TestOzoneFsHAUrls (#1600)
[Oct 7, 2019 6:44:30 PM] (surendralilhore) HDFS-14373. EC : Decoding is failing 
when block group last incomplete
[Oct 7, 2019 8:59:49 PM] (aengineer) HDDS-2238. Container Data Scrubber spams 
log in empty cluster
[Oct 7, 2019 9:10:57 PM] (aengineer) HDDS-2264. Improve output of 
TestOzoneContainer
[Oct 7, 2019 9:30:23 PM] (aengineer) HDDS-2259. Container Data Scrubber 
computes wrong checksum
[Oct 7, 2019 9:38:54 PM] (aengineer) HDDS-2262. SLEEP_SECONDS: command not found
[Oct 7, 2019 10:41:42 PM] (aengineer) HDDS-2245. Use dynamic ports for SCM in 
TestSecureOzoneCluster




-1 overall


The following subsystems voted -1:
asflicense compile findbugs hadolint pathlen unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-excerpt.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags2.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-output.xml
 

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-mawo/hadoop-yarn-applications-mawo-core
 
   Class org.apache.hadoop.applications.mawo.server.common.TaskStatus 
implements Cloneable but does not define or use clone method At 
TaskStatus.java:does not define or use clone method At TaskStatus.java:[lines 
39-346] 
   Equals method for 
org.apache.hadoop.applications.mawo.server.worker.WorkerId assumes the argument 
is of type WorkerId At WorkerId.java:the argument is of type WorkerId At 
WorkerId.java:[line 114] 
   
org.apache.hadoop.applications.mawo.server.worker.WorkerId.equals(Object) does 
not check for null argument At WorkerId.java:null argument At 
WorkerId.java:[lines 114-115] 

FindBugs :

   module:hadoop-cloud-storage-project/hadoop-cos 
   Redundant nullcheck of dir, which is known to be non-null in 
org.apache.hadoop.fs.cosn.BufferPool.createDir(String) Redundant null check at 
BufferPool.java:is known to be non-null in 
org.apache.hadoop.fs.cosn.BufferPool.createDir(String) Redundant null check at 
BufferPool.java:[line 66] 
   org.apache.hadoop.fs.cosn.CosNInputStream$ReadBuffer.getBuffer() may 
expose internal representation by returning CosNInputStream$ReadBuffer.buffer 
At CosNInputStream.java:by returning CosNInputStream$ReadBuffer.buffer At 
CosNInputStream.java:[line 87] 
   Found reliance on default encoding in 
org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.storeFile(String, File, 
byte[]):in org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.storeFile(String, 
File, byte[]): new String(byte[]) At CosNativeFileSystemStore.java:[line 199] 
   Found reliance on default encoding in 
org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.storeFileWithRetry(String, 
InputStream, byte[], long):in 
org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.storeFileWithRetry(String, 
InputStream, byte[], long): new String(byte[]) At 
CosNativeFileSystemStore.java:[line 178] 
   org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.uploadPart(File, 
String, String, int) may fail to clean up java.io.InputStream Obligation to 
clean up resource created at CosNativeFileSystemStore.java:fail to clean up 
java.io.InputStream Obligation to clean up resource created at 
CosNativeFileSystemStore.java:[line 252] is not discharged 

FindBugs :

   module:hadoop-ozone/csi 
   Useless control flow in 
csi.v1.Csi$CapacityRange$Builder.maybeForceBuilderInitialization() At Csi.java: 
At Csi.java:[line 15977] 
   Class csi.v1.Csi$ControllerExpandVolumeRequest defines non-transient 
non-serializable instance field secrets_ In Csi.java:instance field secrets_ In 
Csi.java 
   Useless control flow in 
csi.v1.Csi$ControllerExpandVolumeRequest$Builder.maybeForceBuilderInitialization()
 At Csi.java: At Csi.java:[line 

[jira] [Created] (HDFS-14902) NullPointer When Misconfigured

2019-10-08 Thread David Mollitor (Jira)
David Mollitor created HDFS-14902:
-

 Summary: NullPointer When Misconfigured
 Key: HDFS-14902
 URL: https://issues.apache.org/jira/browse/HDFS-14902
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: rbf
Affects Versions: 3.2.0
Reporter: David Mollitor


Admittedly the server was mis-configured, but this should be a bit more elegant.

{code:none}
2019-10-08 11:19:52,505 ERROR router.NamenodeHeartbeatService: Unhandled 
exception updating NN registration for null:null
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.federation.protocol.proto.HdfsServerFederationProtos$NamenodeMembershipRecordProto$Builder.setServiceAddress(HdfsServerFederationProtos.java:3831)
at 
org.apache.hadoop.hdfs.server.federation.store.records.impl.pb.MembershipStatePBImpl.setServiceAddress(MembershipStatePBImpl.java:119)
at 
org.apache.hadoop.hdfs.server.federation.store.records.MembershipState.newInstance(MembershipState.java:108)
at 
org.apache.hadoop.hdfs.server.federation.resolver.MembershipNamenodeResolver.registerNamenode(MembershipNamenodeResolver.java:259)
at 
org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.updateState(NamenodeHeartbeatService.java:223)
at 
org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.periodicInvoke(NamenodeHeartbeatService.java:159)
at 
org.apache.hadoop.hdfs.server.federation.router.PeriodicService$1.run(PeriodicService.java:178)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-14901) RBF: Add Encryption Zone related ClientProtocol APIs

2019-10-08 Thread hemanthboyina (Jira)
hemanthboyina created HDFS-14901:


 Summary: RBF: Add Encryption Zone related ClientProtocol APIs
 Key: HDFS-14901
 URL: https://issues.apache.org/jira/browse/HDFS-14901
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: hemanthboyina
Assignee: hemanthboyina






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Please cherry pick commits to lower branches

2019-10-08 Thread Wei-Chiu Chuang
I spent the whole last week cherry picking commits from trunk/branch-3.2 to
branch-3.1 (should've done this prior to 3.1.4 code freeze). There were
about 50-60 of them, many of them are conflict-free, and several of them
are critical bug fixes.

If your commit stays in trunk, it'll be useless for the community until the
next minor release, and many months after people start using the new
release.

Here are a few tips:
(1) update dependency to address a know security vulnerability, should be
cherry picked into all lower branches, especially when it updates the
maintenance release number. Example: update commons-compress from 1.18 to
1.19.

(2) blocker/critical bug fixes should be backported to all applicable
branches.

(3) because of the removal of commons-logging and a few code refactors,
commits may apply cleanly but doesn't compile in branch-3.2, branch-3.1 and
lower branches. Please spend the time to verify a commit is good.

Best
Weichiu


Re: [DISCUSS] Release Docs pointers Hadoop site

2019-10-08 Thread Elek, Marton

To be honest, I have no idea. I don't know about the historical meaning.

But as there is no other feedback, here are my guesses based on pure logic:

 * current -> should point to the release with the highest number (3.2.1)
 * stable -> to the stable 3.x release with the highest number (3.2.1 
as of now)


current2 -> latest 2.x release
stable2 -> latest stable 2.x release

>> 1. But if the release manager of 3.1 line thinks 3.1.3 is stable, 
and 3.2

>> line is also in stable state, which release should get precedence to be
>> called as *stable* in any release line (2.x or 3.x) ?

It depends if stable2 = (second highest stable) or (stable from the 2.x 
line). I think the second meaning is more reasonable.


>> 3.1.3 is getting released now, could
>> http://hadoop.apache.org/docs/current/ shall be updated to 3.1.3 ? is it
>> the norms ?

No. As the stable should point to the highest stable, not to the stable 
which was released recently.


Marton

On 9/30/19 10:09 AM, Sunil Govindan wrote:

Bumping up this thread again for feedback.
@Zhankun Tang   is now waiting for a confirmation to
complete 3.1.3 release publish activities.

- Sunil

On Fri, Sep 27, 2019 at 11:03 AM Sunil Govindan  wrote:


Hi Folks,

At present,
http://hadoop.apache.org/docs/stable/  points to *Apache Hadoop 3.2.1*
http://hadoop.apache.org/docs/current/ points to *Apache Hadoop 3.2.1*
http://hadoop.apache.org/docs/stable2/  points to *Apache Hadoop 2.9.2*
http://hadoop.apache.org/docs/current2/ points to *Apache Hadoop 2.9.2*

3.2.1 is released last day. *Now 3.1.3 has completed voting* and it is in
the final stages of staging
As per me,
a) 3.2.1 will be still be pointing to
http://hadoop.apache.org/docs/stable/ ?
b) 3.1.3 should be pointing to http://hadoop.apache.org/docs/current/ ?

Now my questions,
1. But if the release manager of 3.1 line thinks 3.1.3 is stable, and 3.2
line is also in stable state, which release should get precedence to be
called as *stable* in any release line (2.x or 3.x) ?
or do we need a vote or discuss thread to decide which release shall be
called as stable per release line?
2. Given 3.2.1 is released and pointing to 3.2.1 as stable, then when
3.1.3 is getting released now, could
http://hadoop.apache.org/docs/current/ shall be updated to 3.1.3 ? is it
the norms ?

Thanks
Sunil





-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org