Re: [VOTE] Merge HDFS-12943 branch to trunk - Consistent Reads from Standby

2018-12-05 Thread Yongjun Zhang
Great work guys.

Wonder if we can elaborate what's impact of not having #2 fixed, and why #2
is not needed for the feature to complete?
2. Need to fix automatic failover with ZKFC. Currently it does not doesn't
know about ObserverNodes trying to convert them to SBNs.

Thanks.
--Yongjun


On Wed, Dec 5, 2018 at 5:27 PM Konstantin Shvachko 
wrote:

> Hi Hadoop developers,
>
> I would like to propose to merge to trunk the feature branch HDFS-12943 for
> Consistent Reads from Standby Node. The feature is intended to scale read
> RPC workloads. On large clusters reads comprise 95% of all RPCs to the
> NameNode. We should be able to accommodate higher overall RPC workloads (up
> to 4x by some estimates) by adding multiple ObserverNodes.
>
> The main functionality has been implemented see sub-tasks of HDFS-12943.
> We followed up with the test plan. Testing was done on two independent
> clusters (see HDFS-14058 and HDFS-14059) with security enabled.
> We ran standard HDFS commands, MR jobs, admin commands including manual
> failover.
> We know of one cluster running this feature in production.
>
> There are a few outstanding issues:
> 1. Need to provide proper documentation - a user guide for the new feature
> 2. Need to fix automatic failover with ZKFC. Currently it does not doesn't
> know about ObserverNodes trying to convert them to SBNs.
> 3. Scale testing and performance fine-tuning
> 4. As testing progresses, we continue fixing non-critical bugs like
> HDFS-14116.
>
> I attached a unified patch to the umbrella jira for the review and Jenkins
> build.
> Please vote on this thread. The vote will run for 7 days until Wed Dec 12.
>
> Thanks,
> --Konstantin
>


Re: [DISCUSS] Hadoop RPC encryption performance improvements

2018-12-05 Thread Wei-Chiu Chuang
Thanks Daryn for your work. I saw you filed an upstream jira HADOOP-15977
 and uploaded some
patches for review.
I'm watching the jira and will review shortly as fast as I can.

Best


On Wed, Oct 31, 2018 at 7:39 AM Daryn Sharp  wrote:

> Various KMS tasks have been delaying my RPC encryption work – which is 2nd
> on TODO list.  It's becoming a top priority for us so I'll try my best to
> get a preliminary netty server patch (sans TLS) up this week if that helps.
>
> The two cited jiras had some critical flaws.  Skimming my comments, both
> use blocking IO (obvious nonstarter).  HADOOP-10768 is a hand rolled
> TLS-like encryption which I don't feel is something the community can or
> should maintain from a security standpoint.
>
> Daryn
>
> On Wed, Oct 31, 2018 at 8:43 AM Wei-Chiu Chuang 
> wrote:
>
>> Ping. Any one? Cloudera is interested in moving forward with the RPC
>> encryption improvements, but I just like to get a consensus which approach
>> to go with.
>>
>> Otherwise I'll pick HADOOP-10768 since it's ready for commit, and I've
>> spent time on testing it.
>>
>> On Thu, Oct 25, 2018 at 11:04 AM Wei-Chiu Chuang 
>> wrote:
>>
>> > Folks,
>> >
>> > I would like to invite all to discuss the various Hadoop RPC encryption
>> > performance improvements. As you probably know, Hadoop RPC encryption
>> > currently relies on Java SASL, and have _really_ bad performance (in
>> terms
>> > of number of RPCs per second, around 15~20% of the one without SASL)
>> >
>> > There have been some attempts to address this, most notably,
>> HADOOP-10768
>> >  (Optimize Hadoop
>> RPC
>> > encryption performance) and HADOOP-13836
>> >  (Securing Hadoop
>> RPC
>> > using SSL). But it looks like both attempts have not been progressing.
>> >
>> > During the recent Hadoop contributor meetup, Daryn Sharp mentioned he's
>> > working on another approach that leverages Netty for its SSL encryption,
>> > and then integrate Netty with Hadoop RPC so that Hadoop RPC
>> automatically
>> > benefits from netty's SSL encryption performance.
>> >
>> > So there are at least 3 attempts to address this issue as I see it. Do
>> we
>> > have a consensus that:
>> > 1. this is an important problem
>> > 2. which approach we want to move forward with
>> >
>> > --
>> > A very happy Hadoop contributor
>> >
>>
>>
>> --
>> A very happy Hadoop contributor
>>
>
>
> --
>
> Daryn
>


[jira] [Created] (HDFS-14129) RBF : Create new policy provider for router

2018-12-05 Thread Surendra Singh Lilhore (JIRA)
Surendra Singh Lilhore created HDFS-14129:
-

 Summary: RBF : Create new policy provider for router
 Key: HDFS-14129
 URL: https://issues.apache.org/jira/browse/HDFS-14129
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Affects Versions: HDFS-13532
Reporter: Surendra Singh Lilhore


Router is using *{{HDFSPolicyProvider}}*. We can't add new ptotocol in this 
class for router, its better to create in policy provider for Router.
{code:java}
// Set service-level authorization security policy
if (conf.getBoolean(HADOOP_SECURITY_AUTHORIZATION, false)) {
this.adminServer.refreshServiceAcl(conf, new HDFSPolicyProvider());
}
{code}
I got this issue when I am verified HDFS-14079 with secure cluster.
{noformat}
./bin/hdfs dfsrouteradmin -ls /
ls: Protocol interface org.apache.hadoop.hdfs.protocolPB.RouterAdminProtocol is 
not known.
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException):
 Protocol interface org.apache.hadoop.hdfs.protocolPB.RouterAdminProtocol is 
not known.
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1520)
at org.apache.hadoop.ipc.Client.call(Client.java:1466)
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [VOTE] Merge HDFS-12943 branch to trunk - Consistent Reads from Standby

2018-12-05 Thread Zhe Zhang
+1 (binding)

Thanks Konstantin for leading the merge effort!

I worked very closely with Chen, Konstantin, and Erik in the testing stage
and I feel confident that the feature has now completed designed
functionalities and has proven to be stable.

Great team work with contributors from multiple companies!

On Wed, Dec 5, 2018 at 5:27 PM Konstantin Shvachko 
wrote:

> Hi Hadoop developers,
>
> I would like to propose to merge to trunk the feature branch HDFS-12943 for
> Consistent Reads from Standby Node. The feature is intended to scale read
> RPC workloads. On large clusters reads comprise 95% of all RPCs to the
> NameNode. We should be able to accommodate higher overall RPC workloads (up
> to 4x by some estimates) by adding multiple ObserverNodes.
>
> The main functionality has been implemented see sub-tasks of HDFS-12943.
> We followed up with the test plan. Testing was done on two independent
> clusters (see HDFS-14058 and HDFS-14059) with security enabled.
> We ran standard HDFS commands, MR jobs, admin commands including manual
> failover.
> We know of one cluster running this feature in production.
>
> There are a few outstanding issues:
> 1. Need to provide proper documentation - a user guide for the new feature
> 2. Need to fix automatic failover with ZKFC. Currently it does not doesn't
> know about ObserverNodes trying to convert them to SBNs.
> 3. Scale testing and performance fine-tuning
> 4. As testing progresses, we continue fixing non-critical bugs like
> HDFS-14116.
>
> I attached a unified patch to the umbrella jira for the review and Jenkins
> build.
> Please vote on this thread. The vote will run for 7 days until Wed Dec 12.
>
> Thanks,
> --Konstantin
>
-- 
Zhe Zhang
Apache Hadoop Committer
http://zhe-thoughts.github.io/about/ | @oldcap


[jira] [Created] (HDFS-14128) .Trash location

2018-12-05 Thread George Huang (JIRA)
George Huang created HDFS-14128:
---

 Summary: .Trash location
 Key: HDFS-14128
 URL: https://issues.apache.org/jira/browse/HDFS-14128
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Affects Versions: 3.0.0
Reporter: George Huang
Assignee: George Huang


Currently some customer has users accounts that are functional ids (fid) to 
manage application and application data under the path /data/FID. These fid's 
also get a home directory under /user path. The user's home directories are 
limited with space quota 60 G. When these fids delete data, due to customer 
deletion policy they are placed in /user//.Trash location and run over 
quota.

For now they are increasing quotas for these functional users, but considering 
growing applications they would like the .Trash location to be configurable or 
something like  /trash/\{userid} that is owned by the user.

What should the configurable path look like to make this happen? For example, 
some thoughts may relate whether we want to configure it for per user or per 
cluster, etc.

Here is current behavior:

fs.TrashPolicyDefault: Moved: 'hdfs://ns1/user/hdfs/test/1.txt to trash at: 
hdfs://ns1/user/hdfs/.Trash/Current/user/hdfs/test/1.txt

for path under encryption zone:

fs.TrashPolicyDefault: Moved: 'hdfs://ns1/scale/2.txt' to trash at 
hdfs://ns1/scale/.Trash/hdfs/Current/scale/2.txt

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[VOTE] Merge HDFS-12943 branch to trunk - Consistent Reads from Standby

2018-12-05 Thread Konstantin Shvachko
Hi Hadoop developers,

I would like to propose to merge to trunk the feature branch HDFS-12943 for
Consistent Reads from Standby Node. The feature is intended to scale read
RPC workloads. On large clusters reads comprise 95% of all RPCs to the
NameNode. We should be able to accommodate higher overall RPC workloads (up
to 4x by some estimates) by adding multiple ObserverNodes.

The main functionality has been implemented see sub-tasks of HDFS-12943.
We followed up with the test plan. Testing was done on two independent
clusters (see HDFS-14058 and HDFS-14059) with security enabled.
We ran standard HDFS commands, MR jobs, admin commands including manual
failover.
We know of one cluster running this feature in production.

There are a few outstanding issues:
1. Need to provide proper documentation - a user guide for the new feature
2. Need to fix automatic failover with ZKFC. Currently it does not doesn't
know about ObserverNodes trying to convert them to SBNs.
3. Scale testing and performance fine-tuning
4. As testing progresses, we continue fixing non-critical bugs like
HDFS-14116.

I attached a unified patch to the umbrella jira for the review and Jenkins
build.
Please vote on this thread. The vote will run for 7 days until Wed Dec 12.

Thanks,
--Konstantin


[jira] [Resolved] (HDFS-14059) Test reads from standby on a secure cluster with Configured failover

2018-12-05 Thread Konstantin Shvachko (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko resolved HDFS-14059.

Resolution: Done

Thanks [~zero45]! Some cool tests there with multiple observers. Great there 
were no problem with DTs and failover.
We still have an outstanding issue to support automatic failover with ZKFC. 
Will create a jira for that.
Closing this one as done.

> Test reads from standby on a secure cluster with Configured failover
> 
>
> Key: HDFS-14059
> URL: https://issues.apache.org/jira/browse/HDFS-14059
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Konstantin Shvachko
>Assignee: Plamen Jeliazkov
>Priority: Major
>
> Run standard HDFS tests to verify reading from ObserverNode on a secure HA 
> cluster with {{ConfiguredFailoverProxyProvider}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-14058) Test reads from standby on a secure cluster with IP failover

2018-12-05 Thread Konstantin Shvachko (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko resolved HDFS-14058.

Resolution: Done

Thanks, [~vagarychen]!
This was a lot of testing. With all related issues resolved and retested I 
think we can close this one.
Load testing and performance tuning should go into the next step.

> Test reads from standby on a secure cluster with IP failover
> 
>
> Key: HDFS-14058
> URL: https://issues.apache.org/jira/browse/HDFS-14058
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Konstantin Shvachko
>Assignee: Chen Liang
>Priority: Major
> Attachments: dfsio_crs.no-crs.txt, dfsio_crs.with-crs.txt
>
>
> Run standard HDFS tests to verify reading from ObserverNode on a secure HA 
> cluster with {{IPFailoverProxyProvider}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-903) Compile OzoneManager Ratis protobuf files with proto3 compiler

2018-12-05 Thread Hanisha Koneru (JIRA)
Hanisha Koneru created HDDS-903:
---

 Summary: Compile OzoneManager Ratis protobuf files with proto3 
compiler
 Key: HDDS-903
 URL: https://issues.apache.org/jira/browse/HDDS-903
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
Reporter: Hanisha Koneru
Assignee: Hanisha Koneru


Ratis requires the payload to be proto3 ByteString. If the OM ratis protos are 
compiled using proto2, we would have to make the conversion to proto3 for every 
request to be submitted to Ratis. Instead, we can compile OM's Ratis protocol 
files using proto3 compiler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2018-12-05 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/978/

[Dec 4, 2018 1:39:10 PM] (stevel) HADOOP-15968. ABFS: add try catch for UGI 
failure when initializing
[Dec 4, 2018 1:45:29 PM] (aajisaka) HADOOP-15970. Upgrade plexus-utils from 
2.0.5 to 3.1.0.
[Dec 4, 2018 2:42:35 PM] (msingh) HDDS-890. Handle OverlappingFileLockException 
during
[Dec 4, 2018 3:35:43 PM] (stevel) HADOOP-15966. Hadoop Kerberos broken on macos 
as
[Dec 4, 2018 6:08:45 PM] (yufei) YARN-9041. Performance Optimization of method
[Dec 4, 2018 8:57:28 PM] (gifuma) Revert "HADOOP-15852. Refactor QuotaUsage. 
Contributed by Beluga Behr."
[Dec 4, 2018 9:44:03 PM] (jlowe) HADOOP-15974. Upgrade Curator version to 
2.13.0 to fix ZK tests.
[Dec 4, 2018 10:13:06 PM] (wangda) Revert "YARN-8870. [Submarine] Add submarine 
installation scripts. (Xun




-1 overall


The following subsystems voted -1:
asflicense findbugs hadolint pathlen unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

Failed junit tests :

   hadoop.util.TestReadWriteDiskValidator 
   hadoop.registry.secure.TestSecureLogins 
   hadoop.hdfs.web.TestWebHdfsTimeouts 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/978/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/978/artifact/out/diff-compile-javac-root.txt
  [336K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/978/artifact/out/diff-checkstyle-root.txt
  [17M]

   hadolint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/978/artifact/out/diff-patch-hadolint.txt
  [4.0K]

   pathlen:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/978/artifact/out/pathlen.txt
  [12K]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/978/artifact/out/diff-patch-pylint.txt
  [40K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/978/artifact/out/diff-patch-shellcheck.txt
  [20K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/978/artifact/out/diff-patch-shelldocs.txt
  [12K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/978/artifact/out/whitespace-eol.txt
  [9.3M]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/978/artifact/out/whitespace-tabs.txt
  [1.1M]

   findbugs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/978/artifact/out/branch-findbugs-hadoop-hdds_client.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/978/artifact/out/branch-findbugs-hadoop-hdds_container-service.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/978/artifact/out/branch-findbugs-hadoop-hdds_framework.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/978/artifact/out/branch-findbugs-hadoop-hdds_server-scm.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/978/artifact/out/branch-findbugs-hadoop-hdds_tools.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/978/artifact/out/branch-findbugs-hadoop-ozone_client.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/978/artifact/out/branch-findbugs-hadoop-ozone_common.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/978/artifact/out/branch-findbugs-hadoop-ozone_objectstore-service.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/978/artifact/out/branch-findbugs-hadoop-ozone_ozone-manager.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/978/artifact/out/branch-findbugs-hadoop-ozone_ozonefs.txt
  [12K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/978/artifact/out/branch-findbugs-hadoop-ozone_s3gateway.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/978/artifact/out/branch-findbugs-hadoop-ozone_tools.txt
  [8.0K]

   javadoc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/978/artifact/out/diff-javadoc-javadoc-root.txt
  [752K]

   unit:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/978/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
  [164K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/978/artifact/out/patch-unit-hadoop-common-project_hadoop-registry.txt
  [12K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/978/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [324K]
 

[jira] [Created] (HDFS-14127) Add a description about the observer read configuration

2018-12-05 Thread xiangheng (JIRA)
xiangheng created HDFS-14127:


 Summary: Add a description about the observer read configuration
 Key: HDFS-14127
 URL: https://issues.apache.org/jira/browse/HDFS-14127
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: xiangheng


The lack of description of observer reader configuration in hdfs-default.xml 
,That can easily lead users to configure observer read mode to a normal HA mode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org