[jira] [Created] (HDFS-14383) Compute datanode load based on StoragePolicy

2019-03-19 Thread Karthik Palanisamy (JIRA)
Karthik Palanisamy created HDFS-14383:
-

 Summary: Compute datanode load based on StoragePolicy
 Key: HDFS-14383
 URL: https://issues.apache.org/jira/browse/HDFS-14383
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs, namenode
Affects Versions: 3.1.2, 2.7.3
Reporter: Karthik Palanisamy
Assignee: Karthik Palanisamy


Datanode load check logic needs to be changed because existing computation will 
not consider StoragePolicy.

DatanodeManager#getInServiceXceiverAverage

{code}

public double getInServiceXceiverAverage() {
 double avgLoad = 0;
 final int nodes = getNumDatanodesInService();
 if (nodes != 0) {
 final int xceivers = heartbeatManager
 .getInServiceXceiverCount();
 avgLoad = (double)xceivers/nodes;
 }
 return avgLoad;
}

{code}

 

For example: with 10 nodes (HOT), average 50 xceivers and 90 nodes (COLD) with 
average 10 xceivers the calculated threshold by the NN is 28 (((500 + 
900)/100)*2), which means those 10 nodes (the whole HOT tier) becomes 
unavailable when the COLD tier nodes are barely in use. Turning this check off 
helps to mitigate this issue, however the dfs.namenode.replication.considerLoad 
helps to "balance" the load of the DNs, upon turning it off can lead to 
situations where specific DNs are "overloaded".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-747) Update MiniOzoneCluster to work with security protocol from SCM

2019-03-19 Thread Xiaoyu Yao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao resolved HDDS-747.
-
Resolution: Invalid

This won't work with different components require separate Kerberos login of 
different principles in the same JVM. We will look into 
[https://www.testcontainers.org/] to test secure docker compose in the next 
release. 

> Update MiniOzoneCluster to work with security protocol from SCM
> ---
>
> Key: HDDS-747
> URL: https://issues.apache.org/jira/browse/HDDS-747
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Ajay Kumar
>Priority: Major
>  Labels: ozone-security
>
> [HDDS-103] introduces a new security protocol in SCM. MiniOzoneCluster should 
> be updated to utilize it once its implementation is completed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [DISCUSS] Docker build process

2019-03-19 Thread Eric Yang
Hi Jonathan,

Thank you for your input.  There are 15,300 matches for querying Google: 
dockerfile-maven-plugin site:github.com and 377 matches for query apache hosted 
projects.  I see that many projects opt in to use profile to work around 
building docker images all the time while others stay true to have the process 
inline.  People have the rights to opt out using effective root user to compile 
by giving -DskipDocker flag.  Hence, the effective root user requirement is not 
permanent.

People did not change their view point after the discussions of this email 
thread.  I understand the reason that no one likes disruptive changes.  I don’t 
expect calling vote on this issue will change the outcome.  There are 
sufficient facts presented from both point of views in this email thread.  I 
feel enough push back from the community on mandatory inline process and 
flexible to make the change to a profile-based process.  I don’t need to feel 
guilty for implementing a half-baked release process and respect the community 
decision.  Let’s digest the presented facts for rest of the day.  I am ok for 
not calling the vote unless others think a voting procedure is required.

Regards,
Eric

From: Jonathan Eagles 
Date: Tuesday, March 19, 2019 at 11:48 AM
To: Eric Yang 
Cc: "Elek, Marton" , Hadoop Common 
, "yarn-...@hadoop.apache.org" 
, Hdfs-dev , Eric 
Badger , Eric Payne , 
Jim Brennan 
Subject: Re: [DISCUSS] Docker build process

This email discussion thread is the result of failing to reach consensus in the 
JIRA. If you participate in this discussion thread, please recognize that a 
considerable effort has been made by contributors on this JIRA. On the other 
hand, contributors to this JIRA need to listen carefully to the comments in 
this discussion thread since they represent the thoughts and voices of the open 
source community that will a) benefit from and b) bear the burden of this 
feature. Failing to listen to these voices is failing to deliver a feature in 
its best form.

My thoughts-

As shown from my comments on YARN-7129, I have particular concerns that 
resonate other posters on this thread.
https://issues.apache.org/jira/browse/YARN-7129?focusedCommentId=16790842=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16790842
- Docker images don't evolve at the same rate as Hadoop (tends to favor a 
separate release cycle, perhaps project)
- Docker images could have many flavors and favoring one flavor (say ubuntu, or 
windows) over another takes away from Apache Hadoop's platform neutral stance 
(providing a single "one image fits all" stance is optimistic).
- Introduces release processes that could limit the community's ability to 
produce releases at a regular rate. (Effective root user permissions needed to 
create image limiting who can release, extra Docker image only releases)
- In addition, I worry this send a complicated message to our consumers and 
will stagnate release adoption.

> I will make adjustment accordingly unless 7 more people comes out and say 
> otherwise.

I'm sorry if this is a bit of humor which is lost on me. However, Apache Hadoop 
has a set of bylaws that dictate the community's process on decision making.
https://hadoop.apache.org/bylaws.html

Best Regards,
jeagles


Re: [DISCUSS] Docker build process

2019-03-19 Thread Eric Yang
Hi Arpit,

On Docker Hub, Hadoop tagged with version number that looks like: 
docker-hadoop-runner-latest, or jdk11.  It is hard to tell if jdk11 image is 
Hadoop 3 or Hadoop 2 because there is no consistency in tag format usage.  This 
is my reasoning against tag as your heart desired because flexible naming 
causes confusion over the long run.

There is a good article for perform maven release with M2_Release_Plugin in 
Jenkins: https://dzone.com/articles/running-maven-release-plugin
Jenkins perform maven release, tags the source code with version number and 
automatically upload artifacts to Nexus, then reset version number to next 
SNAPSHOT number.  If dockerfile plugin is used, it can upload the artifact to 
Dockerhub as part of the release.

The proposed adjustment is to put docker build in a maven profile.  User who 
wants to build it, will need to add -Pdocker flag to trigger the build.

Regards,
Eric

On 3/19/19, 12:48 PM, "Arpit Agarwal"  wrote:

Hi Eric,

> Dockerfile is most likely to change to apply the security fix.

I am not sure this is always. Marton’s point about revising docker images 
independent of Hadoop versions is valid. 


> When maven release is automated through Jenkins, this is a breeze
> of clicking a button.  Jenkins even increment the target version
> automatically with option to edit. 

I did not understand this suggestion. Could you please explain in simpler 
terms or share a link to the description?


> I will make adjustment accordingly unless 7 more people comes
> out and say otherwise.

What adjustment is this?

Thanks,
Arpit


> On Mar 19, 2019, at 10:19 AM, Eric Yang  wrote:
> 
> Hi Marton,
> 
> Thank you for your input.  I agree with most of what you said with a few 
exceptions.  Security fix should result in a different version of the image 
instead of replace of a certain version.  Dockerfile is most likely to change 
to apply the security fix.  If it did not change, the source has instability 
over time, and result in non-buildable code over time.  When maven release is 
automated through Jenkins, this is a breeze of clicking a button.  Jenkins even 
increment the target version automatically with option to edit.  It makes 
release manager's job easier than Homer Simpson's job.
> 
> If versioning is done correctly, older branches can have the same docker 
subproject, and Hadoop 2.7.8 can be released for older Hadoop branches.  We 
don't generate timeline paradox to allow changing the history of Hadoop 2.7.1.  
That release has passed and let it stay that way.
> 
> There are mounting evidence that Hadoop community wants docker profile 
for developer image.  Precommit build will not catch some build errors because 
more codes are allowed to slip through using profile build process.  I will 
make adjustment accordingly unless 7 more people comes out and say otherwise.
> 
> Regards,
> Eric
> 
> On 3/19/19, 1:18 AM, "Elek, Marton"  wrote:
> 
> 
> 
>Thank you Eric to describe the problem.
> 
>I have multiple small comments, trying to separate them.
> 
>I. separated vs in-build container image creation
> 
>> The disadvantages are:
>> 
>> 1.  Require developer to have access to docker.
>> 2.  Default build takes longer.
> 
> 
>These are not the only disadvantages (IMHO) as I wrote it in in the
>previous thread and the issue [1]
> 
>Using in-build container image creation doesn't enable:
> 
>1. to modify the image later (eg. apply security fixes to the container
>itself or apply improvements for the startup scripts)
>2. create images for older releases (eg. hadoop 2.7.1)
> 
>I think there are two kind of images:
> 
>a) images for released artifacts
>b) developer images
> 
>I would prefer to manage a) with separated branch repositories but b)
>with (optional!) in-build process.
> 
>II. Agree with Steve. I think it's better to make it optional as most 
of
>the time it's not required. I think it's better to support the default
>dev build with the default settings (=just enough to start)
> 
>III. Maven best practices
> 
>(https://dzone.com/articles/maven-profile-best-practices)
> 
>I think this is a good article. But this is not against profiles but
>creating multiple versions from the same artifact with the same name
>(eg. jdk8/jdk11). In Hadoop, profiles are used to introduce optional
>steps. I think it's fine as the maven lifecycle/phase model is very
>static (compare it with the tree based approach in Gradle).
> 
>Marton
> 
>[1]: https://issues.apache.org/jira/browse/HADOOP-16091
> 
>On 3/13/19 11:24 PM, Eric Yang 

[jira] [Created] (HDDS-1312) Add more unit tests to verify BlockOutputStream functionalities

2019-03-19 Thread Shashikant Banerjee (JIRA)
Shashikant Banerjee created HDDS-1312:
-

 Summary: Add more unit tests to verify BlockOutputStream 
functionalities
 Key: HDDS-1312
 URL: https://issues.apache.org/jira/browse/HDDS-1312
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
Affects Versions: 0.5.0
Reporter: Shashikant Banerjee
Assignee: Shashikant Banerjee
 Fix For: 0.5.0


This jira aims to add more unit test coverage for BlockOutputStream 
functionalities.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [DISCUSS] Docker build process

2019-03-19 Thread Arpit Agarwal
Hi Eric,

> Dockerfile is most likely to change to apply the security fix.

I am not sure this is always. Marton’s point about revising docker images 
independent of Hadoop versions is valid. 


> When maven release is automated through Jenkins, this is a breeze
> of clicking a button.  Jenkins even increment the target version
> automatically with option to edit. 

I did not understand this suggestion. Could you please explain in simpler terms 
or share a link to the description?


> I will make adjustment accordingly unless 7 more people comes
> out and say otherwise.

What adjustment is this?

Thanks,
Arpit


> On Mar 19, 2019, at 10:19 AM, Eric Yang  wrote:
> 
> Hi Marton,
> 
> Thank you for your input.  I agree with most of what you said with a few 
> exceptions.  Security fix should result in a different version of the image 
> instead of replace of a certain version.  Dockerfile is most likely to change 
> to apply the security fix.  If it did not change, the source has instability 
> over time, and result in non-buildable code over time.  When maven release is 
> automated through Jenkins, this is a breeze of clicking a button.  Jenkins 
> even increment the target version automatically with option to edit.  It 
> makes release manager's job easier than Homer Simpson's job.
> 
> If versioning is done correctly, older branches can have the same docker 
> subproject, and Hadoop 2.7.8 can be released for older Hadoop branches.  We 
> don't generate timeline paradox to allow changing the history of Hadoop 
> 2.7.1.  That release has passed and let it stay that way.
> 
> There are mounting evidence that Hadoop community wants docker profile for 
> developer image.  Precommit build will not catch some build errors because 
> more codes are allowed to slip through using profile build process.  I will 
> make adjustment accordingly unless 7 more people comes out and say otherwise.
> 
> Regards,
> Eric
> 
> On 3/19/19, 1:18 AM, "Elek, Marton"  wrote:
> 
> 
> 
>Thank you Eric to describe the problem.
> 
>I have multiple small comments, trying to separate them.
> 
>I. separated vs in-build container image creation
> 
>> The disadvantages are:
>> 
>> 1.  Require developer to have access to docker.
>> 2.  Default build takes longer.
> 
> 
>These are not the only disadvantages (IMHO) as I wrote it in in the
>previous thread and the issue [1]
> 
>Using in-build container image creation doesn't enable:
> 
>1. to modify the image later (eg. apply security fixes to the container
>itself or apply improvements for the startup scripts)
>2. create images for older releases (eg. hadoop 2.7.1)
> 
>I think there are two kind of images:
> 
>a) images for released artifacts
>b) developer images
> 
>I would prefer to manage a) with separated branch repositories but b)
>with (optional!) in-build process.
> 
>II. Agree with Steve. I think it's better to make it optional as most of
>the time it's not required. I think it's better to support the default
>dev build with the default settings (=just enough to start)
> 
>III. Maven best practices
> 
>(https://dzone.com/articles/maven-profile-best-practices)
> 
>I think this is a good article. But this is not against profiles but
>creating multiple versions from the same artifact with the same name
>(eg. jdk8/jdk11). In Hadoop, profiles are used to introduce optional
>steps. I think it's fine as the maven lifecycle/phase model is very
>static (compare it with the tree based approach in Gradle).
> 
>Marton
> 
>[1]: https://issues.apache.org/jira/browse/HADOOP-16091
> 
>On 3/13/19 11:24 PM, Eric Yang wrote:
>> Hi Hadoop developers,
>> 
>> In the recent months, there were various discussions on creating docker 
>> build process for Hadoop.  There was convergence to make docker build 
>> process inline in the mailing list last month when Ozone team is planning 
>> new repository for Hadoop/ozone docker images.  New feature has started to 
>> add docker image build process inline in Hadoop build.
>> A few lessons learnt from making docker build inline in YARN-7129.  The 
>> build environment must have docker to have a successful docker build.  
>> BUILD.txt stated for easy build environment use Docker.  There is logic in 
>> place to ensure that absence of docker does not trigger docker build.  The 
>> inline process tries to be as non-disruptive as possible to existing 
>> development environment with one exception.  If docker’s presence is 
>> detected, but user does not have rights to run docker.  This will cause the 
>> build to fail.
>> 
>> Now, some developers are pushing back on inline docker build process because 
>> existing environment did not make docker build process mandatory.  However, 
>> there are benefits to use inline docker build process.  The listed benefits 
>> are:
>> 
>> 1.  Source code tag, maven repository artifacts and docker hub artifacts can 
>> all be 

Re: [DISCUSS] Docker build process

2019-03-19 Thread Jonathan Eagles
This email discussion thread is the result of failing to reach consensus in
the JIRA. If you participate in this discussion thread, please recognize
that a considerable effort has been made by contributors on this JIRA. On
the other hand, contributors to this JIRA need to listen carefully to the
comments in this discussion thread since they represent the thoughts and
voices of the open source community that will a) benefit from and b) bear
the burden of this feature. Failing to listen to these voices is failing to
deliver a feature in its best form.

My thoughts-

As shown from my comments on YARN-7129, I have particular concerns that
resonate other posters on this thread.
https://issues.apache.org/jira/browse/YARN-7129?focusedCommentId=16790842=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16790842
- Docker images don't evolve at the same rate as Hadoop (tends to favor a
separate release cycle, perhaps project)
- Docker images could have many flavors and favoring one flavor (say
ubuntu, or windows) over another takes away from Apache Hadoop's platform
neutral stance (providing a single "one image fits all" stance is
optimistic).
- Introduces release processes that could limit the community's ability to
produce releases at a regular rate. (Effective root user permissions needed
to create image limiting who can release, extra Docker image only releases)
- In addition, I worry this send a complicated message to our consumers and
will stagnate release adoption.

> I will make adjustment accordingly unless 7 more people comes out and say
otherwise.

I'm sorry if this is a bit of humor which is lost on me. However, Apache
Hadoop has a set of bylaws that dictate the community's process on decision
making.
https://hadoop.apache.org/bylaws.html

Best Regards,
jeagles


[jira] [Created] (HDDS-1311) Make Install Snapshot option configurable

2019-03-19 Thread Hanisha Koneru (JIRA)
Hanisha Koneru created HDDS-1311:


 Summary: Make Install Snapshot option configurable
 Key: HDDS-1311
 URL: https://issues.apache.org/jira/browse/HDDS-1311
 Project: Hadoop Distributed Data Store
  Issue Type: New Feature
Reporter: Hanisha Koneru
Assignee: Hanisha Koneru


This Jira aims to make the install snapshot command from leader to follower 
configurable. By default, install snapshot should be enabled. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [DISCUSS] Docker build process

2019-03-19 Thread Jim Brennan
I agree with Steve and Marton.   I am ok with having the docker build as an
option, but I don't want it to be the default.
Jim


On Tue, Mar 19, 2019 at 12:19 PM Eric Yang  wrote:

> Hi Marton,
>
> Thank you for your input.  I agree with most of what you said with a few
> exceptions.  Security fix should result in a different version of the image
> instead of replace of a certain version.  Dockerfile is most likely to
> change to apply the security fix.  If it did not change, the source has
> instability over time, and result in non-buildable code over time.  When
> maven release is automated through Jenkins, this is a breeze of clicking a
> button.  Jenkins even increment the target version automatically with
> option to edit.  It makes release manager's job easier than Homer Simpson's
> job.
>
> If versioning is done correctly, older branches can have the same docker
> subproject, and Hadoop 2.7.8 can be released for older Hadoop branches.  We
> don't generate timeline paradox to allow changing the history of Hadoop
> 2.7.1.  That release has passed and let it stay that way.
>
> There are mounting evidence that Hadoop community wants docker profile for
> developer image.  Precommit build will not catch some build errors because
> more codes are allowed to slip through using profile build process.  I will
> make adjustment accordingly unless 7 more people comes out and say
> otherwise.
>
> Regards,
> Eric
>
> On 3/19/19, 1:18 AM, "Elek, Marton"  wrote:
>
>
>
> Thank you Eric to describe the problem.
>
> I have multiple small comments, trying to separate them.
>
> I. separated vs in-build container image creation
>
> > The disadvantages are:
> >
> > 1.  Require developer to have access to docker.
> > 2.  Default build takes longer.
>
>
> These are not the only disadvantages (IMHO) as I wrote it in in the
> previous thread and the issue [1]
>
> Using in-build container image creation doesn't enable:
>
> 1. to modify the image later (eg. apply security fixes to the container
> itself or apply improvements for the startup scripts)
> 2. create images for older releases (eg. hadoop 2.7.1)
>
> I think there are two kind of images:
>
> a) images for released artifacts
> b) developer images
>
> I would prefer to manage a) with separated branch repositories but b)
> with (optional!) in-build process.
>
> II. Agree with Steve. I think it's better to make it optional as most
> of
> the time it's not required. I think it's better to support the default
> dev build with the default settings (=just enough to start)
>
> III. Maven best practices
>
> (https://dzone.com/articles/maven-profile-best-practices)
>
> I think this is a good article. But this is not against profiles but
> creating multiple versions from the same artifact with the same name
> (eg. jdk8/jdk11). In Hadoop, profiles are used to introduce optional
> steps. I think it's fine as the maven lifecycle/phase model is very
> static (compare it with the tree based approach in Gradle).
>
> Marton
>
> [1]: https://issues.apache.org/jira/browse/HADOOP-16091
>
> On 3/13/19 11:24 PM, Eric Yang wrote:
> > Hi Hadoop developers,
> >
> > In the recent months, there were various discussions on creating
> docker build process for Hadoop.  There was convergence to make docker
> build process inline in the mailing list last month when Ozone team is
> planning new repository for Hadoop/ozone docker images.  New feature has
> started to add docker image build process inline in Hadoop build.
> > A few lessons learnt from making docker build inline in YARN-7129.
> The build environment must have docker to have a successful docker build.
> BUILD.txt stated for easy build environment use Docker.  There is logic in
> place to ensure that absence of docker does not trigger docker build.  The
> inline process tries to be as non-disruptive as possible to existing
> development environment with one exception.  If docker’s presence is
> detected, but user does not have rights to run docker.  This will cause the
> build to fail.
> >
> > Now, some developers are pushing back on inline docker build process
> because existing environment did not make docker build process mandatory.
> However, there are benefits to use inline docker build process.  The listed
> benefits are:
> >
> > 1.  Source code tag, maven repository artifacts and docker hub
> artifacts can all be produced in one build.
> > 2.  Less manual labor to tag different source branches.
> > 3.  Reduce intermediate build caches that may exist in multi-stage
> builds.
> > 4.  Release engineers and developers do not need to search a maze of
> build flags to acquire artifacts.
> >
> > The disadvantages are:
> >
> > 1.  Require developer to have access to docker.
> > 2.  Default build takes longer.
> >
> > There is workaround for above 

Re: [DISCUSS] Docker build process

2019-03-19 Thread Eric Yang
Hi Marton,

Thank you for your input.  I agree with most of what you said with a few 
exceptions.  Security fix should result in a different version of the image 
instead of replace of a certain version.  Dockerfile is most likely to change 
to apply the security fix.  If it did not change, the source has instability 
over time, and result in non-buildable code over time.  When maven release is 
automated through Jenkins, this is a breeze of clicking a button.  Jenkins even 
increment the target version automatically with option to edit.  It makes 
release manager's job easier than Homer Simpson's job.

If versioning is done correctly, older branches can have the same docker 
subproject, and Hadoop 2.7.8 can be released for older Hadoop branches.  We 
don't generate timeline paradox to allow changing the history of Hadoop 2.7.1.  
That release has passed and let it stay that way.

There are mounting evidence that Hadoop community wants docker profile for 
developer image.  Precommit build will not catch some build errors because more 
codes are allowed to slip through using profile build process.  I will make 
adjustment accordingly unless 7 more people comes out and say otherwise.

Regards,
Eric

On 3/19/19, 1:18 AM, "Elek, Marton"  wrote:



Thank you Eric to describe the problem.

I have multiple small comments, trying to separate them.

I. separated vs in-build container image creation

> The disadvantages are:
>
> 1.  Require developer to have access to docker.
> 2.  Default build takes longer.


These are not the only disadvantages (IMHO) as I wrote it in in the
previous thread and the issue [1]

Using in-build container image creation doesn't enable:

1. to modify the image later (eg. apply security fixes to the container
itself or apply improvements for the startup scripts)
2. create images for older releases (eg. hadoop 2.7.1)

I think there are two kind of images:

a) images for released artifacts
b) developer images

I would prefer to manage a) with separated branch repositories but b)
with (optional!) in-build process.

II. Agree with Steve. I think it's better to make it optional as most of
the time it's not required. I think it's better to support the default
dev build with the default settings (=just enough to start)

III. Maven best practices

(https://dzone.com/articles/maven-profile-best-practices)

I think this is a good article. But this is not against profiles but
creating multiple versions from the same artifact with the same name
(eg. jdk8/jdk11). In Hadoop, profiles are used to introduce optional
steps. I think it's fine as the maven lifecycle/phase model is very
static (compare it with the tree based approach in Gradle).

Marton

[1]: https://issues.apache.org/jira/browse/HADOOP-16091

On 3/13/19 11:24 PM, Eric Yang wrote:
> Hi Hadoop developers,
> 
> In the recent months, there were various discussions on creating docker 
build process for Hadoop.  There was convergence to make docker build process 
inline in the mailing list last month when Ozone team is planning new 
repository for Hadoop/ozone docker images.  New feature has started to add 
docker image build process inline in Hadoop build.
> A few lessons learnt from making docker build inline in YARN-7129.  The 
build environment must have docker to have a successful docker build.  
BUILD.txt stated for easy build environment use Docker.  There is logic in 
place to ensure that absence of docker does not trigger docker build.  The 
inline process tries to be as non-disruptive as possible to existing 
development environment with one exception.  If docker’s presence is detected, 
but user does not have rights to run docker.  This will cause the build to fail.
> 
> Now, some developers are pushing back on inline docker build process 
because existing environment did not make docker build process mandatory.  
However, there are benefits to use inline docker build process.  The listed 
benefits are:
> 
> 1.  Source code tag, maven repository artifacts and docker hub artifacts 
can all be produced in one build.
> 2.  Less manual labor to tag different source branches.
> 3.  Reduce intermediate build caches that may exist in multi-stage builds.
> 4.  Release engineers and developers do not need to search a maze of 
build flags to acquire artifacts.
> 
> The disadvantages are:
> 
> 1.  Require developer to have access to docker.
> 2.  Default build takes longer.
> 
> There is workaround for above disadvantages by using -DskipDocker flag to 
avoid docker build completely or -pl !modulename to bypass subprojects.
> Hadoop development did not follow Maven best practice because a full 
Hadoop build requires a number of profile and configuration parameters.  Some 

[jira] [Created] (HDFS-14382) The hdfs fsck command docs do not explain the meaning of the reported fields

2019-03-19 Thread Daniel Templeton (JIRA)
Daniel Templeton created HDFS-14382:
---

 Summary: The hdfs fsck command docs do not explain the meaning of 
the reported fields
 Key: HDFS-14382
 URL: https://issues.apache.org/jira/browse/HDFS-14382
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Affects Versions: 3.2.0
Reporter: Daniel Templeton


The {{hdfs fsck}} command shows something like:

{noformat}FSCK started by root (auth:SIMPLE) from /172.17.0.2 for path /tmp at 
Tue Mar 19 15:50:24 UTC 2019
.Status: HEALTHY
 Total size:179159051 B
 Total dirs:11
 Total files:   1
 Total symlinks:0
 Total blocks (validated):  2 (avg. block size 89579525 B)
 Minimally replicated blocks:   2 (100.0 %)
 Over-replicated blocks:0 (0.0 %)
 Under-replicated blocks:   0 (0.0 %)
 Mis-replicated blocks: 0 (0.0 %)
 Default replication factor:1
 Average block replication: 1.0
 Corrupt blocks:0
 Missing replicas:  0 (0.0 %)
 Number of data-nodes:  1
 Number of racks:   1
FSCK ended at Tue Mar 19 15:50:24 UTC 2019 in 3 milliseconds


The filesystem under path '/tmp' is HEALTHY{noformat}

The fields are presumed to be self-explanatory, but I think that's a bold 
assumption.  In particular, it's not obvious how "mis-replicated" blocks differ 
from "under-replicated" or "over-replicated" blocks.  It would be nice to 
explain the meaning of all the fields clearly in the docs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-14381) Add option to hdfs dfs -cat to ignore corrupt blocks

2019-03-19 Thread Daniel Templeton (JIRA)
Daniel Templeton created HDFS-14381:
---

 Summary: Add option to hdfs dfs -cat to ignore corrupt blocks
 Key: HDFS-14381
 URL: https://issues.apache.org/jira/browse/HDFS-14381
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: tools
Affects Versions: 3.2.0
Reporter: Daniel Templeton


If I have a file in HDFS that contains 100 blocks, and I happen to lose the 
first block (for whatever obscure/unlikely/dumb reason), I can no longer access 
the 99% of the file that's still there and accessible.  In the case of some 
data formats (e.g. text), the remaining data may still be useful.  It would be 
nice to have a way to extract the remaining data without having to manually 
reassemble the file contents from the block files.  Something like {{hdfs dfs 
-cat -ignoreCorrupt }}.  It could insert some marker to show where the 
missing blocks are.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2019-03-19 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1080/

[Mar 17, 2019 9:08:29 AM] (sammichen) HDDS-699. Detect Ozone Network topology. 
Contributed by Sammi Chen.
[Mar 18, 2019 11:45:01 AM] (templedf) MAPREDUCE-7188. [Clean-up] Remove NULL 
check before instanceof and fix
[Mar 18, 2019 1:18:08 PM] (stevel) HADOOP-16182. Update abfs storage back-end 
with "close" flag when
[Mar 18, 2019 2:08:37 PM] (templedf) YARN-9340. [Clean-up] Remove NULL check 
before instanceof in
[Mar 18, 2019 2:10:26 PM] (templedf) HDFS-14328. [Clean-up] Remove NULL check 
before instanceof in TestGSet
[Mar 18, 2019 3:13:43 PM] (xkrogen) HADOOP-16192. Fix CallQueue backoff bugs: 
perform backoff when add() is
[Mar 18, 2019 3:38:55 PM] (7813154+ajayydv) HDDS-1296. Fix checkstyle issue 
from Nightly run. Contributed by Xiaoyu
[Mar 18, 2019 5:04:49 PM] (eyang) HADOOP-16167.  Fixed Hadoop shell script for 
Ubuntu 18.   
[Mar 18, 2019 5:16:34 PM] (eyang) YARN-9385.  Fixed ApiServiceClient to use 
current UGI.
[Mar 18, 2019 5:57:18 PM] (eyang) YARN-9363.  Replaced debug logging with SLF4J 
parameterized log message.
[Mar 18, 2019 7:13:13 PM] (stevel) HADOOP-16124. Extend documentation in 
testing.md about S3 endpoint
[Mar 18, 2019 8:51:44 PM] (bharat) HDDS-1250. In OM HA AllocateBlock call where 
connecting to SCM from OM
[Mar 18, 2019 9:21:57 PM] (arp) Revert "HDDS-1284. Adjust default values of 
pipline recovery for more
[Mar 18, 2019 11:58:42 PM] (eyang) YARN-9364.  Remove commons-logging 
dependency from YARN.




-1 overall


The following subsystems voted -1:
asflicense findbugs hadolint pathlen unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   hadoop-build-tools/src/main/resources/checkstyle/checkstyle.xml 
   hadoop-build-tools/src/main/resources/checkstyle/suppressions.xml 
   hadoop-tools/hadoop-azure/src/config/checkstyle-suppressions.xml 
   hadoop-tools/hadoop-azure/src/config/checkstyle.xml 
   hadoop-tools/hadoop-resourceestimator/src/config/checkstyle.xml 

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-documentstore
 
   
org.apache.hadoop.yarn.server.timelineservice.documentstore.collection.document.entity.TimelineEntityDocument.setEvents(Map)
 makes inefficient use of keySet iterator instead of entrySet iterator At 
TimelineEntityDocument.java:keySet iterator instead of entrySet iterator At 
TimelineEntityDocument.java:[line 159] 
   
org.apache.hadoop.yarn.server.timelineservice.documentstore.collection.document.entity.TimelineEntityDocument.setMetrics(Map)
 makes inefficient use of keySet iterator instead of entrySet iterator At 
TimelineEntityDocument.java:keySet iterator instead of entrySet iterator At 
TimelineEntityDocument.java:[line 142] 
   Unread field:TimelineEventSubDoc.java:[line 56] 
   Unread field:TimelineMetricSubDoc.java:[line 44] 
   Switch statement found in 
org.apache.hadoop.yarn.server.timelineservice.documentstore.collection.document.flowrun.FlowRunDocument.aggregate(TimelineMetric,
 TimelineMetric) where default case is missing At 
FlowRunDocument.java:TimelineMetric) where default case is missing At 
FlowRunDocument.java:[lines 121-136] 
   
org.apache.hadoop.yarn.server.timelineservice.documentstore.collection.document.flowrun.FlowRunDocument.aggregateMetrics(Map)
 makes inefficient use of keySet iterator instead of entrySet iterator At 
FlowRunDocument.java:keySet iterator instead of entrySet iterator At 
FlowRunDocument.java:[line 103] 
   Possible doublecheck on 
org.apache.hadoop.yarn.server.timelineservice.documentstore.reader.cosmosdb.CosmosDBDocumentStoreReader.client
 in new 
org.apache.hadoop.yarn.server.timelineservice.documentstore.reader.cosmosdb.CosmosDBDocumentStoreReader(Configuration)
 At CosmosDBDocumentStoreReader.java:new 
org.apache.hadoop.yarn.server.timelineservice.documentstore.reader.cosmosdb.CosmosDBDocumentStoreReader(Configuration)
 At CosmosDBDocumentStoreReader.java:[lines 73-75] 
   Possible doublecheck on 
org.apache.hadoop.yarn.server.timelineservice.documentstore.writer.cosmosdb.CosmosDBDocumentStoreWriter.client
 in new 
org.apache.hadoop.yarn.server.timelineservice.documentstore.writer.cosmosdb.CosmosDBDocumentStoreWriter(Configuration)
 At CosmosDBDocumentStoreWriter.java:new 
org.apache.hadoop.yarn.server.timelineservice.documentstore.writer.cosmosdb.CosmosDBDocumentStoreWriter(Configuration)
 At CosmosDBDocumentStoreWriter.java:[lines 66-68] 

Failed junit tests :

   hadoop.hdfs.server.datanode.TestBPOfferService 
   hadoop.hdfs.server.namenode.ha.TestBootstrapAliasmap 
   

[jira] [Created] (HDDS-1310) In datanode once a container becomes unhealthy, datanode restart fails.

2019-03-19 Thread Sandeep Nemuri (JIRA)
Sandeep Nemuri created HDDS-1310:


 Summary: In datanode once a container becomes unhealthy, datanode 
restart fails.
 Key: HDDS-1310
 URL: https://issues.apache.org/jira/browse/HDDS-1310
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Datanode
Affects Versions: 0.3.0
Reporter: Sandeep Nemuri


When a container is marked as {{UNHEALTHY}} in a datanode, subsequent restart 
of that datanode fails as it cannot generate ContainerReports anymore. 
Unhealthy state of a container is not handled in ContainerReport generation 
inside a datanode.

We get the below exception when a datanode tries to generate the 
ContainerReport which contains unhealthy container(s)
{noformat}
2019-03-19 13:51:13,646 [Datanode State Machine Thread - 0] ERROR  - Unable 
to communicate to SCM server at x.x.xxx:9861 for past 3300 seconds.
org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: 
Invalid Container state found: 86
at 
org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer.getHddsState(KeyValueContainer.java:623)
at 
org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer.getContainerReport(KeyValueContainer.java:593)
at 
org.apache.hadoop.ozone.container.common.impl.ContainerSet.getContainerReport(ContainerSet.java:204)
at 
org.apache.hadoop.ozone.container.ozoneimpl.ContainerController.getContainerReport(ContainerController.java:82)
at 
org.apache.hadoop.ozone.container.common.states.endpoint.RegisterEndpointTask.call(RegisterEndpointTask.java:114)
at 
org.apache.hadoop.ozone.container.common.states.endpoint.RegisterEndpointTask.call(RegisterEndpointTask.java:47)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-14380) webhdfs failover append to stand-by namenode fails

2019-03-19 Thread gael URBAUER (JIRA)
gael URBAUER created HDFS-14380:
---

 Summary: webhdfs failover append to stand-by namenode fails
 Key: HDFS-14380
 URL: https://issues.apache.org/jira/browse/HDFS-14380
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: webhdfs
Affects Versions: 2.7.3
 Environment: HDP 2.6.2

HA namenode activated
Reporter: gael URBAUER


I'm using datastage to create file in Hadoop through webhdfs.

It happens that when namenode failover happens, datastage is sometimes talking 
to the standby namenode.

Then create operation succeed but when files are bigger than the buffer size, 
datastage calls the APPEND operation and get back a 403 response.

It seems not very coherent that some write operation are allowed on the 
stand-by and other aren't.

 

Regards,

 

Gaël



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86

2019-03-19 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/265/

[Mar 18, 2019 4:00:40 PM] (xkrogen) HADOOP-16192. Fix CallQueue backoff bugs: 
perform backoff when add() is




-1 overall


The following subsystems voted -1:
asflicense findbugs hadolint pathlen unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   hadoop-build-tools/src/main/resources/checkstyle/checkstyle.xml 
   hadoop-build-tools/src/main/resources/checkstyle/suppressions.xml 
   
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/empty-configuration.xml
 
   hadoop-tools/hadoop-azure/src/config/checkstyle.xml 
   hadoop-tools/hadoop-resourceestimator/src/config/checkstyle.xml 
   hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/public/crossdomain.xml 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml
 

FindBugs :

   module:hadoop-common-project/hadoop-common 
   Class org.apache.hadoop.fs.GlobalStorageStatistics defines non-transient 
non-serializable instance field map In GlobalStorageStatistics.java:instance 
field map In GlobalStorageStatistics.java 

FindBugs :

   module:hadoop-hdfs-project/hadoop-hdfs 
   Dead store to state in 
org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Saver.save(OutputStream,
 INodeSymlink) At 
FSImageFormatPBINode.java:org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Saver.save(OutputStream,
 INodeSymlink) At FSImageFormatPBINode.java:[line 623] 

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbase-client
 
   Boxed value is unboxed and then immediately reboxed in 
org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result,
 byte[], byte[], KeyConverter, ValueConverter, boolean) At 
ColumnRWHelper.java:then immediately reboxed in 
org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result,
 byte[], byte[], KeyConverter, ValueConverter, boolean) At 
ColumnRWHelper.java:[line 335] 

Failed junit tests :

   hadoop.security.authentication.client.TestKerberosAuthenticator 
   hadoop.util.TestBasicDiskValidator 
   hadoop.util.TestDiskCheckerWithDiskIo 
   hadoop.crypto.key.kms.server.TestKMS 
   hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys 
   hadoop.hdfs.server.balancer.TestBalancerRPCDelay 
   hadoop.yarn.client.api.impl.TestAMRMProxy 
   hadoop.registry.secure.TestSecureLogins 
   hadoop.yarn.server.nodemanager.amrmproxy.TestFederationInterceptor 
   hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/265/artifact/out/diff-compile-cc-root-jdk1.7.0_95.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/265/artifact/out/diff-compile-javac-root-jdk1.7.0_95.txt
  [328K]

   cc:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/265/artifact/out/diff-compile-cc-root-jdk1.8.0_191.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/265/artifact/out/diff-compile-javac-root-jdk1.8.0_191.txt
  [308K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/265/artifact/out/diff-checkstyle-root.txt
  [16M]

   hadolint:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/265/artifact/out/diff-patch-hadolint.txt
  [4.0K]

   pathlen:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/265/artifact/out/pathlen.txt
  [12K]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/265/artifact/out/diff-patch-pylint.txt
  [24K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/265/artifact/out/diff-patch-shellcheck.txt
  [72K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/265/artifact/out/diff-patch-shelldocs.txt
  [8.0K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/265/artifact/out/whitespace-eol.txt
  [12M]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/265/artifact/out/whitespace-tabs.txt
  [1.2M]

   xml:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/265/artifact/out/xml.txt
  [20K]

   findbugs:

   

[jira] [Created] (HDFS-14379) WebHdfsFileSystem.toUrl double encodes characters

2019-03-19 Thread Boris Vulikh (JIRA)
Boris Vulikh created HDFS-14379:
---

 Summary: WebHdfsFileSystem.toUrl double encodes characters
 Key: HDFS-14379
 URL: https://issues.apache.org/jira/browse/HDFS-14379
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs, hdfs-client
Affects Versions: 3.2.0
Reporter: Boris Vulikh


When using DistCP over HTTPFS with data that contains Spark partitions, DistCP 
fails to access the partitioned parquet files since the "=" characters in file 
path gets double encoded:
 {{"/test/spark/partition/year=2019/month=1/day=1"}}
 to
 {{"/test/spark/partition/year%253D2019/month%253D1/day%253D1"}}

This happens since {{fsPathItem}} containing the character 
{color:#d04437}'='{color} is encoded by {{URLEncoder._encode_(fsPathItem, 
"UTF-8")}} to {color:#d04437}'%3D'{color} and then encoded again by {{new 
Path()}} to {color:#d04437}'%253D'{color}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [DISCUSS] Docker build process

2019-03-19 Thread Elek, Marton



Thank you Eric to describe the problem.

I have multiple small comments, trying to separate them.

I. separated vs in-build container image creation

> The disadvantages are:
>
> 1.  Require developer to have access to docker.
> 2.  Default build takes longer.


These are not the only disadvantages (IMHO) as I wrote it in in the
previous thread and the issue [1]

Using in-build container image creation doesn't enable:

1. to modify the image later (eg. apply security fixes to the container
itself or apply improvements for the startup scripts)
2. create images for older releases (eg. hadoop 2.7.1)

I think there are two kind of images:

a) images for released artifacts
b) developer images

I would prefer to manage a) with separated branch repositories but b)
with (optional!) in-build process.

II. Agree with Steve. I think it's better to make it optional as most of
the time it's not required. I think it's better to support the default
dev build with the default settings (=just enough to start)

III. Maven best practices

(https://dzone.com/articles/maven-profile-best-practices)

I think this is a good article. But this is not against profiles but
creating multiple versions from the same artifact with the same name
(eg. jdk8/jdk11). In Hadoop, profiles are used to introduce optional
steps. I think it's fine as the maven lifecycle/phase model is very
static (compare it with the tree based approach in Gradle).

Marton

[1]: https://issues.apache.org/jira/browse/HADOOP-16091

On 3/13/19 11:24 PM, Eric Yang wrote:
> Hi Hadoop developers,
> 
> In the recent months, there were various discussions on creating docker build 
> process for Hadoop.  There was convergence to make docker build process 
> inline in the mailing list last month when Ozone team is planning new 
> repository for Hadoop/ozone docker images.  New feature has started to add 
> docker image build process inline in Hadoop build.
> A few lessons learnt from making docker build inline in YARN-7129.  The build 
> environment must have docker to have a successful docker build.  BUILD.txt 
> stated for easy build environment use Docker.  There is logic in place to 
> ensure that absence of docker does not trigger docker build.  The inline 
> process tries to be as non-disruptive as possible to existing development 
> environment with one exception.  If docker’s presence is detected, but user 
> does not have rights to run docker.  This will cause the build to fail.
> 
> Now, some developers are pushing back on inline docker build process because 
> existing environment did not make docker build process mandatory.  However, 
> there are benefits to use inline docker build process.  The listed benefits 
> are:
> 
> 1.  Source code tag, maven repository artifacts and docker hub artifacts can 
> all be produced in one build.
> 2.  Less manual labor to tag different source branches.
> 3.  Reduce intermediate build caches that may exist in multi-stage builds.
> 4.  Release engineers and developers do not need to search a maze of build 
> flags to acquire artifacts.
> 
> The disadvantages are:
> 
> 1.  Require developer to have access to docker.
> 2.  Default build takes longer.
> 
> There is workaround for above disadvantages by using -DskipDocker flag to 
> avoid docker build completely or -pl !modulename to bypass subprojects.
> Hadoop development did not follow Maven best practice because a full Hadoop 
> build requires a number of profile and configuration parameters.  Some 
> evolutions are working against Maven design and require fork of separate 
> source trees for different subprojects and pom files.  Maven best practice 
> (https://dzone.com/articles/maven-profile-best-practices) has explained that 
> do not use profile to trigger different artifact builds because it will 
> introduce maven artifact naming conflicts on maven repository using this 
> pattern.  Maven offers flags to skip certain operations, such as -DskipTests 
> -Dmaven.javadoc.skip=true -pl or -DskipDocker.  It seems worthwhile to make 
> some corrections to follow best practice for Hadoop build.
> 
> Some developers have advocated for separate build process for docker images.  
> We need consensus on the direction that will work best for Hadoop development 
> community.  Hence, my questions are:
> 
> Do we want to have inline docker build process in maven?
> If yes, it would be developer’s responsibility to pass -DskipDocker flag to 
> skip docker.  Docker is mandatory for default build.
> If no, what is the release flow for docker images going to look like?
> 
> Thank you for your feedback.
> 
> Regards,
> Eric
> 

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org