Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2018-03-06 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/713/

[Mar 5, 2018 3:49:43 AM] (wwei) YARN-7995. Remove unnecessary boxings and 
unboxings from
[Mar 5, 2018 2:06:20 PM] (stevel) HADOOP-13761. S3Guard: implement retries for 
DDB failures and
[Mar 5, 2018 5:08:44 PM] (billie) YARN-7915. Trusted image log message repeated 
multiple times.
[Mar 5, 2018 6:12:38 PM] (stevel) HADOOP-15288. 
TestSwiftFileSystemBlockLocation doesn't compile.
[Mar 5, 2018 7:24:17 PM] (arun suresh) YARN-7972. Support inter-app placement 
constraints for allocation tags
[Mar 5, 2018 7:54:24 PM] (aajisaka) YARN-7736. Fix itemization in YARN 
federation document
[Mar 5, 2018 11:07:58 PM] (aajisaka) HADOOP-15271. Remove unicode multibyte 
characters from JavaDoc




-1 overall


The following subsystems voted -1:
findbugs unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

FindBugs :

   module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
   org.apache.hadoop.yarn.api.records.Resource.getResources() may expose 
internal representation by returning Resource.resources At Resource.java:by 
returning Resource.resources At Resource.java:[line 234] 

Failed junit tests :

   hadoop.fs.shell.TestCopyFromLocal 
   hadoop.crypto.key.kms.server.TestKMS 
   hadoop.hdfs.web.TestWebHdfsTimeouts 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure 
   hadoop.yarn.server.nodemanager.webapp.TestContainerLogsPage 
   hadoop.yarn.applications.distributedshell.TestDistributedShell 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/713/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/713/artifact/out/diff-compile-javac-root.txt
  [296K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/713/artifact/out/diff-checkstyle-root.txt
  [17M]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/713/artifact/out/diff-patch-pylint.txt
  [24K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/713/artifact/out/diff-patch-shellcheck.txt
  [20K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/713/artifact/out/diff-patch-shelldocs.txt
  [12K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/713/artifact/out/whitespace-eol.txt
  [9.2M]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/713/artifact/out/whitespace-tabs.txt
  [288K]

   xml:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/713/artifact/out/xml.txt
  [4.0K]

   findbugs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/713/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api-warnings.html
  [8.0K]

   javadoc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/713/artifact/out/diff-javadoc-javadoc-root.txt
  [760K]

   unit:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/713/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
  [164K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/713/artifact/out/patch-unit-hadoop-common-project_hadoop-kms.txt
  [12K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/713/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [320K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/713/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
  [48K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/713/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-applications-distributedshell.txt
  [12K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/713/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt
  [84K]

Powered by Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

[jira] [Created] (MAPREDUCE-7063) Update Logging to DEBUG Level

2018-03-06 Thread BELUGA BEHR (JIRA)
BELUGA BEHR created MAPREDUCE-7063:
--

 Summary: Update Logging to DEBUG Level
 Key: MAPREDUCE-7063
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7063
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: client, mrv2
Affects Versions: 3.0.0
Reporter: BELUGA BEHR


[https://github.com/apache/hadoop/blob/178751ed8c9d47038acf8616c226f1f52e884feb/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/CombineFileInputFormat.java#L428-L429]

 
{code:java}
LOG.info("DEBUG: Terminated node allocation with : CompletedNodes: "
+ completedNodes.size() + ", size left: " + totalLength);
{code}

# Use SLF4J Parameterized logging
# The message literal string contains the word "DEBUG" but this is _INFO_ level 
logging.  Remove the word "DEBUG" and set the logging level to _DEBUG_



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [VOTE] Merging branch HDFS-7240 to trunk

2018-03-06 Thread Jitendra Pandey
Hi Andrew, 
  
 I think we can eliminate the maintenance costs even in the same repo. We can 
make following changes that incorporate suggestions from Daryn and Owen as well.
1. Hadoop-hdsl-project will be at the root of hadoop repo, in a separate 
directory.
2. There will be no dependencies from common, yarn and hdfs to hdsl/ozone.
3. Based on Daryn’s suggestion, the Hdsl can be optionally (via config) be 
loaded in DN as a pluggable module. 
 If not loaded, there will be absolutely no code path through hdsl or ozone.
4. To further make it easier for folks building hadoop, we can support a maven 
profile for hdsl/ozone. If the profile is deactivated hdsl/ozone will not be 
built.
 For example, Cloudera can choose to skip even compiling/building 
hdsl/ozone and therefore no maintenance overhead whatsoever.
 HADOOP-14453 has a patch that shows how it can be done.

Arguably, there are two kinds of maintenance costs. Costs for developers and 
the cost for users.
- Developers: A maven profile as noted in point (3) and (4) above completely 
addresses the concern for developers 
 as there are no compile time dependencies and 
further, they can choose not to build ozone/hdsl.
- User: Cost to users will be completely alleviated if ozone/hdsl is not loaded 
as mentioned in point (3) above.

jitendra

From: Andrew Wang 
Date: Monday, March 5, 2018 at 3:54 PM
To: Wangda Tan 
Cc: Owen O'Malley , Daryn Sharp 
, Jitendra Pandey , hdfs-dev 
, "common-...@hadoop.apache.org" 
, "yarn-...@hadoop.apache.org" 
, "mapreduce-dev@hadoop.apache.org" 

Subject: Re: [VOTE] Merging branch HDFS-7240 to trunk

Hi Owen, Wangda, 

Thanks for clearly laying out the subproject options, that helps the discussion.

I'm all onboard with the idea of regular releases, and it's something I tried 
to do with the 3.0 alphas and betas. The problem though isn't a lack of 
commitment from feature developers like Sanjay or Jitendra; far from it! I 
think every feature developer makes a reasonable effort to test their code 
before it's merged. Yet, my experience as an RM is that more code comes with 
more risk. I don't believe that Ozone is special or different in this regard. 
It comes with a maintenance cost, not a maintenance benefit.


I'm advocating for #3: separate source, separate release. Since HDSL stability 
and FSN/BM refactoring are still a ways out, I don't want to incur a 
maintenance cost now. I sympathize with the sentiment that working cross-repo 
is harder than within same repo, but the right tooling can make this a lot 
easier (e.g. git submodule, Google's repo tool). We have experience doing this 
internally here at Cloudera, and I'm happy to share knowledge and possibly code.

Best,
Andrew

On Fri, Mar 2, 2018 at 4:41 PM, Wangda Tan  wrote:
I like the idea of same source / same release and put Ozone's source under a 
different directory. 

Like Owen mentioned, It gonna be important for all parties to keep a regular 
and shorter release cycle for Hadoop, e.g. 3-4 months between minor releases. 
Users can try features and give feedbacks to stabilize feature earlier; 
developers can be happier since efforts will be consumed by users soon after 
features get merged. In addition to this, if features merged to trunk after 
reasonable tests/review, Andrew's concern may not be a problem anymore: 

bq. Finally, I earnestly believe that Ozone/HDSL itself would benefit from
being a separate project. Ozone could release faster and iterate more
quickly if it wasn't hampered by Hadoop's release schedule and security and
compatibility requirements.

Thanks,
Wangda


On Fri, Mar 2, 2018 at 4:24 PM, Owen O'Malley  wrote:
On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang 
wrote:

Owen mentioned making a Hadoop subproject; we'd have to
> hash out what exactly this means (I assume a separate repo still managed by
> the Hadoop project), but I think we could make this work if it's more
> attractive than incubation or a new TLP.


Ok, there are multiple levels of sub-projects that all make sense:

   - Same source tree, same releases - examples like HDFS & YARN
   - Same master branch, separate releases and release branches - Hive's
   Storage API vs Hive. It is in the source tree for the master branch, but
   has distinct releases and release branches.
   - Separate source, separate release - Apache Commons.

There are advantages and disadvantages to each. I'd propose that we use the
same source, same release pattern for Ozone. Note that we tried and later
reverted doing Common, HDFS, and YARN as separate source, separate release
because it was too much trouble. I like Daryn's idea of putting it as a top
level directory in Hadoop and making sure that nothing in Common, HDFS, or
YARN depend on it. That way if a Release Manager doesn't think it is ready
for release, it can be trivially removed before the release.

One thing about using the same releases, Sanjay and Jitendra are signing up
to make much more regular b

Re: [VOTE] Merging branch HDFS-7240 to trunk

2018-03-06 Thread J. Rottinghuis
Sorry for jumping in late into the fray of this discussion.

It seems Ozone is a large feature. I appreciate the development effort and
the desire to get this into the hands of users.
I understand the need to iterate quickly and to reduce overhead for
development.
I also agree that Hadoop can benefit from a quicker release cycle. For our
part, this is a challenge as we have a large installation with multiple
clusters and thousands of users. It is a constant balance between jumping
to the newest release and the cost of this integration and test at our
scale, especially when things aren't backwards compatible. We try to be
good citizens and upstream our changes and contribute back.

The point was made that splitting the projects such as common and Yarn
didn't work and had to be reverted. That was painful and a lot of work for
those involved for sure. This project may be slightly different in that
hadoop-common, Yarn and HDFS made for one consistent whole. One couldn't
run a project without the other.

Having a separate block management layer with possibly multiple block
implementation as pluggable under the covers would be a good future
development for HDFS. Some users would choose Ozone as that layer, some
might use S3, others GCS, or Azure, or something else.
If the argument is made that nobody will be able to run Hadoop as a
consistent stack without Ozone, then that would be a strong case to keep
things in the same repo.

Obviously when people do want to use Ozone, then having it in the same repo
is easier. The flipside is that, separate top-level project in the same
repo or not, it adds to the Hadoop releases. If there is a change in Ozone
and a new release needed, it would have to wait for a Hadoop release. Ditto
if there is a Hadoop release and there is an issue with Ozone. The case
that one could turn off Ozone through a Maven profile works only to some
extend.
If we have done a 3.x release with Ozone in it, would it make sense to do a
3.y release with y>x without Ozone in it? That would be weird.

This does sound like a Hadoop 4 feature. Compatibility with lots of new
features in Hadoop 3 need to be worked out. We're still working on jumping
to a Hadoop 2.9 release and then working on getting a step-store release to
3.0 to bridge compatibility issues. I'm afraid that adding a very large new
feature into trunk now, essentially makes going to Hadoop 3 not viable for
quite a while. That would be a bummer for all the feature work that has
gone into Hadoop 3. Encryption and erasure encoding are very appealing
features, especially in light of meeting GDPR requirements.

I'd argue to pull out those pieces that make sense in Hadoop 3, merge those
in and keep the rest in a separate project. Iterate quickly in that
separate project, you can have a separate set of committers, you can do
separate release cycle. If that develops Ozone into _the_ new block layer
for all use cases (even when people want to give up on encryption, erasure
encoding, or feature parity is reached) then we can jump of that bridge
when we reach it. I think adding a very large chunk of code that relatively
few people in the community are familiar with isn't necessarily going to
help Hadoop at this time.

Cheers,

Joep

On Tue, Mar 6, 2018 at 2:32 PM, Jitendra Pandey 
wrote:

> Hi Andrew,
>
>  I think we can eliminate the maintenance costs even in the same repo. We
> can make following changes that incorporate suggestions from Daryn and Owen
> as well.
> 1. Hadoop-hdsl-project will be at the root of hadoop repo, in a separate
> directory.
> 2. There will be no dependencies from common, yarn and hdfs to hdsl/ozone.
> 3. Based on Daryn’s suggestion, the Hdsl can be optionally (via config) be
> loaded in DN as a pluggable module.
>  If not loaded, there will be absolutely no code path through hdsl or
> ozone.
> 4. To further make it easier for folks building hadoop, we can support a
> maven profile for hdsl/ozone. If the profile is deactivated hdsl/ozone will
> not be built.
>  For example, Cloudera can choose to skip even compiling/building
> hdsl/ozone and therefore no maintenance overhead whatsoever.
>  HADOOP-14453 has a patch that shows how it can be done.
>
> Arguably, there are two kinds of maintenance costs. Costs for developers
> and the cost for users.
> - Developers: A maven profile as noted in point (3) and (4) above
> completely addresses the concern for developers
>  as there are no compile time dependencies
> and further, they can choose not to build ozone/hdsl.
> - User: Cost to users will be completely alleviated if ozone/hdsl is not
> loaded as mentioned in point (3) above.
>
> jitendra
>
> From: Andrew Wang 
> Date: Monday, March 5, 2018 at 3:54 PM
> To: Wangda Tan 
> Cc: Owen O'Malley , Daryn Sharp
> , Jitendra Pandey ,
> hdfs-dev , "common-...@hadoop.apache.org" <
> common-...@hadoop.apache.org>, "yarn-...@hadoop.apache.org" <
> yarn-...@hadoop.apache.org>, "mapreduce-dev@hadoop.

Re: [EVENT] HDFS Bug Bash: March 12

2018-03-06 Thread Chris Douglas
Found a meetup alternative (thanks Subru):
https://meetingstar.io/event/fk13172f1d75KN

So we can get a rough headcount, please add (local) if you plan to
attend in-person. -C


On Mon, Mar 5, 2018 at 4:03 PM, Chris Douglas  wrote:
> [Cross-posting, as this affects the rest of the project]
>
> Hey folks-
>
> As discussed last month [1], the HDFS build hasn't been healthy
> recently. We're dedicating a bug bash to stabilize the build and
> address some longstanding issues with our unit tests. We rely on our
> CI infrastructure to keep the project releasable, and in its current
> state, it's not protecting us from regressions. While we probably
> won't achieve all our goals in this session, we can develop the
> conditions for reestablishing a firm foundation.
>
> If you're new to the project, please consider attending and
> contributing. Committers often prioritize large or complicated
> patches, and the issues that make the project livable don't get enough
> attention. A bug bash is a great opportunity to pull reviewers'
> collars, and fix the annoyances that slow us all down.
>
> If you're a committer, please join us! While some of the proposed
> repairs are rote, many unit tests rely on implementation details and
> non-obvious invariants. We need domain experts to help untangle
> complex dependencies and to prevent breakage of deliberate, but
> counter-intuitive code.
>
> We're collecting tasks in wiki [2] and will include a dial-in option
> for folks who aren't local.
>
> Meetup has started charging for creating new events, so we'll have to
> find another way to get an approximate headcount and publish the
> address. Please ping me if you have a preferred alternative. -C
>
> [1]: https://s.apache.org/nEoQ
> [2]: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=75965105

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org