Re: Automated documentation build for Apache Hadoop

2017-04-03 Thread Andrew Wang
Nice work Akira! Appreciate the help with trunk development.

On Mon, Apr 3, 2017 at 1:56 AM, Akira Ajisaka  wrote:

> Hi folks,
>
> I've created a repository to build and push Apache Hadoop document (trunk)
> via Travis CI.
> https://github.com/aajisaka/hadoop-document
>
> The document is updated daily by Travis CI cron job.
> https://aajisaka.github.io/hadoop-document/hadoop-project/
>
> Hope it helps!
>
> Regards,
> Akira
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>


Re: [DISCUSS] Changing the default class path for clients

2017-04-03 Thread Andrew Wang
Thanks for digging that up. I agree with your analysis of our public
documentation, though we still need a transition path. Officially, our
classpath is not covered by compatibility, though we know that in reality,
classpath changes are quite impactful to users.

While we were having a related discussion on YARN container classpath
isolation, the plan was to still provide the existing set of JARs by
default, with applications having to explicitly opt-in to a clean
classpath. This feels similar.

How do you feel about providing e.g. `hadoop userclasspath` and `hadoop
daemonclasspath`, and having `hadoop classpath` continue to default to
`daemonclasspath` for now? We could then deprecate+remove `hadoop
classpath` in a future release.

On Mon, Apr 3, 2017 at 11:08 AM, Allen Wittenauer 
wrote:

>
> 1.0.4:
>
> "Prints the class path needed to get the Hadoop jar and the
> required libraries.”
>
>  2.8.0 and 3.0.0:
>
> "Prints the class path needed to get the Hadoop jar and the
> required libraries. If called without arguments, then prints the classpath
> set up by the command scripts, which is likely to contain wildcards in the
> classpath entries.”
>
> I would take that to mean “what gives me all the public APIs?”
> Which, by definition, should all be in hadoop-client-runtime (with the
> possible exception of the DistributedFileSystem Quota APIs, since for some
> reason those are marked public.)
>
> Let me ask it a different way:
>
> Why should ‘yarn jar’, ‘mapred jar’, ‘hadoop distcp’, ‘hadoop fs’,
> etc, etc, etc, have anything but hadoop-client-runtime as the provided jar?
> Yes, some things might break, but given this is 3.0, some changes should be
> expected anyway. Given the definition above "needed to get the Hadoop jar
> and the required libraries”  switching this over seems correct.
>
>
> > On Apr 3, 2017, at 10:37 AM, Esteban Gutierrez 
> wrote:
> >
> >
> > I agreed with Andrew too. Users have relied for years on `hadoop
> classpath` for their script to launch jobs or other tools, perhaps no the
> best idea to change the behavior without providing a proper deprecation
> path.
> >
> > thanks!
> > esteban.
> >
> > --
> > Cloudera, Inc.
> >
> >
> > On Mon, Apr 3, 2017 at 10:26 AM, Andrew Wang 
> wrote:
> > What's the current contract for `hadoop classpath`? Would it be safer to
> > introduce `hadoop userclasspath` or similar for this behavior?
> >
> > I'm betting that changing `hadoop classpath` will lead to some breakages,
> > so I'd prefer to make this new behavior opt-in.
> >
> > Best,
> > Andrew
> >
> > On Mon, Apr 3, 2017 at 9:04 AM, Allen Wittenauer <
> a...@effectivemachines.com>
> > wrote:
> >
> > >
> > > This morning I had a bit of a shower thought:
> > >
> > > With the new shaded hadoop client in 3.0, is there any reason
> the
> > > default classpath should remain the full blown jar list?  e.g.,
> shouldn’t
> > > ‘hadoop classpath’ just return configuration, user supplied bits (e.g.,
> > > HADOOP_USER_CLASSPATH, etc), HADOOP_OPTIONAL_TOOLS, and
> > > hadoop-client-runtime? We’d obviously have to add some plumbing for
> daemons
> > > and the capability for the user to get the full list, but that should
> be
> > > trivial.
> > >
> > > Thoughts?
> > > -
> > > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> > > For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> > >
> > >
> >
>
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>


Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2017-04-03 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/

[Apr 3, 2017 4:06:54 AM] (aajisaka) MAPREDUCE-6824. 
TaskAttemptImpl#createCommonContainerLaunchContext is




-1 overall


The following subsystems voted -1:
asflicense unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

Failed junit tests :

   hadoop.security.TestShellBasedUnixGroupsMapping 
   hadoop.security.TestRaceWhenRelogin 
   hadoop.fs.sftp.TestSFTPFileSystem 
   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure000 
   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure 
   hadoop.hdfs.server.datanode.checker.TestThrottledAsyncCheckerTimeout 
   hadoop.hdfs.server.namenode.ha.TestHAAppend 
   hadoop.hdfs.TestReadStripedFileWithMissingBlocks 
   hadoop.hdfs.server.datanode.checker.TestThrottledAsyncChecker 
   hadoop.hdfs.server.datanode.TestDataNodeUUID 
   hadoop.yarn.server.nodemanager.containermanager.TestContainerManager 
   hadoop.yarn.server.resourcemanager.TestResourceTrackerService 
   hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer 
   hadoop.yarn.server.resourcemanager.TestRMAdminService 
   hadoop.yarn.server.TestContainerManagerSecurity 
   hadoop.yarn.server.TestMiniYarnClusterNodeUtilization 
   hadoop.yarn.client.api.impl.TestAMRMClient 
   hadoop.mapred.TestMRTimelineEventHandling 
   hadoop.mapreduce.TestMRJobClient 
   hadoop.tools.TestDistCpSystem 

Timed out junit tests :

   
org.apache.hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStorePerf 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/diff-compile-javac-root.txt
  [184K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/diff-checkstyle-root.txt
  [17M]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/diff-patch-pylint.txt
  [20K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/diff-patch-shellcheck.txt
  [24K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/diff-patch-shelldocs.txt
  [12K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/whitespace-eol.txt
  [12M]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/whitespace-tabs.txt
  [1.2M]

   javadoc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/diff-javadoc-javadoc-root.txt
  [2.2M]

   unit:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
  [148K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [444K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
  [36K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
  [60K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-tests.txt
  [324K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
  [12K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt
  [88K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/patch-unit-hadoop-tools_hadoop-distcp.txt
  [16K]

   asflicense:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/patch-asflicense-problems.txt
  [4.0K]

Powered by Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org



-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Apache Hadoop qbt Report: trunk+JDK8 on Linux/ppc64le

2017-04-03 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/

[Apr 3, 2017 4:06:54 AM] (aajisaka) MAPREDUCE-6824. 
TaskAttemptImpl#createCommonContainerLaunchContext is




-1 overall


The following subsystems voted -1:
compile unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc javac


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

Failed junit tests :

   hadoop.hdfs.TestEncryptedTransfer 
   hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer 
   hadoop.hdfs.server.mover.TestMover 
   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting 
   hadoop.hdfs.web.TestWebHdfsTimeouts 
   hadoop.yarn.server.timeline.TestRollingLevelDB 
   hadoop.yarn.server.timeline.TestTimelineDataManager 
   hadoop.yarn.server.timeline.TestLeveldbTimelineStore 
   hadoop.yarn.server.timeline.recovery.TestLeveldbTimelineStateStore 
   hadoop.yarn.server.timeline.TestRollingLevelDBTimelineStore 
   
hadoop.yarn.server.applicationhistoryservice.TestApplicationHistoryServer 
   hadoop.yarn.server.resourcemanager.recovery.TestLeveldbRMStateStore 
   hadoop.yarn.server.resourcemanager.TestRMRestart 
   hadoop.yarn.server.TestMiniYarnClusterNodeUtilization 
   hadoop.yarn.server.TestContainerManagerSecurity 
   hadoop.yarn.client.api.impl.TestAMRMProxy 
   hadoop.yarn.server.timeline.TestLevelDBCacheTimelineStore 
   hadoop.yarn.server.timeline.TestOverrideTimelineStoreYarnClient 
   hadoop.yarn.server.timeline.TestEntityGroupFSTimelineStore 
   hadoop.yarn.applications.distributedshell.TestDistributedShell 
   hadoop.mapred.TestShuffleHandler 
   hadoop.mapreduce.v2.app.TestRuntimeEstimators 
   hadoop.mapreduce.v2.hs.TestHistoryServerLeveldbStateStoreService 
   hadoop.mapreduce.TestMRJobClient 

Timed out junit tests :

   org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean 
   org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache 
  

   compile:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-compile-root.txt
  [136K]

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-compile-root.txt
  [136K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-compile-root.txt
  [136K]

   unit:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [232K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
  [16K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-applicationhistoryservice.txt
  [52K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
  [72K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-tests.txt
  [324K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
  [16K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timeline-pluginstorage.txt
  [28K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-applications-distributedshell.txt
  [12K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-ui.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-shuffle.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-app.txt
  [20K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-hs.txt
  [16K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt
  [88K]
   

Re: [DISCUSS] Changing the default class path for clients

2017-04-03 Thread Esteban Gutierrez
I agreed with Andrew too. Users have relied for years on `hadoop classpath`
for their script to launch jobs or other tools, perhaps no the best idea to
change the behavior without providing a proper deprecation path.

thanks!
esteban.

--
Cloudera, Inc.


On Mon, Apr 3, 2017 at 10:26 AM, Andrew Wang 
wrote:

> What's the current contract for `hadoop classpath`? Would it be safer to
> introduce `hadoop userclasspath` or similar for this behavior?
>
> I'm betting that changing `hadoop classpath` will lead to some breakages,
> so I'd prefer to make this new behavior opt-in.
>
> Best,
> Andrew
>
> On Mon, Apr 3, 2017 at 9:04 AM, Allen Wittenauer  >
> wrote:
>
> >
> > This morning I had a bit of a shower thought:
> >
> > With the new shaded hadoop client in 3.0, is there any reason the
> > default classpath should remain the full blown jar list?  e.g., shouldn’t
> > ‘hadoop classpath’ just return configuration, user supplied bits (e.g.,
> > HADOOP_USER_CLASSPATH, etc), HADOOP_OPTIONAL_TOOLS, and
> > hadoop-client-runtime? We’d obviously have to add some plumbing for
> daemons
> > and the capability for the user to get the full list, but that should be
> > trivial.
> >
> > Thoughts?
> > -
> > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >
> >
>


[DISCUSS] Changing the default class path for clients

2017-04-03 Thread Allen Wittenauer

This morning I had a bit of a shower thought:

With the new shaded hadoop client in 3.0, is there any reason the 
default classpath should remain the full blown jar list?  e.g., shouldn’t 
‘hadoop classpath’ just return configuration, user supplied bits (e.g., 
HADOOP_USER_CLASSPATH, etc), HADOOP_OPTIONAL_TOOLS, and hadoop-client-runtime? 
We’d obviously have to add some plumbing for daemons and the capability for the 
user to get the full list, but that should be trivial.  

Thoughts?
-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-6874) Make DistributedCache check if the content of a directory has changed

2017-04-03 Thread Attila Sasvari (JIRA)
Attila Sasvari created MAPREDUCE-6874:
-

 Summary: Make DistributedCache check if the content of a directory 
has changed
 Key: MAPREDUCE-6874
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6874
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Attila Sasvari


DistributedCache does not check recursively if the content a directory has 
changed when adding files to it with {{DistributedCache.addCacheFile()}}. 

h5. Background
I have an Oozie workflow on HDFS:
{code}
example_workflow
├── job.properties
├── lib
│   ├── components
│   │   ├── sub-component.sh
│   │   └── subsub
│   │   └── subsub.sh
│   ├── main.sh
│   └── sub.sh
└── workflow.xml
{code}
Executed the workflow; then made some changes in {{subsub.sh}}. Replaced the 
file on HDFS. When I re-ran the workflow, DistributedCache did not notice the 
changes as the timestamp on the {{components}} directory did not change. As a 
result, the old script was materialized.

This behaviour might be related to [determineTimestamps() 
|https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/filecache/ClientDistributedCacheManager.java#L84].
In order to use the new script during workflow execution, I had to update the 
whole {{components}} directory.


h6. Some more info:
In Oozie, [DistributedCache.addCacheFile() 
|https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java#L625]
 is used to add files to the distributed cache.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org