Re: Automated documentation build for Apache Hadoop
Nice work Akira! Appreciate the help with trunk development. On Mon, Apr 3, 2017 at 1:56 AM, Akira Ajisakawrote: > Hi folks, > > I've created a repository to build and push Apache Hadoop document (trunk) > via Travis CI. > https://github.com/aajisaka/hadoop-document > > The document is updated daily by Travis CI cron job. > https://aajisaka.github.io/hadoop-document/hadoop-project/ > > Hope it helps! > > Regards, > Akira > > - > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org > >
Re: [DISCUSS] Changing the default class path for clients
Thanks for digging that up. I agree with your analysis of our public documentation, though we still need a transition path. Officially, our classpath is not covered by compatibility, though we know that in reality, classpath changes are quite impactful to users. While we were having a related discussion on YARN container classpath isolation, the plan was to still provide the existing set of JARs by default, with applications having to explicitly opt-in to a clean classpath. This feels similar. How do you feel about providing e.g. `hadoop userclasspath` and `hadoop daemonclasspath`, and having `hadoop classpath` continue to default to `daemonclasspath` for now? We could then deprecate+remove `hadoop classpath` in a future release. On Mon, Apr 3, 2017 at 11:08 AM, Allen Wittenauerwrote: > > 1.0.4: > > "Prints the class path needed to get the Hadoop jar and the > required libraries.” > > 2.8.0 and 3.0.0: > > "Prints the class path needed to get the Hadoop jar and the > required libraries. If called without arguments, then prints the classpath > set up by the command scripts, which is likely to contain wildcards in the > classpath entries.” > > I would take that to mean “what gives me all the public APIs?” > Which, by definition, should all be in hadoop-client-runtime (with the > possible exception of the DistributedFileSystem Quota APIs, since for some > reason those are marked public.) > > Let me ask it a different way: > > Why should ‘yarn jar’, ‘mapred jar’, ‘hadoop distcp’, ‘hadoop fs’, > etc, etc, etc, have anything but hadoop-client-runtime as the provided jar? > Yes, some things might break, but given this is 3.0, some changes should be > expected anyway. Given the definition above "needed to get the Hadoop jar > and the required libraries” switching this over seems correct. > > > > On Apr 3, 2017, at 10:37 AM, Esteban Gutierrez > wrote: > > > > > > I agreed with Andrew too. Users have relied for years on `hadoop > classpath` for their script to launch jobs or other tools, perhaps no the > best idea to change the behavior without providing a proper deprecation > path. > > > > thanks! > > esteban. > > > > -- > > Cloudera, Inc. > > > > > > On Mon, Apr 3, 2017 at 10:26 AM, Andrew Wang > wrote: > > What's the current contract for `hadoop classpath`? Would it be safer to > > introduce `hadoop userclasspath` or similar for this behavior? > > > > I'm betting that changing `hadoop classpath` will lead to some breakages, > > so I'd prefer to make this new behavior opt-in. > > > > Best, > > Andrew > > > > On Mon, Apr 3, 2017 at 9:04 AM, Allen Wittenauer < > a...@effectivemachines.com> > > wrote: > > > > > > > > This morning I had a bit of a shower thought: > > > > > > With the new shaded hadoop client in 3.0, is there any reason > the > > > default classpath should remain the full blown jar list? e.g., > shouldn’t > > > ‘hadoop classpath’ just return configuration, user supplied bits (e.g., > > > HADOOP_USER_CLASSPATH, etc), HADOOP_OPTIONAL_TOOLS, and > > > hadoop-client-runtime? We’d obviously have to add some plumbing for > daemons > > > and the capability for the user to get the full list, but that should > be > > > trivial. > > > > > > Thoughts? > > > - > > > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org > > > For additional commands, e-mail: common-dev-h...@hadoop.apache.org > > > > > > > > > > > - > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org > >
Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/ [Apr 3, 2017 4:06:54 AM] (aajisaka) MAPREDUCE-6824. TaskAttemptImpl#createCommonContainerLaunchContext is -1 overall The following subsystems voted -1: asflicense unit The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: Failed junit tests : hadoop.security.TestShellBasedUnixGroupsMapping hadoop.security.TestRaceWhenRelogin hadoop.fs.sftp.TestSFTPFileSystem hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting hadoop.hdfs.TestDFSStripedOutputStreamWithFailure000 hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure hadoop.hdfs.server.datanode.checker.TestThrottledAsyncCheckerTimeout hadoop.hdfs.server.namenode.ha.TestHAAppend hadoop.hdfs.TestReadStripedFileWithMissingBlocks hadoop.hdfs.server.datanode.checker.TestThrottledAsyncChecker hadoop.hdfs.server.datanode.TestDataNodeUUID hadoop.yarn.server.nodemanager.containermanager.TestContainerManager hadoop.yarn.server.resourcemanager.TestResourceTrackerService hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer hadoop.yarn.server.resourcemanager.TestRMAdminService hadoop.yarn.server.TestContainerManagerSecurity hadoop.yarn.server.TestMiniYarnClusterNodeUtilization hadoop.yarn.client.api.impl.TestAMRMClient hadoop.mapred.TestMRTimelineEventHandling hadoop.mapreduce.TestMRJobClient hadoop.tools.TestDistCpSystem Timed out junit tests : org.apache.hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStorePerf cc: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/diff-compile-cc-root.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/diff-compile-javac-root.txt [184K] checkstyle: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/diff-checkstyle-root.txt [17M] pylint: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/diff-patch-pylint.txt [20K] shellcheck: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/diff-patch-shellcheck.txt [24K] shelldocs: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/diff-patch-shelldocs.txt [12K] whitespace: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/whitespace-eol.txt [12M] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/whitespace-tabs.txt [1.2M] javadoc: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/diff-javadoc-javadoc-root.txt [2.2M] unit: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt [148K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt [444K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt [36K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt [60K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-tests.txt [324K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt [12K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt [88K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/patch-unit-hadoop-tools_hadoop-distcp.txt [16K] asflicense: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/365/artifact/out/patch-asflicense-problems.txt [4.0K] Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: trunk+JDK8 on Linux/ppc64le
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/ [Apr 3, 2017 4:06:54 AM] (aajisaka) MAPREDUCE-6824. TaskAttemptImpl#createCommonContainerLaunchContext is -1 overall The following subsystems voted -1: compile unit The following subsystems voted -1 but were configured to be filtered/ignored: cc javac The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: Failed junit tests : hadoop.hdfs.TestEncryptedTransfer hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer hadoop.hdfs.server.mover.TestMover hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting hadoop.hdfs.web.TestWebHdfsTimeouts hadoop.yarn.server.timeline.TestRollingLevelDB hadoop.yarn.server.timeline.TestTimelineDataManager hadoop.yarn.server.timeline.TestLeveldbTimelineStore hadoop.yarn.server.timeline.recovery.TestLeveldbTimelineStateStore hadoop.yarn.server.timeline.TestRollingLevelDBTimelineStore hadoop.yarn.server.applicationhistoryservice.TestApplicationHistoryServer hadoop.yarn.server.resourcemanager.recovery.TestLeveldbRMStateStore hadoop.yarn.server.resourcemanager.TestRMRestart hadoop.yarn.server.TestMiniYarnClusterNodeUtilization hadoop.yarn.server.TestContainerManagerSecurity hadoop.yarn.client.api.impl.TestAMRMProxy hadoop.yarn.server.timeline.TestLevelDBCacheTimelineStore hadoop.yarn.server.timeline.TestOverrideTimelineStoreYarnClient hadoop.yarn.server.timeline.TestEntityGroupFSTimelineStore hadoop.yarn.applications.distributedshell.TestDistributedShell hadoop.mapred.TestShuffleHandler hadoop.mapreduce.v2.app.TestRuntimeEstimators hadoop.mapreduce.v2.hs.TestHistoryServerLeveldbStateStoreService hadoop.mapreduce.TestMRJobClient Timed out junit tests : org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache compile: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-compile-root.txt [136K] cc: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-compile-root.txt [136K] javac: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-compile-root.txt [136K] unit: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt [232K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt [16K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-applicationhistoryservice.txt [52K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt [72K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-tests.txt [324K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt [16K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timeline-pluginstorage.txt [28K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-applications-distributedshell.txt [12K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-ui.txt [8.0K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-shuffle.txt [8.0K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-app.txt [20K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-hs.txt [16K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/277/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt [88K]
Re: [DISCUSS] Changing the default class path for clients
I agreed with Andrew too. Users have relied for years on `hadoop classpath` for their script to launch jobs or other tools, perhaps no the best idea to change the behavior without providing a proper deprecation path. thanks! esteban. -- Cloudera, Inc. On Mon, Apr 3, 2017 at 10:26 AM, Andrew Wangwrote: > What's the current contract for `hadoop classpath`? Would it be safer to > introduce `hadoop userclasspath` or similar for this behavior? > > I'm betting that changing `hadoop classpath` will lead to some breakages, > so I'd prefer to make this new behavior opt-in. > > Best, > Andrew > > On Mon, Apr 3, 2017 at 9:04 AM, Allen Wittenauer > > wrote: > > > > > This morning I had a bit of a shower thought: > > > > With the new shaded hadoop client in 3.0, is there any reason the > > default classpath should remain the full blown jar list? e.g., shouldn’t > > ‘hadoop classpath’ just return configuration, user supplied bits (e.g., > > HADOOP_USER_CLASSPATH, etc), HADOOP_OPTIONAL_TOOLS, and > > hadoop-client-runtime? We’d obviously have to add some plumbing for > daemons > > and the capability for the user to get the full list, but that should be > > trivial. > > > > Thoughts? > > - > > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org > > For additional commands, e-mail: common-dev-h...@hadoop.apache.org > > > > >
[DISCUSS] Changing the default class path for clients
This morning I had a bit of a shower thought: With the new shaded hadoop client in 3.0, is there any reason the default classpath should remain the full blown jar list? e.g., shouldn’t ‘hadoop classpath’ just return configuration, user supplied bits (e.g., HADOOP_USER_CLASSPATH, etc), HADOOP_OPTIONAL_TOOLS, and hadoop-client-runtime? We’d obviously have to add some plumbing for daemons and the capability for the user to get the full list, but that should be trivial. Thoughts? - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-6874) Make DistributedCache check if the content of a directory has changed
Attila Sasvari created MAPREDUCE-6874: - Summary: Make DistributedCache check if the content of a directory has changed Key: MAPREDUCE-6874 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6874 Project: Hadoop Map/Reduce Issue Type: New Feature Reporter: Attila Sasvari DistributedCache does not check recursively if the content a directory has changed when adding files to it with {{DistributedCache.addCacheFile()}}. h5. Background I have an Oozie workflow on HDFS: {code} example_workflow ├── job.properties ├── lib │ ├── components │ │ ├── sub-component.sh │ │ └── subsub │ │ └── subsub.sh │ ├── main.sh │ └── sub.sh └── workflow.xml {code} Executed the workflow; then made some changes in {{subsub.sh}}. Replaced the file on HDFS. When I re-ran the workflow, DistributedCache did not notice the changes as the timestamp on the {{components}} directory did not change. As a result, the old script was materialized. This behaviour might be related to [determineTimestamps() |https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/filecache/ClientDistributedCacheManager.java#L84]. In order to use the new script during workflow execution, I had to update the whole {{components}} directory. h6. Some more info: In Oozie, [DistributedCache.addCacheFile() |https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java#L625] is used to add files to the distributed cache. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org