Apache Hadoop qbt Report: trunk+JDK8 on Windows/x64
For more details, see https://builds.apache.org/job/hadoop-trunk-win/530/ [Jul 16, 2018 4:24:21 PM] (ericp) YARN-8421: when moving app, activeUsers is increased, even though app [Jul 16, 2018 4:46:21 PM] (inigoiri) HDFS-13475. RBF: Admin cannot enforce Router enter SafeMode. Contributed [Jul 16, 2018 5:51:23 PM] (weichiu) HDFS-13524. Occasional "All datanodes are bad" error in [Jul 16, 2018 5:54:41 PM] (wangda) YARN-8511. When AM releases a container, RM removes allocation tags [Jul 16, 2018 5:57:37 PM] (wangda) YARN-8361. Change App Name Placement Rule to use App Name instead of App [Jul 16, 2018 5:58:00 PM] (wangda) YARN-8524. Single parameter Resource / LightWeightResource constructor [Jul 16, 2018 6:32:45 PM] (weichiu) HADOOP-15598. DataChecksum calculate checksum is contented on hashtable [Jul 16, 2018 8:19:53 PM] (xiao) HDFS-13690. Improve error message when creating encryption zone while [Jul 16, 2018 9:38:49 PM] (eyang) YARN-8538. Fixed memory leaks in container-executor and test cases. [Jul 16, 2018 9:41:23 PM] (eyang) YARN-8299. Added CLI and REST API for query container status. [Jul 16, 2018 10:45:55 PM] (weichiu) HDFS-13485. DataNode WebHDFS endpoint throws NPE. Contributed by Siyao [Jul 17, 2018 1:24:18 AM] (shv) Fix potential FSImage corruption. Contributed by Ekanth Sethuramalingam [Jul 17, 2018 2:45:08 AM] (yqlin) HDFS-13733. RBF: Add Web UI configurations and descriptions to RBF -1 overall The following subsystems voted -1: compile mvninstall pathlen unit The following subsystems voted -1 but were configured to be filtered/ignored: cc javac The following subsystems are considered long running: (runtime bigger than 1h 00m 00s) unit Specific tests: Failed junit tests : hadoop.crypto.TestCryptoStreamsWithOpensslAesCtrCryptoCodec hadoop.fs.contract.rawlocal.TestRawlocalContractAppend hadoop.fs.TestFileUtil hadoop.fs.TestFsShellCopy hadoop.fs.TestFsShellList hadoop.http.TestHttpServer hadoop.http.TestHttpServerLogs hadoop.io.nativeio.TestNativeIO hadoop.ipc.TestSocketFactory hadoop.metrics2.impl.TestStatsDMetrics hadoop.security.TestSecurityUtil hadoop.security.TestShellBasedUnixGroupsMapping hadoop.security.token.TestDtUtilShell hadoop.util.TestDiskCheckerWithDiskIo hadoop.util.TestNativeCodeLoader hadoop.hdfs.qjournal.server.TestJournalNode hadoop.hdfs.qjournal.server.TestJournalNodeSync hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistLockedMemory hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistPolicy hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaPlacement hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery hadoop.hdfs.server.datanode.fsdataset.impl.TestProvidedImpl hadoop.hdfs.server.datanode.TestBlockPoolSliceStorage hadoop.hdfs.server.datanode.TestBlockScanner hadoop.hdfs.server.datanode.TestDataNodeFaultInjector hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure hadoop.hdfs.server.datanode.TestDirectoryScanner hadoop.hdfs.server.datanode.TestNNHandlesCombinedBlockReport hadoop.hdfs.server.diskbalancer.command.TestDiskBalancerCommand hadoop.hdfs.server.diskbalancer.TestDiskBalancerRPC hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics hadoop.hdfs.TestDatanodeStartupFixesLegacyStorageIDs hadoop.hdfs.TestDFSShell hadoop.hdfs.TestDFSStripedInputStream hadoop.hdfs.TestDFSStripedInputStreamWithRandomECPolicy hadoop.hdfs.TestDFSStripedOutputStreamWithFailure hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy hadoop.hdfs.TestDFSUpgradeFromImage hadoop.hdfs.TestFetchImage hadoop.hdfs.TestFileConcurrentReader hadoop.hdfs.TestHDFSFileSystemContract hadoop.hdfs.TestPread hadoop.hdfs.TestSecureEncryptionZoneWithKMS hadoop.hdfs.TestTrashWithSecureEncryptionZones hadoop.hdfs.tools.TestDFSAdmin hadoop.hdfs.web.TestWebHDFS hadoop.hdfs.web.TestWebHdfsUrl hadoop.fs.http.server.TestHttpFSServerWebServer hadoop.yarn.logaggregation.filecontroller.ifile.TestLogAggregationIndexFileController hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch hadoop.yarn.server.nodemanager.containermanager.TestAuxServices hadoop.yarn.server.nodemanager.containermanager.TestContainerManager hadoop.yarn.server.nodemanager.r
Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/840/ [Jul 17, 2018 2:45:08 AM] (yqlin) HDFS-13733. RBF: Add Web UI configurations and descriptions to RBF -1 overall The following subsystems voted -1: docker Powered by Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
Re: Hadoop 3.2 Release Plan proposal
On 16 Jul 2018, at 23:45, Sunil G mailto:sun...@apache.org>> wrote: I would also would like to take this opportunity to come up with a detailed plan. - Feature freeze date : all features should be merged by August 10, 2018. That's three weeks from now. While I appreciate a faster cadence, saying "you now have three weeks to get everything in" is a bit sudden. That's pretty aggressive given that many patches can languish unreviewed for months at a time. Also: it's the summer holidays. - Code freeze date : blockers/critical only, no improvements and non blocker/critical bug-fixes August 24, 2018. If the feature freeze is 3 weeks from now, I reserve the right to add new bits of code to the hadoop-aws and hadoop-azure modules as I see fit, up until this cutoff. - Release date: August 31, 2018 At least one RC will have to be built, put out and voted on. I think it'd be good to have that RC generation process nailed down before Aug 24, but you have to include the vote *and the possibility of it failing* into the schedule. Please let me know if I missed any features targeted to 3.2 per this Well there these big todo lists for S3 & S3Guard. https://issues.apache.org/jira/browse/HADOOP-15226 https://issues.apache.org/jira/browse/HADOOP-15220 There's a bigger bit of work coming on for Azure Datalake Gen 2 https://issues.apache.org/jira/browse/HADOOP-15407 I don't think this is quite ready yet, I've been doing work on it, but if we have a 3 week deadline, I'm going to expect some timely reviews on https://issues.apache.org/jira/browse/HADOOP-15546 I've uprated that to a blocker feature; will review the S3 & S3Guard JIRAs to see which of those are blocking. Then there are some pressing "guave, java 9 prep" timeline. I would like to volunteer myself as release manager of 3.2.0 release. well volunteered! Please let me know if you have any suggestions. I think this raises a good q: what timetable should we have for the 3.2. & 3.3 releases; if we do want a faster cadence, then having the outline time from the 3.2 to the 3.3 release means that there's less concern about things not making the 3.2 dealine -Steve
[jira] [Created] (HDFS-13741) Cosmetic code improvement in XAttrFormat
Xiao Chen created HDFS-13741: Summary: Cosmetic code improvement in XAttrFormat Key: HDFS-13741 URL: https://issues.apache.org/jira/browse/HDFS-13741 Project: Hadoop HDFS Issue Type: Bug Reporter: Xiao Chen Assignee: Daniel Templeton In an offline review, [~templedf] had a comment about the following code snippet. {code:java title=XAttrFormat.java} static int toInt(XAttr.NameSpace namespace, String name) { long xattrStatusInt = 0; // <-- this can be combined with the line below xattrStatusInt = NAMESPACE.BITS .combine(namespace.ordinal(), xattrStatusInt); int nid = XAttrStorage.getNameSerialNumber(name); xattrStatusInt = NAME.BITS // <-- no line break necessary .combine(nid, xattrStatusInt); return (int) xattrStatusInt; } {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-13150) [Edit Tail Fast Path] Allow SbNN to tail in-progress edits from JN via RPC
[ https://issues.apache.org/jira/browse/HDFS-13150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Krogen resolved HDFS-13150. Resolution: Fixed Fix Version/s: HDFS-12943 Closing this as all sub-issues (HDFS-13607, HDFS-13608, HDFS-13609, HDFS-13610) have been completed. Thanks to all who helped with this new feature! > [Edit Tail Fast Path] Allow SbNN to tail in-progress edits from JN via RPC > -- > > Key: HDFS-13150 > URL: https://issues.apache.org/jira/browse/HDFS-13150 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ha, hdfs, journal-node, namenode >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Fix For: HDFS-12943 > > Attachments: edit-tailing-fast-path-design-v0.pdf, > edit-tailing-fast-path-design-v1.pdf, edit-tailing-fast-path-design-v2.pdf > > > In the interest of making coordinated/consistent reads easier to complete > with low latency, it is advantageous to reduce the time between when a > transaction is applied on the ANN and when it is applied on the SbNN. We > propose adding a new "fast path" which can be used to tail edits when low > latency is desired. We leave the existing tailing logic in place, and fall > back to this path on startup, recovery, and when the fast path encounters > unrecoverable errors. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
Re: Hadoop 3.2 Release Plan proposal
Hi Sunil, For YARN service and docker related work, I like to propose the following features to be merged: YARN-7129 Application Catalog for YARN applications YARN-7512 Support service upgrade via YARN Service API and CLI YARN-8220 Running Tensorflow on YARN with GPU and Docker - Examples Thanks Regards, Eric On 7/17/18, 12:14 AM, "Zian Chen" wrote: Hi Sunil, Thanks for bringing up this proposal. For feature release, should we put this feature in? YARN-8379 Preempt to balance in already satisfied queues Thanks Zian > On Jul 16, 2018, at 11:45 PM, Sunil G wrote: > > Hi All, > > > To continue a faster cadence of releases to accommodate more features, > > we could plan a Hadoop 3.2 release around August end. > > > To start the process sooner, and to establish a timeline, I propose > > to target Hadoop 3.2.0 release by August end 2018. (About 1.5 months from > now). > > > I would also would like to take this opportunity to come up with a detailed > plan. > > - Feature freeze date : all features should be merged by August 10, 2018. > > - Code freeze date : blockers/critical only, no improvements and non > blocker/critical > > bug-fixes August 24, 2018. > > - Release date: August 31, 2018 > > > I have tried to come up with a list of features on my radar which could be > candidates > > for a 3.2 release: > > - YARN-3409, Node Attributes support. (Owner: Naganarasimha/Sunil) > > - YARN-8135, Hadoop Submarine project for DeepLearning workloads in YARN > (Owner: Wangda Tan) > > - YARN Native Service / Docker feature hardening and stabilization works in > YARN > > > > There are several other HDFS features want to be released with 3.2 as well, > I am quoting few here: > > - HDFS-10285 Storage Policy Satisfier (Owner: Uma/Rakesh) > > - Improvements to HDFS-12615 Router-based HDFS federation > > > > Please let me know if I missed any features targeted to 3.2 per this > > timeline. I would like to volunteer myself as release manager of 3.2.0 > release. > > > Please let me know if you have any suggestions. > > > > Thanks, > > Sunil Govindan - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13740) Prometheus /metrics http endpoint for metrics monitoring integration
Hari Sekhon created HDFS-13740: -- Summary: Prometheus /metrics http endpoint for metrics monitoring integration Key: HDFS-13740 URL: https://issues.apache.org/jira/browse/HDFS-13740 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 2.7.3 Reporter: Hari Sekhon Feature Request to add Prometheus /metrics http endpoint for monitoring integration: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#%3Cscrape_config%3E -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13739) Option to disable Rack Local Write Preference to avoid Major Storage Imbalance across DataNodes caused by uneven spread of Datanodes across Racks
Hari Sekhon created HDFS-13739: -- Summary: Option to disable Rack Local Write Preference to avoid Major Storage Imbalance across DataNodes caused by uneven spread of Datanodes across Racks Key: HDFS-13739 URL: https://issues.apache.org/jira/browse/HDFS-13739 Project: Hadoop HDFS Issue Type: Improvement Components: balancer & mover, block placement, datanode, fs, hdfs, hdfs-client, namenode, nn, performance Affects Versions: 2.7.3 Environment: Hortonworks HDP 2.6 Reporter: Hari Sekhon Current HDFS write pattern of "local node, rack local node, other rack node" is good for most purposes but when there is an uneven layout of datanodes across racks it can cause major storage imbalance across nodes with some nodes filling up and others being half empty. I have observed this on a cluster where half the nodes were 85% full and the other half were only 50% full. Rack layouts like the following illustrate this - the nodes in the same rack will only choose to send half their block replicas to each other, so they will fill up first, while other nodes will receive far fewer replica blocks: {code:java} NumNodes - Rack 2 - rack 1 2 - rack 2 1 - rack 3 1 - rack 4 1 - rack 5 1 - rack 6{code} In this case if I reduce the number of replicas to 2 then I get an almost perfect spread of blocks across all datanodes because HDFS has no choice but to maintain the only 2nd replica on a different rack. If I increase the replicas back to 3 it goes back to 85% on half the nodes and 50% on the other half, because the extra replicas choose to replicate only to rack local nodes. Why not just run the HDFS balancer to fix it you might say? This is a heavily loaded HBase cluster - aside from destroying HBase's data locality and performance by moving blocks out from underneath RegionServers - as soon as an HBase major compaction occurs (at least weekly), all blocks will get re-written by HBase and the HDFS client will again write to local node, rack local node, other rack node and resulting in the same storage imbalance again. Hence this cannot be solved by running HDFS balancer on HBase clusters - or for any application sitting on top of HDFS that has any HDFS block churn. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org