Repository: tez Updated Branches: refs/heads/branch-0.5 261197438 -> f31319e48
TEZ-2768. Log a useful error message when the summary stream cannot be closed when shutting down an AM. (Jeff Zhang via hitesh) Project: http://git-wip-us.apache.org/repos/asf/tez/repo Commit: http://git-wip-us.apache.org/repos/asf/tez/commit/f31319e4 Tree: http://git-wip-us.apache.org/repos/asf/tez/tree/f31319e4 Diff: http://git-wip-us.apache.org/repos/asf/tez/diff/f31319e4 Branch: refs/heads/branch-0.5 Commit: f31319e4828266f94971eba03229847213020669 Parents: 2611974 Author: Hitesh Shah <[email protected]> Authored: Fri Sep 4 16:30:53 2015 -0700 Committer: Hitesh Shah <[email protected]> Committed: Fri Sep 4 16:30:53 2015 -0700 ---------------------------------------------------------------------- CHANGES.txt | 235 +------------------ .../dag/api/client/TimelineReaderFactory.java | 7 +- .../dag/history/recovery/RecoveryService.java | 18 +- 3 files changed, 22 insertions(+), 238 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/tez/blob/f31319e4/CHANGES.txt ---------------------------------------------------------------------- diff --git a/CHANGES.txt b/CHANGES.txt index e15aa89..9102456 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,241 +1,14 @@ Apache Tez Change Log ===================== -<<<<<<< HEAD -======= -Release 0.6.3: Unreleased - -INCOMPATIBLE CHANGES - -ALL CHANGES: - TEZ-2745. ClassNotFoundException of user code should fail dag - TEZ-2752. logUnsuccessful completion in Attempt should write original finish - time to ATS - TEZ-2742. VertexImpl.finished() terminationCause hides member var of the - same name - TEZ-2732. DefaultSorter throws ArrayIndex exceptions on 2047 Mb size sort buffers - TEZ-2290. Scale memory for Default Sorter down to a max of 2047 MB if configured higher. - TEZ-2734. Add a test to verify the filename generated by OnDiskMerge. - TEZ-2687. ATS History shutdown happens before the min-held containers are released - TEZ-2629. LimitExceededException in Tez client when DAG has exceeds the default max counters - TEZ-2719. Consider reducing logs in unordered fetcher with shared-fetch option - TEZ-2630. TezChild receives IP address instead of FQDN. - -Release 0.6.2: 2015-08-07 - -INCOMPATIBLE CHANGES - -ALL CHANGES: - TEZ-2311. AM can hang if kill received while recovering from previous attempt. - TEZ-2623. Fix module dependencies related to hadoop-auth. - TEZ-2560. fix tex-ui build for maven 3.3+ - TEZ-2600. When used with HDFS federation(viewfs) ,tez will throw a error - TEZ-2579. Incorrect comparison of TaskAttemptId - TEZ-2549. Reduce Counter Load on the Timeline Server - TEZ-2548. TezClient submitDAG can hang if the AM is in the process of shutting down. - TEZ-2534. Error handling summary event when shutting down AM. - TEZ-2511. Add exitCode to diagnostics when container fails. - TEZ-2541. DAGClientImpl enable TimelineClient check is wrong. - TEZ-2489. Disable warn log for Timeline ACL error when tez.allow.disabled.timeline-domains set to true. - TEZ-2509. YarnTaskSchedulerService should not try to allocate containers if AM is shutting down. - TEZ-1529. ATS and TezClient integration in secure kerberos enabled cluster - TEZ-2483. Tez should close task if processor fail - -Release 0.6.1: 2015-05-18 - -INCOMPATIBLE CHANGES - -ALL CHANGES: - TEZ-2437. FilterLinesByWord NPEs when run in Localmode - TEZ-2057. tez-dag/pom.xml contains versions for dependencies. - TEZ-2282. Delimit reused yarn container logs (stderr, stdout, syslog) with task attempt start/stop events - TEZ-2396. pig-tez-tfile-parser pom is hard coded to depend on 0.6.0-SNAPSHOT version. - TEZ-2237. Valid events should be sent out when an Output is not started. - TEZ-2399. Tez UI: add proper dependencies for computed properties - TEZ-1988. Tez UI: does not work when using file:// in a browser - TEZ-2256. Avoid use of BufferTooSmallException to signal end of buffer in UnorderedPartitionedKVWriter. - TEZ-2385. branch-0.6 compile failure caused by TEZ-2226. - TEZ-2390. tez-tools swimlane tool fails to parse large jobs >8K containers - TEZ-2380. Disable fall back to reading from timeline if timeline disabled. - TEZ-2226. Disable writing history to timeline if domain creation fails. - TEZ-2259. Push additional data to Timeline for Recovery for better consumption in UI. - TEZ-2365. Update tez-ui war's license/notice to reflect OFL license correctly. - TEZ-1969. Stop the DAGAppMaster when a local mode client is stopped - TEZ-2329. UI Query on final dag status performance improvement - TEZ-2287. Deprecate VertexManagerPluginContext.getTaskContainer(). - TEZ-1909. Remove need to copy over all events from attempt 1 to attempt 2 dir - TEZ-2061. Tez UI: vertex id column and filter on tasks page should be changed to vertex name - TEZ-2242. Refactor ShuffleVertexManager code - TEZ-2205. Tez still tries to post to ATS when yarn.timeline-service.enabled=false. - TEZ-2047. Build fails against hadoop-2.2 post TEZ-2018 - TEZ-2064. SessionNotRunning Exception not thrown is all cases - TEZ-2189. Tez UI live AM tracking url only works for localhost addresses - TEZ-2179. Timeline relatedentries missing cause exaggerated warning. - TEZ-2168. Fix application dependencies on mutually exclusive artifacts: tez-yarn-timeline-history - and tez-yarn-timeline-history-with-acls. - TEZ-2190. TestOrderedWordCount fails when generateSplitsInClient set to true. - TEZ-2091. Add support for hosting TEZ_UI with nodejs. - TEZ-2165. Tez UI: DAG shows running status if killed by RM in some cases. - TEZ-2158. TEZ UI: Display dag/vertex names, and task/attempt index in breadcrumb. - TEZ-2160. Tez UI: App tracking URL should support navigation back. - TEZ-2147. Swimlanes: Improved tooltip - TEZ-2142. TEZ UI: Breadcrumb border color looks out of place in wrapped mode. - TEZ-2134. TEZ UI: On request failure, display request URL and server name in error bar. - TEZ-2136. Some enhancements to the new Tez UI. - TEZ-2135. ACL checks handled incorrectly in AMWebController. - TEZ-1990. Tez UI: DAG details page shows Nan for end time when a DAG is running. - TEZ-2116. Tez UI: dags page filter does not work if more than one filter is specified. - TEZ-2106. TEZ UI: Display data load time, and add a refresh button for items that can be refreshed. - TEZ-2114. Tez UI: task/task attempt status is not available when its running. - TEZ-2112. Tez UI: fix offset calculation, add home button to breadcrumbs. - TEZ-2038. TEZ-UI DAG is always running in tez-ui when the app is failed but no DAGFinishedEvent is logged. - TEZ-2102. Tez UI: DAG view has hidden edges, dragging DAG by holding vertex causes unintended click. - TEZ-2101. Tez UI: Issues on displaying a table. - TEZ-2092. Tez UI history url handler injects spurious trailing slash. - TEZ-2098. Tez UI: Dag details should be the default page for dag, fix invalid time entries for failed Vertices. - TEZ-2024. TaskFinishedEvent may not be logged in recovery. - TEZ-2031. Tez UI: horizontal scrollbars do not appear in tables, causing them to look truncated. - TEZ-2073. SimpleHistoryLoggingService cannot be read by log aggregation (umask) - TEZ-2078. Tez UI: Task logs url use in-progress url causing various errors. - TEZ-2077. Tez UI: No diagnostics on Task Attempt Details page if task attempt failed. - TEZ-2065. Setting up tez.tez-ui.history-url.base with a trailing slash can result in failures to redirect correctly. - TEZ-2068. Tez UI: Dag view should use full window height, disable webuiservice in localmode. - TEZ-2079. Tez UI: trailing slash in timelineBaseUrl in ui should be handled. - TEZ-2069. Tez UI: appId should link to application in dag details view. - TEZ-2063. Tez UI: Flaky log url in tasks table. - TEZ-2062. Tez UI: Showing 50 elements not working properly. - TEZ-1661. LocalTaskScheduler hangs when shutdown. - TEZ-1917. Examples should extend TezExampleBase. - TEZ-2056. Tez UI: fix VertexID filter,show only tez configs by default,fix appattemptid. - TEZ-2052. Tez UI: log view fixes, show version from build, better handling of ats url config. - TEZ-2043. Tez UI: add progress info from am webservice to dag and vertex views. - TEZ-2032. Update CHANGES.txt to show 0.6.0 is released - TEZ-2018. App Tracking and History URL should point to the Tez UI. - TEZ-2035. Make timeline server putDomain exceptions non-fatal - work-around - TEZ-1929. pre-empted tasks should be marked as killed instead of failed - TEZ-2017. TEZ UI - Dag view throwing error whild re-displaying additionals in some dags. - TEZ-2013. TEZ UI - App Details Page UI Nits - TEZ-2014. Tez UI: Nits : All tables, Vertices Page UI. - TEZ-2012. TEZ UI: Show page number in all tables, and display more readable task/attempt ids. - TEZ-1973. Dag View - TEZ-2010. History payload generated from conf has ${var} placeholders. - TEZ-1946. Tez UI: add source & sink views, add counters to vertices/all task views. - TEZ-1987. Tez UI non-standalone mode uses invalid protocol. - TEZ-1983. Tez UI swimlane task attempt link is broken - TEZ-2326. Update branch 0.6 version to 0.6.1-SNAPSHOT. - -Release 0.6.0: 2015-01-23 - -INCOMPATIBLE CHANGES - -ALL CHANGES: - TEZ-1977. Fixup CHANGES.txt with Tez UI jiras - TEZ-1743. Add versions-maven-plugins artifacts to gitignore - TEZ-1968. Tez UI - All vertices of DAG are not listed in vertices page - TEZ-1890. tez-ui web.tar.gz also being uploaded to maven repository - TEZ-1938. Build warning duplicate jersey-json definitions - TEZ-1910. Build fails against hadoop-2.2.0. - TEZ-1882. Tez UI build does not work on Windows - TEZ-1915. Add public key to KEYS - TEZ-1907. Fix javadoc warnings in tez codebase - TEZ-1891. Incorrect number of Javadoc warnings reported - TEZ-1762. Lots of unit tests do not have timeout parameter set. - TEZ-1886. remove deprecation warnings for tez-ui on the console. - TEZ-1875. dropdown filters do not work on vertices and task attempts page. - TEZ-1873. TestTezAMRMClient fails due to host resolution timing out. - TEZ-1881. Setup initial test-patch script for TEZ-1313. - TEZ-1864. move initialization code dependent on config params to App.ready. - TEZ-1870. Time displayed in the UI is in GMT. - TEZ-1858. Docs for deploying/using the Tez UI. - TEZ-1859. TestGroupedSplits has commented out test: testGzip. - TEZ-1868. Document how to do Windows builds due to with ACL symlink build changes. - TEZ-1872. docs/src/site/custom/project-info-report.properties needs license header. - TEZ-1850. Enable deploy for tez-ui war. - TEZ-1841. Remove range versions for dependencies in tez-ui. - TEZ-1854. Failing tests due to host resolution timing out. - TEZ-1860. mvn apache-rat:check broken for tez-ui. - TEZ-1866. remove the "original" directory under tez-ui - TEZ-1591. Add multiDAG session test and move TestLocalMode to tez-tests - TEZ-1769. ContainerCompletedWhileRunningTransition should inherit from TerminatedWhileRunningTransition - TEZ-1849. Fix tez-ui war file licensing. - TEZ-1840. Document TezTaskOutput. - TEZ-1576. Class level comment in {{MiniTezCluster}} ends abruptly. - TEZ-1838. tez-ui/src/main/webapp/bower.json gets updated after compiling source code. - TEZ-1789. Move speculator processing off the central dispatcher. - TEZ-1610. Add additional task counters for fetchers, merger. - TEZ-1847. Fix package name for MiniTezClusterWithTimeline. - TEZ-1846. Build fails with package org.apache.tez.dag.history.logging.ats does not exist. - TEZ-1696. Make Tez use the domain-based timeline ACLs. - TEZ-1835. TestFaultTolerance#testRandomFailingTasks is timing out - TEZ-1832. TestSecureShuffle fails with NoClassDefFoundError: org/bouncycastle/x509/X509V1CertificateGenerator - TEZ-1672. Update jetty to use stable 7.x version - 7.6.16.v20140903. - TEZ-1822. Docs for Timeline/ACLs/HistoryText. - TEZ-1252. Change wording on http://tez.apache.org/team-list.html related to member confusion. - TEZ-1805. Tez client DAG cycle detection should detect self loops - TEZ-1816. It is possible to receive START event when DAG is failed - TEZ-1787. Counters for speculation - TEZ-1773. Add attempt failure cause enum to the attempt failed/killed - history record - TEZ-14. Support MR like speculation capabilities based on latency deviation - from the mean - TEZ-1733. TezMerger should sort FileChunks on size when merging - TEZ-1738. Tez tfile parser for log parsing - TEZ-1627. Remove OUTPUT_CONSUMABLE and related Event in TaskAttemptImpl - TEZ-1736. Add support for Inputs/Outputs in runtime-library to generate history text data. - TEZ-1721. Update INSTALL instructions for clarifying tez client jars - compatibility with runtime tarball on HDFS. - TEZ-1690. TestMultiMRInput tests fail because of user collisions. - TEZ-1687. Use logIdentifier of Vertex for logging. - TEZ-1737. Should add taskNum in VertexFinishedEvent. - TEZ-1772. Failing tests post TEZ-1737. - TEZ-1785. Remove unused snappy-java dependency. - TEZ-1685. Remove YARNMaster which is never used. - TEZ-1797. Create necessary content for Tez DOAP file. - TEZ-1650. Please create a DOAP file for your TLP. - TEZ-1697. DAG submission fails if a local resource added is already part of tez.lib.uris - TEZ-1060 Add randomness to fault tolerance tests - TEZ-1790. DeallocationTaskRequest may been handled before corresponding AllocationTaskRequest in local mode - TEZ-1949. Whitelist TEZ_RUNTIME_OPTIMIZE_SHARED_FETCH for broadcast edges. - -TEZ-UI CHANGES (TEZ-8): - TEZ-1823. default ATS url should be the same host as ui - TEZ-1817. Add configuration and build details to README - TEZ-1813. display loading and other error messages in tez-ui - TEZ-1809. Provide a error bar to report errors to users in Tez-UI - TEZ-1810. Do not deploy tez-ui war to maven repo - TEZ-1709. Bunch of files in tez-ui missing Apache license header - TEZ-1784. Attempt details in tasks table. - TEZ-1801. Update build instructions for tez-ui - TEZ-1757. Column selector for tables. - TEZ-1794. Vertex view needs a task attempt rollup - TEZ-1791. breadcrumbs for moving between pages. - TEZ-1781. Configurations view ~ New design - TEZ-1605. Landing page for Tez UI - TEZ-1600. Swimlanes View for Tez UI - TEZ-1799. Enable Cross Origin Support in Tez UI - TEZ-1634. Fix compressed IFile shuffle errors - TEZ-1615. Skeleton framework for Tez UI - TEZ-1604. Task View for Tez UI - TEZ-1603. Vertex View for Tez UI. - TEZ-1720. Allow filters in all tables and also to pass in filters using url params. - TEZ-1708. Make UI part of TEZ build process. - TEZ-1617. Shim layer for Tez UI for use within Ambari. - TEZ-1741. App view. - TEZ-1751. Log view & download links in task and task attempt view. - TEZ-1753. Queue in dags view. - TEZ-1765. Allow dropdown lists in table filters. - TEZ-1606. Counters View for DAG, Vertex, and Task. - TEZ-1768. follow up jira to address minor issues in Tez-ui. - TEZ-1783. Wrapper in standalone mode. - TEZ-1820. Fix wrong links. - ->>>>>>> f8cabdc... TEZ-2752. logUnsuccessful completion in Attempt should write original finish time to ATS (bikas) Release 0.5.5: Unreleased INCOMPATIBLE CHANGES TEZ-2552. CRC errors can cause job to run for very long time in large jobs. ALL CHANGES: + TEZ-2768. Log a useful error message when the summary stream cannot be closed when shutting + down an AM. TEZ-2745. ClassNotFoundException of user code should fail dag TEZ-2752. logUnsuccessful completion in Attempt should write original finish time to ATS @@ -246,12 +19,8 @@ ALL CHANGES: TEZ-2734. Add a test to verify the filename generated by OnDiskMerge. TEZ-2687. ATS History shutdown happens before the min-held containers are released TEZ-2629. LimitExceededException in Tez client when DAG has exceeds the default max counters -<<<<<<< HEAD TEZ-2719. Consider reducing logs in unordered fetcher with shared-fetch option - TEZ-2630. TezChild receives IP address instead of FQDN. (hitesh) -======= TEZ-2630. TezChild receives IP address instead of FQDN. ->>>>>>> f8cabdc... TEZ-2752. logUnsuccessful completion in Attempt should write original finish time to ATS (bikas) TEZ-2635. Limit number of attempts being downloaded in unordered fetch. TEZ-2636. MRInput and MultiMRInput should work for cases when there are 0 physical inputs. TEZ-2600. When used with HDFS federation(viewfs) ,tez will throw a error http://git-wip-us.apache.org/repos/asf/tez/blob/f31319e4/tez-api/src/main/java/org/apache/tez/dag/api/client/TimelineReaderFactory.java ---------------------------------------------------------------------- diff --git a/tez-api/src/main/java/org/apache/tez/dag/api/client/TimelineReaderFactory.java b/tez-api/src/main/java/org/apache/tez/dag/api/client/TimelineReaderFactory.java index c0569dd..4a8e172 100644 --- a/tez-api/src/main/java/org/apache/tez/dag/api/client/TimelineReaderFactory.java +++ b/tez-api/src/main/java/org/apache/tez/dag/api/client/TimelineReaderFactory.java @@ -37,6 +37,9 @@ import com.sun.jersey.api.client.config.DefaultClientConfig; import com.sun.jersey.client.urlconnection.HttpURLConnectionFactory; import com.sun.jersey.client.urlconnection.URLConnectionClientHandler; import com.sun.jersey.json.impl.provider.entity.JSONRootElementProvider; + +import org.apache.commons.logging.Log; +import org.apache.commons.logging.LogFactory; import org.apache.hadoop.classification.InterfaceAudience; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.security.UserGroupInformation; @@ -46,8 +49,6 @@ import org.apache.hadoop.security.authentication.client.ConnectionConfigurator; import org.apache.hadoop.security.ssl.SSLFactory; import org.apache.tez.common.ReflectionUtils; import org.apache.tez.dag.api.TezException; -import org.slf4j.Logger; -import org.slf4j.LoggerFactory; /* * TimelineReaderFactory getTimelineReaderStrategy returns a Strategy class, which is used to @@ -63,7 +64,7 @@ import org.slf4j.LoggerFactory; @InterfaceAudience.Private public class TimelineReaderFactory { - private static final Logger LOG = LoggerFactory.getLogger(TimelineReaderFactory.class); + private static final Log LOG = LogFactory.getLog(TimelineReaderFactory.class); private static final String KERBEROS_DELEGATION_TOKEN_AUTHENTICATOR_CLAZZ_NAME = "org.apache.hadoop.security.token.delegation.web.KerberosDelegationTokenAuthenticator"; http://git-wip-us.apache.org/repos/asf/tez/blob/f31319e4/tez-dag/src/main/java/org/apache/tez/dag/history/recovery/RecoveryService.java ---------------------------------------------------------------------- diff --git a/tez-dag/src/main/java/org/apache/tez/dag/history/recovery/RecoveryService.java b/tez-dag/src/main/java/org/apache/tez/dag/history/recovery/RecoveryService.java index 365c6a6..130e856 100644 --- a/tez-dag/src/main/java/org/apache/tez/dag/history/recovery/RecoveryService.java +++ b/tez-dag/src/main/java/org/apache/tez/dag/history/recovery/RecoveryService.java @@ -212,7 +212,13 @@ public class RecoveryService extends AbstractService { summaryStream.hflush(); summaryStream.close(); } catch (IOException ioe) { - LOG.warn("Error when closing summary stream", ioe); + if (!recoveryDirFS.exists(recoveryPath)) { + LOG.warn("Ignoring error while closing summary stream." + + " The recovery directory at " + recoveryPath + + " has already been deleted externally"); + } else { + LOG.warn("Error when closing summary stream", ioe); + } } } for (Entry<TezDAGID, FSDataOutputStream> entry : outputStreamMap.entrySet()) { @@ -221,7 +227,15 @@ public class RecoveryService extends AbstractService { entry.getValue().hflush(); entry.getValue().close(); } catch (IOException ioe) { - LOG.warn("Error when closing output stream", ioe); + if (!recoveryDirFS.exists(recoveryPath)) { + LOG.warn("Ignoring error while closing output stream." + + " The recovery directory at " + recoveryPath + + " has already been deleted externally"); + // avoid closing other outputStream as the recovery directory has already been deleted. + break; + } else { + LOG.warn("Error when closing output stream", ioe); + } } } }
