[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle
[ https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383438#comment-14383438 ] Zhijie Shen commented on YARN-3047: --- Yeah, I did {{patch -p0}}, but applying failed at {{yarn.cmd}}. [~gtCarrera9], does this patch work on you box? bq. My assumption here is that we only care about per-application partial order, but not a total order on the timeline server. IMHO, we may care about total order. To be more accurate, it may not be "total", but among the applications of the same flow. I think eventually we should support reading from collector to get up-to-date/realtime timeline data. bq. If later we deem that the view is too stale and we need to get the more recent data from the timeline collector, we could add that functionality in a separate effort. Exactly, I agree with this prioritization. Let's start from reading the backend, and we will get a better sense how stale the timeline data will be. > [Data Serving] Set up ATS reader with basic request serving structure and > lifecycle > --- > > Key: YARN-3047 > URL: https://issues.apache.org/jira/browse/YARN-3047 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: Timeline_Reader(draft).pdf, YARN-3047.001.patch, > YARN-3047.003.patch, YARN-3047.005.patch, YARN-3047.02.patch, > YARN-3047.04.patch > > > Per design in YARN-2938, set up the ATS reader as a service and implement the > basic structure as a service. It includes lifecycle management, request > serving, and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3405) FairScheduler's preemption cannot happen between sibling in some case
Peng Zhang created YARN-3405: Summary: FairScheduler's preemption cannot happen between sibling in some case Key: YARN-3405 URL: https://issues.apache.org/jira/browse/YARN-3405 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.7.0 Reporter: Peng Zhang Priority: Critical Queue hierarchy described as below: {noformat} root / queue-1 / \ queue-1-1queue-1-2 {noformat} 1. When queue-1-1 is active and it has been assigned with all resources. 2. When queue-1-2 is active, and it cause some new preemption request. 3. But when do preemption, it now starts from root, and found queue-1 is not over fairshare, so no recursion preemption to queue-1-1. 4. Finally queue-1-2 will be waiting for resource release form queue-1-1 itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3405) FairScheduler's preemption cannot happen between sibling in some case
[ https://issues.apache.org/jira/browse/YARN-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peng Zhang updated YARN-3405: - Description: Queue hierarchy described as below: {noformat} root | queue-1 / \ queue-1-1queue-1-2 {noformat} 1. When queue-1-1 is active and it has been assigned with all resources. 2. When queue-1-2 is active, and it cause some new preemption request. 3. But when do preemption, it now starts from root, and found queue-1 is not over fairshare, so no recursion preemption to queue-1-1. 4. Finally queue-1-2 will be waiting for resource release form queue-1-1 itself. was: Queue hierarchy described as below: {noformat} root / queue-1 / \ queue-1-1queue-1-2 {noformat} 1. When queue-1-1 is active and it has been assigned with all resources. 2. When queue-1-2 is active, and it cause some new preemption request. 3. But when do preemption, it now starts from root, and found queue-1 is not over fairshare, so no recursion preemption to queue-1-1. 4. Finally queue-1-2 will be waiting for resource release form queue-1-1 itself. > FairScheduler's preemption cannot happen between sibling in some case > - > > Key: YARN-3405 > URL: https://issues.apache.org/jira/browse/YARN-3405 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.0 >Reporter: Peng Zhang >Priority: Critical > > Queue hierarchy described as below: > {noformat} > root > | >queue-1 > / \ > queue-1-1queue-1-2 > {noformat} > 1. When queue-1-1 is active and it has been assigned with all resources. > 2. When queue-1-2 is active, and it cause some new preemption request. > 3. But when do preemption, it now starts from root, and found queue-1 is not > over fairshare, so no recursion preemption to queue-1-1. > 4. Finally queue-1-2 will be waiting for resource release form queue-1-1 > itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle
[ https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383440#comment-14383440 ] Varun Saxena commented on YARN-3047: Ok, we can see it later and implement it if required. > [Data Serving] Set up ATS reader with basic request serving structure and > lifecycle > --- > > Key: YARN-3047 > URL: https://issues.apache.org/jira/browse/YARN-3047 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: Timeline_Reader(draft).pdf, YARN-3047.001.patch, > YARN-3047.003.patch, YARN-3047.005.patch, YARN-3047.02.patch, > YARN-3047.04.patch > > > Per design in YARN-2938, set up the ATS reader as a service and implement the > basic structure as a service. It includes lifecycle management, request > serving, and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3406) Add a numUsedContainer for RM Web UI and REST API
Ryu Kobayashi created YARN-3406: --- Summary: Add a numUsedContainer for RM Web UI and REST API Key: YARN-3406 URL: https://issues.apache.org/jira/browse/YARN-3406 Project: Hadoop YARN Issue Type: Improvement Reporter: Ryu Kobayashi Priority: Minor View the number of containers in the all application list. And, add REST API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle
[ https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383444#comment-14383444 ] Varun Saxena commented on YARN-3047: You are applying on a WINDOWS machine or LINUX ? Might have to do with CRLF characters. I converted the yarn.cmd part of the patch using unix2dos command because in normal Jenkins server, it doesn't apply without this. > [Data Serving] Set up ATS reader with basic request serving structure and > lifecycle > --- > > Key: YARN-3047 > URL: https://issues.apache.org/jira/browse/YARN-3047 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: Timeline_Reader(draft).pdf, YARN-3047.001.patch, > YARN-3047.003.patch, YARN-3047.005.patch, YARN-3047.02.patch, > YARN-3047.04.patch > > > Per design in YARN-2938, set up the ATS reader as a service and implement the > basic structure as a service. It includes lifecycle management, request > serving, and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3406) Add a numUsedContainer for RM Web UI and REST API
[ https://issues.apache.org/jira/browse/YARN-3406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryu Kobayashi updated YARN-3406: Attachment: YARN-3406.1.patch > Add a numUsedContainer for RM Web UI and REST API > - > > Key: YARN-3406 > URL: https://issues.apache.org/jira/browse/YARN-3406 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Ryu Kobayashi >Priority: Minor > Attachments: YARN-3406.1.patch, screenshot.png > > > View the number of containers in the all application list. And, add REST API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3406) Add a numUsedContainer for RM Web UI and REST API
[ https://issues.apache.org/jira/browse/YARN-3406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryu Kobayashi updated YARN-3406: Attachment: screenshot.png > Add a numUsedContainer for RM Web UI and REST API > - > > Key: YARN-3406 > URL: https://issues.apache.org/jira/browse/YARN-3406 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Ryu Kobayashi >Priority: Minor > Attachments: YARN-3406.1.patch, screenshot.png > > > View the number of containers in the all application list. And, add REST API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3405) FairScheduler's preemption cannot happen between sibling in some case
[ https://issues.apache.org/jira/browse/YARN-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383469#comment-14383469 ] Peng Zhang commented on YARN-3405: -- And there is another related case which will cause live lock during preemption and scheduling. If necessary, I will create a separated issue for it. Queue hierarchy described as below: {noformat} root /|\ queue-1queue-2queue-3 /\ queue-1-1 queue-1-2 {noformat} # Assume cluster resource is 100G in memory # Assume queue-1 has max resource limit 20G # queue-1-1 is active and it will get max 20G memory(equal to its fairshare) # queue-2 is active then, and it require 30G memory(less than its fairshare) # queue-3 is active, and it can be assigned with all other resources, 50G memory(larger than its fairshare) # queue-1-2 is active, it will cause new preemption request(10G memory and intuitively it can only preempt from its sibling queue-1-1) # Actually preemption starts from root, and it will find queue-3 is most over fairshare, and preempt some resources form queue-3. # But during scheduling, it will find queue-1 itself arrived it's max fairshare, and cannot assign resource to it. Then resource's again assigned to queue-3 And then it repeats between last two steps. > FairScheduler's preemption cannot happen between sibling in some case > - > > Key: YARN-3405 > URL: https://issues.apache.org/jira/browse/YARN-3405 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.0 >Reporter: Peng Zhang >Priority: Critical > > Queue hierarchy described as below: > {noformat} > root > | >queue-1 > / \ > queue-1-1queue-1-2 > {noformat} > 1. When queue-1-1 is active and it has been assigned with all resources. > 2. When queue-1-2 is active, and it cause some new preemption request. > 3. But when do preemption, it now starts from root, and found queue-1 is not > over fairshare, so no recursion preemption to queue-1-1. > 4. Finally queue-1-2 will be waiting for resource release form queue-1-1 > itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3403) Nodemanager dies after a small typo in mapred-site.xml is induced
[ https://issues.apache.org/jira/browse/YARN-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383484#comment-14383484 ] Nikhil Mulley commented on YARN-3403: - Hi [~Naganarasimha] this is with apache hadoop 2.5.1 > Nodemanager dies after a small typo in mapred-site.xml is induced > - > > Key: YARN-3403 > URL: https://issues.apache.org/jira/browse/YARN-3403 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Nikhil Mulley >Priority: Critical > > Hi, > We have noticed that with a small typo in terms of xml config > (mapred-site.xml) can cause the nodemanager go down completely without > stopping/restarting it externally. > I find it little weird that editing the config files on the filesystem, could > cause the running slave daemon yarn nodemanager shutdown. > In this case, I had a ending tag '/' missed in a property and that induced > the nodemanager go down in a cluster. > Why would nodemanager reload the configs while it is running? Are not they > picked up when they are started? Even if they are automated to pick up the > new configs dynamically, I think the xmllint/config checker should come in > before the nodemanager is asked to reload/restart. > > --- > java.lang.RuntimeException: org.xml.sax.SAXParseException; systemId: > file:/etc/hadoop/conf/mapred-site.xml; lineNumber: 228; columnNumber: 3; The > element type "value" must be terminated by the matching end-tag "". >at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2348) > --- > Please shed light on this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle
[ https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383494#comment-14383494 ] Varun Saxena commented on YARN-3047: [~gtCarrera], the threads I am talking about in the design doc are the HttpServer2 threads. HttpServer2 has a thread pool and it will launch multiple threads to process requests. And hence our REST calls will also be in parallel. The thread pool in HttpServer2 works like this. If an idle thread is available a job, it is directly dispatched, otherwise the job is queued. After queuing a job, if the total number of threads is less than the maximum pool size, a new thread is spawned. This is what I was talking about in the doc. Am I missing something ? > [Data Serving] Set up ATS reader with basic request serving structure and > lifecycle > --- > > Key: YARN-3047 > URL: https://issues.apache.org/jira/browse/YARN-3047 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: Timeline_Reader(draft).pdf, YARN-3047.001.patch, > YARN-3047.003.patch, YARN-3047.005.patch, YARN-3047.02.patch, > YARN-3047.04.patch > > > Per design in YARN-2938, set up the ATS reader as a service and implement the > basic structure as a service. It includes lifecycle management, request > serving, and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3407) HttpServer2 Max threads in TimelineCollectorManager should be more than 10
Varun Saxena created YARN-3407: -- Summary: HttpServer2 Max threads in TimelineCollectorManager should be more than 10 Key: YARN-3407 URL: https://issues.apache.org/jira/browse/YARN-3407 Project: Hadoop YARN Issue Type: Bug Components: timelineserver Reporter: Varun Saxena Assignee: Varun Saxena Currently TimelineCollectorManager sets HttpServer2.HTTP_MAX_THREADS to just 10. This value might be too less for serving put requests. By default HttpServer2 will have max threads value of 250. We can probably make it configurable too so that an optimum value can be configured based on number of requests coming to server. Thoughts ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle
[ https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383504#comment-14383504 ] Varun Saxena commented on YARN-3047: And I think max number of HttpServer threads should not be merely 10. Filed YARN-3407 for it > [Data Serving] Set up ATS reader with basic request serving structure and > lifecycle > --- > > Key: YARN-3047 > URL: https://issues.apache.org/jira/browse/YARN-3047 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: Timeline_Reader(draft).pdf, YARN-3047.001.patch, > YARN-3047.003.patch, YARN-3047.005.patch, YARN-3047.02.patch, > YARN-3047.04.patch > > > Per design in YARN-2938, set up the ATS reader as a service and implement the > basic structure as a service. It includes lifecycle management, request > serving, and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3405) FairScheduler's preemption cannot happen between sibling in some case
[ https://issues.apache.org/jira/browse/YARN-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383513#comment-14383513 ] zhihai xu commented on YARN-3405: - It looks like the code will still check queue-1-1(leaf queue) even queue-1(parent queue) is not over fair share. This is the code for FSParentQueue#preemptContainer, for this case candidateQueue will become queue-1 because candidateQueue is null at the beginning. {code} public RMContainer preemptContainer() { RMContainer toBePreempted = null; // Find the childQueue which is most over fair share FSQueue candidateQueue = null; Comparator comparator = policy.getComparator(); for (FSQueue queue : childQueues) { if (candidateQueue == null || comparator.compare(queue, candidateQueue) > 0) { candidateQueue = queue; } } // Let the selected queue choose which of its container to preempt if (candidateQueue != null) { toBePreempted = candidateQueue.preemptContainer(); } return toBePreempted; } {code} Only leaf queue will not be checked if it is not over fair share. The following is the code for FSLeafQueue#preemptContainer {code} public RMContainer preemptContainer() { RMContainer toBePreempted = null; // If this queue is not over its fair share, reject if (!preemptContainerPreCheck()) { return toBePreempted; } if (LOG.isDebugEnabled()) { LOG.debug("Queue " + getName() + " is going to preempt a container " + "from its applications."); } // Choose the app that is most over fair share Comparator comparator = policy.getComparator(); FSAppAttempt candidateSched = null; readLock.lock(); try { for (FSAppAttempt sched : runnableApps) { if (candidateSched == null || comparator.compare(sched, candidateSched) > 0) { candidateSched = sched; } } } finally { readLock.unlock(); } // Preempt from the selected app if (candidateSched != null) { toBePreempted = candidateSched.preemptContainer(); } return toBePreempted; } {code} preemptContainerPreCheck is only called at leaf queue. So for this case, leaf queue queue-1-1 is over fair share, it will be preempted. Do I miss the code which prevent queue-1(parent queue) to be recursively preempted? > FairScheduler's preemption cannot happen between sibling in some case > - > > Key: YARN-3405 > URL: https://issues.apache.org/jira/browse/YARN-3405 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.0 >Reporter: Peng Zhang >Priority: Critical > > Queue hierarchy described as below: > {noformat} > root > | >queue-1 > / \ > queue-1-1queue-1-2 > {noformat} > 1. When queue-1-1 is active and it has been assigned with all resources. > 2. When queue-1-2 is active, and it cause some new preemption request. > 3. But when do preemption, it now starts from root, and found queue-1 is not > over fairshare, so no recursion preemption to queue-1-1. > 4. Finally queue-1-2 will be waiting for resource release form queue-1-1 > itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2901) Add errors and warning stats to RM, NM web UI
[ https://issues.apache.org/jira/browse/YARN-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Vasudev updated YARN-2901: Attachment: apache-yarn-2901.2.patch Thanks for the review [~leftnoteasy]! {quote} 1.1 Better to place in yarn-server-common? 1.2 If you agree above, how about put into package o.a.h.y.server.metrics (or utils)? {quote} I'd prefer not to move it. All the common web ui classes(for the existing web ui) are in hadoop-yarn-common and I'll have to move everything over to hadoop-yarn-server-common. bq. 1.3 Rename it to Log4jWarnErrorMetricsAppender? Fixed. {cutoff} 1.4 Comments about implementation: I think currently, implementation of cleanup can be improved, now cutoff process of message/count is basically loop all items stored, which could be inefficient (imaging if number of stored message > threshold), existing logics in the patch would lead to lots of potential stored message (tons of messages could be genereated in 5 min, which is purge message task run interval). {quote} Changed the purge implementation. I maintain a purge information structure that makes purging more efficient. I've also added the appender information to log4j.properties so that the appender can be enabled/disabled using YARN_ROOT_LOGGER. > Add errors and warning stats to RM, NM web UI > - > > Key: YARN-2901 > URL: https://issues.apache.org/jira/browse/YARN-2901 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager, resourcemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: Exception collapsed.png, Exception expanded.jpg, Screen > Shot 2015-03-19 at 7.40.02 PM.png, apache-yarn-2901.0.patch, > apache-yarn-2901.1.patch, apache-yarn-2901.2.patch > > > It would be really useful to have statistics on the number of errors and > warnings in the RM and NM web UI. I'm thinking about - > 1. The number of errors and warnings in the past 5 min/1 hour/12 hours/day > 2. The top 'n'(20?) most common exceptions in the past 5 min/1 hour/12 > hours/day > By errors and warnings I'm referring to the log level. > I suspect we can probably achieve this by writing a custom appender?(I'm open > to suggestions on alternate mechanisms for implementing this). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3405) FairScheduler's preemption cannot happen between sibling in some case
[ https://issues.apache.org/jira/browse/YARN-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383546#comment-14383546 ] Peng Zhang commented on YARN-3405: -- Thanks, my mistake on the code detail. And I will verify it works now. And If queue1 level has some other sibling queue(like queue-2) that equals to queue-1's usage/fairshare, "candidateQueue" still may be not the queue-1 itself, because they are equal by comparing, and will depends on the queue order. Then queue-1-2 still cannot preempt its sibling, and cause some live lock issue like above second scenario. > FairScheduler's preemption cannot happen between sibling in some case > - > > Key: YARN-3405 > URL: https://issues.apache.org/jira/browse/YARN-3405 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.0 >Reporter: Peng Zhang >Priority: Critical > > Queue hierarchy described as below: > {noformat} > root > | >queue-1 > / \ > queue-1-1queue-1-2 > {noformat} > 1. When queue-1-1 is active and it has been assigned with all resources. > 2. When queue-1-2 is active, and it cause some new preemption request. > 3. But when do preemption, it now starts from root, and found queue-1 is not > over fairshare, so no recursion preemption to queue-1-1. > 4. Finally queue-1-2 will be waiting for resource release form queue-1-1 > itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3396) Handle URISyntaxException in ResourceLocalizationService
[ https://issues.apache.org/jira/browse/YARN-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula updated YARN-3396: --- Attachment: YARN-3396.patch > Handle URISyntaxException in ResourceLocalizationService > > > Key: YARN-3396 > URL: https://issues.apache.org/jira/browse/YARN-3396 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.7.0 >Reporter: Chengbing Liu >Assignee: Brahma Reddy Battula > Attachments: YARN-3396.patch > > > There are two occurrences of the following code snippet: > {code} > //TODO fail? Already translated several times... > {code} > It should be handled correctly in case that the resource URI is incorrect. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3405) FairScheduler's preemption cannot happen between sibling in some case
[ https://issues.apache.org/jira/browse/YARN-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383615#comment-14383615 ] Peng Zhang commented on YARN-3405: -- [~zxu] I have verified that there's no problem in first scenario. Second scenario problem still exists. bq. And If queue1 level has some other sibling queue(like queue-2) that equals to queue-1's usage/fairshare, "candidateQueue" still may be not the queue-1 itself, because they are equal by comparing, and will depends on the queue order.Then queue-1-2 still cannot preempt its sibling, and cause some live lock issue like above second scenario. I think for above scenario it maybe results in preemptContainerPreCheck() for queue-2 (leaf queue) will fail, and queue-1-2 cannot get preempt any resources. Live lock will not happen. I'll update description once you committed above bad cases. Thanks. > FairScheduler's preemption cannot happen between sibling in some case > - > > Key: YARN-3405 > URL: https://issues.apache.org/jira/browse/YARN-3405 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.0 >Reporter: Peng Zhang >Priority: Critical > > Queue hierarchy described as below: > {noformat} > root > | >queue-1 > / \ > queue-1-1queue-1-2 > {noformat} > 1. When queue-1-1 is active and it has been assigned with all resources. > 2. When queue-1-2 is active, and it cause some new preemption request. > 3. But when do preemption, it now starts from root, and found queue-1 is not > over fairshare, so no recursion preemption to queue-1-1. > 4. Finally queue-1-2 will be waiting for resource release form queue-1-1 > itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3304) ResourceCalculatorProcessTree#getCpuUsagePercent default return value is inconsistent with other getters
[ https://issues.apache.org/jira/browse/YARN-3304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383616#comment-14383616 ] Anubhav Dhoot commented on YARN-3304: - Overall the API looks usable and will fix the issues But the implementation has to do a lot of duplication of code to implement the boolean api. The CPU isAvailable is straightforward but the memory calculations seem to rely on duplicating code that does the actual calculation. Wouldn't using it as -1 be easier instead of duplicating code to keep it as 0? > ResourceCalculatorProcessTree#getCpuUsagePercent default return value is > inconsistent with other getters > > > Key: YARN-3304 > URL: https://issues.apache.org/jira/browse/YARN-3304 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Junping Du >Assignee: Karthik Kambatla >Priority: Blocker > Attachments: YARN-3304-v2.patch, YARN-3304-v3.patch, YARN-3304.patch > > > Per discussions in YARN-3296, getCpuUsagePercent() will return -1 for > unavailable case while other resource metrics are return 0 in the same case > which sounds inconsistent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3258) FairScheduler: Need to add more logging to investigate allocations
[ https://issues.apache.org/jira/browse/YARN-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383630#comment-14383630 ] Tsuyoshi Ozawa commented on YARN-3258: -- [~adhoot] thank you for taking this issue! Looks good to me overall. One minor nits: {code} + getName() + "fairShare" + getFairShare()); {code} Should we add a space after the word "fairShare"? > FairScheduler: Need to add more logging to investigate allocations > -- > > Key: YARN-3258 > URL: https://issues.apache.org/jira/browse/YARN-3258 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot >Priority: Minor > Attachments: YARN-3258.001.patch > > > Its hard to investigate allocation failures without any logging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3396) Handle URISyntaxException in ResourceLocalizationService
[ https://issues.apache.org/jira/browse/YARN-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383657#comment-14383657 ] Hadoop QA commented on YARN-3396: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12707729/YARN-3396.patch against trunk revision af618f2. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService Test results: https://builds.apache.org/job/PreCommit-YARN-Build/7121//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7121//console This message is automatically generated. > Handle URISyntaxException in ResourceLocalizationService > > > Key: YARN-3396 > URL: https://issues.apache.org/jira/browse/YARN-3396 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.7.0 >Reporter: Chengbing Liu >Assignee: Brahma Reddy Battula > Attachments: YARN-3396.patch > > > There are two occurrences of the following code snippet: > {code} > //TODO fail? Already translated several times... > {code} > It should be handled correctly in case that the resource URI is incorrect. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3400) [JDK 8] Build Failure due to unreported exceptions in RPCUtil
[ https://issues.apache.org/jira/browse/YARN-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383680#comment-14383680 ] Hudson commented on YARN-3400: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #145 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/145/]) YARN-3400. [JDK 8] Build Failure due to unreported exceptions in RPCUtil (rkanter) (rkanter: rev 87130bf6b22f538c5c26ad5cef984558a8117798) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java * hadoop-yarn-project/CHANGES.txt > [JDK 8] Build Failure due to unreported exceptions in RPCUtil > -- > > Key: YARN-3400 > URL: https://issues.apache.org/jira/browse/YARN-3400 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Robert Kanter >Assignee: Robert Kanter > Fix For: 2.8.0 > > Attachments: YARN-3400.patch > > > When I try compiling Hadoop with JDK 8 like this > {noformat} > mvn clean package -Pdist -Dtar -DskipTests -Djavac.version=1.8 > {noformat} > I get this error: > {noformat} > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) > on project hadoop-yarn-common: Compilation failure: Compilation failure: > [ERROR] > /Users/rkanter/dev/hadoop-common2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java:[101,11] > unreported exception java.lang.Throwable; must be caught or declared to be > thrown > [ERROR] > /Users/rkanter/dev/hadoop-common2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java:[104,11] > unreported exception java.lang.Throwable; must be caught or declared to be > thrown > [ERROR] > /Users/rkanter/dev/hadoop-common2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java:[107,11] > unreported exception java.lang.Throwable; must be caught or declared to be > thrown > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3400) [JDK 8] Build Failure due to unreported exceptions in RPCUtil
[ https://issues.apache.org/jira/browse/YARN-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383693#comment-14383693 ] Hudson commented on YARN-3400: -- FAILURE: Integrated in Hadoop-Yarn-trunk #879 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/879/]) YARN-3400. [JDK 8] Build Failure due to unreported exceptions in RPCUtil (rkanter) (rkanter: rev 87130bf6b22f538c5c26ad5cef984558a8117798) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java > [JDK 8] Build Failure due to unreported exceptions in RPCUtil > -- > > Key: YARN-3400 > URL: https://issues.apache.org/jira/browse/YARN-3400 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Robert Kanter >Assignee: Robert Kanter > Fix For: 2.8.0 > > Attachments: YARN-3400.patch > > > When I try compiling Hadoop with JDK 8 like this > {noformat} > mvn clean package -Pdist -Dtar -DskipTests -Djavac.version=1.8 > {noformat} > I get this error: > {noformat} > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) > on project hadoop-yarn-common: Compilation failure: Compilation failure: > [ERROR] > /Users/rkanter/dev/hadoop-common2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java:[101,11] > unreported exception java.lang.Throwable; must be caught or declared to be > thrown > [ERROR] > /Users/rkanter/dev/hadoop-common2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java:[104,11] > unreported exception java.lang.Throwable; must be caught or declared to be > thrown > [ERROR] > /Users/rkanter/dev/hadoop-common2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java:[107,11] > unreported exception java.lang.Throwable; must be caught or declared to be > thrown > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2901) Add errors and warning stats to RM, NM web UI
[ https://issues.apache.org/jira/browse/YARN-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383712#comment-14383712 ] Hadoop QA commented on YARN-2901: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12707723/apache-yarn-2901.2.patch against trunk revision af618f2. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestRMHA org.apache.hadoop.yarn.server.resourcemanager.TestMoveApplication org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebappAuthentication org.apache.hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStore org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService Test results: https://builds.apache.org/job/PreCommit-YARN-Build/7120//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/7120//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7120//console This message is automatically generated. > Add errors and warning stats to RM, NM web UI > - > > Key: YARN-2901 > URL: https://issues.apache.org/jira/browse/YARN-2901 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager, resourcemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: Exception collapsed.png, Exception expanded.jpg, Screen > Shot 2015-03-19 at 7.40.02 PM.png, apache-yarn-2901.0.patch, > apache-yarn-2901.1.patch, apache-yarn-2901.2.patch > > > It would be really useful to have statistics on the number of errors and > warnings in the RM and NM web UI. I'm thinking about - > 1. The number of errors and warnings in the past 5 min/1 hour/12 hours/day > 2. The top 'n'(20?) most common exceptions in the past 5 min/1 hour/12 > hours/day > By errors and warnings I'm referring to the log level. > I suspect we can probably achieve this by writing a custom appender?(I'm open > to suggestions on alternate mechanisms for implementing this). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3084) YARN REST API 2.6 - can't submit simple job in hortonworks-allways job failes to run
[ https://issues.apache.org/jira/browse/YARN-3084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383715#comment-14383715 ] Varun Vasudev commented on YARN-3084: - [~Xquery] My apologies for not replying for so long. For some reason I didn't get any notification that you had replied to me. In response to your questions - {quote{ 1. First, for launching the application master, I see I need to provide it in the request. I tried to do it, but couldn’t find AppMaster.jar on the HDFS/local FS. I assume there is AppMaster per YARN-based application (MR, Pig, Hive etc.). Can you let me know how can I find/install/download such AppMaster jar? {quote} I would recommend using the DistributedShell jar. You can build it yourself from the Hadoop source code. The jar can be found at hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/target. Copy this jar to a location on HDFS that's readable by the user running the job. I've attached a sample json file which you can use to submit to the REST API. One thing you should note - please update the values for "size" and "timestamp" under the "AppMaster.jar" key. The value for "resource" should be the full path to the jar on HDFS. The DistributedShell app runs a command on multiple machines and exits. The commands write their output to a local file and that output is aggregated at the end of the job if log aggregation is enabled. In the json I've uploaded, copy the script you want to run to HDFS(making sure it's readable by the user submitting the job). Set the value of the key "DISTRIBUTEDSHELLSCRIPTTIMESTAMP" to the timestamp for this script(on HDFS), the value of the key "DISTRIBUTEDSHELLSCRIPTLEN" to the size of the script and the value of "DISTRIBUTEDSHELLSCRIPTLOCATION" to the location on HDFS. The 'num_containers' parameter(part of the "command" key) is the number of containers you wish to launch. {quote} 2. After I launched the application master, in order to run the map reduce remotely, I need to run another rest api request (I guess), but couldn’t find any example for it. Do you have REST API example of how to run map reduce using REST? (or an explained how to/steps) {quote} I don't have an example for running MapReduce using REST. The MapReduce client for YARN is a thick client which does a lot of calculations such as creating the splits for the map before it submits the job. You will have to implement that logic yourself if you wish to submit MapReduce jobs. You don't need to run any other API once you submit the job. The AppMaster is responsible for scheduling your mappers and reducers. {quote} Also, if I want to run the application as user A password B, where I supposed to add my credentials and Identify; When I submit my map reduce job, isn’t yarn expects me to identify? {quote} Hadoop requires you to setup kerberos for secure mode. In secure mode, jobs are executed as the user who submitted the job. Credentials are picked up when you submit the job. I'm going to close this issue since it doesn't seem like an issue with the REST API itself. If you have any further questions, we can discuss them offline. > YARN REST API 2.6 - can't submit simple job in hortonworks-allways job failes > to run > > > Key: YARN-3084 > URL: https://issues.apache.org/jira/browse/YARN-3084 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, webapp >Affects Versions: 2.6.0 > Environment: Using eclipse on windows 7 (client)to run the map reduce > job on the host of Hortonworks HDP 2.2 (hortonworks is on vmware version > 6.0.2 build-1744117) >Reporter: Michael Br >Priority: Minor > Attachments: submit-app.json, > yarn-yarn-resourcemanager-sandbox.hortonworks.com.log > > > Hello, > 1.I want to run the simple Map Reduce job example (with the REST API 2.6 > for yarn applications) and to calculate PI… for now it doesn’t work. > When I use the command in the hortonworks terminal it works: “hadoop jar > /usr/hdp/2.2.0.0-2041/hadoop-mapreduce/hadoop-mapreduce-examples-2.6.0.2.2.0.0-2041.jar > pi 10 10”. > But I want to submit the job with the REST API and not in the terminal as a > command line. > [http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Applications_APISubmit_Application] > 2.I do succeed with other REST API requests: get state, get new > application id and even kill(change state), but when I try to submit my > example, the response is: > -- > -- > The Response Header: > Key : null ,Value : [HTTP/1.1 202 Accepted] > Key : Date ,Value : [Thu, 22 Jan 2015 07:47:24 GMT, Thu,
[jira] [Updated] (YARN-3084) YARN REST API 2.6 - can't submit simple job in hortonworks-allways job failes to run
[ https://issues.apache.org/jira/browse/YARN-3084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Vasudev updated YARN-3084: Attachment: submit-app.json > YARN REST API 2.6 - can't submit simple job in hortonworks-allways job failes > to run > > > Key: YARN-3084 > URL: https://issues.apache.org/jira/browse/YARN-3084 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, webapp >Affects Versions: 2.6.0 > Environment: Using eclipse on windows 7 (client)to run the map reduce > job on the host of Hortonworks HDP 2.2 (hortonworks is on vmware version > 6.0.2 build-1744117) >Reporter: Michael Br >Priority: Minor > Attachments: submit-app.json, > yarn-yarn-resourcemanager-sandbox.hortonworks.com.log > > > Hello, > 1.I want to run the simple Map Reduce job example (with the REST API 2.6 > for yarn applications) and to calculate PI… for now it doesn’t work. > When I use the command in the hortonworks terminal it works: “hadoop jar > /usr/hdp/2.2.0.0-2041/hadoop-mapreduce/hadoop-mapreduce-examples-2.6.0.2.2.0.0-2041.jar > pi 10 10”. > But I want to submit the job with the REST API and not in the terminal as a > command line. > [http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Applications_APISubmit_Application] > 2.I do succeed with other REST API requests: get state, get new > application id and even kill(change state), but when I try to submit my > example, the response is: > -- > -- > The Response Header: > Key : null ,Value : [HTTP/1.1 202 Accepted] > Key : Date ,Value : [Thu, 22 Jan 2015 07:47:24 GMT, Thu, 22 Jan 2015 07:47:24 > GMT] > Key : Content-Length ,Value : [0] > Key : Expires ,Value : [Thu, 22 Jan 2015 07:47:24 GMT, Thu, 22 Jan 2015 > 07:47:24 GMT] > Key : Location ,Value : [http://[my > port]:8088/ws/v1/cluster/apps/application_1421661392788_0038] > Key : Content-Type ,Value : [application/json] > Key : Server ,Value : [Jetty(6.1.26.hwx)] > Key : Pragma ,Value : [no-cache, no-cache] > Key : Cache-Control ,Value : [no-cache] > The Respone Body: > Null (No Response) > -- > -- > 3.I need help with the http request body filling. I am doing a POST http > request and I know that I am doing it right (in java). > 4.I think the problem is in the request body. > 5.I used this guy’s answer to help me build my map reduce example xml but > it does not work: > [http://hadoop-forum.org/forum/general-hadoop-discussion/miscellaneous/2136-how-can-i-run-mapreduce-job-by-rest-api]. > 6.What am I missing? (the description is not clear to me in the submit > section of the rest api 2.6) > 7.Does someone have an xml example for using a simple MR job? > 8.Thanks! Here is the XML file I am using for the request body: > -- > -- > > > application_1421661392788_0038 > test_21_1 > default > 3 > > > > CLASSPATH > > /usr/hdp/2.2.0.0-2041/hadoop/conf/usr/hdp/2.2.0.0-2041/hadoop/lib/* /usr/hdp/2.2.0.0-2041/hadoop/.//* /usr/hdp/2.2.0.0-2041/hadoop-hdfs/./ /usr/hdp/2.2.0.0-2041/hadoop-hdfs/lib/* /usr/hdp/2.2.0.0-2041/hadoop-hdfs/.//* /usr/hdp/2.2.0.0-2041/hadoop-yarn/lib/* /usr/hdp/2.2.0.0-2041/hadoop-yarn/.//* /usr/hdp/2.2.0.0-2041/hadoop-mapreduce/lib/* /usr/hdp/2.2.0.0-2041/hadoop-mapreduce/.//* /usr/share/java/mysql-connector-java-5.1.17.jar /usr/share/java/mysql-connector-java.jar /usr/hdp/current/hadoop-mapreduce-client/* /usr/hdp/current/tez-client/* /usr/hdp/current/tez-client/lib/* /etc/tez/conf/ /usr/hdp/2.2.0.0-2041/tez/* /usr/hdp/2.2.0.0-2041/tez/lib/* /etc/tez/conf > > > > hadoop jar > /usr/hdp/2.2.0.0-2041/hadoop-mapreduce/hadoop-mapreduce-examples-2.6.0.2.2.0.0-2041.jar > pi 10 10 > > > false > 2 > > 1024 > 1 > > MAPREDUCE > > false > > Michael > PI example > > > -- > -- -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-3084) YARN REST API 2.6 - can't submit simple job in hortonworks-allways job failes to run
[ https://issues.apache.org/jira/browse/YARN-3084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Vasudev resolved YARN-3084. - Resolution: Invalid > YARN REST API 2.6 - can't submit simple job in hortonworks-allways job failes > to run > > > Key: YARN-3084 > URL: https://issues.apache.org/jira/browse/YARN-3084 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, webapp >Affects Versions: 2.6.0 > Environment: Using eclipse on windows 7 (client)to run the map reduce > job on the host of Hortonworks HDP 2.2 (hortonworks is on vmware version > 6.0.2 build-1744117) >Reporter: Michael Br >Priority: Minor > Attachments: submit-app.json, > yarn-yarn-resourcemanager-sandbox.hortonworks.com.log > > > Hello, > 1.I want to run the simple Map Reduce job example (with the REST API 2.6 > for yarn applications) and to calculate PI… for now it doesn’t work. > When I use the command in the hortonworks terminal it works: “hadoop jar > /usr/hdp/2.2.0.0-2041/hadoop-mapreduce/hadoop-mapreduce-examples-2.6.0.2.2.0.0-2041.jar > pi 10 10”. > But I want to submit the job with the REST API and not in the terminal as a > command line. > [http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Applications_APISubmit_Application] > 2.I do succeed with other REST API requests: get state, get new > application id and even kill(change state), but when I try to submit my > example, the response is: > -- > -- > The Response Header: > Key : null ,Value : [HTTP/1.1 202 Accepted] > Key : Date ,Value : [Thu, 22 Jan 2015 07:47:24 GMT, Thu, 22 Jan 2015 07:47:24 > GMT] > Key : Content-Length ,Value : [0] > Key : Expires ,Value : [Thu, 22 Jan 2015 07:47:24 GMT, Thu, 22 Jan 2015 > 07:47:24 GMT] > Key : Location ,Value : [http://[my > port]:8088/ws/v1/cluster/apps/application_1421661392788_0038] > Key : Content-Type ,Value : [application/json] > Key : Server ,Value : [Jetty(6.1.26.hwx)] > Key : Pragma ,Value : [no-cache, no-cache] > Key : Cache-Control ,Value : [no-cache] > The Respone Body: > Null (No Response) > -- > -- > 3.I need help with the http request body filling. I am doing a POST http > request and I know that I am doing it right (in java). > 4.I think the problem is in the request body. > 5.I used this guy’s answer to help me build my map reduce example xml but > it does not work: > [http://hadoop-forum.org/forum/general-hadoop-discussion/miscellaneous/2136-how-can-i-run-mapreduce-job-by-rest-api]. > 6.What am I missing? (the description is not clear to me in the submit > section of the rest api 2.6) > 7.Does someone have an xml example for using a simple MR job? > 8.Thanks! Here is the XML file I am using for the request body: > -- > -- > > > application_1421661392788_0038 > test_21_1 > default > 3 > > > > CLASSPATH > > /usr/hdp/2.2.0.0-2041/hadoop/conf/usr/hdp/2.2.0.0-2041/hadoop/lib/* /usr/hdp/2.2.0.0-2041/hadoop/.//* /usr/hdp/2.2.0.0-2041/hadoop-hdfs/./ /usr/hdp/2.2.0.0-2041/hadoop-hdfs/lib/* /usr/hdp/2.2.0.0-2041/hadoop-hdfs/.//* /usr/hdp/2.2.0.0-2041/hadoop-yarn/lib/* /usr/hdp/2.2.0.0-2041/hadoop-yarn/.//* /usr/hdp/2.2.0.0-2041/hadoop-mapreduce/lib/* /usr/hdp/2.2.0.0-2041/hadoop-mapreduce/.//* /usr/share/java/mysql-connector-java-5.1.17.jar /usr/share/java/mysql-connector-java.jar /usr/hdp/current/hadoop-mapreduce-client/* /usr/hdp/current/tez-client/* /usr/hdp/current/tez-client/lib/* /etc/tez/conf/ /usr/hdp/2.2.0.0-2041/tez/* /usr/hdp/2.2.0.0-2041/tez/lib/* /etc/tez/conf > > > > hadoop jar > /usr/hdp/2.2.0.0-2041/hadoop-mapreduce/hadoop-mapreduce-examples-2.6.0.2.2.0.0-2041.jar > pi 10 10 > > > false > 2 > > 1024 > 1 > > MAPREDUCE > > false > > Michael > PI example > > > -- > -- -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3304) ResourceCalculatorProcessTree#getCpuUsagePercent default return value is inconsistent with other getters
[ https://issues.apache.org/jira/browse/YARN-3304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383762#comment-14383762 ] Junping Du commented on YARN-3304: -- Thanks [~adhoot] for comments. bq. Overall the API looks usable and will fix the issues. Thanks for confirmation on this. That is important. bq. But the implementation has to do a lot of duplication of code to implement the boolean api. The CPU isAvailable is straightforward but the memory calculations seem to rely on duplicating code that does the actual calculation. I admit that there are some duplicate code (with getting CPU/Memory values) here. The reason I didn't do more refactor work is less risk of changing existing logic at this moment. I think we can refactor it later (in next release) to reuse logic for boolean API and get values. Anyway, this belongs to implementation not API and we don't have to finalize it now. bq. Wouldn't using it as -1 be easier instead of duplicating code to keep it as 0? First, replacing default memory resource from 0 to -1 also take extra effort and extra checking resource value is not -1 in calculating resource. Second, like I mentioned above, the consumer side (ContainerMetrics, MR resource counter, new TimelineService, etc.) need to handle -1 explicitly and existing MR resource counter never have negative value for memory resource before. Last but not the least, as a public API, negative resource value is always more misleading than boolean API for resource available. > ResourceCalculatorProcessTree#getCpuUsagePercent default return value is > inconsistent with other getters > > > Key: YARN-3304 > URL: https://issues.apache.org/jira/browse/YARN-3304 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Junping Du >Assignee: Karthik Kambatla >Priority: Blocker > Attachments: YARN-3304-v2.patch, YARN-3304-v3.patch, YARN-3304.patch > > > Per discussions in YARN-3296, getCpuUsagePercent() will return -1 for > unavailable case while other resource metrics are return 0 in the same case > which sounds inconsistent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-3304) ResourceCalculatorProcessTree#getCpuUsagePercent default return value is inconsistent with other getters
[ https://issues.apache.org/jira/browse/YARN-3304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du reassigned YARN-3304: Assignee: Junping Du (was: Karthik Kambatla) > ResourceCalculatorProcessTree#getCpuUsagePercent default return value is > inconsistent with other getters > > > Key: YARN-3304 > URL: https://issues.apache.org/jira/browse/YARN-3304 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Junping Du >Assignee: Junping Du >Priority: Blocker > Attachments: YARN-3304-v2.patch, YARN-3304-v3.patch, YARN-3304.patch > > > Per discussions in YARN-3296, getCpuUsagePercent() will return -1 for > unavailable case while other resource metrics are return 0 in the same case > which sounds inconsistent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3400) [JDK 8] Build Failure due to unreported exceptions in RPCUtil
[ https://issues.apache.org/jira/browse/YARN-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383805#comment-14383805 ] Hudson commented on YARN-3400: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #145 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/145/]) YARN-3400. [JDK 8] Build Failure due to unreported exceptions in RPCUtil (rkanter) (rkanter: rev 87130bf6b22f538c5c26ad5cef984558a8117798) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java * hadoop-yarn-project/CHANGES.txt > [JDK 8] Build Failure due to unreported exceptions in RPCUtil > -- > > Key: YARN-3400 > URL: https://issues.apache.org/jira/browse/YARN-3400 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Robert Kanter >Assignee: Robert Kanter > Fix For: 2.8.0 > > Attachments: YARN-3400.patch > > > When I try compiling Hadoop with JDK 8 like this > {noformat} > mvn clean package -Pdist -Dtar -DskipTests -Djavac.version=1.8 > {noformat} > I get this error: > {noformat} > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) > on project hadoop-yarn-common: Compilation failure: Compilation failure: > [ERROR] > /Users/rkanter/dev/hadoop-common2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java:[101,11] > unreported exception java.lang.Throwable; must be caught or declared to be > thrown > [ERROR] > /Users/rkanter/dev/hadoop-common2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java:[104,11] > unreported exception java.lang.Throwable; must be caught or declared to be > thrown > [ERROR] > /Users/rkanter/dev/hadoop-common2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java:[107,11] > unreported exception java.lang.Throwable; must be caught or declared to be > thrown > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3169) drop the useless yarn overview document
[ https://issues.apache.org/jira/browse/YARN-3169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383821#comment-14383821 ] Tsuyoshi Ozawa commented on YARN-3169: -- [~brahmareddy] thank you for taking this issue. I think we can just drop the document since Architecture.md has a same sentences. I mean we can just execute git rm index.md. Also, we should check hyperlinks to the document correctly. > drop the useless yarn overview document > --- > > Key: YARN-3169 > URL: https://issues.apache.org/jira/browse/YARN-3169 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation >Reporter: Allen Wittenauer >Assignee: Brahma Reddy Battula > Attachments: YARN-3169.patch > > > It's pretty superfluous given there is a site index on the left. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3189) Yarn application usage command should not give -appstate and -apptype
[ https://issues.apache.org/jira/browse/YARN-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anushri updated YARN-3189: -- Attachment: YARN-3189_1.patch modified the test cases > Yarn application usage command should not give -appstate and -apptype > - > > Key: YARN-3189 > URL: https://issues.apache.org/jira/browse/YARN-3189 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Anushri >Assignee: Anushri >Priority: Minor > Attachments: YARN-3189.patch, YARN-3189.patch, YARN-3189_1.patch > > > Yarn application usage command should not give -appstate and -apptype since > these two are applicable to --list command.. > *Can somebody please assign this issue to me* -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3407) HttpServer2 Max threads in TimelineCollectorManager should be more than 10
[ https://issues.apache.org/jira/browse/YARN-3407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3407: - Affects Version/s: YARN-2928 > HttpServer2 Max threads in TimelineCollectorManager should be more than 10 > -- > > Key: YARN-3407 > URL: https://issues.apache.org/jira/browse/YARN-3407 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > > Currently TimelineCollectorManager sets HttpServer2.HTTP_MAX_THREADS to just > 10. This value might be too less for serving put requests. By default > HttpServer2 will have max threads value of 250. We can probably make it > configurable too so that an optimum value can be configured based on number > of requests coming to server. Thoughts ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3400) [JDK 8] Build Failure due to unreported exceptions in RPCUtil
[ https://issues.apache.org/jira/browse/YARN-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383835#comment-14383835 ] Hudson commented on YARN-3400: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2095 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2095/]) YARN-3400. [JDK 8] Build Failure due to unreported exceptions in RPCUtil (rkanter) (rkanter: rev 87130bf6b22f538c5c26ad5cef984558a8117798) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java * hadoop-yarn-project/CHANGES.txt > [JDK 8] Build Failure due to unreported exceptions in RPCUtil > -- > > Key: YARN-3400 > URL: https://issues.apache.org/jira/browse/YARN-3400 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Robert Kanter >Assignee: Robert Kanter > Fix For: 2.8.0 > > Attachments: YARN-3400.patch > > > When I try compiling Hadoop with JDK 8 like this > {noformat} > mvn clean package -Pdist -Dtar -DskipTests -Djavac.version=1.8 > {noformat} > I get this error: > {noformat} > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) > on project hadoop-yarn-common: Compilation failure: Compilation failure: > [ERROR] > /Users/rkanter/dev/hadoop-common2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java:[101,11] > unreported exception java.lang.Throwable; must be caught or declared to be > thrown > [ERROR] > /Users/rkanter/dev/hadoop-common2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java:[104,11] > unreported exception java.lang.Throwable; must be caught or declared to be > thrown > [ERROR] > /Users/rkanter/dev/hadoop-common2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java:[107,11] > unreported exception java.lang.Throwable; must be caught or declared to be > thrown > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3189) Yarn application usage command should not give -appstate and -apptype
[ https://issues.apache.org/jira/browse/YARN-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383894#comment-14383894 ] Hadoop QA commented on YARN-3189: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12707766/YARN-3189_1.patch against trunk revision af618f2. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client: org.apache.hadoop.yarn.client.TestApplicationMasterServiceProtocolOnHA org.apache.hadoop.yarn.client.cli.TestYarnCLI org.apache.hadoop.yarn.client.TestResourceTrackerOnHA org.apache.hadoop.yarn.client.api.impl.TestNMClient org.apache.hadoop.yarn.client.TestResourceManagerAdministrationProtocolPBClientImpl org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA org.apache.hadoop.yarn.client.TestRMFailover org.apache.hadoop.yarn.client.api.impl.TestYarnClient org.apache.hadoop.yarn.client.api.impl.TestAMRMClient org.apache.hadoop.yarn.client.TestGetGroups Test results: https://builds.apache.org/job/PreCommit-YARN-Build/7123//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7123//console This message is automatically generated. > Yarn application usage command should not give -appstate and -apptype > - > > Key: YARN-3189 > URL: https://issues.apache.org/jira/browse/YARN-3189 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Anushri >Assignee: Anushri >Priority: Minor > Attachments: YARN-3189.patch, YARN-3189.patch, YARN-3189_1.patch > > > Yarn application usage command should not give -appstate and -apptype since > these two are applicable to --list command.. > *Can somebody please assign this issue to me* -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3212) RMNode State Transition Update with DECOMMISSIONING state
[ https://issues.apache.org/jira/browse/YARN-3212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383896#comment-14383896 ] Hadoop QA commented on YARN-3212: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12706943/YARN-3212-v3.patch against trunk revision af618f2. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestApplicationCleanup org.apache.hadoop.yarn.server.resourcemanager.TestRM org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesAppsModification org.apache.hadoop.yarn.server.resourcemanager.TestMoveApplication org.apache.hadoop.yarn.server.resourcemanager.TestApplicationMasterLauncher org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation org.apache.hadoop.yarn.server.resourcemanager.TestKillApplicationWithRMHA org.apache.hadoop.yarn.server.resourcemanager.TestApplicationMasterService org.apache.hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStore org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebappAuthentication org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService org.apache.hadoop.yarn.server.resourcemanager.TestRMHA Test results: https://builds.apache.org/job/PreCommit-YARN-Build/7122//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7122//console This message is automatically generated. > RMNode State Transition Update with DECOMMISSIONING state > - > > Key: YARN-3212 > URL: https://issues.apache.org/jira/browse/YARN-3212 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Junping Du >Assignee: Junping Du > Attachments: RMNodeImpl - new.png, YARN-3212-v1.patch, > YARN-3212-v2.patch, YARN-3212-v3.patch > > > As proposed in YARN-914, a new state of “DECOMMISSIONING” will be added and > can transition from “running” state triggered by a new event - > “decommissioning”. > This new state can be transit to state of “decommissioned” when > Resource_Update if no running apps on this NM or NM reconnect after restart. > Or it received DECOMMISSIONED event (after timeout from CLI). > In addition, it can back to “running” if user decides to cancel previous > decommission by calling recommission on the same node. The reaction to other > events is similar to RUNNING state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3408) TestDistributedShell get failed due to RM start failure.
Junping Du created YARN-3408: Summary: TestDistributedShell get failed due to RM start failure. Key: YARN-3408 URL: https://issues.apache.org/jira/browse/YARN-3408 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Junping Du Assignee: Junping Du The log exception: {code} 2015-03-27 14:43:17,190 WARN [RM-0] mortbay.log (Slf4jLog.java:warn(89)) - Failed startup of context org.mortbay.jetty.webapp.WebAppContext@2d2d0132{/,file:/Users/jdu/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/target/classes/webapps/cluster} javax.servlet.ServletException: java.lang.RuntimeException: Could not read signature secret file: /Users/jdu/hadoop-http-auth-signature-secret at org.apache.hadoop.security.authentication.server.AuthenticationFilter.initializeSecretProvider(AuthenticationFilter.java:266) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.init(AuthenticationFilter.java:225) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationFilter.init(DelegationTokenAuthenticationFilter.java:161) at org.apache.hadoop.yarn.server.security.http.RMAuthenticationFilter.init(RMAuthenticationFilter.java:53) at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:713) at org.mortbay.jetty.servlet.Context.startContext(Context.java:140) at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1282) at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:518) at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:499) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152) at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130) at org.mortbay.jetty.Server.doStart(Server.java:224) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:773) at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:274) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startWepApp(ResourceManager.java:989) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1089) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.yarn.server.MiniYARNCluster$2.run(MiniYARNCluster.java:312) Caused by: java.lang.RuntimeException: Could not read signature secret file: /Users/jdu/hadoop-http-auth-signature-secret at org.apache.hadoop.security.authentication.util.FileSignerSecretProvider.init(FileSignerSecretProvider.java:59) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.initializeSecretProvider(AuthenticationFilter.java:264) ... 23 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3400) [JDK 8] Build Failure due to unreported exceptions in RPCUtil
[ https://issues.apache.org/jira/browse/YARN-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383938#comment-14383938 ] Hudson commented on YARN-3400: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #2077 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2077/]) YARN-3400. [JDK 8] Build Failure due to unreported exceptions in RPCUtil (rkanter) (rkanter: rev 87130bf6b22f538c5c26ad5cef984558a8117798) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java > [JDK 8] Build Failure due to unreported exceptions in RPCUtil > -- > > Key: YARN-3400 > URL: https://issues.apache.org/jira/browse/YARN-3400 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Robert Kanter >Assignee: Robert Kanter > Fix For: 2.8.0 > > Attachments: YARN-3400.patch > > > When I try compiling Hadoop with JDK 8 like this > {noformat} > mvn clean package -Pdist -Dtar -DskipTests -Djavac.version=1.8 > {noformat} > I get this error: > {noformat} > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) > on project hadoop-yarn-common: Compilation failure: Compilation failure: > [ERROR] > /Users/rkanter/dev/hadoop-common2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java:[101,11] > unreported exception java.lang.Throwable; must be caught or declared to be > thrown > [ERROR] > /Users/rkanter/dev/hadoop-common2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java:[104,11] > unreported exception java.lang.Throwable; must be caught or declared to be > thrown > [ERROR] > /Users/rkanter/dev/hadoop-common2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java:[107,11] > unreported exception java.lang.Throwable; must be caught or declared to be > thrown > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3405) FairScheduler's preemption cannot happen between sibling in some case
[ https://issues.apache.org/jira/browse/YARN-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383953#comment-14383953 ] Karthik Kambatla commented on YARN-3405: [~peng.zhang] - I haven't looked at the code yet; I think the issue described in the description exists, but don't quite see the livelock. Is this the confirmation you are looking for? The back-and-forth is a little confusing, could you update the description with what you think the real problem is. If there is a second problem, let us handle that in a different JIRA and link the two if necessary. > FairScheduler's preemption cannot happen between sibling in some case > - > > Key: YARN-3405 > URL: https://issues.apache.org/jira/browse/YARN-3405 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.0 >Reporter: Peng Zhang >Priority: Critical > > Queue hierarchy described as below: > {noformat} > root > | >queue-1 > / \ > queue-1-1queue-1-2 > {noformat} > 1. When queue-1-1 is active and it has been assigned with all resources. > 2. When queue-1-2 is active, and it cause some new preemption request. > 3. But when do preemption, it now starts from root, and found queue-1 is not > over fairshare, so no recursion preemption to queue-1-1. > 4. Finally queue-1-2 will be waiting for resource release form queue-1-1 > itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3400) [JDK 8] Build Failure due to unreported exceptions in RPCUtil
[ https://issues.apache.org/jira/browse/YARN-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383983#comment-14383983 ] Hudson commented on YARN-3400: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk-Java8 #136 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/136/]) YARN-3400. [JDK 8] Build Failure due to unreported exceptions in RPCUtil (rkanter) (rkanter: rev 87130bf6b22f538c5c26ad5cef984558a8117798) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java * hadoop-yarn-project/CHANGES.txt > [JDK 8] Build Failure due to unreported exceptions in RPCUtil > -- > > Key: YARN-3400 > URL: https://issues.apache.org/jira/browse/YARN-3400 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Robert Kanter >Assignee: Robert Kanter > Fix For: 2.8.0 > > Attachments: YARN-3400.patch > > > When I try compiling Hadoop with JDK 8 like this > {noformat} > mvn clean package -Pdist -Dtar -DskipTests -Djavac.version=1.8 > {noformat} > I get this error: > {noformat} > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) > on project hadoop-yarn-common: Compilation failure: Compilation failure: > [ERROR] > /Users/rkanter/dev/hadoop-common2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java:[101,11] > unreported exception java.lang.Throwable; must be caught or declared to be > thrown > [ERROR] > /Users/rkanter/dev/hadoop-common2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java:[104,11] > unreported exception java.lang.Throwable; must be caught or declared to be > thrown > [ERROR] > /Users/rkanter/dev/hadoop-common2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/ipc/RPCUtil.java:[107,11] > unreported exception java.lang.Throwable; must be caught or declared to be > thrown > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3169) drop the useless yarn overview document
[ https://issues.apache.org/jira/browse/YARN-3169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula updated YARN-3169: --- Attachment: YARN-3169-002.patch > drop the useless yarn overview document > --- > > Key: YARN-3169 > URL: https://issues.apache.org/jira/browse/YARN-3169 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation >Reporter: Allen Wittenauer >Assignee: Brahma Reddy Battula > Attachments: YARN-3169-002.patch, YARN-3169.patch > > > It's pretty superfluous given there is a site index on the left. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3169) drop the useless yarn overview document
[ https://issues.apache.org/jira/browse/YARN-3169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384002#comment-14384002 ] Brahma Reddy Battula commented on YARN-3169: Thanks a lot for taking look into this issue..Yes, Now I deleted the index.md which is not require anymore... > drop the useless yarn overview document > --- > > Key: YARN-3169 > URL: https://issues.apache.org/jira/browse/YARN-3169 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation >Reporter: Allen Wittenauer >Assignee: Brahma Reddy Battula > Attachments: YARN-3169-002.patch, YARN-3169.patch > > > It's pretty superfluous given there is a site index on the left. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle
[ https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-3047: --- Attachment: YARN-3047.006.patch > [Data Serving] Set up ATS reader with basic request serving structure and > lifecycle > --- > > Key: YARN-3047 > URL: https://issues.apache.org/jira/browse/YARN-3047 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: Timeline_Reader(draft).pdf, YARN-3047.001.patch, > YARN-3047.003.patch, YARN-3047.005.patch, YARN-3047.006.patch, > YARN-3047.02.patch, YARN-3047.04.patch > > > Per design in YARN-2938, set up the ATS reader as a service and implement the > basic structure as a service. It includes lifecycle management, request > serving, and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle
[ https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384024#comment-14384024 ] Varun Saxena commented on YARN-3047: [~zjshen], have uploaded another patch {{YARN-3047.006.patch}}. Check if this applies. In this patch I have not changed the line endings. Windows script normally has issues in applying due to line endings. You may have cloned the repo with line endings which are not same as whats in my directory. Previous patch was according to what applies in Jenkins. I had faced this issue with another patch earlier. > [Data Serving] Set up ATS reader with basic request serving structure and > lifecycle > --- > > Key: YARN-3047 > URL: https://issues.apache.org/jira/browse/YARN-3047 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: Timeline_Reader(draft).pdf, YARN-3047.001.patch, > YARN-3047.003.patch, YARN-3047.005.patch, YARN-3047.006.patch, > YARN-3047.02.patch, YARN-3047.04.patch > > > Per design in YARN-2938, set up the ATS reader as a service and implement the > basic structure as a service. It includes lifecycle management, request > serving, and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle
[ https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384035#comment-14384035 ] Hadoop QA commented on YARN-3047: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12707786/YARN-3047.006.patch against trunk revision 05499b1. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7125//console This message is automatically generated. > [Data Serving] Set up ATS reader with basic request serving structure and > lifecycle > --- > > Key: YARN-3047 > URL: https://issues.apache.org/jira/browse/YARN-3047 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: Timeline_Reader(draft).pdf, YARN-3047.001.patch, > YARN-3047.003.patch, YARN-3047.005.patch, YARN-3047.006.patch, > YARN-3047.02.patch, YARN-3047.04.patch > > > Per design in YARN-2938, set up the ATS reader as a service and implement the > basic structure as a service. It includes lifecycle management, request > serving, and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3169) drop the useless yarn overview document
[ https://issues.apache.org/jira/browse/YARN-3169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384049#comment-14384049 ] Hadoop QA commented on YARN-3169: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12707783/YARN-3169-002.patch against trunk revision 05499b1. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+0 tests included{color}. The patch appears to be a documentation patch that doesn't require tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-YARN-Build/7124//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7124//console This message is automatically generated. > drop the useless yarn overview document > --- > > Key: YARN-3169 > URL: https://issues.apache.org/jira/browse/YARN-3169 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation >Reporter: Allen Wittenauer >Assignee: Brahma Reddy Battula > Attachments: YARN-3169-002.patch, YARN-3169.patch > > > It's pretty superfluous given there is a site index on the left. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3334) [Event Producers] NM TimelineClient life cycle handling and container metrics posting to new timeline service.
[ https://issues.apache.org/jira/browse/YARN-3334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3334: - Attachment: YARN-3334-v3.patch Rebase v2 patch against recently changes on YARN-2928 (with merging patches from trunk). TestDistributedShell cannot work because HADOOP-10670, RM cannot be started in insecure model, and with applying patch of HADOOP-11763, verify it can works. > [Event Producers] NM TimelineClient life cycle handling and container metrics > posting to new timeline service. > -- > > Key: YARN-3334 > URL: https://issues.apache.org/jira/browse/YARN-3334 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: YARN-2928 >Reporter: Junping Du >Assignee: Junping Du > Attachments: YARN-3334-demo.patch, YARN-3334-v1.patch, > YARN-3334-v2.patch, YARN-3334-v3.patch > > > After YARN-3039, we have service discovery mechanism to pass app-collector > service address among collectors, NMs and RM. In this JIRA, we will handle > service address setting for TimelineClients in NodeManager, and put container > metrics to the backend storage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3409) Add constraint node labels
Wangda Tan created YARN-3409: Summary: Add constraint node labels Key: YARN-3409 URL: https://issues.apache.org/jira/browse/YARN-3409 Project: Hadoop YARN Issue Type: Sub-task Reporter: Wangda Tan Assignee: Wangda Tan Specify only one label for each node (IAW, partition a cluster) is a way to determinate how resources of a special set of nodes could be shared by a group of entities (like teams, departments, etc.). Partitions of a cluster has following characteristics: - ACL/priority (Only market team / marke team has priority to use the partition). Percentage of capacities (Market team has 40% minimum capacity and Dev team has 60% of minimum capacity of the partition). - One node can only belong to one partition. - One resource request can only ask for one partition Constraints are orthogonal to partition, they’re describing attributes of node’s hardware/software just for affinity. Some example of constraints: - glibc version - JDK version - Type of CPU (x86_64/i686) - Type of OS (windows, linux, etc.) With this, application can be able to ask for resource has (glibc.version >= 2.20 && JDK.version >= 8u20 && x86_64). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3214) Add non-exclusive node labels
[ https://issues.apache.org/jira/browse/YARN-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384136#comment-14384136 ] Wangda Tan commented on YARN-3214: -- [~lohit], I created and added some rough description for constraints at YARN-3409. I'm working on a design doc for constraint, should be ready to review soon. We can continue discuss it on YARN-3409. Thanks, > Add non-exclusive node labels > -- > > Key: YARN-3214 > URL: https://issues.apache.org/jira/browse/YARN-3214 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: Non-exclusive-Node-Partition-Design.pdf > > > Currently node labels partition the cluster to some sub-clusters so resources > cannot be shared between partitioned cluster. > With the current implementation of node labels we cannot use the cluster > optimally and the throughput of the cluster will suffer. > We are proposing adding non-exclusive node labels: > 1. Labeled apps get the preference on Labeled nodes > 2. If there is no ask for labeled resources we can assign those nodes to non > labeled apps > 3. If there is any future ask for those resources , we will preempt the non > labeled apps and give them back to labeled apps. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3409) Add constraint node labels
[ https://issues.apache.org/jira/browse/YARN-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-3409: - Description: Specify only one label for each node (IAW, partition a cluster) is a way to determinate how resources of a special set of nodes could be shared by a group of entities (like teams, departments, etc.). Partitions of a cluster has following characteristics: - Cluster divided to several disjoint sub clusters. - ACL/priority can apply on partition (Only market team / marke team has priority to use the partition). - Percentage of capacities can apply on partition (Market team has 40% minimum capacity and Dev team has 60% of minimum capacity of the partition). Constraints are orthogonal to partition, they’re describing attributes of node’s hardware/software just for affinity. Some example of constraints: - glibc version - JDK version - Type of CPU (x86_64/i686) - Type of OS (windows, linux, etc.) With this, application can be able to ask for resource has (glibc.version >= 2.20 && JDK.version >= 8u20 && x86_64). was: Specify only one label for each node (IAW, partition a cluster) is a way to determinate how resources of a special set of nodes could be shared by a group of entities (like teams, departments, etc.). Partitions of a cluster has following characteristics: - ACL/priority (Only market team / marke team has priority to use the partition). Percentage of capacities (Market team has 40% minimum capacity and Dev team has 60% of minimum capacity of the partition). - One node can only belong to one partition. - One resource request can only ask for one partition Constraints are orthogonal to partition, they’re describing attributes of node’s hardware/software just for affinity. Some example of constraints: - glibc version - JDK version - Type of CPU (x86_64/i686) - Type of OS (windows, linux, etc.) With this, application can be able to ask for resource has (glibc.version >= 2.20 && JDK.version >= 8u20 && x86_64). > Add constraint node labels > -- > > Key: YARN-3409 > URL: https://issues.apache.org/jira/browse/YARN-3409 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, client, resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan > > Specify only one label for each node (IAW, partition a cluster) is a way to > determinate how resources of a special set of nodes could be shared by a > group of entities (like teams, departments, etc.). Partitions of a cluster > has following characteristics: > - Cluster divided to several disjoint sub clusters. > - ACL/priority can apply on partition (Only market team / marke team has > priority to use the partition). > - Percentage of capacities can apply on partition (Market team has 40% > minimum capacity and Dev team has 60% of minimum capacity of the partition). > Constraints are orthogonal to partition, they’re describing attributes of > node’s hardware/software just for affinity. Some example of constraints: > - glibc version > - JDK version > - Type of CPU (x86_64/i686) > - Type of OS (windows, linux, etc.) > With this, application can be able to ask for resource has (glibc.version >= > 2.20 && JDK.version >= 8u20 && x86_64). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3409) Add constraint node labels
[ https://issues.apache.org/jira/browse/YARN-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-3409: - Component/s: (was: resourcemanager) capacityscheduler > Add constraint node labels > -- > > Key: YARN-3409 > URL: https://issues.apache.org/jira/browse/YARN-3409 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, capacityscheduler, client >Reporter: Wangda Tan >Assignee: Wangda Tan > > Specify only one label for each node (IAW, partition a cluster) is a way to > determinate how resources of a special set of nodes could be shared by a > group of entities (like teams, departments, etc.). Partitions of a cluster > has following characteristics: > - Cluster divided to several disjoint sub clusters. > - ACL/priority can apply on partition (Only market team / marke team has > priority to use the partition). > - Percentage of capacities can apply on partition (Market team has 40% > minimum capacity and Dev team has 60% of minimum capacity of the partition). > Constraints are orthogonal to partition, they’re describing attributes of > node’s hardware/software just for affinity. Some example of constraints: > - glibc version > - JDK version > - Type of CPU (x86_64/i686) > - Type of OS (windows, linux, etc.) > With this, application can be able to ask for resource has (glibc.version >= > 2.20 && JDK.version >= 8u20 && x86_64). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3214) Add non-exclusive node labels
[ https://issues.apache.org/jira/browse/YARN-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384176#comment-14384176 ] Wangda Tan commented on YARN-3214: -- And to your concerns about regression. YARN-2694 temp restriction of only one node-label per node and per resource-request, because there're lots of problems when doing that. We need address these problems before let people to use. 2.6 doesn't have multiple node label support, so proposal attached in the JIRA isn't a regression. In general, node label feature is still in alpha stage, after following tasks get completed, it can be a much better: - YARN-2495, distribtued configuration for node label - YARN-3214, - YARN-3409, constraints support - YARN-3362, UI support Hope this makes sense to you, and looking forward for your thoughts/suggestions. Thanks, > Add non-exclusive node labels > -- > > Key: YARN-3214 > URL: https://issues.apache.org/jira/browse/YARN-3214 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: Non-exclusive-Node-Partition-Design.pdf > > > Currently node labels partition the cluster to some sub-clusters so resources > cannot be shared between partitioned cluster. > With the current implementation of node labels we cannot use the cluster > optimally and the throughput of the cluster will suffer. > We are proposing adding non-exclusive node labels: > 1. Labeled apps get the preference on Labeled nodes > 2. If there is no ask for labeled resources we can assign those nodes to non > labeled apps > 3. If there is any future ask for those resources , we will preempt the non > labeled apps and give them back to labeled apps. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3410) YARN admin should be able to remove individual application record from RMStateStore
Wangda Tan created YARN-3410: Summary: YARN admin should be able to remove individual application record from RMStateStore Key: YARN-3410 URL: https://issues.apache.org/jira/browse/YARN-3410 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, yarn Reporter: Wangda Tan Priority: Critical When RM state store entered an unexpected state, one example is YARN-2340, when an attempt is not in final state but app already completed, RM can never get up unless format RMStateStore. I think we should support remove individual application records from RMStateStore to unblock RM admin make choice of either waiting for a fix or format state store. In addition, RM should be able to report all fatal errors (which will shutdown RM) when doing app recovery, this can save admin some time to remove apps in bad state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3410) YARN admin should be able to remove individual application records from RMStateStore
[ https://issues.apache.org/jira/browse/YARN-3410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-3410: - Summary: YARN admin should be able to remove individual application records from RMStateStore (was: YARN admin should be able to remove individual application record from RMStateStore) > YARN admin should be able to remove individual application records from > RMStateStore > > > Key: YARN-3410 > URL: https://issues.apache.org/jira/browse/YARN-3410 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, yarn >Reporter: Wangda Tan >Priority: Critical > > When RM state store entered an unexpected state, one example is YARN-2340, > when an attempt is not in final state but app already completed, RM can never > get up unless format RMStateStore. > I think we should support remove individual application records from > RMStateStore to unblock RM admin make choice of either waiting for a fix or > format state store. > In addition, RM should be able to report all fatal errors (which will > shutdown RM) when doing app recovery, this can save admin some time to remove > apps in bad state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle
[ https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384244#comment-14384244 ] Sangjin Lee commented on YARN-3047: --- bq. My assumption here is that we only care about per-application partial order, but not a total order on the timeline server. So I assume there are multiple "timelines" on the server, one for each application, but not a global timeline (which cost much more). Based on this assumption I think we can focus on reading from storage level firstly. I'm not quite sure what you meant by this. Could you kindly elaborate on the "partial order" and the "total order"? Maybe I missed some context. > [Data Serving] Set up ATS reader with basic request serving structure and > lifecycle > --- > > Key: YARN-3047 > URL: https://issues.apache.org/jira/browse/YARN-3047 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: Timeline_Reader(draft).pdf, YARN-3047.001.patch, > YARN-3047.003.patch, YARN-3047.005.patch, YARN-3047.006.patch, > YARN-3047.02.patch, YARN-3047.04.patch > > > Per design in YARN-2938, set up the ATS reader as a service and implement the > basic structure as a service. It includes lifecycle management, request > serving, and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2890) MiniMRYarnCluster should turn on timeline service if configured to do so
[ https://issues.apache.org/jira/browse/YARN-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mit Desai updated YARN-2890: Attachment: YARN-2890.1.patch [~hitesh], I have attached new patch addressing your comments. > MiniMRYarnCluster should turn on timeline service if configured to do so > > > Key: YARN-2890 > URL: https://issues.apache.org/jira/browse/YARN-2890 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Mit Desai >Assignee: Mit Desai > Attachments: YARN-2890.1.patch, YARN-2890.patch, YARN-2890.patch, > YARN-2890.patch, YARN-2890.patch, YARN-2890.patch > > > Currently the MiniMRYarnCluster does not consider the configuration value for > enabling timeline service before starting. The MiniYarnCluster should only > start the timeline service if it is configured to do so. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3324) TestDockerContainerExecutor should clean test docker image from local repository after test is done
[ https://issues.apache.org/jira/browse/YARN-3324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384274#comment-14384274 ] Ravi Prakash commented on YARN-3324: Hi Ravindra! Is there a reason you want to delete the image everytime? Wouldn't that mean that it would have to be downloaded for each test run? Unless there's a good reason I'd be a -1 on the change. > TestDockerContainerExecutor should clean test docker image from local > repository after test is done > --- > > Key: YARN-3324 > URL: https://issues.apache.org/jira/browse/YARN-3324 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.6.0 >Reporter: Chen He > Attachments: YARN-3324-branch-2.6.0.002.patch, > YARN-3324-trunk.002.patch > > > Current TestDockerContainerExecutor only cleans the temp directory in local > file system but leaves the test docker image in local docker repository. It > should be cleaned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2890) MiniMRYarnCluster should turn on timeline service if configured to do so
[ https://issues.apache.org/jira/browse/YARN-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384275#comment-14384275 ] Mit Desai commented on YARN-2890: - The test {{testTimelineServiceStartInMiniCluster}} includes the following scenarios. 1) Timeline service should not start if {{TIMELINE_SERVICE_ENABLED == false}} and {{enableAHS}} is not set 2) Timeline service should not start if {{TIMELINE_SERVICE_ENABLED == false}} and {{enableAHS == false}} 3) Timeline service should start if {{TIMELINE_SERVICE_ENABLED == true }}and {{enableAHS == false}} 4) Timeline service should start if {{TIMELINE_SERVICE_ENABLED == false}} and {{enableAHS == true}} Following scenarios not included: 1) {{TIMELINE_SERVICE_ENABLED == true}} and {{enableAHS}} is not set:: This case is already covered in the other tests in TestJobHistoryEventHandler 2) {{TIMELINE_SERVICE_ENABLED == true}} and {{enableAHS == true}}:: Timeline service will start if either of this is true. This case will be a duplicate of scenarios 3 and 4. > MiniMRYarnCluster should turn on timeline service if configured to do so > > > Key: YARN-2890 > URL: https://issues.apache.org/jira/browse/YARN-2890 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Mit Desai >Assignee: Mit Desai > Attachments: YARN-2890.1.patch, YARN-2890.patch, YARN-2890.patch, > YARN-2890.patch, YARN-2890.patch, YARN-2890.patch > > > Currently the MiniMRYarnCluster does not consider the configuration value for > enabling timeline service before starting. The MiniYarnCluster should only > start the timeline service if it is configured to do so. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3324) TestDockerContainerExecutor should clean test docker image from local repository after test is done
[ https://issues.apache.org/jira/browse/YARN-3324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384280#comment-14384280 ] Ravi Prakash commented on YARN-3324: And to clarify, I see the image as a dependency for the test. Possibly as any other jar that may be needed to run a test. We don't delete the jars a tests depends on after every run, and so neither should we docker images. > TestDockerContainerExecutor should clean test docker image from local > repository after test is done > --- > > Key: YARN-3324 > URL: https://issues.apache.org/jira/browse/YARN-3324 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.6.0 >Reporter: Chen He > Attachments: YARN-3324-branch-2.6.0.002.patch, > YARN-3324-trunk.002.patch > > > Current TestDockerContainerExecutor only cleans the temp directory in local > file system but leaves the test docker image in local docker repository. It > should be cleaned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle
[ https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384285#comment-14384285 ] Sangjin Lee commented on YARN-3047: --- Took a quick look at the latest patch (v06). Looks good for the most part, but Field.java and TestTimelineReaderServer.java are missing the license. Could you please fix that quickly? The patch applies cleanly for me BTW. bq. And I think max number of HttpServer threads should not be merely 10. Filed YARN-3407 for it If I'm not mistaken, it looks like the number of threads can be controlled by HTTP_MAX_THREADS? And also by default jetty maxes out at 250? Wouldn't that be enough? > [Data Serving] Set up ATS reader with basic request serving structure and > lifecycle > --- > > Key: YARN-3047 > URL: https://issues.apache.org/jira/browse/YARN-3047 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: Timeline_Reader(draft).pdf, YARN-3047.001.patch, > YARN-3047.003.patch, YARN-3047.005.patch, YARN-3047.006.patch, > YARN-3047.02.patch, YARN-3047.04.patch > > > Per design in YARN-2938, set up the ATS reader as a service and implement the > basic structure as a service. It includes lifecycle management, request > serving, and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-3402) Security support for new timeline service.
[ https://issues.apache.org/jira/browse/YARN-3402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli resolved YARN-3402. --- Resolution: Duplicate Dup of YARN-3053. > Security support for new timeline service. > -- > > Key: YARN-3402 > URL: https://issues.apache.org/jira/browse/YARN-3402 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Junping Du > > We should support YARN security for new TimelineService. > Basically, there should be security token exchange between AM, NMs and > app-collectors to prevent anyone who knows the service address of > app-collector can post faked/unwanted information. Also, there should be > tokens exchange between app-collector/RMTimelineCollector and backend storage > (HBase, Phoenix, etc.) that enabling security. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3324) TestDockerContainerExecutor should clean test docker image from local repository after test is done
[ https://issues.apache.org/jira/browse/YARN-3324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384338#comment-14384338 ] Chen He commented on YARN-3324: --- Good question, [~raviprak]. My original concern was the space issue and cleanness of the machine which runs the tests. Since docker local registry is a shared place for the machine. It may cumulate a lot of test images if we do not remove them. Test generated temporary data should be cleaned after the test is done. I think this is a common problem for future docker related test. > TestDockerContainerExecutor should clean test docker image from local > repository after test is done > --- > > Key: YARN-3324 > URL: https://issues.apache.org/jira/browse/YARN-3324 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.6.0 >Reporter: Chen He > Attachments: YARN-3324-branch-2.6.0.002.patch, > YARN-3324-trunk.002.patch > > > Current TestDockerContainerExecutor only cleans the temp directory in local > file system but leaves the test docker image in local docker repository. It > should be cleaned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3405) FairScheduler's preemption cannot happen between sibling in some case
[ https://issues.apache.org/jira/browse/YARN-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384363#comment-14384363 ] zhihai xu commented on YARN-3405: - [~kasha] is right, There is a possibility for the first scenario. If we have another queue queue-2 which is queue-1's sibling and queue-2 is greater than queue-1 when compare queue-1 and queue-2, then queue-2 will always be picked for preemption and queue-1 won't have chance to be preempted. > FairScheduler's preemption cannot happen between sibling in some case > - > > Key: YARN-3405 > URL: https://issues.apache.org/jira/browse/YARN-3405 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.0 >Reporter: Peng Zhang >Priority: Critical > > Queue hierarchy described as below: > {noformat} > root > | >queue-1 > / \ > queue-1-1queue-1-2 > {noformat} > 1. When queue-1-1 is active and it has been assigned with all resources. > 2. When queue-1-2 is active, and it cause some new preemption request. > 3. But when do preemption, it now starts from root, and found queue-1 is not > over fairshare, so no recursion preemption to queue-1-1. > 4. Finally queue-1-2 will be waiting for resource release form queue-1-1 > itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle
[ https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384361#comment-14384361 ] Sangjin Lee commented on YARN-3047: --- One more thing: I notice that TimelineReaderManager uses object types (Long) for primitive types in its methods. Is there a strong reason to do this? If not, I would prefer staying with primitive types to avoid unnecessary boxing and unboxing. I'm also going to go over the document. > [Data Serving] Set up ATS reader with basic request serving structure and > lifecycle > --- > > Key: YARN-3047 > URL: https://issues.apache.org/jira/browse/YARN-3047 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: Timeline_Reader(draft).pdf, YARN-3047.001.patch, > YARN-3047.003.patch, YARN-3047.005.patch, YARN-3047.006.patch, > YARN-3047.02.patch, YARN-3047.04.patch > > > Per design in YARN-2938, set up the ATS reader as a service and implement the > basic structure as a service. It includes lifecycle management, request > serving, and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle
[ https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384381#comment-14384381 ] Li Lu commented on YARN-3047: - I tried the latest (006) patch on my machine. There were some hunk fails with .cmd file: {code} patching file hadoop-yarn-project/hadoop-yarn/bin/yarn patching file hadoop-yarn-project/hadoop-yarn/bin/yarn.cmd Hunk #1 FAILED at 151. Hunk #2 FAILED at 242. Hunk #3 FAILED at 317. 3 out of 3 hunks FAILED -- saving rejects to file hadoop-yarn-project/hadoop-yarn/bin/yarn.cmd.rej {code} BTW, why TimelineEvents.java is still here? For the documentation, I know what a thread pool is but my point is maybe we don't need to reach into the HttpServer2 level for now. We may also want to focus on building the whole thing up rather than deciding the best parameters for performance tuning. > [Data Serving] Set up ATS reader with basic request serving structure and > lifecycle > --- > > Key: YARN-3047 > URL: https://issues.apache.org/jira/browse/YARN-3047 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: Timeline_Reader(draft).pdf, YARN-3047.001.patch, > YARN-3047.003.patch, YARN-3047.005.patch, YARN-3047.006.patch, > YARN-3047.02.patch, YARN-3047.04.patch > > > Per design in YARN-2938, set up the ATS reader as a service and implement the > basic structure as a service. It includes lifecycle management, request > serving, and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384389#comment-14384389 ] Li Lu commented on YARN-3051: - Hi [~varun_saxena], any progress on the reader API side for now? The new reader API is blocking our storage implementations, so if you have any bandwidth problems feel free to let us know. I can take it over if necessary. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle
[ https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-3047: --- Attachment: YARN-3047.007.patch > [Data Serving] Set up ATS reader with basic request serving structure and > lifecycle > --- > > Key: YARN-3047 > URL: https://issues.apache.org/jira/browse/YARN-3047 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: Timeline_Reader(draft).pdf, YARN-3047.001.patch, > YARN-3047.003.patch, YARN-3047.005.patch, YARN-3047.006.patch, > YARN-3047.007.patch, YARN-3047.02.patch, YARN-3047.04.patch > > > Per design in YARN-2938, set up the ATS reader as a service and implement the > basic structure as a service. It includes lifecycle management, request > serving, and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle
[ https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384415#comment-14384415 ] Hadoop QA commented on YARN-3047: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12707853/YARN-3047.007.patch against trunk revision 05499b1. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7127//console This message is automatically generated. > [Data Serving] Set up ATS reader with basic request serving structure and > lifecycle > --- > > Key: YARN-3047 > URL: https://issues.apache.org/jira/browse/YARN-3047 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: Timeline_Reader(draft).pdf, YARN-3047.001.patch, > YARN-3047.003.patch, YARN-3047.005.patch, YARN-3047.006.patch, > YARN-3047.007.patch, YARN-3047.02.patch, YARN-3047.04.patch > > > Per design in YARN-2938, set up the ATS reader as a service and implement the > basic structure as a service. It includes lifecycle management, request > serving, and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384422#comment-14384422 ] Varun Saxena commented on YARN-3051: I am relatively free this weekend. So will be able to work on this on priority. Will let you know if I run into bandwidth issues. We had decided on below three APIs' which are somewhat similar to what existed in ATS v1. Now, as you mentioned in comment elsewhere we need to support metrics too. So, what kind of queries have we decided to support ? For instance, queries such as get apps which have a particular metric's value less than or greater than something ? {code} TimelineEntities getEntities(String entityType, long limit, long windowStart, Long windowEnd, String fromId, long fromTs, Collection filters, EnumSet fieldsToRetrieve) throws IOException; TimelineEntity getEntity(String entityId, String entityType, EnumSet fieldsToRetrieve) throws IOException; TimelineEvents getEntityTimelines(String entityType, SortedSet entityIds, long limit, long windowStart, long windowEnd, Set eventTypes) throws IOException; {code} > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle
[ https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384427#comment-14384427 ] Varun Saxena commented on YARN-3047: I think 005 patch would work for you then. This issue has to do with line endings. TimelineEvents has been copied over from ATS v1 to support getEvents API. I guess we have to support this v1 API. > [Data Serving] Set up ATS reader with basic request serving structure and > lifecycle > --- > > Key: YARN-3047 > URL: https://issues.apache.org/jira/browse/YARN-3047 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: Timeline_Reader(draft).pdf, YARN-3047.001.patch, > YARN-3047.003.patch, YARN-3047.005.patch, YARN-3047.006.patch, > YARN-3047.007.patch, YARN-3047.02.patch, YARN-3047.04.patch > > > Per design in YARN-2938, set up the ATS reader as a service and implement the > basic structure as a service. It includes lifecycle management, request > serving, and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle
[ https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384434#comment-14384434 ] Varun Saxena commented on YARN-3047: Updated a new patch > [Data Serving] Set up ATS reader with basic request serving structure and > lifecycle > --- > > Key: YARN-3047 > URL: https://issues.apache.org/jira/browse/YARN-3047 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: Timeline_Reader(draft).pdf, YARN-3047.001.patch, > YARN-3047.003.patch, YARN-3047.005.patch, YARN-3047.006.patch, > YARN-3047.007.patch, YARN-3047.02.patch, YARN-3047.04.patch > > > Per design in YARN-2938, set up the ATS reader as a service and implement the > basic structure as a service. It includes lifecycle management, request > serving, and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3304) ResourceCalculatorProcessTree#getCpuUsagePercent default return value is inconsistent with other getters
[ https://issues.apache.org/jira/browse/YARN-3304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384439#comment-14384439 ] Vinod Kumar Vavilapalli commented on YARN-3304: --- I too feel the same w.r.t the code duplication. It is going to be very hard maintaining this. To me, all ideas seem to converge on either throwing an exception or return -1. Given there were no external developers using this before and this is the first release where we are making this public, I think we should go with the -1 approach. I also think we should clearly javadoc this class saying this should not be used for external users, it is more of an SPI for developers to extend and include their own process-tree implementation. (Writing it that way, I don't see this to be public at all, but we screwed up way back when we made this a user-visible configuration). For doing this, we should make the getResourceCalculatorProcessTree() an @Private method. > ResourceCalculatorProcessTree#getCpuUsagePercent default return value is > inconsistent with other getters > > > Key: YARN-3304 > URL: https://issues.apache.org/jira/browse/YARN-3304 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Junping Du >Assignee: Junping Du >Priority: Blocker > Attachments: YARN-3304-v2.patch, YARN-3304-v3.patch, YARN-3304.patch > > > Per discussions in YARN-3296, getCpuUsagePercent() will return -1 for > unavailable case while other resource metrics are return 0 in the same case > which sounds inconsistent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage
Sangjin Lee created YARN-3411: - Summary: [Storage implementation] explore the native HBase write schema for storage Key: YARN-3411 URL: https://issues.apache.org/jira/browse/YARN-3411 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Priority: Critical There is work that's in progress to implement the storage based on a Phoenix schema (YARN-3134). In parallel, we would like to explore an implementation based on a native HBase schema for the write path. Such a schema does not exclude using Phoenix, especially for reads and offline queries. Once we have basic implementations of both options, we could evaluate them in terms of performance, scalability, usability, etc. and make a call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage
[ https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vrushali C reassigned YARN-3411: Assignee: Vrushali C > [Storage implementation] explore the native HBase write schema for storage > -- > > Key: YARN-3411 > URL: https://issues.apache.org/jira/browse/YARN-3411 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Vrushali C >Priority: Critical > > There is work that's in progress to implement the storage based on a Phoenix > schema (YARN-3134). > In parallel, we would like to explore an implementation based on a native > HBase schema for the write path. Such a schema does not exclude using > Phoenix, especially for reads and offline queries. > Once we have basic implementations of both options, we could evaluate them in > terms of performance, scalability, usability, etc. and make a call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2890) MiniMRYarnCluster should turn on timeline service if configured to do so
[ https://issues.apache.org/jira/browse/YARN-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384468#comment-14384468 ] Hadoop QA commented on YARN-2890: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12707822/YARN-2890.1.patch against trunk revision 05499b1. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests: org.apache.hadoop.mapreduce.jobhistory.TestJobHistoryEventHandler org.apache.hadoop.mapreduce.TestMRJobClient org.apache.hadoop.mapreduce.lib.output.TestJobOutputCommitter org.apache.hadoop.conf.TestNoDefaultsJobConf org.apache.hadoop.mapreduce.v2.TestNonExistentJob org.apache.hadoop.mapreduce.v2.TestMROldApiJobs org.apache.hadoop.mapreduce.TestChild org.apache.hadoop.mapred.TestReduceFetch org.apache.hadoop.mapred.TestLazyOutput org.apache.hadoop.mapred.TestMiniMRChildTask org.apache.hadoop.mapreduce.v2.TestRMNMInfo org.apache.hadoop.mapred.TestJobName org.apache.hadoop.mapred.TestMiniMRWithDFSWithDistinctUsers org.apache.hadoop.mapreduce.security.TestBinaryTokenFile org.apache.hadoop.mapreduce.security.TestMRCredentials org.apache.hadoop.mapreduce.v2.TestUberAM org.apache.hadoop.mapreduce.v2.TestMRJobs org.apache.hadoop.mapreduce.v2.TestMRAMWithNonNormalizedCapabilities org.apache.hadoop.mapred.TestMiniMRClasspath org.apache.hadoop.mapred.TestMerge org.apache.hadoop.mapred.TestMRTimelineEventHandling org.apache.hadoop.mapreduce.v2.TestMRAppWithCombiner org.apache.hadoop.mapred.TestClusterMapReduceTestCase org.apache.hadoop.mapred.TestJobCounters org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath org.apache.hadoop.mapred.TestJobSysDirWithDFS org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler org.apache.hadoop.mapred.TestMiniMRBringup org.apache.hadoop.mapreduce.v2.TestMRJobsWithHistoryService org.apache.hadoop.mapreduce.security.ssl.TestEncryptedShuffle org.apache.hadoop.mapred.TestMRIntermediateDataEncryption org.apache.hadoop.mapreduce.TestMapReduceLazyOutput org.apache.hadoop.mapred.TestJobCleanup org.apache.hadoop.ipc.TestMRCJCSocketFactory org.apache.hadoop.mapreduce.v2.TestMiniMRProxyUser org.apache.hadoop.mapreduce.v2.TestSpeculativeExecution org.apache.hadoop.mapreduce.TestLargeSort org.apache.hadoop.mapred.TestClusterMRNotification org.apache.hadoop.mapred.TestNetworkedJob org.apache.hadoop.mapred.TestMiniMRClientCluster org.apache.hadoop.mapred.TestReduceFetchFromPartialMem org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShellWithNodeLabels org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell org.apache.hadoop.yarn.server.TestMiniYARNClusterForHA org.apache.hadoop.yarn.server.TestContainerManagerSecurity org.apache.hadoop.yarn.server.TestDiskFailures Test results: https://builds.apache.org/job/PreCommit-YARN-Build/7126//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7126//console This message is automatically generated. > MiniMRYarnCluster should turn on
[jira] [Commented] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage
[ https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384531#comment-14384531 ] Li Lu commented on YARN-3411: - Hi [~vrushalic], thanks for working on this! It would be good for us to have both hbase and Phoenix storage implementations for comparison. Just keeping a record here that I think we can do the evaluation, before we move into implementing the aggregations. In this way we may save duplicated efforts in designing and implementing aggregations. > [Storage implementation] explore the native HBase write schema for storage > -- > > Key: YARN-3411 > URL: https://issues.apache.org/jira/browse/YARN-3411 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Vrushali C >Priority: Critical > > There is work that's in progress to implement the storage based on a Phoenix > schema (YARN-3134). > In parallel, we would like to explore an implementation based on a native > HBase schema for the write path. Such a schema does not exclude using > Phoenix, especially for reads and offline queries. > Once we have basic implementations of both options, we could evaluate them in > terms of performance, scalability, usability, etc. and make a call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384566#comment-14384566 ] Li Lu commented on YARN-3051: - bq. We had decided on below three APIs' which are somewhat similar to what existed in ATS v1. Isn't that what we already have in YARN-3047? > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384578#comment-14384578 ] Varun Saxena commented on YARN-3051: No...I had initially kept it there but later moved it out so that store implementation can be in YARN-3051. This JIRA will have File System implementation. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle
[ https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384609#comment-14384609 ] Li Lu commented on YARN-3047: - Given the current discussion in YARN-3051, seems for now we have not reached a rough conclusion on the V2 timeline reader APIs. In the patch of this JIRA, we're simply assuming the very least subset of reader APIs, but in the foreseeable future we may probably add more to support more v2 queries. We may want to minimize such kind of iterations. IMHO, the reader layer should depend on the reader APIs, but not the other way around. That said, we may want to pause this JIRA until we have a relatively workable v2 API. After we feel we're fine with the APIs, we can then rebase the patch to connect the relatively simple wire up works here. > [Data Serving] Set up ATS reader with basic request serving structure and > lifecycle > --- > > Key: YARN-3047 > URL: https://issues.apache.org/jira/browse/YARN-3047 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: Timeline_Reader(draft).pdf, YARN-3047.001.patch, > YARN-3047.003.patch, YARN-3047.005.patch, YARN-3047.006.patch, > YARN-3047.007.patch, YARN-3047.02.patch, YARN-3047.04.patch > > > Per design in YARN-2938, set up the ATS reader as a service and implement the > basic structure as a service. It includes lifecycle management, request > serving, and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384615#comment-14384615 ] Sangjin Lee commented on YARN-3051: --- A couple of things to discuss: In principle, a *shallow* view of the entity will be returned by default, right? Specifically, I'm wondering whether all configs and metrics should be included in the default view or not. I wonder what ATS v.1 does in this regard? FYI, I believe most of the YARN REST API returns a shall view of objects. Note that the size of the responses could become quite big if we include configs and metrics by default. On a related note, if we decide to return shallow views by default, then the question is, how do we ask the reader to get things like configs and metrics? The reader API as well as the reader storage interface should be able to support calls to retrieve config/metrics, perhaps with new methods. bq. For instance, queries such as get apps which have a particular metric's value less than or greater than something ? Metric/config-based queries will probably need changes to the API. We would want to be able to queries like "return apps where config X = Y" or "return apps where metric A > B". But we can consider them advanced queries. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2890) MiniMRYarnCluster should turn on timeline service if configured to do so
[ https://issues.apache.org/jira/browse/YARN-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mit Desai updated YARN-2890: Attachment: YARN-2890.2.patch Verified that the tests run fine on my box. Attaching another version of the patch where I fixed a silly mistake. > MiniMRYarnCluster should turn on timeline service if configured to do so > > > Key: YARN-2890 > URL: https://issues.apache.org/jira/browse/YARN-2890 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Mit Desai >Assignee: Mit Desai > Attachments: YARN-2890.1.patch, YARN-2890.2.patch, YARN-2890.patch, > YARN-2890.patch, YARN-2890.patch, YARN-2890.patch, YARN-2890.patch > > > Currently the MiniMRYarnCluster does not consider the configuration value for > enabling timeline service before starting. The MiniYarnCluster should only > start the timeline service if it is configured to do so. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3331) Hadoop should have the option to use directory other than tmp for extracting and loading leveldbjni
[ https://issues.apache.org/jira/browse/YARN-3331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-3331: -- Component/s: (was: nodemanager) Target Version/s: 2.8.0 Summary: Hadoop should have the option to use directory other than tmp for extracting and loading leveldbjni (was: NodeManager should use directory other than tmp for extracting and loading leveldbjni) Update the jira a bit, as the issue doesn't limit to NM only. All components that are using leveldb for storage will be affected. > Hadoop should have the option to use directory other than tmp for extracting > and loading leveldbjni > --- > > Key: YARN-3331 > URL: https://issues.apache.org/jira/browse/YARN-3331 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-3331.001.patch, YARN-3331.002.patch > > > /tmp can be required to be noexec in many environments. This causes a > problem when nodemanager tries to load the leveldbjni library which can get > unpacked and executed from /tmp. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384634#comment-14384634 ] Varun Saxena commented on YARN-3051: That is what was initially decided. We can handle file system implementation in another JIRA as well. But as File System implementation will be the default, we thought we can handle it here > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle
[ https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384645#comment-14384645 ] Varun Saxena commented on YARN-3047: This JIRA was just meant to start reader as a daemon and start some basic services. This can act as base code which we add to as we reach consensus on reader API. Thoughts ? > [Data Serving] Set up ATS reader with basic request serving structure and > lifecycle > --- > > Key: YARN-3047 > URL: https://issues.apache.org/jira/browse/YARN-3047 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > Attachments: Timeline_Reader(draft).pdf, YARN-3047.001.patch, > YARN-3047.003.patch, YARN-3047.005.patch, YARN-3047.006.patch, > YARN-3047.007.patch, YARN-3047.02.patch, YARN-3047.04.patch > > > Per design in YARN-2938, set up the ATS reader as a service and implement the > basic structure as a service. It includes lifecycle management, request > serving, and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384656#comment-14384656 ] Varun Saxena commented on YARN-3051: configs and metrics will be retrieved as part of an entity. We can filter out which fields to retrieve based on {{EnumSet fieldsToRetrieve}}. null means all fields will be retrieved. So if we do not want all configs and metrics, we can leave them out and mention other fields in fieldsToRetrieve. This can be mentioned in the REST URL as {{fields=}} > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384675#comment-14384675 ] Li Lu commented on YARN-3051: - bq. configs and metrics will be retrieved as part of an entity. The most significant concern here is the size of configs and metrics. I think that's why [~sjlee0] is proposing a shallow view here. Still waiting for [~zjshen]'s confirmation for v1, but for v2 I think we may need something like this. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384684#comment-14384684 ] Varun Saxena commented on YARN-3051: Keeping this in mind, do you think a new method will be required to fetch config and metrics ? I guess not. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384699#comment-14384699 ] Varun Saxena commented on YARN-3051: To elaborate on what getEntities API will do. It will support filters similar to secondary filters by matching the info field. Yes, API would need to be enhanced to support queries based on config and metrics. I think it can be part of the same getEntities API. As mentioned above, for config equality can be checked and for metrics all the relational operators will have to be supported. We can probably have 2 additional parameters in the API, namely configFilters and metricsFilters. I guess that should do. I dont think there will be any other field on the basis of which filtering will be done. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384698#comment-14384698 ] Varun Saxena commented on YARN-3051: Yeah I meant it can still be supported if client mentions which fields are to be retrieved. But I do understand the concern here. The default view should return all fields except configs and metrics. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384709#comment-14384709 ] Varun Saxena commented on YARN-3051: For the point about shallow view of entity, we can then say if {{fieldsToRetrieve}} is null i.e. client does not specify which fields to retrieve, store implementation will return all fields except configs and metrics. I can add another special field called "all" which would indicate all fields will have to be retrieved. So if client specifies fields=all in REST URL, storage implementation will fetch all the fields. Thoughts ? > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3258) FairScheduler: Need to add more logging to investigate allocations
[ https://issues.apache.org/jira/browse/YARN-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-3258: Attachment: YARN-3528.002.patch Addressed feedback > FairScheduler: Need to add more logging to investigate allocations > -- > > Key: YARN-3258 > URL: https://issues.apache.org/jira/browse/YARN-3258 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot >Priority: Minor > Attachments: YARN-3258.001.patch, YARN-3528.002.patch > > > Its hard to investigate allocation failures without any logging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384761#comment-14384761 ] Varun Saxena commented on YARN-3051: bq. We may not only need to do queries for timeline entities, but also something solely for their configs and/or metrics But IIUC, metrics and configs would still be tied to or encapsulated inside an entity. The entity may be a cluster or it may be an application or something else. So when I say get all configs for an app. I do that by specifying fields=configs in REST URL. And if I want metrics and configs for an app, I can say fields=configs,metrics. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384783#comment-14384783 ] Li Lu commented on YARN-3051: - bq. So when I say get all configs for an app. I do that by specifying fields=configs in REST URL. And if I want metrics and configs for an app, I can say fields=configs,metrics. OK, I'm just thinking out loud. So do we need to touch both the entity table and the config/metric table on the underlying storage? Now suppose I've already have a timeline entity, without its metrics, and I'd like to draw a time series for its hdfs_bytes_write. Do I need to regenerate the timeline entity together with the metric, or I can say something like "get hdfs_bytes_write for this context"? BTW, we may want to consider the relationship between the context and timeline entities on the reader side. The context information is the PK of the timeline entity rows. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384785#comment-14384785 ] Varun Saxena commented on YARN-3051: Just to elaborate further, below API will be used to serve the use case above. {code} TimelineEntity getEntity(String entityId, String entityType, EnumSet fieldsToRetrieve) {code} Assuming entityid will be same as appid if entity type is "application", we can fetch configs for application_12345_0001 like below : {{http:///application/application_12345_0001?fields=configs}} > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384788#comment-14384788 ] Varun Saxena commented on YARN-3051: Hmm...If you don't mind can you share the schema decided for phoenix based storage. That will be helpful in designing the API. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384792#comment-14384792 ] Li Lu commented on YARN-3051: - Sure. Will post it soon. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384797#comment-14384797 ] Varun Saxena commented on YARN-3051: Thanks. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384796#comment-14384796 ] Li Lu commented on YARN-3051: - BTW, the reader APIs are not only for the Phoenix storage itself. We also need to consider the hbase implementation. On the design side, we may want to consider the common strategies, and I don't think a single storage implementation would block the progress of this JIRA. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384802#comment-14384802 ] Varun Saxena commented on YARN-3051: Yeah it should not block progress of this JIRA. Was just trying to understand your use case better. > [Storage abstraction] Create backing storage read interface for ATS readers > --- > > Key: YARN-3051 > URL: https://issues.apache.org/jira/browse/YARN-3051 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Varun Saxena > > Per design in YARN-2928, create backing storage read interface that can be > implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3304) ResourceCalculatorProcessTree#getCpuUsagePercent default return value is inconsistent with other getters
[ https://issues.apache.org/jira/browse/YARN-3304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384830#comment-14384830 ] Junping Du commented on YARN-3304: -- Thanks [~vinodkv] for comments! Agree that the second comment is very important, and we should put it on public interface now. For boolean API VS. negative value, sounds like more people may prefer the later one so far. I would like to hear your more comments when you see the finally code looks like. In v4 patch, I will post 2 patches, one for boolean API (with addressing 2nd comment above based on v3 patch) way, and the other is for negative value way. Honestly speaking, I don't feel very comfortable in writing patch for the later way as every return -1 sounds like a warning to external developer that to follow the implicit contract here... Also, please note that the negative value way is not a completed work which need YARN-3392 ([~adhoot] have a demo patch there) and fix in MapReduce resource counter. > ResourceCalculatorProcessTree#getCpuUsagePercent default return value is > inconsistent with other getters > > > Key: YARN-3304 > URL: https://issues.apache.org/jira/browse/YARN-3304 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Junping Du >Assignee: Junping Du >Priority: Blocker > Attachments: YARN-3304-v2.patch, YARN-3304-v3.patch, YARN-3304.patch > > > Per discussions in YARN-3296, getCpuUsagePercent() will return -1 for > unavailable case while other resource metrics are return 0 in the same case > which sounds inconsistent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)