[jira] [Commented] (YARN-6467) CSQueueMetrics needs to update the current metrics for default partition only
[ https://issues.apache.org/jira/browse/YARN-6467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065923#comment-16065923 ] Manikandan R commented on YARN-6467: Fixed findbugs error. Junit failures are not related to this patch. > CSQueueMetrics needs to update the current metrics for default partition only > - > > Key: YARN-6467 > URL: https://issues.apache.org/jira/browse/YARN-6467 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 2.8.0, 2.7.3, 3.0.0-alpha2 >Reporter: Naganarasimha G R >Assignee: Manikandan R > Fix For: 3.0.0-alpha4 > > Attachments: YARN-6467.001.patch, YARN-6467.001.patch, > YARN-6467.002.patch, YARN-6467.003.patch, YARN-6467.004.patch, > YARN-6467.005.patch, YARN-6467.006.patch, YARN-6467-branch-2.007.patch, > YARN-6467-branch-2.008.patch, YARN-6467-branch-2.8.009.patch, > YARN-6467-branch-2.8.010.patch, YARN-6467-branch-2.8.011.patch > > > As a followup to YARN-6195, we need to update existing metrics to only > default Partition. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6467) CSQueueMetrics needs to update the current metrics for default partition only
[ https://issues.apache.org/jira/browse/YARN-6467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikandan R updated YARN-6467: --- Attachment: YARN-6467-branch-2.8.011.patch > CSQueueMetrics needs to update the current metrics for default partition only > - > > Key: YARN-6467 > URL: https://issues.apache.org/jira/browse/YARN-6467 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 2.8.0, 2.7.3, 3.0.0-alpha2 >Reporter: Naganarasimha G R >Assignee: Manikandan R > Fix For: 3.0.0-alpha4 > > Attachments: YARN-6467.001.patch, YARN-6467.001.patch, > YARN-6467.002.patch, YARN-6467.003.patch, YARN-6467.004.patch, > YARN-6467.005.patch, YARN-6467.006.patch, YARN-6467-branch-2.007.patch, > YARN-6467-branch-2.008.patch, YARN-6467-branch-2.8.009.patch, > YARN-6467-branch-2.8.010.patch, YARN-6467-branch-2.8.011.patch > > > As a followup to YARN-6195, we need to update existing metrics to only > default Partition. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6689) PlacementRule should be configurable
[ https://issues.apache.org/jira/browse/YARN-6689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065906#comment-16065906 ] Hadoop QA commented on YARN-6689: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 49s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 45s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 15s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 33s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 33s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 44m 27s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 34s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}100m 56s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart | | | hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisher | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | YARN-6689 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12874795/YARN-6689.004.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml | | uname | Linux 4d90ed21dad0 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / a5c0476 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/16261/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results |
[jira] [Commented] (YARN-6428) Queue AM limit is not honored in CS always
[ https://issues.apache.org/jira/browse/YARN-6428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065864#comment-16065864 ] Naganarasimha G R commented on YARN-6428: - As discussed offline with [~sunilg], What [~bibinchundatt] wanted to say is : In ResourceCalculator. multiplyAndNormalizeUp implementations, as we multiply with the resource.mem (long value & CPU usually will not have a problem ) first with 10^6 so the value of memory cannot exceed by {{Long.MAX VALUE / 10^N}} as it would hit the limit and become negative value. > Queue AM limit is not honored in CS always > --- > > Key: YARN-6428 > URL: https://issues.apache.org/jira/browse/YARN-6428 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: YARN-6428.0001.patch, YARN-6428.0002.patch > > > Steps to reproduce > > Setup cluster with 40 GB and 40 vcores with 4 Node managers with 10 GB each. > Configure 100% to default queue as capacity and max am limit as 10 % > Minimum scheduler memory and vcore as 512,1 > *Expected* > AM limit 4096 and 4 vores > *Actual* > AM limit 4096+512 and 4+1 vcore -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6223) [Umbrella] Natively support GPU configuration/discovery/scheduling/isolation on YARN
[ https://issues.apache.org/jira/browse/YARN-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-6223: - Attachment: YARN-6223.wip.2.patch Attached ver.2 WIP patch which discussed with [~chris.douglas] / [~vinodkv] / [~sunil.gov...@gmail.com] / [~sidharta-s] / [~vvasudev]. I want you guys to help look at overall implementation in c side. I made it better modularized, and the new logics can be turned on/off. This is related to YARN-5673 but doesn't include refactoring of existing logics / dynamic load modules, etc. TODO items: - More validation / tests need for C side, now doesn't handle memory leak, etc. properly. - Doesn't include logics of GPU allocation recovery on NM restart. +[~tangzhankun], this may related to FPGA isolation as well. I will be traveling till early next week so please expect delayed responses :). > [Umbrella] Natively support GPU configuration/discovery/scheduling/isolation > on YARN > > > Key: YARN-6223 > URL: https://issues.apache.org/jira/browse/YARN-6223 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-6223.Natively-support-GPU-on-YARN-v1.pdf, > YARN-6223.wip.1.patch, YARN-6223.wip.2.patch > > > As varieties of workloads are moving to YARN, including machine learning / > deep learning which can speed up by leveraging GPU computation power. > Workloads should be able to request GPU from YARN as simple as CPU and memory. > *To make a complete GPU story, we should support following pieces:* > 1) GPU discovery/configuration: Admin can either config GPU resources and > architectures on each node, or more advanced, NodeManager can automatically > discover GPU resources and architectures and report to ResourceManager > 2) GPU scheduling: YARN scheduler should account GPU as a resource type just > like CPU and memory. > 3) GPU isolation/monitoring: once launch a task with GPU resources, > NodeManager should properly isolate and monitor task's resource usage. > For #2, YARN-3926 can support it natively. For #3, YARN-3611 has introduced > an extensible framework to support isolation for different resource types and > different runtimes. > *Related JIRAs:* > There're a couple of JIRAs (YARN-4122/YARN-5517) filed with similar goals but > different solutions: > For scheduling: > - YARN-4122/YARN-5517 are all adding a new GPU resource type to Resource > protocol instead of leveraging YARN-3926. > For isolation: > - And YARN-4122 proposed to use CGroups to do isolation which cannot solve > the problem listed at > https://github.com/NVIDIA/nvidia-docker/wiki/GPU-isolation#challenges such as > minor device number mapping; load nvidia_uvm module; mismatch of CUDA/driver > versions, etc. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6593) [API] Introduce Placement Constraint object
[ https://issues.apache.org/jira/browse/YARN-6593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065840#comment-16065840 ] Wangda Tan commented on YARN-6593: -- Thanks [~kkaranasos]/[~chris.douglas]/[~subru] for putting this together! I haven't checked many details in the patch, the user facing API PlacementConstraints looks very clear. So far, I'm not sure Is it possible to reduce {{@public}} APIs? For example, Visitor APIs should not be user-faceable, correct? I will be traveling till next Tue, could you wait me to look at details of the patch next week? > [API] Introduce Placement Constraint object > --- > > Key: YARN-6593 > URL: https://issues.apache.org/jira/browse/YARN-6593 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Konstantinos Karanasos >Assignee: Konstantinos Karanasos > Fix For: 3.0.0-alpha3 > > Attachments: YARN-6593.001.patch, YARN-6593.002.patch, > YARN-6593.003.patch, YARN-6593.004.patch > > > This JIRA introduces an object for defining placement constraints. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6689) PlacementRule should be configurable
[ https://issues.apache.org/jira/browse/YARN-6689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065830#comment-16065830 ] Jonathan Hung commented on YARN-6689: - 004 fixes checkstyle again. > PlacementRule should be configurable > > > Key: YARN-6689 > URL: https://issues.apache.org/jira/browse/YARN-6689 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Jonathan Hung >Assignee: Jonathan Hung > Attachments: YARN-6689.001.patch, YARN-6689.002.patch, > YARN-6689.003.patch, YARN-6689.004.patch > > > YARN-3635 introduces PlacementRules for placing applications in queues. It is > currently hardcoded to one rule, {{UserGroupMappingPlacementRule}}. This > should be configurable as mentioned in the comments:{noformat} private void > updatePlacementRules() throws IOException { > List placementRules = new ArrayList<>(); > // Initialize UserGroupMappingPlacementRule > // TODO, need make this defineable by configuration. > UserGroupMappingPlacementRule ugRule = getUserGroupMappingPlacementRule(); > if (null != ugRule) { > placementRules.add(ugRule); > } > rmContext.getQueuePlacementManager().updateRules(placementRules); > }{noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6689) PlacementRule should be configurable
[ https://issues.apache.org/jira/browse/YARN-6689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hung updated YARN-6689: Attachment: YARN-6689.004.patch > PlacementRule should be configurable > > > Key: YARN-6689 > URL: https://issues.apache.org/jira/browse/YARN-6689 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Jonathan Hung >Assignee: Jonathan Hung > Attachments: YARN-6689.001.patch, YARN-6689.002.patch, > YARN-6689.003.patch, YARN-6689.004.patch > > > YARN-3635 introduces PlacementRules for placing applications in queues. It is > currently hardcoded to one rule, {{UserGroupMappingPlacementRule}}. This > should be configurable as mentioned in the comments:{noformat} private void > updatePlacementRules() throws IOException { > List placementRules = new ArrayList<>(); > // Initialize UserGroupMappingPlacementRule > // TODO, need make this defineable by configuration. > UserGroupMappingPlacementRule ugRule = getUserGroupMappingPlacementRule(); > if (null != ugRule) { > placementRules.add(ugRule); > } > rmContext.getQueuePlacementManager().updateRules(placementRules); > }{noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6708) Nodemanager container crash after ext3 folder limit
[ https://issues.apache.org/jira/browse/YARN-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065823#comment-16065823 ] Bibin A Chundatt commented on YARN-6708: Thank you [~jlowe] for review. {quote} Is there a reason to use 755 permissions on the intermediate directories in the user cache? Note that we only allow 710 permissions on the final directly, and it seems intermediate directories should only require that as well, or 750 at the most. I don't see any reason to allow any "other" permissions on user-specific directories in the local cache. {quote} {{755}} is the existing directory permissions for cache folders in {{FsDownload#cacheperms}}. If Node manager service and users are in different group should be able to check the availability of existing cache folders during download and recovery. {{LocalResourcesTrackerImpl#handle}} {code} case REQUEST: if (rsrc != null && (!isResourcePresent(rsrc))) { LOG.info("Resource " + rsrc.getLocalPath() + " is missing, localizing it again"); removeResource(req); rsrc = null; } {code} {quote} It would improve readability if we moved the directory handling stuff to a utility method like createDirAndParents or something similar and pass the desired permissions for the dirs as an argument. {quote} Will update in next patch. {code} vm2:/opt/hadoop/release/data/nmlocal/usercache/mapred/filecache # l total 28 drwx--x--- 7 mapred hadoop 4096 Jun 10 14:35 ./ drwxr-s--- 4 mapred hadoop 4096 Jun 10 12:07 ../ drwxr-x--- 3 mapred users 4096 Jun 10 14:36 0/ drwxr-xr-x 3 mapred users 4096 Jun 10 12:15 10/ drwxr-xr-x 3 mapred users 4096 Jun 10 12:22 11/ drwxr-xr-x 3 mapred users 4096 Jun 10 12:27 12/ drwxr-xr-x 3 mapred users 4096 Jun 10 12:31 13/ {code} {quote} If the parent of destDirPath is the cache root then we won't set the permissions of destDirPath but otherwise we will? {quote} Already existing {{FSDownload}} code handles this case. {{FSDownload}} *cacheperms* sets directory permissions as *755*. {{FSDownload}} should have been in {{nodemanager}} since its tightly coupled to the directoy permission wrt to localization . or am I missing something? {quote} AtomicLong use is overkill in the test since there's no thread contention on that object. {quote} Yes .. we dont require will change. In test tried to cover complete flow with multiple base directory, single base directory etc.. On second thought we really don't require. LocalCacheDirectoryManager part we could skip. Creating paths {{12}}, {{1/14}} {{0/0/85}} should be enough for current code change. {{FSDownload}} handles the final cache directory permissions. Even if 0/0/85 is created before download, in FSDownload for {{85}} the same could get reset rt?? The directory permission is 755 and in jenkins the umask is 022 to validate directory rights for code change used reflection. Container localizer USERCACHE permission could be package private but the above point of {{FSDownload}} will set the rights to 0755 or we should be checking only {{0/0}}?? > Nodemanager container crash after ext3 folder limit > --- > > Key: YARN-6708 > URL: https://issues.apache.org/jira/browse/YARN-6708 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > Attachments: YARN-6708.001.patch, YARN-6708.002.patch, > YARN-6708.003.patch, YARN-6708.004.patch > > > Configure umask as *027* for nodemanager service user > and {{yarn.nodemanager.local-cache.max-files-per-directory}} as {{40}}. After > 4 *private* dir localization next directory will be *0/14* > Local Directory cache manager > {code} > vm2:/opt/hadoop/release/data/nmlocal/usercache/mapred/filecache # l > total 28 > drwx--x--- 7 mapred hadoop 4096 Jun 10 14:35 ./ > drwxr-s--- 4 mapred hadoop 4096 Jun 10 12:07 ../ > drwxr-x--- 3 mapred users 4096 Jun 10 14:36 0/ > drwxr-xr-x 3 mapred users 4096 Jun 10 12:15 10/ > drwxr-xr-x 3 mapred users 4096 Jun 10 12:22 11/ > drwxr-xr-x 3 mapred users 4096 Jun 10 12:27 12/ > drwxr-xr-x 3 mapred users 4096 Jun 10 12:31 13/ > {code} > *drwxr-x---* 3 mapred users 4096 Jun 10 14:36 0/ is only *750* > Nodemanager user will not be able check for localization path exists or not. > {{LocalResourcesTrackerImpl}} > {code} > case REQUEST: > if (rsrc != null && (!isResourcePresent(rsrc))) { > LOG.info("Resource " + rsrc.getLocalPath() > + " is missing, localizing it again"); > removeResource(req); > rsrc = null; > } > if (null == rsrc) { > rsrc = new LocalizedResource(req, dispatcher); > localrsrc.put(req, rsrc); > } > break; > {code} > *isResourcePresent* will always return false and same
[jira] [Commented] (YARN-6689) PlacementRule should be configurable
[ https://issues.apache.org/jira/browse/YARN-6689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065811#comment-16065811 ] Hadoop QA commented on YARN-6689: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 47s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 30s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 43s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 53s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 1 new + 286 unchanged - 0 fixed = 287 total (was 286) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 32s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 26s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 46m 33s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}105m 32s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart | | | hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSAppStarvation | | Timed out junit tests | org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | YARN-6689 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12874780/YARN-6689.003.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml | | uname | Linux 5798ab17b22d 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 686a634 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | checkstyle |
[jira] [Commented] (YARN-4161) Capacity Scheduler : Assign single or multiple containers per heart beat driven by configuration
[ https://issues.apache.org/jira/browse/YARN-4161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065803#comment-16065803 ] Wangda Tan commented on YARN-4161: -- Thanks [~ywskycn] for the patch, [~sunilg] could you help review the patch? > Capacity Scheduler : Assign single or multiple containers per heart beat > driven by configuration > > > Key: YARN-4161 > URL: https://issues.apache.org/jira/browse/YARN-4161 > Project: Hadoop YARN > Issue Type: New Feature > Components: capacity scheduler >Reporter: Mayank Bansal >Assignee: Mayank Bansal > Labels: oct16-medium > Attachments: YARN-4161.patch, YARN-4161.patch.1 > > > Capacity Scheduler right now schedules multiple containers per heart beat if > there are more resources available in the node. > This approach works fine however in some cases its not distribute the load > across the cluster hence throughput of the cluster suffers. I am adding > feature to drive that using configuration by that we can control the number > of containers assigned per heart beat. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6600) Enhance default lifetime of application at LeafQueue level.
[ https://issues.apache.org/jira/browse/YARN-6600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065777#comment-16065777 ] Hadoop QA commented on YARN-6600: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 7s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 1s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 10s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 8 new + 574 unchanged - 1 fixed = 582 total (was 575) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 23s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 32s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 29s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 48m 59s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 56s{color} | {color:red} hadoop-yarn-client in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}131m 23s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart | | | hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer | | | hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSAppStarvation | | | hadoop.yarn.client.api.impl.TestAMRMProxy | | Timed out junit tests | org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | YARN-6600 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12874772/YARN-6600.01.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml | | uname | Linux ef6ef01060a1 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
[jira] [Commented] (YARN-6492) Generate queue metrics for each partition
[ https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065753#comment-16065753 ] Naganarasimha G R commented on YARN-6492: - Hi [~jhung], YARN-6467 is the base for this jira and [~maniraj...@gmail.com] is already working on it and is almost in the verge of getting committed. He has ensured that required things are already handled in it. Based on this jira in mind we were already working on it. So hope you dont mind if Mani takes over this jira! > Generate queue metrics for each partition > - > > Key: YARN-6492 > URL: https://issues.apache.org/jira/browse/YARN-6492 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Reporter: Jonathan Hung >Assignee: Naganarasimha G R > > We are interested in having queue metrics for all partitions. Right now each > queue has one QueueMetrics object which captures metrics either in default > partition or across all partitions. (After YARN-6467 it will be in default > partition) > But having the partition metrics would be very useful. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-6467) CSQueueMetrics needs to update the current metrics for default partition only
[ https://issues.apache.org/jira/browse/YARN-6467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065748#comment-16065748 ] Naganarasimha G R edited comment on YARN-6467 at 6/28/17 12:57 AM: --- Hi [~maniraj...@gmail.com], can you please check the Findbugs error and test case failures are they related to the patch ? was (Author: naganarasimha): Hi > CSQueueMetrics needs to update the current metrics for default partition only > - > > Key: YARN-6467 > URL: https://issues.apache.org/jira/browse/YARN-6467 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 2.8.0, 2.7.3, 3.0.0-alpha2 >Reporter: Naganarasimha G R >Assignee: Manikandan R > Fix For: 3.0.0-alpha4 > > Attachments: YARN-6467.001.patch, YARN-6467.001.patch, > YARN-6467.002.patch, YARN-6467.003.patch, YARN-6467.004.patch, > YARN-6467.005.patch, YARN-6467.006.patch, YARN-6467-branch-2.007.patch, > YARN-6467-branch-2.008.patch, YARN-6467-branch-2.8.009.patch, > YARN-6467-branch-2.8.010.patch > > > As a followup to YARN-6195, we need to update existing metrics to only > default Partition. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6467) CSQueueMetrics needs to update the current metrics for default partition only
[ https://issues.apache.org/jira/browse/YARN-6467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065748#comment-16065748 ] Naganarasimha G R commented on YARN-6467: - Hi > CSQueueMetrics needs to update the current metrics for default partition only > - > > Key: YARN-6467 > URL: https://issues.apache.org/jira/browse/YARN-6467 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 2.8.0, 2.7.3, 3.0.0-alpha2 >Reporter: Naganarasimha G R >Assignee: Manikandan R > Fix For: 3.0.0-alpha4 > > Attachments: YARN-6467.001.patch, YARN-6467.001.patch, > YARN-6467.002.patch, YARN-6467.003.patch, YARN-6467.004.patch, > YARN-6467.005.patch, YARN-6467.006.patch, YARN-6467-branch-2.007.patch, > YARN-6467-branch-2.008.patch, YARN-6467-branch-2.8.009.patch, > YARN-6467-branch-2.8.010.patch > > > As a followup to YARN-6195, we need to update existing metrics to only > default Partition. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6689) PlacementRule should be configurable
[ https://issues.apache.org/jira/browse/YARN-6689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065696#comment-16065696 ] Jonathan Hung commented on YARN-6689: - 003 patch fixes TestYarnConfigurationFields test (others seem unrelated), also fix checkstyle. > PlacementRule should be configurable > > > Key: YARN-6689 > URL: https://issues.apache.org/jira/browse/YARN-6689 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Jonathan Hung >Assignee: Jonathan Hung > Attachments: YARN-6689.001.patch, YARN-6689.002.patch, > YARN-6689.003.patch > > > YARN-3635 introduces PlacementRules for placing applications in queues. It is > currently hardcoded to one rule, {{UserGroupMappingPlacementRule}}. This > should be configurable as mentioned in the comments:{noformat} private void > updatePlacementRules() throws IOException { > List placementRules = new ArrayList<>(); > // Initialize UserGroupMappingPlacementRule > // TODO, need make this defineable by configuration. > UserGroupMappingPlacementRule ugRule = getUserGroupMappingPlacementRule(); > if (null != ugRule) { > placementRules.add(ugRule); > } > rmContext.getQueuePlacementManager().updateRules(placementRules); > }{noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6689) PlacementRule should be configurable
[ https://issues.apache.org/jira/browse/YARN-6689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hung updated YARN-6689: Attachment: YARN-6689.003.patch > PlacementRule should be configurable > > > Key: YARN-6689 > URL: https://issues.apache.org/jira/browse/YARN-6689 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Jonathan Hung >Assignee: Jonathan Hung > Attachments: YARN-6689.001.patch, YARN-6689.002.patch, > YARN-6689.003.patch > > > YARN-3635 introduces PlacementRules for placing applications in queues. It is > currently hardcoded to one rule, {{UserGroupMappingPlacementRule}}. This > should be configurable as mentioned in the comments:{noformat} private void > updatePlacementRules() throws IOException { > List placementRules = new ArrayList<>(); > // Initialize UserGroupMappingPlacementRule > // TODO, need make this defineable by configuration. > UserGroupMappingPlacementRule ugRule = getUserGroupMappingPlacementRule(); > if (null != ugRule) { > placementRules.add(ugRule); > } > rmContext.getQueuePlacementManager().updateRules(placementRules); > }{noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5881) Enable configuration of queue capacity in terms of absolute resources
[ https://issues.apache.org/jira/browse/YARN-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065682#comment-16065682 ] Wangda Tan commented on YARN-5881: -- Created a branch (YARN-5881) for this feature. > Enable configuration of queue capacity in terms of absolute resources > - > > Key: YARN-5881 > URL: https://issues.apache.org/jira/browse/YARN-5881 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Sean Po >Assignee: Wangda Tan > Attachments: > YARN-5881.Support.Absolute.Min.Max.Resource.In.Capacity.Scheduler.design-doc.v1.pdf, > YARN-5881.v0.patch, YARN-5881.v1.patch > > > Currently, Yarn RM supports the configuration of queue capacity in terms of a > proportion to cluster capacity. In the context of Yarn being used as a public > cloud service, it makes more sense if queues can be configured absolutely. > This will allow administrators to set usage limits more concretely and > simplify customer expectations for cluster allocation. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6689) PlacementRule should be configurable
[ https://issues.apache.org/jira/browse/YARN-6689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065674#comment-16065674 ] Hadoop QA commented on YARN-6689: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 37s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 42s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 48s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 16 new + 286 unchanged - 0 fixed = 302 total (was 286) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 29s{color} | {color:red} hadoop-yarn-api in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 42m 5s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 89m 3s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.conf.TestYarnConfigurationFields | | | hadoop.yarn.server.resourcemanager.TestRMRestart | | | hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSAppStarvation | | Timed out junit tests | org.apache.hadoop.yarn.server.resourcemanager.TestRMStoreCommands | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | YARN-6689 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12874765/YARN-6689.002.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 889aff0cd2bc 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 63ce159 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/16258/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt | | unit |
[jira] [Commented] (YARN-6600) Enhance default lifetime of application at LeafQueue level.
[ https://issues.apache.org/jira/browse/YARN-6600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065656#comment-16065656 ] Rohith Sharma K S commented on YARN-6600: - High level changes on the patch # introduced CS scheduler config key to load max lifetime value. # On submission, RMAppImpl validate for scheduler max lifetime value and sets accordingly. # On update, RMAppManager validate for scheduler max lifetime value. If updated value exceeds queue max lifetime value than cut off to scheduler max queue lifetime and send new timeout value in response. # Changed update response proto that carries new timeout value in iso8601 format. # tests are modified accordingly > Enhance default lifetime of application at LeafQueue level. > --- > > Key: YARN-6600 > URL: https://issues.apache.org/jira/browse/YARN-6600 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: YARN-6600.01.patch, [YARN-6600] Extend lifetime to > scheduler Leaf Queue.pdf > > > Setting timeout at LeafQueue level allows admin to control from bad apps > which uses most of the resources for all the time. > Example : Any application submitted to particular queue i.e QUEUE-1 should > not run more than N hours. Even user set lifetime as N+1 hour, the > application will be killed after N hours. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6600) Enhance default lifetime of application at LeafQueue level.
[ https://issues.apache.org/jira/browse/YARN-6600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S updated YARN-6600: Attachment: YARN-6600.01.patch > Enhance default lifetime of application at LeafQueue level. > --- > > Key: YARN-6600 > URL: https://issues.apache.org/jira/browse/YARN-6600 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: YARN-6600.01.patch, [YARN-6600] Extend lifetime to > scheduler Leaf Queue.pdf > > > Setting timeout at LeafQueue level allows admin to control from bad apps > which uses most of the resources for all the time. > Example : Any application submitted to particular queue i.e QUEUE-1 should > not run more than N hours. Even user set lifetime as N+1 hour, the > application will be killed after N hours. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6600) Enhance default lifetime of application at LeafQueue level.
[ https://issues.apache.org/jira/browse/YARN-6600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S updated YARN-6600: Attachment: [YARN-6600] Extend lifetime to scheduler Leaf Queue.pdf > Enhance default lifetime of application at LeafQueue level. > --- > > Key: YARN-6600 > URL: https://issues.apache.org/jira/browse/YARN-6600 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: [YARN-6600] Extend lifetime to scheduler Leaf Queue.pdf > > > Setting timeout at LeafQueue level allows admin to control from bad apps > which uses most of the resources for all the time. > Example : Any application submitted to particular queue i.e QUEUE-1 should > not run more than N hours. Even user set lifetime as N+1 hour, the > application will be killed after N hours. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5311) Document graceful decommission CLI and usage
[ https://issues.apache.org/jira/browse/YARN-5311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065623#comment-16065623 ] Hadoop QA commented on YARN-5311: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 48s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 0s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 0s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 1 new + 206 unchanged - 0 fixed = 207 total (was 206) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 5 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 32s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 15s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 35s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 48m 18s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | YARN-5311 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12860373/YARN-5311.004.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 22b5dc001794 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / bc4dfe9 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | checkstyle |
[jira] [Commented] (YARN-5311) Document graceful decommission CLI and usage
[ https://issues.apache.org/jira/browse/YARN-5311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065612#comment-16065612 ] Hadoop QA commented on YARN-5311: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 51s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 51s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 58s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 1 new + 206 unchanged - 0 fixed = 207 total (was 206) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 5 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 34s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 16s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 34s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 49m 40s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | YARN-5311 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12860373/YARN-5311.004.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 6b7519b2290e 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / bc4dfe9 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | checkstyle |
[jira] [Commented] (YARN-6738) LevelDBCacheTimelineStore should reuse ObjectMapper instances
[ https://issues.apache.org/jira/browse/YARN-6738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065610#comment-16065610 ] Hudson commented on YARN-6738: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11934 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/11934/]) YARN-6738. LevelDBCacheTimelineStore should reuse ObjectMapper (jlowe: rev 63ce1593c5b78eb172773e7498d9c321debe81e8) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage/src/main/java/org/apache/hadoop/yarn/server/timeline/LevelDBCacheTimelineStore.java > LevelDBCacheTimelineStore should reuse ObjectMapper instances > - > > Key: YARN-6738 > URL: https://issues.apache.org/jira/browse/YARN-6738 > Project: Hadoop YARN > Issue Type: Improvement > Components: timelineserver >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Fix For: 2.9.0, 3.0.0-alpha4, 2.8.2 > > Attachments: Screen Shot 2017-06-23 at 2.43.06 PM.png, > YARN-6738.1.patch, YARN-6738.2.patch > > > Using TezUI sometimes times out...and the cause of it was that the query was > quite large; and the leveldb handler seems like recreates the > {{ObjectMapper}} for every read - this is unfortunate; since the ObjectMapper > have to rescan the class annotations which may take some time. > Keeping the objectmapper reduces the ATS call time from 17 seconds to 3 > seconds for me...which was enough to get my tez-ui working again :) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-6689) PlacementRule should be configurable
[ https://issues.apache.org/jira/browse/YARN-6689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065564#comment-16065564 ] Jonathan Hung edited comment on YARN-6689 at 6/27/17 10:15 PM: --- Hi [~xgong]/[~leftnoteasy], as discussed, attached 002 patch which: # changes userGroup to user-group (for UserGroupMappingPlacementRule) # tries to load PlacementRule using reflection if keyword is not found was (Author: jhung): Hi [~xgong], as discussed, attached 002 patch which: # changes userGroup to user-group (for UserGroupMappingPlacementRule) # tries to load PlacementRule using reflection if keyword is not found > PlacementRule should be configurable > > > Key: YARN-6689 > URL: https://issues.apache.org/jira/browse/YARN-6689 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Jonathan Hung >Assignee: Jonathan Hung > Attachments: YARN-6689.001.patch, YARN-6689.002.patch > > > YARN-3635 introduces PlacementRules for placing applications in queues. It is > currently hardcoded to one rule, {{UserGroupMappingPlacementRule}}. This > should be configurable as mentioned in the comments:{noformat} private void > updatePlacementRules() throws IOException { > List placementRules = new ArrayList<>(); > // Initialize UserGroupMappingPlacementRule > // TODO, need make this defineable by configuration. > UserGroupMappingPlacementRule ugRule = getUserGroupMappingPlacementRule(); > if (null != ugRule) { > placementRules.add(ugRule); > } > rmContext.getQueuePlacementManager().updateRules(placementRules); > }{noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6689) PlacementRule should be configurable
[ https://issues.apache.org/jira/browse/YARN-6689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065564#comment-16065564 ] Jonathan Hung commented on YARN-6689: - Hi [~xgong], as discussed, attached 002 patch which: # changes userGroup to user-group (for UserGroupMappingPlacementRule) # tries to load PlacementRule using reflection if keyword is not found > PlacementRule should be configurable > > > Key: YARN-6689 > URL: https://issues.apache.org/jira/browse/YARN-6689 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Jonathan Hung >Assignee: Jonathan Hung > Attachments: YARN-6689.001.patch, YARN-6689.002.patch > > > YARN-3635 introduces PlacementRules for placing applications in queues. It is > currently hardcoded to one rule, {{UserGroupMappingPlacementRule}}. This > should be configurable as mentioned in the comments:{noformat} private void > updatePlacementRules() throws IOException { > List placementRules = new ArrayList<>(); > // Initialize UserGroupMappingPlacementRule > // TODO, need make this defineable by configuration. > UserGroupMappingPlacementRule ugRule = getUserGroupMappingPlacementRule(); > if (null != ugRule) { > placementRules.add(ugRule); > } > rmContext.getQueuePlacementManager().updateRules(placementRules); > }{noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6689) PlacementRule should be configurable
[ https://issues.apache.org/jira/browse/YARN-6689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hung updated YARN-6689: Attachment: YARN-6689.002.patch > PlacementRule should be configurable > > > Key: YARN-6689 > URL: https://issues.apache.org/jira/browse/YARN-6689 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Jonathan Hung >Assignee: Jonathan Hung > Attachments: YARN-6689.001.patch, YARN-6689.002.patch > > > YARN-3635 introduces PlacementRules for placing applications in queues. It is > currently hardcoded to one rule, {{UserGroupMappingPlacementRule}}. This > should be configurable as mentioned in the comments:{noformat} private void > updatePlacementRules() throws IOException { > List placementRules = new ArrayList<>(); > // Initialize UserGroupMappingPlacementRule > // TODO, need make this defineable by configuration. > UserGroupMappingPlacementRule ugRule = getUserGroupMappingPlacementRule(); > if (null != ugRule) { > placementRules.add(ugRule); > } > rmContext.getQueuePlacementManager().updateRules(placementRules); > }{noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6708) Nodemanager container crash after ext3 folder limit
[ https://issues.apache.org/jira/browse/YARN-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065535#comment-16065535 ] Jason Lowe commented on YARN-6708: -- Thanks for updating the patch! This looks a lot cleaner. Is there a reason to use 755 permissions on the intermediate directories in the user cache? Note that we only allow 710 permissions on the final directly, and it seems intermediate directories should only require that as well, or 750 at the most. I don't see any reason to allow any "other" permissions on user-specific directories in the local cache. If the parent of destDirPath is the cache root then we won't set the permissions of destDirPath but otherwise we will? Seems like we could end up with inconsistent permissions on dest directories since sometimes the disk validator will end up creating it. I think we should just always do the while loop rather than special-case the no-intermediate-directories case -- or am I missing something? Nit: It would improve readability if we moved the directory handling stuff to a utility method like createDirAndParents or something similar and pass the desired permissions for the dirs as an argument. There be an After method that deletes {{basedir}} so we don't leave cruft on the filesystem if a unit test fails. The AtomicLong use is overkill in the test since there's no thread contention on that object. Actually the test could be a _lot_ simpler if we just told the container localizer to download a singl file to basedir/0/0/85/ and verify the permissions of the directories are correct. No need to download a bunch of times -- we're not testing the LocalCacheDirectoryManager here, and if the LocalCacheDirectoryManager internals change then this test is going to fail since it knows a bit too much about it. Similarly I do not see the need to actually create a jar and copy it. We can mock out the downloading process (see the {{mockOutDownloads}} method) so we don't have to have a real file on disk to download. The localizer will still create the directories but the executor service won't actually call FSDownload, and that's fine for what we're trying to test. The test should not be using reflection to modify access to objects. The user cache permissions can be package private, and we can just replicate the FSDownload permissions for this test if really necessary. > Nodemanager container crash after ext3 folder limit > --- > > Key: YARN-6708 > URL: https://issues.apache.org/jira/browse/YARN-6708 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > Attachments: YARN-6708.001.patch, YARN-6708.002.patch, > YARN-6708.003.patch, YARN-6708.004.patch > > > Configure umask as *027* for nodemanager service user > and {{yarn.nodemanager.local-cache.max-files-per-directory}} as {{40}}. After > 4 *private* dir localization next directory will be *0/14* > Local Directory cache manager > {code} > vm2:/opt/hadoop/release/data/nmlocal/usercache/mapred/filecache # l > total 28 > drwx--x--- 7 mapred hadoop 4096 Jun 10 14:35 ./ > drwxr-s--- 4 mapred hadoop 4096 Jun 10 12:07 ../ > drwxr-x--- 3 mapred users 4096 Jun 10 14:36 0/ > drwxr-xr-x 3 mapred users 4096 Jun 10 12:15 10/ > drwxr-xr-x 3 mapred users 4096 Jun 10 12:22 11/ > drwxr-xr-x 3 mapred users 4096 Jun 10 12:27 12/ > drwxr-xr-x 3 mapred users 4096 Jun 10 12:31 13/ > {code} > *drwxr-x---* 3 mapred users 4096 Jun 10 14:36 0/ is only *750* > Nodemanager user will not be able check for localization path exists or not. > {{LocalResourcesTrackerImpl}} > {code} > case REQUEST: > if (rsrc != null && (!isResourcePresent(rsrc))) { > LOG.info("Resource " + rsrc.getLocalPath() > + " is missing, localizing it again"); > removeResource(req); > rsrc = null; > } > if (null == rsrc) { > rsrc = new LocalizedResource(req, dispatcher); > localrsrc.put(req, rsrc); > } > break; > {code} > *isResourcePresent* will always return false and same resource will be > localized to {{0}} to next unique number -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5311) Document graceful decommission CLI and usage
[ https://issues.apache.org/jira/browse/YARN-5311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065541#comment-16065541 ] Junping Du commented on YARN-5311: -- Sorry for coming late on this. Latest patch LGTM now, I think we can continue the improvements later to make doc work better. +1. I will commit it tomorrow if no further comments from others. > Document graceful decommission CLI and usage > > > Key: YARN-5311 > URL: https://issues.apache.org/jira/browse/YARN-5311 > Project: Hadoop YARN > Issue Type: Sub-task > Components: documentation >Affects Versions: 2.9.0 >Reporter: Junping Du >Assignee: Elek, Marton > Attachments: YARN-5311.001.patch, YARN-5311.002.patch, > YARN-5311.003.patch, YARN-5311.004.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4161) Capacity Scheduler : Assign single or multiple containers per heart beat driven by configuration
[ https://issues.apache.org/jira/browse/YARN-4161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated YARN-4161: -- Attachment: YARN-4161.patch.1 Rebase for trunk, by moving the assignMultiple check to scheduler-level, instead of queue-level. [~wangda], could u help take a look? > Capacity Scheduler : Assign single or multiple containers per heart beat > driven by configuration > > > Key: YARN-4161 > URL: https://issues.apache.org/jira/browse/YARN-4161 > Project: Hadoop YARN > Issue Type: New Feature > Components: capacity scheduler >Reporter: Mayank Bansal >Assignee: Mayank Bansal > Labels: oct16-medium > Attachments: YARN-4161.patch, YARN-4161.patch.1 > > > Capacity Scheduler right now schedules multiple containers per heart beat if > there are more resources available in the node. > This approach works fine however in some cases its not distribute the load > across the cluster hence throughput of the cluster suffers. I am adding > feature to drive that using configuration by that we can control the number > of containers assigned per heart beat. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6738) LevelDBCacheTimelineStore should reuse ObjectMapper instances
[ https://issues.apache.org/jira/browse/YARN-6738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065519#comment-16065519 ] Junping Du commented on YARN-6738: -- +1. Patch LGTM as well. > LevelDBCacheTimelineStore should reuse ObjectMapper instances > - > > Key: YARN-6738 > URL: https://issues.apache.org/jira/browse/YARN-6738 > Project: Hadoop YARN > Issue Type: Improvement > Components: timelineserver >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Attachments: Screen Shot 2017-06-23 at 2.43.06 PM.png, > YARN-6738.1.patch, YARN-6738.2.patch > > > Using TezUI sometimes times out...and the cause of it was that the query was > quite large; and the leveldb handler seems like recreates the > {{ObjectMapper}} for every read - this is unfortunate; since the ObjectMapper > have to rescan the class annotations which may take some time. > Keeping the objectmapper reduces the ATS call time from 17 seconds to 3 > seconds for me...which was enough to get my tez-ui working again :) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-6744) Recover component information on YARN native services AM restart
[ https://issues.apache.org/jira/browse/YARN-6744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Billie Rinaldi reassigned YARN-6744: Assignee: Billie Rinaldi > Recover component information on YARN native services AM restart > > > Key: YARN-6744 > URL: https://issues.apache.org/jira/browse/YARN-6744 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-native-services >Reporter: Billie Rinaldi >Assignee: Billie Rinaldi > Fix For: yarn-native-services > > > The new RoleInstance#Container constructor does not populate all the > information needed for a RoleInstance. This is the constructor used when > recovering running containers in AppState#addRestartedContainer. We will have > to figure out a way to determine this information for a running container. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4266) Allow whitelisted users to disable user re-mapping/squashing when launching docker containers
[ https://issues.apache.org/jira/browse/YARN-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065414#comment-16065414 ] Eric Badger commented on YARN-4266: --- bq. I'm not really a fan of hard coding bind mounts like we've done for /sys/fs/cgroup if we can help it. Yea I'm really not a fan either. I would strongly prefer a better, cleaner solution to this problem if there is one. bq. Are you aware of any security implications with this socket mounted read-only in the container? I haven't done any research into it, but I imagine bind-mounting a socket would have more security implications than a regular directory. If we decide this route is worth delving into, then I will do my due diligence of trying to identify security risks and whether they are acceptable or not. bq. Also, are there any clients that might be required to be installed in the container depending on how nsswitch is configured? If you bind mount /var/run/nscd then it will use the process listening on that socket. This will end up using nsswitch on the host, not in the container. So it would completely leverage host services, not services within the container. This actually makes things nice for remote user lookups, because it can use the host's cache. This means that the container won't have to do a remote user lookup every time a container is started, assuming that it's in the host's user cache, if it has one (such as nscd). So I don't believe that any extra services would have to be installed within the container for this to be used, only on the host. And if no service is listening on the nscd socket, then the user lookup would do the user lookup like it normally would. bq. Alternatively, why does MRAppMaster do the user lookup in this case? Is there a way to remove that limitation? Are you finding other AM's have a similar issue? I'm looking into this. I'm hoping that we can get around this so that we can optionally add the bind mount, but not require it for the {{--user}} option. I have not yet tested other AMs. > Allow whitelisted users to disable user re-mapping/squashing when launching > docker containers > - > > Key: YARN-4266 > URL: https://issues.apache.org/jira/browse/YARN-4266 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Sidharta Seethana >Assignee: luhuichun > Attachments: YARN-4266.001.patch, YARN-4266.001.patch, > YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping.pdf, > YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping_v2.pdf, > YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping_v3.pdf, > YARN-4266-branch-2.8.001.patch > > > Docker provides a mechanism (the --user switch) that enables us to specify > the user the container processes should run as. We use this mechanism today > when launching docker containers . In non-secure mode, we run the docker > container based on > `yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user` and in > secure mode, as the submitting user. However, this mechanism breaks down with > a large number of 'pre-created' images which don't necessarily have the users > available within the image. Examples of such images include shared images > that need to be used by multiple users. We need a way in which we can allow a > pre-defined set of users to run containers based on existing images, without > using the --user switch. There are some implications of disabling this user > squashing that we'll need to work through : log aggregation, artifact > deletion etc., -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4266) Allow whitelisted users to disable user re-mapping/squashing when launching docker containers
[ https://issues.apache.org/jira/browse/YARN-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065260#comment-16065260 ] Shane Kumpf commented on YARN-4266: --- Thanks, [~ebadger]! It sounds like you are making good progress. {quote} To do this, I propose mounting /var/run/nscd so that the docker container can lookup users via the host according to whatever method is defined in nsswitch.conf on the host.{quote} I believe YARN-5534 will be required for this. I'm not really a fan of hard coding bind mounts like we've done for /sys/fs/cgroup if we can help it. Are you aware of any security implications with this socket mounted read-only in the container? Also, are there any clients that might be required to be installed in the container depending on how nsswitch is configured? Alternatively, why does MRAppMaster do the user lookup in this case? Is there a way to remove that limitation? Are you finding other AM's have a similar issue? > Allow whitelisted users to disable user re-mapping/squashing when launching > docker containers > - > > Key: YARN-4266 > URL: https://issues.apache.org/jira/browse/YARN-4266 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Sidharta Seethana >Assignee: luhuichun > Attachments: YARN-4266.001.patch, YARN-4266.001.patch, > YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping.pdf, > YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping_v2.pdf, > YARN-4266_Allow_whitelisted_users_to_disable_user_re-mapping_v3.pdf, > YARN-4266-branch-2.8.001.patch > > > Docker provides a mechanism (the --user switch) that enables us to specify > the user the container processes should run as. We use this mechanism today > when launching docker containers . In non-secure mode, we run the docker > container based on > `yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user` and in > secure mode, as the submitting user. However, this mechanism breaks down with > a large number of 'pre-created' images which don't necessarily have the users > available within the image. Examples of such images include shared images > that need to be used by multiple users. We need a way in which we can allow a > pre-defined set of users to run containers based on existing images, without > using the --user switch. There are some implications of disabling this user > squashing that we'll need to work through : log aggregation, artifact > deletion etc., -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2919) Potential race between renew and cancel in DelegationTokenRenwer
[ https://issues.apache.org/jira/browse/YARN-2919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065226#comment-16065226 ] Junping Du commented on YARN-2919: -- Thanks [~Naganarasimha] for contributing the patch. Latest patch looks good to me. Just a minor comments for DelegationTokenRenewer, if token is cancelling, shall we log something different than "The token was removed already."? Other looks fine to me. btw, does failed tests related to the patch? If not, do we have some JIRA to track them? > Potential race between renew and cancel in DelegationTokenRenwer > - > > Key: YARN-2919 > URL: https://issues.apache.org/jira/browse/YARN-2919 > Project: Hadoop YARN > Issue Type: Bug > Components: security >Affects Versions: 2.6.0 >Reporter: Karthik Kambatla >Assignee: Naganarasimha G R >Priority: Critical > Attachments: YARN-2919.002.patch, YARN-2919.003.patch, > YARN-2919.004.patch, YARN-2919.20141209-1.patch > > > YARN-2874 fixes a deadlock in DelegationTokenRenewer, but there is still a > race because of which a renewal in flight isn't interrupted by a cancel. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6708) Nodemanager container crash after ext3 folder limit
[ https://issues.apache.org/jira/browse/YARN-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065200#comment-16065200 ] Hadoop QA commented on YARN-6708: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 8s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 43s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager in trunk has 5 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 16s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 4 new + 54 unchanged - 1 fixed = 58 total (was 55) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 46s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 32m 44s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | YARN-6708 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12874715/YARN-6708.004.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 9619f3540737 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 94e39c6 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | findbugs | https://builds.apache.org/job/PreCommit-YARN-Build/16255/artifact/patchprocess/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-warnings.html | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/16255/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/16255/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/16255/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Nodemanager container crash after ext3 folder limit > --- > > Key:
[jira] [Commented] (YARN-6507) Support FPGA abstraction framework on NM side
[ https://issues.apache.org/jira/browse/YARN-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065193#comment-16065193 ] Wangda Tan commented on YARN-6507: -- Thanks [~tangzhankun] for the patch. I'm recently working on GPU isolation patches, with inputs from [~chris.douglas] / [~vinodkv] / [~subru] / [~sunilg]. Some comments to implementation: 1) For isolation of cgroups devices, we need root permission, which means we have to move some functionalities to the container-executor c-binary. 2) Existing implementation of container-executor is very hard to add/enable/disable new features, it's better to do refactoring before adding more logics. 3) We may need to consider recover allocated resources for NM restart. Once NM restart finish, we should recover mapping of FPGAs/GPUs and running containers. Otherwise we may assign multiple containers to a single device. For #1/#2, I'm working on a prototype, if everything goes well, I can upload a WIP patch today. For #3, I think we may need some unified implementation in NM side -- it may not be necessary to reuse the device scheduling logics, state-store of device allocation and recover could be a common module in NM. > Support FPGA abstraction framework on NM side > - > > Key: YARN-6507 > URL: https://issues.apache.org/jira/browse/YARN-6507 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Zhankun Tang >Assignee: Zhankun Tang > Fix For: YARN-3926 > > Attachments: YARN-6507-branch-YARN-3926.001.patch > > > Support local FPGA resource scheduler to assign/cleanup N FPGA slots to/from > a container. > Support vendor plugin framework with basic features that meets vendor > requirements -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6428) Queue AM limit is not honored in CS always
[ https://issues.apache.org/jira/browse/YARN-6428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065182#comment-16065182 ] Sunil G commented on YARN-6428: --- bq. LimitedPrivate only applicable for specified projects.. So i don't think its an issue. I guess it might be still a problem. In scheduler and rm, we have various plugin interfaces. Also a recent change of one api in this class broke tez compilation. Hence I do not think its a good idea to change from double to float and break an existing api. {{Math.floor(value * 10^N) / 10^N }} will be doing conversion to float. A precision of 6 should be good enough rt. Do you think there are some case where precision 6 will be a problem ? > Queue AM limit is not honored in CS always > --- > > Key: YARN-6428 > URL: https://issues.apache.org/jira/browse/YARN-6428 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: YARN-6428.0001.patch, YARN-6428.0002.patch > > > Steps to reproduce > > Setup cluster with 40 GB and 40 vcores with 4 Node managers with 10 GB each. > Configure 100% to default queue as capacity and max am limit as 10 % > Minimum scheduler memory and vcore as 512,1 > *Expected* > AM limit 4096 and 4 vores > *Actual* > AM limit 4096+512 and 4+1 vcore -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6720) Support updating FPGA related constraint node label after FPGA device re-configuration
[ https://issues.apache.org/jira/browse/YARN-6720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065170#comment-16065170 ] Wangda Tan commented on YARN-6720: -- Thanks [~tangzhankun] and [~YuQiang Ye] for this proposal. In general this approach looks good to me. Since this proposal depends on YARN-3409, which needs more time to get done. I'm not sure if this is a blocker/critical item for this feature, IIRC, offline you mentioned that download FPGA firmware and reconfigure FPGA devices only take few seconds, which means it is generally fine without this improvement. > Support updating FPGA related constraint node label after FPGA device > re-configuration > -- > > Key: YARN-6720 > URL: https://issues.apache.org/jira/browse/YARN-6720 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Zhankun Tang > Attachments: > Storing-and-Updating-extra-FPGA-resource-attributes-in-hdfs_v1.pdf > > > In order to provide a global optimal scheduling for mutable FPGA resource, it > seems an easy and direct way to utilize constraint node labels(YARN-3409) > instead of extending the global scheduler(YARN-3926) to match both resource > count and attributes. > The rough idea is that the AM sets the constraint node label expression to > request containers on the nodes whose FPGA devices has the matching IP, and > then NM resource handler update the node constraint label if there's FPGA > device re-configuration. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6708) Nodemanager container crash after ext3 folder limit
[ https://issues.apache.org/jira/browse/YARN-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-6708: --- Attachment: YARN-6708.004.patch [~jlowe] Thank you for looking into issue.Attaching patch with following modification. # Moved folder creation to container Localizer. # Testcase to verify folder permission are based on configuration. > Nodemanager container crash after ext3 folder limit > --- > > Key: YARN-6708 > URL: https://issues.apache.org/jira/browse/YARN-6708 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > Attachments: YARN-6708.001.patch, YARN-6708.002.patch, > YARN-6708.003.patch, YARN-6708.004.patch > > > Configure umask as *027* for nodemanager service user > and {{yarn.nodemanager.local-cache.max-files-per-directory}} as {{40}}. After > 4 *private* dir localization next directory will be *0/14* > Local Directory cache manager > {code} > vm2:/opt/hadoop/release/data/nmlocal/usercache/mapred/filecache # l > total 28 > drwx--x--- 7 mapred hadoop 4096 Jun 10 14:35 ./ > drwxr-s--- 4 mapred hadoop 4096 Jun 10 12:07 ../ > drwxr-x--- 3 mapred users 4096 Jun 10 14:36 0/ > drwxr-xr-x 3 mapred users 4096 Jun 10 12:15 10/ > drwxr-xr-x 3 mapred users 4096 Jun 10 12:22 11/ > drwxr-xr-x 3 mapred users 4096 Jun 10 12:27 12/ > drwxr-xr-x 3 mapred users 4096 Jun 10 12:31 13/ > {code} > *drwxr-x---* 3 mapred users 4096 Jun 10 14:36 0/ is only *750* > Nodemanager user will not be able check for localization path exists or not. > {{LocalResourcesTrackerImpl}} > {code} > case REQUEST: > if (rsrc != null && (!isResourcePresent(rsrc))) { > LOG.info("Resource " + rsrc.getLocalPath() > + " is missing, localizing it again"); > removeResource(req); > rsrc = null; > } > if (null == rsrc) { > rsrc = new LocalizedResource(req, dispatcher); > localrsrc.put(req, rsrc); > } > break; > {code} > *isResourcePresent* will always return false and same resource will be > localized to {{0}} to next unique number -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5006) ResourceManager quit due to ApplicationStateData exceed the limit size of znode in zk
[ https://issues.apache.org/jira/browse/YARN-5006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065090#comment-16065090 ] Naganarasimha G R commented on YARN-5006: - Yeah saw that, I am fine with it ! > ResourceManager quit due to ApplicationStateData exceed the limit size of > znode in zk > -- > > Key: YARN-5006 > URL: https://issues.apache.org/jira/browse/YARN-5006 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0, 2.7.2 >Reporter: dongtingting >Assignee: Bibin A Chundatt >Priority: Critical > Fix For: 2.9.0, 3.0.0-alpha4 > > Attachments: YARN-5006.001.patch, YARN-5006.002.patch, > YARN-5006.003.patch, YARN-5006.004.patch, YARN-5006.005.patch, > YARN-5006-branch-2.005.patch > > > Client submit a job, this job add 1 file into DistributedCache. when the > job is submitted, ResourceManager sotre ApplicationStateData into zk. > ApplicationStateData is exceed the limit size of znode. RM exit 1. > The related code in RMStateStore.java : > {code} > private static class StoreAppTransition > implements SingleArcTransition{ > @Override > public void transition(RMStateStore store, RMStateStoreEvent event) { > if (!(event instanceof RMStateStoreAppEvent)) { > // should never happen > LOG.error("Illegal event type: " + event.getClass()); > return; > } > ApplicationState appState = ((RMStateStoreAppEvent) > event).getAppState(); > ApplicationId appId = appState.getAppId(); > ApplicationStateData appStateData = ApplicationStateData > .newInstance(appState); > LOG.info("Storing info for app: " + appId); > try { > store.storeApplicationStateInternal(appId, appStateData); //store > the appStateData > store.notifyApplication(new RMAppEvent(appId, >RMAppEventType.APP_NEW_SAVED)); > } catch (Exception e) { > LOG.error("Error storing app: " + appId, e); > store.notifyStoreOperationFailed(e); //handle fail event, system > exit > } > }; > } > {code} > The Exception log: > {code} > ... > 2016-04-20 11:26:35,732 INFO > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore > AsyncDispatcher event handler: Maxed out ZK retries. Giving up! > 2016-04-20 11:26:35,732 ERROR > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore > AsyncDispatcher event handler: Error storing app: > application_1461061795989_17671 > org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode > = ConnectionLoss > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:99) > at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:931) > at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:911) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:936) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:933) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1075) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:1096) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doMultiWithRetries(ZKRMStateStore.java:933) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doMultiWithRetries(ZKRMStateStore.java:947) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.createWithRetries(ZKRMStateStore.java:956) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.storeApplicationStateInternal(ZKRMStateStore.java:626) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreAppTransition.transition(RMStateStore.java:138) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreAppTransition.transition(RMStateStore.java:123) > at > org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at >
[jira] [Assigned] (YARN-6738) LevelDBCacheTimelineStore should reuse ObjectMapper instances
[ https://issues.apache.org/jira/browse/YARN-6738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe reassigned YARN-6738: Assignee: Zoltan Haindrich > LevelDBCacheTimelineStore should reuse ObjectMapper instances > - > > Key: YARN-6738 > URL: https://issues.apache.org/jira/browse/YARN-6738 > Project: Hadoop YARN > Issue Type: Improvement > Components: timelineserver >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Attachments: Screen Shot 2017-06-23 at 2.43.06 PM.png, > YARN-6738.1.patch, YARN-6738.2.patch > > > Using TezUI sometimes times out...and the cause of it was that the query was > quite large; and the leveldb handler seems like recreates the > {{ObjectMapper}} for every read - this is unfortunate; since the ObjectMapper > have to rescan the class annotations which may take some time. > Keeping the objectmapper reduces the ATS call time from 17 seconds to 3 > seconds for me...which was enough to get my tez-ui working again :) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6744) Recover component information on YARN native services AM restart
Billie Rinaldi created YARN-6744: Summary: Recover component information on YARN native services AM restart Key: YARN-6744 URL: https://issues.apache.org/jira/browse/YARN-6744 Project: Hadoop YARN Issue Type: Sub-task Components: yarn-native-services Reporter: Billie Rinaldi Fix For: yarn-native-services The new RoleInstance#Container constructor does not populate all the information needed for a RoleInstance. This is the constructor used when recovering running containers in AppState#addRestartedContainer. We will have to figure out a way to determine this information for a running container. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5006) ResourceManager quit due to ApplicationStateData exceed the limit size of znode in zk
[ https://issues.apache.org/jira/browse/YARN-5006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064960#comment-16064960 ] Daniel Templeton commented on YARN-5006: It's such a minor thing that I didn't think it was worth an addendum patch. There's a community of people who love to do these tiny fixes. In fact, someone has already claimed the JIRA I filed. > ResourceManager quit due to ApplicationStateData exceed the limit size of > znode in zk > -- > > Key: YARN-5006 > URL: https://issues.apache.org/jira/browse/YARN-5006 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0, 2.7.2 >Reporter: dongtingting >Assignee: Bibin A Chundatt >Priority: Critical > Fix For: 2.9.0, 3.0.0-alpha4 > > Attachments: YARN-5006.001.patch, YARN-5006.002.patch, > YARN-5006.003.patch, YARN-5006.004.patch, YARN-5006.005.patch, > YARN-5006-branch-2.005.patch > > > Client submit a job, this job add 1 file into DistributedCache. when the > job is submitted, ResourceManager sotre ApplicationStateData into zk. > ApplicationStateData is exceed the limit size of znode. RM exit 1. > The related code in RMStateStore.java : > {code} > private static class StoreAppTransition > implements SingleArcTransition{ > @Override > public void transition(RMStateStore store, RMStateStoreEvent event) { > if (!(event instanceof RMStateStoreAppEvent)) { > // should never happen > LOG.error("Illegal event type: " + event.getClass()); > return; > } > ApplicationState appState = ((RMStateStoreAppEvent) > event).getAppState(); > ApplicationId appId = appState.getAppId(); > ApplicationStateData appStateData = ApplicationStateData > .newInstance(appState); > LOG.info("Storing info for app: " + appId); > try { > store.storeApplicationStateInternal(appId, appStateData); //store > the appStateData > store.notifyApplication(new RMAppEvent(appId, >RMAppEventType.APP_NEW_SAVED)); > } catch (Exception e) { > LOG.error("Error storing app: " + appId, e); > store.notifyStoreOperationFailed(e); //handle fail event, system > exit > } > }; > } > {code} > The Exception log: > {code} > ... > 2016-04-20 11:26:35,732 INFO > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore > AsyncDispatcher event handler: Maxed out ZK retries. Giving up! > 2016-04-20 11:26:35,732 ERROR > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore > AsyncDispatcher event handler: Error storing app: > application_1461061795989_17671 > org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode > = ConnectionLoss > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:99) > at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:931) > at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:911) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:936) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:933) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1075) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:1096) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doMultiWithRetries(ZKRMStateStore.java:933) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doMultiWithRetries(ZKRMStateStore.java:947) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.createWithRetries(ZKRMStateStore.java:956) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.storeApplicationStateInternal(ZKRMStateStore.java:626) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreAppTransition.transition(RMStateStore.java:138) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreAppTransition.transition(RMStateStore.java:123) > at > org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at >
[jira] [Commented] (YARN-5148) [YARN-3368] Add page to new YARN UI to view server side configurations/logs/JVM-metrics
[ https://issues.apache.org/jira/browse/YARN-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064848#comment-16064848 ] Hadoop QA commented on YARN-5148: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 0m 57s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | YARN-5148 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12874689/YARN-5148.13.patch | | Optional Tests | asflicense | | uname | Linux d1f035d8bd4b 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 07defa4 | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/16254/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > [YARN-3368] Add page to new YARN UI to view server side > configurations/logs/JVM-metrics > --- > > Key: YARN-5148 > URL: https://issues.apache.org/jira/browse/YARN-5148 > Project: Hadoop YARN > Issue Type: Sub-task > Components: webapp, yarn-ui-v2 >Reporter: Wangda Tan >Assignee: Kai Sasaki > Labels: oct16-medium > Attachments: pretty-json-metrics.png, Screen Shot 2016-09-11 at > 23.28.31.png, Screen Shot 2016-09-13 at 22.27.00.png, > UsingStringifyPrint.png, YARN-5148.07.patch, YARN-5148.08.patch, > YARN-5148.09.patch, YARN-5148.10.patch, YARN-5148.11.patch, > YARN-5148.12.patch, YARN-5148.13.patch, YARN-5148-YARN-3368.01.patch, > YARN-5148-YARN-3368.02.patch, YARN-5148-YARN-3368.03.patch, > YARN-5148-YARN-3368.04.patch, YARN-5148-YARN-3368.05.patch, > YARN-5148-YARN-3368.06.patch, yarn-conf.png, yarn-tools.png > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5148) [YARN-3368] Add page to new YARN UI to view server side configurations/logs/JVM-metrics
[ https://issues.apache.org/jira/browse/YARN-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Sasaki updated YARN-5148: - Attachment: YARN-5148.13.patch > [YARN-3368] Add page to new YARN UI to view server side > configurations/logs/JVM-metrics > --- > > Key: YARN-5148 > URL: https://issues.apache.org/jira/browse/YARN-5148 > Project: Hadoop YARN > Issue Type: Sub-task > Components: webapp, yarn-ui-v2 >Reporter: Wangda Tan >Assignee: Kai Sasaki > Labels: oct16-medium > Attachments: pretty-json-metrics.png, Screen Shot 2016-09-11 at > 23.28.31.png, Screen Shot 2016-09-13 at 22.27.00.png, > UsingStringifyPrint.png, YARN-5148.07.patch, YARN-5148.08.patch, > YARN-5148.09.patch, YARN-5148.10.patch, YARN-5148.11.patch, > YARN-5148.12.patch, YARN-5148.13.patch, YARN-5148-YARN-3368.01.patch, > YARN-5148-YARN-3368.02.patch, YARN-5148-YARN-3368.03.patch, > YARN-5148-YARN-3368.04.patch, YARN-5148-YARN-3368.05.patch, > YARN-5148-YARN-3368.06.patch, yarn-conf.png, yarn-tools.png > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5148) [YARN-3368] Add page to new YARN UI to view server side configurations/logs/JVM-metrics
[ https://issues.apache.org/jira/browse/YARN-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064829#comment-16064829 ] Hadoop QA commented on YARN-5148: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch 2 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 51s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 1m 43s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | YARN-5148 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12874687/YARN-5148.12.patch | | Optional Tests | asflicense | | uname | Linux df096c7528f3 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 07defa4 | | whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/16253/artifact/patchprocess/whitespace-tabs.txt | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/16253/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > [YARN-3368] Add page to new YARN UI to view server side > configurations/logs/JVM-metrics > --- > > Key: YARN-5148 > URL: https://issues.apache.org/jira/browse/YARN-5148 > Project: Hadoop YARN > Issue Type: Sub-task > Components: webapp, yarn-ui-v2 >Reporter: Wangda Tan >Assignee: Kai Sasaki > Labels: oct16-medium > Attachments: pretty-json-metrics.png, Screen Shot 2016-09-11 at > 23.28.31.png, Screen Shot 2016-09-13 at 22.27.00.png, > UsingStringifyPrint.png, YARN-5148.07.patch, YARN-5148.08.patch, > YARN-5148.09.patch, YARN-5148.10.patch, YARN-5148.11.patch, > YARN-5148.12.patch, YARN-5148-YARN-3368.01.patch, > YARN-5148-YARN-3368.02.patch, YARN-5148-YARN-3368.03.patch, > YARN-5148-YARN-3368.04.patch, YARN-5148-YARN-3368.05.patch, > YARN-5148-YARN-3368.06.patch, yarn-conf.png, yarn-tools.png > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5927) BaseContainerManagerTest::waitForNMContainerState timeout accounting is not accurate
[ https://issues.apache.org/jira/browse/YARN-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064825#comment-16064825 ] Kai Sasaki commented on YARN-5927: -- [~ka...@cloudera.com] Sorry again. But could you take a look when you have time? > BaseContainerManagerTest::waitForNMContainerState timeout accounting is not > accurate > > > Key: YARN-5927 > URL: https://issues.apache.org/jira/browse/YARN-5927 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Kai Sasaki >Priority: Trivial > Labels: newbie > Attachments: YARN-5917.01.patch, YARN-5917.02.patch, > YARN-5927.03.patch > > > See below that timeoutSecs is increased twice. We also do a sleep right away > before even checking the observed value. > {code} > do { > Thread.sleep(2000); > ... > timeoutSecs += 2; > } while (!finalStates.contains(currentState) > && timeoutSecs++ < timeOutMax); > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5148) [YARN-3368] Add page to new YARN UI to view server side configurations/logs/JVM-metrics
[ https://issues.apache.org/jira/browse/YARN-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Sasaki updated YARN-5148: - Attachment: YARN-5148.12.patch > [YARN-3368] Add page to new YARN UI to view server side > configurations/logs/JVM-metrics > --- > > Key: YARN-5148 > URL: https://issues.apache.org/jira/browse/YARN-5148 > Project: Hadoop YARN > Issue Type: Sub-task > Components: webapp, yarn-ui-v2 >Reporter: Wangda Tan >Assignee: Kai Sasaki > Labels: oct16-medium > Attachments: pretty-json-metrics.png, Screen Shot 2016-09-11 at > 23.28.31.png, Screen Shot 2016-09-13 at 22.27.00.png, > UsingStringifyPrint.png, YARN-5148.07.patch, YARN-5148.08.patch, > YARN-5148.09.patch, YARN-5148.10.patch, YARN-5148.11.patch, > YARN-5148.12.patch, YARN-5148-YARN-3368.01.patch, > YARN-5148-YARN-3368.02.patch, YARN-5148-YARN-3368.03.patch, > YARN-5148-YARN-3368.04.patch, YARN-5148-YARN-3368.05.patch, > YARN-5148-YARN-3368.06.patch, yarn-conf.png, yarn-tools.png > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6428) Queue AM limit is not honored in CS always
[ https://issues.apache.org/jira/browse/YARN-6428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064392#comment-16064392 ] Bibin A Chundatt commented on YARN-6428: [~sunilg] {quote} I think there might be a compatibility break though its from Resources class {quote} LimitedPrivate only applicable for specified projects.. So i don't think its an issue. {code} @InterfaceAudience.LimitedPrivate({"YARN", "MapReduce"}) @Unstable {code} {quote} So using Math.floor(value * 10^N) / 10^N where N could be 6 will help to resolve the problem for now. {quote} The issue with solution mentioned is we are limiting the max value of resource that can be configured for attributes to {{MAX VALUE / 10^N}} i wouldn't prefer for these reasons. IF you feel all the above can be ignored then i can go ahead and change. > Queue AM limit is not honored in CS always > --- > > Key: YARN-6428 > URL: https://issues.apache.org/jira/browse/YARN-6428 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: YARN-6428.0001.patch, YARN-6428.0002.patch > > > Steps to reproduce > > Setup cluster with 40 GB and 40 vcores with 4 Node managers with 10 GB each. > Configure 100% to default queue as capacity and max am limit as 10 % > Minimum scheduler memory and vcore as 512,1 > *Expected* > AM limit 4096 and 4 vores > *Actual* > AM limit 4096+512 and 4+1 vcore -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-6507) Support FPGA abstraction framework on NM side
[ https://issues.apache.org/jira/browse/YARN-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhankun Tang reassigned YARN-6507: -- Assignee: Zhankun Tang > Support FPGA abstraction framework on NM side > - > > Key: YARN-6507 > URL: https://issues.apache.org/jira/browse/YARN-6507 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Zhankun Tang >Assignee: Zhankun Tang > Fix For: YARN-3926 > > Attachments: YARN-6507-branch-YARN-3926.001.patch > > > Support local FPGA resource scheduler to assign/cleanup N FPGA slots to/from > a container. > Support vendor plugin framework with basic features that meets vendor > requirements -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6507) Support FPGA abstraction framework on NM side
[ https://issues.apache.org/jira/browse/YARN-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16064373#comment-16064373 ] Zhankun Tang commented on YARN-6507: Some thoughts about next step: 1. For the vendor specific plugin, a better way may be load the plugin from configuration dynamically instead of merging the code with YARN code base? 2. The local scheduler logic and cgroup isolation seems same for FPGA and GPU, may be it's better to abstract a common component? > Support FPGA abstraction framework on NM side > - > > Key: YARN-6507 > URL: https://issues.apache.org/jira/browse/YARN-6507 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Zhankun Tang > Fix For: YARN-3926 > > Attachments: YARN-6507-branch-YARN-3926.001.patch > > > Support local FPGA resource scheduler to assign/cleanup N FPGA slots to/from > a container. > Support vendor plugin framework with basic features that meets vendor > requirements -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org