subject:"Wrong resource consumption on scheduler"

Re: Wrong resource consumption on scheduler

2016-11-03 Thread Clayton Coleman

No, but the global default for any project will steer you away from infra
nodes.

On Wed, Nov 2, 2016 at 9:32 AM, Frank Liauw  wrote:

> No, it does not. Are nodes without region labels automatically classified
> as infra region?
>
> Frank
> Systems Engineer
>
> VSee: fr...@vsee.com  | Cell: +65 9338 0035
>
> Join me on VSee for Free 
>
>
>
>
> On Wed, Nov 2, 2016 at 9:24 PM, Clayton Coleman 
> wrote:
>
>> Does your namespace have a namespace restriction annotation set?  If not,
>> you'll be defaulted to the global restriction which is usually excluding
>> the infra region.
>>
>> On Nov 2, 2016, at 8:14 AM, Frank Liauw  wrote:
>>
>> There's no node selector on the pod. The pod is under a service.
>>
>> The affinity rules are left unmodified from install:
>>
>> {
>> "predicates": [{
>> "name": "MatchNodeSelector"
>> }, {
>> "name": "PodFitsResources"
>> }, {
>> "name": "PodFitsPorts"
>> }, {
>> "name": "NoDiskConflict"
>> }, {
>> "argument": {
>> "serviceAffinity": {
>> "labels": ["region"]
>> }
>> },
>> "name": "Region"
>> }],
>> "kind": "Policy",
>> "priorities": [{
>> "name": "LeastRequestedPriority",
>> "weight": 1
>> }, {
>> "name": "SelectorSpreadPriority",
>> "weight": 1
>> }, {
>> "argument": {
>> "serviceAntiAffinity": {
>> "label": "zone"
>> }
>> },
>> "weight": 2,
>> "name": "Zone"
>> }],
>> "apiVersion": "v1"
>> }
>>
>> It puzzles me more that my custom labels as well as the extra labels were
>> not introduced in previous scaleup runs by ansible; it's not the first time
>> I'm adding new nodes to the cluster.
>>
>> Frank
>> Systems Engineer
>>
>> VSee: fr...@vsee.com  | Cell: +65 9338 0035
>>
>> Join me on VSee for Free 
>>
>>
>>
>>
>> On Fri, Oct 28, 2016 at 9:47 PM, Clayton Coleman 
>> wrote:
>>
>>> What node selector / tolerations / affinity rules were on your pod?  Is
>>> the pod under a service?
>>>
>>> On Oct 28, 2016, at 4:03 AM, Frank Liauw  wrote:
>>>
>>> After giving the event log some thought, I realised that the 'Region fit
>>> failure' was a different error as 'Node didn't have enough resource', and
>>> realised that openshift is trying to force deployment onto
>>> node4.openshift.internal, a node I added recently.
>>>
>>> My nodes had these sets of labels:
>>>
>>> kubernetes.io/hostname=node1.openshift.internal,logging-infr
>>> a-fluentd=true
>>> kubernetes.io/hostname=node2.openshift.internal,logging-infr
>>> a-fluentd=true,public-router=true
>>> core-router=true,kubernetes.io/hostname=node3.openshift.inte
>>> rnal,logging-infra-fluentd=true
>>> beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,ku
>>> bernetes.io/hostname=node4.openshift.internal,logging-infra-
>>> fluentd=true,region=primary,zone=layer42
>>> kubernetes.io/hostname=node2.openshift.ncal,logging-infra-fl
>>> uentd=true,public-router=true
>>>
>>> Removing the labels 'beta.kubernetes.io/arch=amd64
>>> ,beta.kubernetes.io/os=linux' fixed the issue.
>>>
>>> Why is it so?
>>>
>>> Frank
>>> Systems Engineer
>>>
>>> VSee: fr...@vsee.com  | Cell: +65 9338 0035
>>>
>>> Join me on VSee for Free 
>>>
>>>
>>>
>>>
>>> On Fri, Oct 28, 2016 at 3:50 PM, Frank Liauw  wrote:
>>>
 Hi,

 My pods are not deploying despite there being plenty of spare resources
 on my nodes; the event log on the pod seems to report much higher resource
 usage than what I'm seeing on my node:

 pod (redshiftlogger-14-8zaty) failed to fit in any node fit failure on
 node (node1.openshift.internal): Region fit failure on node
 (node2.openshift.internal): Region fit failure on node
 (node2.openshift.ncal): Region fit failure on node
 (node3.openshift.internal): Region fit failure on node
 (node4.openshift.internal): Node didn't have enough resource: Memory,
 requested: 2147483648, used: 32448771072, capacity: 32511942656

 Node   CPU Requests CPU Limits Memory
 Requests Memory Limits
 --    -- --- -
 node1.openshift.internal  100m (1%) 100m (1%) 2560Mi (8%) 10Gi (33%)
 node2.openshift.internal  100m (1%) 100m (1%) 5623141Ki (46%) 13487461Ki
 (111%)
 node3.openshift.internal  200m (2%) 200m (2%) 7680Mi (48%) 15Gi (96%)
 node4.openshift.internal  100m (1%) 100m (1%) 32448771072 (99%) 32448771072
 (99%)
 node2.openshift.ncal   100m (2%) 100m (2%) 4147483648 (25%)
 4147483648 (25%)

 Restarting the master didn't help; I restarted one of two masters.

 Frank

Re: Wrong resource consumption on scheduler

2016-11-02 Thread Frank Liauw

No, it does not. Are nodes without region labels automatically classified
as infra region?

Frank
Systems Engineer

VSee: fr...@vsee.com  | Cell: +65 9338 0035

Join me on VSee for Free 




On Wed, Nov 2, 2016 at 9:24 PM, Clayton Coleman  wrote:

> Does your namespace have a namespace restriction annotation set?  If not,
> you'll be defaulted to the global restriction which is usually excluding
> the infra region.
>
> On Nov 2, 2016, at 8:14 AM, Frank Liauw  wrote:
>
> There's no node selector on the pod. The pod is under a service.
>
> The affinity rules are left unmodified from install:
>
> {
> "predicates": [{
> "name": "MatchNodeSelector"
> }, {
> "name": "PodFitsResources"
> }, {
> "name": "PodFitsPorts"
> }, {
> "name": "NoDiskConflict"
> }, {
> "argument": {
> "serviceAffinity": {
> "labels": ["region"]
> }
> },
> "name": "Region"
> }],
> "kind": "Policy",
> "priorities": [{
> "name": "LeastRequestedPriority",
> "weight": 1
> }, {
> "name": "SelectorSpreadPriority",
> "weight": 1
> }, {
> "argument": {
> "serviceAntiAffinity": {
> "label": "zone"
> }
> },
> "weight": 2,
> "name": "Zone"
> }],
> "apiVersion": "v1"
> }
>
> It puzzles me more that my custom labels as well as the extra labels were
> not introduced in previous scaleup runs by ansible; it's not the first time
> I'm adding new nodes to the cluster.
>
> Frank
> Systems Engineer
>
> VSee: fr...@vsee.com  | Cell: +65 9338 0035
>
> Join me on VSee for Free 
>
>
>
>
> On Fri, Oct 28, 2016 at 9:47 PM, Clayton Coleman 
> wrote:
>
>> What node selector / tolerations / affinity rules were on your pod?  Is
>> the pod under a service?
>>
>> On Oct 28, 2016, at 4:03 AM, Frank Liauw  wrote:
>>
>> After giving the event log some thought, I realised that the 'Region fit
>> failure' was a different error as 'Node didn't have enough resource', and
>> realised that openshift is trying to force deployment onto
>> node4.openshift.internal, a node I added recently.
>>
>> My nodes had these sets of labels:
>>
>> kubernetes.io/hostname=node1.openshift.internal,logging-infr
>> a-fluentd=true
>> kubernetes.io/hostname=node2.openshift.internal,logging-infr
>> a-fluentd=true,public-router=true
>> core-router=true,kubernetes.io/hostname=node3.openshift.inte
>> rnal,logging-infra-fluentd=true
>> beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,ku
>> bernetes.io/hostname=node4.openshift.internal,logging-infra-
>> fluentd=true,region=primary,zone=layer42
>> kubernetes.io/hostname=node2.openshift.ncal,logging-infra-fl
>> uentd=true,public-router=true
>>
>> Removing the labels 'beta.kubernetes.io/arch=amd64
>> ,beta.kubernetes.io/os=linux' fixed the issue.
>>
>> Why is it so?
>>
>> Frank
>> Systems Engineer
>>
>> VSee: fr...@vsee.com  | Cell: +65 9338 0035
>>
>> Join me on VSee for Free 
>>
>>
>>
>>
>> On Fri, Oct 28, 2016 at 3:50 PM, Frank Liauw  wrote:
>>
>>> Hi,
>>>
>>> My pods are not deploying despite there being plenty of spare resources
>>> on my nodes; the event log on the pod seems to report much higher resource
>>> usage than what I'm seeing on my node:
>>>
>>> pod (redshiftlogger-14-8zaty) failed to fit in any node fit failure on
>>> node (node1.openshift.internal): Region fit failure on node
>>> (node2.openshift.internal): Region fit failure on node
>>> (node2.openshift.ncal): Region fit failure on node
>>> (node3.openshift.internal): Region fit failure on node
>>> (node4.openshift.internal): Node didn't have enough resource: Memory,
>>> requested: 2147483648, used: 32448771072, capacity: 32511942656
>>>
>>> Node   CPU Requests CPU Limits Memory
>>> Requests Memory Limits
>>> --    -- --- -
>>> node1.openshift.internal  100m (1%) 100m (1%) 2560Mi (8%) 10Gi (33%)
>>> node2.openshift.internal  100m (1%) 100m (1%) 5623141Ki (46%) 13487461Ki
>>> (111%)
>>> node3.openshift.internal  200m (2%) 200m (2%) 7680Mi (48%) 15Gi (96%)
>>> node4.openshift.internal  100m (1%) 100m (1%) 32448771072 (99%) 32448771072
>>> (99%)
>>> node2.openshift.ncal   100m (2%) 100m (2%) 4147483648 (25%)
>>> 4147483648 (25%)
>>>
>>> Restarting the master didn't help; I restarted one of two masters.
>>>
>>> Frank
>>> Systems Engineer
>>>
>>> VSee: fr...@vsee.com  | Cell: +65 9338 0035
>>>
>>> Join me on VSee for Free 
>>>
>>>
>>>
>>>
>> ___
>> users mailing list
>> users@lists.openshift.redhat.com
>>

Re: Wrong resource consumption on scheduler

Re: Wrong resource consumption on scheduler

2 matches

Site Navigation

Mail list logo

Footer information