Hello Michal,
nice to read you! :) Yes, i'm still on 3.2. Could you be so kind and try to
backport it? Thank you very much!
azur
__
> Od: "Michal Hocko"
> Komu: azurIt
> Dátum: 06.06.2013 18:04
> Predmet: Re: [PATCH for 3.2.34]
Hi,
I am really sorry it took so long but I was constantly preempted by
other stuff. I hope I have a good news for you, though. Johannes has
found a nice way how to overcome deadlock issues from memcg OOM which
might help you. Would you be willing to test with his patch
Hi,
I am really sorry it took so long but I was constantly preempted by
other stuff. I hope I have a good news for you, though. Johannes has
found a nice way how to overcome deadlock issues from memcg OOM which
might help you. Would you be willing to test with his patch
Hello Michal,
nice to read you! :) Yes, i'm still on 3.2. Could you be so kind and try to
backport it? Thank you very much!
azur
__
Od: Michal Hocko mho...@suse.cz
Komu: azurIt azu...@pobox.sk
Dátum: 06.06.2013 18:04
Predmet:
On Fri 22-02-13 13:54:42, azurIt wrote:
> >I am not sure how much time I'll have for this today but just to make
> >sure we are on the same page, could you point me to the two patches you
> >have applied in the mean time?
>
>
> Here:
> http://watchdog.sk/lkml/patches2
OK, looks correct.
--
>I am not sure how much time I'll have for this today but just to make
>sure we are on the same page, could you point me to the two patches you
>have applied in the mean time?
Here:
http://watchdog.sk/lkml/patches2
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the
Hi,
On Fri 22-02-13 09:23:32, azurIt wrote:
[...]
> sorry that i didn't response for a while. Today i installed kernel
> with your two patches and i'm running it now.
I am not sure how much time I'll have for this today but just to make
sure we are on the same page, could you point me to the two
>Unfortunately I am not able to reproduce this behavior even if I try
>to hammer OOM like mad so I am afraid I cannot help you much without
>further debugging patches.
>I do realize that experimenting in your environment is a problem but I
>do not many options left. Please do not use strace and
>Unfortunately I am not able to reproduce this behavior even if I try
>to hammer OOM like mad so I am afraid I cannot help you much without
>further debugging patches.
>I do realize that experimenting in your environment is a problem but I
>do not many options left. Please do not use strace and
Unfortunately I am not able to reproduce this behavior even if I try
to hammer OOM like mad so I am afraid I cannot help you much without
further debugging patches.
I do realize that experimenting in your environment is a problem but I
do not many options left. Please do not use strace and rather
Unfortunately I am not able to reproduce this behavior even if I try
to hammer OOM like mad so I am afraid I cannot help you much without
further debugging patches.
I do realize that experimenting in your environment is a problem but I
do not many options left. Please do not use strace and rather
Hi,
On Fri 22-02-13 09:23:32, azurIt wrote:
[...]
sorry that i didn't response for a while. Today i installed kernel
with your two patches and i'm running it now.
I am not sure how much time I'll have for this today but just to make
sure we are on the same page, could you point me to the two
I am not sure how much time I'll have for this today but just to make
sure we are on the same page, could you point me to the two patches you
have applied in the mean time?
Here:
http://watchdog.sk/lkml/patches2
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body
On Fri 22-02-13 13:54:42, azurIt wrote:
I am not sure how much time I'll have for this today but just to make
sure we are on the same page, could you point me to the two patches you
have applied in the mean time?
Here:
http://watchdog.sk/lkml/patches2
OK, looks correct.
--
Michal Hocko
On Sun 10-02-13 17:46:19, azurIt wrote:
> >stuck in the ptrace code.
>
>
> But this happens _after_ the cgroup was freezed and i tried to strace
> one of it's processes (to see what's happening):
>
> Feb 8 01:29:46 server01 kernel: [ 1187.540672] grsec: From 178.40.250.111:
> process
On Sun 10-02-13 17:46:19, azurIt wrote:
stuck in the ptrace code.
But this happens _after_ the cgroup was freezed and i tried to strace
one of it's processes (to see what's happening):
Feb 8 01:29:46 server01 kernel: [ 1187.540672] grsec: From 178.40.250.111:
process
>stuck in the ptrace code.
But this happens _after_ the cgroup was freezed and i tried to strace one of
it's processes (to see what's happening):
Feb 8 01:29:46 server01 kernel: [ 1187.540672] grsec: From 178.40.250.111:
process /usr/lib/apache2/mpm-itk/apache2(apache2:18211) attached to via
On Fri 08-02-13 22:02:43, azurIt wrote:
> >
> >I assume you have checked that the killed processes eventually die,
> >right?
>
>
> When i killed them by hand, yes, they dissappeard from process list (i
> saw it). I don't know if they really died when OOM killed them.
>
>
> >Well, I do not see
On Fri 08-02-13 22:02:43, azurIt wrote:
I assume you have checked that the killed processes eventually die,
right?
When i killed them by hand, yes, they dissappeard from process list (i
saw it). I don't know if they really died when OOM killed them.
Well, I do not see anything
stuck in the ptrace code.
But this happens _after_ the cgroup was freezed and i tried to strace one of
it's processes (to see what's happening):
Feb 8 01:29:46 server01 kernel: [ 1187.540672] grsec: From 178.40.250.111:
process /usr/lib/apache2/mpm-itk/apache2(apache2:18211) attached to via
>
>I assume you have checked that the killed processes eventually die,
>right?
>
When i killed them by hand, yes, they dissappeard from process list (i saw it).
I don't know if they really died when OOM killed them.
>Well, I do not see anything supsicious during that time period
>(timestamps
On Fri 08-02-13 16:58:05, azurIt wrote:
[...]
> I took the kernel log from yesterday from the same time frame:
>
> $ grep "killed as a result of limit" kern2.log | sed 's@.*\] @@' | sort |
> uniq -c | sort -k1 -n
> 1 Task in /1252/uid killed as a result of limit of /1252
> 1 Task in
>Which means that the oom killer didn't try to kill any task more than
>once which is good because it tells us that the killed task manages to
>die before we trigger oom again. So this is definitely not a deadlock.
>You are just hitting OOM very often.
>$ grep "killed as a result of limit"
On Fri 08-02-13 14:56:16, azurIt wrote:
> >kernel log would be sufficient.
>
>
> Full kernel log from kernel with you newest patch:
> http://watchdog.sk/lkml/kern2.log
OK, so the log says that there is a little slaughter on your yard:
$ grep "Memory cgroup out of memory:" kern2.log | wc -l
220
On Fri 08-02-13 14:56:16, azurIt wrote:
> Data are inside memcg-bug-5.tar.gz in directories bug///
ohh, I didn't get those were timestamp directories. It makes more sense
now.
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a
>kernel log would be sufficient.
Full kernel log from kernel with you newest patch:
http://watchdog.sk/lkml/kern2.log
>This limit is for top level groups, right? Those seem to children which
>have 62MB charged - is that a limit for those children?
It was the limit for parent cgroup and
On Fri 08-02-13 12:02:49, azurIt wrote:
> >
> >Do you have logs from that time period?
> >
> >I have only glanced through the stacks and most of the threads are
> >waiting in the mem_cgroup_handle_oom (mostly from the page fault path
> >where we do not have other options than waiting) which
>
>Do you have logs from that time period?
>
>I have only glanced through the stacks and most of the threads are
>waiting in the mem_cgroup_handle_oom (mostly from the page fault path
>where we do not have other options than waiting) which suggests that
>your memory limit is seriously
On Fri 08-02-13 06:03:04, azurIt wrote:
> Michal, thank you very much but it just didn't work and broke
> everything :(
I am sorry to hear that. The patch should help to solve the deadlock you
have seen earlier. It in no way can solve side effects of failing writes
and it also cannot help much if
On Fri 08-02-13 06:03:04, azurIt wrote:
Michal, thank you very much but it just didn't work and broke
everything :(
I am sorry to hear that. The patch should help to solve the deadlock you
have seen earlier. It in no way can solve side effects of failing writes
and it also cannot help much if
Do you have logs from that time period?
I have only glanced through the stacks and most of the threads are
waiting in the mem_cgroup_handle_oom (mostly from the page fault path
where we do not have other options than waiting) which suggests that
your memory limit is seriously underestimated. If
On Fri 08-02-13 12:02:49, azurIt wrote:
Do you have logs from that time period?
I have only glanced through the stacks and most of the threads are
waiting in the mem_cgroup_handle_oom (mostly from the page fault path
where we do not have other options than waiting) which suggests that
kernel log would be sufficient.
Full kernel log from kernel with you newest patch:
http://watchdog.sk/lkml/kern2.log
This limit is for top level groups, right? Those seem to children which
have 62MB charged - is that a limit for those children?
It was the limit for parent cgroup and
On Fri 08-02-13 14:56:16, azurIt wrote:
Data are inside memcg-bug-5.tar.gz in directories bug/timestamp/pids/
ohh, I didn't get those were timestamp directories. It makes more sense
now.
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the
On Fri 08-02-13 14:56:16, azurIt wrote:
kernel log would be sufficient.
Full kernel log from kernel with you newest patch:
http://watchdog.sk/lkml/kern2.log
OK, so the log says that there is a little slaughter on your yard:
$ grep Memory cgroup out of memory: kern2.log | wc -l
220
$ grep
Which means that the oom killer didn't try to kill any task more than
once which is good because it tells us that the killed task manages to
die before we trigger oom again. So this is definitely not a deadlock.
You are just hitting OOM very often.
$ grep killed as a result of limit kern2.log |
On Fri 08-02-13 16:58:05, azurIt wrote:
[...]
I took the kernel log from yesterday from the same time frame:
$ grep killed as a result of limit kern2.log | sed 's@.*\] @@' | sort |
uniq -c | sort -k1 -n
1 Task in /1252/uid killed as a result of limit of /1252
1 Task in
I assume you have checked that the killed processes eventually die,
right?
When i killed them by hand, yes, they dissappeard from process list (i saw it).
I don't know if they really died when OOM killed them.
Well, I do not see anything supsicious during that time period
(timestamps
Michal, thank you very much but it just didn't work and broke everything :(
This happened:
Problem started to occur really often immediately after booting the new kernel,
every few minutes for one of my users. But everything other seems to work fine
so i gave it a try for a day (which was a
Michal, thank you very much but it just didn't work and broke everything :(
This happened:
Problem started to occur really often immediately after booting the new kernel,
every few minutes for one of my users. But everything other seems to work fine
so i gave it a try for a day (which was a
On Wed 06-02-13 15:22:19, Michal Hocko wrote:
> On Wed 06-02-13 15:01:19, Michal Hocko wrote:
> > On Wed 06-02-13 02:17:21, azurIt wrote:
> > > >5-memcg-fix-1.patch is not complete. It doesn't contain the folloup I
> > > >mentioned in a follow up email. Here is the full patch:
> > >
> > >
> > >
On Wed 06-02-13 15:22:19, Michal Hocko wrote:
On Wed 06-02-13 15:01:19, Michal Hocko wrote:
On Wed 06-02-13 02:17:21, azurIt wrote:
5-memcg-fix-1.patch is not complete. It doesn't contain the folloup I
mentioned in a follow up email. Here is the full patch:
Here is the log
42 matches
Mail list logo