Re: Memory / resource leak in 0.10.1.1 release

2016-12-30 Thread Jon Yeargers
FWIW: I went through and removed all the 'custom' serdes from my code and
replaced them with 'string serdes'. The memory leak problem went away.

The code is a bit more cumbersome now as it's constantly flipping back and
forth between Objects and JSON.. but that seems to be what it takes to keep
it running.

On Thu, Dec 29, 2016 at 9:42 PM, Guozhang Wang  wrote:

> Hello Jon,
>
> It is hard to tell, since I cannot see how is your Aggregate() function is
> implemented as well.
>
> Note that the deserializer of transactionSerde is used in both `aggregate`
> and `KstreamBuilder.stream`, while the serializer of transactionSerde is
> only used in `aggregate`, so if you suspect the transactionSerde is the
> root cause, to narrow it down you can leave the topology as
>
>
> KStream transactionKStream =  kStreamBuilder.stream(
> stringSerde,transactionSerde,TOPIC);
>
> transactionKStream.to(TOPIC-2);
>
> where TOPIC-2 should be pre-created.
>
> The above topology will also trigger both the serializer and deserializer
> of the transactionSerde, and if this topology also leads to memory leak,
> then it means it is not relevant to your aggregate function.
>
>
> Guozhang
>
>
> On Sun, Dec 25, 2016 at 4:15 AM, Jon Yeargers 
> wrote:
>
> > I narrowed this problem down to this part of the topology (and yes, it's
> > 100% repro - for me):
> >
> > KStream transactionKStream =
> >  kStreamBuilder.stream(stringSerde,transactionSerde,TOPIC);
> >
> > KTable, SumRecordCollector> ktAgg =
> > transactionKStream.groupByKey().aggregate(
> > SumRecordCollector::new,
> > new Aggregate(),
> > TimeWindows.of(20 * 60 * 1000L),
> > collectorSerde, "table_stream");
> >
> > Given that this is a pretty trivial, well-traveled piece of Kafka I can't
> > imagine it has a memory leak.
> >
> > So Im guessing that the serde I'm using is causing a problem somehow. The
> > 'transactionSerde' is just to get/set JSON into the 'SumRecord' object.
> > That Object is just a bunch of String and int fields so nothing
> interesting
> > there either.
> >
> > I'm attaching the two parts of the transactionSerde to see if anyone has
> > suggestions on how to find / fix this.
> >
> >
> >
> > On Thu, Dec 22, 2016 at 9:26 AM, Jon Yeargers 
> > wrote:
> >
> >> Yes - that's the one. It's 100% reproducible (for me).
> >>
> >>
> >> On Thu, Dec 22, 2016 at 8:03 AM, Damian Guy 
> wrote:
> >>
> >>> Hi Jon,
> >>>
> >>> Is this for the topology where you are doing something like:
> >>>
> >>> topology: kStream -> groupByKey.aggregate(minute) -> foreach
> >>>  \-> groupByKey.aggregate(hour) -> foreach
> >>>
> >>> I'm trying to understand how i could reproduce your problem. I've not
> >>> seen
> >>> any such issues with 0.10.1.1, but then i'm not sure what you are
> doing.
> >>>
> >>> Thanks,
> >>> Damian
> >>>
> >>> On Thu, 22 Dec 2016 at 15:26 Jon Yeargers 
> >>> wrote:
> >>>
> >>> > Im still hitting this leak with the released version of 0.10.1.1.
> >>> >
> >>> > Process mem % grows over the course of 10-20 minutes and eventually
> >>> the OS
> >>> > kills it.
> >>> >
> >>> > Messages like this appear in /var/log/messages:
> >>> >
> >>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.793692] java
> invoked
> >>> > oom-killer: gfp_mask=0x24201ca, order=0, oom_score_adj=0
> >>> >
> >>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.798383] java
> >>> cpuset=/
> >>> > mems_allowed=0
> >>> >
> >>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.801079] CPU: 0
> PID:
> >>> 9550
> >>> > Comm: java Tainted: GE   4.4.19-29.55.amzn1.x86_64 #1
> >>> >
> >>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072] Hardware
> >>> name:
> >>> > Xen HVM domU, BIOS 4.2.amazon 11/11/2016
> >>> >
> >>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> >>> >  88071c517a70 812c958f 88071c517c58
> >>> >
> >>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> >>> >  88071c517b00 811ce76d 8109db14
> >>> >
> >>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> >>> > 810b2d91  0010 817d0fe9
> >>> >
> >>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072] Call
> Trace:
> >>> >
> >>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> >>> > [] dump_stack+0x63/0x84
> >>> >
> >>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> >>> > [] dump_header+0x5e/0x1d8
> >>> >
> >>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> >>> > [] ? set_next_entity+0xa4/0x710
> >>> >
> >>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> >>> > [] ? __raw_callee_save___pv_queued_
> >>> spin_unlock+0x11/0x20
> >>> >
> >>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> >>> > [] oom_kill_process+0x205/0x3d0
> >>> >
> >>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> >>> > [] out_of_memory

Re: Memory / resource leak in 0.10.1.1 release

2016-12-29 Thread Guozhang Wang
Hello Jon,

It is hard to tell, since I cannot see how is your Aggregate() function is
implemented as well.

Note that the deserializer of transactionSerde is used in both `aggregate`
and `KstreamBuilder.stream`, while the serializer of transactionSerde is
only used in `aggregate`, so if you suspect the transactionSerde is the
root cause, to narrow it down you can leave the topology as


KStream transactionKStream =  kStreamBuilder.stream(
stringSerde,transactionSerde,TOPIC);

transactionKStream.to(TOPIC-2);

where TOPIC-2 should be pre-created.

The above topology will also trigger both the serializer and deserializer
of the transactionSerde, and if this topology also leads to memory leak,
then it means it is not relevant to your aggregate function.


Guozhang


On Sun, Dec 25, 2016 at 4:15 AM, Jon Yeargers 
wrote:

> I narrowed this problem down to this part of the topology (and yes, it's
> 100% repro - for me):
>
> KStream transactionKStream =
>  kStreamBuilder.stream(stringSerde,transactionSerde,TOPIC);
>
> KTable, SumRecordCollector> ktAgg =
> transactionKStream.groupByKey().aggregate(
> SumRecordCollector::new,
> new Aggregate(),
> TimeWindows.of(20 * 60 * 1000L),
> collectorSerde, "table_stream");
>
> Given that this is a pretty trivial, well-traveled piece of Kafka I can't
> imagine it has a memory leak.
>
> So Im guessing that the serde I'm using is causing a problem somehow. The
> 'transactionSerde' is just to get/set JSON into the 'SumRecord' object.
> That Object is just a bunch of String and int fields so nothing interesting
> there either.
>
> I'm attaching the two parts of the transactionSerde to see if anyone has
> suggestions on how to find / fix this.
>
>
>
> On Thu, Dec 22, 2016 at 9:26 AM, Jon Yeargers 
> wrote:
>
>> Yes - that's the one. It's 100% reproducible (for me).
>>
>>
>> On Thu, Dec 22, 2016 at 8:03 AM, Damian Guy  wrote:
>>
>>> Hi Jon,
>>>
>>> Is this for the topology where you are doing something like:
>>>
>>> topology: kStream -> groupByKey.aggregate(minute) -> foreach
>>>  \-> groupByKey.aggregate(hour) -> foreach
>>>
>>> I'm trying to understand how i could reproduce your problem. I've not
>>> seen
>>> any such issues with 0.10.1.1, but then i'm not sure what you are doing.
>>>
>>> Thanks,
>>> Damian
>>>
>>> On Thu, 22 Dec 2016 at 15:26 Jon Yeargers 
>>> wrote:
>>>
>>> > Im still hitting this leak with the released version of 0.10.1.1.
>>> >
>>> > Process mem % grows over the course of 10-20 minutes and eventually
>>> the OS
>>> > kills it.
>>> >
>>> > Messages like this appear in /var/log/messages:
>>> >
>>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.793692] java invoked
>>> > oom-killer: gfp_mask=0x24201ca, order=0, oom_score_adj=0
>>> >
>>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.798383] java
>>> cpuset=/
>>> > mems_allowed=0
>>> >
>>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.801079] CPU: 0 PID:
>>> 9550
>>> > Comm: java Tainted: GE   4.4.19-29.55.amzn1.x86_64 #1
>>> >
>>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072] Hardware
>>> name:
>>> > Xen HVM domU, BIOS 4.2.amazon 11/11/2016
>>> >
>>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>>> >  88071c517a70 812c958f 88071c517c58
>>> >
>>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>>> >  88071c517b00 811ce76d 8109db14
>>> >
>>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>>> > 810b2d91  0010 817d0fe9
>>> >
>>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072] Call Trace:
>>> >
>>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>>> > [] dump_stack+0x63/0x84
>>> >
>>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>>> > [] dump_header+0x5e/0x1d8
>>> >
>>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>>> > [] ? set_next_entity+0xa4/0x710
>>> >
>>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>>> > [] ? __raw_callee_save___pv_queued_
>>> spin_unlock+0x11/0x20
>>> >
>>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>>> > [] oom_kill_process+0x205/0x3d0
>>> >
>>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>>> > [] out_of_memory+0x431/0x480
>>> >
>>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>>> > [] __alloc_pages_nodemask+0x91e/0xa60
>>> >
>>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>>> > [] alloc_pages_current+0x88/0x120
>>> >
>>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>>> > [] __page_cache_alloc+0xb4/0xc0
>>> >
>>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>>> > [] filemap_fault+0x188/0x3e0
>>> >
>>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>>> > [] ext4_filemap_fault+0x36/0x50 [ext4]
>>> >
>>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [29898

Re: Memory / resource leak in 0.10.1.1 release

2016-12-25 Thread Jon Yeargers
I narrowed this problem down to this part of the topology (and yes, it's
100% repro - for me):

KStream transactionKStream =
 kStreamBuilder.stream(stringSerde,transactionSerde,TOPIC);

KTable, SumRecordCollector> ktAgg =
transactionKStream.groupByKey().aggregate(
SumRecordCollector::new,
new Aggregate(),
TimeWindows.of(20 * 60 * 1000L),
collectorSerde, "table_stream");

Given that this is a pretty trivial, well-traveled piece of Kafka I can't
imagine it has a memory leak.

So Im guessing that the serde I'm using is causing a problem somehow. The
'transactionSerde' is just to get/set JSON into the 'SumRecord' object.
That Object is just a bunch of String and int fields so nothing interesting
there either.

I'm attaching the two parts of the transactionSerde to see if anyone has
suggestions on how to find / fix this.



On Thu, Dec 22, 2016 at 9:26 AM, Jon Yeargers 
wrote:

> Yes - that's the one. It's 100% reproducible (for me).
>
>
> On Thu, Dec 22, 2016 at 8:03 AM, Damian Guy  wrote:
>
>> Hi Jon,
>>
>> Is this for the topology where you are doing something like:
>>
>> topology: kStream -> groupByKey.aggregate(minute) -> foreach
>>  \-> groupByKey.aggregate(hour) -> foreach
>>
>> I'm trying to understand how i could reproduce your problem. I've not seen
>> any such issues with 0.10.1.1, but then i'm not sure what you are doing.
>>
>> Thanks,
>> Damian
>>
>> On Thu, 22 Dec 2016 at 15:26 Jon Yeargers 
>> wrote:
>>
>> > Im still hitting this leak with the released version of 0.10.1.1.
>> >
>> > Process mem % grows over the course of 10-20 minutes and eventually the
>> OS
>> > kills it.
>> >
>> > Messages like this appear in /var/log/messages:
>> >
>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.793692] java invoked
>> > oom-killer: gfp_mask=0x24201ca, order=0, oom_score_adj=0
>> >
>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.798383] java cpuset=/
>> > mems_allowed=0
>> >
>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.801079] CPU: 0 PID:
>> 9550
>> > Comm: java Tainted: GE   4.4.19-29.55.amzn1.x86_64 #1
>> >
>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072] Hardware
>> name:
>> > Xen HVM domU, BIOS 4.2.amazon 11/11/2016
>> >
>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>> >  88071c517a70 812c958f 88071c517c58
>> >
>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>> >  88071c517b00 811ce76d 8109db14
>> >
>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>> > 810b2d91  0010 817d0fe9
>> >
>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072] Call Trace:
>> >
>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>> > [] dump_stack+0x63/0x84
>> >
>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>> > [] dump_header+0x5e/0x1d8
>> >
>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>> > [] ? set_next_entity+0xa4/0x710
>> >
>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>> > [] ? __raw_callee_save___pv_queued_
>> spin_unlock+0x11/0x20
>> >
>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>> > [] oom_kill_process+0x205/0x3d0
>> >
>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>> > [] out_of_memory+0x431/0x480
>> >
>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>> > [] __alloc_pages_nodemask+0x91e/0xa60
>> >
>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>> > [] alloc_pages_current+0x88/0x120
>> >
>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>> > [] __page_cache_alloc+0xb4/0xc0
>> >
>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>> > [] filemap_fault+0x188/0x3e0
>> >
>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>> > [] ext4_filemap_fault+0x36/0x50 [ext4]
>> >
>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>> > [] __do_fault+0x3d/0x70
>> >
>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>> > [] handle_mm_fault+0xf27/0x1870
>> >
>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>> > [] ? __raw_callee_save___pv_queued_
>> spin_unlock+0x11/0x20
>> >
>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>> > [] __do_page_fault+0x183/0x3f0
>> >
>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>> > [] do_page_fault+0x22/0x30
>> >
>> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>> > [] page_fault+0x28/0x30
>> >
>>
>
>


Re: Memory / resource leak in 0.10.1.1 release

2016-12-22 Thread Jon Yeargers
Yes - that's the one. It's 100% reproducible (for me).


On Thu, Dec 22, 2016 at 8:03 AM, Damian Guy  wrote:

> Hi Jon,
>
> Is this for the topology where you are doing something like:
>
> topology: kStream -> groupByKey.aggregate(minute) -> foreach
>  \-> groupByKey.aggregate(hour) -> foreach
>
> I'm trying to understand how i could reproduce your problem. I've not seen
> any such issues with 0.10.1.1, but then i'm not sure what you are doing.
>
> Thanks,
> Damian
>
> On Thu, 22 Dec 2016 at 15:26 Jon Yeargers 
> wrote:
>
> > Im still hitting this leak with the released version of 0.10.1.1.
> >
> > Process mem % grows over the course of 10-20 minutes and eventually the
> OS
> > kills it.
> >
> > Messages like this appear in /var/log/messages:
> >
> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.793692] java invoked
> > oom-killer: gfp_mask=0x24201ca, order=0, oom_score_adj=0
> >
> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.798383] java cpuset=/
> > mems_allowed=0
> >
> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.801079] CPU: 0 PID:
> 9550
> > Comm: java Tainted: GE   4.4.19-29.55.amzn1.x86_64 #1
> >
> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072] Hardware name:
> > Xen HVM domU, BIOS 4.2.amazon 11/11/2016
> >
> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> >  88071c517a70 812c958f 88071c517c58
> >
> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> >  88071c517b00 811ce76d 8109db14
> >
> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> > 810b2d91  0010 817d0fe9
> >
> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072] Call Trace:
> >
> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> > [] dump_stack+0x63/0x84
> >
> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> > [] dump_header+0x5e/0x1d8
> >
> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> > [] ? set_next_entity+0xa4/0x710
> >
> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> > [] ? __raw_callee_save___pv_queued_
> spin_unlock+0x11/0x20
> >
> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> > [] oom_kill_process+0x205/0x3d0
> >
> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> > [] out_of_memory+0x431/0x480
> >
> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> > [] __alloc_pages_nodemask+0x91e/0xa60
> >
> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> > [] alloc_pages_current+0x88/0x120
> >
> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> > [] __page_cache_alloc+0xb4/0xc0
> >
> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> > [] filemap_fault+0x188/0x3e0
> >
> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> > [] ext4_filemap_fault+0x36/0x50 [ext4]
> >
> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> > [] __do_fault+0x3d/0x70
> >
> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> > [] handle_mm_fault+0xf27/0x1870
> >
> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> > [] ? __raw_callee_save___pv_queued_
> spin_unlock+0x11/0x20
> >
> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> > [] __do_page_fault+0x183/0x3f0
> >
> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> > [] do_page_fault+0x22/0x30
> >
> > Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> > [] page_fault+0x28/0x30
> >
>


Re: Memory / resource leak in 0.10.1.1 release

2016-12-22 Thread Damian Guy
Hi Jon,

Is this for the topology where you are doing something like:

topology: kStream -> groupByKey.aggregate(minute) -> foreach
 \-> groupByKey.aggregate(hour) -> foreach

I'm trying to understand how i could reproduce your problem. I've not seen
any such issues with 0.10.1.1, but then i'm not sure what you are doing.

Thanks,
Damian

On Thu, 22 Dec 2016 at 15:26 Jon Yeargers  wrote:

> Im still hitting this leak with the released version of 0.10.1.1.
>
> Process mem % grows over the course of 10-20 minutes and eventually the OS
> kills it.
>
> Messages like this appear in /var/log/messages:
>
> Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.793692] java invoked
> oom-killer: gfp_mask=0x24201ca, order=0, oom_score_adj=0
>
> Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.798383] java cpuset=/
> mems_allowed=0
>
> Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.801079] CPU: 0 PID: 9550
> Comm: java Tainted: GE   4.4.19-29.55.amzn1.x86_64 #1
>
> Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072] Hardware name:
> Xen HVM domU, BIOS 4.2.amazon 11/11/2016
>
> Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>  88071c517a70 812c958f 88071c517c58
>
> Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
>  88071c517b00 811ce76d 8109db14
>
> Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> 810b2d91  0010 817d0fe9
>
> Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072] Call Trace:
>
> Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> [] dump_stack+0x63/0x84
>
> Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> [] dump_header+0x5e/0x1d8
>
> Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> [] ? set_next_entity+0xa4/0x710
>
> Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> [] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20
>
> Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> [] oom_kill_process+0x205/0x3d0
>
> Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> [] out_of_memory+0x431/0x480
>
> Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> [] __alloc_pages_nodemask+0x91e/0xa60
>
> Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> [] alloc_pages_current+0x88/0x120
>
> Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> [] __page_cache_alloc+0xb4/0xc0
>
> Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> [] filemap_fault+0x188/0x3e0
>
> Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> [] ext4_filemap_fault+0x36/0x50 [ext4]
>
> Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> [] __do_fault+0x3d/0x70
>
> Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> [] handle_mm_fault+0xf27/0x1870
>
> Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> [] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20
>
> Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> [] __do_page_fault+0x183/0x3f0
>
> Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> [] do_page_fault+0x22/0x30
>
> Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
> [] page_fault+0x28/0x30
>


Memory / resource leak in 0.10.1.1 release

2016-12-22 Thread Jon Yeargers
Im still hitting this leak with the released version of 0.10.1.1.

Process mem % grows over the course of 10-20 minutes and eventually the OS
kills it.

Messages like this appear in /var/log/messages:

Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.793692] java invoked
oom-killer: gfp_mask=0x24201ca, order=0, oom_score_adj=0

Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.798383] java cpuset=/
mems_allowed=0

Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.801079] CPU: 0 PID: 9550
Comm: java Tainted: GE   4.4.19-29.55.amzn1.x86_64 #1

Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072] Hardware name:
Xen HVM domU, BIOS 4.2.amazon 11/11/2016

Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
 88071c517a70 812c958f 88071c517c58

Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
 88071c517b00 811ce76d 8109db14

Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
810b2d91  0010 817d0fe9

Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072] Call Trace:

Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
[] dump_stack+0x63/0x84

Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
[] dump_header+0x5e/0x1d8

Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
[] ? set_next_entity+0xa4/0x710

Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
[] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20

Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
[] oom_kill_process+0x205/0x3d0

Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
[] out_of_memory+0x431/0x480

Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
[] __alloc_pages_nodemask+0x91e/0xa60

Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
[] alloc_pages_current+0x88/0x120

Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
[] __page_cache_alloc+0xb4/0xc0

Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
[] filemap_fault+0x188/0x3e0

Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
[] ext4_filemap_fault+0x36/0x50 [ext4]

Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
[] __do_fault+0x3d/0x70

Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
[] handle_mm_fault+0xf27/0x1870

Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
[] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20

Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
[] __do_page_fault+0x183/0x3f0

Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
[] do_page_fault+0x22/0x30

Dec 22 13:31:22 ip-172-16-101-108 kernel: [2989844.805072]
[] page_fault+0x28/0x30