Re: [openstack-dev] [oslo][monasca] Can we uncap python-kafka ?

2017-01-17 Thread Keen, Joe
Tony, I have some observations on the new client based on a short term
test and a long running test.

For short term use it uses 2x the memory compared to the older client.
The logic that deals with receiving partial messages from Kafka was
completely rewritten in the 1.x series and with logging enabled I see
continual warnings about truncated messages.  I don’t lose any data
because of this but I haven’t been able to verify if it’s doing more reads
than necessary.  I don’t know that either of these problems are really a
sticking point for Monasca but the increase in memory usage is potentially
a problem.

Long term testing showed some additional problems.  On a Kafka server that
has been running for a couple weeks I can write data in but the
kafka-python library is no longer able to read data from Kafka.  Clients
written in other languages are able to read successfully.  Profiling of
the python-kafka client shows that it’s spending all it’s time in a loop
attempting to connect to Kafka:

276150.0860.0000.0860.000 {method 'acquire’ of
'thread.lock' objects}
431520.2500.0000.3850.000 types.py:15(_unpack)
431530.1350.0000.1350.000 {_struct.unpack}
48040/477980.1640.0000.1650.000 {len}
603510.2010.0000.2010.000 {method 'read’ of
'_io.BytesIO' objects}
  7389962   23.9850.000   23.9850.000 {method 'keys' of ‘dict'
objects}
  738  104.9310.000  395.6540.000 conn.py:560(recv)
  738   58.3420.000  100.0050.000
conn.py:722(_requests_timed_out)
  738   97.7870.000  167.5680.000 conn.py:588(_recv)
  7390071   46.5960.000   46.5960.000 {method 'recv’ of
'_socket.socket' objects}
  7390145   23.1510.000   23.1510.000 conn.py:458(connected)
  7390266   21.4170.000   21.4170.000 {method 'tell’ of
'_io.BytesIO' objects}
  7395664   41.6950.000   41.6950.000 {time.time}



I also see additional problems with the use of the deprecated
SimpleConsumer and SimpleProducer clients.  We really do need to
investigate migrating to the new async only Producer objects while still
maintaining the reliability guarantees that Monasca requires.


On 12/13/16, 10:01 PM, "Tony Breeds" <t...@bakeyournoodle.com> wrote:

>On Mon, Dec 05, 2016 at 04:03:13AM +, Keen, Joe wrote:
>
>> I don’t know, yet, that we can.  Unless we can find an answer to the
>> questions I had above I’m not sure that this new library will be
>> performant and durable enough for the use cases Monasca has.  I’m fairly
>> confident that we can make it work but the performance issues with
>> previous versions prevented us from even trying to integrate so it will
>> take us some time.  If you need an answer more quickly than a week or
>>so,
>> and if anyone in the community is willing, I can walk them through the
>> testing I’d expect to happen to validate the new library.
>
>Any updates Joe?  It's been 10 days and we're running close to Christamas
>so
>at this rate it'll be next year before we know if this is workable.
>
>Yours Tony.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo][monasca] Can we uncap python-kafka ?

2016-12-04 Thread Keen, Joe


On 12/4/16, 7:36 PM, "Tony Breeds" <t...@bakeyournoodle.com> wrote:

>On Fri, Dec 02, 2016 at 06:18:39PM +, Keen, Joe wrote:
>> 
>> 
>> On 12/2/16, 1:29 AM, "Mehdi Abaakouk" <sil...@sileht.net> wrote:
>> 
>> >On Fri, Dec 02, 2016 at 03:29:59PM +1100, Tony Breeds wrote:
>> >>On Thu, Dec 01, 2016 at 04:52:52PM +, Keen, Joe wrote:
>> >>
>> >>> Unfortunately there¹s nothing wrong on the Monasca side so far as we
>> >>>know.
>> >>>  We test new versions of the kafka-python library outside of Monasca
>> >>> before we bother to try integrating a new version.  Since 1.0 the
>> >>> kafka-python library has suffered from crashes and memory leaks
>>severe
>> >>> enough that we¹ve never attempted using it in Monasca itself.  We
>> >>>reported
>> >>> the bugs we found to the kafka-python project but they were closed
>>once
>> >>> they released a new version.
>> >>
>> >>So Opening bugs isn't working.  What about writing code?
>> >
>> >The bug https://github.com/dpkp/kafka-python/issues/55
>> >
>> >Reopening it would be the right solution here.
>> >
>> >I can't reproduce the segfault neither and I agree with dpkp, that
>>looks
>> >like a
>> >ujson issue.
>> 
>> 
>> The bug I had was: https://github.com/dpkp/kafka-python/issues/551
>> 
>> In the case of that bug ujson was not an issue.  The behaviour remained
>> even using the standard json library.  The primary issue I found with it
>> was a memory leak over successive runs of the test script.  Eventually
>>the
>> leak became so bad that the OOM killer killed the process which caused
>>the
>> segfault I was seeing.  The last version I tested was 1.2.1 and it still
>> leaked badly.  I¹ll need to let the benchmark script run for a while and
>> make sure it¹s not still leaking.
>
>So you write that on Friday so you shoudl know by now if it's leaking
>care to
>give us an update?

I wasn’t able to set a test up on Friday and with all the other work I
have for the next few days I doubt I’ll be able to get to it much before
Wednesday.

>
>> >And my bench seems to confirm the perf issue have been solved:
>> >(but not in the pointed version...)
>> >
>> >$ pifpaf run kafka python kafka_test.py
>> >kafka-python version: 0.9.5
>> >...
>> >fetch size 179200 -> 45681.8728864 messages per second
>> >fetch size 204800 -> 47724.3810674 messages per second
>> >fetch size 230400 -> 47209.9841092 messages per second
>> >fetch size 256000 -> 48340.7719787 messages per second
>> >fetch size 281600 -> 49192.9896743 messages per second
>> >fetch size 307200 -> 50915.3291133 messages per second
>> >
>> >$ pifpaf run kafka python kafka_test.py
>> >kafka-python version: 1.0.2
>> >
>> >fetch size 179200 -> 8546.77931323 messages per second
>> >fetch size 204800 -> 9213.30958314 messages per second
>> >fetch size 230400 -> 10316.668006 messages per second
>> >fetch size 256000 -> 11476.2285269 messages per second
>> >fetch size 281600 -> 12353.7254386 messages per second
>> >fetch size 307200 -> 13131.2367288 messages per second
>> >
>> >(1.1.1 and 1.2.5 have also the same issue)
>> >
>> >$ pifpaf run kafka python kafka_test.py
>> >kafka-python version: 1.3.1
>> >fetch size 179200 -> 44636.9371873 messages per second
>> >fetch size 204800 -> 44324.7085365 messages per second
>> >fetch size 230400 -> 45235.8283208 messages per second
>> >fetch size 256000 -> 45793.1044121 messages per second
>> >fetch size 281600 -> 44648.6357019 messages per second
>> >fetch size 307200 -> 44877.8445987 messages per second
>> >fetch size 332800 -> 47166.9176281 messages per second
>> >fetch size 358400 -> 47391.0057622 messages per second
>> >
>> >Looks like it works well now :)
>> 
>> It¹s good that the performance problem has been fixed.  The remaining
>> issues on the Monasca side are verifying that the batch send method we
>> were using in 0.9.5 still works with the new async behaviour, seeing if
>> our consumer auto balance still functions or converting to use the Kafka
>> internal auto balance in Kafka 0.10, and finding a way to do efficient
>> synchronous writes with the new async methods.
>
>Can you +1 https://review.openstack

Re: [openstack-dev] [oslo][monasca] Can we uncap python-kafka ?

2016-12-02 Thread Keen, Joe


On 12/2/16, 1:29 AM, "Mehdi Abaakouk" <sil...@sileht.net> wrote:

>On Fri, Dec 02, 2016 at 03:29:59PM +1100, Tony Breeds wrote:
>>On Thu, Dec 01, 2016 at 04:52:52PM +, Keen, Joe wrote:
>>
>>> Unfortunately there¹s nothing wrong on the Monasca side so far as we
>>>know.
>>>  We test new versions of the kafka-python library outside of Monasca
>>> before we bother to try integrating a new version.  Since 1.0 the
>>> kafka-python library has suffered from crashes and memory leaks severe
>>> enough that we¹ve never attempted using it in Monasca itself.  We
>>>reported
>>> the bugs we found to the kafka-python project but they were closed once
>>> they released a new version.
>>
>>So Opening bugs isn't working.  What about writing code?
>
>The bug https://github.com/dpkp/kafka-python/issues/55
>
>Reopening it would be the right solution here.
>
>I can't reproduce the segfault neither and I agree with dpkp, that looks
>like a
>ujson issue.


The bug I had was: https://github.com/dpkp/kafka-python/issues/551

In the case of that bug ujson was not an issue.  The behaviour remained
even using the standard json library.  The primary issue I found with it
was a memory leak over successive runs of the test script.  Eventually the
leak became so bad that the OOM killer killed the process which caused the
segfault I was seeing.  The last version I tested was 1.2.1 and it still
leaked badly.  I¹ll need to let the benchmark script run for a while and
make sure it¹s not still leaking.

>
>And my bench seems to confirm the perf issue have been solved:
>(but not in the pointed version...)
>
>$ pifpaf run kafka python kafka_test.py
>kafka-python version: 0.9.5
>...
>fetch size 179200 -> 45681.8728864 messages per second
>fetch size 204800 -> 47724.3810674 messages per second
>fetch size 230400 -> 47209.9841092 messages per second
>fetch size 256000 -> 48340.7719787 messages per second
>fetch size 281600 -> 49192.9896743 messages per second
>fetch size 307200 -> 50915.3291133 messages per second
>
>$ pifpaf run kafka python kafka_test.py
>kafka-python version: 1.0.2
>
>fetch size 179200 -> 8546.77931323 messages per second
>fetch size 204800 -> 9213.30958314 messages per second
>fetch size 230400 -> 10316.668006 messages per second
>fetch size 256000 -> 11476.2285269 messages per second
>fetch size 281600 -> 12353.7254386 messages per second
>fetch size 307200 -> 13131.2367288 messages per second
>
>(1.1.1 and 1.2.5 have also the same issue)
>
>$ pifpaf run kafka python kafka_test.py
>kafka-python version: 1.3.1
>fetch size 179200 -> 44636.9371873 messages per second
>fetch size 204800 -> 44324.7085365 messages per second
>fetch size 230400 -> 45235.8283208 messages per second
>fetch size 256000 -> 45793.1044121 messages per second
>fetch size 281600 -> 44648.6357019 messages per second
>fetch size 307200 -> 44877.8445987 messages per second
>fetch size 332800 -> 47166.9176281 messages per second
>fetch size 358400 -> 47391.0057622 messages per second
>
>Looks like it works well now :)

It¹s good that the performance problem has been fixed.  The remaining
issues on the Monasca side are verifying that the batch send method we
were using in 0.9.5 still works with the new async behaviour, seeing if
our consumer auto balance still functions or converting to use the Kafka
internal auto balance in Kafka 0.10, and finding a way to do efficient
synchronous writes with the new async methods.


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo][monasca] Can we uncap python-kafka ?

2016-12-01 Thread Keen, Joe
On 12/1/16, 9:44 AM, "Julien Danjou" <jul...@danjou.info> wrote:

>On Thu, Dec 01 2016, Keen, Joe wrote:
>
>Hi Joe,
>
>[…]
>
>> The message context wrapped around every message is extra overhead we
>> don¹t want.  
>> There¹s no support for batch sends to Kafka.
>> There¹s no support for keyed producers.
>> The forced auto_commit on consumers isn¹t something we can tolerate.
>> There is no auto balance of multiple consumers in the same consumer
>>group
>> on a topic.
>
>This sounds like interesting features and optimizations to add to
>oslo.messaging. Did you already send patches or spec to oslo.messaging
>to improve that? What there the issues you encountered to fix those
>problems / add those new abilities?
>
>-- 
>Julien Danjou
>// Free Software hacker
>// https://julien.danjou.info


I haven’t submitted any patches to oslo.messaging.  When I wrote the Kafka
interfaces for Monasca I didn’t know oslo.messaging existed.  Since then I
just haven’t had the time to try merging the two approaches into one
library.  If anyone else is interested in what we’ve done I can help with
that.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo][monasca] Can we uncap python-kafka ?

2016-12-01 Thread Keen, Joe


On 12/1/16, 9:41 AM, "Joshua Harlow" <harlo...@fastmail.com> wrote:

>Keen, Joe wrote:
>> I¹ll look into testing the newest version of kafka-python and see if it
>> meets our needs.  If it still isn¹t stable and performant enough what
>>are
>> the available options?
>
>Fix the kafka-python library or fix monasca; those seem to be the
>options to me :)
>
>I'd also not like to block the rest of the world (from using newer
>versions of kafka-python) during this as well. But then this may
>diverge/expand into a discussion we had a few summits ago, about getting
>rid of co-instability...
>
>-Josh

Unfortunately there’s nothing wrong on the Monasca side so far as we know.
 We test new versions of the kafka-python library outside of Monasca
before we bother to try integrating a new version.  Since 1.0 the
kafka-python library has suffered from crashes and memory leaks severe
enough that we’ve never attempted using it in Monasca itself.  We reported
the bugs we found to the kafka-python project but they were closed once
they released a new version.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo][monasca] Can we uncap python-kafka ?

2016-12-01 Thread Keen, Joe
On 12/1/16, 3:11 AM, "Julien Danjou"  wrote:

>On Thu, Dec 01 2016, Mehdi Abaakouk wrote:
>
>> I'm aware of all of that, oslo.messaging patch for the new version is
>> ready since 8 months now. And we are still blocked... capping libraries,
>> whatever the reason, is very annoying and just freezes people work.
>>
>> From the API pov python-kafka haven't break anything, the API is still
>> here and documented (and deprecated). What monasca raises is performance
>> issue due to how their uses the library, and on absumption on how it
>>works
>> internally. Blocking all projects for that looks not fair to me.
>>
>> As nadya said, now, we have users that that prefers using an unmerged
>> patch and the new lib instead of using upstream supported
>> version with the old lib. This is not an acceptable situation to me but
>>that's
>> just my thought.
>>
>> Where is the solution to allow oslo.messaging works blocked since 8
>> month to continue ?
>
>+1
>
>And if Monasca is using messaging, I wonder why they don't rely on
>oslo.messaging, which would also solve this entire problem in a snap.


Julien, I looked at the current oslo.messaging Kafka driver and
unfortunately I don¹t think it meets our needs.  There are several major
pieces that are a problem.

The message context wrapped around every message is extra overhead we
don¹t want.  
There¹s no support for batch sends to Kafka.
There¹s no support for keyed producers.
The forced auto_commit on consumers isn¹t something we can tolerate.
There is no auto balance of multiple consumers in the same consumer group
on a topic.

The current Kafka driver we use in monasca-common has all these features.
Without them we can¹t meet the durability and performance levels we need.

What sort of performance levels is oslo.messaging tested at?

I¹ll look into testing the newest version of kafka-python and see if it
meets our needs.  If it still isn¹t stable and performant enough what are
the available options?


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo][monasca][requirements] Kafka 1.x for Oslo.Messaging and Monasca

2016-06-28 Thread Keen, Joe
Tony,
  Psutil is the only library we’ve seen so far that was already in the
global requirements with a version we didn’t support.  Because of that we
don’t currently have it as a requirement for the Monasca Agent, we wanted
to use the upper constraints in our devstack gate jobs, and we just
install it externally for now.  Once that patch set lands we can add it
back into the requirements for the Monasca Agent.  We have several other
libraries we’re in the process of removing that aren’t strictly necessary
and that aren’t worth trying to add to global requirements right now.  We
currently have a review up to add our monasca-common repo to the
requirements [1].  Once that goes through we should be able to immediately
add reviews for our monasca-api and monasca-persister repos.


  I’m in the process of running some more Kafka tests and putting together
more details about the Kafka problems from the Monasca perspective.  The
short version is that we have a simple Kafka test we used to validate
version 0.9.5 that writes 100,000 messages to Kafka, at the average
message size that Monasca expects, and then records how long it takes to
consume them at a variety of fetch sizes from 50K to 2MB.  With 0.9.5,
depending on the fetch size, we could get throughput rates of 50K to 60K
per second.  With the 1.0.x series performance degraded down to 7K to 10K
per second and after running my test program repeatedly it eventually
consumed all the RAM in my VM and the kernel killed it.  I recently tested
version 1.2.2 and while the performance was better, a consistent ~40K per
second regardless of fetch size, that only worked for a single run of my
test program.  A subsequent run only managed ~12K per second and did not
complete the test.  It consumed ~7GB of RAM before running my VM out of
RAM.

Aside from those performance issues the 1.x library is a fundamental
change from the 0.9.x library.  The producers became inherently
asynchronous while Monasca requires synchronous writes.  We should be able
to configure the new producers to operate in a synchronous manner but we
have not been able to verify that.  The older synchronous producers have
been deprecated and based on bugs filed seem to have degraded reliability.
 The consumers attempt to balance themselves inherently now if you’re
using a Kafka version > 0.9 but we have not been able to verify that the
new consumer code functions with the consumer balancing methods that are
required with Kafka versions < 0.9.

We do want to upgrade once things stabilize, especially since we want to
take advantage of Kafka Streaming in the new 0.10 version of Kafka that
the 0.9.5 library doesn’t support, but because we’re still running into
performance and stability problems we haven’t been able to verify that the
new versions can support the Monasca use case.

[1] https://review.openstack.org/#/c/334625/


On 6/23/16, 9:37 PM, "Tony Breeds" <t...@bakeyournoodle.com> wrote:

>On Wed, Jun 22, 2016 at 05:34:27AM +, Keen, Joe wrote:
>> Davanum,
>>   We started work on getting Monasca into the global requirements with
>>two
>> reviews [1] [2] that add gate jobs and check requirements jobs for the
>> Monasca repositories.  Some repositories are being adapted to use
>>versions
>> of libraries that OpenStack currently accepts [3] and we¹re looking at
>>the
>> libraries we use that are not currently part of OpenStack and seeing if
>> they¹re worth trying to add to the global requirements.  We¹re hoping to
>> be able to start adding the global requirements reviews within a week or
>> two.
>> 
>> We definitely want to talk with the oslo.messaging team and explain the
>> ways we use Kafka and what effects the move to the 1.x versions of the
>> library has on us.  I¹ve attempted to contact the oslo.messaging team in
>> the oslo IRC channel to see if we can talk about this at a weekly
>>meeting
>> but I wasn¹t able to connect with anyone.  Would you prefer that
>> conversation happen on the mailing list here or could we add that topic
>>to
>> the next weekly meeting?
>> 
>> [1] https://review.openstack.org/#/c/316293/
>> [2] https://review.openstack.org/#/c/323567/
>
>These 2 are merged.
>
>> [3] https://review.openstack.org/#/c/323598/
>
>Taking a tangent here:
>
>In 2014[1] we added a cap to psutil because 2.x wasn't compatible with 1.x
>which is fine but 2 years latere we have 4.3.0 and because of the cap I'm
>guessing we've done very little to work towards 4.3.0
>
>I've added an item for the requirements team to look at what's involved in
>raising the minimum for psutil, but:
>Requirement: psutil<2.0.0,>=1.1.1 (used by 41 projects)
>it wont happen soon.
>
>Is psutil the last of the "old" libraries you need to deal with?
>
>Getting back t

Re: [openstack-dev] [oslo][monasca][requirements] Kafka 1.x for Oslo.Messaging and Monasca

2016-06-21 Thread Keen, Joe
Davanum,
  We started work on getting Monasca into the global requirements with two
reviews [1] [2] that add gate jobs and check requirements jobs for the
Monasca repositories.  Some repositories are being adapted to use versions
of libraries that OpenStack currently accepts [3] and we¹re looking at the
libraries we use that are not currently part of OpenStack and seeing if
they¹re worth trying to add to the global requirements.  We¹re hoping to
be able to start adding the global requirements reviews within a week or
two.

We definitely want to talk with the oslo.messaging team and explain the
ways we use Kafka and what effects the move to the 1.x versions of the
library has on us.  I¹ve attempted to contact the oslo.messaging team in
the oslo IRC channel to see if we can talk about this at a weekly meeting
but I wasn¹t able to connect with anyone.  Would you prefer that
conversation happen on the mailing list here or could we add that topic to
the next weekly meeting?

[1] https://review.openstack.org/#/c/316293/
[2] https://review.openstack.org/#/c/323567/
[3] https://review.openstack.org/#/c/323598/

On 6/21/16, 6:53 AM, "Davanum Srinivas"  wrote:

>Roland Hochmuth, Joe Keen,
>
>The oslo.messaging folks have been trying off and on again [1] to get
>to python-kafka 1.x. Right now they are waiting on the Monasca team as
>we reverted python-kafka 1.x in [2] for the Monasca Team.
>
>I still don't see a review for adding Monasca to projects.txt which is
>concerning. Is there work being done to get to that point?
>
>The oslo.messaging team will file a review to update to 1.x again when
>they are ready, when that happens, we will end up breaking Monasca CI
>jobs again. Unless of course Monasca is in projects.txt by that time
>and adhering to requirements.
>
>If both oslo.messaging and Monasca are under requirements, we still
>need a way to move forward together. So please treat this email thread
>as an opportunity to talk to each other :) Which should have started
>ideally when we did [2]
>
>Thanks,
>Dims
>
>[1] 
>https://review.openstack.org/#/q/kafka+project:openstack/oslo.messaging+NO
>T+is:merged+NOT+is:abandoned
>[2] https://review.openstack.org/#/c/316259/
>
>-- 
>Davanum Srinivas :: https://twitter.com/dims
>
>__
>OpenStack Development Mailing List (not for usage questions)
>Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev