On 11/05/16 12:09 +0000, Hayes, Graham wrote:
On 10/05/2016 23:28, Gregory Haynes wrote:
On Tue, May 10, 2016, at 11:10 AM, Hayes, Graham wrote:
On 10/05/2016 01:01, Gregory Haynes wrote:

On Mon, May 9, 2016, at 03:54 PM, John Dickinson wrote:
On 9 May 2016, at 13:16, Gregory Haynes wrote:

This is a bit of an aside but I am sure others are wondering the same
thing - Is there some info (specs/etherpad/ML thread/etc) that has more
details on the bottleneck you're running in to? Given that the only
clients of your service are the public facing DNS servers I am now even
more surprised that you're hitting a python-inherent bottleneck.

In Swift's case, the summary is that it's hard[0] to write a network
service in Python that shuffles data between the network and a block
device (hard drive) and effectively utilizes all of the hardware
available. So far, we've done very well by fork()'ing child processes,
using cooperative concurrency via eventlet, and basic "write more
efficient code" optimizations. However, when it comes down to it,
managing all of the async operations across many cores and many drives
is really hard, and there just isn't a good, efficient interface for
that in Python.

This is a pretty big difference from hitting an unsolvable performance
issue in the language and instead is a case of language preference -
which is fine. I don't really want to fall in to the language-comparison
trap, but I think more detailed reasoning for why it is preferable over
python in specific use cases we have hit is good info to include /
discuss in the document you're drafting :). Essentially its a matter of
weighing the costs (which lots of people have hit on so I won't) with
the potential benefits and so unless the benefits are made very clear
(especially if those benefits are technical) its pretty hard to evaluate
IMO.

There seemed to be an assumption in some of the designate rewrite posts
that there is some language-inherent performance issue causing a
bottleneck. If this does actually exist then that is a good reason for
rewriting in another language and is something that would be very useful
to clearly document as a case where we support this type of thing. I am
highly suspicious that this is the case though, but I am trying hard to
keep an open mind...

The way this component works makes it quite difficult to make any major
improvement.

OK, I'll bite.

I had a look at the code and there's a *ton* of low hanging fruit. I
decided to hack in some fixes or emulation of fixes to see whether I
could get any major improvements. Each test I ran 4 workers using
SO_REUSEPORT and timed doing 1k axfr's with 4 in parallel at a time and
recorded 5 timings. I also added these changes on top of one another in
the order they follow.

Thanks for the analysis - any suggestions about how we can improve the
current design are more than welcome .

For this test, was it a single static zone? What size was it?


Base timings: [9.223, 9.030, 8.942, 8.657, 9.190]

Stop spawning a thread per request - there are a lot of ways to do this
better, but lets not even mess with that and just literally move the
thread spawning that happens per request because its a silly idea here:
[8.579, 8.732, 8.217, 8.522, 8.214] (almost 10% increase).

Stop instantiating oslo config object per request - this should be a no
brainer, we dont need to parse config inside of a request handler:
[8.544, 8.191, 8.318, 8.086] (a few more percent).

Now, the slightly less low hanging fruit - there are 3 round trips to
the database *every request*. This is where the vast majority of request
time is spent (not in python). I didn't actually implement a full on
cache (I just hacked around the db queries), but this should be trivial
to do since designate does know when to invalidate the cache data. Some
numbers on how much a warm cache will help:

Caching zone: [5.968, 5.942, 5.936, 5.797, 5.911]

Caching records: [3.450, 3.357, 3.364, 3.459, 3.352].

I would also expect real-world usage to be similar in that you should
only get 1 cache miss per worker per notify, and then all the other
public DNS servers would be getting cache hits. You could also remove
the cost of that 1 cache miss by pre-loading data in to the cache.

I actually would expect the real world use of this to have most of the
servers have a cache miss.

We shuffle the order of the miniDNS servers sent out to the user facing
DNS servers, so I would expect them to hit different minidns servers
at nearly same time, and each of them try to generate the cache entry.

For pre-loading - this could work, but I *really* don't like relying on
a cache for one of the critical path components.


All said and done, I think that's almost a 3x speed increase with
minimal effort. So, can we stop saying that this has anything to do with
Python as a language and has everything to do with the algorithms being
used?

As I have said before - for us, the time spent : performance
improvement ratio is just much higher (for our dev team at least) with
Go.

The problem I see here is that you're considering the time required for your
team to improve this situation without considering the impact this choice has on
the community and the time other members of the community need to spend on this.
I'm not saying you're doing this in bad faith, what I'm saying is that it's our
job to take care of non-project-specific aspects of these proposals.

The way I read your paragraph above is that it's easier to write a good/bad code
in Go that seems to be fater (better?) than the existing Python's code than
actually trying to fix the Python code, which we're already familiar with.

While I don't think the above is entirely wrong, it does strike me as a
surprise. AS Thierry said in another reply, I'm sure the Swift team has spent
quite sometime in improving the service's performance. Why stopping now?

Flavio


We saw a 50x improvement for small SOA queries, and ~ 10x improvement
for 2000 record AXFR (without caching). The majority of your
improvement came from caching, so I would imagine that would speed up
the Go implementation as well.


MiniDNS (the component) takes data and sends a zone transfer every time
a recordset gets updated. That is a full (AXFR) zone transfer, so every
record in the zone gets sent to each of the DNS servers that end users
can hit.

This can be quite a large number - ns[1-6].example.com. may well be
tens or hundreds of servers behind anycast IPs and load balancers.


This design sounds like a *perfect* contender for caching. If you're
designing this properly its purely a question of how quickly can you
shove memory over the wire and as a result your choice in language will
have almost no effect - it'll be entirely an i/o bound problem.

In many cases, internal zones (or even external zones) can be quite
large - I have seen zones that are 200-300Mb. If a zone is high traffic
(like say cloud.example.com. where a record is added / removed for
each boot / destroy, or the reverse DNS zones for a cloud), there can
be a lot of data sent out from this component.

Great! It's even more I/O bound, then.


We are a small development team, and after looking at our options, and
judging the amount of developer hours we had available, a different
language was the route we decided on. I was going to go implement a few
POCs and see what was most suitable.

This is what especially surprises me. The problems going on here are
purely algorithmic, and the thinking is that rather than solve those
issues the small amount of development time needs to be spent on a re
implementation which also is going to have costs the the wider community
due to the language choice.


Golang was then being proposed as a new "blessed" language, and as it
was a language that we had a pre-existing POC in we decided to keep
this within the potential new list of languages.

As I said before, we did not just randomly decide this. We have been
talking about it for a while, and at this summit we dedicated an entire
session to it, and decided to do it.

Cheers,
Greg

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

--
@flaper87
Flavio Percoco

Attachment: signature.asc
Description: PGP signature

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to