On 25 February 2015 at 02:35, Robert Collins <robe...@robertcollins.net>

> On 24 February 2015 at 01:07, Salvatore Orlando <sorla...@nicira.com>
> wrote:
> > Lazy-Stacker summary:
> ...
> > In the medium term, there are a few things we might consider for
> Neutron's
> > "built-in IPAM".
> > 1) Move the allocation logic out of the driver, thus making IPAM an
> > independent service. The API workers will then communicate with the IPAM
> > service through a message bus, where IP allocation requests will be
> > "naturally serialized"
> > 2) Use 3-party software as dogpile, zookeeper but even memcached to
> > implement distributed coordination. I have nothing against it, and I
> reckon
> > Neutron can only benefit for it (in case you're considering of arguing
> that
> > "it does not scale", please also provide solid arguments to support your
> > claim!). Nevertheless, I do believe API request processing should proceed
> > undisturbed as much as possible. If processing an API requests requires
> > distributed coordination among several components then it probably means
> > that an asynchronous paradigm is more suitable for that API request.
> So data is great. It sounds like as long as we have an appropriate
> retry decorator in place, that write locks are better here, at least
> for up to 30 threads. But can we trust the data?

Not unless you can prove the process to obtain them is correct.
Otherwise we'd still think the sun rotates around the earth.

> One thing I'm not clear on is the SQL statement count.  You say 100
> queries for A-1 with a time on Galera of 0.06*1.2=0.072 seconds per
> allocation ? So is that 2 queries over 50 allocations over 20 threads?

So the query number reported in the thread is for a single node test. The
numbers for the galera tests are on github, and if you have a galera
environment you can try and run the experiment there too.
The algorithm indeed should perform a single select query for each IP
allocation and the number appears to be really too high. It is coming from
sqlalchemy hooks, so I guess it's reliable. It's worth noting that I put
the count for all queries, including those for setting up the environment,
and verifying the algorithm successful completion, so those should be
removed. I can easily enable debug logging and provide a detailed breakdown
of db operations for every algorithm.

> I'm not clear on what the request parameter in the test json files
> does, and AFAICT your threads each do one request each. As such I
> suspect that you may be seeing less concurrency - and thus contention
> - than real-world setups where APIs are deployed to run worker
> processes in separate processes and requests are coming in
> willy-nilly. The size of each algorithms workload is so small that its
> feasible to imagine the thread completing before the GIL bytecount
> code trigger (see
> https://docs.python.org/2/library/sys.html#sys.setcheckinterval) and
> the GIL's lack of fairness would exacerbate that.

I have a retry counter which testifies that contention is actually
Indeed algorithms which do sequential allocation see a lot of contention,
so I do not think that I'm just fooling myself and the tests are actually
running serially!
Anyway, the multiprocess suggestion is very valid and I will repeat the
experiments (I'm afraid that won't happen before Friday), because I did not
consider the GIL aspect you mention, as I dumbly expected that python will
simple spawn a different pthread for each thread and let the OS do the

> If I may suggest:
>  - use multiprocessing or some other worker-pool approach rather than
> threads
>  - or set setcheckinterval down low (e.g. to 20 or something)
>  - do multiple units of work (in separate transactions) within each
> worker, aim for e.g. 10 seconds or work or some such.

This last suggestion also makes sense.

>  - log with enough detail that we can report on the actual concurrency
> achieved. E.g. log the time in us when each transaction starts and
> finishes, then we can assess how many concurrent requests were
> actually running.

I put simple output on github only, but full debug logging can be achieved
by simply changing a constant.
However, I'm collecting the number of retries for each thread as an
indirect marker of concurrency level.

> If the results are still the same - great, full steam ahead. If not,
> well lets revisit :)

Obviously. We're not religious here. We'll simply do what the data suggest
as the best way forward.

> -Rob
> --
> Robert Collins <rbtcoll...@hp.com>
> Distinguished Technologist
> HP Converged Cloud
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe

Reply via email to