On 05/23/2017 03:16 PM, Edward Leafe wrote:
On May 23, 2017, at 1:43 PM, Jay Pipes <jaypi...@gmail.com> wrote:
 Witness the join constructs in Golang in Kubernetes as they work around
etcd not being a relational data store:
Maybe it’s just me, but I found that Go code more understandable than some of
the SQL we are using in the placement engine. :)
I assume that the SQL in a relational engine is faster than the same thing in
code, but is that difference significant? For extremely large data sets I think
that the database processing may be rate limiting, but is that the case here?
Sometimes it seems that we are overly obsessed with optimizing data handling
when the amount of data is relatively small. A few million records should be
fast enough using just about anything.
When you write your app fresh, put some data into it, a few hundred
rows, not at all. Pull it all into memory and sort/filter all you want,
SQL is too hard. Push it to production! works great. send the
customer your bill.
6 months later. Customer has 10K rows. The tools their contractor
wrote seem a little sticky. Not sure when that happened?
A year later. Customer is at 300K rows, nowhere near "a few million"
records. Application regularly crashes when asked to search and filter
results. Because Python interpreter uses a fair amount of memory for a
result set, multiplied by the overhead of Python object() / dict() per
row == 100's / 1000's of megs of memory to have 300000 objects in memory
all at once. Multiply by dozens of threads / processes handling
concurrent requests, Python interpreter rarely returns memory. Then add
latency of fetching 300K rows over the wire, converting to objects.
Concurrent requests pile up because they're slower; == more processes,
== more memory.
New contractor is called in to rewrite the whole thing in MongoDB. Now
it's fast again! Proceed to chapter 2, "So you decided to use
OpenStack Development Mailing List (not for usage questions)