Re: [gentoo-user] PostgreSQL Vs MySQL @Uber

J. Roeleveld Mon, 01 Aug 2016 09:49:54 -0700

On Monday, August 01, 2016 08:43:49 AM james wrote:
> On 08/01/2016 02:16 AM, J. Roeleveld wrote:
> > On Saturday, July 30, 2016 06:38:01 AM Rich Freeman wrote:
> >> On Sat, Jul 30, 2016 at 6:24 AM, Alan McKinnon <alan.mckin...@gmail.com>
> > 
> > wrote:
> >>> On 29/07/2016 22:58, Mick wrote:
> >>>> Interesting article explaining why Uber are moving away from
> >>>> PostgreSQL.
> >>>> I am
> >>>> running both DBs on different desktop PCs for akonadi and I'm also
> >>>> running
> >>>> MySQL on a number of websites.  Let's which one goes sideways first. 
> >>>> :p
> >>>> 
> >>>>  https://eng.uber.com/mysql-migration/
> >>> 
> >>> I don't think your akonadi and some web sites compares in any way to
> >>> Uber
> >>> and what they do.
> >>> 
> >>> FWIW, my Dev colleagues support and entire large corporate ISP's
> >>> operational and customer data on PostgreSQL-9.3. With clustering. With
> >>> no
> >>> db-related issues :-)
> >> 
> >> Agree, you'd need to be fairly large-scale to have their issues,
> > 
> > And also have to design your database by people who think MySQL actually
> > follows common SQL standards.
> > 
> >> but I
> >> think the article was something anybody interested in databases should
> >> read.  If nothing else it is a really easy to follow explanation of
> >> the underlying architectures.
> > 
> > Check the link posted by Douglas.
> > Ubers article has some misunderstandings about the architecture with
> > conclusions drawn that are, at least also, caused by their database design
> > and usage.
> > 
> >> I'll probably post this to my LUG mailing list.  I think one of the
> >> Postgres devs lurks there so I'm curious to his impressions.
> >> 
> >> I was a bit surprised to hear about the data corruption bug.  I've
> >> always considered Postgres to have a better reputation for data
> >> integrity.
> > 
> > They do.
> > 
> >> And of course almost any FOSS project could have a bug.  I
> >> don't know if either project does the kind of regression testing to
> >> reliably detect this sort of issue.
> > 
> > Not sure either, I do think PostgreSQL does a lot with regression tests.
> > 
> >> I'd think that it is more likely
> >> that the likes of Oracle would (for their flagship DB (not for MySQL),
> > 
> > Never worked with Oracle (or other big software vendors), have you? :)
> > 
> >> and they'd probably be more likely to send out an engineer to beg
> >> forgiveness while they fix your database).
> > 
> > Only if you're a big (as in, spend a lot of money with them) customer.
> > 
> >> Of course, if you're Uber
> >> the hit you'd take from downtime/etc isn't made up for entirely by
> >> having somebody take a few days to get everything fixed.
> > 
> > --
> > Joost
> 
> I certainly respect your skills and posts on Databases, Joost, as
> everything you have posted, in the past is 'spot on'.


Comes with a keen interest and long-term (think decades) of working with 
different databases.

> Granted, I'm no database expert, far from it.

Not many people are, nor do they need to be.

> But I want to share a few thing with you,
> and hope you  (and others) will 'chime in' on these comments.
> 
> Way back, when the earth was cooling and we all had dinosaurs for pets,
> some of us hacked on AT&T "3B2" unix systems. They were know for their
> 'roll back and recovery', triplicated (or more) transaction processes
> and 'voters' system to ferret out if a transaction was complete and
> correct. There was no ACID, the current 'gold standard' if you believe
> what Douglas and other write about concerning databases.
> 
> In essence, (from crusted up memories) a basic (SS7) transaction related
> to the local telephone switch, was ran  on 3 machines. The results were
> compared. If they matched, the transaction went forward as valid. If 2/3
> matched,

And what in the likely case when only 1 was correct?
Have you seen the movie "minority report"?
If yes, think back to why Tom Cruise was found 'guilty' when he wasn't and how 
often this actually occured.

> and the switch was was configured, then the code would
> essentially 'vote' and majority ruled. This is what led to phone calls
> (switched phone calls) having variable delays, often in the order of
> seconds, mis-connections and other problems we all encountered during
> periods of excessive demand.

Not sure if that was the cause in the past, but these days it can also still 
take a few seconds before the other end rings. This is due to the phone-system 
(all PBXs in the path) needing to setup the routing between both end-points 
prior to the ring-tone actually starting.
When the system is busy, these lookups will take time and can even time-out. 
(Try wishing everyone you know a happy new year using a wired phone and you'll 
see what I mean. Mobile phones have a seperate problem at that time)

> That scenario was at the heart of how old, crappy AT&T unix (SVR?) could
> perform so well and therefore established the gold standard for RT
> transaction processing, aka the "five  9s" 99.999% of up-time (about 5
> minutes per year of downtime).

"Unscheduled" downtime. Regular maintenance will require more than 5 minutes 
per year.

> Sure this part is only related to
> transaction processing as there was much more to the "five 9s" legacy,
> but imho, that is the heart of what was the precursor to ACID property's
> now so greatly espoused in SQL codes that Douglas refers to.
> 
> Do folks concur or disagree at this point?

ACID is about data integrity. The "best 2 out of 3" voting was, in my opinion, 
a work-around for unreliable hardware. It is based on a clever idea, but when 
2 computers having the same data and logic come up with 2 different answers, I 
wouldn't trust either of them.

> The reason this is important to me (and others?), is that, if this idea
> (granted there is much more detail to it) is still valid, then it can
> form  the basis for building up superior-ACID processes, that meet or
> exceed, the properties of an expensive (think Oracle) transaction
> process on distributed (parallel) or clustered systems, to a degree of
> accuracy only limited by the limit of the number of odd numbered voter
> codes involve in the distributed and replicated parts of the
> transaction. I even added some code where replicated routines were
> written in different languages, and the results compared to add an
> additional layer of verification before the voter step. (gotta love
> assembler?).

You have seen how "democracies" work, right? :)
The more voters involved, the longer it takes for all the votes to be counted.
With a small number, it might actually still scale, but when you pass a magic 
number (no clue what this would be), the counting time starts to exceed any 
time you might have gained by adding more voters.

Also, this, to me, seems to counteract the whole reason for using clusters: 
Have different nodes handle a different part of the problem.

Clusters of multiple compute-nodes is a quick and "simple" way of increasing 
the amount of computational cores to throw at problems that can be broken down 
in a lot of individual steps with minimal inter-dependencies.
I say "simple" because I think designing a 1,000 core chip is more difficult 
than building a 1,000-node cluster using single-core, single cpu boxes.

I would still consider the cluster to be a single "machine".

> I guess my point is 'Douglas' is full of stuffing, OR that is what folks
> are doing when they 'role their own solution specifically customized to
> their specific needs' as he alludes to near the end of his commentary?

The response Douglas linked to is closer to what seems to work when dealing 
with large amounts of data.

> (I'd like your opinion of this and maybe some links to current schemes
> how to have ACID/99.999% accurate transactions on clusters of various
> architectures.)  Douglas, like yourself, writes of these things in a
> very lucid fashion, so that is why I'm asking you for your thoughts.

The way Uber created the cluster is useful when having 1 node handle all the 
updates and multiple nodes providing read-only access while also providing 
failover functionality.

> Robustness of transactions, in a distributed (clustered) environment is
> fundamental to the usefulness of most codes that are trying to migrate
> to a cluster based processes in (VM/container/HPC) environments.

Whereas I do consider clusters to be very useful, not all work-loads can be 
redesigned to scale properly.

> I do
> not have the old articles handy but, I'm sure that many/most of those
> types of inherent processes can be formulated in the algebraic domain,
> normalized and used to solve decisions often where other forms of
> advanced logic failed (not that I'm taking a cheap shot at modern
> programming languages) (wink wink nudge nudge); or at least that's how
> we did it.... as young whipper_snappers bask in the day...

If you know what you are doing, the language is just a tool. Sometimes a 
hammer is sufficient, other times one might need to use a screwdriver.

> --an_old_farts_logic

Thinking back on how long I've been playing with computers, I wonder how long 
it will be until I am in the "old fart" category?

--
Joost

Re: [gentoo-user] PostgreSQL Vs MySQL @Uber

Reply via email to