Re: [Pulp-dev] Transition from Mongo to Postgre

Brian Bouterse Tue, 13 Sep 2016 08:02:55 -0700

In addition to what @mhrivnak said. For me, the big motivation istransaction support. A single Pulp sync or publish can issue thousandsof writes to the database. A failure in the middle leaves the database"half-updated" and Pulp has no feasible way to roll back these changes.This creates a major problem for data correctness in the face offailures. Transaction support at the database layer will give Pulp anopportunity to recover from these failures and preserve correctness.

From a high level, Pulp's transition to PostgreSQL is about correctnessnot performance. We don't want to give up performance, but performanceis a secondary concern behind correctness. Pulp 2.y hasn't done much tohave the write and read performance really benefit from "the mongodbway"[0] so in switching I expect to see "similar" performance. We wouldneed to benchmark and quantify the performance of 2.y versus 3.y toreally know. We are not planning to do that so we may never know, buthere is a writeup of an outline to track performance [1].


[0]: loosening write/read consistency and deployments that use sharding
[1]: https://etherpad.net/p/pulp_performance_test_plan

-Brian

On 09/13/2016 09:11 AM, Michael Hrivnak wrote:

We have a thread here about a lot of the 3.0 stack choices, although it
seems to skip past the assumption that we're moving to postgres:

https://www.redhat.com/archives/pulp-list/2016-May/msg00042.html

I can't quickly find another summary of why, so I'll describe the
highlights here:

- Pulp has highly relational data. The core use case is managing the
relationships between content and repositories. Using a relational DB
makes that a lot easier.
- A schemaless DB makes it easy to do writes, but you have to be very
careful when doing reads that the your software is prepared for whatever
data structure comes out. If you want to enforce a schema, it has to be
done in software. It's doable, but requires great care.
- Transactions!
- The HA story with mongodb is more complex than most people realize
(certainly more complex than we expected). To get real HA with data
safety, you have to do a lot of the work in your own software.

MongoDB is great at what it does and a good fit for some use cases, but
we learned that it's not the best fit for Pulp.

Michael

On Tue, Sep 13, 2016 at 3:21 AM, Filip Nguyen <[email protected]
<mailto:[email protected]>> wrote:

    I heard that Pulp is switching from Mongo to Postgre. Just out of
    curiosity, I would like to learn more about the reasons why you
    decided to go this direction. Is there any document/email thread
    about it?

    _______________________________________________
    Pulp-dev mailing list
    [email protected] <mailto:[email protected]>
    https://www.redhat.com/mailman/listinfo/pulp-dev
    <https://www.redhat.com/mailman/listinfo/pulp-dev>




_______________________________________________
Pulp-dev mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/pulp-dev


_______________________________________________
Pulp-dev mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/pulp-dev

Re: [Pulp-dev] Transition from Mongo to Postgre

Reply via email to