Re: [SDRuby] Reddit's open DB schema

Eric MacAdie Sat, 06 Oct 2012 22:07:14 -0700

This is the part I found interesting:

*Doing too much work on one box causes a lot of context switching between
jobs. Try to make each database server serve the same kind of database in
the same way. This means all your indices will be cached and they won’t be
paged in and out. Keep everything as similar as possible together. Don’t
use Python threads. They are slow. They put everything in separate multiple
processes. Services like spam, and thumbnails, query caching. It allows you
to put them on different machines easily. You already solved problems of
communicating between process. Once solved it keeps the architecture clean
and it's easier to grow.*


- Eric MacAdie

On Sat, Oct 6, 2012 at 11:54 PM, Thomaz Leite <[email protected]> wrote:

> Good point, put I think the whole point is that you don't necessarily have
> to do lots of joins. Since you have only two tables, everything is (most of
> the time) two queries away. From my experience, joins only work in the
> relational algebra college class. I think that the drawback here is that
> you end up doing yourself most of the programming that would stay inside
> the database.
>
> On Saturday, October 6, 2012 2:37:04 PM UTC-7, Gisborne wrote:
>
>> On Oct 6, 2012, at 2:34 PM, Thomaz Leite <[email protected]> wrote:
>>
>> Reading this[1] article on High Scalability I found out Reddit has (or
>> had at some point) only two tables in their database. It's an interesting
>> approach to delay decisions about the schema, but I wonder if the drawbacks
>> are worth it. What do you think?
>>
>> [1]: http://highscalability.**com/blog/2010/5/17/7-lessons-**
>> learned-while-building-reddit-**to-270-million-page.html<http://highscalability.com/blog/2010/5/17/7-lessons-learned-while-building-reddit-to-270-million-page.html>
>>
>>
>> I think I’d be inclined to have at least some of the application using a
>> regular schema. The poor database has to do an awful lot of joins with
>> this. Still, clearly it works at least to some extent, and if you ran the
>> thing out of an SSD, the joins wouldn’t be such an issue.
>>
>> Note that you can get quite a lot of this kind of flexibility in Postgres
>> using Array and HStore field types, which I mentioned in a presentation a
>> couple of months back.
>>
>>  --
> SD Ruby mailing list
> [email protected]
> http://groups.google.com/group/sdruby
>

-- 
SD Ruby mailing list
[email protected]
http://groups.google.com/group/sdruby

Re: [SDRuby] Reddit's open DB schema

Reply via email to