Re: [PERFORM] Partitioning / Clustering

2005-05-14 Thread PFC
If you make the assertion that you are transferring equal or less session data between your session server (lets say an RDBMS) and the app server than you are between the app server and the client, an out of band 100Mb network for session information is plenty of bandwidth. So if you count on a m

Re: [PERFORM] Partitioning / Clustering

2005-05-12 Thread Josh Berkus
Ross, > Memcached is a PG memory store, I gather, Nope. It's a hyperfast resident-in-memory hash that allows you to stash stuff like user session information and even materialized query set results. Thanks to SeanC, we even have a plugin, pgmemcached. > but...what is squid, lighttpd? > anyt

Re: [PERFORM] Partitioning / Clustering

2005-05-12 Thread Alex Stapleton
On 12 May 2005, at 18:33, Josh Berkus wrote: People, In general I think your point is valid. Just remember that it probably also matters how you count page views. Because technically images are a separate page (and this thread did discuss serving up images). So if there are 20 graphics on a sp

Re: [PERFORM] Partitioning / Clustering

2005-05-12 Thread Josh Berkus
People, > In general I think your point is valid. Just remember that it probably > also matters how you count page views. Because technically images are a > separate page (and this thread did discuss serving up images). So if > there are 20 graphics on a specific page, that is 20 server hits just

Re: [PERFORM] Partitioning / Clustering

2005-05-12 Thread PFC
100 hits a second = 8,640,000 hits a day. I work on a site which does > 100 million dynamic pages a day. In comparison Yahoo probably does > 100,000,000,000 (100 billion) views a day if I am interpreting Alexa's charts correctly. Which is about 1,150,000 a second. Read the help on Alexa

Re: [PERFORM] Partitioning / Clustering

2005-05-12 Thread John A Meinel
Alex Turner wrote: Ok - my common sense alarm is going off here... There are only 6.446 billion people worldwide. 100 Billion page views would require every person in the world to view 18 pages of yahoo every day. Not very likely. http://www.internetworldstats.com/stats.htm suggests that there ar

Re: [PERFORM] Partitioning / Clustering

2005-05-12 Thread Alex Turner
Ok - my common sense alarm is going off here... There are only 6.446 billion people worldwide. 100 Billion page views would require every person in the world to view 18 pages of yahoo every day. Not very likely. http://www.internetworldstats.com/stats.htm suggests that there are around 1 billio

Re: [PERFORM] Partitioning / Clustering

2005-05-12 Thread Alex Stapleton
On 12 May 2005, at 15:08, Alex Turner wrote: Having local sessions is unnesesary, and here is my logic: Generaly most people have less than 100Mb of bandwidth to the internet. If you make the assertion that you are transferring equal or less session data between your session server (lets say an

Re: [PERFORM] Partitioning / Clustering

2005-05-12 Thread Alex Turner
Having local sessions is unnesesary, and here is my logic: Generaly most people have less than 100Mb of bandwidth to the internet. If you make the assertion that you are transferring equal or less session data between your session server (lets say an RDBMS) and the app server than you are between

Re: [PERFORM] Partitioning / Clustering

2005-05-12 Thread PFC
machines. Which has it's own set of issues entirely. I am not entirely sure that memcached actually does serialize data when it's comitted into I think it does, ie. it's a simple mapping of [string key] => [string value]. memcached either, although I could be wrong, I have not looked at the

Re: [PERFORM] Partitioning / Clustering

2005-05-12 Thread Alex Stapleton
On 11 May 2005, at 23:35, PFC wrote: However, memcached (and for us, pg_memcached) is an excellent way to improve horizontal scalability by taking disposable data (like session information) out of the database and putting it in protected RAM. So, what is the advantage of such a system ve

Re: [PERFORM] Partitioning / Clustering

2005-05-11 Thread PFC
However, memcached (and for us, pg_memcached) is an excellent way to improve horizontal scalability by taking disposable data (like session information) out of the database and putting it in protected RAM. So, what is the advantage of such a system versus, say, a "sticky sessions" system wh

Re: [PERFORM] Partitioning / Clustering

2005-05-11 Thread Jim C. Nasby
On Wed, May 11, 2005 at 08:57:57AM +0100, David Roussel wrote: > For an interesting look at scalability, clustering, caching, etc for a > large site have a look at how livejournal did it. > http://www.danga.com/words/2004_lisa/lisa04.pdf > > They have 2.6 Million active users, posting 200 new blog

Re: [PERFORM] Partitioning / Clustering

2005-05-11 Thread Simon Riggs
On Wed, 2005-05-11 at 17:13 +0800, Christopher Kings-Lynne wrote: > > Alex Stapleton wrote > > Be more helpful, and less arrogant please. > > Simon told you all the reasons clearly and politely. Thanks Chris for your comments. PostgreSQL can always do with one more developer and my sole intent

Re: [PERFORM] Partitioning / Clustering

2005-05-11 Thread Greg Stark
Alex Stapleton <[EMAIL PROTECTED]> writes: > Acceptable Answers to 'So, when/is PG meant to be getting a decent > partitioning system?': ... > 3. Your welcome to take a stab at it, I expect the community would > support your efforts as well. As long as we're being curt all around, this one'

Re: [PERFORM] Partitioning / Clustering

2005-05-11 Thread Josh Berkus
David, > It's interesting that the solution livejournal have arrived at is quite > similar in ways to the way google is set up. Yes, although again, they're using memcached as pseudo-clustering software, and as a result are limited to what fits in RAM (RAM on 27 machines, but it's still RAM).

Re: [PERFORM] Partitioning / Clustering

2005-05-11 Thread Tom Lane
Mischa Sandberg <[EMAIL PROTECTED]> writes: > So, simplicity dictates something like: > table pg_remote(schemaname text, connectby text, remoteschema text) Previous discussion of this sort of thing concluded that we wanted to follow the SQL-MED standard. regards, tom lane

Re: [PERFORM] Partitioning / Clustering

2005-05-11 Thread Christopher Kings-Lynne
Acceptable Answers to 'So, when/is PG meant to be getting a decent partitioning system?': 1. Person X is working on it I believe. 2. It's on the list, but nobody has done anything about it yet 3. Your welcome to take a stab at it, I expect the community would support your efforts a

Re: [PERFORM] Partitioning / Clustering

2005-05-11 Thread Alex Stapleton
On 11 May 2005, at 09:50, Alex Stapleton wrote: On 11 May 2005, at 08:57, David Roussel wrote: For an interesting look at scalability, clustering, caching, etc for a large site have a look at how livejournal did it. http://www.danga.com/words/2004_lisa/lisa04.pdf I have implemented similar syst

Re: [PERFORM] Partitioning / Clustering

2005-05-11 Thread Alex Stapleton
On 11 May 2005, at 08:57, David Roussel wrote: For an interesting look at scalability, clustering, caching, etc for a large site have a look at how livejournal did it. http://www.danga.com/words/2004_lisa/lisa04.pdf I have implemented similar systems in the past, it's a pretty good technique, unf

Re: [PERFORM] Partitioning / Clustering

2005-05-11 Thread Alex Stapleton
On 11 May 2005, at 08:16, Simon Riggs wrote: On Tue, 2005-05-10 at 11:03 +0100, Alex Stapleton wrote: So, when/is PG meant to be getting a decent partitioning system? ISTM that your question seems to confuse where code comes from. Without meaning to pick on you, or reply rudely, I'd like to explo

Re: [PERFORM] Partitioning / Clustering

2005-05-11 Thread David Roussel
For an interesting look at scalability, clustering, caching, etc for a large site have a look at how livejournal did it. http://www.danga.com/words/2004_lisa/lisa04.pdf They have 2.6 Million active users, posting 200 new blog entries per minute, plus many comments and countless page views. Althou

Re: [PERFORM] Partitioning / Clustering

2005-05-11 Thread Neil Conway
Josh Berkus wrote: The other problem, as I was told it at OSCON, was that these were not high-availability clusters; it's impossible to add a server to an existing cluster Yeah, that's a pretty significant problem. a server going down is liable to take the whole cluster down. That's news to me. D

Re: [PERFORM] Partitioning / Clustering

2005-05-11 Thread Simon Riggs
On Tue, 2005-05-10 at 11:03 +0100, Alex Stapleton wrote: > So, when/is PG meant to be getting a decent partitioning system? ISTM that your question seems to confuse where code comes from. Without meaning to pick on you, or reply rudely, I'd like to explore that question. Perhaps it should be a F

Re: [PERFORM] Partitioning / Clustering

2005-05-10 Thread Josh Berkus
Neil, > Sure, but that hardly makes it not "usable". Considering the price of > RAM these days, having enough RAM to hold the database (distributed over > the entire cluster) is perfectly acceptable for quite a few people. The other problem, as I was told it at OSCON, was that these were not hig

Re: [PERFORM] Partitioning / Clustering

2005-05-10 Thread Neil Conway
Joshua D. Drake wrote: Neil Conway wrote: Oh? What's wrong with MySQL's clustering implementation? Ram only tables :) Sure, but that hardly makes it not "usable". Considering the price of RAM these days, having enough RAM to hold the database (distributed over the entire cluster) is perfectly acc

Re: [PERFORM] Partitioning / Clustering

2005-05-10 Thread Bruno Wolff III
On Tue, May 10, 2005 at 08:02:50 -0700, Adam Haberlach <[EMAIL PROTECTED]> wrote: > > > With all the Opteron v. Xeon around here, and talk of $30,000 machines, > perhaps it would be worth exploring the option of buying 10 cheapass > machines for $300 each. At the moment, that $300 buys you, fr

Re: [PERFORM] Partitioning / Clustering

2005-05-10 Thread Joshua D. Drake
Neil Conway wrote: Josh Berkus wrote: Don't hold your breath. MySQL, to judge by their first "clustering" implementation, has a *long* way to go before they have anything usable. Oh? What's wrong with MySQL's clustering implementation? Ram only tables :) -Neil ---(end of

Re: [PERFORM] Partitioning / Clustering

2005-05-10 Thread Neil Conway
Josh Berkus wrote: Don't hold your breath. MySQL, to judge by their first "clustering" implementation, has a *long* way to go before they have anything usable. Oh? What's wrong with MySQL's clustering implementation? -Neil ---(end of broadcast)---

Re: [PERFORM] Partitioning / Clustering

2005-05-10 Thread Mischa Sandberg
Quoting "Jim C. Nasby" <[EMAIL PROTECTED]>: > To the best of my knowledge no such work has been done. There is a > project (who's name escapes me) that lets you run queries against a > remote postgresql server from a postgresql connection to a different > server, which could serve as the basis for

Re: [PERFORM] Partitioning / Clustering

2005-05-10 Thread Mischa Sandberg
Quoting Christopher Kings-Lynne <[EMAIL PROTECTED]>: > > >>*laff* > >>Yeah, like they've been working on views for the last 5 years, and > >>still haven't released them :D :D :D > > > > ? > > http://dev.mysql.com/doc/mysql/en/create-view.html > > ...for MySQL 5.0.1+ ? > > Give me a call when i

Re: [PERFORM] Partitioning / Clustering

2005-05-10 Thread Christopher Kings-Lynne
*laff* Yeah, like they've been working on views for the last 5 years, and still haven't released them :D :D :D ? http://dev.mysql.com/doc/mysql/en/create-view.html ...for MySQL 5.0.1+ ? Give me a call when it's RELEASED. Chris ---(end of broadcast)

Re: [PERFORM] Partitioning / Clustering

2005-05-10 Thread Joshua D. Drake
Mischa Sandberg wrote: Quoting Christopher Kings-Lynne <[EMAIL PROTECTED]>: This is why I mention partitioning. It solves this issue by storing different data sets on different machines under the same schema. These seperate chunks of the table can then be replicated as well for data redundancy a

Re: [PERFORM] Partitioning / Clustering

2005-05-10 Thread Mischa Sandberg
Quoting Christopher Kings-Lynne <[EMAIL PROTECTED]>: > > This is why I mention partitioning. It solves this issue by storing > > different data sets on different machines under the same schema. > > These seperate chunks of the table can then be replicated as well for > > data redundancy and so o

Re: [PERFORM] Partitioning / Clustering

2005-05-10 Thread Christopher Kings-Lynne
This is why I mention partitioning. It solves this issue by storing different data sets on different machines under the same schema. These seperate chunks of the table can then be replicated as well for data redundancy and so on. MySQL are working on these things *laff* Yeah, like they've bee

Re: [PERFORM] Partitioning / Clustering

2005-05-10 Thread Jim C. Nasby
On Tue, May 10, 2005 at 02:55:55PM -0700, Mischa Sandberg wrote: > just beyond belief, for both updates and queries. At Acxiom, the > datasets are so large, even after partitioning, that they just > constantly cycle them through memory, and commands are executes in > convoys --- sort of like riding

Re: [PERFORM] Partitioning / Clustering

2005-05-10 Thread Mischa Sandberg
Quoting Alex Stapleton <[EMAIL PROTECTED]>: > This is why I mention partitioning. It solves this issue by storing > different data sets on different machines under the same schema. > These seperate chunks of the table can then be replicated as well for > data redundancy and so on. MySQL are wor

Re: [PERFORM] Partitioning / Clustering

2005-05-10 Thread Jim C. Nasby
On Tue, May 10, 2005 at 07:29:59PM +0200, PFC wrote: > I wonder how Oracle does it ;) Oracle *clustering* demands shared storage. So you've shifted your money from big-iron CPUs to big-iron disk arrays. Oracle replication works similar to Slony, though it supports a lot more modes (ie: sync

Re: [PERFORM] Partitioning / Clustering

2005-05-10 Thread Mischa Sandberg
Quoting [EMAIL PROTECTED]: > > exploring the option of buying 10 cheapass > > machines for $300 each. At the moment, that $300 buys you, from > Dell, a > > 2.5Ghz Pentium 4 > > Buy cheaper ass Dells with an AMD 64 3000+. Beats the crap out of > the 2.5 > GHz Pentium, especially for PostgreSQL.

RE: [PERFORM] Partitioning / Clustering

2005-05-10 Thread tdrayton
Hi Alex, Actually, our product can partition data among several clustered nodes running PostgreSQL, if that is what you are looking for. Data is distributed based on a designated column. Other tables can be replicated to all nodes. For SELECTs, it also knows when it can join locally or it needs

Re: [PERFORM] Partitioning / Clustering

2005-05-10 Thread Josh Berkus
Alex, > This is why I mention partitioning. It solves this issue by storing   > different data sets on different machines under the same schema.   That's clustering, actually. Partitioning is simply dividing up a table into chunks and using the chunks intelligently. Putting those chunks on se

Re: [PERFORM] Partitioning / Clustering

2005-05-10 Thread PFC
SELECT row1, row2 FROM table1_on_machine_a NATURAL JOIN table2_on_machine_b WHERE restrict_table_1 AND restrict_table_2 AND restrict_1_based_on_2; I don't think that's ever going to be efficient... What would be efficient would be, for instance, a Join of a part of a table against another pa

Re: [PERFORM] Partitioning / Clustering

2005-05-10 Thread John A Meinel
Adam Haberlach wrote: I think that perhaps he was trying to avoid having to buy "Big Iron" at all. With all the Opteron v. Xeon around here, and talk of $30,000 machines, perhaps it would be worth exploring the option of buying 10 cheapass machines for $300 each. At the moment, that $300 buys you,

Re: [PERFORM] Partitioning / Clustering

2005-05-10 Thread Richard_D_Levine
) > > > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of John A Meinel > Sent: Tuesday, May 10, 2005 7:41 AM > To: Alex Stapleton > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] Partitioning / Clustering > &

Re: [PERFORM] Partitioning / Clustering

2005-05-10 Thread Alex Stapleton
lto:[EMAIL PROTECTED] On Behalf Of John A Meinel Sent: Tuesday, May 10, 2005 7:41 AM To: Alex Stapleton Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Partitioning / Clustering Alex Stapleton wrote: What is the status of Postgres support for any sort of multi-machine scaling support? Wha

Re: [PERFORM] Partitioning / Clustering

2005-05-10 Thread Alex Stapleton
On 10 May 2005, at 15:41, John A Meinel wrote: Alex Stapleton wrote: What is the status of Postgres support for any sort of multi-machine scaling support? What are you meant to do once you've upgraded your box and tuned the conf files as much as you can? But your query load is just too high for

Re: [PERFORM] Partitioning / Clustering

2005-05-10 Thread Adam Haberlach
ss huge clusters of cheap hardware. It can't be _that_ hard, can it. :) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of John A Meinel Sent: Tuesday, May 10, 2005 7:41 AM To: Alex Stapleton Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] P

Re: [PERFORM] Partitioning / Clustering

2005-05-10 Thread John A Meinel
Alex Stapleton wrote: What is the status of Postgres support for any sort of multi-machine scaling support? What are you meant to do once you've upgraded your box and tuned the conf files as much as you can? But your query load is just too high for a single machine? Upgrading stock Dell boxes (I