Hi.
> >> It's now working in pgpool-II 2.1 RC1. No more problems. And yes I use > >> PHP > >> PDO and AdoDB. My previous post has more details. So what's the > >> replicate_def table and replicate_select for? Thanks again. :) > > > > replicate_def is to replicate data, not to distribute. > > > > ex) partitioning data > > node1: dist1 -> (1, 2, 3) dist2 -> (4, 5, 6) > > node2: dist1 -> (4, 5, 6) dist2 -> (7, 8, 9) > > node3: dist1 -> (7, 8, 9) dist2 -> (10, 11, 12) > > > > When pgpool executes the follwing query, the query is not parallel > > execution. Because we need to join dist1 table on node2 and dist2 > > table on node1. > > > > SELECT * FROM dist1 INNER JOIN dist2 on dist1.a = dist2.a; > > > > > > ex) partitioning data > > node1: dist1 -> (1, 2, 3) rep1 -> (4, 5, 6, 7, 8, 9, 10, 11, 12) > > node2: dist1 -> (4, 5, 6) rep1 -> (4, 5, 6, 7, 8, 9, 10, 11, 12) > > node3: dist1 -> (7, 8, 9) rep1 -> (4, 5, 6, 7, 8, 9, 10, 11, 12) > > > > pgpool can execute the following query concurrently. Because rep1 > > table is the same on all nodes. > > > > SELECT * FROM dist1 INNER JOIN rep1 ON dist1.a = rep1.a; > > > > I think small tables are defined in replicate_def. > > If small tables are ideal in replicate_def then, it may not be applicable > to our situation because here's the structure of the whole thing as > planned: > > database: loadbal > schema: public > tables: salesdetail > salesheader > > Tables salesdetail and salesheader both contain transaction date and > branchcode columns. We decided to partition the data by transaction date > because, as calculated, it partitions data more evenly as compared to by > branchcode partitioning. We then extract the MONTH part of the transaction > date then use it as an argument on the SELECT CASE function. Therefore, > possible values are 1-12. The 2 tables are very very large (millions of > records per table because we have like 100+ branches and > hundreds/thousands of transactions per day). So that is the reason why my > function looks like this: Do you want to execute following query ? SELECT * FROM salesdetail INNER JOIN salesheader using(transaction_date); ex) partitioning data node1: salesdetail -> (1, 2, 3, 4) salesheader -> (1, 2, 3, 4) node2: salesdetail -> (5, 6, 7, 8) salesheader -> (5, 6, 7, 8) node3: salesdetail -> (9, 10, 11, 12) salesheader -> (9, 10, 11, 12) If JOIN is concluded by a single node, you can use the VIEW. Define VIEW to each node. CREATE VIEW detail_and_header as SELECT * FROM salesdetail INNER JOIN salesheader using(transaction_date); Define VIEW as partitioning table to dist_def table. You can use parallel query execution. SELECT * FROM detail_and_header; > > *** Note: $1 = transaction date > > SELECT CASE WHEN EXTRACT(MONTH FROM $1) => 1 AND EXTRACT(MONTH FROM $1) <= > 4 THEN 0 > > *** Months 1-4 goes to node0 > > Question: Is it safe to say that it is not advisable to use replicate_def > because the tables are very large or should I use it to replicate the > records on all nodes? > > Whenever I enter a value in the transactiondate field, ex. 2008-07-15, it > goes to node1 alone and no copy on others. Is this the right behavior? Are > we going to be able to take advantage of parallel query if this is going > to be the setup? If not then what would possibly be the best partitioning > scheme to take full advantage of parallel query given the structure/setup > above? Our goal again is to prove that pgpool-II parallel query can cut > the SELECT query time shorter as compared to a single MySQL server with > same amount of data (millions). Ideally, time taken should divided by 3 > because of 3 backends. > > Thank you very much for all your help. > > Regards, > Viril Calimlim > > > > > replicate_select is a parameter for replication mode. So you don't > > need to set it. > > > > I'm not good at English. So if you cannot understand, feel free to ask > > more. > > > > Regards, > > -- > > Yoshiyuki Asaba > > [EMAIL PROTECTED] > > > >> > >> Regards, > >> Viril Calimlim > >> > >> > Hi, > >> > > >> > From: [EMAIL PROTECTED] > >> > Subject: Re: [Pgpool-general] Parallel Query Mode - SELECT, > >> Phppgadmin, > >> > Psql > >> > Date: Thu, 10 Jul 2008 17:25:41 +0800 (PHT) > >> > > >> >> This is really serious because we code using PHP and we wouldn't be > >> able > >> >> to make use of pgpool-II if we couldn't solve this problem. > >> > > >> > Do you use PHP-PDO? > >> > > >> > By the way, > >> >> where can I get more information regarding other parameters like > >> >> replicate_select (yes it replicates the SELECT on all nodes but > >> what's > >> >> the > >> >> advantage? did I miss anything on the manual?), replicate_def table, > >> >> etc. > >> >> I am more than willing to further test the software for you to > >> eliminate > >> >> more bugs. Thank you so much for the speedy response. > >> > > >> > Could you try pgpool-II 2.1 RC1? > >> > > >> > Regards, > >> > -- > >> > Yoshiyuki Asaba > >> > [EMAIL PROTECTED] > >> > > >> > > >> > > >> >> > >> >> Best regards, > >> >> Viril Calimlim > >> >> > >> >> > Hi, > >> >> > > >> >> > Do you set parallel_mode to true? > >> >> > > >> >> > Regards, > >> >> > -- > >> >> > Yoshiyuki Asaba > >> >> > [EMAIL PROTECTED] > >> >> > > >> >> > > >> >> > From: [EMAIL PROTECTED] > >> >> > Subject: [Pgpool-general] Parallel Query Mode - SELECT, Phppgadmin, > >> >> Psql > >> >> > Date: Thu, 10 Jul 2008 15:45:11 +0800 (PHT) > >> >> > > >> >> >> Hello! > >> >> >> > >> >> >> First of all, thank you for taking the time to reply and excellent > >> >> tip > >> >> >> regarding my last question about Parallel Query. However, I've got > >> a > >> >> new > >> >> >> problem. > >> >> >> > >> >> >> In Parallel Query Mode, as per the Tutorial guidelines, I created > >> the > >> >> >> following: > >> >> >> > >> >> >> - Database "test" > >> >> >> - Schema "testschema" > >> >> >> - Table "testtable" > >> >> >> - Fields "id integer", "name varchar(20)" > >> >> >> - Function pgpool_catalog.dist_def_test with same ranges > >> (1..100000 > >> >> etc) > >> >> >> - Inserted a row in dist_def table with key "id" > >> >> >> > >> >> >> I have 3 backend servers 0, 1, 2 and 1 pgpool-II dedicated server. > >> I > >> >> did > >> >> >> not insert a row in replicate_def because I still don't understand > >> >> that > >> >> >> part (sorry). > >> >> >> > >> >> >> My question is that whenever I issue the command: > >> >> >> > >> >> >> [EMAIL PROTECTED]:~$ psql -p 9999 -c "INSERT INTO > >> >> >> testschema.testtable VALUES (100000,'testname')" test > >> >> >> > >> >> >> ... I checked all the backends and it send the data to node0 ALONE > >> >> and > >> >> >> when I do a SELECT *, it shows only 1 row. > >> >> >> > >> >> >> But when I use Phppgadmin or even run a PHP script (using PDO) on > >> >> port > >> >> >> 9999 of the pgpool-II server to insert data, I checked all the > >> >> backends > >> >> >> and all of them has the copy of the inserted row. When I do SELECT > >> *, > >> >> I > >> >> >> get 3 rows of the same data. Another thing about Phppgadmin, when > >> I > >> >> >> click > >> >> >> on the BROWSE button on testtable, it just keeps on Loading... > >> >> meaning, > >> >> >> it > >> >> >> stuck forever for some reason and I had to kill all pgpool > >> processes. > >> >> >> > >> >> >> tail postgresql log: lots of "LOG: unexpected EOF on client > >> >> connection" > >> >> >> tail pgpool log: > >> >> >> > >> >> >> 2008-07-10 15:41:28 DEBUG: pid 6621: read_kind_from_backend: read > >> >> kind > >> >> >> from 0 th backend Z NUM_BACKENDS: 3 > >> >> >> 2008-07-10 15:41:28 DEBUG: pid 6621: read_kind_from_backend: read > >> >> kind > >> >> >> from 1 th backend Z NUM_BACKENDS: 3 > >> >> >> 2008-07-10 15:41:28 DEBUG: pid 6621: read_kind_from_backend: read > >> >> kind > >> >> >> from 2 th backend Z NUM_BACKENDS: 3 > >> >> >> 2008-07-10 15:41:28 DEBUG: pid 6621: pool_process_query: kind from > >> >> >> backend: Z > >> >> >> 2008-07-10 15:41:28 DEBUG: pid 6621: pool_read_message_length: > >> slot: > >> >> 0 > >> >> >> length: 5 > >> >> >> 2008-07-10 15:41:28 DEBUG: pid 6621: pool_read_message_length: > >> slot: > >> >> 1 > >> >> >> length: 5 > >> >> >> 2008-07-10 15:41:28 DEBUG: pid 6621: pool_read_message_length: > >> slot: > >> >> 2 > >> >> >> length: 5 > >> >> >> 2008-07-10 15:41:28 DEBUG: pid 6621: ReadyForQuery: message > >> length: 5 > >> >> >> 2008-07-10 15:41:28 DEBUG: pid 6621: ReadyForQuery: transaction > >> >> state: I > >> >> >> 2008-07-10 15:41:28 DEBUG: pid 6621: pool_connection_pool_timer: > >> set > >> >> >> close > >> >> >> time 1215675688 > >> >> >> > >> >> >> Version: pgpool-II 2.1beta2 > >> >> >> OS: Ubuntu 8.04 Server > >> >> >> PostgreSQL: 8.3.1 > >> >> >> > >> >> >> Please help. I really want this to work. Ive been after this for 3 > >> >> >> months > >> >> >> now. > >> >> >> > >> >> >> Thank you very much for your help. > >> >> >> > >> >> >> Regards, > >> >> >> Viril > >> >> >> > >> >> >> _______________________________________________ > >> >> >> Pgpool-general mailing list > >> >> >> [email protected] > >> >> >> http://pgfoundry.org/mailman/listinfo/pgpool-general > >> >> > _______________________________________________ > >> >> > Pgpool-general mailing list > >> >> > [email protected] > >> >> > http://pgfoundry.org/mailman/listinfo/pgpool-general > >> >> > > >> >> > > >> >> > >> >> > >> > > >> > > >> > >> > > > > > > > > _______________________________________________ > Pgpool-general mailing list > [email protected] > http://pgfoundry.org/mailman/listinfo/pgpool-general > -- SRA OSS, Inc. 日本支社 Yoshiharu Mori <[EMAIL PROTECTED]> http://www.sraoss.co.jp/ _______________________________________________ Pgpool-general mailing list [email protected] http://pgfoundry.org/mailman/listinfo/pgpool-general
