Re: [HACKERS] Fragmentation project

2007-04-27 Thread Marko Kreen

On 4/25/07, Gustavo Tonini [EMAIL PROTECTED] wrote:

On 4/24/07, Marko Kreen [EMAIL PROTECTED] wrote:
 On 4/23/07, Heikki Linnakangas [EMAIL PROTECTED] wrote:
  Oh, you're talking about distributing partitions across different nodes
  and parallelizing queries. No, we don't do that today.

 PL/Proxy actually works like that, only in smaller scope -
 for function calls only.


I think that proposed funcionalities cannot be implemented in a PL scope...


Oh, sure.  PL/proxy just proves if you can adhere to specific coding-style
- all db access goes via functions - you can solve the problem today,
with simple tools.  You dont need even PL/proxy for that, any PL that
can do connections (plpython, plperl, ..) can be used for proxy functions.
PL/proxy just makes it so much easier.

Function based DB API can give additional benefits:
- easy upgradeablility
- easy to change db structure without apps knowing
- easy to monitor/audit

And ofcourse:
- if database hits hardware limits, you can replace all functions
with PLproxy functions and spread data over several partitions.

I think most OLTP apps can be designed around db-functions, with OLAP
it will be bit harder, but there it is less critical also.

--
marko

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Fragmentation project

2007-04-25 Thread Gustavo Tonini

Josh,

On 4/23/07, Josh Berkus [EMAIL PROTECTED] wrote:

Gustavo,

  Oh, you're talking about distributing partitions across different nodes
  and parallelizing queries. No, we don't do that today.

 Yes.This is the goal. Well, I will try it. I'll send the project
 reports to this list. Comments will be valuable. Desire me good
 luck...

You might join/look at the PgPoolII project, which is working on parallel
query amoung other things.



The pgpool is an interesting approach to this, but I think that the
funcionality of inserting a record at a backend which will be
redirectioned to other and verifying deadlocks under network demands
in acquiring locks on the referenced records/tables in several hosts.
Then, IMO, this may be implemented inside dbms. How Marko wrote, this
is a non-trivial solution...
Really, It could be improved on pgpool to be a process coordinator,
but will need some changes in backend too.
This is a non trivial implementation, but there are several users
waiting for an effective solution for data distributing in a cluster.
These users actually buy commercial solutions or build themselves one.

Gustavo.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Fragmentation project

2007-04-25 Thread Gustavo Tonini

Marko,

On 4/24/07, Marko Kreen [EMAIL PROTECTED] wrote:

On 4/23/07, Heikki Linnakangas [EMAIL PROTECTED] wrote:
 Oh, you're talking about distributing partitions across different nodes
 and parallelizing queries. No, we don't do that today.

PL/Proxy actually works like that, only in smaller scope -
for function calls only.



I think that proposed funcionalities cannot be implemented in a PL scope...

Gustavo.


General solution that partitions free-form SQL
will be non-trivial...



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Fragmentation project

2007-04-25 Thread Josh Berkus

Gustavo,


The pgpool is an interesting approach to this, but I think that the
funcionality of inserting a record at a backend which will be
redirectioned to other and verifying deadlocks under network demands
in acquiring locks on the referenced records/tables in several hosts.
Then, IMO, this may be implemented inside dbms. How Marko wrote, this
is a non-trivial solution...
Really, It could be improved on pgpool to be a process coordinator,
but will need some changes in backend too.
This is a non trivial implementation, but there are several users
waiting for an effective solution for data distributing in a cluster.
These users actually buy commercial solutions or build themselves one.


Yeah, I was just thinking that if you start from scratch you're not 
likely to get very far ... you might look at some of the existing 
partial solutions (pgPoolII, PostgreSQLForest, ExtenDB, etc.) and see if 
you can improve them.


--Josh

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Fragmentation project

2007-04-24 Thread Marko Kreen

On 4/23/07, Heikki Linnakangas [EMAIL PROTECTED] wrote:

Oh, you're talking about distributing partitions across different nodes
and parallelizing queries. No, we don't do that today.


PL/Proxy actually works like that, only in smaller scope -
for function calls only.

General solution that partitions free-form SQL
will be non-trivial...

--
marko

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Fragmentation project

2007-04-23 Thread Heikki Linnakangas

Gustavo Tonini wrote:

Well, I'm thinking in define (maybe via SQL) a set of servers as a
cluster and make the fragmentation rules based on select clauses,
storing this configuration in a specific catalog in global schema.
For example: when a record is inserted in a server which not store
this fragment (no rule matches), it will be automatically moved do the
correct server (the server who matches that record), transparently to
the user. This process involves transaction management and a lot more
validations.

A propose is:

create partition name on site as select clause

This create a replication in this site with the rules of the select clause.

We can provide, for example, users create a foreign key in a site
referencing a table who stored in other(s). In this case, the foreign
key will be created in global schema but the validations of the local
schema's would be managed by DRDBMS.

These are my ideas. Is this supported today?


Oh, you're talking about distributing partitions across different nodes 
and parallelizing queries. No, we don't do that today.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Fragmentation project

2007-04-23 Thread Gustavo Tonini

On 4/23/07, Heikki Linnakangas [EMAIL PROTECTED] wrote:

Gustavo Tonini wrote:
 Well, I'm thinking in define (maybe via SQL) a set of servers as a
 cluster and make the fragmentation rules based on select clauses,
 storing this configuration in a specific catalog in global schema.
 For example: when a record is inserted in a server which not store
 this fragment (no rule matches), it will be automatically moved do the
 correct server (the server who matches that record), transparently to
 the user. This process involves transaction management and a lot more
 validations.

 A propose is:

 create partition name on site as select clause

 This create a replication in this site with the rules of the select clause.

 We can provide, for example, users create a foreign key in a site
 referencing a table who stored in other(s). In this case, the foreign
 key will be created in global schema but the validations of the local
 schema's would be managed by DRDBMS.

 These are my ideas. Is this supported today?

Oh, you're talking about distributing partitions across different nodes
and parallelizing queries. No, we don't do that today.


Yes.This is the goal. Well, I will try it. I'll send the project
reports to this list. Comments will be valuable. Desire me good
luck...

Thanks,
Gustavo.

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Fragmentation project

2007-04-23 Thread Josh Berkus
Gustavo,

  Oh, you're talking about distributing partitions across different nodes
  and parallelizing queries. No, we don't do that today.

 Yes.This is the goal. Well, I will try it. I'll send the project
 reports to this list. Comments will be valuable. Desire me good
 luck...

You might join/look at the PgPoolII project, which is working on parallel 
query amoung other things.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Fragmentation project

2007-04-21 Thread Gustavo Tonini

Well, I'm thinking in define (maybe via SQL) a set of servers as a
cluster and make the fragmentation rules based on select clauses,
storing this configuration in a specific catalog in global schema.
For example: when a record is inserted in a server which not store
this fragment (no rule matches), it will be automatically moved do the
correct server (the server who matches that record), transparently to
the user. This process involves transaction management and a lot more
validations.

A propose is:

create partition name on site as select clause

This create a replication in this site with the rules of the select clause.

We can provide, for example, users create a foreign key in a site
referencing a table who stored in other(s). In this case, the foreign
key will be created in global schema but the validations of the local
schema's would be managed by DRDBMS.

These are my ideas. Is this supported today?

Gustavo.

P.S.: sorry by the English mistakes...

On 4/19/07, Heikki Linnakangas [EMAIL PROTECTED] wrote:

Gustavo Tonini wrote:
 I'm thinking in project and implement data fragmentation based on
 catalogs inside
 pgcluster as my university final project. I would like suggestions and
 would be happy if anyone help me to define it. Sorry if this is the
 incorrect list...I had no answer on pgcluster-general...

This is the correct list.

I'm not sure what you mean by data fragmentation, but we do support
horizontal partitioning by table constraints. And toasting is like
vertical partitioning.

What exactly are you thinking of implementing?

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] Fragmentation project

2007-04-19 Thread Gustavo Tonini

Hello people,
I'm thinking in project and implement data fragmentation based on
catalogs inside
pgcluster as my university final project. I would like suggestions and
would be happy if anyone help me to define it. Sorry if this is the
incorrect list...I had no answer on pgcluster-general...

Thanks,
Gustavo.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Fragmentation project

2007-04-19 Thread Heikki Linnakangas

Gustavo Tonini wrote:

I'm thinking in project and implement data fragmentation based on
catalogs inside
pgcluster as my university final project. I would like suggestions and
would be happy if anyone help me to define it. Sorry if this is the
incorrect list...I had no answer on pgcluster-general...


This is the correct list.

I'm not sure what you mean by data fragmentation, but we do support 
horizontal partitioning by table constraints. And toasting is like 
vertical partitioning.


What exactly are you thinking of implementing?

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate