Re: [HACKERS] help with plug-in function for additional (partition/shard) visibility checks

2011-09-02 Thread PostgreSQL - Hans-Jürgen Schönig
hello …

i have been thinking about this issue for quite a while ...
given your idea i am not sure how this can work at all.

consider:
begin;
insert 1
insert 2
commit

assume this ends up in the same node,
now you split it into two …
1 and 2 will have exactly the same visibility to and transaction.
i am not sure how you can get this right without looking at the data.

alternative idea: what if the proxy would add / generate a filter by looking at 
the data?
a quick idea would be that once you split you add a simple directive such as 
FILTER GENERATOR $1 or so to the PL/proxy code.
it would then behind the scene arrange the filter passed on.
what do you think?

regards,

hans



On Sep 1, 2011, at 10:13 AM, Hannu Krosing wrote:

 Hallow hackers
 
 I have the following problem to solve and would like to get advice on
 the best way to do it.
 
 The problem:
 
 When growing a pl/proxy based database cluster, one of the main
 operations is splitting a partition. The standard flow is as follows:
 
 1) make a copy of the partitions table(s) to another database
 2) reconfigure pl/proxy to use 2 partitions instead of one
 
 The easy part is making a copy of all or half of the table to another
 database. The hard part is fast deletion (i mean milliseconds,
 comparable to TRUNCATE) the data that should not be in a partition (so
 that RUN ON ALL functions will continue to return right results).
 
 It would be relatively easy, if we still had the RULES for select
 available for plain tables, but even then the eventual cleanup would
 usually mean at least 3 passes of disk writes (set xmax, write deleted
 flag, vacuum and remove)
 
 What I would like to have is possibility for additional visibility
 checks, which would run some simple C function over tuple data (usually
 hash(fieldval) + and + or ) and return visibility (is in this partition)
 as a result. It would be best if this is run at so low level that also
 vacuum would use it and can clean up the foreign partition data in one
 pass, without doing the delete dance first.
 
 So finally the QUESTION :
 
 where in code would be the best place to check for this so that
 
 1) both regular queries and VACUUM see it
 2) the tuple data (and not only system fields or just xmin/xmax) would
 be available for the function to use
 
 
 -- 
 ---
 Hannu Krosing
 PostgreSQL Unlimited Scalability and Performance Consultant
 2ndQuadrant Nordic
 PG Admin Book: http://www.2ndQuadrant.com/books/
 
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers
 

--
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] help with plug-in function for additional (partition/shard) visibility checks

2011-09-02 Thread Hannu Krosing
On Fri, 2011-09-02 at 14:01 +0200, PostgreSQL - Hans-Jürgen Schönig
wrote:
 hello …
 
 i have been thinking about this issue for quite a while ...
 given your idea i am not sure how this can work at all.
 
 consider:
   begin;
   insert 1
   insert 2
   commit
 
 assume this ends up in the same node,
 now you split it into two …
 1 and 2 will have exactly the same visibility to and transaction.
 i am not sure how you can get this right without looking at the data.

It has to consider the data when determining visibility, that's the
whole point of the plug-in .

The idea is, that each row belongs to a certain partition, as
determined by some function over it's fields. Most often this function
is hash of primary key OR-ed by a bitmap representing cluster size and
AND-ed by bitmap for partition(s) stored in this database. 

when you split the parition, then some row's don't belong in the old
partition database anymore (and if you did a full copy, then the other
half dont belong to the new one), so they should be handled as
invisible / deleted. As this can be only done by looking at the tuple
data, this needs an additional visibility function. And as this is only
needed for partitioned databases, it makes sense to implement it as a
plogin, so it would not wast cycles on non-partitioned databases

 alternative idea: what if the proxy would add / generate a filter by 
 looking at the data?
 a quick idea would be that once you split you add a simple directive 
 such as FILTER GENERATOR $1 or so to the PL/proxy code.
 it would then behind the scene arrange the filter passed on.
 what do you think?

Hmm. I'm not sure I understand what you are trying to say. Can you
elaborate a little ?

 
   regards,
 
   hans
 
 
 
 On Sep 1, 2011, at 10:13 AM, Hannu Krosing wrote:
 
  Hallow hackers
  
  I have the following problem to solve and would like to get advice on
  the best way to do it.
  
  The problem:
  
  When growing a pl/proxy based database cluster, one of the main
  operations is splitting a partition. The standard flow is as follows:
  
  1) make a copy of the partitions table(s) to another database
  2) reconfigure pl/proxy to use 2 partitions instead of one
  
  The easy part is making a copy of all or half of the table to another
  database. The hard part is fast deletion (i mean milliseconds,
  comparable to TRUNCATE) the data that should not be in a partition (so
  that RUN ON ALL functions will continue to return right results).
  
  It would be relatively easy, if we still had the RULES for select
  available for plain tables, but even then the eventual cleanup would
  usually mean at least 3 passes of disk writes (set xmax, write deleted
  flag, vacuum and remove)
  
  What I would like to have is possibility for additional visibility
  checks, which would run some simple C function over tuple data (usually
  hash(fieldval) + and + or ) and return visibility (is in this partition)
  as a result. It would be best if this is run at so low level that also
  vacuum would use it and can clean up the foreign partition data in one
  pass, without doing the delete dance first.
  
  So finally the QUESTION :
  
  where in code would be the best place to check for this so that
  
  1) both regular queries and VACUUM see it
  2) the tuple data (and not only system fields or just xmin/xmax) would
  be available for the function to use
  
  
  -- 
  ---
  Hannu Krosing
  PostgreSQL Unlimited Scalability and Performance Consultant
  2ndQuadrant Nordic
  PG Admin Book: http://www.2ndQuadrant.com/books/
  
  
  -- 
  Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
  To make changes to your subscription:
  http://www.postgresql.org/mailpref/pgsql-hackers
  
 
 --
 Cybertec Schönig  Schönig GmbH
 Gröhrmühlgasse 26
 A-2700 Wiener Neustadt, Austria
 Web: http://www.postgresql-support.de
 
 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] help with plug-in function for additional (partition/shard) visibility checks

2011-09-02 Thread PostgreSQL - Hans-Jürgen Schönig
hello …

the goal of the entire proxy thing is to make the right query go to the right 
node / nodes.
we determine this by using a partitioning function and so on …
currently PL/proxy has only a handful of commands - one is RUN ON … which tells 
us where to put things.
assume you issue a select … some select will fall out on the target node. 
to restrict the data coming from the node you could add an additional 
constraint on the way …

say:
SELECT * FROM proxy_table WHERE a = 20;

what you want to reach the node after a split is …

SELECT * FROM proxy_table WHERE a = 20 AND col = filter the wrong half 
away

my idea is to add an additional command to the PL/proxy command set.
it should call a function generating this additional filter.
maybe somehow like that …

RUN ON hashtext($1) 
-- this one already knows about the increased cluster
GENERATE FILTER my_create_the_bloody_filter_func($1)-- this one 
would massage the query going to the node.

it would actually open the door for a lot of additional trickery.
the function would tell the proxy what to append - and: this what would be 
under your full control.

what do you think?
i got to think about it futher but i can envision that this could be feasible 
...

hans


On Sep 2, 2011, at 2:36 PM, Hannu Krosing wrote:

 On Fri, 2011-09-02 at 14:01 +0200, PostgreSQL - Hans-Jürgen Schönig
 wrote:
 hello …
 
 i have been thinking about this issue for quite a while ...
 given your idea i am not sure how this can work at all.
 
 consider:
  begin;
  insert 1
  insert 2
  commit
 
 assume this ends up in the same node,
 now you split it into two …
 1 and 2 will have exactly the same visibility to and transaction.
 i am not sure how you can get this right without looking at the data.
 
 It has to consider the data when determining visibility, that's the
 whole point of the plug-in .
 
 The idea is, that each row belongs to a certain partition, as
 determined by some function over it's fields. Most often this function
 is hash of primary key OR-ed by a bitmap representing cluster size and
 AND-ed by bitmap for partition(s) stored in this database. 
 
 when you split the parition, then some row's don't belong in the old
 partition database anymore (and if you did a full copy, then the other
 half dont belong to the new one), so they should be handled as
 invisible / deleted. As this can be only done by looking at the tuple
 data, this needs an additional visibility function. And as this is only
 needed for partitioned databases, it makes sense to implement it as a
 plogin, so it would not wast cycles on non-partitioned databases
 
 alternative idea: what if the proxy would add / generate a filter by 
 looking at the data?
 a quick idea would be that once you split you add a simple directive 
 such as FILTER GENERATOR $1 or so to the PL/proxy code.
 it would then behind the scene arrange the filter passed on.
 what do you think?
 
 Hmm. I'm not sure I understand what you are trying to say. Can you
 elaborate a little ?
 
 
  regards,
 
  hans
 
 
 
 On Sep 1, 2011, at 10:13 AM, Hannu Krosing wrote:
 
 Hallow hackers
 
 I have the following problem to solve and would like to get advice on
 the best way to do it.
 
 The problem:
 
 When growing a pl/proxy based database cluster, one of the main
 operations is splitting a partition. The standard flow is as follows:
 
 1) make a copy of the partitions table(s) to another database
 2) reconfigure pl/proxy to use 2 partitions instead of one
 
 The easy part is making a copy of all or half of the table to another
 database. The hard part is fast deletion (i mean milliseconds,
 comparable to TRUNCATE) the data that should not be in a partition (so
 that RUN ON ALL functions will continue to return right results).
 
 It would be relatively easy, if we still had the RULES for select
 available for plain tables, but even then the eventual cleanup would
 usually mean at least 3 passes of disk writes (set xmax, write deleted
 flag, vacuum and remove)
 
 What I would like to have is possibility for additional visibility
 checks, which would run some simple C function over tuple data (usually
 hash(fieldval) + and + or ) and return visibility (is in this partition)
 as a result. It would be best if this is run at so low level that also
 vacuum would use it and can clean up the foreign partition data in one
 pass, without doing the delete dance first.
 
 So finally the QUESTION :
 
 where in code would be the best place to check for this so that
 
 1) both regular queries and VACUUM see it
 2) the tuple data (and not only system fields or just xmin/xmax) would
 be available for the function to use
 
 
 -- 
 ---
 Hannu Krosing
 PostgreSQL Unlimited Scalability and Performance Consultant
 2ndQuadrant Nordic
 PG Admin Book: http://www.2ndQuadrant.com/books/
 
 
 -- 
 Sent via 

Re: [HACKERS] help with plug-in function for additional (partition/shard) visibility checks

2011-09-02 Thread Hannu Krosing
On Fri, 2011-09-02 at 14:51 +0200, PostgreSQL - Hans-Jürgen Schönig
wrote:
 hello …
 
 the goal of the entire proxy thing is to make the right query go to the right 
 node / nodes.
 we determine this by using a partitioning function and so on …
 currently PL/proxy has only a handful of commands - one is RUN ON … which 
 tells us where to put things.
 assume you issue a select … some select will fall out on the target node. 
 to restrict the data coming from the node you could add an additional 
 constraint on the way …
 
 say:
   SELECT * FROM proxy_table WHERE a = 20;
 
 what you want to reach the node after a split is …
 
   SELECT * FROM proxy_table WHERE a = 20 AND col = filter the wrong half 
 away
 
 my idea is to add an additional command to the PL/proxy command set.
 it should call a function generating this additional filter.
 maybe somehow like that …
 
   RUN ON hashtext($1) 
 -- this one already knows about the increased cluster
   GENERATE FILTER my_create_the_bloody_filter_func($1)-- this one 
 would massage the query going to the node.
 
 it would actually open the door for a lot of additional trickery.
 the function would tell the proxy what to append - and: this what would be 
 under your full control.
 
 what do you think?

Hmm, could work for simplest cases, but this has 2 main problems:

1) you need a full SQL parser to make this generally useful for plain
SQL

and

2) it still won't work for pl/proxy's main usecase - calling the same
_function_ on partition.

 i got to think about it futher but i can envision that this could be feasible 
 ...
 
   hans
 
 
 On Sep 2, 2011, at 2:36 PM, Hannu Krosing wrote:
 
  On Fri, 2011-09-02 at 14:01 +0200, PostgreSQL - Hans-Jürgen Schönig
  wrote:
  hello …
  
  i have been thinking about this issue for quite a while ...
  given your idea i am not sure how this can work at all.
  
  consider:
 begin;
 insert 1
 insert 2
 commit
  
  assume this ends up in the same node,
  now you split it into two …
  1 and 2 will have exactly the same visibility to and transaction.
  i am not sure how you can get this right without looking at the data.
  
  It has to consider the data when determining visibility, that's the
  whole point of the plug-in .
  
  The idea is, that each row belongs to a certain partition, as
  determined by some function over it's fields. Most often this function
  is hash of primary key OR-ed by a bitmap representing cluster size and
  AND-ed by bitmap for partition(s) stored in this database. 
  
  when you split the parition, then some row's don't belong in the old
  partition database anymore (and if you did a full copy, then the other
  half dont belong to the new one), so they should be handled as
  invisible / deleted. As this can be only done by looking at the tuple
  data, this needs an additional visibility function. And as this is only
  needed for partitioned databases, it makes sense to implement it as a
  plogin, so it would not wast cycles on non-partitioned databases
  
  alternative idea: what if the proxy would add / generate a filter by 
  looking at the data?
  a quick idea would be that once you split you add a simple directive 
  such as FILTER GENERATOR $1 or so to the PL/proxy code.
  it would then behind the scene arrange the filter passed on.
  what do you think?
  
  Hmm. I'm not sure I understand what you are trying to say. Can you
  elaborate a little ?
  
  
 regards,
  
 hans
  
  
  
  On Sep 1, 2011, at 10:13 AM, Hannu Krosing wrote:
  
  Hallow hackers
  
  I have the following problem to solve and would like to get advice on
  the best way to do it.
  
  The problem:
  
  When growing a pl/proxy based database cluster, one of the main
  operations is splitting a partition. The standard flow is as follows:
  
  1) make a copy of the partitions table(s) to another database
  2) reconfigure pl/proxy to use 2 partitions instead of one
  
  The easy part is making a copy of all or half of the table to another
  database. The hard part is fast deletion (i mean milliseconds,
  comparable to TRUNCATE) the data that should not be in a partition (so
  that RUN ON ALL functions will continue to return right results).
  
  It would be relatively easy, if we still had the RULES for select
  available for plain tables, but even then the eventual cleanup would
  usually mean at least 3 passes of disk writes (set xmax, write deleted
  flag, vacuum and remove)
  
  What I would like to have is possibility for additional visibility
  checks, which would run some simple C function over tuple data (usually
  hash(fieldval) + and + or ) and return visibility (is in this partition)
  as a result. It would be best if this is run at so low level that also
  vacuum would use it and can clean up the foreign partition data in one
  pass, without doing the delete dance first.
  
  So finally the 

Re: [HACKERS] help with plug-in function for additional (partition/shard) visibility checks

2011-09-02 Thread PostgreSQL - Hans-Jürgen Schönig

On Sep 2, 2011, at 2:59 PM, Hannu Krosing wrote:

 On Fri, 2011-09-02 at 14:51 +0200, PostgreSQL - Hans-Jürgen Schönig
 wrote:
 hello …
 
 the goal of the entire proxy thing is to make the right query go to the 
 right node / nodes.
 we determine this by using a partitioning function and so on …
 currently PL/proxy has only a handful of commands - one is RUN ON … which 
 tells us where to put things.
 assume you issue a select … some select will fall out on the target node. 
 to restrict the data coming from the node you could add an additional 
 constraint on the way …
 
 say:
  SELECT * FROM proxy_table WHERE a = 20;
 
 what you want to reach the node after a split is …
 
  SELECT * FROM proxy_table WHERE a = 20 AND col = filter the wrong half 
 away
 
 my idea is to add an additional command to the PL/proxy command set.
 it should call a function generating this additional filter.
 maybe somehow like that …
 
  RUN ON hashtext($1) 
 -- this one already knows about the increased cluster
  GENERATE FILTER my_create_the_bloody_filter_func($1)-- this one 
 would massage the query going to the node.
 
 it would actually open the door for a lot of additional trickery.
 the function would tell the proxy what to append - and: this what would be 
 under your full control.
 
 what do you think?
 
 Hmm, could work for simplest cases, but this has 2 main problems:
 
 1) you need a full SQL parser to make this generally useful for plain
 SQL


i think that everything beyond a simple case is pretty hard to achieve anyway. 
to me it looks pretty impossible to solve this in a generic way without same 
insane amount of labor input - at listen given the ideas coming to me in the 
past.
and yes, functions are an issue. unless you have some sort of virtually 
private database thing it is close to impossible (unless you want to try some 
nightmare based on views / constraint exclusion on the partitions or so).

regards,

hans


--
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] help with plug-in function for additional (partition/shard) visibility checks

2011-09-01 Thread Hannu Krosing
Hallow hackers

I have the following problem to solve and would like to get advice on
the best way to do it.

The problem:

When growing a pl/proxy based database cluster, one of the main
operations is splitting a partition. The standard flow is as follows:

1) make a copy of the partitions table(s) to another database
2) reconfigure pl/proxy to use 2 partitions instead of one

The easy part is making a copy of all or half of the table to another
database. The hard part is fast deletion (i mean milliseconds,
comparable to TRUNCATE) the data that should not be in a partition (so
that RUN ON ALL functions will continue to return right results).

It would be relatively easy, if we still had the RULES for select
available for plain tables, but even then the eventual cleanup would
usually mean at least 3 passes of disk writes (set xmax, write deleted
flag, vacuum and remove)

What I would like to have is possibility for additional visibility
checks, which would run some simple C function over tuple data (usually
hash(fieldval) + and + or ) and return visibility (is in this partition)
as a result. It would be best if this is run at so low level that also
vacuum would use it and can clean up the foreign partition data in one
pass, without doing the delete dance first.

So finally the QUESTION :

where in code would be the best place to check for this so that

1) both regular queries and VACUUM see it
2) the tuple data (and not only system fields or just xmin/xmax) would
be available for the function to use


-- 
---
Hannu Krosing
PostgreSQL Unlimited Scalability and Performance Consultant
2ndQuadrant Nordic
PG Admin Book: http://www.2ndQuadrant.com/books/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers