Re: [HACKERS] Vertical Partitioning with TOAST

2006-03-04 Thread Hannu Krosing
Ühel kenal päeval, N, 2006-03-02 kell 22:15, kirjutas Bruce Momjian:
 Is there still interst in this idea for TODO?

Just to voice my support - Yes, I think that being able to set lower
thresolds for TOAST is very useful in several cases.

Also getting rid of toast index and start using ctids directly would be
a big bonus.

When using direct ctids we could use either ctid chains or some sort of
skiplist for access to N-th TOAST chunk.


Hannu



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Vertical Partitioning with TOAST

2006-03-04 Thread Martijn van Oosterhout
On Sat, Mar 04, 2006 at 12:15:46PM +0200, Hannu Krosing wrote:
 Also getting rid of toast index and start using ctids directly would be
 a big bonus.
 
 When using direct ctids we could use either ctid chains or some sort of
 skiplist for access to N-th TOAST chunk.

I suppose this would mean that you couldn't use vacuum on the toast
table anymore. Or teach vacuum that everytime it moves a tuple it needs
to update the original table (sequential scan). What exactly are you
trying to save here?

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


signature.asc
Description: Digital signature


Re: [HACKERS] Vertical Partitioning with TOAST

2006-03-04 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes:
 On Sat, Mar 04, 2006 at 12:15:46PM +0200, Hannu Krosing wrote:
 Also getting rid of toast index and start using ctids directly would be
 a big bonus.
 When using direct ctids we could use either ctid chains or some sort of
 skiplist for access to N-th TOAST chunk.

 I suppose this would mean that you couldn't use vacuum on the toast
 table anymore.

Another problem with it is that it'd destroy the current optimizations
that allow partial fetches of uncompressed TOASTed fields to be fast.
You couldn't fetch page N of a TOAST datum without reading all the pages
before it.

I suppose the objection that toast tables wouldn't be regular tables
anymore might not be fatal, but you'll certainly get some pushback if
you try to take away the direct-access optimizations.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Vertical Partitioning with TOAST

2006-03-04 Thread Hannu Krosing
Ühel kenal päeval, L, 2006-03-04 kell 10:31, kirjutas Tom Lane:
 Martijn van Oosterhout kleptog@svana.org writes:
  On Sat, Mar 04, 2006 at 12:15:46PM +0200, Hannu Krosing wrote:
  Also getting rid of toast index and start using ctids directly would be
  a big bonus.
  When using direct ctids we could use either ctid chains or some sort of
  skiplist for access to N-th TOAST chunk.
 
  I suppose this would mean that you couldn't use vacuum on the toast
  table anymore.
 
 Another problem with it is that it'd destroy the current optimizations
 that allow partial fetches of uncompressed TOASTed fields to be fast.
 You couldn't fetch page N of a TOAST datum without reading all the pages
 before it.
 
 I suppose the objection that toast tables wouldn't be regular tables
 anymore might not be fatal, but you'll certainly get some pushback if
 you try to take away the direct-access optimizations.

That's why I was suggesting skiplist instead on simple linked lists.

Another way would be to put a list of all toast ctids for your whole
toasted field in the first page(s) of the toast.

That way you will still have option of fast access to any partial of the
field, most likely even faster than with current implementation, as you
have to touch less pages. And you can have it also for compressed fields
if you store uncompressed offsets.

-
Hannu





---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Vertical Partitioning with TOAST

2006-03-03 Thread Jim C. Nasby
If this would be accepted I might actually be able to accomplish this.
Maybe. :) But having a TODO wouldn't be a bad idea as well...

Would this require 2 new fields in pg_attribute, or is there a better
way to store the thresholds? I'm thinking that each field would need two
special values; 0 for 'no external/compression' and -1 for 'use default'
(hrm, I guess this means we should add at least one GUC to control that
default...)

I suspect there's folks on -general who would express interest if you
want me to ask there...

On Thu, Mar 02, 2006 at 10:15:19PM -0500, Bruce Momjian wrote:
 
 Is there still interst in this idea for TODO?
 
 ---
  As Tom suggested, I think it would be best to be able to change the size
  at which a field gets stored externally. I think it also makes sense to
  have this reverse the normal order of compress first, then if it still
  doesn't fit store it externally. I forsee this typically being useful
  when you have fields that are between ~100 and 1000 bytes in size, and
  I'm doubtful that compression would do much good there. But I wouldn't
  rule out this being useful on fields that can also sometimes contain
  much larger amounts of data, so I don't think it makes sense to disable
  compression completely. So, I think this leaves two new options:
  
  SET STORAGE EXTERNAL [THRESHOLD x]
  If a field is over x in size, it's stored externally.
  
  SET STORAGE EXTENDED [THRESHOLD x]
  If a field is over x in size, it's stored externally. If it's over
  BLCKSZ/4 it will also be compressed (I think that's how things work
  now).
  
  Actually, that's rather ugly. I think it would be better to just break
  external storage and compression out into their own attributes:
  
  SET STORAGE [ ALLOW EXTERNAL [THRESHOLD x] ] [ ALLOW COMPRESSION [THRESHOLD 
  x] ]
  
  ALLOW EXTERNAL: if a field is larger than x bytes (BLCKSZ/4 by default)
  then it will be stored externally. May be specified along with ALLOW
  COMPRESSION.
  
  ALLOW COMPRESSION: if a field is larger than x bytes (BLCKSZ/4 by default)
  then it will be COMPRESSED. May be specified along with ALLOW EXTERNAL.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Vertical Partitioning with TOAST

2006-03-02 Thread Bruce Momjian

Is there still interst in this idea for TODO?

---

Jim C. Nasby wrote:
 On Thu, Dec 08, 2005 at 10:03:43AM -0500, Bruce Momjian wrote:
  Jim C. Nasby wrote:
   On Thu, Dec 08, 2005 at 12:59:47AM -0500, Tom Lane wrote:
Jim C. Nasby [EMAIL PROTECTED] writes:
 This seems like a useful feature to add, allowing for easy built-in
 verticle partitioning. Are there issues with the patch as-is?

Other than the ones mentioned by the poster?

It seemed to me more like a not-too-successful experiment than something
ready for application.  If you take the viewpoint that this is just
another TOAST storage strategy, I think it's pretty useless.  A large
field value is going to get toasted anyway with the regular strategy,
and if your column happens to contain some values that are not large,
forcing them out-of-line anyway is simply silly.  (You could make a case
for making the threshold size user-controllable, but I don't see the
case for setting the threshold to zero, which is what this amounts to.)
   
   Valid point. I do think there's a lot of benefit to being able to set
   the limit much lower than what it currently defaults to today. We have a
   client that has a queue-type table that is updated very frequently. One
   of the fields is text, that is not updated as frequently. Keeping this
   table vacuumed well enough has proven to be problematic, because any
   delay to vacuuming quickly results in a very large amount of bloat.
   Moving that text field into a seperate table would most likely be a win.
   
   Presumably this would need to be settable on at least a per-table basis.
   
   Would adding such a variable be a good beginner TODO, or is it too
   invasive?
  
  Well, we have now:
  
 ALTER TABLE ALTER [ COLUMN ] column 
  SET STORAGE { PLAIN | EXTERNAL | EXTENDED | MAIN }
  
  What else is needed?
 
 As Tom suggested, I think it would be best to be able to change the size
 at which a field gets stored externally. I think it also makes sense to
 have this reverse the normal order of compress first, then if it still
 doesn't fit store it externally. I forsee this typically being useful
 when you have fields that are between ~100 and 1000 bytes in size, and
 I'm doubtful that compression would do much good there. But I wouldn't
 rule out this being useful on fields that can also sometimes contain
 much larger amounts of data, so I don't think it makes sense to disable
 compression completely. So, I think this leaves two new options:
 
 SET STORAGE EXTERNAL [THRESHOLD x]
 If a field is over x in size, it's stored externally.
 
 SET STORAGE EXTENDED [THRESHOLD x]
 If a field is over x in size, it's stored externally. If it's over
 BLCKSZ/4 it will also be compressed (I think that's how things work
 now).
 
 Actually, that's rather ugly. I think it would be better to just break
 external storage and compression out into their own attributes:
 
 SET STORAGE [ ALLOW EXTERNAL [THRESHOLD x] ] [ ALLOW COMPRESSION [THRESHOLD 
 x] ]
 
 ALLOW EXTERNAL: if a field is larger than x bytes (BLCKSZ/4 by default)
 then it will be stored externally. May be specified along with ALLOW
 COMPRESSION.
 
 ALLOW COMPRESSION: if a field is larger than x bytes (BLCKSZ/4 by default)
 then it will be COMPRESSED. May be specified along with ALLOW EXTERNAL.
 --
 Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
 Pervasive Software  http://pervasive.comwork: 512-231-6117
 vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461
 
 ---(end of broadcast)---
 TIP 3: Have you checked our extensive FAQ?
 
http://www.postgresql.org/docs/faq
 

-- 
  Bruce Momjian   http://candle.pha.pa.us
  SRA OSS, Inc.   http://www.sraoss.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Vertical Partitioning with TOAST

2005-12-08 Thread Junji TERAMOTO
Hello all,

Thank you for having the interest.

Jim C. Nasby wrote:
 Valid point. I do think there's a lot of benefit to being able to set
 the limit much lower than what it currently defaults to today. We have a
 client that has a queue-type table that is updated very frequently. One
 of the fields is text, that is not updated as frequently. Keeping this
 table vacuumed well enough has proven to be problematic, because any
 delay to vacuuming quickly results in a very large amount of bloat.
 Moving that text field into a seperate table would most likely be a win.

Yes, our team think that this patch is effective that the tuple can be
partially updated.
For instance, DBT-2 updates frequently contents excluding c_data in the
customer table. Because c_data(about 400bytes: The size of the entire
tuple is 500bytes.) is copied together in every case, it is thought that
it has decreased the performance.
That is more important than the vertical partitioning function.

Of course, it is important to change DDL of the table. However, I think
it might be useful when it is not possible to change.

As pointed out by Tom, this is a patch to verify the idea.
I want to know that community is how much interested in a partial update.
Of course, it is interested whether to want the vertical partitioning
function in PostgreSQL, too. :-)


By the way, should I send the patch to -patches again?

-- 
Junji Teramoto / teramoto.junji (a) lab.ntt.co.jp

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Vertical Partitioning with TOAST

2005-12-08 Thread Bruce Momjian
Jim C. Nasby wrote:
 On Thu, Dec 08, 2005 at 12:59:47AM -0500, Tom Lane wrote:
  Jim C. Nasby [EMAIL PROTECTED] writes:
   This seems like a useful feature to add, allowing for easy built-in
   verticle partitioning. Are there issues with the patch as-is?
  
  Other than the ones mentioned by the poster?
  
  It seemed to me more like a not-too-successful experiment than something
  ready for application.  If you take the viewpoint that this is just
  another TOAST storage strategy, I think it's pretty useless.  A large
  field value is going to get toasted anyway with the regular strategy,
  and if your column happens to contain some values that are not large,
  forcing them out-of-line anyway is simply silly.  (You could make a case
  for making the threshold size user-controllable, but I don't see the
  case for setting the threshold to zero, which is what this amounts to.)
 
 Valid point. I do think there's a lot of benefit to being able to set
 the limit much lower than what it currently defaults to today. We have a
 client that has a queue-type table that is updated very frequently. One
 of the fields is text, that is not updated as frequently. Keeping this
 table vacuumed well enough has proven to be problematic, because any
 delay to vacuuming quickly results in a very large amount of bloat.
 Moving that text field into a seperate table would most likely be a win.
 
 Presumably this would need to be settable on at least a per-table basis.
 
 Would adding such a variable be a good beginner TODO, or is it too
 invasive?

Well, we have now:

   ALTER TABLE ALTER [ COLUMN ] column 
SET STORAGE { PLAIN | EXTERNAL | EXTENDED | MAIN }

What else is needed?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Vertical Partitioning with TOAST

2005-12-08 Thread Jan Wieck

On 12/8/2005 1:42 PM, Jim C. Nasby wrote:


On Thu, Dec 08, 2005 at 10:03:43AM -0500, Bruce Momjian wrote:

Jim C. Nasby wrote:
 On Thu, Dec 08, 2005 at 12:59:47AM -0500, Tom Lane wrote:
  Jim C. Nasby [EMAIL PROTECTED] writes:
   This seems like a useful feature to add, allowing for easy built-in
   verticle partitioning. Are there issues with the patch as-is?
  
  Other than the ones mentioned by the poster?
  
  It seemed to me more like a not-too-successful experiment than something

  ready for application.  If you take the viewpoint that this is just
  another TOAST storage strategy, I think it's pretty useless.  A large
  field value is going to get toasted anyway with the regular strategy,
  and if your column happens to contain some values that are not large,
  forcing them out-of-line anyway is simply silly.  (You could make a case
  for making the threshold size user-controllable, but I don't see the
  case for setting the threshold to zero, which is what this amounts to.)
 
 Valid point. I do think there's a lot of benefit to being able to set

 the limit much lower than what it currently defaults to today. We have a
 client that has a queue-type table that is updated very frequently. One
 of the fields is text, that is not updated as frequently. Keeping this
 table vacuumed well enough has proven to be problematic, because any
 delay to vacuuming quickly results in a very large amount of bloat.
 Moving that text field into a seperate table would most likely be a win.
 
 Presumably this would need to be settable on at least a per-table basis.
 
 Would adding such a variable be a good beginner TODO, or is it too

 invasive?

Well, we have now:

   ALTER TABLE ALTER [ COLUMN ] column 
		SET STORAGE { PLAIN | EXTERNAL | EXTENDED | MAIN }


What else is needed?


As Tom suggested, I think it would be best to be able to change the size
at which a field gets stored externally. I think it also makes sense to
have this reverse the normal order of compress first, then if it still
doesn't fit store it externally. I forsee this typically being useful
when you have fields that are between ~100 and 1000 bytes in size, and
I'm doubtful that compression would do much good there. But I wouldn't
rule out this being useful on fields that can also sometimes contain
much larger amounts of data, so I don't think it makes sense to disable
compression completely. So, I think this leaves two new options:


It's not the size of a field that triggers toasting. It is the size of 
the entire tuple. As long as that is  BLKSIZE/4, the toaster will pick 
the currently largest inline value and do something with it. 
something is either compressing or (if not allowed or already done) 
moving external.



Jan



SET STORAGE EXTERNAL [THRESHOLD x]
If a field is over x in size, it's stored externally.

SET STORAGE EXTENDED [THRESHOLD x]
If a field is over x in size, it's stored externally. If it's over
BLCKSZ/4 it will also be compressed (I think that's how things work
now).

Actually, that's rather ugly. I think it would be better to just break
external storage and compression out into their own attributes:

SET STORAGE [ ALLOW EXTERNAL [THRESHOLD x] ] [ ALLOW COMPRESSION [THRESHOLD x] ]

ALLOW EXTERNAL: if a field is larger than x bytes (BLCKSZ/4 by default)
then it will be stored externally. May be specified along with ALLOW
COMPRESSION.

ALLOW COMPRESSION: if a field is larger than x bytes (BLCKSZ/4 by default)
then it will be COMPRESSED. May be specified along with ALLOW EXTERNAL.
--
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461



--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Vertical Partitioning with TOAST

2005-12-07 Thread Jim C. Nasby
This seems like a useful feature to add, allowing for easy built-in
verticle partitioning. Are there issues with the patch as-is? (Other
than it probably should have gone to -patches...)

On Thu, Dec 01, 2005 at 05:59:08PM +0900, Junji TERAMOTO wrote:
 Hi all,
 
 I wrote a experimental patch for a vertical partitioning
 function.
 
 I decided to use the code of TOAST to create the function
 easily. In a word, the row that the user specified is forcedly
 driven out with TOAST.
 
 The performance gain of 10% was seen by driving out c_data of the
 customer table in the DBT-2 benchmark in our environment.
 
 The mechanism of TOAST is an overdesigned system to use it for a
 vertical partitioning. Because the overhead of processing is large,
 the performance might down according to the environment.
 
 There are seriously a lot of things that should be considered if
 a vertical partitioning is mounted.
 For instance, TOAST index is omitted, and ctid is used for link.
 
 Your comments are welcome. Thanks.
 
 ---
 How To Use
 ---
 Use ALTER TABLE command.
 http://www.postgresql.org/docs/8.1/static/sql-altertable.html
 
  ALTER TABLE name ALTER COLUMN column SET STRAGE FORCEEXTERNAL;
 
 I do not understand whether FORCEEXTERNAL is an appropriate
 word. Please teach when there is a better word...
 
 
 -- 
 Junji Teramoto

 diff -purN postgresql-8.1.0.org/src/backend/access/heap/heapam.c 
 postgresql-8.1.0/src/backend/access/heap/heapam.c
 --- postgresql-8.1.0.org/src/backend/access/heap/heapam.c 2005-10-15 
 11:49:08.0 +0900
 +++ postgresql-8.1.0/src/backend/access/heap/heapam.c 2005-12-01 
 15:31:38.307713257 +0900
 @@ -1070,6 +1070,36 @@ heap_get_latest_tid(Relation relation,
   }   /* end of loop 
 */
  }
  
 +// Add by junji from here
 +/*
 + *   has_rel_forceexternal - Is there SET STORAGE FORCEEXTERNALed rows?
 + */
 +bool
 +has_rel_forceexternal(Relation relation)
 +{
 + TupleDesc   tupleDesc;
 + Form_pg_attribute *att;
 + int numAttrs;
 + int i;
 +
 + /*
 +  * Get the tuple descriptor and break down the tuple(s) into fields.
 +  */
 + tupleDesc = relation-rd_att;
 + att = tupleDesc-attrs;
 + numAttrs = tupleDesc-natts;
 +
 + for (i = 0; i  numAttrs; i++)
 + {
 + if (att[i]-attstorage == 'f')
 + return true;
 + }
 + 
 + return false;
 +}
 +// Add by junji to here
 +
 +
  /*
   *   heap_insert - insert tuple into a heap
   *
 @@ -1130,6 +1160,9 @@ heap_insert(Relation relation, HeapTuple
* out-of-line attributes from some other relation, invoke the toaster.
*/
   if (HeapTupleHasExternal(tup) ||
 +// Add by junji from here
 + (has_rel_forceexternal(relation)) ||
 +// Add by junji to here
   (MAXALIGN(tup-t_len)  TOAST_TUPLE_THRESHOLD))
   heap_tuple_toast_attrs(relation, tup, NULL);
  
 @@ -1762,6 +1795,9 @@ l2:
*/
   need_toast = (HeapTupleHasExternal(oldtup) ||
 HeapTupleHasExternal(newtup) ||
 +// Add by junji from here
 +   (has_rel_forceexternal(relation)) ||
 +// Add by junji to here
 (MAXALIGN(newtup-t_len)  
 TOAST_TUPLE_THRESHOLD));
  
   newtupsize = MAXALIGN(newtup-t_len);
 diff -purN postgresql-8.1.0.org/src/backend/access/heap/tuptoaster.c 
 postgresql-8.1.0/src/backend/access/heap/tuptoaster.c
 --- postgresql-8.1.0.org/src/backend/access/heap/tuptoaster.c 2005-10-15 
 11:49:09.0 +0900
 +++ postgresql-8.1.0/src/backend/access/heap/tuptoaster.c 2005-12-01 
 15:29:29.722579466 +0900
 @@ -512,6 +512,46 @@ toast_insert_or_update(Relation rel, Hea
   }
   }
  
 +// Add by junji from here
 + /*
 +  * We look for attributes of attstorage 'f'.
 +  */
 + if (rel-rd_rel-reltoastrelid != InvalidOid)
 + {
 + Datum   old_value;
 +
 + /*--
 +  * Search for the biggest yet inlined attribute with
 +  * attstorage equals 'x' or 'e'
 +  *--
 +  */
 + for (i = 0; i  numAttrs; i++)
 + {
 + if (toast_action[i] == 'p')
 + continue;
 + if (VARATT_IS_EXTERNAL(toast_values[i]))
 + continue;
 + if (att[i]-attstorage != 'f')
 + continue;
 +
 + /*
 +  * Store this external
 +  */
 + old_value = toast_values[i];
 + toast_action[i] = 'p';
 + toast_values[i] = toast_save_datum(rel, 
 toast_values[i]);
 + if (toast_free[i])
 + pfree(DatumGetPointer(old_value));
 +
 + 

Re: [HACKERS] Vertical Partitioning with TOAST

2005-12-07 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes:
 This seems like a useful feature to add, allowing for easy built-in
 verticle partitioning. Are there issues with the patch as-is?

Other than the ones mentioned by the poster?

It seemed to me more like a not-too-successful experiment than something
ready for application.  If you take the viewpoint that this is just
another TOAST storage strategy, I think it's pretty useless.  A large
field value is going to get toasted anyway with the regular strategy,
and if your column happens to contain some values that are not large,
forcing them out-of-line anyway is simply silly.  (You could make a case
for making the threshold size user-controllable, but I don't see the
case for setting the threshold to zero, which is what this amounts to.)

The poster was not actually suggesting applying it in the form of a
force-external TOAST strategy; he was using this as a prototype to try
to interest people in the idea of out-of-line storage mechanisms with
lower overhead than TOAST.  But that part is all speculation not code.

Personally, I'd rather look into whether we couldn't speed up TOAST
without changing any of its basic assumptions.  The current
implementation isn't awful, but it was built to allow the existing table
and index mechanisms to be re-used for TOAST data.  Now that we know for
certain TOAST is a good idea, it would be reasonable to take a second
look at whether we could improve the performance with another round of
implementation effort.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Vertical Partitioning with TOAST

2005-12-07 Thread Jim C. Nasby
On Thu, Dec 08, 2005 at 12:59:47AM -0500, Tom Lane wrote:
 Jim C. Nasby [EMAIL PROTECTED] writes:
  This seems like a useful feature to add, allowing for easy built-in
  verticle partitioning. Are there issues with the patch as-is?
 
 Other than the ones mentioned by the poster?
 
 It seemed to me more like a not-too-successful experiment than something
 ready for application.  If you take the viewpoint that this is just
 another TOAST storage strategy, I think it's pretty useless.  A large
 field value is going to get toasted anyway with the regular strategy,
 and if your column happens to contain some values that are not large,
 forcing them out-of-line anyway is simply silly.  (You could make a case
 for making the threshold size user-controllable, but I don't see the
 case for setting the threshold to zero, which is what this amounts to.)

Valid point. I do think there's a lot of benefit to being able to set
the limit much lower than what it currently defaults to today. We have a
client that has a queue-type table that is updated very frequently. One
of the fields is text, that is not updated as frequently. Keeping this
table vacuumed well enough has proven to be problematic, because any
delay to vacuuming quickly results in a very large amount of bloat.
Moving that text field into a seperate table would most likely be a win.

Presumably this would need to be settable on at least a per-table basis.

Would adding such a variable be a good beginner TODO, or is it too
invasive?

 Personally, I'd rather look into whether we couldn't speed up TOAST
 without changing any of its basic assumptions.  The current
 implementation isn't awful, but it was built to allow the existing table
 and index mechanisms to be re-used for TOAST data.  Now that we know for
 certain TOAST is a good idea, it would be reasonable to take a second
 look at whether we could improve the performance with another round of
 implementation effort.

I've often wondered about all the overhead of storing toast data in what
amounts to a regular table. Sounds like another TODO...
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[HACKERS] Vertical Partitioning with TOAST

2005-12-01 Thread Junji TERAMOTO
Hi all,

I wrote a experimental patch for a vertical partitioning
function.

I decided to use the code of TOAST to create the function
easily. In a word, the row that the user specified is forcedly
driven out with TOAST.

The performance gain of 10% was seen by driving out c_data of the
customer table in the DBT-2 benchmark in our environment.

The mechanism of TOAST is an overdesigned system to use it for a
vertical partitioning. Because the overhead of processing is large,
the performance might down according to the environment.

There are seriously a lot of things that should be considered if
a vertical partitioning is mounted.
For instance, TOAST index is omitted, and ctid is used for link.

Your comments are welcome. Thanks.

---
How To Use
---
Use ALTER TABLE command.
http://www.postgresql.org/docs/8.1/static/sql-altertable.html

 ALTER TABLE name ALTER COLUMN column SET STRAGE FORCEEXTERNAL;

I do not understand whether FORCEEXTERNAL is an appropriate
word. Please teach when there is a better word...


-- 
Junji Teramoto
diff -purN postgresql-8.1.0.org/src/backend/access/heap/heapam.c 
postgresql-8.1.0/src/backend/access/heap/heapam.c
--- postgresql-8.1.0.org/src/backend/access/heap/heapam.c   2005-10-15 
11:49:08.0 +0900
+++ postgresql-8.1.0/src/backend/access/heap/heapam.c   2005-12-01 
15:31:38.307713257 +0900
@@ -1070,6 +1070,36 @@ heap_get_latest_tid(Relation relation,
}   /* end of loop 
*/
 }
 
+// Add by junji from here
+/*
+ * has_rel_forceexternal - Is there SET STORAGE FORCEEXTERNALed rows?
+ */
+bool
+has_rel_forceexternal(Relation relation)
+{
+   TupleDesc   tupleDesc;
+   Form_pg_attribute *att;
+   int numAttrs;
+   int i;
+
+   /*
+* Get the tuple descriptor and break down the tuple(s) into fields.
+*/
+   tupleDesc = relation-rd_att;
+   att = tupleDesc-attrs;
+   numAttrs = tupleDesc-natts;
+
+   for (i = 0; i  numAttrs; i++)
+   {
+   if (att[i]-attstorage == 'f')
+   return true;
+   }
+   
+   return false;
+}
+// Add by junji to here
+
+
 /*
  * heap_insert - insert tuple into a heap
  *
@@ -1130,6 +1160,9 @@ heap_insert(Relation relation, HeapTuple
 * out-of-line attributes from some other relation, invoke the toaster.
 */
if (HeapTupleHasExternal(tup) ||
+// Add by junji from here
+   (has_rel_forceexternal(relation)) ||
+// Add by junji to here
(MAXALIGN(tup-t_len)  TOAST_TUPLE_THRESHOLD))
heap_tuple_toast_attrs(relation, tup, NULL);
 
@@ -1762,6 +1795,9 @@ l2:
 */
need_toast = (HeapTupleHasExternal(oldtup) ||
  HeapTupleHasExternal(newtup) ||
+// Add by junji from here
+ (has_rel_forceexternal(relation)) ||
+// Add by junji to here
  (MAXALIGN(newtup-t_len)  
TOAST_TUPLE_THRESHOLD));
 
newtupsize = MAXALIGN(newtup-t_len);
diff -purN postgresql-8.1.0.org/src/backend/access/heap/tuptoaster.c 
postgresql-8.1.0/src/backend/access/heap/tuptoaster.c
--- postgresql-8.1.0.org/src/backend/access/heap/tuptoaster.c   2005-10-15 
11:49:09.0 +0900
+++ postgresql-8.1.0/src/backend/access/heap/tuptoaster.c   2005-12-01 
15:29:29.722579466 +0900
@@ -512,6 +512,46 @@ toast_insert_or_update(Relation rel, Hea
}
}
 
+// Add by junji from here
+   /*
+* We look for attributes of attstorage 'f'.
+*/
+   if (rel-rd_rel-reltoastrelid != InvalidOid)
+   {
+   Datum   old_value;
+
+   /*--
+* Search for the biggest yet inlined attribute with
+* attstorage equals 'x' or 'e'
+*--
+*/
+   for (i = 0; i  numAttrs; i++)
+   {
+   if (toast_action[i] == 'p')
+   continue;
+   if (VARATT_IS_EXTERNAL(toast_values[i]))
+   continue;
+   if (att[i]-attstorage != 'f')
+   continue;
+
+   /*
+* Store this external
+*/
+   old_value = toast_values[i];
+   toast_action[i] = 'p';
+   toast_values[i] = toast_save_datum(rel, 
toast_values[i]);
+   if (toast_free[i])
+   pfree(DatumGetPointer(old_value));
+
+   toast_free[i] = true;
+   toast_sizes[i] = VARATT_SIZE(toast_values[i]);
+
+   need_change = true;
+   need_free = true;
+   }
+   }
+// Add by junji to here
+
/* --
 *