My next suggestion would be to leave EXTERN_TUPLES_PER_PAGE
as is, but:
Split data wider than a page into page sized chunks as long
as they fill whole pages.
Split the rest with EXTERN_TUPLES_PER_PAGE (4) as now.
This would not waste more space than currently, but improve
performance
Gregory Stark wrote:
Bruce Momjian [EMAIL PROTECTED] writes:
Well, it is summarized here:
http://momjian.us/expire/TOAST/SUMMARY.html
It made non-TOAST access 2x faster, but TOAST 7x slower, and that seemed
like a good compromise.
Is this still testing with all data fitting
Bruce Momjian [EMAIL PROTECTED] writes:
Gregory Stark wrote:
Is this still testing with all data fitting in RAM?
Yes. Having things out of RAM is going to make access even slower, but
it is going to allow the heap to be in RAM more often.
It would let us measure the actual impact of
No, you misunderstood. Bruce was suggesting changing the target to
512.
That means if a row is wider than ~2k, toaster will try to toast
until
the base row is
~512 bytes. I would not do that part for 8.3.
OK, what do you suggest for 8.3? Attached are my suggestion
to use 512 and a
Zeugswetter Andreas ADI SD wrote:
No, you misunderstood. Bruce was suggesting changing the target to
512.
That means if a row is wider than ~2k, toaster will try to toast
until
the base row is
~512 bytes. I would not do that part for 8.3.
OK, what do you suggest for 8.3?
Bruce Momjian [EMAIL PROTECTED] writes:
Well, it is summarized here:
http://momjian.us/expire/TOAST/SUMMARY.html
It made non-TOAST access 2x faster, but TOAST 7x slower, and that seemed
like a good compromise.
Is this still testing with all data fitting in RAM?
--
Gregory Stark
While I agree, that 2 might be a good compromise with low risc for
now, I think that toasting all rows down to ~512 bytes is too
narrowly
targeted at not reading wider columns.
Well, it is summarized here:
http://momjian.us/expire/TOAST/SUMMARY.html
It made non-TOAST access
The big question is do we want to drop the target tuple size down
to
512, and increase the chunk size to 8k for 8.3? Dropping the tuple
size down to 512 is going to give us some smaller TOAST values to
fill
in free space created by the 8k chuck size, assuming you have both
types
Zeugswetter Andreas ADI SD wrote:
The big question is do we want to drop the target tuple size down
to
512, and increase the chunk size to 8k for 8.3? Dropping the tuple
size down to 512 is going to give us some smaller TOAST values to
fill
in free space created by the 8k
On Fri, Jun 01, 2007 at 01:50:12PM -0400, Bruce Momjian wrote:
I think the long-term solution is to go to a 2k/8k fragment/block model,
but that isn't going to happen for 8.3.
There might well have been lessons learned since UFS (anyone know what
ZFS does in this regard?), but I agree that we
Jim C. Nasby [EMAIL PROTECTED] writes:
On Fri, Jun 01, 2007 at 01:50:12PM -0400, Bruce Momjian wrote:
The big question is do we want to drop the target tuple size down to
512, and increase the chunk size to 8k for 8.3?
If we do that people could see their disk space usage increase by up to
Gregory Stark wrote:
Gregory Stark [EMAIL PROTECTED] writes:
Bruce Momjian [EMAIL PROTECTED] writes:
shared_buffers again was 32MB so all the data was in memory.
The case where all the data is in memory is simply not interesting. The cost
of TOAST is the random access seeks it
Zeugswetter Andreas ADI SD wrote:
It is good. It shows, that we even see a small advantage in the
everything cached case.
What we don't have yet is numbers for whether EXTERN_TUPLES_PER_PAGE=1
substantially increases the toast table size for real life scenarios,
what happens in the worst
Bruce Momjian [EMAIL PROTECTED] writes:
shared_buffers again was 32MB so all the data was in memory.
The case where all the data is in memory is simply not interesting. The cost
of TOAST is the random access seeks it causes. You seem to be intentionally
avoiding testing the precise thing we're
Gregory Stark [EMAIL PROTECTED] writes:
Bruce Momjian [EMAIL PROTECTED] writes:
shared_buffers again was 32MB so all the data was in memory.
The case where all the data is in memory is simply not interesting. The cost
of TOAST is the random access seeks it causes. You seem to be
I tested EXTERN_TUPLES_PER_PAGE for values 4(default), 2, and 1:
4 15.596
2 15.197
1 14.6
which is basically a 3% decrease from 4-2 and 2-1. The
test script and result are here:
http://momjian.us/expire/TOAST2/
shared_buffers again was
I reran the tests with hashtext(), and created a SUMMARY.HTML chart:
http://momjian.us/expire/TOAST/
What you will see is that pushing TEXT to a TOAST column
allows quick access to non-TOAST values and single-row TOAST
values, but accessing all TOAST columns is slower than
Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes:
My expectation would be, that we want to allow a toast tuple to fill a
whole page (TOAST_TUPLES_PER_PAGE = 1),
I've been wondering about that too. It certainly needs to be
experimented with, independently of TOAST_TUPLE_THRESHOLD, now that
Tom Lane [EMAIL PROTECTED] writes:
Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes:
My expectation would be, that we want to allow a toast tuple to fill a
whole page (TOAST_TUPLES_PER_PAGE = 1),
I thought that previously but then I started thinking maybe that's not true.
Picture
Gregory Stark [EMAIL PROTECTED] writes:
Whereas if you set toast_tuples_per_page to 8k then the only option for
Postgres will be to put each datum in its own page and waste 1-3k on every
page.
No, because actually the code is designed to make the toast chunk size
just enough less than 8K that
Tom Lane [EMAIL PROTECTED] writes:
The shorter-than-normal tuples carrying the last chunk of any particular
datum are going to result in wasted space to the extent that we can't
pack them together on a page, but that's true now. Right now, if you
have a large toasted datum, it mostly will
Whereas if you set toast_tuples_per_page to 8k then the only option
for Postgres will be to put each datum in its own page and
waste 1-3k
on every page.
No, because actually the code is designed to make the toast
chunk size just enough less than 8K that the tuples fit.
He
Tom Lane wrote:
Gregory Stark [EMAIL PROTECTED] writes:
Whereas if you set toast_tuples_per_page to 8k then the only option for
Postgres will be to put each datum in its own page and waste 1-3k on every
page.
No, because actually the code is designed to make the toast chunk size
just
Bruce Momjian wrote:
What you will see is that pushing TEXT to a TOAST column allows quick
access to non-TOAST values and single-row TOAST values, but accessing
all TOAST columns is slower than accessing them in the heap, by a factor
of 3-18x.
Looking at the chart, it seems 512 is the proper
Bruce Momjian [EMAIL PROTECTED] writes:
Uh, am I supposed to be running more TOAST tests? Would someone explain
what they want tested?
If you want my opinion I would say we need two tests:
1) For TOAST_TUPLE_TARGET:
We need to run the test scripts you have already for sizes that cause
I tested EXTERN_TUPLES_PER_PAGE for values 4(default), 2, and 1:
4 15.596
2 15.197
1 14.6
which is basically a 3% decrease from 4-2 and 2-1. The test script
and result are here:
http://momjian.us/expire/TOAST2/
shared_buffers again was 32MB
Bruce Momjian [EMAIL PROTECTED] writes:
I tested TOAST using a method similar to the above method against CVS
HEAD, with default shared_buffers = 32MB and no assert()s. I created
backends with power-of-2 seetings for TOAST_TUPLES_PER_PAGE (4(default),
8, 16, 32, 64) which gives
Gregory Stark wrote:
Bruce Momjian [EMAIL PROTECTED] writes:
I tested TOAST using a method similar to the above method against CVS
HEAD, with default shared_buffers = 32MB and no assert()s. I created
backends with power-of-2 seetings for TOAST_TUPLES_PER_PAGE (4(default),
8, 16, 32,
Bruce Momjian [EMAIL PROTECTED] writes:
Gregory Stark wrote:
Bruce Momjian [EMAIL PROTECTED] writes:
I tested TOAST using a method similar to the above method against CVS
HEAD, with default shared_buffers = 32MB and no assert()s. I created
backends with power-of-2 seetings for
Gregory Stark [EMAIL PROTECTED] writes:
Is your database initialized with C locale? If so then length(text) is
optimized to not have to detoast:
if (pg_database_encoding_max_length() == 1)
PG_RETURN_INT32(toast_raw_datum_size(str) - VARHDRSZ);
Of course I got that wrong.
Gregory Stark wrote:
Bruce Momjian [EMAIL PROTECTED] writes:
Gregory Stark wrote:
Bruce Momjian [EMAIL PROTECTED] writes:
I tested TOAST using a method similar to the above method against CVS
HEAD, with default shared_buffers = 32MB and no assert()s. I created
backends with
Bruce Momjian wrote:
Gregory Stark wrote:
Bruce Momjian [EMAIL PROTECTED] writes:
Gregory Stark wrote:
Bruce Momjian [EMAIL PROTECTED] writes:
I tested TOAST using a method similar to the above method against CVS
HEAD, with default shared_buffers = 32MB and no assert()s. I
Alvaro Herrera wrote:
Strangely, 128 bytes seems to be the break-even point for TOAST and
non-TOAST, even for sequential scans of the entire heap touching all
long row values. I am somewhat confused why TOAST has faster access
than inline heap data.
Is your database
Bruce Momjian [EMAIL PROTECTED] writes:
Gregory Stark wrote:
Is your database initialized with C locale? If so then length(text) is
optimized to not have to detoast:
if (pg_database_encoding_max_length() == 1)
PG_RETURN_INT32(toast_raw_datum_size(str) - VARHDRSZ);
Wow,
Bruce Momjian wrote:
My test uses random data, which I figured was a close to real-world as I
could get, and I have a test that makes sure the data was pushed to the
TOAST table. Should I still try EXTERNAL?
My point is that you probably want to measure separately the effect of
compression
Alvaro Herrera wrote:
Bruce Momjian wrote:
My test uses random data, which I figured was a close to real-world as I
could get, and I have a test that makes sure the data was pushed to the
TOAST table. Should I still try EXTERNAL?
My point is that you probably want to measure
Gregory Stark wrote:
Bruce Momjian [EMAIL PROTECTED] writes:
Gregory Stark wrote:
Is your database initialized with C locale? If so then length(text) is
optimized to not have to detoast:
if (pg_database_encoding_max_length() == 1)
Bruce Momjian [EMAIL PROTECTED] writes:
No, we did substring() too :)
Uh, I looked at text_substring(), and while there is an optimization to
do character counting for encoding length == 1, it is still accessing
the data.
Sure but it'll only access the first chunk. There are two chunks in
Gregory Stark wrote:
Bruce Momjian [EMAIL PROTECTED] writes:
No, we did substring() too :)
Uh, I looked at text_substring(), and while there is an optimization to
do character counting for encoding length == 1, it is still accessing
the data.
Sure but it'll only access the first
Bruce Momjian wrote:
Gregory Stark wrote:
Bruce Momjian [EMAIL PROTECTED] writes:
No, we did substring() too :)
Uh, I looked at text_substring(), and while there is an optimization to
do character counting for encoding length == 1, it is still accessing
the data.
Sure but
Bruce Momjian [EMAIL PROTECTED] writes:
I reran the tests with hashtext(), and created a SUMMARY.HTML chart:
http://momjian.us/expire/TOAST/
I don't understand what the numbers in this chart are?
regards, tom lane
---(end of
Tom Lane wrote:
Bruce Momjian [EMAIL PROTECTED] writes:
I reran the tests with hashtext(), and created a SUMMARY.HTML chart:
http://momjian.us/expire/TOAST/
I don't understand what the numbers in this chart are?
They are taken from the test script and output files that are also in
[EMAIL PROTECTED] (Bruce Momjian) writes:
The results are here:
http://momjian.us/expire/TOAST/
I'll take a look and see if there's anything further it makes sense
for me to try testing. Thanks for following up so quickly; what with
the cold I have had, I haven't yet gotten back to the
43 matches
Mail list logo