Re: [PATCHES] Fix pgstatindex using for large indexes
Tatsuhito Kasahara wrote: Tatsuhito Kasahara wrote: I fix the patch. Oops, I forgot to attach the patch for pgstattuple.sql. I send it again. Hmm, this followup patch is wrong though -- the SQL definition is still using BIGINT where it should be using double. And the other changes to use BIGINT where the original values were int4 seem unnecessary. One thing I'm not clear about is the change from %d to %u to represent int4 values. Since the SQL datatype is signed, this can't really work, now, can it? -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] Fix pgstatindex using for large indexes
Tatsuhito Kasahara [EMAIL PROTECTED] writes: Tom Lane wrote: Most places where we've dealt with this before, we use double, which is guaranteed to be available whereas uint64 is not ... Oh I see. I fix the patch. # I changed max_avail and free_space to double. I took a closer look at this, and noticed that you were still redefining the output parameters of the function as BIGINT: *** *** 33,48 -- pgstatindex -- CREATE OR REPLACE FUNCTION pgstatindex(IN relname text, ! OUT version int4, ! OUT tree_level int4, ! OUT index_size int4, ! OUT root_block_no int4, ! OUT internal_pages int4, ! OUT leaf_pages int4, ! OUT empty_pages int4, ! OUT deleted_pages int4, ! OUT avg_leaf_density float8, ! OUT leaf_fragmentation float8) AS 'MODULE_PATHNAME', 'pgstatindex' LANGUAGE C STRICT; --- 33,48 -- pgstatindex -- CREATE OR REPLACE FUNCTION pgstatindex(IN relname text, ! OUT version INT, ! OUT tree_level INT, ! OUT index_size BIGINT, ! OUT root_block_no INT, ! OUT internal_pages BIGINT, ! OUT leaf_pages BIGINT, ! OUT empty_pages BIGINT, ! OUT deleted_pages BIGINT, ! OUT avg_leaf_density FLOAT8, ! OUT leaf_fragmentation FLOAT8) AS 'MODULE_PATHNAME', 'pgstatindex' LANGUAGE C STRICT; This is inconsistent --- if int64 isn't actually available, it's not likely to work very well. It seems to me that we should either change the output parameters to float8, or stick with the original version of the patch and just accept that it will give overflowed answers on machines without working int64. Given that there is no problem until you get into multi-terabyte indexes, which are hardly likely to be getting pushed around on ancient systems, it's hard to argue that there's really any case against using bigint. Also I now see that pgstattuple() is using bigint for numbers that are really much more likely to overflow a broken int64 than pgstatindex() is, so the argument for float would require us to change both functions. In short, I'm willing to drop my opposition to the original form of the patch. regards, tom lane -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] Fix pgstatindex using for large indexes
Alvaro Herrera [EMAIL PROTECTED] writes: Hmm, this followup patch is wrong though -- the SQL definition is still using BIGINT where it should be using double. And the other changes to use BIGINT where the original values were int4 seem unnecessary. I'm on this one now ... regards, tom lane -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] Fix pgstatindex using for large indexes
I wrote: In short, I'm willing to drop my opposition to the original form of the patch. Original version applied with some minor tweaks (notably, pg_relpages had the same problem). regards, tom lane -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] Fix pgstatindex using for large indexes
On Mon, Feb 25, 2008 at 11:50:11AM -0500, Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: Tom Lane wrote: Is there any currently supported platform which does not have uint64? I don't know, and neither do you. Maybe we should look at some reasonable way of getting the info out of a compiled instance. How about if we get pg_config to output the value of INT64_IS_BUSTED? We know all the buildfarm machines have working int64, because they'd fail the bigint regression test if not. That's not the point here. If we don't have buildfarm coverage for machines where INT64_IS_BUSTED, how do we know we support those architectures at all? Cheers, David. -- David Fetter [EMAIL PROTECTED] http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: [EMAIL PROTECTED] Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] Fix pgstatindex using for large indexes
Tom Lane napsal(a): Florian G. Pflug [EMAIL PROTECTED] writes: Maybe we should just bite the bullet, and implement int64 emulation for platforms that don't provide one? Why? Workarounds such as use double where needed have served us perfectly fine so far, with far less effort and notational ugliness than this would involve. Disadvantage of double usage is in CPUs. Current CPUs are perfectly optimized for int operation but there are limitation for floating point operatim. For example T1 (niagara) has only one FP unit per whole CPU (T2 has one FP per core) and AMD64 has three parallel execution unit for microcode but only one for FP and FP shares registry with MMX unit. And operation system has more work with saving FP register and so on... Not all these limitations are related to PostgreSQL but it is good to know. By my opinion avoid double is important in critical(bottleneck) places. How you mentioned double is OK for pgstattuple. There will come a time where either there's a really good reason to rely on int64, or we feel that it's moot because any platform without int64 is certainly dead anyway. I'm not sure how far off that time is, but it's probably some fairly small number of years. I don't think support more than five years older platform makes sense for new version of PostgreSQL. Very often users buy a new hardware for new database. IIRC all platform (x86, SPARC, MIPS, ALPHA, PARISC ...) do not have problem with 64bit for longer time. Zdenek ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] Fix pgstatindex using for large indexes
Tom Lane napsal(a): Tatsuhito Kasahara [EMAIL PROTECTED] writes: In pgstatindex.c and pgstattuple.sql, some variables are defined with int type. So when we try to get informations about a large index by using pgstatindex, we get strange value of size and density. Because the values exceed int-max. ... I think that max_avail and free_space should be uint64. Most places where we've dealt with this before, we use double, which is guaranteed to be available whereas uint64 is not ... Is this requirement still valid? Is there any currently supported platform which does not have uint64? IIRC we are going to change datetime to integer for 8.4. By my opinion uint64 is suitable for head and probably for 8.2 and 8.3 as well. 64bit integer is already used on many places: e.g. ./commands/copy.c ./tcop/utility.c. ./optimizer/plan/planner.c Zdenek ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] Fix pgstatindex using for large indexes
Tom Lane wrote: Is there any currently supported platform which does not have uint64? I don't know, and neither do you. Maybe we should look at some reasonable way of getting the info out of a compiled instance. How about if we get pg_config to output the value of INT64_IS_BUSTED? Then we could add a step to the buildfarm client to catch all the output from pg_config. cheers andrew ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] Fix pgstatindex using for large indexes
Zdenek Kotala [EMAIL PROTECTED] writes: Tom Lane napsal(a): Most places where we've dealt with this before, we use double, which is guaranteed to be available whereas uint64 is not ... Is this requirement still valid? Yes. Is there any currently supported platform which does not have uint64? I don't know, and neither do you. IIRC we are going to change datetime to integer for 8.4. We are going to change the *default* to integer. It will still be possible to select float datetimes for use on platforms without working int64. There is not any core functionality that will fail completely without int64, and none will be introduced anytime soon if I have anything to say about it. Now having said that, there are places that use int64 with the understanding that it might degrade to int32, and that that will be good enough --- the statistics counters are an example. However, the only point of the patch being presented for pgstatindex is to be able to count higher than 2^32, so ISTM we may as well make it use double. There isn't any particular reason it *shouldn't* be coded that way, AFAICS, and there is precedent in that VACUUM/ANALYZE use doubles for similar purposes. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] Fix pgstatindex using for large indexes
Andrew Dunstan [EMAIL PROTECTED] writes: Tom Lane wrote: Is there any currently supported platform which does not have uint64? I don't know, and neither do you. Maybe we should look at some reasonable way of getting the info out of a compiled instance. How about if we get pg_config to output the value of INT64_IS_BUSTED? We know all the buildfarm machines have working int64, because they'd fail the bigint regression test if not. That's not the point here. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] Fix pgstatindex using for large indexes
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Mon, 25 Feb 2008 11:21:18 -0500 Tom Lane [EMAIL PROTECTED] wrote: IIRC we are going to change datetime to integer for 8.4. We are going to change the *default* to integer. Thank goodness. Now I can stop recompiling rpms. Thanks for this - -hackers. Joshua D. Drake - -- The PostgreSQL Company since 1997: http://www.commandprompt.com/ PostgreSQL Community Conference: http://www.postgresqlconference.org/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL SPI Liaison | SPI Director | PostgreSQL political pundit -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFHwvkdATb/zqfZUUQRAnl8AJ4ou850ROztMxHZyGeUaD/uXQRMPACeMN9x 72FC2K3hKKV/Aq+FPWI8UJ4= =EeIA -END PGP SIGNATURE- ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] Fix pgstatindex using for large indexes
Tom Lane wrote: Zdenek Kotala [EMAIL PROTECTED] writes: Tom Lane napsal(a): Most places where we've dealt with this before, we use double, which is guaranteed to be available whereas uint64 is not ... Is this requirement still valid? Yes. Maybe we should just bite the bullet, and implement int64 emulation for platforms that don't provide one? I was thinking that something like typedef struct { int32 low, int32 high } int64, plus a few Macros for basic arithmetic should do the trick. Not particularly rewarding work, given that all major platforms *do* support int64 - but it'd prevent the discussion that starts everytime someone proposes a patch that depends on int64. regards, Florian Pflug ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] Fix pgstatindex using for large indexes
Florian G. Pflug [EMAIL PROTECTED] writes: Maybe we should just bite the bullet, and implement int64 emulation for platforms that don't provide one? Why? Workarounds such as use double where needed have served us perfectly fine so far, with far less effort and notational ugliness than this would involve. There will come a time where either there's a really good reason to rely on int64, or we feel that it's moot because any platform without int64 is certainly dead anyway. I'm not sure how far off that time is, but it's probably some fairly small number of years. My position is simply that pgstattuple does not present a reason to make that decision today, especially not when making it rely on int64 is at variance with the coding method already in use in related parts of the core backend. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] Fix pgstatindex using for large indexes
Hi. Tom Lane wrote: I think that max_avail and free_space should be uint64. Most places where we've dealt with this before, we use double, which is guaranteed to be available whereas uint64 is not ... Oh I see. I fix the patch. # I changed max_avail and free_space to double. Best regards. -- NTT OSS Center Tatsuhito Kasahara kasahara.tatsuhito _at_ oss.ntt.co.jp *** postgresql-8.3.0.org/contrib/pgstattuple/pgstatindex.c 2007-11-16 06:14:31.0 +0900 --- postgresql-8.3.0/contrib/pgstattuple/pgstatindex.c 2008-02-24 19:35:09.0 +0900 *** *** 68,75 uint32 empty_pages; uint32 deleted_pages; ! uint32 max_avail; ! uint32 free_space; uint32 fragments; } BTIndexStat; --- 68,75 uint32 empty_pages; uint32 deleted_pages; ! double max_avail; ! double free_space; uint32 fragments; } BTIndexStat; *** *** 87,94 Relationrel; RangeVar *relrv; Datum result; ! uint32 nblocks; ! uint32 blkno; BTIndexStat indexStat; if (!superuser()) --- 87,94 Relationrel; RangeVar *relrv; Datum result; ! BlockNumber nblocks; ! BlockNumber blkno; BTIndexStat indexStat; if (!superuser()) *** *** 207,231 values[j] = palloc(32); snprintf(values[j++], 32, %d, indexStat.level); values[j] = palloc(32); ! snprintf(values[j++], 32, %d, (indexStat.root_pages + ! indexStat.leaf_pages + ! indexStat.internal_pages + ! indexStat.deleted_pages + ! indexStat.empty_pages) * BLCKSZ); values[j] = palloc(32); snprintf(values[j++], 32, %d, indexStat.root_blkno); values[j] = palloc(32); ! snprintf(values[j++], 32, %d, indexStat.internal_pages); values[j] = palloc(32); ! snprintf(values[j++], 32, %d, indexStat.leaf_pages); values[j] = palloc(32); ! snprintf(values[j++], 32, %d, indexStat.empty_pages); values[j] = palloc(32); ! snprintf(values[j++], 32, %d, indexStat.deleted_pages); values[j] = palloc(32); ! snprintf(values[j++], 32, %.2f, 100.0 - (float) indexStat.free_space / (float) indexStat.max_avail * 100.0); values[j] = palloc(32); ! snprintf(values[j++], 32, %.2f, (float) indexStat.fragments / (float) indexStat.leaf_pages * 100.0); tuple = BuildTupleFromCStrings(TupleDescGetAttInMetadata(tupleDesc), values); --- 207,231 values[j] = palloc(32); snprintf(values[j++], 32, %d, indexStat.level); values[j] = palloc(32); ! snprintf(values[j++], 32, %.0f, ( (double) indexStat.root_pages + ! (double) indexStat.leaf_pages + ! (double) indexStat.internal_pages + ! (double) indexStat.deleted_pages + ! (double) indexStat.empty_pages) * BLCKSZ); values[j] = palloc(32); snprintf(values[j++], 32, %d, indexStat.root_blkno); values[j] = palloc(32); ! snprintf(values[j++], 32, %u, indexStat.internal_pages); values[j] = palloc(32); ! snprintf(values[j++], 32, %u, indexStat.leaf_pages); values[j] = palloc(32); ! snprintf(values[j++], 32, %u, indexStat.empty_pages); values[j] = palloc(32); ! snprintf(values[j++], 32, %u, indexStat.deleted_pages); values[j] = palloc(32); ! snprintf(values[j++], 32, %.2f, 100.0 - indexStat.free_space / indexStat.max_avail * 100.0); values[j] = palloc(32); ! snprintf(values[j++], 32, %.2f, (double) indexStat.fragments / (double) indexStat.leaf_pages * 100.0); tuple = BuildTupleFromCStrings(TupleDescGetAttInMetadata(tupleDesc), values);
Re: [PATCHES] Fix pgstatindex using for large indexes
Tatsuhito Kasahara wrote: I fix the patch. Oops, I forgot to attach the patch for pgstattuple.sql. I send it again. Best regards. -- NTT OSS Center Tatsuhito Kasahara kasahara.tatsuhito _at_ oss.ntt.co.jp diff -crN postgresql-8.3.0.org/contrib/pgstattuple/pgstatindex.c postgresql-8.3.0/contrib/pgstattuple/pgstatindex.c *** postgresql-8.3.0.org/contrib/pgstattuple/pgstatindex.c 2007-11-16 06:14:31.0 +0900 --- postgresql-8.3.0/contrib/pgstattuple/pgstatindex.c 2008-02-24 19:35:09.0 +0900 *** *** 68,75 uint32 empty_pages; uint32 deleted_pages; ! uint32 max_avail; ! uint32 free_space; uint32 fragments; } BTIndexStat; --- 68,75 uint32 empty_pages; uint32 deleted_pages; ! double max_avail; ! double free_space; uint32 fragments; } BTIndexStat; *** *** 87,94 Relationrel; RangeVar *relrv; Datum result; ! uint32 nblocks; ! uint32 blkno; BTIndexStat indexStat; if (!superuser()) --- 87,94 Relationrel; RangeVar *relrv; Datum result; ! BlockNumber nblocks; ! BlockNumber blkno; BTIndexStat indexStat; if (!superuser()) *** *** 207,231 values[j] = palloc(32); snprintf(values[j++], 32, %d, indexStat.level); values[j] = palloc(32); ! snprintf(values[j++], 32, %d, (indexStat.root_pages + ! indexStat.leaf_pages + ! indexStat.internal_pages + ! indexStat.deleted_pages + ! indexStat.empty_pages) * BLCKSZ); values[j] = palloc(32); snprintf(values[j++], 32, %d, indexStat.root_blkno); values[j] = palloc(32); ! snprintf(values[j++], 32, %d, indexStat.internal_pages); values[j] = palloc(32); ! snprintf(values[j++], 32, %d, indexStat.leaf_pages); values[j] = palloc(32); ! snprintf(values[j++], 32, %d, indexStat.empty_pages); values[j] = palloc(32); ! snprintf(values[j++], 32, %d, indexStat.deleted_pages); values[j] = palloc(32); ! snprintf(values[j++], 32, %.2f, 100.0 - (float) indexStat.free_space / (float) indexStat.max_avail * 100.0); values[j] = palloc(32); ! snprintf(values[j++], 32, %.2f, (float) indexStat.fragments / (float) indexStat.leaf_pages * 100.0); tuple = BuildTupleFromCStrings(TupleDescGetAttInMetadata(tupleDesc), values); --- 207,231 values[j] = palloc(32); snprintf(values[j++], 32, %d, indexStat.level); values[j] = palloc(32); ! snprintf(values[j++], 32, %.0f, ( (double) indexStat.root_pages + ! (double) indexStat.leaf_pages + ! (double) indexStat.internal_pages + ! (double) indexStat.deleted_pages + ! (double) indexStat.empty_pages) * BLCKSZ); values[j] = palloc(32); snprintf(values[j++], 32, %d, indexStat.root_blkno); values[j] = palloc(32); ! snprintf(values[j++], 32, %u, indexStat.internal_pages); values[j] = palloc(32); ! snprintf(values[j++], 32, %u, indexStat.leaf_pages); values[j] = palloc(32); ! snprintf(values[j++], 32, %u, indexStat.empty_pages); values[j] = palloc(32); ! snprintf(values[j++], 32, %u, indexStat.deleted_pages); values[j] = palloc(32); ! snprintf(values[j++], 32, %.2f, 100.0 - indexStat.free_space / indexStat.max_avail * 100.0); values[j] = palloc(32); ! snprintf(values[j++], 32, %.2f, (double) indexStat.fragments / (double) indexStat.leaf_pages * 100.0); tuple = BuildTupleFromCStrings(TupleDescGetAttInMetadata(tupleDesc), values); diff -crN
[PATCHES] Fix pgstatindex using for large indexes
Hi. In pgstatindex.c and pgstattuple.sql, some variables are defined with int type. So when we try to get informations about a large index by using pgstatindex, we get strange value of size and density. Because the values exceed int-max. # Like following output. I used pgstatindex just after data load. So density is should be nearly 90. test=# SELECT * FROM pgstatindex('large_index'); -[ RECORD 1 ]--+ version| 2 tree_level | 4 index_size | -1349410816 ★ root_block_no | 119666 internal_pages | 28936 leaf_pages | 1379204 empty_pages| 0 deleted_pages | 0 avg_leaf_density | 60.33 ★ leaf_fragmentation | 0 I think that max_avail and free_space should be uint64. And the output format for index_size should be %lld (INT64_FORMAT). I made the patch and tryed it. (And it seemed OK.) test=# SELECT * FROM pgstatindex('large_index'); -[ RECORD 1 ]--+ version| 2 tree_level | 4 index_size | 11535491072 root_block_no | 119666 internal_pages | 28936 leaf_pages | 1379204 empty_pages| 0 deleted_pages | 0 avg_leaf_density | 90.64 leaf_fragmentation | 0 I also fix *_pages variables just in case. Please confirm this. Best regards. -- NTT OSS Center Tatsuhito Kasahara kasahara.tatsuhito _at_ oss.ntt.co.jp diff -crN postgresql-8.3.0.org/contrib/pgstattuple/pgstatindex.c postgresql-8.3.0/contrib/pgstattuple/pgstatindex.c *** postgresql-8.3.0.org/contrib/pgstattuple/pgstatindex.c 2007-11-16 06:14:31.0 +0900 --- postgresql-8.3.0/contrib/pgstattuple/pgstatindex.c 2008-02-21 22:34:40.0 +0900 *** *** 63,77 uint32 level; uint32 root_pages; ! uint32 internal_pages; ! uint32 leaf_pages; ! uint32 empty_pages; ! uint32 deleted_pages; ! uint32 max_avail; ! uint32 free_space; ! uint32 fragments; } BTIndexStat; /* -- --- 63,77 uint32 level; uint32 root_pages; ! uint64 internal_pages; ! uint64 leaf_pages; ! uint64 empty_pages; ! uint64 deleted_pages; ! uint64 max_avail; ! uint64 free_space; ! uint64 fragments; } BTIndexStat; /* -- *** *** 87,94 Relationrel; RangeVar *relrv; Datum result; ! uint32 nblocks; ! uint32 blkno; BTIndexStat indexStat; if (!superuser()) --- 87,94 Relationrel; RangeVar *relrv; Datum result; ! BlockNumber nblocks; ! BlockNumber blkno; BTIndexStat indexStat; if (!superuser()) *** *** 207,213 values[j] = palloc(32); snprintf(values[j++], 32, %d, indexStat.level); values[j] = palloc(32); ! snprintf(values[j++], 32, %d, (indexStat.root_pages + indexStat.leaf_pages + indexStat.internal_pages + indexStat.deleted_pages + --- 207,213 values[j] = palloc(32); snprintf(values[j++], 32, %d, indexStat.level); values[j] = palloc(32); ! snprintf(values[j++], 32, INT64_FORMAT, (indexStat.root_pages + indexStat.leaf_pages + indexStat.internal_pages + indexStat.deleted_pages + *** *** 215,231 values[j] = palloc(32); snprintf(values[j++], 32, %d, indexStat.root_blkno); values[j] = palloc(32); ! snprintf(values[j++], 32, %d, indexStat.internal_pages); values[j] = palloc(32); ! snprintf(values[j++], 32, %d, indexStat.leaf_pages); values[j] = palloc(32); ! snprintf(values[j++], 32, %d, indexStat.empty_pages); values[j] = palloc(32); ! snprintf(values[j++], 32, %d, indexStat.deleted_pages); values[j] = palloc(32); ! snprintf(values[j++], 32, %.2f, 100.0 - (float) indexStat.free_space / (float) indexStat.max_avail * 100.0); values[j] = palloc(32); ! snprintf(values[j++], 32,
Re: [PATCHES] Fix pgstatindex using for large indexes
Tatsuhito Kasahara [EMAIL PROTECTED] writes: In pgstatindex.c and pgstattuple.sql, some variables are defined with int type. So when we try to get informations about a large index by using pgstatindex, we get strange value of size and density. Because the values exceed int-max. ... I think that max_avail and free_space should be uint64. Most places where we've dealt with this before, we use double, which is guaranteed to be available whereas uint64 is not ... regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly