Re: [HACKERS] ERROR: index row size 2960 exceeds btree maximum
Ühel kenal päeval, K, 2007-05-30 kell 21:41, kirjutas Rodrigo Sakai: Hello, I’m developing an application that needs a different data type. So, I have implemented this new data type inside postgresql using C, as documentation shows to. Basically, the data type is a composition of two timestamps, like: (timestamp, timestamp) and it is called ‘period’. So, almost everything is fine. Data type is ok, operators are ok, but the index doesn’t work fine. ... Here is the error about last INSERT: ERROR: index row size 2960 exceeds btree maximum, 2713 SQL state: 54000 Hint: Values larger than 1/3 of a buffer page cannot be indexed. It seems that your C datatype is buggy, generating binary representation of Period that is bigger than 2713 bytes. - Hannu ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Ye olde drop-the-database-you-just-left problem
However, it suddenly struck me that we could probably make most of the problem go away if we put that same wait into DROP DATABASE itself --- that is, if we see other backends in the target DB, sleep for a second or two and then recheck before erroring out. Yup, waiting in drop database up to 10-30 secs would imho be fine. Andreas ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] TOAST usage setting
Bruce Momjian [EMAIL PROTECTED] writes: shared_buffers again was 32MB so all the data was in memory. The case where all the data is in memory is simply not interesting. The cost of TOAST is the random access seeks it causes. You seem to be intentionally avoiding testing the precise thing we're interested in. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] TOAST usage setting
Gregory Stark [EMAIL PROTECTED] writes: Bruce Momjian [EMAIL PROTECTED] writes: shared_buffers again was 32MB so all the data was in memory. The case where all the data is in memory is simply not interesting. The cost of TOAST is the random access seeks it causes. You seem to be intentionally avoiding testing the precise thing we're interested in. Also, something's not right with these results. 100,000 tuples --even if all they contain is a toast pointer-- won't fit on a single page. And the toast tables should vary in size depending on how many toast chunks are created. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] [COMMITTERS] pgsql: Make large sequential scans and VACUUMs work in a limited-size
Alvaro Herrera wrote: Tom Lane wrote: Log Message: --- Make large sequential scans and VACUUMs work in a limited-size ring of buffers, rather than blowing out the whole shared-buffer arena. Aside from avoiding cache spoliation, this fixes the problem that VACUUM formerly tended to cause a WAL flush for every page it modified, because we had it hacked to use only a single buffer. Those flushes will now occur only once per ring-ful. The exact ring size, and the threshold for seqscans to switch into the ring usage pattern, remain under debate; but the infrastructure seems done. The key bit of infrastructure is a new optional BufferAccessStrategy object that can be passed to ReadBuffer operations; this replaces the former StrategyHintVacuum API. I think now is time to re-test the patch for advancing OldestXmin during vacuum? Thanks for the reminder, I'll schedule those tests. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] Backend crash during explain
Hi, This is on Intel OSX, anon CVS download today. Build process:- 1. make distclean 2. ./configure --enable-debug --enable-cassert --enable-integer-datetimes --prefix=/Users/grant/Development/bin/pgsql --enable-depend 3. make all install The query with no EXPLAIN (ANALYSE) completes fine. The query with EXPLAIN ANALYSE completes fine. foo=# explain analyse select this_.id as id6_2_, this_1_.created_at as created2_6_2_, this_1_.created_by as created3_6_2_, this_1_.updated_at as updated4_6_2_, this_1_.updated_by as updated5_6_2_, this_1_.from_date as from6_6_2_, this_1_.party_id as party8_6_2_, this_1_.thru_date as thru7_6_2_, this_1_.type_id as type9_6_2_, this_.tag as tag14_2_, this_.taggedBy_id as taggedBy4_14_2_, this_.taggedDate as taggedDate14_2_, partyrolet2_.id as id3_0_, partyrolet2_.created_at as created2_3_0_, partyrolet2_.created_by as created3_3_0_, partyrolet2_.updated_at as updated4_3_0_, partyrolet2_.updated_by as updated5_3_0_, partyrolet2_.description as descript6_3_0_, partyrolet2_.name as name3_0_, tagimplant3_.id as id6_1_, tagimplant3_1_.created_at as created2_6_1_, tagimplant3_1_.created_by as created3_6_1_, tagimplant3_1_.updated_at as updated4_6_1_, tagimplant3_1_.updated_by as updated5_6_1_, tagimplant3_1_.from_date as from6_6_1_, tagimplant3_1_.party_id as party8_6_1_, tagimplant 3_1_.thru_date as thru7_6_1_, tagimplant3_1_.type_id as type9_6_1_ from tagged_asset this_ inner join party_role this_1_ on this_.id=this_1_.id inner join party_role_type partyrolet2_ on this_1_.type_id=partyrolet2_.id left outer join tag_implanter tagimplant3_ on this_.taggedBy_id=tagimplant3_.id left outer join party_role tagimplant3_1_ on tagimplant3_.id=tagimplant3_1_.id where (lower(this_.tag) like '1f76%') limit 100; QUERY PLAN -- Limit (cost=8.31..24.50 rows=1 width=3748) (actual time=23.057..209.191 rows=77 loops=1) - Nested Loop (cost=8.31..24.50 rows=1 width=3748) (actual time=23.055..209.142 rows=77 loops=1) - Nested Loop Left Join (cost=8.31..24.22 rows=1 width=2170) (actual time=23.036..208.326 rows=77 loops=1) - Nested Loop Left Join (cost=8.31..18.62 rows=1 width=1098) (actual time=23.033..208.204 rows=77 loops=1) - Merge Join (cost=8.31..10.34 rows=1 width=1094) (actual time=23.024..208.015 rows=77 loops=1) Merge Cond: (this_1_.id = this_.id) - Index Scan Backward using party_role_pkey on party_role this_1_ (cost=0.00..18672.18 rows=581325 width=1076) (actual time=0.102..142.963 rows=240384 loops=1) - Sort (cost=8.31..8.32 rows=1 width=22) (actual time=0.856..0.902 rows=77 loops=1) Sort Key: this_.id Sort Method: quicksort Memory: 20kB - Index Scan using tagged_asset_tag_key on tagged_asset this_ (cost=0.01..8.30 rows=1 width=22) (actual time=0.109..0.739 rows=77 loops=1) Index Cond: ((lower((tag)::text) = '1f76'::text) AND (lower((tag)::text) '1f77'::text)) Filter: (lower((tag)::text) ~~ '1f76%'::text) - Index Scan using tag_implanter_pkey on tag_implanter tagimplant3_ (cost=0.00..8.27 rows=1 width=4) (actual time=0.001..0.001 rows=0 loops=77) Index Cond: (this_.taggedby_id = tagimplant3_.id) - Index Scan using party_role_pkey on party_role tagimplant3_1_ (cost=0.00..5.59 rows=1 width=1076) (actual time=0.000..0.000 rows=0 loops=77) Index Cond: (tagimplant3_.id = tagimplant3_1_.id) - Index Scan using party_role_type_pkey on party_role_type partyrolet2_ (cost=0.00..0.27 rows=1 width=1578) (actual time=0.008..0.009 rows=1 loops=77) Index Cond: (partyrolet2_.id = this_1_.type_id) Total runtime: 209.699 ms (20 rows) However, with just EXPLAIN (no ANALYSE) foo=# explain select this_.id as id6_2_, this_1_.created_at as created2_6_2_, this_1_.created_by as created3_6_2_, this_1_.updated_at as updated4_6_2_, this_1_.updated_by as updated5_6_2_, this_1_.from_date as from6_6_2_, this_1_.party_id as party8_6_2_, this_1_.thru_date as thru7_6_2_, this_1_.type_id as type9_6_2_, this_.tag as tag14_2_, this_.taggedBy_id as taggedBy4_14_2_, this_.taggedDate as taggedDate14_2_, partyrolet2_.id as id3_0_, partyrolet2_.created_at as created2_3_0_, partyrolet2_.created_by as created3_3_0_, partyrolet2_.updated_at as updated4_3_0_, partyrolet2_.updated_by
Re: [HACKERS] Backend crash during explain
Grant Finnemore napsal(a): CrashReporter trace: Date/Time: 2007-05-31 10:21:39.285 +0200 OS Version: 10.4.9 (Build 8P2137) Report Version: 4 Command: postmaster Path:./bin/postmaster Parent: postmaster [23091] Version: ??? (???) PID:23096 Thread: 0 Exception: EXC_BAD_ACCESS (0x0001) Codes: KERN_PROTECTION_FAILURE (0x0002) at 0x0018 Thread 0 Crashed: 0 postmaster 0x00116ec6 ExecSetSlotDescriptor + 77 (execTuples.c:344) 1 postmaster 0x001182f9 ExecAssignScanTypeFromOuterPlan + 33 (execUtils.c:771) 2 postmaster 0x001240c8 ExecInitSort + 168 (nodeSort.c:211) It looks that tupDesc contains invalid pointer. I found some strange assignment in ExecAssignScanTypeFromOuterPlan function. See comment bellow. OuterPlanState expects PlaneState structure instead ScanState. 00762 ExecAssignScanTypeFromOuterPlan(ScanState *scanstate) 00763 { 00764 PlanState *outerPlan; 00765 TupleDesc tupDesc; 00766 00767 outerPlan = outerPlanState(scanstate); ^ scanstate-ps ?? 00768 tupDesc = ExecGetResultType(outerPlan); 00769 00770 ExecAssignScanType(scanstate, tupDesc); 00771 } Zdenek ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] ERROR: index row size 2960 exceeds btree maximum
Rodrigo Sakai wrote: Basically, the data type is a composition of two timestamps, like: (timestamp, timestamp) and it is called 'period'. So, almost everything is fine. Data type is ok, operators are ok, but the index doesn't work fine. Does type tinterval not suit you? It is not very well documented but it should work, or at least serve as a basis for developing your own types. -- Alvaro Herrera http://www.amazon.com/gp/registry/DXLWNGRJD34J Essentially, you're proposing Kevlar shoes as a solution for the problem that you want to walk around carrying a loaded gun aimed at your foot. (Tom Lane) ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] TOAST usage setting
I tested EXTERN_TUPLES_PER_PAGE for values 4(default), 2, and 1: 4 15.596 2 15.197 1 14.6 which is basically a 3% decrease from 4-2 and 2-1. The test script and result are here: http://momjian.us/expire/TOAST2/ shared_buffers again was 32MB so all the data was in memory. Thanks for the test. (The test is for 1 row that is 100k wide.) It is good. It shows, that we even see a small advantage in the everything cached case. What we don't have yet is numbers for whether EXTERN_TUPLES_PER_PAGE=1 substantially increases the toast table size for real life scenarios, what happens in the worst case (~48% wastage compared to previous 12%), and whether 1 row per page works well with autovacuum ? The bad case (with EXTERN_TUPLES_PER_PAGE=1) is when most toast tuples have a size over TOAST_MAX_CHUNK_SIZE_for_2+1 but enough smaller than a page that we care about the wasteage. Maybe we can special case that range. Maybe determine (and lock) the freespace of any cheap-to-get-at non empty page (e.g. the current insert target page) and splitting the toast data there. Andreas ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] SQLF Optimization question
Hello, (First of all, sorry for my english), I'm in a project that involves including sqlf sentences to make postgres understand the fuzzy querys, i mean, make people to create fuzzy predicates, these are words like fat, tall, young, old, in an sql sentence. I guess the first step is complete. The parser know when an user writes CREATE FUZZY PREDICATE name ON domain AS fuzzy set. For instance, a person can create the young word as a fuzzy predicate like this: CREATE FUZZY PREDICATE young ON 0..120 AS (0,0,25,50), that means, in a domain of 0..120 a young person is between 25 and 50. The fuzzy set has the follows: (beg,core1,core2,end), (begin of the set, first core, second core, end of the set). The sentence creates a row in a catalog named pg_fuzzypred, and we're good till then. But when I write Select * from students where age=young, brings me a message telling me the node isn't exist. My teacher who is very involved in the project tells me that postgres has to do the derivation principle, before it gets to the executor (what sounds logical), in order to the optimizer calculates all costs properly. It's in this point where i need help, ¿What do i have to modify or create to make postgres undestand the derivation principle, where it's the point in parser tree goes to query tree and do the optimization? I send a file, this has all the modification that my teacher and i have make to postgres (It has some words in spanish, but i guess it's understandable), and a file with the errors. Thanks. *) Agregaciones en /src/backend/parser/keywords.c: linea 156: {fuzzy, FUZZY}, linea 267: {predicate, PREDICATE}, lÃnea 177: {infinite, INFINITE}, *) Agregaciones en /src/backend/parser/gram.y: linea 154: CreateFuzzyPredStmt linea 381: FUZZY lÃnea 386: INFINITE linea 410: PREDICATE linea 540: | CreateFuzzyPredStmt linea 604: /* * * Create a new Fuzzy Predicate :) * */ CreateFuzzyPredStmt: CREATE FUZZY PREDICATE ColId ON Iconst'.''.'Iconst AS '('Iconst','Iconst','Iconst','Iconst')' { CreateFuzzyPredStmt *n = makeNode(CreateFuzzyPredStmt); n-pred=$4; n-begd=$6; n-endd=$9; n-minfp=$12; n-core1=$14; n-core2=$16; n-maxfp=$18; n-typefp=1; $$ = (Node *)n; } | CREATE FUZZY PREDICATE ColId ON Iconst'.''.'Iconst AS '('INFINITE','INFINITE','Iconst','Iconst')' { CreateFuzzyPredStmt *n = makeNode(CreateFuzzyPredStmt); n-pred=$4; n-begd=$6; n-endd=$9; n-core2=$16; n-maxfp=$18; n-typefp=2; $$ = (Node *)n; } | CREATE FUZZY PREDICATE ColId ON Iconst'.''.'Iconst AS '('Iconst','Iconst','INFINITE','INFINITE')' { CreateFuzzyPredStmt *n = makeNode(CreateFuzzyPredStmt); n-pred=$4; n-begd=$6; n-endd=$9; n-minfp=$13; n-core1=$15; n-typefp=3; $$ = (Node *)n; } ; *) Agregaciones en /src/include/nodes/nodes.h: linea 296: T_CreateFuzzyPredStmt, linea 299: T_A_FuzzyPred, *) Agregaciones en /src/include/nodes/parsenodes.h: linea 252: /* * A_FuzzPred- a Fuzzy Predicate expression */ typedef struct A_FuzzyPred { NodeTag type; char*pred; int minfp; int core1; int core2; int maxfp; int typefp; } A_FuzzyPred; linea 1282: /* -- * {Create} FUZZY PREDICATE Statement * -- */ typedef struct CreateFuzzyPredStmt { NodeTag type; int begd; /* begin domain */ int endd; /* end domain */ int minfp; /* min trapezoid */ int core1; /* core 1 trapezoid */ int core2; /* core 2 trapezoid */ int maxfp; /* max trapezoid */ int typefp; /* type=1 (trapezoid), 2 (at least), 3 (at most)*/ } CreateFuzzyPredStmt; *) Agregaciones
Re: [HACKERS] Backend crash during explain
Grant Finnemore [EMAIL PROTECTED] writes: The query with no EXPLAIN (ANALYSE) completes fine. The query with EXPLAIN ANALYSE completes fine. However, with just EXPLAIN (no ANALYSE) Need a complete test case please, not just the query. All I get here is ERROR: relation tagged_asset does not exist regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] New cast between inet/cidr and bytea
Zoltan Boszormenyi [EMAIL PROTECTED] writes: Bruce Momjian írta: What is the use case for such a cast? The application doesn't want to parse the textual IP address when all the parsing and checking intelligence is already there in the inet/cidr type checks. This presumes exactly the assumption we are questioning, namely that there's a universal binary representation for these things. There might be such for bare IP addresses (ignoring endianness) but the argument doesn't scale to CIDR. You've also failed to make the case that this application designer has made a sane judgment about whether avoiding parsing is a good tradeoff here. Also: to the extent that the application is willing to deal with a Postgres-specific inet/cidr representation (which, in the end, is what this would be) it can do that *today* using binary output format. So I'm still not seeing an argument for exposing a cast to bytea. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] timestamp datatyp problem
Hi, I have found a problem by using timestamp datatyp in postgresql. The minimum value should be '4713-11-24 00:00:00 BC' but it's possible to use '4714-11-24 00:00:00 BC'. I have found the problem by using a binary cursor where I get the value as double. Is this the right list for posting this problem? Mathias Palm ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] timestamp datatyp problem
Mathias Palm wrote: Hi, I have found a problem by using timestamp datatyp in postgresql. The minimum value should be '4713-11-24 00:00:00 BC' but it's possible to use '4714-11-24 00:00:00 BC'. I have found the problem by using a binary cursor where I get the value as double. Is this the right list for posting this problem? I don't know if it's a problem or not, but I'd guess the relevant bit is in src/include/utils/datetime.h #define JULIAN_MINYEAR (-4713) #define JULIAN_MINMONTH (11) #define JULIAN_MINDAY (24) #define JULIAN_MAXYEAR (5874898) #define IS_VALID_JULIAN(y,m,d) y) JULIAN_MINYEAR) \ || (((y) == JULIAN_MINYEAR) (((m) JULIAN_MINMONTH) \ || (((m) == JULIAN_MINMONTH) ((d) = JULIAN_MINDAY) \ ((y) JULIAN_MAXYEAR)) I'm guessing -4713 == 4714BC (no year 0 between 1BC and 1AD). Presumably this can only happen if using floating-point datetimes and not 64-bit integers? -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] New cast between inet/cidr and bytea
Tom Lane írta: Zoltan Boszormenyi [EMAIL PROTECTED] writes: Bruce Momjian írta: What is the use case for such a cast? The application doesn't want to parse the textual IP address when all the parsing and checking intelligence is already there in the inet/cidr type checks. This presumes exactly the assumption we are questioning, namely that there's a universal binary representation for these things. But there is: network order. There might be such for bare IP addresses (ignoring endianness) but the argument doesn't scale to CIDR. Would you enlighten me why not? You've also failed to make the case that this application designer has made a sane judgment about whether avoiding parsing is a good tradeoff here. So, reinventing the wheel is always the way to go? Even when the app is actually storing those IP addresses with the type and features PostgreSQL provides? Also: to the extent that the application is willing to deal with a Postgres-specific inet/cidr representation (which, in the end, is what this would be) it can do that *today* using binary output format. So I'm still not seeing an argument for exposing a cast to bytea. regards, tom lane But the binary output of inet/cidr needs another round of parsing which requires using internal server headers. Would you like a 4/8/16/32 byte output using IP only or IP + fully represented netmask better? Best regards, -- -- Zoltán Böszörményi Cybertec Geschwinde Schönig GmbH http://www.postgresql.at/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] New cast between inet/cidr and bytea
Zoltan Boszormenyi wrote: Also: to the extent that the application is willing to deal with a Postgres-specific inet/cidr representation (which, in the end, is what this would be) it can do that *today* using binary output format. So I'm still not seeing an argument for exposing a cast to bytea. regards, tom lane But the binary output of inet/cidr needs another round of parsing which requires using internal server headers. Would you like a 4/8/16/32 byte output using IP only or IP + fully represented netmask better? How are you getting the bytea output? If as text then you're going to be doing parsing anyway; if as binary, why not just get the binary of the base type directly? It is not clear to me why we should provide this facility just for inet/cidr - if it is justified in that case it should be required for all types. cheers andrew ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Query plan degradation 8.2 -- 8.3
Josh Berkus [EMAIL PROTECTED] writes: It does the right thing if t_s_symb is declared as text instead of varchar. When it's varchar, even setting enable_sort off won't make it pick the right plan, which suggests that it fails to recognize that the index can match the query's ORDER BY. I'm guessing I overlooked a binary-compatibility case when I rejiggered the handling of PathKeys in connection with the NULLS FIRST/LAST stuff. No time to look deeper right now. Yeah, that looks like the case. We'll move it to TEXT for the tests right now, but I'll make sure we don't forget this bug during beta. Thanks! I've applied a patch that fixes this case, but I'm not yet 100% convinced that there are no other cases where it'll prevent matching things that should match. Please test. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Command tags in create/drop scripts
On 5/31/07, Peter Eisentraut [EMAIL PROTECTED] wrote: The messages output by the scripts always seemed unnecessary to me, e.g ., $ createdb foo CREATE DATABASE A Unix program (hi Magnus) shouldn't need to say anything if the requested action succeeded. I believe the history of this output is actually that these scripts simply used to call psql and no one really questioned why the output should be this way. There are also some inconsistencies, e.g., createlang doesn't output anything, and createuser outputs CREATE ROLE. I would just remove all this, but I suppose this will be controversial? Can it me made to return an integer like most UNIX commands do? This helps immensely when writing shell scripts. Thanks!
Re: [HACKERS] Command tags in create/drop scripts
CAJ CAJ [EMAIL PROTECTED] writes: Can it me made to return an integer like most UNIX commands do? This helps immensely when writing shell scripts. Don't they do that already? If not, that's a bug quite independent of Peter's cosmetic concern. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Query plan degradation 8.2 -- 8.3
Tom, I've applied a patch that fixes this case, but I'm not yet 100% convinced that there are no other cases where it'll prevent matching things that should match. Please test. Will do. We're having trouble building from CVS on the TPCE test rig, so it'll wait for tommorrow's snapshot. -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] SQLF Optimization question
On Thu, May 31, 2007 at 09:58:27AM -0400, Werner Echezuria wrote: Hello, (First of all, sorry for my english), I'm in a project that involves including sqlf sentences to make postgres understand the fuzzy querys, i mean, make people to create fuzzy predicates, these are words like fat, tall, young, old, in an sql sentence. I guess the first step is complete. The parser know when an user writes CREATE FUZZY PREDICATE name ON domain AS fuzzy set. For instance, a person can create the young word as a fuzzy predicate like this: CREATE FUZZY PREDICATE young ON 0..120 AS (0,0,25,50), that means, in a domain of 0..120 a young person is between 25 and 50. The fuzzy set has the follows: (beg,core1,core2,end), (begin of the set, first core, second core, end of the set). The sentence creates a row in a catalog named pg_fuzzypred, and we're good till then. But when I write Select * from students where age=young, brings me a message telling me the node isn't exist. My teacher who is very involved in the project tells me that postgres has to do the derivation principle, before it gets to the executor (what sounds logical), in order to the optimizer calculates all costs properly. Ok, I didn't totally follow the patch, but it seems to be you need to, somewhere, get postgres to see an unadorned word as a fuzzy predicate. Somewhere involving IDENT probably. However, it seems to me you're opening yourself up to problems, what if someone has a column name the same as your fuzzy predicate? I suggest you also add syntax for the matching, something like: Select * from students where age FUZZYMATCH young; This means that you don't interfere with existing grammer, but you have your own bit which you can completely control. Hope this helps, -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. signature.asc Description: Digital signature
[HACKERS] EXPLAIN feature request
Hello, - INDEX Scan USING tickets_q1_import6 ON tickets_q1 tickets (cost=0.00..2.06 rows=1 width=0) - INDEX Scan USING tickets_q1_import6 ON tickets_q1 tickets (cost=0.00..2.06 rows=1 width=0) Note that those are actually *2* different tables in different schemas. It would be great if it looked like this instead: - INDEX Scan USING tickets_q1_import6 ON foo.tickets_q1 public.tickets (cost=0.00..2.06 rows=1 width=0) - INDEX Scan USING tickets_q1_import6 ON bar.tickets_q1 public.tickets (cost=0.00..2.06 rows=1 width=0) Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Command tags in create/drop scripts
On 5/31/07, CAJ CAJ [EMAIL PROTECTED] wrote: Can it me made to return an integer like most UNIX commands do? This helps immensely when writing shell scripts. Don't they do that already? If not, that's a bug quite independent of Peter's cosmetic concern. Ah, I just tested it. create/drop user commands does return an integer. It's not mentioned in the manpages though. BTW, there is a -q option to quieten the output.
Re: [HACKERS] Autovacuum versus rolled-back transactions
Tom Lane wrote: It may boil down to whether we would like the identity n_live_tup = n_tup_ins - n_tup_del to continue to hold, or the similar one for n_dead_tup. The problem basically is that pgstats is computing n_live_tup and n_dead_tup using those identities rather than by tracking what really happens. Thanks for fixing this. For the record, I don't think I ever actually *considered* the effect of rolled back transactions in the tuple counts; at the time I wrote the code, I was just mirroring what the old autovac code did, and I didn't stop to think whether the assumptions were actually correct. I think the committed fix was the most appropriate -- changing the semantics of n_ins_tup etc would defeat the original purpose they were written for, I think. Regarding the idea of counting dead tuples left behind by vacuum to update pgstats at the end, I think the idea of counting them individually is good, but it doesn't account for dead tuples created in areas that were scanned earlier. So I think that Takahiro-san idea of using the value accumulated in pgstats is better. If we apply Heikki's idea of advancing OldestXmin, I think what we should do is grab the value from pgstats when vacuum starts, and each time we're going to advance OldestXmin, grab the value from pgstats again; accumulate the differences from the various pgstat grabs. At the end we send the accumulated differences as the new dead tuple count. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] Hash joins vs small-integer join values
I was idly thinking about Joseph Shraibman's problem here: http://archives.postgresql.org/pgsql-general/2007-05/msg01011.php in which a large hash join seemed to be blowing out memory. By chance I tried the following test case: js=# create table ml (jid int); CREATE TABLE js=# insert into ml select random()*1000 from generate_series(1,185391404); INSERT 0 185391404 js=# create table tempjr (id int); CREATE TABLE js=# insert into tempjr select random()*1000 from generate_series(1,6); INSERT 0 6 js=# analyze ml; ANALYZE js=# select count(*) from tempjr join ml on (jid=id) group by jid; Since I hadn't remembered to increase work_mem beyond the default, this set up a hash join with 4111 buckets in each of 8192 batches, which didn't seem too awfully unreasonable, so I let it go. Imagine my horror as I watched it stuff all 185 million ml rows into batch 4365. Naturally, when it got to trying to process that batch, the in-memory hashtable blew out real good. I'm not certain this is what happened to Joseph, since I don't know the stats of his jid column, but in any case it's got to be fixed. Hash join is a probabilistic algorithm, so there will always be some input distributions for which it sucks, but I don't think we can tolerate uniformly distributed on the integers 0-N as being one of them. The problem comes from the rather simplistic assignment of bucket and batch numbers in ExecHashGetBucketAndBatch(): * Note: on-the-fly increases of nbatch must not change the bucket number * for a given hash code (since we don't move tuples to different hash * chains), and must only cause the batch number to remain the same or * increase. Our algorithm is * bucketno = hashvalue MOD nbuckets * batchno = (hashvalue DIV nbuckets) MOD nbatch * where nbuckets should preferably be prime so that all bits of the * hash value can affect both bucketno and batchno. * nbuckets doesn't change over the course of the join. This would be fine if the hashvalues were reasonably randomly distributed over all uint32 values, but take a look at hashint4 --- it's just a one's-complement: Datum hashint4(PG_FUNCTION_ARGS) { PG_RETURN_UINT32(~PG_GETARG_UINT32(0)); } Two inputs that differ by 1 will have hash values also differing by 1. Therefore, in my test case with 4111 buckets, consecutive ranges of 4111 input values map to the same batch --- different buckets in the batch, but the same batch. My example with inputs 0..999 would have mapped to either 1 or 2 batches depending on luck. With a more realistic work_mem, nbuckets would have been larger, making this problem worse not better. 8.1 and up are broken this way; in 8.0 and before we were calculating the batch number in a different way that doesn't seem vulnerable to this particular failure mode. Arguably, the problem here is a chintzy hash function, and we should fix it by making the integer hash functions use hash_any(). I'm inclined to do that for 8.3. The problem is that this is not a back-patchable answer, because changing the hash functions would corrupt existing hash indexes. The best idea I can come up with for the back branches is to make ExecHashGetBucketAndBatch do hash_any internally, say if (nbatch 1) { *bucketno = hashvalue % nbuckets; /* since nbatch is a power of 2, can do MOD by masking */ - *batchno = (hashvalue / nbuckets) (nbatch - 1); + *batchno = hash_any(hashvalue, sizeof(int32)) (nbatch - 1); } else { *bucketno = hashvalue % nbuckets; *batchno = 0; } Comments, better ideas? regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Changing checkpoint_timeout to another name?
Bruce Momjian wrote: Joshua D. Drake wrote: Joshua D. Drake wrote: Hello, I am currently writing some curriculum on managing IO with PostgreSQL and I keep running into the parameter checkpolint_timeout. *cough* checkpoint_timeout This seems to be incorrect as it is not really a timeout as much as an interval... It is a timeout because of WAL fills up before the timeout you get a checkpoint and the timer is reset. How is it an interval? because if it doesn't, it will checkpoint at that timeout which is an interval :). It depends on how you look at it I guess. For me, I try to make sure we are never rolling our logs until checkpoint, so it is an interval. Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Autovacuum versus rolled-back transactions
Alvaro Herrera [EMAIL PROTECTED] writes: If we apply Heikki's idea of advancing OldestXmin, I think what we should do is grab the value from pgstats when vacuum starts, and each time we're going to advance OldestXmin, grab the value from pgstats again; accumulate the differences from the various pgstat grabs. At the end we send the accumulated differences as the new dead tuple count. Considering that each of those values will be up to half a second old, I can hardly think that this will accomplish anything except to introduce a great deal of noise ... regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Autovacuum versus rolled-back transactions
Alvaro Herrera [EMAIL PROTECTED] wrote: Tom Lane wrote: It may boil down to whether we would like the identity n_live_tup = n_tup_ins - n_tup_del to continue to hold, or the similar one for n_dead_tup. The problem basically is that pgstats is computing n_live_tup and n_dead_tup using those identities rather than by tracking what really happens. On a relevant note, there is a variance in the calculation of auto-analyze threshold between documentation and implementation in HEAD. (Only HEAD; It is ok in 8.2 or before) Our documentation says | analyze threshold = analyze base threshold | + analyze scale factor * number of tuples | is compared to the total number of tuples inserted, updated, or deleted | since the last ANALYZE. http://momjian.us/main/writings/pgsql/sgml/routine-vacuuming.html#AUTOVACUUM but deleted tuples are not considered in the total number, because the delta of {n_live_tuples + n_dead_tuples} is not changed by DELETE. We add the number of DELETE into n_live_tuples and subtract it from n_dead_tuples. | pgstat.c | t_new_live_tuples += tuples_inserted - tuples_deleted; | t_new_dead_tuples += tuples_deleted; | autovacuum.c | anltuples = n_live_tuples + n_dead_tuples - last_anl_tuples; There is no delete-only database in the real world, so this is not so serious problem probably. We'd better to fix the documentation if it is intention. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Autovacuum versus rolled-back transactions
Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: If we apply Heikki's idea of advancing OldestXmin, I think what we should do is grab the value from pgstats when vacuum starts, and each time we're going to advance OldestXmin, grab the value from pgstats again; accumulate the differences from the various pgstat grabs. At the end we send the accumulated differences as the new dead tuple count. Considering that each of those values will be up to half a second old, I can hardly think that this will accomplish anything except to introduce a great deal of noise ... Normally, yes, but the values can be older if the vacuum_cost_delay is large. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Autovacuum versus rolled-back transactions
ITAGAKI Takahiro [EMAIL PROTECTED] writes: Our documentation says | analyze threshold = analyze base threshold | + analyze scale factor * number of tuples | is compared to the total number of tuples inserted, updated, or deleted | since the last ANALYZE. but deleted tuples are not considered in the total number, because the delta of {n_live_tuples + n_dead_tuples} is not changed by DELETE. We add the number of DELETE into n_live_tuples and subtract it from n_dead_tuples. Yeah, I was concerned about that when I was making the patch, but didn't see any simple fix. A large number of DELETEs (without any inserts or updates) would trigger a VACUUM but not an ANALYZE, which in the worst case would be bad because the stats could have shifted. We could fix this at the cost of carrying another per-table counter in the stats info, but I'm not sure it's worth it. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Autovacuum versus rolled-back transactions
Alvaro Herrera [EMAIL PROTECTED] writes: Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: If we apply Heikki's idea of advancing OldestXmin, I think what we should do is grab the value from pgstats when vacuum starts, and each time we're going to advance OldestXmin, grab the value from pgstats Considering that each of those values will be up to half a second old, I can hardly think that this will accomplish anything except to introduce a great deal of noise ... Normally, yes, but the values can be older if the vacuum_cost_delay is large. I'm not sure we're on the same page. I meant that whatever you read from pgstats is going to be stale by an uncertain amount of time. Taking the deltas of such numbers over relatively short intervals is going to be mighty noisy. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Hash joins vs small-integer join values
Tom, The problem is that this is not a back-patchable answer, because changing the hash functions would corrupt existing hash indexes. Does anyone *use* hash indexes? Comments, better ideas? I was just talking to Luke today and he said they had a considerable amount of cleanup on hash join they were planning to contribute for 8.4. Luke? --Josh ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Hash joins vs small-integer join values
Tom Lane [EMAIL PROTECTED] writes: The best idea I can come up with for the back branches is to make ExecHashGetBucketAndBatch do hash_any internally, say hashany of a 4-byte value degenerates to pretty much just a call to mix(). Perhaps we should just expose a hash12() that takes three integers and calls mix() on them like hash_any does. The reason I'm thinking that is that we'll want to do the same thing for bigint, float4, float8, etc. And that fix you committed a while back to improve the catcache hash function made a huge difference. Now I'm wondering if it shouldn't just be invoking hash_any() or mix() too. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Hash joins vs small-integer join values
Josh Berkus [EMAIL PROTECTED] writes: Tom, The problem is that this is not a back-patchable answer, because changing the hash functions would corrupt existing hash indexes. Does anyone *use* hash indexes? We get bug reports on 'em, so yes ... regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings