Re: CREATE INDEX and HOT (was [HACKERS] Question: pg_classattributes and race conditions ?)
On Sat, 2007-03-17 at 12:43 +0530, Pavan Deolasee wrote: > How do we move forward with the CREATE INDEX issue with > HOT ? There are quite a few suggestions and objections. > Can we please discuss and decide on the plan ? I am very > comfortable with the current state of HOT, the results > are encouraging and I hope this issue does not become > a showstopper. My suggested plan would be: First, we should make CREATE INDEX CONCURRENTLY work, since we have a design to do that. Many other things are acceptable if that works fully and correctly with HOT, if it cannot, we have problems and that is a showstopper. Second, I can't put my finger on it exactly, but I think having a DDL command to chill a table, as Alvaro suggests, sounds like a very wise utility to include, whatever we do later. That does have the useful by-product of making CREATE INDEX work without change, but I would not want to rely on it in the longer term. While those are happening we can explore the other possibilities in more detail, but they do seem like safe bets to me. > Here is what different people have suggested: > > 1. Simon and I posted couple of designs which include prior > suggestions from Heikki, Tom, Hannu and others. I believe > these designs are not very distinct and if done correctly > would help us keep the current behavior of CREATE INDEX > and CREATE INDEX CONCURRENTLY same. Agreed. Apologies for confusing things by posting the second design, that wasn't very helpful or polite. > There are concerns > about race conditions and deadlock issues though. What > is a general feeling at this point ? Are these issues very > hard to address ? At a high level, IMO we need some or all > of these things to make either of these designs work: > > - ability to enable/disable HOT-updates on the table > - ability to enable/disable HOT-fetches on the table > - ability to wait for concurrent index scans to finish > - ability to allow only one backend to CHILL the table That seems like a good summary of what we need. I would add only that these need not be explicit user-level commands, just internal capabilities. > How hard would these be without causing any race conditions > and deadlocks ? We should be able to take advantage of two things: - We only need to chill some of the tuples, in some cases. - Chilling only requires us to lock the indexes, which would be more practical if the index locking were more transitory. I note that IndexScans hold the locks they have on an index until transaction end, even after the IndexScans and BitmapIndexScans complete. ISTM we could relax that locking; that seems OK since we earlier discussed removing locks completely on indexes. The lock on the heap would not be upgraded or changed. > 2. Heikki suggested an approach where we add a byte > to tuple header and track HOT-ness of different indexes. > The idea looks good but had a downside of increasing tuple > header and complexity. I'm very comfortable with the idea that HOT can be turned on/off for a table. That gives us a workaround to bugs. Previously, changing things like WITHOUT OIDS was done over two releases, so I'd suggest the same thing here. Add the option now, disabled, then look to make it the default option in the next release. We can override that with the default_use_hot parameter for those that feel bold, at least initially. I know Bruce has been long opposed to the idea of a table-level switch, which is why we've been trying so hard to avoid it. So we should add his -1 to this idea from the start. Right now, ideas around (2) sound like too much complexity. Maybe there's a good idea there waiting to break out, so we should pursue that also - but I'm not sure you can wait for that to happen. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] [PATCHES] Bitmapscan changes
On Mar 16, 2007, at 10:12 PM, Heikki Linnakangas wrote: You'll obviously need to run it with the patch applied. I'd suggest to enable stats_block_level to see the effect on buffer cache hit/ miss ratio. groupeditems-42-pghead.patch.gz is enough, or it needs maintain_cluster_order_v5.patch ?? -- Grzegorz Jaskiewicz C/C++ freelance for hire ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] My honours project - databases using dynamically attached entity-properties
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Fri, Mar 16, 2007 at 09:56:23AM -0400, Andrew Dunstan wrote: > [EMAIL PROTECTED] wrote: > > > >>Does hstore nest? [...] > > > >If what you mean is to have "mappings of mappings" then no. > > [...] think "hash" for perl folks [...] > As a perl folk I think of hashes as nestable :-) Heh. Point taken :-) > Anyway, this means that you can't use hstore to cover the same field as > YAML or JSON. That doesn't mean it's not useful - far from it. I think that hstore might be extensible in this direction. After all, the keys and (key, value) combinations just get hashed into a bitmap. There is no reason why one shouldn't be able to hash more complex data structures (unless, of course, the bit map gets so densely poked with ones that it stops being useful). The difficult part might be to reach a consensus on what a "complex data structure" might look like. The purists among the relational folks are sharpening their knives already, I can hear that ;-) Regards - -- tomás -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFF++GaBcgs9XrR2kYRAs/hAJ9vfRy36T23vJ6eIYj6efrQVk9roQCeMkqQ kWqzNbgbkMOXKIDQuzwDeFY= =fCiG -END PGP SIGNATURE- ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] Bison 2.1 on win32
I just tried building with Bison 2.1 on msvc, and it broke. For one thing, the .BAT file rejects 2.1 as broken instead of 2.0, which is obviously incorrect :-) But the generated C file also does not compile causing the error on http://msdn2.microsoft.com/en-us/library/93az0868.aspx, because msvc doesn't define __STDC__, which causes Bison to generate code it can't compile. Defining __STDC__ globally breaks several other places, since it affects a lot of include files that aren't necessarily others. The attached patch seems to fix the build issue. Does it seem acceptable/the right thing to do? Another option would be to just reject both 2.0 and 2.1 as broken to build pg with, I guess... //Magnus Index: src/backend/parser/gram.y === RCS file: /projects/cvsroot/pgsql/src/backend/parser/gram.y,v retrieving revision 2.581 diff -c -r2.581 gram.y *** src/backend/parser/gram.y 13 Mar 2007 00:33:41 - 2.581 --- src/backend/parser/gram.y 17 Mar 2007 13:14:40 - *** *** 62,67 --- 62,71 #include "utils/numeric.h" #include "utils/xml.h" + /* MSVC does not define __STDC__, but Bison 2.1 generates broken code without it */ + #ifdef WIN32_ONLY_COMPILER + #define __STDC__ 1 + #endif /* Location tracking support --- simpler than bison's default */ #define YYLLOC_DEFAULT(Current, Rhs, N) \ Index: src/backend/bootstrap/bootparse.y === RCS file: /projects/cvsroot/pgsql/src/backend/bootstrap/bootparse.y,v retrieving revision 1.88 diff -c -r1.88 bootparse.y *** src/backend/bootstrap/bootparse.y 13 Mar 2007 00:33:39 - 1.88 --- src/backend/bootstrap/bootparse.y 17 Mar 2007 13:14:56 - *** *** 51,56 --- 51,61 #include "tcop/dest.h" #include "utils/rel.h" + /* MSVC does not define __STDC__, but Bison 2.1 generates broken code without it */ + #ifdef WIN32_ONLY_COMPILER + #define __STDC__ 1 + #endif + #define atooid(x) ((Oid) strtoul((x), NULL, 10)) Index: src/pl/plpgsql/src/gram.y === RCS file: /projects/cvsroot/pgsql/src/pl/plpgsql/src/gram.y,v retrieving revision 1.99 diff -c -r1.99 gram.y *** src/pl/plpgsql/src/gram.y 19 Feb 2007 03:18:51 - 1.99 --- src/pl/plpgsql/src/gram.y 17 Mar 2007 13:15:11 - *** *** 18,23 --- 18,27 #include "parser/parser.h" + /* MSVC does not define __STDC__, but Bison 2.1 generates broken code without it */ + #ifdef WIN32_ONLY_COMPILER + #define __STDC__ 1 + #endif static PLpgSQL_expr *read_sql_construct(int until, int until2, ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Bison 2.1 on win32
Magnus Hagander wrote: I just tried building with Bison 2.1 on msvc, and it broke. For one thing, the .BAT file rejects 2.1 as broken instead of 2.0, which is obviously incorrect :-) But the generated C file also does not compile causing the error on http://msdn2.microsoft.com/en-us/library/93az0868.aspx, because msvc doesn't define __STDC__, which causes Bison to generate code it can't compile. Defining __STDC__ globally breaks several other places, since it affects a lot of include files that aren't necessarily others. The attached patch seems to fix the build issue. Does it seem acceptable/the right thing to do? Another option would be to just reject both 2.0 and 2.1 as broken to build pg with, I guess... I rolled back to 1.875 to get MSVC builds working. In the longer term, though, falling behind upstream is probably not a good idea. Should this be reported to the bison people? For now I could live with either of your solutions. cheers andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PATCHES] Bitmapscan changes
Grzegorz Jaskiewicz wrote: > > On Mar 16, 2007, at 10:12 PM, Heikki Linnakangas wrote: > > >> >> You'll obviously need to run it with the patch applied. I'd suggest to >> enable stats_block_level to see the effect on buffer cache hit/miss >> ratio. > > groupeditems-42-pghead.patch.gz is enough, or it needs > maintain_cluster_order_v5.patch ?? He has a patched source ball here of the whole thing, which is what I used: http://community.enterprisedb.com/git/pgsql-git-20070315.tar.gz The you just need to run the tests. > > > > > --Grzegorz Jaskiewicz > > C/C++ freelance for hire > > > > > > > ---(end of broadcast)--- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PATCHES] xpath_array with namespaces support
What about it? W/o this not large patch XML functionality in 8.3 will be weak... Will it be accepted? On 3/5/07, Nikolay Samokhvalov <[EMAIL PROTECTED]> wrote: On 3/4/07, Nikolay Samokhvalov <[EMAIL PROTECTED]> wrote: > I'll fix these issues and extend the patch with resgression tests and > docs for xpath_array(). I'll resubmit it very soon. Here is a new version of the patch. I didn't change any part of docs yet. Since there were no objections I've changed the name of the function to xmlpath(). -- Best regards, Nikolay Index: src/backend/utils/adt/xml.c === RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/xml.c,v retrieving revision 1.34 diff -u -r1.34 xml.c --- src/backend/utils/adt/xml.c 3 Mar 2007 19:32:55 - 1.34 +++ src/backend/utils/adt/xml.c 5 Mar 2007 01:14:57 - @@ -47,6 +47,8 @@ #include #include #include +#include +#include #endif /* USE_LIBXML */ #include "catalog/namespace.h" @@ -67,6 +69,7 @@ #include "utils/datetime.h" #include "utils/lsyscache.h" #include "utils/memutils.h" +#include "access/tupmacs.h" #include "utils/xml.h" @@ -88,6 +91,7 @@ static int parse_xml_decl(const xmlChar *str, size_t *lenp, xmlChar **version, xmlChar **encoding, int *standalone); static bool print_xml_decl(StringInfo buf, const xmlChar *version, pg_enc encoding, int standalone); static xmlDocPtr xml_parse(text *data, XmlOptionType xmloption_arg, bool preserve_whitespace, xmlChar *encoding); +static text *xml_xmlnodetotext(xmlNodePtr cur); #endif /* USE_LIBXML */ @@ -1463,7 +1467,6 @@ return buf.data; } - /* * Map SQL value to XML value; see SQL/XML:2003 section 9.16. */ @@ -2403,3 +2406,247 @@ else appendStringInfoString(result, "\n\n"); } + + +/* + * XPath related functions + */ + +#ifdef USE_LIBXML +/* + * Convert XML node to text (return value only, it's not dumping) + */ +text * +xml_xmlnodetotext(xmlNodePtr cur) +{ + xmlChar *str; + text *result; + size_t len; + + str = xmlXPathCastNodeToString(cur); + len = strlen((char *) str); + result = (text *) palloc(len + VARHDRSZ); + SET_VARSIZE(result, len + VARHDRSZ); + memcpy(VARDATA(result), str, len); + + return result; +} +#endif + +/* + * Evaluate XPath expression and return array of XML values. + * As we have no support of XQuery sequences yet, this functions seems + * to be the most useful one (array of XML functions plays a role of + * some kind of substritution for XQuery sequences). + + * Workaround here: we parse XML data in different way to allow XPath for + * fragments (see "XPath for fragment" TODO comment inside). + */ +Datum +xmlpath(PG_FUNCTION_ARGS) +{ +#ifdef USE_LIBXML + ArrayBuildState *astate = NULL; + xmlParserCtxtPtr ctxt = NULL; + xmlDocPtr doc = NULL; + xmlXPathContextPtr xpathctx = NULL; + xmlXPathCompExprPtr xpathcomp = NULL; + xmlXPathObjectPtr xpathobj = NULL; + int32len, xpath_len; + xmlChar*string, *xpath_expr; + boolres_is_null = FALSE; + int i; + xmltype*data; + text*xpath_expr_text; + ArrayType *namespaces; + int *dims, ndims, ns_count = 0, bitmask = 1; + char*ptr; + bits8*bitmap; + char**ns_names = NULL, **ns_uris = NULL; + int16typlen; + booltypbyval; + chartypalign; + + /* the function is not strict, we must check first two args */ + if (PG_ARGISNULL(0) || PG_ARGISNULL(1)) + PG_RETURN_NULL(); + + xpath_expr_text = PG_GETARG_TEXT_P(0); + data = PG_GETARG_XML_P(1); + + /* Namespace mappings passed as text[]. + * Assume that 2-dimensional array has been passed, + * the 1st subarray is array of names, the 2nd -- array of URIs, + * example: ARRAY[ARRAY['myns', 'myns2'], ARRAY['http://example.com', 'http://example2.com']]. + */ + if (!PG_ARGISNULL(2)) + { + namespaces = PG_GETARG_ARRAYTYPE_P(2); + ndims = ARR_NDIM(namespaces); + dims = ARR_DIMS(namespaces); + + /* Sanity check */ + if (ndims != 2) + ereport(ERROR, (errmsg("invalid array passed for namespace mappings"), + errdetail("Only 2-dimensional array may be used for namespace mappings."))); + + Assert(ARR_ELEMTYPE(namespaces) == TEXTOID); + + ns_count = ArrayGetNItems(ndims, dims) / 2; + get_typlenbyvalalign(ARR_ELEMTYPE(namespaces), + &typlen, &typbyval, &typalign); + ns_names = (char **) palloc(ns_count * sizeof(char *)); + ns_uris = (char **) palloc(ns_count * sizeof(char *)); + ptr = ARR_DATA_PTR(namespaces); + bitmap = ARR_NULLBITMAP(namespaces); + bitmask = 1; + + for (i = 0; i < ns_count * 2; i++) + { + if (bitmap && (*bitmap & bitmask) == 0) +ereport(ERROR, (errmsg("neither namespace nor URI may be NULL"))); /* TODO: better message */ + else + { +if (i < ns_count) + ns_names[i] = DatumGetCString(DirectFunctionCall1(textout, + PointerGetDatum(ptr))); +else + ns_uris[i - ns_count] = DatumGetCString(DirectFunctionCall1(textout, + PointerGetDatum(ptr))); +ptr = att_ad
Re: [HACKERS] [PATCHES] xpath_array with namespaces support
Nikolay Samokhvalov wrote: What about it? W/o this not large patch XML functionality in 8.3 will be weak... Will it be accepted? In principle I am in favor of the patch. Would it be better to use some more unlikely name for the dummy root element used to process fragments than ? Perhaps even something in a special namespace? cheers andrew ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [COMMITTERS] pgsql: Add GUC temp_tablespaces to provide a default location for
On 3/5/07, Tom Lane <[EMAIL PROTECTED]> wrote: In the second place, it's a serious violation of what little modularity and layering we have for fd.c to be calling into commands/tablespace.c. This is not merely cosmetic but has real consequences: one being that it's now unsafe to call OpenTemporaryFile outside a transaction. ok, you are right... what do you suggest? maybe move the GetTempTablespace function to somewhere in src/backend/utils? -- regards, Jaime Casanova "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs and the universe trying to produce bigger and better idiots. So far, the universe is winning." Richard Cook ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] SoC ECPG Enhancements
> > ECPG Enhancments: Enable ECPG to generate calls directly for libpq > > rather than calls to its own libraries. > > > > What would be the point of that? > > More importantly, Michael hasn't volunteered to mentor this year, so I think > we should probably take the idea down. And most importantly in my opinion: this simply doesn't make sense. :-) I already refused to add this last year and unless someone can explain to me what I am misunderstanding I you can simply remove it. Michael -- Michael Meskes Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org) ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: [EMAIL PROTECTED] Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL! ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] [COMMITTERS] pgsql: Add GUC temp_tablespaces to provide a default location for
"Jaime Casanova" <[EMAIL PROTECTED]> writes: > On 3/5/07, Tom Lane <[EMAIL PROTECTED]> wrote: >> In the second place, it's a serious violation of what little modularity >> and layering we have for fd.c to be calling into commands/tablespace.c. >> This is not merely cosmetic but has real consequences: one being that >> it's now unsafe to call OpenTemporaryFile outside a transaction. > ok, you are right... what do you suggest? > maybe move the GetTempTablespace function to somewhere in src/backend/utils? You missed the point entirely. Relocating the code to some other file wouldn't change the objection: the problem is that fd.c mustn't invoke any transactional facilities such as catalog lookups. It's too low level for that. You could perhaps do it the other way around: some transactional code (eg the assign hook for a GUC variable) tells fd.c to save some private state controlling future temp file creations. BTW, if we are now thinking of temp files as being directed to particular tablespaces, is there any reason still to have per-database subdirectories for them? It might simplify life if there were just one default temp directory, $PGDATA/base/pgsql_tmp/, plus one per configured temp tablespace, $PGDATA/pg_tblspc//pgsql_tmp/. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: CREATE INDEX and HOT (was [HACKERS] Question: pg_class attributes and race conditions ?)
"Pavan Deolasee" <[EMAIL PROTECTED]> writes: > While creating an index, if a HEAP_ONLY tuple is found, > CREATE INDEX [CONCURRENTLY] fails with an error and the > user needs to SET HOT OFF and then try again. While turning > HOT off, the entire table is CHILLed, holding AccessExclusive > lock on the table. Once the new index is created, user > can turn HOT on again. It hardly seems acceptable to require exclusive lock to chill a table. In production situations, knowing that you'd have to do that to do index maintenance on a large table would probably scare you off of ever enabling the feature at all. Last year we were getting beaten up about how it wasn't acceptable for CREATE INDEX to lock out writes for a long time; how is it suddenly acceptable to need to lock out both reads and writes for a long time before you can even think about creating an index? regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Bug in UTF8-Validation Code?
Jeff Davis wrote: On Wed, 2007-03-14 at 01:29 -0600, Michael Fuhr wrote: On Tue, Mar 13, 2007 at 04:42:35PM +0100, Mario Weilguni wrote: Am Dienstag, 13. März 2007 16:38 schrieb Joshua D. Drake: Is this any different than the issues of moving 8.0.x to 8.1 UTF8? Where we had to use iconv? What issues? I've upgraded several 8.0 database to 8.1. without having to use iconv. Did I miss something? http://www.postgresql.org/docs/8.1/interactive/release-8-1.html "Some users are having problems loading UTF-8 data into 8.1.X. This is because previous versions allowed invalid UTF-8 byte sequences to be entered into the database, and this release properly accepts only valid UTF-8 sequences. One way to correct a dumpfile is to run the command iconv -c -f UTF-8 -t UTF-8 -o cleanfile.sql dumpfile.sql." If the above quote were actually true, then Mario wouldn't be having a problem. Instead, it's half-true: Invalid byte sequences are rejected in some situations and accepted in others. If postgresql consistently rejected or consistently accepted invalid byte sequences, that would not cause problems with COPY (meaning problems with pg_dump, slony, etc.). How can we fix this? Frankly, the statement in the docs warning about making sure that escaped sequences are valid in the server encoding is a cop-out. We don't accept invalid data elsewhere, and this should be no different IMNSHO. I don't see why this should be any different from, say, date or numeric data. For years people have sneered at MySQL because it accepted dates like Feb 31st, and rightly so. But this seems to me to be like our own version of the same problem. Last year Jeff suggested adding something like: pg_verifymbstr(string,strlen(string),0); to each relevant input routine. Would that be an acceptable solution? If not, what would be? cheers andrew ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Bison 2.1 on win32
Magnus Hagander <[EMAIL PROTECTED]> writes: > The attached patch seems to fix the build issue. Does it seem > acceptable/the right thing to do? No, it seems pretty bletcherous. > Another option would be to just reject both 2.0 and 2.1 as broken to > build pg with, I guess... In bison 2.3 (which is shipping with Fedora Core 6), all the uses of __STDC__ seem to also test _MSC_VER: #if (defined __STDC__ || defined __C99__FUNC__ \ || defined __cplusplus || defined _MSC_VER) If this fixes the problem, then I'd vote for just stating you need bison >= 2.3 (or 2.2?) to build on MSVC. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Bug in UTF8-Validation Code?
Andrew Dunstan <[EMAIL PROTECTED]> writes: > Last year Jeff suggested adding something like: > pg_verifymbstr(string,strlen(string),0); > to each relevant input routine. Would that be an acceptable solution? The problem with that is that it duplicates effort: in many cases (especially COPY IN) the data's already been validated. I'm not sure how to fix that, but I think you'll get some push-back if you double the encoding verification work in COPY for nothing. Given that we are moving away from backslash-enabled literals, I'm not as convinced as some that this must be fixed... regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Bison 2.1 on win32
Tom Lane wrote: > Magnus Hagander <[EMAIL PROTECTED]> writes: >> The attached patch seems to fix the build issue. Does it seem >> acceptable/the right thing to do? > > No, it seems pretty bletcherous. That's kind of what I thought :-) >> Another option would be to just reject both 2.0 and 2.1 as broken to >> build pg with, I guess... > > In bison 2.3 (which is shipping with Fedora Core 6), all the uses of > __STDC__ seem to also test _MSC_VER: > > #if (defined __STDC__ || defined __C99__FUNC__ \ > || defined __cplusplus || defined _MSC_VER) > > If this fixes the problem, then I'd vote for just stating you need > bison >= 2.3 (or 2.2?) to build on MSVC. It certainly looks like it would fix it. However, the gnuwin32 project doesn't have 2.3, not even 2.2. But let's add a check and say we need it, and then we can fix it again once they do release that version if it's broken still... Do you happen to have a 2.2 around so you can see what happens there? Or does someone else have that? So I know which version to test against... //Magnus ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Bug in UTF8-Validation Code?
Tom Lane wrote: Andrew Dunstan <[EMAIL PROTECTED]> writes: Last year Jeff suggested adding something like: pg_verifymbstr(string,strlen(string),0); to each relevant input routine. Would that be an acceptable solution? The problem with that is that it duplicates effort: in many cases (especially COPY IN) the data's already been validated. I'm not sure how to fix that, but I think you'll get some push-back if you double the encoding verification work in COPY for nothing. Given that we are moving away from backslash-enabled literals, I'm not as convinced as some that this must be fixed... They will still be available in E'\nn' form, won't they? One thought I had was that it might make sense to have a flag that would inhibit the check, that could be set (and reset) by routines that check for themselves, such as COPY IN. Then bulk load performance should not be hit much. cheers andrew ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Bison 2.1 on win32
Magnus Hagander wrote: > I just tried building with Bison 2.1 on msvc, and it broke. For one > thing, the .BAT file rejects 2.1 as broken instead of 2.0, which is > obviously incorrect :-) Actually, looking at the GNU ftp site, there isn't even a version 2.2 available. There is a 2.1a which should have the fix (based on file dates - they don't use branches or tags in their cvs repository). But I'll go ahead and just say it's fixed in 2.3, and patch accordingly. //Magnus ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Bison 2.1 on win32
Magnus Hagander <[EMAIL PROTECTED]> writes: > Actually, looking at the GNU ftp site, there isn't even a version 2.2 > available. There is a 2.1a which should have the fix (based on file > dates - they don't use branches or tags in their cvs repository). Huh? At http://ftp.gnu.org/gnu/bison/ I see 2.0, 2.1, 2.2, 2.3, and no 2.1a. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Bison 2.1 on win32
Tom Lane wrote: > Magnus Hagander <[EMAIL PROTECTED]> writes: >> Actually, looking at the GNU ftp site, there isn't even a version 2.2 >> available. There is a 2.1a which should have the fix (based on file >> dates - they don't use branches or tags in their cvs repository). > > Huh? At > http://ftp.gnu.org/gnu/bison/ > I see 2.0, 2.1, 2.2, 2.3, and no 2.1a. Gah. I clicked the wrong link that was titled "test releases" ;-) I just found a second ftp link when the first one didn't work. Should've read more carefully. Ok. So it looks like 2.2 should be fine as well, as soon as they put out a new win32 ver... (FYI, the code seems to come from data/c.m4) /Magnus ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Bison 2.1 on win32
Magnus Hagander <[EMAIL PROTECTED]> writes: > Do you happen to have a 2.2 around so you can see what happens there? Or > does someone else have that? So I know which version to test against... 2.2 and 2.3 seem to use _MSC_VER in the same way. I had occasion to test both last fall, and they generate only trivially different outputs from our grammar. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: CREATE INDEX and HOT (was [HACKERS] Question: pg_classattributes and race conditions ?)
On Sat, 2007-03-17 at 11:45 -0400, Tom Lane wrote: > "Pavan Deolasee" <[EMAIL PROTECTED]> writes: > > While creating an index, if a HEAP_ONLY tuple is found, > > CREATE INDEX [CONCURRENTLY] fails with an error and the > > user needs to SET HOT OFF and then try again. While turning > > HOT off, the entire table is CHILLed, holding AccessExclusive > > lock on the table. Once the new index is created, user > > can turn HOT on again. > > It hardly seems acceptable to require exclusive lock to chill a table. > In production situations, knowing that you'd have to do that to do > index maintenance on a large table would probably scare you off of > ever enabling the feature at all. Last year we were getting beaten up > about how it wasn't acceptable for CREATE INDEX to lock out writes > for a long time; how is it suddenly acceptable to need to lock out > both reads and writes for a long time before you can even think > about creating an index? I agree with you: It isn't acceptable for us to contemplate an AccessExclusiveLock before we can build any index. We *must* make CREATE INDEX CONCURRENTLY work with HOT. The good news is I think we can without significant difficulty. The problems are with CREATE INDEX, in some cases. I regret that I did not see those difficulties until recently, which is why I was concerned that we spent time on VACUUM FULL rather than this issue. I'm glad now that you both pressed ahead and solved that though. As a result of the issues, I think Pavan is playing safe, to make sure there is *an* option, so that we can build upwards from there. The proposal is pragmatism only, while we discuss other approaches. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Bug in UTF8-Validation Code?
Andrew Dunstan <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> The problem with that is that it duplicates effort: in many cases >> (especially COPY IN) the data's already been validated. > One thought I had was that it might make sense to have a flag that would > inhibit the check, that could be set (and reset) by routines that check > for themselves, such as COPY IN. Then bulk load performance should not > be hit much. Actually, I have to take back that objection: on closer look, COPY validates the data only once and does so before applying its own backslash-escaping rules. So there is a risk in that path too. It's still pretty annoying to be validating the data twice in the common case where no backslash reduction occurred, but I'm not sure I see any good way to avoid it. I don't much want to add another argument to input functions, and the global flag that you suggest above seems too ugly/risky. Would someone do some performance checking on the cost of adding mbverify to textin()? If it could be shown that it adds only negligible overhead to COPY, on say hundred-byte-wide text fields, then we could decide that this isn't worth worrying about. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: CREATE INDEX and HOT (was [HACKERS] Question: pg_class attributes and race conditions ?)
Tom Lane wrote: > "Pavan Deolasee" <[EMAIL PROTECTED]> writes: >> While creating an index, if a HEAP_ONLY tuple is found, >> CREATE INDEX [CONCURRENTLY] fails with an error and the >> user needs to SET HOT OFF and then try again. While turning >> HOT off, the entire table is CHILLed, holding AccessExclusive >> lock on the table. Once the new index is created, user >> can turn HOT on again. > > It hardly seems acceptable to require exclusive lock to chill a table. > In production situations, knowing that you'd have to do that to do > index maintenance on a large table would probably scare you off of > ever enabling the feature at all. Last year we were getting beaten up > about how it wasn't acceptable for CREATE INDEX to lock out writes > for a long time; how is it suddenly acceptable to need to lock out > both reads and writes for a long time before you can even think > about creating an index? > Yeah, I agree. I was proposing this as a stop-gap solution though. Something which would help us solve the problem without changing the current behavior for non-HOT tables. So what do you suggest ? Do you feel that there is no way we can solve the problem ? ISTM that if we run CHILL as a seperate transaction (just like VACUUM), we should be able to CHILL the table with ShareUpdateExclusiveLock. Running it as a seperate transaction would reduce the risk of causing deadlocks. Is that a fair assessment ? If we need to CHILL with ShareUpdateExclusiveLock, IMHO we would again be back to something similar to the first approach. I know you said its fragile and full of race conditions, but do you think we can handle it better if we have a seperate DDL command, running within its own transaction ? The algorithm would look like: 1. Disable HOT-updates 2. CHILL the table by inserting appropriate index entries and marking tuples CHILL_IN_PROGRESS 3. Establish a point when there are no open index scans 4. Disable HOT-fetches 5. Scan the heap again, reset CHILL_IN_PROGRESS, HEAP_ONLY and HOT_UPDATED flags 6. Establish a point when there are no open index scans 7. Enable HOT-fetches 8. Enable HOT-updates I need help to do the step 1,3,4,6,7 and 8 (well all :-)) in a deadlock and race condition free mannner. Any suggestions ? Where do we keep the global state about HOT-updates/HOT-fetches ? If we keep it in pg_class, a crash of the CHILL command or the server may leave the pg_class row in a stale state. That does not look like a problem though. In the worst case, we might not be able to do HOT-updates without manual intervention. Again comments, suggestions ? I really appreciate everyone's time and patience. Help is what I need to solve this problem. Thanks, Pavan -- EnterpriseDBhttp://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: CREATE INDEX and HOT (was [HACKERS] Question: pg_classattributes and race conditions ?)
Simon Riggs wrote: > > As a result of the issues, I think Pavan is playing safe, to make sure > there is *an* option, so that we can build upwards from there. The > proposal is pragmatism only, while we discuss other approaches. > Absolutely true. I agree that CHILLing the table with AccessExclusive lock is not a solution that I like. I was looking at it just as a stop-gap solution, given the time left for feature freeze. But if there is something better that we can do, I am for it. Thanks, Pavan -- EnterpriseDBhttp://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Bug in UTF8-Validation Code?
I wrote: > Actually, I have to take back that objection: on closer look, COPY > validates the data only once and does so before applying its own > backslash-escaping rules. So there is a risk in that path too. > It's still pretty annoying to be validating the data twice in the > common case where no backslash reduction occurred, but I'm not sure > I see any good way to avoid it. Further thought here: if we put encoding verification into textin() and related functions, could we *remove* it from COPY IN, in the common case where client and server encodings are the same? Currently, copy.c forces a trip through pg_client_to_server for multibyte encodings even when the encodings are the same, so as to perform validation. But I'm wondering whether we'd still need that. There's no risk of SQL injection in COPY data. Bogus input encoding could possibly make for confusion about where the field boundaries are, but bad data is bad data in any case. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PATCHES] xpath_array with namespaces support
Nikolay Samokhvalov wrote: On 3/17/07, Andrew Dunstan <[EMAIL PROTECTED]> wrote: In principle I am in favor of the patch. Would it be better to use some more unlikely name for the dummy root element used to process fragments than ? Perhaps even something in a special namespace? I did think about it, but I didn't find any difficulties with simple The thing is that regardless the element name we have corresponding shift in XPath epression -- so, there cannot be any problem from my point of view... But maybe I don't see something and it's better to avoid _possible_ problem. It depends on PostgreSQL code style itself -- what is the best approach in such cases? To avoid unknown possible difficulties or to be clear? If you are sure that it won't cause a problem then I think it's ok to leave it, as long as there is a comment in the code that says why we are sure it's ok. cheers andrew ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] Project suggestion: benchmark utility for PostgreSQL
Hello everybody. I am a student and I would like to have your opinion on a project I plan to submit for GSoC. At school, when I work with relational databases I have problems to test tables' structure and queries because I need to insert test data manually, which is very unpleasant. Therefore, I suggest creating a benchmark utility for PostgreSQL. This utility would be divided in two parts: first part is the **random data generator** to fill in the tables with test data, and second part is a **test module** in which we can execute a set of queries and analyze response time and results. The random data generator would be able to generate data of every type (from 8.1 Numeric Types to 8.10 Arrays in official documentation). When starting, the program connects to a database, retrieves the tables and proposes adding random data to the user. The user uses pattern and dictionaries to make those generators work. The test module would consist in a sort of unit tests module. The user adds queries and possibly sets the expected results of those queries. The module executes the queries and gives for each one the results, the time it took to execute, and if the results are those which were expected. And finally, I did not talk about that before. But I think it would be a great idea for a complete benchmark utility... if it would give some advices to optimize the tables' structure (columns' type, indexes, foreign keys). Of course, those advices would be for a novice user as expert already knows how to optimize the columns types, the joins... For a programming language, as it would be for GSoC, it has to be realized in three month and I believe the utility has to be cross-platform (anyway I want it to be). So I think Java would be good. I am very used to Java and Swing programming. What do you think about that choice? If you feel Java is a bad choice, there is C++/Boost/wxWidget/ (like pgAdmin). But with wxWidget, I am not sure if a GUI works under Windows and Linux it will work under MacOS without hacks. Thank you for having read me until there. I am very excited about such project. I wait for your comments. Best regards, Mickael Deloison ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: CREATE INDEX and HOT (was [HACKERS] Question: pg_classattributes and race conditions ?)
On Sat, 2007-03-17 at 23:11 +0530, Pavan Deolasee wrote: > The algorithm would look like: > > 1. Disable HOT-updates The lock held by CREATE INDEX already prevents HOT updates. So steps 1 and 8 aren't needed. We need to be clear that we already have a solution to CREATE INDEX CONCURRENTLY. Do you agree that we do? Does anyone see a problem with the posted design for that? Hopefully it is only CREATE INDEX that we need to think about. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: CREATE INDEX and HOT (was [HACKERS] Question: pg_classattributes and race conditions ?)
Simon Riggs wrote: > > We need to be clear that we already have a solution to CREATE INDEX > CONCURRENTLY. Do you agree that we do? Does anyone see a problem with > the posted design for that? > > Hopefully it is only CREATE INDEX that we need to think about. > I agree. Lets first decide whether its only CREATE INDEX that needs solution or its both. Lets also decide whether we want to fix CREATE INDEX [CONCURRENTLY] or we want to provide a seperate DLL to CHILL the table and then build index normally. Tom has already rejected the idea of holding exclusive lock while chilling, but if we get around that, is rest of the approach acceptable ? If its going to be a seperate DLL, lets decide whether its acceptable to run it as a seperate transaction. Thanks, Pavan -- EnterpriseDBhttp://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Bison 2.1 on win32
Hi. I was operating in a tentative way by 2.3. Still, it is not sufficient. However, it moves. I will think that I am glad, if it can adjust with Magnus. http://winpg.jp/~saito/MinGW/bison-2.3_win32_src.tgz Regards, Hiroshi Saito - Original Message - From: "Tom Lane" <[EMAIL PROTECTED]> To: "Magnus Hagander" <[EMAIL PROTECTED]> Cc: "PGSQL Hackers" Sent: Sunday, March 18, 2007 2:06 AM Subject: Re: [HACKERS] Bison 2.1 on win32 Magnus Hagander <[EMAIL PROTECTED]> writes: Do you happen to have a 2.2 around so you can see what happens there? Or does someone else have that? So I know which version to test against... 2.2 and 2.3 seem to use _MSC_VER in the same way. I had occasion to test both last fall, and they generate only trivially different outputs from our grammar. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Bug in UTF8-Validation Code?
Tom Lane wrote: I wrote: Actually, I have to take back that objection: on closer look, COPY validates the data only once and does so before applying its own backslash-escaping rules. So there is a risk in that path too. It's still pretty annoying to be validating the data twice in the common case where no backslash reduction occurred, but I'm not sure I see any good way to avoid it. Further thought here: if we put encoding verification into textin() and related functions, could we *remove* it from COPY IN, in the common case where client and server encodings are the same? Currently, copy.c forces a trip through pg_client_to_server for multibyte encodings even when the encodings are the same, so as to perform validation. But I'm wondering whether we'd still need that. There's no risk of SQL injection in COPY data. Bogus input encoding could possibly make for confusion about where the field boundaries are, but bad data is bad data in any case. regards, tom lane Here are some timing tests in 1m rows of random utf8 encoded 100 char data. It doesn't look to me like the saving you're suggesting is worth the trouble. baseline: Time: 28228.325 ms Time: 25987.740 ms Time: 25950.707 ms Time: 25756.371 ms Time: 27589.719 ms Time: 25774.417 ms after adding suggested extra test to textin(): Time: 26722.376 ms Time: 28343.226 ms Time: 26529.364 ms Time: 28020.140 ms Time: 24836.853 ms Time: 24860.530 ms Script is: \timing create table xyz (x text); copy xyz from '/tmp/utf8.data'; truncate xyz; copy xyz from '/tmp/utf8.data'; truncate xyz; copy xyz from '/tmp/utf8.data'; truncate xyz; copy xyz from '/tmp/utf8.data'; truncate xyz; copy xyz from '/tmp/utf8.data'; truncate xyz; copy xyz from '/tmp/utf8.data'; drop table xyz; Test platform: FC6, Athlon64. cheers andrew ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Project suggestion: benchmark utility for PostgreSQL
Mickael DELOISON wrote: For a programming language, as it would be for GSoC, it has to be realized in three month and I believe the utility has to be cross-platform (anyway I want it to be). So I think Java would be good. I am very used to Java and Swing programming. What do you think about that choice? If you feel Java is a bad choice, there is C++/Boost/wxWidget/ (like pgAdmin). But with wxWidget, I am not sure if a GUI works under Windows and Linux it will work under MacOS without hacks. I just wanted to comment on the wxWidgest-under-OSX part. I works pretty well, but of course there are some bugs in wxWidgets that plague only a particular plattform. And since OSX seems to be one of the lesser used one, there appear to be more bugs of this kind. But if you grep through the pgAdmin3 sources, you'll find only little OSX-specific code. One nice thing about wxWidgets is that applications that use if feel like native applications on the platforms that wxWidgets supports. Java-GUIs often feel rather alien, at least IMHO. greetings, Florian Pflug ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: CREATE INDEX and HOT (was [HACKERS] Question:pg_classattributes and race conditions ?)
On Sun, 2007-03-18 at 00:44 +0530, Pavan Deolasee wrote: > Simon Riggs wrote: > > > > We need to be clear that we already have a solution to CREATE INDEX > > CONCURRENTLY. Do you agree that we do? Does anyone see a problem with > > the posted design for that? If we have solved CREATE INDEX CONCURRENTLY, then I would propose that this becomes the default option for creating an index, when the statement is issued outside of a statement block. That seems better than reminding everybody to run with the CONCURRENTLY option, or advise them of different performance characteristics or behaviour of the normal CREATE INDEX route. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PATCHES] Bitmapscan changes
This is on dual ultra 2 sparc. with ultrawide 320 scsi drives. 512MB ram. I had to drop size of DB, because the DB drive is 4GB (I do welecome bigger drives as donation, if someone asks - UWscsi 320). here are my results. With only 4.2 patch (no maintain cluster order v5 patch). If the v5 patch was needed, please tell me - I am going rerun it with. hope it is usefull. Repeat 3 times to ensure repeatable results. Timing is on. select_with_normal_index -- 10 (1 row) Time: 1727891.334 ms select_with_normal_index -- 10 (1 row) Time: 1325561.252 ms select_with_normal_index -- 10 (1 row) Time: 1348530.100 ms Timing is off. And now run the same tests with clustered index Timing is on. select_with_clustered_index - 10 (1 row) Time: 870246.856 ms select_with_clustered_index - 10 (1 row) Time: 477089.456 ms select_with_clustered_index - 10 (1 row) Time: 381880.965 ms Timing is off. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Bug in UTF8-Validation Code?
Andrew Dunstan <[EMAIL PROTECTED]> writes: > Here are some timing tests in 1m rows of random utf8 encoded 100 char > data. It doesn't look to me like the saving you're suggesting is worth > the trouble. Hmm ... not sure I believe your numbers. Using a test file of 1m lines of 100 random latin1 characters converted to utf8 (thus, about half and half 7-bit ASCII and 2-byte utf8 characters), I get this in SQL_ASCII encoding: regression=# \timing Timing is on. regression=# create temp table test(f1 text); CREATE TABLE Time: 5.047 ms regression=# copy test from '/home/tgl/zzz1m'; COPY 100 Time: 4337.089 ms and this in UTF8 encoding: utf8=# \timing Timing is on. utf8=# create temp table test(f1 text); CREATE TABLE Time: 5.108 ms utf8=# copy test from '/home/tgl/zzz1m'; COPY 100 Time: 7776.583 ms The numbers aren't super repeatable, but it sure looks to me like the encoding check adds at least 50% to the runtime in this example; so doing it twice seems unpleasant. (This is CVS HEAD, compiled without assert checking, on an x86_64 Fedora Core 6 box.) regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Bug in UTF8-Validation Code?
Tom Lane wrote: Andrew Dunstan <[EMAIL PROTECTED]> writes: Here are some timing tests in 1m rows of random utf8 encoded 100 char data. It doesn't look to me like the saving you're suggesting is worth the trouble. Hmm ... not sure I believe your numbers. Using a test file of 1m lines of 100 random latin1 characters converted to utf8 (thus, about half and half 7-bit ASCII and 2-byte utf8 characters), I get this in SQL_ASCII encoding: regression=# \timing Timing is on. regression=# create temp table test(f1 text); CREATE TABLE Time: 5.047 ms regression=# copy test from '/home/tgl/zzz1m'; COPY 100 Time: 4337.089 ms and this in UTF8 encoding: utf8=# \timing Timing is on. utf8=# create temp table test(f1 text); CREATE TABLE Time: 5.108 ms utf8=# copy test from '/home/tgl/zzz1m'; COPY 100 Time: 7776.583 ms The numbers aren't super repeatable, but it sure looks to me like the encoding check adds at least 50% to the runtime in this example; so doing it twice seems unpleasant. (This is CVS HEAD, compiled without assert checking, on an x86_64 Fedora Core 6 box.) Are you comparing apples with apples? The db is utf8 in both of my cases. cheers andrew ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PATCHES] Bitmapscan changes
Wow, nice! Can you tell us: - how big is the table - cardinality of the column - how big is the index in each case - how much memory on the machine - query and explain analyze Thanks! - Luke Msg is shrt cuz m on ma treo -Original Message- From: Grzegorz Jaskiewicz [mailto:[EMAIL PROTECTED] Sent: Saturday, March 17, 2007 05:16 PM Eastern Standard Time To: Joshua D.Drake Cc: Heikki Linnakangas; PostgreSQL-development Hackers Subject:Re: [HACKERS] [PATCHES] Bitmapscan changes This is on dual ultra 2 sparc. with ultrawide 320 scsi drives. 512MB ram. I had to drop size of DB, because the DB drive is 4GB (I do welecome bigger drives as donation, if someone asks - UWscsi 320). here are my results. With only 4.2 patch (no maintain cluster order v5 patch). If the v5 patch was needed, please tell me - I am going rerun it with. hope it is usefull. Repeat 3 times to ensure repeatable results. Timing is on. select_with_normal_index -- 10 (1 row) Time: 1727891.334 ms select_with_normal_index -- 10 (1 row) Time: 1325561.252 ms select_with_normal_index -- 10 (1 row) Time: 1348530.100 ms Timing is off. And now run the same tests with clustered index Timing is on. select_with_clustered_index - 10 (1 row) Time: 870246.856 ms select_with_clustered_index - 10 (1 row) Time: 477089.456 ms select_with_clustered_index - 10 (1 row) Time: 381880.965 ms Timing is off. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PATCHES] Bitmapscan changes
On Mar 17, 2007, at 10:33 PM, Luke Lonergan wrote: Wow, nice! Can you tell us: - how big is the table - cardinality of the column - how big is the index in each case - how much memory on the machine - query and explain analyze All I changed, was the 400k to 150k 512MB of ram, as I said earlier. And it is running 64bit kernel, 32bit user-land on linux 2.6.20 query and explain is going to run for a while, so I'll leave it - as it is going to be the same on other machines (much faster ones). postgres=# select pg_size_pretty( pg_relation_size ( 'narrowtable_index' ) ); pg_size_pretty 321 MB (1 row) postgres=# select pg_size_pretty( pg_relation_size ( 'narrowtable2_clustered_index' ) ); pg_size_pretty 3960 kB (1 row) (so there's quite a difference). Judging from noises coming out of machine, there was pretty loads of I/O activity. and funny enough, one CPU was stucked on 'wait' up to 80% most of the time. the 'cardinality', as I guess, uniqueness is the same as intended in original test. Like I said, only table size was changed. select count(distinct key) from narrowtable; and select count(*) from narrowtable; are the same - 1500 hth. -- Grzegorz Jaskiewicz C/C++ freelance for hire ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Bug in UTF8-Validation Code?
Tom Lane wrote: Andrew Dunstan <[EMAIL PROTECTED]> writes: Here are some timing tests in 1m rows of random utf8 encoded 100 char data. It doesn't look to me like the saving you're suggesting is worth the trouble. Hmm ... not sure I believe your numbers. Using a test file of 1m lines of 100 random latin1 characters converted to utf8 (thus, about half and half 7-bit ASCII and 2-byte utf8 characters), I get this in SQL_ASCII encoding: regression=# \timing Timing is on. regression=# create temp table test(f1 text); CREATE TABLE Time: 5.047 ms regression=# copy test from '/home/tgl/zzz1m'; COPY 100 Time: 4337.089 ms and this in UTF8 encoding: utf8=# \timing Timing is on. utf8=# create temp table test(f1 text); CREATE TABLE Time: 5.108 ms utf8=# copy test from '/home/tgl/zzz1m'; COPY 100 Time: 7776.583 ms The numbers aren't super repeatable, but it sure looks to me like the encoding check adds at least 50% to the runtime in this example; so doing it twice seems unpleasant. (This is CVS HEAD, compiled without assert checking, on an x86_64 Fedora Core 6 box.) Here are some test results that are closer to yours. I used a temp table and had cassert off and fsync off, and tried with several encodings. The additional load from the test isn't 50%, (I think you have added the cost of going from ascii to utf8 to the cost of the test to get that 50%) but it is nevertheless appreciable. I agree that we should look at not testing if the client and server encodings are the same, so we can reduce the difference. cheers andrew Run SQL_ASCII LATIN1 UTF8 1 4659.38 4766.07 9134.53 2 7999.64 4003.13 6231.41 3 4178.46 6178.89 7266.39 Without test 44201.7 3930.84 10154.38 5 4092.44 .52 9438.24 6 3977.34 4197.09 8866.56 Average 4851.49 4586.76 8515.25 1 11993.86 12625.8 10109.89 2 4647.16 9192.53 11251.27 With test 3 4211.02 9903.77 10097.37 4 9203.62 7045.06 10372.25 5 4121.39 4138.78 10386.92 6 3722.73 4552.09 7432.56 Average 6316.63 7909.67 9941.71 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly