Re: [PATCHES] hash index improving v3
"Alex Hunsaker" <[EMAIL PROTECTED]> writes: > - tbm_add_tuples(tbm, &scan->xs_ctup.t_self, 1, false); > + tbm_add_tuples(tbm, &scan->xs_ctup.t_self, 1, true); Hah, I bet that explains Jonah's complaint that recheck didn't seem to be happening in his tests. Nice catch. regards, tom lane -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] hash index improving v3
"Alex Hunsaker" <[EMAIL PROTECTED]> writes: > On Fri, Sep 5, 2008 at 2:21 PM, Alex Hunsaker <[EMAIL PROTECTED]> wrote: >> Ok now that I made it so it actually *test* collisions, with the patch >> it always returns all rows that matched the hashed "key". > And here is the fix, we just forget to set the recheck flag for bitmap scans. For the convenience of anyone intending to test, here is an updated patch against CVS HEAD that incorporates Alex's fix. regards, tom lane binwQERo3PM5e.bin Description: hash-v5.patch.gz -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] still alive?
Abhijit Menon-Sen wrote: > I thought -patches was supposed to die. What happened? I was wondering the same thing. Peter? -- Bruce Momjian <[EMAIL PROTECTED]>http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] to_date() validation
On Fri, Aug 29, 2008 at 7:39 PM, Brendan Jurd <[EMAIL PROTECTED]> wrote: > Hi all, > > Well, it's been a long time coming, but I've finally got a patch to > improve the error handling of to_date() and to_timestamp(), as > previously discussed on -hackers. [1][2] BTW -patches is obsolete just submit pathces to -hackers. Im just following this: http://wiki.postgresql.org/wiki/Reviewing_a_Patch so lets get started. Submission review: Everything looks good. Tests seem reasonable patch applies etc. Usability review: I think both linked threads in the parent mail give sufficient justification. Feature test: Everything seems to work as advertised. However before via sscanf() most limited the max length of the input. Before they were just silently ignored now they get added to the result: patched: # SELECT to_timestamp('1', 'HHMM'); to_timestamp -- 0009-03-19 11:00:00-06:59:56 8.3.3: # SELECT to_timestamp('1', 'HHMM'); to_timestamp --- 0001-11-01 11:00:00-07 BC Performance review: simple pgbench of to_timestamp did not show any noticeable differences Coding review: seemed fine to me, code matched surrounding style, comments made sense etc Code review: one minor nit *** a/src/backend/utils/adt/formatting.c --- b/src/backend/utils/adt/formatting.c *** *** 781,787 static const KeyWord DCH_keywords[] = { {"y", 1, DCH_Y, TRUE, FROM_CHAR_DATE_GREGORIAN}, /* last */ ! {NULL, 0, 0, 0} }; /* -- --- 781,787 {"y", 1, DCH_Y, TRUE, FROM_CHAR_DATE_GREGORIAN}, /* last */ ! {NULL, 0, 0, 0, 0} }; /* -- *** -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] [PgFoundry] Unsigned Data Types [1 of 2]
On Sun, Aug 31, 2008 at 3:35 PM, Ryan Bradetich <[EMAIL PROTECTED]> wrote: > Hello all, > a few comments. - i think you have to add some more comments in uint.c file and maybe a header indicating this is part of the postgresql project or that is intended to use with postgres or something of the like - what is uint1? i know int, int2, int4, int8 so i think we should have uint, uint2, uint4 (maybe uint8?) >uint-base.tar.bz2 -- The core of the unsigned integer type. seems there is something wrong in the unlikely macro (i'm using GCC 4.2.3 in Ubuntu 4.2.3-2ubuntu7 with amd64) postgres=# select -256::uint1; ERROR: uint1 out of range STATEMENT: select -256::uint1; ERROR: uint1 out of range postgres=# select -255::uint1; ?column? -- -255 (1 row) postgres=# select -2::uint1; ?column? -- -2 (1 row) postgres=# select -5::uint1 + 30::uint1; ?column? -- 25 (1 row) >uint-tests.tar.bz2 -- The regression tests. > here failed two regression tests but that is because the path >* Converted build system to use PGXS (more portable). the Makefile doesn't work here... i have installed postgres 8.3.3 from ubuntu package and the test env i compile manually (the uint module tried to install in the ubuntu location while it should in the env location) attached a Makefile that fix that i still have to make some more test... -- regards, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. (593) 87171157 Makefile Description: Binary data regression.diffs Description: Binary data -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] [PgFoundry] Unsigned Data Types [1 of 2]
"Jaime Casanova" <[EMAIL PROTECTED]> writes: > seems there is something wrong in the unlikely macro (i'm using GCC > 4.2.3 in Ubuntu 4.2.3-2ubuntu7 with amd64) > postgres=# select -256::uint1; > ERROR: uint1 out of range No, that's just because this is parsed as -(256::uint1) regards, tom lane -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] [HACKERS] TODO item: Implement Boyer-Moore searching (First time hacker)
"David Rowley" <[EMAIL PROTECTED]> writes: > I've made and attached the changes Heikki recommended. I looked this over a bit and was immediately confused by one thing: the introductory comment says that the skip table size ought to be based on the length of the haystack, which makes sense to me, but the code is actually initializing it on the basis of len2, ie, the length of the needle. Isn't that a bug? Was the same bug present in the tests you made to determine the best table sizes? Stylistic issues: I really dislike having two copies of the heuristic size-setting code. I think it's worth introducing a second test of use_wchar in order to arrange text_position_setup like this: ... existing code ... choose skiptablesize initialize skip table (this loop doesn't depend on char width) if (!state->use_wchar) process char needle else process wide-char needle Also it might be worth having local-variable copies of skiptablesize and len2 for this initialization loop: for (ai = 0; ai <= state->skiptablesize; ai++) state->skiptable[ai] = state->len2; I'm not at all sure whether a C compiler will think that it's allowed to avoid fetching these values again on each iteration, seeing that the loop is storing into other integer members of *state. Given that this loop is exactly the overhead we're trying to control by adjusting skiptablesize, making it as tight as possible seems worthwhile. > Now that the skip table is a member of TextPositionState, I was not quite > sure if I should #define a size for it. It would certainly look neater, only > the code that defines the skip table size in text_position_start assumes 256 > elements. I tend to agree that a #define is pointless there, since there's no easy way to make the relevant code do anything with it. It would be worthwhile to point out adjacent to the code what the max length is, though. I had gotten as far as rewriting your introductory comment like this /* * Prepare the skip table for Boyer-Moore-Horspool searching. First we * must determine how big a skip table to use. The declaration of * TextPositionState allows up to 256 elements, but with haystack lengths * of only a few characters we don't really want to have to initialize so * many elements --- it would take too long in comparison to the actual * search time. So we choose a useful skip table size based on the * haystack length. before noticing that what I was writing wasn't what the code actually did. Another point that doesn't seem to be included in your comments is that the skip table size *must* be a power of 2 because we use bit masking. (In fact state->skiptablesize might better be named skiptablemask since it's really 1 less than the table size.) BTW, try to avoid introducing so much random vertical space. IMHO it does little for readability and much to make routines too long to fit in one screenful. Generally the PG code doesn't use double blank lines anywhere inside a function, nor blank lines before a right brace line. pg_indent will clean this up to some extent, but it would be better if the code looked more like the project style in the first place. regards, tom lane -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] [HACKERS] TODO item: Implement Boyer-Moore searching (First time hacker)
I wrote: > I looked this over a bit and was immediately confused by one thing: > the introductory comment says that the skip table size ought to be based > on the length of the haystack, which makes sense to me, but the code is > actually initializing it on the basis of len2, ie, the length of the > needle. Isn't that a bug? Was the same bug present in the tests you > made to determine the best table sizes? BTW, to the extent that you feel like testing a different idea, I would suggest: * don't bother initializing the skiptable when len1 < len2 * otherwise, choose its size based on len1 - len2, not just len1 or len2. This is (one less than) the maximum number of search loop consultations of the skip table that can happen, so it seems like a plausible number, and better than either length alone. regards, tom lane -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] [PgFoundry] Unsigned Data Types [1 of 2]
On Sat, Sep 6, 2008 at 3:57 PM, Tom Lane <[EMAIL PROTECTED]> wrote: > "Jaime Casanova" <[EMAIL PROTECTED]> writes: >> seems there is something wrong in the unlikely macro (i'm using GCC >> 4.2.3 in Ubuntu 4.2.3-2ubuntu7 with amd64) > >> postgres=# select -256::uint1; >> ERROR: uint1 out of range > > No, that's just because this is parsed as -(256::uint1) > actually, i thought that case is right but the -255::uint1 returning a negative number (aka -255) is what bothers me -- regards, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. (593) 87171157 -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] [PgFoundry] Unsigned Data Types [1 of 2]
On Sat, Sep 6, 2008 at 3:57 PM, Tom Lane <[EMAIL PROTECTED]> wrote: > "Jaime Casanova" <[EMAIL PROTECTED]> writes: >> seems there is something wrong in the unlikely macro (i'm using GCC >> 4.2.3 in Ubuntu 4.2.3-2ubuntu7 with amd64) > >> postgres=# select -256::uint1; >> ERROR: uint1 out of range > > No, that's just because this is parsed as -(256::uint1) > ah! ok, i see the point... postgres=# select 256::uint1; ERROR: uint1 out of range but is right that way of parsing? so i get a negative number instead of an error? -- regards, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. (593) 87171157 -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] [HACKERS] TODO item: Implement Boyer-Moore searching (First time hacker)
Tom Lane wrote: > I looked this over a bit and was immediately confused by one thing: > the introductory comment says that the skip table size ought to be based > on the length of the haystack, which makes sense to me, but the code is > actually initializing it on the basis of len2, ie, the length of the > needle. Isn't that a bug? Was the same bug present in the tests you > made to determine the best table sizes? Yes Bug. That's what I get for making last minute changes. It didn't affect the benchmark, that was done before moving the code into postgresql for testing. The function I tested with had an extra parameter to set the skip table size. Each possible size was tested and the best time was saved along with the best size. > BTW, to the extent that you feel like testing a different idea, > I would suggest: > * don't bother initializing the skiptable when len1 < len2 Good plan. > * otherwise, choose its size based on len1 - len2, not just len1 or > len2. This is (one less than) the maximum number of search loop > consultations of the skip table that can happen, so it seems like a > plausible number, and better than either length alone. That seems like a better idea. I had considered len1 * len2, giving that's the worst case for BMH. Of course the lengths would need to be raised quite a bit. I'll go with len1 - len2 I'll make the above changes and then run my benchmark again to see if the skip table size logic should be updated. I'll also benchmark and update the benchmark spreadsheet to see what's changed. David. -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] [PgFoundry] Unsigned Data Types [1 of 2]
"Jaime Casanova" <[EMAIL PROTECTED]> writes: > On Sat, Sep 6, 2008 at 3:57 PM, Tom Lane <[EMAIL PROTECTED]> wrote: >>> postgres=# select -256::uint1; >>> ERROR: uint1 out of range >> >> No, that's just because this is parsed as -(256::uint1) > actually, i thought that case is right but the -255::uint1 returning a > negative number (aka -255) is what bothers me Well, again, that's -(255::uint1). I suppose uint1 hasn't got a negation operator (what would it do??), so probably the sequence of events is to form 255::uint1, then implicitly promote it to some signed type or other (most likely int4), then negate. Not much to be done about this unless you want to get rid of the implicit coercion to signed types, which would probably defeat most of the purpose. Now, if (-255)::uint1 fails to throw error, that would be a bug IMHO. Casting any negative value to uint ought to fail, no? regards, tom lane -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] hash index improving v3
On Sat, Sep 6, 2008 at 1:09 PM, Tom Lane <[EMAIL PROTECTED]> wrote: >For the convenience of anyone intending to test, here is an updated >patch against CVS HEAD that incorporates Alex's fix. Here are the results for a table containing 1 million entries that will generate hash collisions. It paints a bad picture for the patch but then again im not sure how relevant the issue is. For example yesterday I imported a table with 10 million collisions and the create index is still running (now at about ~18 hours). Maybe we should warn if there are lots of collisions when creating the index and suggest you use a btree? Anyway here are the results. ./pgbench -c1 -n -t10 -f bench_createindex.sql cvs head: tps = 0.002169 v5 : tps = 0.002196 pgbench -c1 -n -t1000 -f bench_bitmap.sql cvs head: tps = 24.011871 v5: tps = 2.543123 pgbench -c1 -n -t1000 -f bench_index.sql cvs head: tps = 51.614502 v5: tps = 3.205542 pgbench -c1 -n -t1000 -f bench_seqscan.sql cvs head: tps = 8.553318 v5: tps = 9.836091 Table created via: create table test_hash (num int8); ./hash | psql -c 'copy test_hash from stdin;' bench_create.sql Description: Binary data bench_index.sql Description: Binary data bench_seqscan.sql Description: Binary data int8collide.patch Description: Binary data #include #include #include int main(void) { unsigned long y = 0; unsigned cnt = 0; while(cnt < 100) { y += UINT_MAX; y += 1; printf("%ld\n", y); cnt++; } } -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] hash index improving v3
2008/9/6 Alex Hunsaker <[EMAIL PROTECTED]>: > pgbench -c1 -n -t1000 -f bench_bitmap.sql > cvs head: tps = 24.011871 > v5: tps = 2.543123 oops forgot to attach bench_bitmap.sql bench_bitmap.sql Description: Binary data -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] [PgFoundry] Unsigned Data Types [1 of 2]
On Sat, Sep 6, 2008 at 7:10 PM, Tom Lane <[EMAIL PROTECTED]> wrote: > > Now, if (-255)::uint1 fails to throw error, that would be a bug IMHO. > Casting any negative value to uint ought to fail, no? > then the patch is right but it seems to me like that is broking the law of less surprise i expected -2::uint1 to be equivalent to (-2)::uint1 that should be at least documented, no? -- regards, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. (593) 87171157 -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] [PgFoundry] Unsigned Data Types [1 of 2]
"Jaime Casanova" <[EMAIL PROTECTED]> writes: > then the patch is right but it seems to me like that is broking the > law of less surprise i expected -2::uint1 to be equivalent to > (-2)::uint1 that should be at least documented, no? See the precedence table here: http://www.postgresql.org/docs/8.3/static/sql-syntax-lexical.html#SQL-PRECEDENCE :: binds more tightly than -, and always has. regards, tom lane -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches
Re: [PATCHES] [PgFoundry] Unsigned Data Types [1 of 2]
On Sat, Sep 6, 2008 at 3:41 PM, Jaime Casanova <[EMAIL PROTECTED]> wrote: > > i still have to make some more test... > why i need the cast in this case? even if the cast is really necesary (the message seems realy ugly) contrib_regression=# select * from t1 where f1 > 35; ERROR: unsupported type: 16486 contrib_regression=# select * from t1 where f1 > 35::uint4; f1 - 36 37 38 -- regards, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. (593) 87171157 -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches