Re: [HACKERS] Regex code versus Unicode chars beyond codepoint 255

2012-02-17 Thread David Smith

on 2010-11-24 at 15:56, Tom Lane wrote:

 Bug #5766 points out that we're still not there yet in terms of having
 sane behavior for locale-specific regex operations in Unicode
 encoding. The reason it's not working is that regc_locale does this to
 expand the set of characters that are considered to match [[:alnum:]]
 : SNIP

and it would appear that nobody answered the email.

I am currently implementing a library system that needs to search by
whole word. I am using \m...\M regexes, and the DB is utf8, which
includes text in Hebrew, Greek, Arabic and various European character
sets. I need a solution to do whole word searches on the data, and this
either means fixing the value of alnum for utf8 to include all character
sets, or manually generating a list of all characters and reimplementing
a word-start/end in regex myself. I would prefer to avoid the latter if
at all possible!

What is the current status regarding a full character list for alnum for
utf8, and is there anything I can do to help get it working?

Thanks,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in intarray?

2012-02-17 Thread Guillaume Lelarge
On Thu, 2012-02-16 at 19:27 -0500, Tom Lane wrote:
 Guillaume Lelarge guilla...@lelarge.info writes:
  This query:
SELECT ARRAY[-1,3,1]  ARRAY[1, 2];
  should give {1} as a result.
 
  But, on HEAD (and according to his tests, on 9.0.6 and 9.1.2), it
  appears to give en empty array.
 
 Definitely a bug, and I'll bet it goes all the way back.
 
  Digging on this issue, another user (Julien Rouhaud) made an interesting
  comment on this line of code:
 
  if (i + j == 0 || (i + j  0  *(dr - 1) != db[j]))
 
  (line 159 of contrib/intarray/_int_tool.c, current HEAD)
 
  Apparently, the code tries to check the current value of the right side
  array with the previous value of the resulting array. Which clearly
  cannot work if there is no previous value in the resulting array.
 
  So I worked on a patch to fix this, as I think it is a bug (but I may be
  wrong). Patch is attached and fixes the issue AFAICT.
 
 Yeah, this code is bogus, but it's also pretty unreadable.  I think
 it's better to get rid of the inconsistently-used pointer arithmetic
 and the fundamentally wrong/irrelevant test on i+j, along the lines
 of the attached.
 

Completely agree.

Thank you.


-- 
Guillaume
http://blog.guillaume.lelarge.info
http://www.dalibo.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Notes about fixing regexes and UTF-8 (yet again)

2012-02-17 Thread Heikki Linnakangas

On 16.02.2012 01:06, Tom Lane wrote:

In bug #6457 it's pointed out that we *still* don't have full
functionality for locale-dependent regexp behavior with UTF8 encoding.
The reason is that there's old crufty code in regc_locale.c that only
considers character codes up to 255 when searching for characters that
should be considered letters, digits, etc.  We could fix that, for
some value of fix, by iterating up to perhaps 0x when dealing with
UTF8 encoding, but the time that would take is unappealing.  Especially
so considering that this code is executed afresh anytime we compile a
regex that requires locale knowledge.

I looked into the upstream Tcl code and observed that they deal with
this by having hard-wired tables of which Unicode code points are to be
considered letters etc.  The tables are directly traceable to the
Unicode standard (they provide a script to regenerate them from files
available from unicode.org).  Nonetheless, I do not find that approach
appealing, mainly because we'd be risking deviating from the libc locale
code's behavior within regexes when we follow it everywhere else.
It seems entirely likely to me that a particular locale setting might
consider only some of what Unicode says are letters to be letters.

However, we could possibly compromise by using Unicode-derived tables
as a guide to which code points are worth probing libc for.  That is,
assume that a utf8-based locale will never claim that some code is a
letter that unicode.org doesn't think is a letter.  That would cut the
number of required probes by a pretty large factor.

The other thing that seems worth doing is to install some caching.
We could presumably assume that the behavior of iswupper() et al are
fixed for the duration of a database session, so that we only need to
run the probe loop once when first asked to create a cvec for a
particular category.

Thoughts, better ideas?


Here's a wild idea: keep the class of each codepoint in a hash table. 
Initialize it with all codepoints up to 0x. After that, whenever a 
string contains a character that's not in the hash table yet, query the 
class of that character, and add it to the hash table. Then recompile 
the whole regex and restart the matching engine.


Recompiling is expensive, but if you cache the results for the session, 
it would probably be acceptable.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers

2012-02-17 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 Wait, we already have ALTER TRIGGER bob ON ANY COMMAND SET DISABLED;

 Eh, so what happens then if someone sets a command trigger on ALTER TRIGGER?

We should remove support for command triggers on alter command triggers.
Well I could also go with the GUC idea, it's only that I'm not entirely
sold it's the best we can do yet and I'd like to avoid yet another GUC.

 Why would we do it that way (a single entry for multiple commands)?  The
 way it is now, it's only syntactic sugar, so I think it's easier to
 implement, document and use.

 Well, for one thing, it's consistent with how we handle it for regular
 triggers. For two things, if you create an object named bob, you

I don't think so, if you attach the same procedure to more than one
table each time with the same name, you get multiple entries in
pg_trigger:

pg_trigger_tgrelid_tgname_index UNIQUE, btree (tgrelid, tgname)

create trigger footg after insert on tg.foo for each row execute procedure 
tg.trigfunc();
create trigger footg after insert on tg.bar for each row execute procedure 
tg.trigfunc();
create trigger footg after insert on tg.baz for each row execute procedure 
tg.trigfunc();

select oid, tgrelid::regclass, tgname, tgfoid, tgtype, tgenabled from 
pg_trigger;
  oid   | tgrelid | tgname | tgfoid | tgtype | tgenabled 
+-++++---
 533210 | tg.foo  | footg  | 533209 |  5 | O
 533211 | tg.bar  | footg  | 533209 |  5 | O
 533212 | tg.baz  | footg  | 533209 |  5 | O
(3 rows)

The difference I see is that in the table trigger case you don't have a
syntax that allows you to do the 3 operations I did above in 1 command,
and it's easy to provide for this capability with command triggers (and
the use case is much bigger too, as all command triggers are given the
same arguments and all expected to return void).

 expect to end up with an object named bob - not 47 objects (or
 whatever) that are all named bob.  Also, suppose you create a trigger
 on ALL COMMANDS, and then a new version of PG adds a new command.

You create a trigger on ANY command :)

 When you dump and reload, do you expect to end up with a trigger on
 all commands that existed in the old version, or all the commands that
 exist in the new version?  Or conversely, suppose we get rid of a
 command in a future release.  How will we handle that?  I can't think
 of another example of where a CREATE command creates multiple objects
 like that.

ANY COMMAND triggers are just one entry in pg_cmdtrigger, with the
command name registered as ANY, which is only safe as long as we don't
provide a new SQL command whose command tag is ANY. We could decide that
we want to name this magic ANY command __ANY__, but it does not look
like it fits the project usual naming style.

-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Scaling XLog insertion (was Re: [HACKERS] Moving more work outside WALInsertLock)

2012-02-17 Thread Heikki Linnakangas

On 16.02.2012 13:31, Fujii Masao wrote:

On Thu, Feb 16, 2012 at 6:15 PM, Fujii Masaomasao.fu...@gmail.com  wrote:

BTW, when I ran the test on my Ubuntu, I could not reproduce the problem.
I could reproduce the problem only in MacOS.


+   nextslot = Insert-nextslot;
+   if (NextSlotNo(nextslot) == lastslot)
+   {
+   /*
+* Oops, we've caught our tail and the oldest slot is still 
in use.
+* Have to wait for it to become vacant.
+*/
+   SpinLockRelease(Insert-insertpos_lck);
+   WaitForXLogInsertionSlotToBecomeFree();
+   goto retry;
+   }
+   myslot =XLogCtl-XLogInsertSlots[nextslot];
+   nextslot = NextSlotNo(nextslot);

nextslot can reach NumXLogInsertSlots, which would be a bug, I guess.
When I did the quick-fix and ran the test, I could not reproduce the problem
any more. I'm not sure if this is really the cause of the problem, though.


Ah, I see. That explains why you only see it on some platforms - 
depending on ALIGNOF_XLOG_BUFFER, there is often enough padding after 
the last valid slot to accommodate the extra bogus slot. Thanks for the 
debugging!


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-02-17 Thread premanand
Hi,

Thanks for the reply. It is not possible to change the query in our
environment. So we need to do casting. I'm a new bee, so it will be more
helpful if you give me some simple examples.

--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/MySQL-search-query-is-not-executing-in-Postgres-DB-tp5491531p5491947.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Displaying accumulated autovacuum cost

2012-02-17 Thread Fujii Masao
On Sat, Nov 26, 2011 at 10:10 AM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Nov 25, 2011 at 11:39 AM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
 I'm going to push this now anyway, thanks.

 This patch adds a count of the number of buffers dirtied to VACUUM,
 but it strikes me that it would be useful to add similar tracking to
 pgBufferUsage.  Attached is a patch for that.  You can see the new
 counters through pg_stat_statements or with EXPLAIN (ANALYZE,
 BUFFERS).  This is useful because the number of buffers that a query
 *writes* doesn't necessarily have much to do with anything - it may
 end up writing buffers dirtied by other queries while being read-only
 itself, or conversely it may not write anything at all even though it
 dirties quite a bit.

 Thoughts?  Comments?  Objections?

Here are review comments:

The document about EXPLAIN needs to be updated.

You forgot to add the long-integer-valued property of shared/local_blks_dirtied.
So when I ran EXPLAIN and used json as a format, no information about
blks_dirtied
was reported.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server

2012-02-17 Thread Albe Laurenz
Shigeru Hanada wrote:
 - Since a rescan is done by rewinding the cursor, is it necessary
to have any other remote isolation level than READ COMMITED?
There is only one query issued per transaction.
 
 If multiple foreign tables on a foreign server is used in a local
query,
 multiple queries are executed in a remote transaction.  So IMO
isolation
 levels are useful even if remote query is executed only once.

Oh, I see. You are right.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-02-17 Thread Andrew Dunstan



On 02/17/2012 04:22 AM, premanand wrote:

Hi,

Thanks for the reply. It is not possible to change the query in our
environment. So we need to do casting. I'm a new bee, so it will be more
helpful if you give me some simple examples.



That's not what this list is for, You should be asking in pgsql-general, 
not here, (so don't just reply to this email). But in any case, you 
should take Heikki's advice seriously. What you are doing is broken. You 
should fix it and not just say Oh, we can't.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers

2012-02-17 Thread Dimitri Fontaine
Alvaro Herrera alvhe...@commandprompt.com writes:
 ... after looking at it more closely, I think only this line needs to be
 in a separate file:

 typedef struct CommandContextData *CommandContext;

Files like src/backend/commands/tablecmds.c and others need both the
structure and the pointer, so we need both. What about putting those
definitions into src/include/catalog/pg_cmdtrigger.h?

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Triggers with DO functionality

2012-02-17 Thread Thom Brown
Hi,

This may have already been discussed before, but I can't find any
mention of it.  Would it be desirable to add support for triggers that
contain their own anonymous functions (i.e. DO)?

So instead of

CREATE TRIGGER...
EXECUTE PROCEDURE functioname();

you'd have:

CREATE TRIGGER...
DO $$
...
$$;

The purpose being to only have a single statement to set up the
trigger rather than setting up a separate trigger function which will
unlikely be re-used by other triggers... or is this of dubious
benefit?

-- 
Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers

2012-02-17 Thread Dimitri Fontaine
Alvaro Herrera alvhe...@commandprompt.com writes:
 I didn't like the new cmdtrigger.h file.  It's included by a lot of
 other headers, and it's also itself including execnodes.h and

It turns around that this file does not need including execnode.h, I've
cleaned that up now (compile ok, make installcheck ok).

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers

2012-02-17 Thread Marko Kreen
On Fri, Feb 17, 2012 at 10:54 AM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Robert Haas robertmh...@gmail.com writes:
 Wait, we already have ALTER TRIGGER bob ON ANY COMMAND SET DISABLED;

 Eh, so what happens then if someone sets a command trigger on ALTER TRIGGER?

 We should remove support for command triggers on alter command triggers.
 Well I could also go with the GUC idea, it's only that I'm not entirely
 sold it's the best we can do yet and I'd like to avoid yet another GUC.

Btw, we already have a GUC for triggers: session_replication_role,
how will the command triggers follow that?

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers

2012-02-17 Thread Dimitri Fontaine
Marko Kreen mark...@gmail.com writes:
 Btw, we already have a GUC for triggers: session_replication_role,
 how will the command triggers follow that?

Note that the replica here in my mind would have been an Hot Standby
node, and having the standby run the replica/always command triggers is
not implemented yet, because you can't run DDL on the standby.

Now that you mention it we should also provide support the GUC here and
only fire the triggers matching it. I'm working on that now.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers

2012-02-17 Thread Robert Haas
On Fri, Feb 17, 2012 at 3:54 AM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Robert Haas robertmh...@gmail.com writes:
 Wait, we already have ALTER TRIGGER bob ON ANY COMMAND SET DISABLED;

 Eh, so what happens then if someone sets a command trigger on ALTER TRIGGER?

 We should remove support for command triggers on alter command triggers.
 Well I could also go with the GUC idea, it's only that I'm not entirely
 sold it's the best we can do yet and I'd like to avoid yet another GUC.

I'm OK with not supporting command triggers on command triggers, but I
still think the GUC is useful.  Keep in mind that flipping a GUC is
really cheap compared to a catalog change, and can affect just one
session.  Those are significant advantages.  However, if you want to
just not support triggers on statements that modify command triggers,
I'm OK with that, too.

 Why would we do it that way (a single entry for multiple commands)?  The
 way it is now, it's only syntactic sugar, so I think it's easier to
 implement, document and use.

 Well, for one thing, it's consistent with how we handle it for regular
 triggers. For two things, if you create an object named bob, you

 I don't think so, if you attach the same procedure to more than one
 table each time with the same name, you get multiple entries in
 pg_trigger:

    pg_trigger_tgrelid_tgname_index UNIQUE, btree (tgrelid, tgname)

 create trigger footg after insert on tg.foo for each row execute procedure 
 tg.trigfunc();
 create trigger footg after insert on tg.bar for each row execute procedure 
 tg.trigfunc();
 create trigger footg after insert on tg.baz for each row execute procedure 
 tg.trigfunc();

Sure, but if you run the same trigger on multiple operations - INSERT
OR UPDATE OR DELETE.

 expect to end up with an object named bob - not 47 objects (or
 whatever) that are all named bob.  Also, suppose you create a trigger
 on ALL COMMANDS, and then a new version of PG adds a new command.

 You create a trigger on ANY command :)

Oh.  Well, then +1 for me on the ANY COMMAND thing, but -1 on ALL
COMMANDS.  I can't see that there's enough utility to having a
bulk-create functionality to justify its existence.  The ANY COMMAND
thing I think is what people will want.

 ANY COMMAND triggers are just one entry in pg_cmdtrigger, with the
 command name registered as ANY, which is only safe as long as we don't
 provide a new SQL command whose command tag is ANY. We could decide that
 we want to name this magic ANY command __ANY__, but it does not look
 like it fits the project usual naming style.

I am thinking that we should ditch the idea of keeping track of
commands using strings and instead assign a bunch of integer constants
using a big enum.  The parser can translate from what the user enters
to these constants and then use those throughout, including in the
system catalogs.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers

2012-02-17 Thread Marko Kreen
On Fri, Feb 17, 2012 at 4:04 PM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Marko Kreen mark...@gmail.com writes:
 Btw, we already have a GUC for triggers: session_replication_role,
 how will the command triggers follow that?

 Note that the replica here in my mind would have been an Hot Standby
 node, and having the standby run the replica/always command triggers is
 not implemented yet, because you can't run DDL on the standby.

But we will be able?  Thats news to me.

I'm more interested whether it follows ordinary trigger
behaviour on Slony/Londiste slave node.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Notes about fixing regexes and UTF-8 (yet again)

2012-02-17 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 Here's a wild idea: keep the class of each codepoint in a hash table. 
 Initialize it with all codepoints up to 0x. After that, whenever a 
 string contains a character that's not in the hash table yet, query the 
 class of that character, and add it to the hash table. Then recompile 
 the whole regex and restart the matching engine.

 Recompiling is expensive, but if you cache the results for the session, 
 it would probably be acceptable.

Dunno ... recompiling is so expensive that I can't see this being a win;
not to mention that it would require fundamental surgery on the regex
code.

In the Tcl implementation, no codepoints above U+ have any locale
properties (alpha/digit/punct/etc), period.  Personally I'd not have a
problem imposing the same limitation, so that dealing with stuff above
that range isn't really a consideration anyway.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Notes about fixing regexes and UTF-8 (yet again)

2012-02-17 Thread Andrew Dunstan



On 02/17/2012 09:39 AM, Tom Lane wrote:

Heikki Linnakangasheikki.linnakan...@enterprisedb.com  writes:

Here's a wild idea: keep the class of each codepoint in a hash table.
Initialize it with all codepoints up to 0x. After that, whenever a
string contains a character that's not in the hash table yet, query the
class of that character, and add it to the hash table. Then recompile
the whole regex and restart the matching engine.
Recompiling is expensive, but if you cache the results for the session,
it would probably be acceptable.

Dunno ... recompiling is so expensive that I can't see this being a win;
not to mention that it would require fundamental surgery on the regex
code.

In the Tcl implementation, no codepoints above U+ have any locale
properties (alpha/digit/punct/etc), period.  Personally I'd not have a
problem imposing the same limitation, so that dealing with stuff above
that range isn't really a consideration anyway.



up to U+ is the BMP which is described as containing characters for 
almost all modern languages, and a large number of special characters. 
It seems very likely to be acceptable not to bother about the locale of 
code points in the supplementary planes.


See http://en.wikipedia.org/wiki/Plane_%28Unicode%29 for descriptions 
of which sets of characters are involved.



cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Notes about fixing regexes and UTF-8 (yet again)

2012-02-17 Thread Robert Haas
On Fri, Feb 17, 2012 at 3:48 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 Here's a wild idea: keep the class of each codepoint in a hash table.
 Initialize it with all codepoints up to 0x. After that, whenever a
 string contains a character that's not in the hash table yet, query the
 class of that character, and add it to the hash table. Then recompile the
 whole regex and restart the matching engine.

 Recompiling is expensive, but if you cache the results for the session, it
 would probably be acceptable.

What if you did this ONCE and wrote the results to a file someplace?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-02-17 Thread Robert Haas
On Fri, Feb 17, 2012 at 2:45 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 Of course, the best fix would be to change your queries. It's quite sloppy
 to rely on integer LIKE text without an explicit cast in the query.

Why?

I understand why it's a bad idea to rely on integer = text doing
anything sane - is that text equality or numeric equality?

And in theory the same issue could exist here if there were another
meaning for LIKE.  But there isn't.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Notes about fixing regexes and UTF-8 (yet again)

2012-02-17 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Feb 17, 2012 at 3:48 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 Recompiling is expensive, but if you cache the results for the session, it
 would probably be acceptable.

 What if you did this ONCE and wrote the results to a file someplace?

That's still a cache, you've just defaulted on your obligation to think
about what conditions require the cache to be flushed.  (In the case at
hand, the trigger for a cache rebuild would probably need to be a glibc
package update, which we have no way of knowing about.)

Before going much further with this, we should probably do some timings
of 64K calls of iswupper and friends, just to see how bad a dumb
implementation will be.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-02-17 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I understand why it's a bad idea to rely on integer = text doing
 anything sane - is that text equality or numeric equality?

 And in theory the same issue could exist here if there were another
 meaning for LIKE.  But there isn't.

Really?  LIKE is just a different spelling for operator ~~, and there is
nothing stopping people from defining more operators named that, not to
mention that there are already four of them in core PG.  In particular
the bytea ~~ bytea version is explicitly intended to provide a LIKE
implementation for non-text types.  I see some operators named ~~ in
contrib as well.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Copyright notice for contrib/cube?

2012-02-17 Thread Jay Levitt
I'm basing an extension off contrib/cube. I'm going to open-source it under 
the existing PostgreSQL license, but I'm not sure how the copyright notice 
should look - there isn't one at the moment. (In fact, there's no LICENSE or 
COPYRIGHT file at all.)


Should it be something like

Portions Copyright (c) 1996-2011, PostgreSQL Global Development Group

Portions Copyright (c) 2012, TipTap Inc.

?

Jay

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Triggers with DO functionality

2012-02-17 Thread David E. Wheeler
On Feb 17, 2012, at 5:22 AM, Thom Brown wrote:

 The purpose being to only have a single statement to set up the
 trigger rather than setting up a separate trigger function which will
 unlikely be re-used by other triggers... or is this of dubious
 benefit?

+1, though I imagine it would just give it a generated name and save it anyway, 
eh?

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Copyright notice for contrib/cube?

2012-02-17 Thread Marti Raudsepp
On Fri, Feb 17, 2012 at 17:42, Jay Levitt jay.lev...@gmail.com wrote:
 Should it be something like

 Portions Copyright (c) 1996-2011, PostgreSQL Global Development Group
 Portions Copyright (c) 2012, TipTap Inc.

Please don't add that, just change 2011 to 2012. This is what the wiki says:

Q: May I add my own copyright notice where appropriate?
A: No, please don't. We like to keep the legal information short and
crisp. Additionally, we've heard that could possibly pose problems for
corporate users.

Q: Doesn't the PostgreSQL license itself require to keep the copyright
notice intact?
A: Yes, it does. And it is, because the PostgreSQL Global Development
Group covers all copyright holders. Also note that US law doesn't
require any copyright notice for getting the copyright granted, just
like most European laws.

https://wiki.postgresql.org/wiki/Developer_FAQ

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Triggers with DO functionality

2012-02-17 Thread Andrew Dunstan



On 02/17/2012 11:29 AM, David E. Wheeler wrote:

On Feb 17, 2012, at 5:22 AM, Thom Brown wrote:


The purpose being to only have a single statement to set up the
trigger rather than setting up a separate trigger function which will
unlikely be re-used by other triggers... or is this of dubious
benefit?

+1, though I imagine it would just give it a generated name and save it anyway, 
eh?





Before we rush into this, let's consider all the wrinkles. For example, 
what if you need to change the function? And how would you edit the 
function in psql? It might be a bit more involved that it seems at first 
glance, although my initial reaction was the same as David's.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Triggers with DO functionality

2012-02-17 Thread Thom Brown
On 17 February 2012 16:29, David E. Wheeler da...@justatheory.com wrote:
 On Feb 17, 2012, at 5:22 AM, Thom Brown wrote:

 The purpose being to only have a single statement to set up the
 trigger rather than setting up a separate trigger function which will
 unlikely be re-used by other triggers... or is this of dubious
 benefit?

 +1, though I imagine it would just give it a generated name and save it 
 anyway, eh?

I had thought about that, yes, but I didn't want to get bogged down in
implementation.

-- 
Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Triggers with DO functionality

2012-02-17 Thread Thom Brown
On 17 February 2012 16:43, Andrew Dunstan and...@dunslane.net wrote:


 On 02/17/2012 11:29 AM, David E. Wheeler wrote:

 On Feb 17, 2012, at 5:22 AM, Thom Brown wrote:

 The purpose being to only have a single statement to set up the
 trigger rather than setting up a separate trigger function which will
 unlikely be re-used by other triggers... or is this of dubious
 benefit?

 +1, though I imagine it would just give it a generated name and save it
 anyway, eh?




 Before we rush into this, let's consider all the wrinkles. For example, what
 if you need to change the function? And how would you edit the function in
 psql? It might be a bit more involved that it seems at first glance,
 although my initial reaction was the same as David's.

Why not just...

CREATE OR REPLACE TRIGGER my_trigger...

-- 
Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Copyright notice for contrib/cube?

2012-02-17 Thread Jay Levitt

Marti Raudsepp wrote:

On Fri, Feb 17, 2012 at 17:42, Jay Levittjay.lev...@gmail.com  wrote:

Should it be something like

Portions Copyright (c) 1996-2011, PostgreSQL Global Development Group
Portions Copyright (c) 2012, TipTap Inc.


Please don't add that, just change 2011 to 2012. This is what the wiki says:

Q: May I add my own copyright notice where appropriate?


To clarify, this is for an extension to be distributed separately on PGXN 
and GitHub, not for a contribution to the PostgreSQL distribution. It will 
differ greatly from contrib/cube when it's done, but cube is the scaffolding 
I'm starting with.


That said:


Q: Doesn't the PostgreSQL license itself require to keep the copyright
notice intact?
A: Yes, it does. And it is, because the PostgreSQL Global Development
Group covers all copyright holders.


Is that true for separately-distributed extensions as well - if I push this 
to GitHub, my company is part of the PGDG? Where is the PGDG defined?


If not (and perhaps even if so), I think I could still add an additional 
copyright notice without violating the license, since the copyright notice 
and following two paragraphs still appear in all copies. But perhaps it's 
not necessary.


I think the edge case is something stupid like In five years, there is no 
remaining contrib code, and we get bought by MonsantoOracleHalliburton, and 
they want to close-source the code in a way that's somehow incompatible with 
the PostgreSQL license.. can they?


But that does raise two other points:

- cube seems to post-date any work at UC. Should I still include the 
Portions Copyright (c) 1994, The Regents of the University of California?


- Technically, the license could be read such that the above copyright 
notice (singular) refers to the UC copyright notice but not the PGDG 
notice; next time the lawyers run through it, you might want to add an s 
to notices..


Jay

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-02-17 Thread Robert Haas
On Fri, Feb 17, 2012 at 10:27 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 I understand why it's a bad idea to rely on integer = text doing
 anything sane - is that text equality or numeric equality?

 And in theory the same issue could exist here if there were another
 meaning for LIKE.  But there isn't.

 Really?  LIKE is just a different spelling for operator ~~, and there is
 nothing stopping people from defining more operators named that, not to
 mention that there are already four of them in core PG.  In particular
 the bytea ~~ bytea version is explicitly intended to provide a LIKE
 implementation for non-text types.  I see some operators named ~~ in
 contrib as well.

As far as I know, the fact that LIKE is another spelling for ~~ is a
PostgreSQL implementation detail with which users ought not to be
burdened.  But even given that, there are many situations where we
currently complain about ambiguity even though the ambiguity is
entirely hypothetical: there COULD be a ~~(int,text) operator, but
there actually isn't.  Now, I'll admit that this is not an easy
problem to solve without giving up something somewhere, since it's
clearly undesirable for the meaning of something that worked before to
silently change when and if someone defines a new operator.

But on the other hand, I think that labeling the user's coding
practices as sloppy is a cop-out.  There are many, many people running
on not only MySQL but also on Oracle who have written large amounts of
code that requires fewer type casts on those systems than it does on
PostgreSQL.  That fact presents serious migration challenges for such
users when they move over to PostgreSQL.  Labeling the code as the
problem excuses us from the need to think about how to make our type
system work any better than it does today.  Boo, hiss.  If we're not
doing anything about this because we have carefully examined the
subject and decided that this is a trade-off we must make because
MySQL or Oracle doesn't support XYZ and we do, then let's give that
explanation to the user instead of telling them the problem is that
their code stinks.  Otherwise, we have some soul-searching to do, as
time permits.

I remember there was a time when you couldn't say SELECT a x FROM
foo in PostgreSQL.  We told people that it was because our syntax was
more flexible - we have postfix operators, or something.  I no longer
remember the details of where the grammar conflict came from.  But
somebody (probably you or Hiroshi Saito, judging by the commit log)
figured out a way to get around the problem, and now that syntax works
fine in 99% of the cases people care about.  That is a huge usability
improvement for people coming from other database systems where AS was
never required.   I don't know whether a similar improvement is
possible in this area, but we're certainly not going to get there by
labeling the user's expectations as unreasonable.  I don't think they
are, and the people who wrote MySQL and Oracle evidently agree.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Triggers with DO functionality

2012-02-17 Thread Andrew Dunstan



On 02/17/2012 11:46 AM, Thom Brown wrote:

On 17 February 2012 16:43, Andrew Dunstanand...@dunslane.net  wrote:


On 02/17/2012 11:29 AM, David E. Wheeler wrote:

On Feb 17, 2012, at 5:22 AM, Thom Brown wrote:


The purpose being to only have a single statement to set up the
trigger rather than setting up a separate trigger function which will
unlikely be re-used by other triggers... or is this of dubious
benefit?

+1, though I imagine it would just give it a generated name and save it
anyway, eh?




Before we rush into this, let's consider all the wrinkles. For example, what
if you need to change the function? And how would you edit the function in
psql? It might be a bit more involved that it seems at first glance,
although my initial reaction was the same as David's.

Why not just...

CREATE OR REPLACE TRIGGER my_trigger...




Maybe that would do it. You might also want a \e command for psql to 
match it.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers

2012-02-17 Thread Dimitri Fontaine
Dimitri Fontaine dimi...@2ndquadrant.fr writes:
 I think it would be better to bail on trying to use CREATE TRIGGER and
 DROP TRIGGER as a basis for this functionality, and instead create
 completely new toplevel statements CREATE COMMAND TRIGGER and DROP
 COMMAND TRIGGER.  Then, you could decide that all command triggers
 live in the same namespace, and therefore to get rid of the command
 trigger called bob you can just say DROP COMMAND TRIGGER bob,
 without having to specify the type of command it applies to.  It's

 I guess that's a point to change the grammar the way you're hinting:

   CREATE COMMAND TRIGGER
   DROP COMMAND TRIGGER
   ALTER COMMAND TRIGGER

 That also needs each their own reference page.  It will be easier on the
 users I guess.  Will work on that.

FWIW I've pushed such a change to my github repository, I'm not spamming
the list with v10 already though, unless someone wants to see it.

  
https://github.com/dimitri/postgres/commit/82996b45aae10f12818f1e3097ba805fff22a97b

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers

2012-02-17 Thread Robert Haas
On Fri, Feb 17, 2012 at 10:42 AM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Done.  Of course at the time the command trigger is created you can't
 distinguish if the CREATE INDEX command will be run CONCURRENTLY or not,
 so I've decided to issue a WARNING about it.

That seems icky.  Whatever warnings need to be given should be in the
documentation, not at runtime.

Another idea here would be to treat CREATE INDEX CONCURRENTLY as if it
were a separate toplevel command, for command-trigger purposes only.
But I'm not sure that's any better.

 You would need to set a command trigger on ALTER COMMAND TRIGGER and
 that's not supported. Triggers on command ALTER TRIGGER in fact will
 not get fired on ALTER TRIGGER ... ON COMMAND ...

 I guess that's a point to change the grammar the way you're hinting:

Indeed it is.  :-)

  CREATE COMMAND TRIGGER
  DROP COMMAND TRIGGER
  ALTER COMMAND TRIGGER

 That also needs each their own reference page.  It will be easier on the
 users I guess.  Will work on that.

Yeah, I think that will be much more clear, and not really that much
work for you.  It will also make the reference pages simpler, I think,
since there are significant behavioral differences between ordinary
triggers and command triggers.

 Both done, if you agree with using session_replication_role here.

It's better than a sharp stick in the eye.  I'm not convinced it's
ideal, but I don't feel strongly enough about the issue to push on it
for now, as long as we disallow command triggers on CREATE/ALTER/DROP
COMMAND TRIGGER.

 Sure, but if you run the same trigger on multiple operations - INSERT
 OR UPDATE OR DELETE.

 I failed to see that analogy. The other problem with the current way of
 doing things is that I can't integrate with RemoveObjects(), and I think
 you won't like that :)

I sure won't.  I think ultimately you won't like it either, since the
objectaddress infrastructure is also needed to make this work with
extensions.  And I assume you would agree with me that extensions are
an important feature.  :-)

 You create a trigger on ANY command :)

 Oh.  Well, then +1 for me on the ANY COMMAND thing, but -1 on ALL
 COMMANDS.  I can't see that there's enough utility to having a
 bulk-create functionality to justify its existence.  The ANY COMMAND
 thing I think is what people will want.

 There's no such thing as ALL COMMANDS in the patch, there's a syntactic
 sugar allowing you to create and drop more than one command trigger in a
 single command, much as we have DROP TABLE foo, bar, baz;

OK, I'll look more carefully.

 I am thinking that we should ditch the idea of keeping track of
 commands using strings and instead assign a bunch of integer constants
 using a big enum.  The parser can translate from what the user enters
 to these constants and then use those throughout, including in the
 system catalogs.

 It's not really command strings but the Command Tag we've historically
 been using up until now.  You're saying that it should remain the same
 for users but change internally.  No strong opinion from me here, apart
 from it being more code for doing the same thing.

Well, the reason I thought it might be better is for caching purposes.
 If you have a cache of which triggers need to be run for which
commands, an integer index into an array will be a lot faster than a
hash table lookup.  But it may bear more examination, so I don't feel
this is a must-do at this point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Copyright notice for contrib/cube?

2012-02-17 Thread Tom Lane
Jay Levitt jay.lev...@gmail.com writes:
 Marti Raudsepp wrote:
 Please don't add that, just change 2011 to 2012. This is what the wiki says:
 Q: May I add my own copyright notice where appropriate?

 To clarify, this is for an extension to be distributed separately on PGXN 
 and GitHub, not for a contribution to the PostgreSQL distribution. It will 
 differ greatly from contrib/cube when it's done, but cube is the scaffolding 
 I'm starting with.

If you don't have any ambitions of eventually getting the code into
Postgres contrib or core, then you can pretty much add any copyright
notice or terms you like.  The wiki text Marti is quoting is only meant
to cover code that is submitted for inclusion in Postgres.

 - cube seems to post-date any work at UC. Should I still include the 
 Portions Copyright (c) 1994, The Regents of the University of California?

We typically do that even in new files, on the grounds that there is
almost always a certain amount of copying-and-pasting involved in a new
file, and so some part of it could be traced to UCB if you tried hard
enough.  In any event, if we were to get really anal about it, we'd be
trying to have the exact same copyright notice in every single file,
and that one-size-fits-all version would definitely have to credit UCB.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Notes about fixing regexes and UTF-8 (yet again)

2012-02-17 Thread Robert Haas
On Fri, Feb 17, 2012 at 10:19 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 What if you did this ONCE and wrote the results to a file someplace?

 That's still a cache, you've just defaulted on your obligation to think
 about what conditions require the cache to be flushed.

Yep.  Unfortunately, I don't have a good idea how to handle that; I
was hoping someone else did.

 Before going much further with this, we should probably do some timings
 of 64K calls of iswupper and friends, just to see how bad a dumb
 implementation will be.

Can't hurt.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-02-17 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I remember there was a time when you couldn't say SELECT a x FROM
 foo in PostgreSQL.  We told people that it was because our syntax was
 more flexible - we have postfix operators, or something.

Which it was, and yes that was the reason.  We eventually thought of a
kluge solution that lets you omit AS 90% of the time, which is better
than nothing; but I doubt it would ever have been accepted if it weren't
a matter of improving standards compliance.  I am pretty sure that the
SQL spec doesn't say that you should be able to apply LIKE directly to
an integer, so that issue isn't comparable to this one.

 I don't know whether a similar improvement is
 possible in this area, but we're certainly not going to get there by
 labeling the user's expectations as unreasonable.  I don't think they
 are, and the people who wrote MySQL and Oracle evidently agree.

The people who wrote MySQL had very poor taste in a lot of areas, and
we are not going to blindly follow their lead.  Oracle is not a terribly
presentable system either.  Having said that, I don't object to any
clean improvements we can think of in this area --- but make it work
more like MySQL had better not be the only argument for it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Displaying accumulated autovacuum cost

2012-02-17 Thread Robert Haas
On Fri, Feb 17, 2012 at 5:04 AM, Fujii Masao masao.fu...@gmail.com wrote:
 Here are review comments:

 The document about EXPLAIN needs to be updated.

 You forgot to add the long-integer-valued property of 
 shared/local_blks_dirtied.
 So when I ran EXPLAIN and used json as a format, no information about
 blks_dirtied
 was reported.

Thanks for the review.  Updated patch attached.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


dirty_buffers_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Triggers with DO functionality

2012-02-17 Thread Peter Eisentraut
On fre, 2012-02-17 at 13:22 +, Thom Brown wrote:
 So instead of
 
 CREATE TRIGGER...
 EXECUTE PROCEDURE functioname();
 
 you'd have:
 
 CREATE TRIGGER...
 DO $$
 ...
 $$; 

I had wished for this many times and was about to propose something
similar.

We might wish to review the SQL standard and other implementations to
make porting triggers a bit easier too.

Also, whatever ALTER functionality functions have would have to be made
available here as well.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-02-17 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 
 As far as I know, the fact that LIKE is another spelling for ~~ is
a
 PostgreSQL implementation detail with which users ought not to be
 burdened.
 
+1
 
LIKE is well defined by the standard, and the ~~ operator is not
mentioned there anywhere.
 
On the other hand, LIKE is defined to either work on character
strings or binary strings -- there is nothing in the standard about
using it with other data types or automatic casting to support that.
Any such support would be a non-standard PostgreSQL extension. As
such, anyone wanting to write portable code should avoid that by
explicit casting (which should be portable).
 
 I remember there was a time when you couldn't say SELECT a x FROM
 foo in PostgreSQL.
 
That was in violation of the SQL standard, which makes AS an allowed
but optional noise word.
 
In spite of all that, perhaps we should have a compatibility
extension which provides more casts, in an attempt to ease the
transition from other databases?  Personally, I like having the
default behavior this strict -- I think it reduces the chance of
errors, reduces the chances of accidentally having type mismatches
which defeat optimizations, and improves portability.  But I have
nothing against allowing someone to give all that up to ease
transition from another product.  My biggest concern is whether we
might paint ourselves into a corner by including such an
extension.  It might shut off avenues for other cool features
because anyone using the extension would have conflicts.  Perhaps
such a thing would be more appropriate on PGXN with admonitions that
it was only intended to ease conversion and that users were
encouraged to migrate to standard syntax as soon as possible.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-02-17 Thread Pavel Stehule
2012/2/17 Kevin Grittner kevin.gritt...@wicourts.gov:
 Robert Haas robertmh...@gmail.com wrote:

 As far as I know, the fact that LIKE is another spelling for ~~ is
 a
 PostgreSQL implementation detail with which users ought not to be
 burdened.

 +1

 LIKE is well defined by the standard, and the ~~ operator is not
 mentioned there anywhere.

 On the other hand, LIKE is defined to either work on character
 strings or binary strings -- there is nothing in the standard about
 using it with other data types or automatic casting to support that.
 Any such support would be a non-standard PostgreSQL extension. As
 such, anyone wanting to write portable code should avoid that by
 explicit casting (which should be portable).

 I remember there was a time when you couldn't say SELECT a x FROM
 foo in PostgreSQL.

 That was in violation of the SQL standard, which makes AS an allowed
 but optional noise word.

 In spite of all that, perhaps we should have a compatibility
 extension which provides more casts, in an attempt to ease the
 transition from other databases?  Personally, I like having the
 default behavior this strict -- I think it reduces the chance of
 errors, reduces the chances of accidentally having type mismatches
 which defeat optimizations, and improves portability.  But I have
 nothing against allowing someone to give all that up to ease
 transition from another product.  My biggest concern is whether we
 might paint ourselves into a corner by including such an
 extension.  It might shut off avenues for other cool features
 because anyone using the extension would have conflicts.  Perhaps
 such a thing would be more appropriate on PGXN with admonitions that
 it was only intended to ease conversion and that users were
 encouraged to migrate to standard syntax as soon as possible.


+1

Pavel

 -Kevin

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers

2012-02-17 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 On Fri, Feb 17, 2012 at 10:42 AM, Dimitri Fontaine
 dimi...@2ndquadrant.fr wrote:
 Done.  Of course at the time the command trigger is created you can't
 distinguish if the CREATE INDEX command will be run CONCURRENTLY or not,
 so I've decided to issue a WARNING about it.

 That seems icky.  Whatever warnings need to be given should be in the
 documentation, not at runtime.

Agreed.  We're still making user visible changes though, so I wanted to
defer docs editing some more.  Even documented, a WARNING seems a good
idea to me, but maybe you would prefer a NOTICE?

 Another idea here would be to treat CREATE INDEX CONCURRENTLY as if it
 were a separate toplevel command, for command-trigger purposes only.
 But I'm not sure that's any better.

The patch as it stands will fire the AFTER command trigger only when not
using the CONCURRENTLY variant, which I think is enough, once documented.

 Yeah, I think that will be much more clear, and not really that much
 work for you.  It will also make the reference pages simpler, I think,
 since there are significant behavioral differences between ordinary
 triggers and command triggers.

Yeah done this way, still needed an overview section in triggers.sgml I
think.

 Both done, if you agree with using session_replication_role here.

 It's better than a sharp stick in the eye.  I'm not convinced it's
 ideal, but I don't feel strongly enough about the issue to push on it
 for now, as long as we disallow command triggers on CREATE/ALTER/DROP
 COMMAND TRIGGER.

We simply don't support those commands as far as command triggers are
concerned, which seems to be like a sane limitation.

 I sure won't.  I think ultimately you won't like it either, since the
 objectaddress infrastructure is also needed to make this work with
 extensions.  And I assume you would agree with me that extensions are
 an important feature.  :-)

How you'd guess about that :)

Will see about it later tonight, I'd like to keep the multiple command
drop command trigger spelling.

 It's not really command strings but the Command Tag we've historically
 been using up until now.  You're saying that it should remain the same
 for users but change internally.  No strong opinion from me here, apart
 from it being more code for doing the same thing.

 Well, the reason I thought it might be better is for caching purposes.
  If you have a cache of which triggers need to be run for which
 commands, an integer index into an array will be a lot faster than a
 hash table lookup.  But it may bear more examination, so I don't feel
 this is a must-do at this point.

I've been trying to get a feeling of the runtime performance with
command triggers in the line you suggested, even if I'd be very
surprised that a couple of index scans are anything but noise when
completing a DDL command.

I'm having those results on my development machine:

duration: 30 s
number of transactions actually processed: 42390
tps = 1413.004051 (including connections establishing)
tps = 1413.505517 (excluding connections establishing)
statement latencies in milliseconds:
0.705843create or replace function plus1(int) returns bigint 
language sql as $$ select $1::bigint + 1; $$;

I don't have the setup to compare that easily to current master's
branch, I was hoping you would run tests on your side (btw the previous
patch version is rebased against master and cleaned up, should be fine
now — oh and in context format).

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Triggers with DO functionality

2012-02-17 Thread Thom Brown
On 17 February 2012 17:26, Peter Eisentraut pete...@gmx.net wrote:
 On fre, 2012-02-17 at 13:22 +, Thom Brown wrote:
 So instead of

 CREATE TRIGGER...
 EXECUTE PROCEDURE functioname();

 you'd have:

 CREATE TRIGGER...
 DO $$
 ...
 $$;

 I had wished for this many times and was about to propose something
 similar.

 We might wish to review the SQL standard and other implementations to
 make porting triggers a bit easier too.

I had looked at how a couple other RDBMS's do it, and there are:

CREATE TRIGGER...
BEGIN
END;

and

CREATE TRIGGER...
AS
BEGIN
END;

And thinking about it, DO is a bit nonsense here, so maybe we'd just
have something like:

CREATE TRIGGER...
AS $$
BEGIN
END;
$$;

i.e. the same as a function.

-- 
Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-02-17 Thread Robert Haas
On Fri, Feb 17, 2012 at 12:14 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I don't know whether a similar improvement is
 possible in this area, but we're certainly not going to get there by
 labeling the user's expectations as unreasonable.  I don't think they
 are, and the people who wrote MySQL and Oracle evidently agree.

 The people who wrote MySQL had very poor taste in a lot of areas, and
 we are not going to blindly follow their lead.  Oracle is not a terribly
 presentable system either.  Having said that, I don't object to any
 clean improvements we can think of in this area --- but make it work
 more like MySQL had better not be the only argument for it.

Hey, if I preferred MySQL to PostgreSQL, I wouldn't be here.  That
doesn't mean that there are exactly 0 things that they do better than
we do.  What I'm unhappy about isn't that we're not bug-compatible
with MySQL, but rather that, in this case, I like MySQL's behavior
better, and the fact that they've made it work means it's not
theoretically impossible.  It just involves some trade-off that I
don't believe we've thought about hard enough.

Standards compliance is a means to an end.  The purpose of having
standards is to allow for interoperable implementations of the same
underlying functionality.  That doesn't mean we should copy
non-standard warts, of course, but it isn't obvious to me that this is
a wart.  No one has suggested that the user's actual query has more
than one reasonable interpretation, so complaining that it's ambiguous
doesn't impress me very much.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-02-17 Thread Andrew Dunstan



On 02/17/2012 12:59 PM, Robert Haas wrote:

On Fri, Feb 17, 2012 at 12:14 PM, Tom Lanet...@sss.pgh.pa.us  wrote:

I don't know whether a similar improvement is
possible in this area, but we're certainly not going to get there by
labeling the user's expectations as unreasonable.  I don't think they
are, and the people who wrote MySQL and Oracle evidently agree.

The people who wrote MySQL had very poor taste in a lot of areas, and
we are not going to blindly follow their lead.  Oracle is not a terribly
presentable system either.  Having said that, I don't object to any
clean improvements we can think of in this area --- but make it work
more like MySQL had better not be the only argument for it.

Hey, if I preferred MySQL to PostgreSQL, I wouldn't be here.  That
doesn't mean that there are exactly 0 things that they do better than
we do.  What I'm unhappy about isn't that we're not bug-compatible
with MySQL, but rather that, in this case, I like MySQL's behavior
better, and the fact that they've made it work means it's not
theoretically impossible.  It just involves some trade-off that I
don't believe we've thought about hard enough.

Standards compliance is a means to an end.  The purpose of having
standards is to allow for interoperable implementations of the same
underlying functionality.  That doesn't mean we should copy
non-standard warts, of course, but it isn't obvious to me that this is
a wart.  No one has suggested that the user's actual query has more
than one reasonable interpretation, so complaining that it's ambiguous
doesn't impress me very much.



Assuming we had the cast, What would intval like '1%' mean? You're 
going to match 1, 10..19, 100..199, 1000..1999 ...


Now maybe there's a good use for such a test, but I'm have a VERY hard 
time imagining what it might be.



cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-02-17 Thread Jeff MacDonald
Greetings,

  My biggest concern is whether we
  might paint ourselves into a corner by including such an
  extension.  It might shut off avenues for other cool features
  because anyone using the extension would have conflicts.  Perhaps
  such a thing would be more appropriate on PGXN with admonitions that
  it was only intended to ease conversion and that users were
  encouraged to migrate to standard syntax as soon as possible.
 

IMHO if you give someone syntax surgar like this and tell them to fix it ASAP 
it will never get fixed properly. I'm all for getting new users to pgsql, but 
this is not the way to do it.

Regards,
J

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Designing an extension for feature-space similarity search

2012-02-17 Thread Jay Levitt

Tom Lane wrote:

Jay Levittjay.lev...@gmail.com  writes:

- Does KNN-GiST run into problems when-  returns values that don't make
sense in the physical world?


If the indexed entities are records, it would be
entirely your own business how you handled individual fields being NULL.


This turns out to be a bit challenging. Let's say I'm building a 
nullable_point type that allows the Y axis to be NULL (or any sentinel value 
for missing data), where the semantics are NULL is infinitely far from 
the query.   I'll need my GiST functions to return useful results with NULL 
- not just correct results, but results that help partition the tree nicely.


At first I thought this posed a challenge for union; if I have these points:

(1,2)
(2,1)
(1,NULL)

what's the union? I think the answer is to treat NULL box coordinates like 
LL = -infinity, UR = infinity, or (equivalently, I think) to store a 
saw_nulls bit in addition to LL and UR.


The real challenge is probably in picksplit and penalty - where in the tree 
should I stick (1,NULL)? - at which point you say Yes, algorithms for 
efficient indexes are hard work and computer-science-y and point me at 
surrogate splitters.


Just thinking out loud, I guess; if other GiST types have addressed this 
problem, I'd love to hear about it.


Jay

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Simulating Clog Contention

2012-02-17 Thread Robert Haas
On Mon, Jan 30, 2012 at 12:26 PM, Robert Haas robertmh...@gmail.com wrote:
 I was thinking the opposite.  That -i should only print progress
 indication when -d is given.  Or at least knock an order of magnitude
 or two off of how often it does so.

 I'd be in all in favor of having -i emit progress reports 10x less
 often; even on a laptop, the current reports are very chatty.  But I
 think 100x less often might be taking it too far.

Trivial patch for that attached.

 Either way, if we're going to have an option for inserts, they should
 produce the same progress reports that COPY does - though possibly
 more often, since I'm guessing it's likely to be way slower.

I looked at this a little more and I'm coming around to the view that
Heikki expressed originally: I think this too much of a one-tricky
pony to justify including it in pgbench.  It's an interesting hack for
testing, but the thing is that, to really make it do anything
interesting, you've got to not only use INSERT instead of COPY and
create the indexes before loading the table, BUT ALSO frob the
existing code to prevent the WAL bypass from doing its thing.  I
suppose we could have a separate option for all THREE of those
behaviors, rather than just the two I mentioned in my previous email,
but that seems over the top.  So if we're going to have this at all,
we might as well just call it --artificially-inflate-clog-pain and be
done with it.

But I think that's kind of a narrow special case that isn't really
worth catering for.  Our CLOG contention right now is not so bad that
we need to focus a major development effort on making it less, and
even if we do want to do that that there's no real evidence that a
half-hour pgbench run isn't sufficient to demonstrate the problem
perfectly adequately with the code we have right now.  After a few
minutes the frequency of hitting previously-updated rows is high
enough to measure the problem anyway.  In the process of developing
the various performance improvements we've committed for 9.2, I and
others have developed various test cases - Heikki has one he uses for
his xlog scaling patch, for example.  We can't commit all of those as
pgbench options, or we'll go nuts.  Ideally it would be nice if
pgbench was flexible enough to handle these kinds of uses cases via
configuration rather than by hard-coding them.

Given all the above, I'm inclined to conclude that this is just
another special-purpose test case which we should use for testing and
that's it, as Simon already proposed upthread.  However, I do think
that we should go ahead and make -n work in initialization mode,
because I've wanted that a few times.  So, patch for that attached,
too.

(Note that no actual benchmarking is happening right now with regard
to the CLOG history patch because, as previously noted, the most
recent version does not compile.  This probably doesn't matter for
that patch hugely anyway, since that mechanism as currently designed
does not kick in until a million transactions have been processed, and
by that time you'll have quite a spread of XIDs in pgbench_acounts
anyway.  I don't believe there are any other remaining patches in this
CommitFest to which the test case would be applicable; please let me
know if I am wrong.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgbench-n.patch
Description: Binary data


pgbench-quiet.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Designing an extension for feature-space similarity search

2012-02-17 Thread Alexander Korotkov
On Fri, Feb 17, 2012 at 11:00 PM, Jay Levitt jay.lev...@gmail.com wrote:

 Tom Lane wrote:

 Jay Levittjay.lev...@gmail.com  writes:

 - Does KNN-GiST run into problems when-  returns values that don't
 make

 sense in the physical world?


 If the indexed entities are records, it would be
 entirely your own business how you handled individual fields being NULL.


 This turns out to be a bit challenging. Let's say I'm building a
 nullable_point type that allows the Y axis to be NULL (or any sentinel
 value for missing data), where the semantics are NULL is infinitely far
 from the query.   I'll need my GiST functions to return useful results
 with NULL - not just correct results, but results that help partition the
 tree nicely.

 At first I thought this posed a challenge for union; if I have these
 points:

 (1,2)
 (2,1)
 (1,NULL)

 what's the union? I think the answer is to treat NULL box coordinates like
 LL = -infinity, UR = infinity, or (equivalently, I think) to store a
 saw_nulls bit in addition to LL and UR.

 The real challenge is probably in picksplit and penalty - where in the
 tree should I stick (1,NULL)? - at which point you say Yes, algorithms for
 efficient indexes are hard work and computer-science-y and point me at
 surrogate splitters.

 Just thinking out loud, I guess; if other GiST types have addressed this
 problem, I'd love to hear about it.


Similar problem appears at GiST indexing of ranges, because range can be
empty. There additional contain empty flag was introduced. This contain
empty flag indicates that underlying value can be empty. So, this flag is
set when union with empty range or other range with this flag set. It's
likely you need similar flag for each dimension.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-02-17 Thread Christopher Browne
On Fri, Feb 17, 2012 at 1:21 PM, Andrew Dunstan and...@dunslane.net wrote:
 Assuming we had the cast, What would intval like '1%' mean? You're going
 to match 1, 10..19, 100..199, 1000..1999 ...

 Now maybe there's a good use for such a test, but I'm have a VERY hard time
 imagining what it might be.

Well, I can readily see someone encoding parts of their application
into this sort of encoding, so that, for instance, all customer
numbers beginning with 1 are deemed to be internal accounts.

It's a pretty terrible approach to encoding data for an application;
it leads to stuff like the oops, once they have generated 30,000
invoices, the system reaches doomsday and can't work anymore.
   http://thedailywtf.com/Articles/Ive-Got-Your-Number.aspx

But nothing prevents users from designing their applications to encode
information in their ID prefixes.

And I have *zero* confidence that for PostgreSQL to rule out LIKE
'1%' is preventing those designs from getting built...

When confronted by a difficult problem, solve it by reducing it to the
question, How would the Lone Ranger handle this?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Designing an extension for feature-space similarity search

2012-02-17 Thread Jay Levitt

Alexander Korotkov wrote:

On Fri, Feb 17, 2012 at 11:00 PM, Jay Levitt jay.lev...@gmail.com
mailto:jay.lev...@gmail.com wrote:

At first I thought this posed a challenge for union; if I have these 
points:


(1,2)
(2,1)
(1,NULL)

what's the union? I think the answer is to treat NULL box coordinates
like LL = -infinity, UR = infinity, or (equivalently, I think) to store
a saw_nulls bit in addition to LL and UR.

Similar problem appears at GiST indexing of ranges, because range can be
empty. There additional contain empty flag was introduced. This contain
empty flag indicates that underlying value can be empty. So, this flag is
set when union with empty range or other range with this flag set. It's
likely you need similar flag for each dimension.


Ah, yes, exactly the same problem. So what led you to add a flag instead of 
using the range NULL..NULL? I'm on the fence about choosing.


Jay

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-02-17 Thread Don Baccus

On Feb 17, 2012, at 11:28 AM, Christopher Browne wrote:
 And I have *zero* confidence that for PostgreSQL to rule out LIKE
 '1%' is preventing those designs from getting built...
 
 When confronted by a difficult problem, solve it by reducing it to the
 question, How would the Lone Ranger handle this?

He would turn to Tonto, who undoubtably would advise:

If you want to treat an integer like a string, figure out how to convert your 
integer into a string …

The original query strikes me as being similar to expecting the Lone Ranger's 
six-shooter to be capable of shooting Tonto's arrows.


Don Baccus
http://donb.photo.net
http://birdnotes.net
http://openacs.org







-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Designing an extension for feature-space similarity search

2012-02-17 Thread Alexander Korotkov
On Fri, Feb 17, 2012 at 11:32 PM, Jay Levitt jay.lev...@gmail.com wrote:

 Alexander Korotkov wrote:

 On Fri, Feb 17, 2012 at 11:00 PM, Jay Levitt jay.lev...@gmail.com
 mailto:jay.lev...@gmail.com wrote:

 At first I thought this posed a challenge for union; if I have these
 points:


(1,2)
(2,1)
(1,NULL)

what's the union? I think the answer is to treat NULL box coordinates
like LL = -infinity, UR = infinity, or (equivalently, I think) to store
a saw_nulls bit in addition to LL and UR.

 Similar problem appears at GiST indexing of ranges, because range can be
 empty. There additional contain empty flag was introduced. This contain
 empty flag indicates that underlying value can be empty. So, this flag is
 set when union with empty range or other range with this flag set. It's
 likely you need similar flag for each dimension.


 Ah, yes, exactly the same problem. So what led you to add a flag instead
 of using the range NULL..NULL? I'm on the fence about choosing.


At first, range bounds can't be NULL :) At second, if we have range
(a;b)+contain empty in internal page, both facts:
1) All normal underlying ranges are contained in (a;b).
2) There can be empty underlying ranges.
are useful for search.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-02-17 Thread Andrew Dunstan



On 02/17/2012 02:28 PM, Christopher Browne wrote:

On Fri, Feb 17, 2012 at 1:21 PM, Andrew Dunstanand...@dunslane.net  wrote:

Assuming we had the cast, What would intval like '1%' mean? You're going
to match 1, 10..19, 100..199, 1000..1999 ...

Now maybe there's a good use for such a test, but I'm have a VERY hard time
imagining what it might be.

Well, I can readily see someone encoding parts of their application
into this sort of encoding, so that, for instance, all customer
numbers beginning with 1 are deemed to be internal accounts.

It's a pretty terrible approach to encoding data for an application;
it leads to stuff like the oops, once they have generated 30,000
invoices, the system reaches doomsday and can't work anymore.
http://thedailywtf.com/Articles/Ive-Got-Your-Number.aspx

But nothing prevents users from designing their applications to encode
information in their ID prefixes.

And I have *zero* confidence that for PostgreSQL to rule out LIKE
'1%' is preventing those designs from getting built...

When confronted by a difficult problem, solve it by reducing it to the
question, How would the Lone Ranger handle this?




Strings of digits used that way should not be stored in numeric fields 
at all, IMNSHO, just as zip codes and phone numbers should not be. They 
should be text in the first place, and if the OP had done that he 
wouldn't have had any difficulty about this. I hope that's what the Lone 
Ranger would do ...


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-02-17 Thread Robert Haas
On Fri, Feb 17, 2012 at 1:21 PM, Andrew Dunstan and...@dunslane.net wrote:
 Assuming we had the cast, What would intval like '1%' mean? You're going
 to match 1, 10..19, 100..199, 1000..1999 ...

Yep.

 Now maybe there's a good use for such a test, but I'm have a VERY hard time
 imagining what it might be.

Dunno.  Presumably the test is meaningful for the OP's IDs, or he
wouldn't have written the query that way.

The time I got bitten by this was actually with LPAD(), rather than
LIKE.  I had a serial column which I wanted to use to generate record
identifiers off of a sequence: B1, B2, B3, B4, etc.
So I wrote 'B' || lpad(id, 5, '0').   When the implicit casting
changes came along, I had to go back and change that to id::text.
Fortunately that wasn't a lot of work, especially since by that time I
was following pgsql-hackers enough to understand immediately why it
broke, but it did and does seem unnecessary, because there is no real
ambiguity there.  Yeah, there could be ambiguity, if someone created
another LPAD() function... but no one did.

Here's yet another case where the current rules are thoroughly disagreeable.

rhaas=# create or replace function z(smallint) returns smallint as
$$select $1+1$$ language sql;
ERROR:  return type mismatch in function declared to return smallint
DETAIL:  Actual return type is integer.
CONTEXT:  SQL function z

So cast the result from an integer to a smallint.  What's the big deal?

But, OK, I'll do it your way:

rhaas=# create or replace function z(smallint) returns smallint as
$$select $1+1::smallint$$ language sql;
CREATE FUNCTION
rhaas=# select z(1);
ERROR:  function z(integer) does not exist
LINE 1: select z(1);
   ^
HINT:  No function matches the given name and argument types. You
might need to add explicit type casts.

Come on, really?  Note that the above example works without casts if
you use int *or* bigint *or* numeric, but not smallint.  That could be
fixed by causing sufficiently-small integers to lex as smallints, but
if you think implicit datatype coercions are evil, you ought to be
outraged by the fact that we are already going out of our way to blur
the line between int, bigint, and numeric.  We let people write 2.0 +
3 and get 5.0 - surely it's only a short step from there to human
sacrifice, cats and dogs living together... mass hysteria!   I mean,
the whole reason for rejecting integer = text is that we aren't sure
whether to coerce the text to an integer or the integer to a string,
and it's better to throw an error than to guess.  But in the case of
2.0 + 3, we feel 100% confident in predicting that the user will be
happy to convert the integer to a numeric rather than the numeric to
an integer, so no error.  We do that because we know that the domain
of numeric is a superset of the domain of integer, or in other words,
we are using context clues to deduce what the user probably meant
rather than forcing them to be explicit about it.

And yet in other cases, such as LIKE or LPAD with an integer rather
than a string, or just about anything involving smallint, the user is
required to be explicit, even though in most cases there is only one
reasonable implementation of the query.  What is the value in erroring
out on a query that's not actually ambiguous?  Numerous people here
are defending that behavior as if it were somehow morally superior,
but I think it's merely an accident of how the post-8.3 type system
happens to work.  On pre-8.3 systems this all works differently, and
some of those old behaviors are worse than what we have now, while
others are better.  If we really believed that implicit casts any form
were evil, we would have removed them entirely instead of trimming
them back.  I don't see why it's heretical to suggest that the 8.3
casting changes brought us to exactly that point in the universe where
everything is perfect and nothing can be further improved; does anyone
seriously believe that?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-02-17 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 Assuming we had the cast, What would intval like '1%' mean? You're 
 going to match 1, 10..19, 100..199, 1000..1999 ...

 Now maybe there's a good use for such a test, but I'm have a VERY hard 
 time imagining what it might be.

Yeah, that's another point worth asking.  Coercing an integer to text
and then doing LIKE on it is an extremely inefficient way to do what's
probably the wrong thing anyhow.  I would be interested to know exactly
why the OP wants to do this, and whether it couldn't be done better with
some arithmetical test.

In this connection it's worth remembering that when we removed some
implicit casts in 8.3, we heard lots of yelling, and we also heard from
lots of people who found bugs in their SQL code that the implicit casts
had masked.  Allowing LIKE-on-anything could be a rerun of that mess.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-02-17 Thread Robert Haas
On Fri, Feb 17, 2012 at 2:44 PM, Andrew Dunstan and...@dunslane.net wrote:
 Strings of digits used that way should not be stored in numeric fields at
 all, IMNSHO, just as zip codes and phone numbers should not be. They should
 be text in the first place, and if the OP had done that he wouldn't have had
 any difficulty about this. I hope that's what the Lone Ranger would do ...

The argument isn't about whether the user made the right design
choices; it's about whether he should be forced to insert an explicit
type cast to get the query to do what it is unambiguously intended to
do.  I don't believe it's entirely self-evident that it's always
better to store strings of integers in a text column rather than as an
integer or bigint - integers are pretty fast and compact.  Even
granting that a text field would have been better, nobody's arguing
that you can't do LIKE against an integer column; we're just
discussing what syntax is required to make the user's intent
acceptably clear.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-02-17 Thread Andrew Dunstan



On 02/17/2012 02:52 PM, Robert Haas wrote:
If we really believed that implicit casts any form were evil, we would 
have removed them entirely instead of trimming them back. I don't see 
why it's heretical to suggest that the 8.3 casting changes brought us 
to exactly that point in the universe where everything is perfect and 
nothing can be further improved; does anyone seriously believe that? 


I don't believe we are necessarily at a perfect place, nor have I said 
it, nor has anyone else that I'm aware of. Neither am I opposed to 
implementing MySQL features (or doing them better) when appropriate.


But I do believe that a test for intval like '1%' is very likely to 
come from a broken design.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Triggers with DO functionality

2012-02-17 Thread Dimitri Fontaine
Thom Brown t...@linux.com writes:
 And thinking about it, DO is a bit nonsense here, so maybe we'd just
 have something like:

 CREATE TRIGGER...
 AS $$
 BEGIN
 END;
 $$;

 i.e. the same as a function.

I like that.  How do you tell which language the trigger is written in?
I'm not so sure about other function properties (SET, COST, ROWS,
SECURITY DEFINER etc) because applying default and punting users to go
use the full CREATE FUNCTION syntax would be a practical answer here.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Designing an extension for feature-space similarity search

2012-02-17 Thread Jay Levitt

Alexander Korotkov wrote:

On Fri, Feb 17, 2012 at 11:32 PM, Jay Levitt jay.lev...@gmail.com



Ah, yes, exactly the same problem. So what led you to add a flag instead
of using the range NULL..NULL? I'm on the fence about choosing.


At first, range bounds can't be NULL :) At second, if we have range
(a;b)+contain empty in internal page, both facts:
1) All normal underlying ranges are contained in (a;b).
2) There can be empty underlying ranges.
are useful for search.


That makes sense; you're essentially keeping one bit of stats about the 
values present in the range.


I wonder: if I'm indexing a rowtype, then for each column in the row I need 
to store a lower-left and an upper-right bound, plus a might-have-nulls 
flag.  Sounds a lot like a range. Should I just use ranges for that? See a 
downside (overhead)? See an upside (seems less duplicative somehow)? I'm 
fine depending on 9.2.


Jay

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Triggers with DO functionality

2012-02-17 Thread Thom Brown
On 17 February 2012 20:40, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 Thom Brown t...@linux.com writes:
 And thinking about it, DO is a bit nonsense here, so maybe we'd just
 have something like:

 CREATE TRIGGER...
 AS $$
 BEGIN
 END;
 $$;

 i.e. the same as a function.

 I like that.  How do you tell which language the trigger is written in?

Exactly the same as a function I'd imagine.  Just tack LANGUAGE
language; at the end.

 I'm not so sure about other function properties (SET, COST, ROWS,
 SECURITY DEFINER etc) because applying default and punting users to go
 use the full CREATE FUNCTION syntax would be a practical answer here.

*shrug* There's also the question about the stability of the trigger's
own in-line function too (i.e. IMMUTABLE, STABLE, VOLATILE).

-- 
Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Triggers with DO functionality

2012-02-17 Thread Andrew Dunstan



On 02/17/2012 03:58 PM, Thom Brown wrote:

On 17 February 2012 20:40, Dimitri Fontainedimi...@2ndquadrant.fr  wrote:

Thom Brownt...@linux.com  writes:

And thinking about it, DO is a bit nonsense here, so maybe we'd just
have something like:

CREATE TRIGGER...
AS $$
BEGIN
END;
$$;

i.e. the same as a function.

I like that.  How do you tell which language the trigger is written in?

Exactly the same as a function I'd imagine.  Just tack LANGUAGE
language; at the end.


I'm not so sure about other function properties (SET, COST, ROWS,
SECURITY DEFINER etc) because applying default and punting users to go
use the full CREATE FUNCTION syntax would be a practical answer here.

*shrug* There's also the question about the stability of the trigger's
own in-line function too (i.e. IMMUTABLE, STABLE, VOLATILE).



This is going to be pretty much a piece of syntactic sugar. Would it 
matter that much if the trigger functions made thus are all volatile? If 
someone wants the full function feature set they can always use CREATE 
FUNCTION first. I think I'm with Dimitri - let's keep it simple.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Triggers with DO functionality

2012-02-17 Thread Thom Brown
On 17 February 2012 21:07, Andrew Dunstan and...@dunslane.net wrote:


 On 02/17/2012 03:58 PM, Thom Brown wrote:

 On 17 February 2012 20:40, Dimitri Fontainedimi...@2ndquadrant.fr
  wrote:

 Thom Brownt...@linux.com  writes:

 And thinking about it, DO is a bit nonsense here, so maybe we'd just
 have something like:

 CREATE TRIGGER...
 AS $$
 BEGIN
 END;
 $$;

 i.e. the same as a function.

 I like that.  How do you tell which language the trigger is written in?

 Exactly the same as a function I'd imagine.  Just tack LANGUAGE
 language; at the end.

 I'm not so sure about other function properties (SET, COST, ROWS,
 SECURITY DEFINER etc) because applying default and punting users to go
 use the full CREATE FUNCTION syntax would be a practical answer here.

 *shrug* There's also the question about the stability of the trigger's
 own in-line function too (i.e. IMMUTABLE, STABLE, VOLATILE).


 This is going to be pretty much a piece of syntactic sugar. Would it matter
 that much if the trigger functions made thus are all volatile? If someone
 wants the full function feature set they can always use CREATE FUNCTION
 first. I think I'm with Dimitri - let's keep it simple.

Yes, always best to start with essential functionality.

-- 
Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Triggers with DO functionality

2012-02-17 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 02/17/2012 03:58 PM, Thom Brown wrote:
 On 17 February 2012 20:40, Dimitri Fontainedimi...@2ndquadrant.fr  wrote:
 I'm not so sure about other function properties (SET, COST, ROWS,
 SECURITY DEFINER etc) because applying default and punting users to go
 use the full CREATE FUNCTION syntax would be a practical answer here.

 This is going to be pretty much a piece of syntactic sugar. Would it 
 matter that much if the trigger functions made thus are all volatile? If 
 someone wants the full function feature set they can always use CREATE 
 FUNCTION first. I think I'm with Dimitri - let's keep it simple.

Volatility is a complete no-op for a trigger function anyway, as are
other planner parameters such as cost/rows, because there is no planning
involved in trigger calls.

Of the existing CREATE FUNCTION options, I think only LANGUAGE, SECURITY
DEFINER, and SET are of any possible interest for a trigger function.
And I don't have any problem deeming SET a second-order thing that you
should have to go use CREATE FUNCTION for.  But perhaps SECURITY DEFINER
is a common enough need to justify including in this shorthand form.

Has anybody stopped to look at the SQL standard for this?  In-line
trigger definitions are actually what they intend, IIRC.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Triggers with DO functionality

2012-02-17 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 Has anybody stopped to look at the SQL standard for this?  In-line
 trigger definitions are actually what they intend, IIRC.

In which language?  Do we need to include PL/PSM to be compliant, and
use that by default?  In that case we might want to force people to
spell out LANGUAGE plpgsql when we don't provide for PSM yet, so that we
avoid some backwards compatibility problems down the road.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Triggers with DO functionality

2012-02-17 Thread Tom Lane
Dimitri Fontaine dimi...@2ndquadrant.fr writes:
 Tom Lane t...@sss.pgh.pa.us writes:
 Has anybody stopped to look at the SQL standard for this?  In-line
 trigger definitions are actually what they intend, IIRC.

 In which language?  Do we need to include PL/PSM to be compliant, and
 use that by default?

Darn if I know.  But let's make sure we don't paint ourselves into a
corner such that we couldn't support the standard's syntax sometime
in the future.

 In that case we might want to force people to
 spell out LANGUAGE plpgsql when we don't provide for PSM yet, so that we
 avoid some backwards compatibility problems down the road.

I suspect that we can avoid that as long as the command is based around
a string literal for the function body.  OTOH, CREATE FUNCTION has never
had a default for LANGUAGE, and we don't get many complaints about that,
so maybe insisting that LANGUAGE be supplied for an in-line trigger
isn't unreasonable.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 16-bit page checksums for 9.2

2012-02-17 Thread Simon Riggs
On Thu, Feb 16, 2012 at 1:53 PM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Feb 16, 2012 at 6:16 AM, Simon Riggs si...@2ndquadrant.com wrote:
 v8 attached

 It's hard to believe that this version has been tested terribly
 thoroughly, because it doesn't compile.

I'm just back home from a few days off grid.

It's possible it doesn't compile against current HEAD, though it
certainly does compile and work against my last git pull.

I will look into your comments in detail tomorrow morning. Thank you
for looking at the patch.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Triggers with DO functionality

2012-02-17 Thread Jaime Casanova
On Fri, Feb 17, 2012 at 4:46 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Has anybody stopped to look at the SQL standard for this?  In-line
 trigger definitions are actually what they intend, IIRC.


this is what i found there

trigger definition ::=
  CREATE TRIGGER trigger name trigger action time trigger event
      ON table name [ REFERENCING transition table or variable list ]
      triggered action

triggered action ::=
  [ FOR EACH { ROW | STATEMENT } ]
      [ WHEN left paren search condition right paren ]
      triggered SQL statement

triggered SQL statement ::=
    SQL procedure statement
  | BEGIN ATOMIC { SQL procedure statement semicolon }... END

-- 
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Notes about fixing regexes and UTF-8 (yet again)

2012-02-17 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Feb 17, 2012 at 10:19 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Before going much further with this, we should probably do some timings
 of 64K calls of iswupper and friends, just to see how bad a dumb
 implementation will be.

 Can't hurt.

The answer, on a reasonably new desktop machine (2.0GHz Xeon E5503)
running Fedora 16 in en_US.utf8 locale, is that 64K iterations of
pg_wc_isalpha or sibling functions requires a shade under 2ms.
So this definitely justifies caching the values to avoid computing
them more than once per session, but I'm not convinced there are
grounds for trying harder than that.

BTW, I am also a bit surprised to find out that this locale considers
48342 of those characters to satisfy isalpha().  Seems like a heck of
a lot.  But anyway we can forget my idea of trying to save work by
incorporating a-priori assumptions about which Unicode codepoints are
which --- it'll be faster to just iterate through them all, at least
for that case.  Maybe we should hard-wire some cases like digits, not
sure.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] elog and MemoryContextSwitchTo

2012-02-17 Thread Gaetano Mendola

Hi all,

Is the following code well formed?

oldContext = MemoryContextSwitchTo(newContext);

if (something_bad) {
  elog(ERROR, ...);
}
...
MemoryContextSwitchTo(oldContext);


or do I have to ripristinate the oldContext before to issue the elog ?



Regards
Gaetano Mendola


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] elog and MemoryContextSwitchTo

2012-02-17 Thread Tom Lane
Gaetano Mendola mend...@gmail.com writes:
 or do I have to ripristinate the oldContext before to issue the elog ?

No.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-02-17 Thread Josh Berkus
On 2/17/12 12:04 PM, Robert Haas wrote:
 The argument isn't about whether the user made the right design
 choices; it's about whether he should be forced to insert an explicit
 type cast to get the query to do what it is unambiguously intended to
 do. 

I don't find INTEGER LIKE '1%' to be unambiguous.

Prior to this discussion, if I had run across such a piece of code, I
couldn't have told you what it would do in MySQL without testing.

What *does* it do in MySQL?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MySQL search query is not executing in Postgres DB

2012-02-17 Thread Don Baccus

On Feb 17, 2012, at 4:12 PM, Josh Berkus wrote:

 On 2/17/12 12:04 PM, Robert Haas wrote:
 The argument isn't about whether the user made the right design
 choices; it's about whether he should be forced to insert an explicit
 type cast to get the query to do what it is unambiguously intended to
 do. 
 
 I don't find INTEGER LIKE '1%' to be unambiguous.
 
 Prior to this discussion, if I had run across such a piece of code, I
 couldn't have told you what it would do in MySQL without testing.

If someone showed it to me without mention MySQL I'd say:

oh, it's an error.

 
 What *does* it do in MySQL?

And knowing it's MySQL … oh, probably *not* an error, but like you … I'd be 
mystified.

Should 01 like '0%' match?


Don Baccus
http://donb.photo.net
http://birdnotes.net
http://openacs.org







-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Notes about fixing regexes and UTF-8 (yet again)

2012-02-17 Thread Tom Lane
I wrote:
 The answer, on a reasonably new desktop machine (2.0GHz Xeon E5503)
 running Fedora 16 in en_US.utf8 locale, is that 64K iterations of
 pg_wc_isalpha or sibling functions requires a shade under 2ms.
 So this definitely justifies caching the values to avoid computing
 them more than once per session, but I'm not convinced there are
 grounds for trying harder than that.

And here's a poorly-tested draft patch for that.

regards, tom lane


diff --git a/src/backend/regex/regc_cvec.c b/src/backend/regex/regc_cvec.c
index fb6f06b5243f50bfad2cefa5c016d4e842791a3d..98f3c597678b492dd59afcd956e5cdfecdba4f86 100644
*** a/src/backend/regex/regc_cvec.c
--- b/src/backend/regex/regc_cvec.c
*** static void
*** 77,82 
--- 77,83 
  addchr(struct cvec * cv,		/* character vector */
  	   chr c)	/* character to add */
  {
+ 	assert(cv-nchrs  cv-chrspace);
  	cv-chrs[cv-nchrs++] = (chr) c;
  }
  
diff --git a/src/backend/regex/regc_locale.c b/src/backend/regex/regc_locale.c
index 6cf27958b1545a61fba01e76dc4d37aca32789dc..44ce582bdad1a7d830d4122cada45a39c188981c 100644
*** a/src/backend/regex/regc_locale.c
--- b/src/backend/regex/regc_locale.c
*** static const struct cname
*** 351,356 
--- 351,366 
  
  
  /*
+  * We do not use the hard-wired Unicode classification tables that Tcl does.
+  * This is because (a) we need to deal with other encodings besides Unicode,
+  * and (b) we want to track the behavior of the libc locale routines as
+  * closely as possible.  For example, it wouldn't be unreasonable for a
+  * locale to not consider every Unicode letter as a letter.  So we build
+  * character classification cvecs by asking libc, even for Unicode.
+  */
+ 
+ 
+ /*
   * element - map collating-element name to celt
   */
  static celt
*** cclass(struct vars * v,			/* context */
*** 498,503 
--- 508,514 
  	   int cases)/* case-independent? */
  {
  	size_t		len;
+ 	const struct cvec *ccv = NULL;
  	struct cvec *cv = NULL;
  	const char * const *namePtr;
  	int			i,
*** cclass(struct vars * v,			/* context */
*** 549,626 
  
  	/*
  	 * Now compute the character class contents.
- 	 *
- 	 * For the moment, assume that only char codes  256 can be in these
- 	 * classes.
  	 */
  
  	switch ((enum classes) index)
  	{
  		case CC_PRINT:
! 			cv = getcvec(v, UCHAR_MAX, 0);
! 			if (cv)
! 			{
! for (i = 0; i = UCHAR_MAX; i++)
! {
! 	if (pg_wc_isprint((chr) i))
! 		addchr(cv, (chr) i);
! }
! 			}
  			break;
  		case CC_ALNUM:
! 			cv = getcvec(v, UCHAR_MAX, 0);
! 			if (cv)
! 			{
! for (i = 0; i = UCHAR_MAX; i++)
! {
! 	if (pg_wc_isalnum((chr) i))
! 		addchr(cv, (chr) i);
! }
! 			}
  			break;
  		case CC_ALPHA:
! 			cv = getcvec(v, UCHAR_MAX, 0);
! 			if (cv)
! 			{
! for (i = 0; i = UCHAR_MAX; i++)
! {
! 	if (pg_wc_isalpha((chr) i))
! 		addchr(cv, (chr) i);
! }
! 			}
  			break;
  		case CC_ASCII:
  			cv = getcvec(v, 0, 1);
  			if (cv)
  addrange(cv, 0, 0x7f);
  			break;
  		case CC_BLANK:
  			cv = getcvec(v, 2, 0);
  			addchr(cv, '\t');
  			addchr(cv, ' ');
  			break;
  		case CC_CNTRL:
  			cv = getcvec(v, 0, 2);
  			addrange(cv, 0x0, 0x1f);
  			addrange(cv, 0x7f, 0x9f);
  			break;
  		case CC_DIGIT:
! 			cv = getcvec(v, 0, 1);
! 			if (cv)
! addrange(cv, (chr) '0', (chr) '9');
  			break;
  		case CC_PUNCT:
! 			cv = getcvec(v, UCHAR_MAX, 0);
! 			if (cv)
! 			{
! for (i = 0; i = UCHAR_MAX; i++)
! {
! 	if (pg_wc_ispunct((chr) i))
! 		addchr(cv, (chr) i);
! }
! 			}
  			break;
  		case CC_XDIGIT:
  			cv = getcvec(v, 0, 3);
  			if (cv)
  			{
--- 560,608 
  
  	/*
  	 * Now compute the character class contents.
  	 */
  
  	switch ((enum classes) index)
  	{
  		case CC_PRINT:
! 			ccv = pg_ctype_get_cache(pg_wc_isprint);
  			break;
  		case CC_ALNUM:
! 			ccv = pg_ctype_get_cache(pg_wc_isalnum);
  			break;
  		case CC_ALPHA:
! 			ccv = pg_ctype_get_cache(pg_wc_isalpha);
  			break;
  		case CC_ASCII:
+ 			/* hard-wired meaning */
  			cv = getcvec(v, 0, 1);
  			if (cv)
  addrange(cv, 0, 0x7f);
  			break;
  		case CC_BLANK:
+ 			/* hard-wired meaning */
  			cv = getcvec(v, 2, 0);
  			addchr(cv, '\t');
  			addchr(cv, ' ');
  			break;
  		case CC_CNTRL:
+ 			/* hard-wired meaning */
  			cv = getcvec(v, 0, 2);
  			addrange(cv, 0x0, 0x1f);
  			addrange(cv, 0x7f, 0x9f);
  			break;
  		case CC_DIGIT:
! 			ccv = pg_ctype_get_cache(pg_wc_isdigit);
  			break;
  		case CC_PUNCT:
! 			ccv = pg_ctype_get_cache(pg_wc_ispunct);
  			break;
  		case CC_XDIGIT:
+ 			/*
+ 			 * It's not clear how to define this in non-western locales, and
+ 			 * even less clear that there's any particular use in trying.
+ 			 * So just hard-wire the meaning.
+ 			 */
  			cv = getcvec(v, 0, 3);
  			if (cv)
  			{
*** cclass(struct vars * v,			/* context */
*** 630,679 
  			}
  			break;
  		case