Re: [HACKERS] Question about Encoding a Custom Type

2008-06-16 Thread Martijn van Oosterhout
On Sun, Jun 15, 2008 at 10:07:43PM -0700, David E. Wheeler wrote:
 Howdy,
 
 Possibly showing my ignorance here, but as I'm working on updating  
 citext to be locale-aware and to work on 8.3, I've run into this  
 peculiarity:

The only odd thing I see is the use of PG_ARGS to pass the arguments to
citextcmp. But I can't see why it would break either. Can you attach a
debugger and see where it goes wrong?

As to the comment about freeing stuff, it's usually nice if btree
comparison functions free memory because that way index rebuilds on
large tables don't run you out of memory.

Have a nice day,
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 Please line up in a tree and maintain the heap invariant while 
 boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature


Re: [HACKERS] Question about Encoding a Custom Type

2008-06-16 Thread David E. Wheeler

On Jun 16, 2008, at 02:52, Martijn van Oosterhout wrote:

The only odd thing I see is the use of PG_ARGS to pass the arguments  
to

citextcmp. But I can't see why it would break either. Can you attach a
debugger and see where it goes wrong?


Yes, I can do that, although I'm pretty new to C (let alone gdb), so  
I'm not sure exactly how to go about it. I'll try to get on IRC later  
today to see if anyone can help me along.



As to the comment about freeing stuff, it's usually nice if btree
comparison functions free memory because that way index rebuilds on
large tables don't run you out of memory.


Thanks. I'll add that to my list.

Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question about Encoding a Custom Type

2008-06-16 Thread David E. Wheeler

On Jun 16, 2008, at 09:24, David E. Wheeler wrote:


On Jun 16, 2008, at 02:52, Martijn van Oosterhout wrote:

The only odd thing I see is the use of PG_ARGS to pass the  
arguments to
citextcmp. But I can't see why it would break either. Can you  
attach a

debugger and see where it goes wrong?


Yes, I can do that, although I'm pretty new to C (let alone gdb), so  
I'm not sure exactly how to go about it. I'll try to get on IRC  
later today to see if anyone can help me along.


What's even weirder is that it can not work and then suddenly work:

try=# select citext_smaller( 'aardvark'::citext, 'AARDVARK'::citext );
ERROR:  invalid byte sequence for encoding UTF8: 0xe02483
HINT:  This error can also happen if the byte sequence does not match  
the encoding expected by the server, which is controlled by  
client_encoding.

try=# select citext_smaller( 'aardvark'::citext, 'AARDVARK'::citext );
 citext_smaller

 AARDVARK
(1 row)

WTF? Logging onto IRC now…

  https://svn.kineticode.com/citext/trunk/

Best,

David


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question about Encoding a Custom Type

2008-06-16 Thread Tom Lane
David E. Wheeler [EMAIL PROTECTED] writes:
 What's even weirder is that it can not work and then suddenly work:

Smells like uninitialized-memory problems to me.  Perhaps you are
miscalculating the length of the input data?

Are you testing in an --enable-cassert build?  The memory clobber
stuff can help to make it more obvious where such problems lurk.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question about Encoding a Custom Type

2008-06-16 Thread David E. Wheeler

On Jun 16, 2008, at 13:06, Tom Lane wrote:


David E. Wheeler [EMAIL PROTECTED] writes:

What's even weirder is that it can not work and then suddenly work:


Smells like uninitialized-memory problems to me.  Perhaps you are
miscalculating the length of the input data?


Entirely possible. Here are the two functions in which I calculate size:

char * cilower(text * arg) {
// Do I need to free anything here?
char * str = VARDATA_ANY( arg );
#ifdef USE_WIDE_UPPER_LOWER
// Have wstring_lower() do the work.
return wstring_lower( str );
# else
// Copy the string and process it.
intinex, len;
char * result;

index  = 0;
len= VARSIZE(arg) - VARHDRSZ;
result = (char *) palloc( strlen( str ) + 1 );

for (index = 0; index = len; index++) {
result[index] = tolower((unsigned char) str[index] );
}
return result;
#endif   /* USE_WIDE_UPPER_LOWER */
}

int citextcmp (PG_FUNCTION_ARGS) {
// Could we do away with the varlena struct here?
text * left  = PG_GETARG_TEXT_P(0);
text * right = PG_GETARG_TEXT_P(1);
char * lstr  = cilower( left );
char * rstr  = cilower( right );
intllen  = VARSIZE_ANY_EXHDR(left);
intrlen  = VARSIZE_ANY_EXHDR(right);
return varstr_cmp(lstr, llen, rstr, rlen);
}


Are you testing in an --enable-cassert build?  The memory clobber
stuff can help to make it more obvious where such problems lurk.


I've just recompiled with --enable-cassert and --enable-debug, but got  
no more information when I triggered the error, neither in psql nor in  
the log. :-(


Thanks,

David


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question about Encoding a Custom Type

2008-06-16 Thread Martijn van Oosterhout
On Mon, Jun 16, 2008 at 01:29:33PM -0500, David E. Wheeler wrote:
 Smells like uninitialized-memory problems to me.  Perhaps you are
 miscalculating the length of the input data?
 
 Entirely possible. Here are the two functions in which I calculate size:

Actually, real dumb question but: arn't you assume that text* values
are NULL terminated, because they're not...
 
 char * cilower(text * arg) {
 // Do I need to free anything here?
 char * str = VARDATA_ANY( arg );

str here is not null terminated. You need text_to_cstring or something
similar.

Have a nice day,
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 Please line up in a tree and maintain the heap invariant while 
 boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature


Re: [HACKERS] Question about Encoding a Custom Type

2008-06-16 Thread David E. Wheeler

On Jun 16, 2008, at 13:41, Martijn van Oosterhout wrote:


Actually, real dumb question but: arn't you assume that text* values
are NULL terminated, because they're not...


char * cilower(text * arg) {
   // Do I need to free anything here?
   char * str = VARDATA_ANY( arg );


str here is not null terminated. You need text_to_cstring or something
similar.


Ah! That makes sense. I changed it to this:

#define GET_TEXT_STR(textp) DatumGetCString( \
DirectFunctionCall1( textout, PointerGetDatum( textp ) ) \
)

char * cilower(text * arg) {
// Do I need to free anything here?
char * str  = GET_TEXT_STR( arg );
...

And now I don't get that error anymore. W00t! Many thanks.

Now I have just one more bizarre error: PostgreSQL thinks that a  
citext column is not in an aggregate even when it is:


try=# CREATE AGGREGATE array_accum (anyelement) (
try(# sfunc = array_append,
try(# stype = anyarray,
try(# initcond = '{}'
try(# );
try=# CREATE TEMP TABLE srt ( name CITEXT );
try=#
try=# INSERT INTO srt (name)
try-# VALUES ('aardvark'),
try-#('AAA'),
try-#('â');
try=# select array_accum(name) from srt order by name;
ERROR:  column srt.name must appear in the GROUP BY clause or be  
used in an aggregate function


Um, what? Again, I'm sure I'm just missing something really stupid.  
What might cause this?


Many thanks all,

David
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question about Encoding a Custom Type

2008-06-16 Thread Tom Lane
David E. Wheeler [EMAIL PROTECTED] writes:
 Now I have just one more bizarre error: PostgreSQL thinks that a  
 citext column is not in an aggregate even when it is:
 try=# select array_accum(name) from srt order by name;
 ERROR:  column srt.name must appear in the GROUP BY clause or be  
 used in an aggregate function

 Um, what?

It's complaining about the use in ORDER BY.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question about Encoding a Custom Type

2008-06-16 Thread David E. Wheeler

On Jun 16, 2008, at 14:38, Tom Lane wrote:


It's complaining about the use in ORDER BY.


Okay, so stupid question: How can I get an array of the values in a  
given order? I guess this works:


select array_accum(b) from ( select name from srt order by name ) AS  
A(b);


Thanks,

David


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question about Encoding a Custom Type

2008-06-16 Thread David Fetter
On Mon, Jun 16, 2008 at 02:45:57PM -0500, David Wheeler wrote:
 On Jun 16, 2008, at 14:38, Tom Lane wrote:

 It's complaining about the use in ORDER BY.

 Okay, so stupid question: How can I get an array of the values in a
 given order? I guess this works:

 select array_accum(b) from ( select name from srt order by name ) AS
 A(b);

SELECT ARRAY(SELECT name FROM srt ORDER BY name); -- also works.

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question about Encoding a Custom Type

2008-06-16 Thread David E. Wheeler

On Jun 16, 2008, at 16:48, David Fetter wrote:


select array_accum(b) from ( select name from srt order by name ) AS
A(b);


SELECT ARRAY(SELECT name FROM srt ORDER BY name); -- also works.


Wow, somehow I'd missed that syntax over the years. Thanks David!

Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers