Re: [SQL] regexp_replace and UTF8

2009-02-04 Thread Jasen Betts
On 2009-02-02, Bart Degryse  wrote:
>
> --=__PartF6DE34E1.0__=
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: quoted-printable
>
> Thanks for the ideas!
> The function Jasen suggests works partially.
> There are also entities like
> ‘
> ’
> “
> ”
> …

these work fine here.

jasen=# select htmlent('‘ ’ “ ” …');
  htmlent  
  ---
   ‘ ’ “ ” …
   (1 row)
   
> The output of the htmlent function for these looks like |

I see a pipe symbol '|' is that what you wrote?

> Was that what you meant with 
>>"characters outside of the LATIN-1 space
>> are not handled but apparently this is what you want." ?

I was under a mistaken impression of the encoding of HTML numeric
entities.

> Because in that case, they should be handled too.
> How should that be done.
> Thanks,

it works here because server_encoding is UTF8;

to check do this.

  show server_encoding;show client_encoding;
  
it looks like you's need to convert the numbers to utt-8 in a bytea
type and then use convert to translate then to your preferred encoding
(appears to be win125x)

postgresql8.3 appears to provide no way to generate UTF-8 in a bytea
type so if you can't use a utf-8 encoding for your database you'll
have to write your own.








 Jasen Betts  2009-01-31 12:47 >>>
> On 2009-01-30, Bart Degryse  wrote:
>>
>> --=3D__Part8EA648F8.0__=3D
>> Content-Type: text/plain; charset=3DUTF-8
>> Content-Transfer-Encoding: quoted-printable
>>
>> Hi,
>> I have a text field with data like this: 'de patiënt niet'
>> (without the quotes).
>> I would like to convert this string to look like this: 'de pati=C3=ABnt
>> niet'
>> Basically what I need to do (I think) is
>> - get rid of the &, # and ;
>> - convert the number to hex
>> - make a UTF8 from that (thus: \xEB)
> that is not UTF8.
>
> the UTF8 representation for '=C3=AB' in  SQL is e'\xC3\xAB' or chr(235)
>
> your input appears to be encoded in LATIN-1 (or possibly 8859-13)
>
>
> I think you'll need to write a function.
>
> here, where I have database encoding UTF8 this appears to work as you
> desire.
>
> CREATE or replace FUNCTION htmlent(inp text) returns text as
> $f$
> DECLARE
>   str text;
> BEGIN
> str=3D regexp_replace(quote_literal( inp)
> ,$$&#(\d+);$$
> ,$$'||chr(\1)||'$$=20
> ,'g');
> execute 'select '||str into str;
> return str;
> END
> $f$
> LANGUAGE PLPGSQL;
>
> select htmlent('de patiënt niet');
>
> probably the above should be expanded to handle named entities=20
> like '&' too.
>
> characters outside of the LATIN-1 space are not handled=20
> but aparently this is what you want.
>
>
> --=20
> Sent via pgsql-sql mailing list ([email protected])
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-sql
>
> --=__PartF6DE34E1.0__=
> Content-Type: text/html; charset=UTF-8
> Content-Transfer-Encoding: quoted-printable
>
>
>
>
>
>Thanks for the ideas!
>The function Jasen suggests works partially.
>There are also entities like
>‘
>
>’
>“
>”
>…
>The output of the htmlent function for these looks like | 
>Was that what you meant with "characters outside of the LATIN-1 space =
> are not handled
>but apparently this is what you want." ?
>Because in that case, they should be handled too.
>How should that be done.
>Thanks,
>Bart >>> Jasen Betts  gt; 2009-01-31 12:47 >>>On 2009-01-30, Bart Degryse  [email protected]> wrote:>> --=3D__Part8EA648F8.0__=3D R>> Content-Type: text/plain; charset=3DUTF-8> Content-Transfer-E=
> ncoding: quoted-printable>> Hi,> I have a text field =
> with data like this: 'de patiënt niet'> (without the =
> quotes).> I would like to convert this string to look like this: =
> 'de pati=C3=ABnt> niet'> Basically what I need to do (I =
> think) is> - get rid of the &, # and ;> - convert the =
> number to hex> - make a UTF8 from that (thus: \xEB)that is not =
> UTF8.the UTF8 representation for '=C3=AB' in  SQL is =
> e'\xC3\xAB' or chr(235)your input appears to be encoded in LATIN-1 =
> (or possibly 8859-13)I think you'll need to write a function. R>here, where I have database encoding UTF8 this appears to work as =
> youdesire.CREATE or replace FUNCTION htmlent(inp text) returns =
> text as$f$DECLARE  str text;BEGINstr=3D regexp_rep=
> lace(quote_literal( inp)=
> ;    =
> ,$$&#(\d+);$$    ,$$'||chr(\1)||'$$   =
>   ,'g');execute 'select '||str into str;return str;END=
> $f$LANGUAGE PLPGSQL;select htmlent('de patiënt =
> niet');probably the above should be expanded to handle named =
> entities like '&' too.characters outside of the =
> LATIN-1 space are not handled but aparently this is what you want.<=
> BR>-- Sent via pgsql-sql mailing list ([email protected])>To make changes to your subscription:http://www.postgresql.=
> org/mailpref/pgsql-sql">http://www.postgresql.org/mailpref/pgsql-sql>
>
> --=__PartF6DE34E1.0__=--
>


-- 
Sent via pgsql-sql mailing list ([email protected]

[SQL] postgre2postgre

2009-02-04 Thread F.

Hello,
I am trying to migrate from postgresql-8.1.11-1.el5_1.1 (i386) to
postgresql-8.3.4-1.fc10.x86_64. But I can not.

Database uses ltree and tsearch and the problem seems to be this.

I am using,
pg_dump in first computer and psql in second computer to execute script.

First error:
psql:informatica.sql:24: ERROR:  no se encuentra la función «gtsvector_in» en 
el archivo «/usr/lib64/pgsql/tsearch2.so»

Anyone know any way to migrate?





-- 




--
Publicidad http://www.pas-world.com


-- 
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] postgre2postgre

2009-02-04 Thread Devrim GÜNDÜZ
On Wed, 2009-02-04 at 22:57 +0100, F. wrote:
> First error:
> psql:informatica.sql:24: ERROR:  no se encuentra la función
> «gtsvector_in» en el archivo «/usr/lib64/pgsql/tsearch2.so»
> 
> Anyone know any way to migrate?

Tsearch2 was integrated in core as of 8.3. You will need to read this:

http://www.postgresql.org/docs/8.3/static/textsearch-migration.html

Regards,
-- 
Devrim GÜNDÜZ, RHCE
devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
   http://www.gunduz.org


signature.asc
Description: This is a digitally signed message part


Re: [SQL] postgre2postgre

2009-02-04 Thread Chris

F. wrote:

Hello,
I am trying to migrate from postgresql-8.1.11-1.el5_1.1 (i386) to
postgresql-8.3.4-1.fc10.x86_64. But I can not.

Database uses ltree and tsearch and the problem seems to be this.

I am using,
pg_dump in first computer and psql in second computer to execute script.

First error:
psql:informatica.sql:24: ERROR:  no se encuentra la función «gtsvector_in» en 
el archivo «/usr/lib64/pgsql/tsearch2.so»

Anyone know any way to migrate?


tsearch2 became a built in module, there is doco on the website about 
how to handle this:


http://www.postgresql.org/docs/8.3/static/tsearch2.html

--
Postgresql & php tutorials
http://www.designmagick.com/


--
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


[SQL] current_date vs 'now'

2009-02-04 Thread Jamie Tufnell
Hi,

I was doing EXPLAIN ANALYZE on a query where I compare against
current_date and noticed the following:

   Filter: (date <= ('now'::text)::date)

I knew about now() but did not know about 'now' and have since learnt
of 'today', 'tomorrow', etc.  Great!

So, I changed my condition to <= 'now' to see if there would be any
improvement, which changed the filter line to:

   Filter: (date <= '2009-02-05'::date)

and without fail this query is 2ms quicker every time.  I alternated
between the two and ran them both several times to make sure it wasn't
just a cache difference.

AFAIK current_date is standard and 'now' and 'today', etc are not...
so that's one reason to continuing using current_date.  However, I
wonder why 'today' and current_date don't generate the same query
plan?

Jamie

-- 
Sent via pgsql-sql mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql