Re: [GENERAL] Searching for bare letters

2011-10-02 Thread Reuven M. Lerner

Hi, everyone.  Uwe wrote:


What kind of client are the users using?  I assume you will have some kind
of user interface. For me this is a typical job for a user interface. The
number of letters with equivalents in different languages are extremely
limited, so a simple matching routine in the user interface should give you a
way to issue the proper query.


The user interface will be via a Web application.  But we need to store 
the data with the European characters, such as ñ, so that we can display 
them appropriately.  So much as I like your suggestion, we need to do 
the opposite of what you're saying -- namely, take a bare letter, and 
then search for letters with accents and such on them.


I am beginning to think that storing two versions of each name, one bare 
and the other not, might be the easiest way to go.   But hey, I'm open 
to more suggestions.


Reuven

--
Reuven M. Lerner -- Web development, consulting, and training
Mobile: +972-54-496-8405 * US phone: 847-230-9795
Skype/AIM: reuvenlerner

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Searching for bare letters

2011-10-02 Thread hamann . w
Reuven M. Lerner wrote:

 pHi, everyone.  I'm working on a project on PostgreSQL 9.0 (soon
   to be upgraded to 9.1, given that we haven't yet launched).  The
   project will involve numerous text fields containing English,
   Spanish, and Portuguese.  Some of those text fields will be
   searchable by the user.  That's easy enough to do; for our
   purposes, I was planning to use some combination of LIKE searches;
   the database is small enough that this doesn't take very much
   time, and we don't expect the number of searchable records (or
   columns within those records) to be all that large./p
 pThe thing is, the people running the site want searches to work
   on what I'm calling (for lack of a better term) bare letters. 
   That is, if the user searches for n, then the search should also
   match Spanish words containing ñ.  I'm told by Spanish-speaking
   members of the team that this is how they would expect searches to
   work.  However, when I just did a quick test using a UTF-8 encoded
   9.0 database, I found that PostgreSQL didn't  see the two
   characters as identical.  (I must say, this is the behavior that I
   would have expected, had the Spanish-speaking team member not said
   anything on the subject.)/p
 pSo my question is whether I can somehow wrangle PostgreSQL into
   thinking that n and ñ are the same character for search
   purposes, or if I need to do something else -- use regexps, keep a
   naked, searchable version of each column alongside the native
   one, or something else entirely -- to get this to work./p
 pAny ideas?/p
 pThanks,/p
 pReuvenbr

I had the same problem with german (there is ä ö ü)
I ended up with a normalized version of the database (for many purposes, this 
could
be just an extra column) plus preprocessing the input.
There is one difficulty with german searches: these letters are commonly 
transliterated into
ue etc, like in Muenchen. So depending on culture, some people would expect a 
u search
term to match, and others the ue. So preprocessing query means replacing bare 
u
(not followed by e) with a ue? regex

BTW: if your search form does not explicitly tell the browser to use utf8 to 
encode the search field,
you might expect a small proportion of iso-latin1 requests

Regards
Wolfgang




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] SQL Help - Finding Next Lowest Value of Current Row Value

2011-10-02 Thread Thomas Kellerer

Jeff Adams wrote on 01.10.2011 23:30:

Greetings,

I have a large table (~19 million records). Records contains a field
identifying a vessel and a field containing an time (epoch). Using the
current rows vessel and time values, I need to be able to find the next
lowest time value for the vessel and use it to compute how much time has
elapsed between the records. I have tried a scalar subquery in the SELECT,
which works, but it runs quite slowly. Is there an alternative approach that
might perform better for this type of query. Any information would be
greatly appreciated. Thanks...

Jeff


Something like:


SELECT vessel,
   time_column,
   time_column - lag(time_column) over (partition by vessel order by 
time_column) as diff
FROM your_table

No sure how good that performs though.




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Searching for bare letters

2011-10-02 Thread Uwe Schroeder
 Hi, everyone.  Uwe wrote:
  What kind of client are the users using?  I assume you will have some
  kind of user interface. For me this is a typical job for a user
  interface. The number of letters with equivalents in different
  languages are extremely limited, so a simple matching routine in the
  user interface should give you a way to issue the proper query.
 
 The user interface will be via a Web application.  But we need to store
 the data with the European characters, such as ñ, so that we can display
 them appropriately.  So much as I like your suggestion, we need to do
 the opposite of what you're saying -- namely, take a bare letter, and
 then search for letters with accents and such on them.
 
 I am beginning to think that storing two versions of each name, one bare
 and the other not, might be the easiest way to go.   But hey, I'm open
 to more suggestions.
 
 Reuven


That still doesn't hinder you from using a matching algorithm. Here a simple 
example (to my understanding of the problem)
You have texts stored in the db both containing a n and a ñ. Now a client 
enters n on the website. What you want to do is look for both variations, so 
n translates into n or ñ.
There you have it. In the routine that receives the request you have a 
matching method that matches on n (or any of the few other characters with 
equivalents) and the routine will issue a query with a xx like %n% or xx 
like %ñ% (personally I would use ilike, since that eliminates the case 
problem).

Since you're referring to a name, I sure don't know the specifics of the 
problem or data layout, but by what I know I think you can tackle this with a 
rather primitive match - translate to kind of algorithm.

One thing I'd not do: store duplicate versions. There's always a way to deal 
with data the way it is. In my opinion storing different versions of the same 
data just bloats a database in favor of a smarter way to deal with the initial 
data.

Uwe



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Searching for bare letters

2011-10-02 Thread Oleg Bartunov
I don't see the problem - you can have a dictionary, which does all work on 
recognizing bare letters and output several versions. Have you seen unaccent

dictionary ?

Oleg
On Sun, 2 Oct 2011, Uwe Schroeder wrote:


Hi, everyone.  Uwe wrote:

What kind of client are the users using?  I assume you will have some
kind of user interface. For me this is a typical job for a user
interface. The number of letters with equivalents in different
languages are extremely limited, so a simple matching routine in the
user interface should give you a way to issue the proper query.


The user interface will be via a Web application.  But we need to store
the data with the European characters, such as ?, so that we can display
them appropriately.  So much as I like your suggestion, we need to do
the opposite of what you're saying -- namely, take a bare letter, and
then search for letters with accents and such on them.

I am beginning to think that storing two versions of each name, one bare
and the other not, might be the easiest way to go.   But hey, I'm open
to more suggestions.

Reuven



That still doesn't hinder you from using a matching algorithm. Here a simple
example (to my understanding of the problem)
You have texts stored in the db both containing a n and a ?. Now a client
enters n on the website. What you want to do is look for both variations, so
n translates into n or ?.
There you have it. In the routine that receives the request you have a
matching method that matches on n (or any of the few other characters with
equivalents) and the routine will issue a query with a xx like %n% or xx
like %?% (personally I would use ilike, since that eliminates the case
problem).

Since you're referring to a name, I sure don't know the specifics of the
problem or data layout, but by what I know I think you can tackle this with a
rather primitive match - translate to kind of algorithm.

One thing I'd not do: store duplicate versions. There's always a way to deal
with data the way it is. In my opinion storing different versions of the same
data just bloats a database in favor of a smarter way to deal with the initial
data.

Uwe






Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Updating 9.0.4 -- 9.1.1: How best to ???

2011-10-02 Thread r d
Hi,

I would like to know which if any is a recommended install and upgrade
method.

My situation is summarized as:

I currently have Postgresql 9.0.4 installed via the Fedora Core FC15 yum
repo, and 9.1.x is not there (yet).

I decided to install the PGDG repo (pgdg-fedora91-9.1-5.noarch) to make it
update. It does not.

The FC15 packages are called postgresql_* and not postgresql90_*, the
PGDG packages are called postgresql91_*.


Since both repositories apparently get into each others way, my question is,
what is the best (recommended) way to update or install postgres:

- Dump my FC15 installation and replace it with one coming from the PGDG
repo?
- Wait for the 9.1 packages to appear in the Fedora Core repo?

I prefer install from repo as there is currently no reason for me to install
from source.

Thanks in advance,

RD


Re: [GENERAL] Why PGSQL has no developments in the .NET area?

2011-10-02 Thread Andy Colson

On 10/01/2011 10:32 PM, Rohit Coder wrote:

PgSQL has just one old NPGSQL driver for .NET, which is itself sluggish. The 
ODBC driver works better as compared to NPGSQL, but I suspect the ODBC driver 
is not the right choice for ORM framework of .NET.

I want to know whether there is any efficient .NET provider and is PGSQL 
compatible with the .NET entity framework.

Regards,
Rohit.


why: probably because no .net user has stepped up and written one.  Not enough 
interest maybe?

But really, why would you need to?  I assume .net is not so crippled that it 
can't link to a .dll, correct?  Why not just use the native libpq.dll?  The 
interface is well documented and simple to use.  And you wont get any faster.

-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PL/Python

2011-10-02 Thread Sim Zacks


  
  
On 09/30/2011 05:10 AM, Joe Abbate wrote:

  Although there are no discussions or examples in the documentation, I've
determined that PL/Python supports Python new style classes like class
Test(object), and import of standard modules.

Now, in order for to_yaml/to_map to do its work, it needs to import 15
modules, e.g.,

from pyrseas.dbobject.language import LanguageDict

I'd be interested to know if anyone has tried something similar (in
terms of scope) and if so, how they tackled it.

The other function, diff_yaml() or diff_map(), is a bit trickier because
it requires reading in a YAML spec file, like the one above, and then
comparing it to the internal version of to_yaml/to_map, in order to
output SQL DDL statements.  The command line tool could read the spec
file locally and send it off as one big text argument.  Again, I'm
interested in similar experiences if any.


If I understand plpython correctly, it uses the python installed on
the machine. In other words, plpythonu doesn't support the new style
classes, it depends on what version of python is installed.
In the same way, if you have libraries installed on your machine,
plpythonu has access to them as well. So if someone installed
pyrseas, he would be able to use all the functions from within his
function. 

IMO, there is no need for an extension here, all you need to do is
have an sql file containing your 2 plpythonu functions that can be
run into the database.

Sim 
  



Re: [GENERAL] How to find freak UTF-8 character?

2011-10-02 Thread pasman pasmański
Its simple to remove strange chars  with regex_replace.

2011/10/1, Leif Biberg Kristensen l...@solumslekt.org:
 On Saturday 1. October 2011 21.29.45 Andrew Sullivan wrote:
 I see you found it, but note that it's _not_ a spurious UTF-8
 character: it's a right-to-left mark, ans is a perfectly ok UTF-8 code
 point.

 Andrew,
 thank you for your reply. Yes I know that this is a perfectly legal UTF-8
 character. It crept into my database as a result of a copy-and-paste job
 from
 a web site. The point is that it doesn't have a counterpart in ISO-8859-1 to
 which I regularly have to export the data.

 The offending character came from this URL:
 http://www.soge.kviteseid.no/individual.php?pid=I2914ged=Kviteseid.GEDtab=0

 and the text that I copied and pasted from the page looks like this in the
 source code:

 Aslaug Steinarsdotter Fjågesundnbsp;nbsp;lrm;(I2914)lrm;

 I'm going to write to the webmaster of the site and ask why that character,
 represented in the HTML as the lrm; entity, has to appear in a Norwegian
 web
 site which never should have to display text in anything but left-to-right
 order.

 If you need a subset of the UTF-8 character set, you want to make sure
 you have some sort of constraint in your application or your database
 that prevents insertion of anything at all in UTF-8.  This is a need
 people often forget when working in an internationalized setting,
 because there's a lot of crap that comes from the client side in a
 UTF-8 setting that might not come in other settings (like LATIN1).

 I don't want any constraint of that sort. I'm perfectly happy with UTF-8.
 And
 now that I've found out how to spot problematic characters that will crash
 my
 export script, it's really not an issue anymore. The character didn't print
 neither in psql nor in my PHP frontend, so I just removed the problematic
 text
 and re-entered it by hand. Problem solved.

 But thank you for the idea, I think that I will strip out at least any lrm;
 entities from text entered into the database.

 By the way, is there a setting in psql that will output unprintable
 characters
 as question marks or something?

 regards, Leif.

 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general



-- 

pasman

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to find freak UTF-8 character?

2011-10-02 Thread Leif Biberg Kristensen
On Sunday 2. October 2011 15.53.50 pasman pasmański wrote:
 Its simple to remove strange chars  with regex_replace.

True, but first you have to know how to represent a «strange char» in 
Postgresql :P

It isn't all that obvious, and it's difficult to search for the solution. I 
tried a lot of different search terms in Google, and none of them turned up 
anything near what I needed.

regards, Leif

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to find freak UTF-8 character?

2011-10-02 Thread Cédric Villemain
2011/10/2 Leif Biberg Kristensen l...@solumslekt.org:
 On Sunday 2. October 2011 15.53.50 pasman pasmański wrote:
 Its simple to remove strange chars  with regex_replace.

 True, but first you have to know how to represent a «strange char» in
 Postgresql :P

 It isn't all that obvious, and it's difficult to search for the solution. I
 tried a lot of different search terms in Google, and none of them turned up
 anything near what I needed.

you may have miss this one :
http://tapoueh.org/blog/2010/02/23-getting-out-of-sql_ascii-part-2.html



 regards, Leif

 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general




-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PL/Python

2011-10-02 Thread Joe Abbate
Hi Sim,

On 10/02/2011 08:02 AM, Sim Zacks wrote:
 If I understand plpython correctly, it uses the python installed on the
 machine. In other words, plpythonu doesn't support the new style
 classes, it depends on what version of python is installed.

Well, Python has had new style classes since 2.2 (December 2001).  PG
8.2 release notes says it supports Python 2.5 and the 9.0 notes show
support was added for Python 3.  Unless someone is running Python 2.1 or
earlier, it seems new style classes are available.

 In the same way, if you have libraries installed on your machine,
 plpythonu has access to them as well. So if someone installed pyrseas,
 he would be able to use all the functions from within his function.

Yes, that seems to be the case, assuming the path to the library is
visible in (or added to) PYTHONPATH.

 IMO, there is no need for an extension here, all you need to do is have
 an sql file containing your 2 plpythonu functions that can be run into
 the database.

Maybe I'm misunderstanding something, but isn't such a sql file an
extension or is 95% of the way there?  Pyrseas is already distributed
via PGXN, but like some other PGXN extensions (dbi-link?), it doesn't
actually create functions in the database.  Its two utilities run
entirely as external programs.  If the Pyrseas functions were added via
an .sql file to a database, EXTENSION or not, they would be available
for use by non-Pyrseas programs, e.g., pgAdmin could call diff_map() to
compare database objects, Perl scripts or even a plain psql script could
call to_yaml().  And these would not depend on psycopg2, which currently
*is* a Pyrseas dependency (it would still be necessary for access to the
command line utilities).

Joe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to find freak UTF-8 character?

2011-10-02 Thread Leif Biberg Kristensen
On Sunday 2. October 2011 16.34.27 Cédric Villemain wrote:
 you may have miss this one :
 http://tapoueh.org/blog/2010/02/23-getting-out-of-sql_ascii-part-2.html

That's an, uh, interesting article, but as far as I can see, it doesn't tell 
anything about how to find a perfectly legal three-byte UTF-8 character that 
doesn't have a counterpart in LATIN1, given that all I know about it is its 
hexadecimal value.

I know how to do it now, and I consider the problem solved. Hopefully, this 
thread may help others who stumbles upon the same issue.

regards, Leif

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Searching for bare letters

2011-10-02 Thread Reuven M. Lerner

Hi, Oleg.  You wrote:


I don't see the problem - you can have a dictionary, which does all work
on recognizing bare letters and output several versions. Have you seen
unaccent
dictionary ?


This seems to be the direction that everyone is suggesting, and I'm 
quite grateful for that.  (I really hadn't ever needed to deal with such 
issues in the past, having worked mostly with English and Hebrew, which 
don't have such accent marks.)


As for the unaccent dictionary, I hadn't heard of it before, but just 
saw it now in contrib, and it looks like it might fit perfectly.  I'll 
take a look; thanks for the suggestion.


Reuven

--
Reuven M. Lerner -- Web development, consulting, and training
Mobile: +972-54-496-8405 * US phone: 847-230-9795
Skype/AIM: reuvenlerner

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to find freak UTF-8 character?

2011-10-02 Thread Raymond O'Donnell
On 02/10/2011 15:55, Leif Biberg Kristensen wrote:
 On Sunday 2. October 2011 16.34.27 Cédric Villemain wrote:
 you may have miss this one :
 http://tapoueh.org/blog/2010/02/23-getting-out-of-sql_ascii-part-2.html
 
 That's an, uh, interesting article, but as far as I can see, it doesn't tell 
 anything about how to find a perfectly legal three-byte UTF-8 character that 
 doesn't have a counterpart in LATIN1, given that all I know about it is its 
 hexadecimal value.
 
 I know how to do it now, and I consider the problem solved. Hopefully, this 
 thread may help others who stumbles upon the same issue.

I may have missed it upthread, but if you haven't already would you
consider writing up your solution for the benefit of the archives?

Ray.


-- 
Raymond O'Donnell :: Galway :: Ireland
r...@iol.ie

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to find freak UTF-8 character?

2011-10-02 Thread Leif Biberg Kristensen
On Sunday 2. October 2011 17.54.52 Raymond O'Donnell wrote:
 I may have missed it upthread, but if you haven't already would you
 consider writing up your solution for the benefit of the archives?

I did, in my own first reply to the original message:

SELECT * FROM foo WHERE bar LIKE E'%\xe2\x80\x8e%';

The trick is obviously to escape each byte in the sequence.

Maybe I'll write a blog post about it. It appears to be weakly documented, or 
at least very hard to find. Or maybe it's just me being dense.

regards, Leif.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Help message of PostgreSQL-win-x64 is UTF-8 encoded which is malformed

2011-10-02 Thread LiuYan
I downloaded both postgresql-9.1.1-1-windows-x64-binaries.zip and
postgresql-9.1.1-1-windows-binaries.zip, running on Simplified Chinese
Windows XP Professional x64.

The output of `psql --help`(x64) is UTF-8 encoded, they will be displayed as
malformed characters in command prompt window.
psql --help
psql鏄疨ostgreSQL 鐨勪氦浜掑紡瀹㈡埛绔伐鍏枫€?浣跨敤鏂规硶:

The output of `psql --help`(win32) is GBK encoded, which will be displayed
normally.
psql --help
psql是PostgreSQL 的交互式客户端工具。

Is it a bug of PostgreSQL win-x64?

--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Help-message-of-PostgreSQL-win-x64-is-UTF-8-encoded-which-is-malformed-tp4861320p4861320.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Updating 9.0.4 -- 9.1.1: How best to ???

2011-10-02 Thread Devrim GÜNDÜZ

Hi,

On Sun, 2011-10-02 at 12:41 +0200, r d wrote:
 
 I decided to install the PGDG repo (pgdg-fedora91-9.1-5.noarch) to
 make it update. It does not.

Fedora/Red Hat distro RPMs a only binary compatible since last year. I
broke package compatibility by enabling multiple version installation.

 The FC15 packages are called postgresql_* and not postgresql90_*,
 the PGDG packages are called postgresql91_*.

See above.

 Since both repositories apparently get into each others way, my
 question is, what is the best (recommended) way to update or install
 postgres:
 
 - Dump my FC15 installation and replace it with one coming from the
 PGDG repo?

This might be one of the options. However, as compared to Fedora RPMs,
we don't offer upgrade path. Fedora RPMs have a 

service (or systemctl... whatever) postgresql upgrade

option. So, an upgrade path could be:

* Replace current packages with PGDG packages. Please note that data
directory of 9.0 will be under /var/lib/pgsql/9.0, not /var/lib/pgsql.

* Make sure that db is working.

* Now, install 9.1 in parallel.

* Perform upgrade using pg_upgrade or pg_dump.

* Start 9.1

 - Wait for the 9.1 packages to appear in the Fedora Core repo? 

Fedora will release 9.1 in F-16. Fedora 15 won't have 9.1.

Regards,

-- 
Devrim GÜNDÜZ
Principal Systems Engineer @ EnterpriseDB: http://www.enterprisedb.com
PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
http://www.gunduz.org  Twitter: http://twitter.com/devrimgunduz


signature.asc
Description: This is a digitally signed message part


Re: [GENERAL] Why PGSQL has no developments in the .NET area?

2011-10-02 Thread Mike Christensen
 PgSQL has just one old NPGSQL driver for .NET, which is itself sluggish.
 The ODBC driver works better as compared to NPGSQL, but I suspect the ODBC
 driver is not the right choice for ORM framework of .NET.

 I want to know whether there is any efficient .NET provider and is PGSQL
 compatible with the .NET entity framework.

 Regards,
 Rohit.

 Won't bite on the why of it all, but is this
 http://www.devart.com/dotconnect/postgresql/ what you're after?

We use the DevArt drivers (their Oracle drivers anyway) at work and
they're pretty good.  They have solid tech support (helpful and quick
to respond) and they seem to write quality stuff.  I'd assume their PG
drivers are just as good.  However, this stuff isn't cheap.  You pay
per dev license and it's like a few hundred bucks per dev.

I use Npgsql for my own website (www.kitchenpc.com) and I've had zero
problems with these drivers.  They're 100% managed code, and they seem
to be quick even though I haven't done any in-depth speed comparisons.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Searching for bare letters

2011-10-02 Thread Mike Christensen
 I don't see the problem - you can have a dictionary, which does all work
 on recognizing bare letters and output several versions. Have you seen
 unaccent
 dictionary ?

 This seems to be the direction that everyone is suggesting, and I'm quite
 grateful for that.  (I really hadn't ever needed to deal with such issues in
 the past, having worked mostly with English and Hebrew, which don't have
 such accent marks.)

 As for the unaccent dictionary, I hadn't heard of it before, but just saw it
 now in contrib, and it looks like it might fit perfectly.  I'll take a look;
 thanks for the suggestion.

I wrote this code for something similar I was doing, feel free to rip
it off or copy the regular expressions:


input = Regex.Replace(input, @[\xC0-\xC5\xE0-\xE5], a);  //Replace with a
input = Regex.Replace(input, @[\xC8-\xCB\xE8-\xEB], e);  //Replace with e
input = Regex.Replace(input, @[\xCC-\xCF\xEC-\xEF], i);  //Replace with i
input = Regex.Replace(input, @[\xD1\xF1], n);//Replace with n
input = Regex.Replace(input, @[\xD2-\xD6\xF2-\xF6], o);  //Replace with o
input = Regex.Replace(input, @[\xD9-\xDC\xF9-\xFC], u);  //Replace with u
input = Regex.Replace(input, @[\xDD\xDF\xFF], y);//Replace with y

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [Solved] Generic logging system for pre-hstore using plperl triggers

2011-10-02 Thread Diego Augusto Molina
2011/9/28, Merlin Moncure mmonc...@gmail.com:

 I disagree.  unnest() and array_agg() (or, even better, array()
 constructor syntax) are an absolute joy to work with and thinking in a
 more functional way, which is usually the key to making things run
 quickly.  Also both functions are trivial to emulate in userland for
 compatibility.  Arrays of composites IIRC only go back to 8.3 so  that
 would be a true stopper for any solution in that vein.

Ok, tastes are tastes: I hate to make two or three more levels of
subqueries. Regarding arrays of composites, that would be perfectly
solved if we use no composite at all! Instead of a field with an array
of a composite of three instrinsics, use three fields, each of an
intrinsic type. See your proposal:

 create type audit_field_t as (field text, old_value text, new_value text);

Instad, in the audit table you may use:

..., field smallint[], before text[], after text[],...

Note the smallint in field, that means I really want to keep the
reference to the field table. That is for the two reasons I had
mentioned earlier (to reduce space: 2 bytes of type smallint against
variable size of type text; and to keep track of names been used
too). You can also set up something like this if you like dimensions:

..., field smallint[], values text[][],...

Implying that the first dimension is the before value and the second
one is for the after value. Any of these prevents us from using
composites and makes the box a little wider and simpler. Even further,
I would like to keep the logging on demand:

..., field smallint[], is_pk boolean[], { before text[], after text[]
| values text[][] },...

You know what are the braces and pipe for...
So, at the end, we have the entire audet table inside the audit
table, as a series of arrays. We got a real compact table with only
enough data to fully log the changes which triggered the event. No
less, no more.
At this point we know querying this table will be much more slow and
rotation will have to be done more frequently. If we dumprestore the
table somewhere else we will still be able to split the table in the
original two ones, and make indexes, cluster them, and query as
desired. But this can get so complicated that maybe I should implement
a function doing all this. In an event, we are getting less
responsiveness because of this. But a couple of mins more may not be a
problem for most cases. I'm just trying to summarize.

As a rule of thumb, you may need to run a cron job every night or so
to check if 'select count(*) from audit' is bigger than X then rotate
the table (or maybe each X days/weeks/etc.). The smaller the X, the
bigger responsiveness _in_ some cases: if we know an interval in time
we will just have to dumprestore those logs. In other cases this
would not be of much help: if you need to track a tupple to the very
beggining of the times, you'll have a lot of work to do
dumpingrestoring (and so forth... remember to split the table,
indexing...). Still, rotation seems to be a good practice, and you can
include in the cron job the dump/restore part into another server and
then delete the old table. That would save a lot of space in your
production environment.

 As for the rest of it, I'd be looking to try and come up with an all
 sql implementation.  Also you should give an honest comparison between
 what you've come up with vs. this:
 http://pgfoundry.org/projects/tablelog/.

 merlin


All SQL implementation? Didn't we agree that's not possible in
pg=8.4? then what do you mean by that?

About project tablelog, I didn't really try it, but read it's
documentation and seems not appropiate at all for my case. First of
all, it's propose seems to be to log everything in a table to be able
to restore it later as of any time in the past. My propose is to log
to run analysis. Also, it needs to create one table per logged table,
consisting of the same structure of the logged table (without
constraints) plus three, four or five columns for control (depending
on usage, four or five recommended). I have a lot of tables to log
(hundreds!) with small changes to each of them; that means to
duplicate the amount of tables for a few changes. Speaking of
compactness... It also logs everything, not only changed values.
It is written in C, so I assume it runs much, much faster (specially
needed for highly transactional DBs). But it's not proven to be binary
safe (which I don't remember what that is). Bugs: nothing known.

So, if you need to be able to restore your table as of any time, use
tablelog. If you need to run analysis on who did what, use my option.


Finally attaching the code!

Cheers.

-- 
Diego Augusto Molina
diegoaugustomol...@gmail.com

ES: Por favor, evite adjuntar documentos de Microsoft Office. Serán
desestimados.
EN: Please, avoid attaching Microsoft Office documents. They shall be discarded.
LINK: http://www.gnu.org/philosophy/no-word-attachments.html

		/* Created by Diego Augusto Molina in 2011 for Tucuman Government, 

[GENERAL] pg_upgrade 8.4 - 9.1 failures

2011-10-02 Thread Joseph S
I've tried the upgrade a few times, but I always get one type of failure
or another.

Performing Consistency Checks
-
Checking current, bin, and data directories ok
Checking cluster versions   ok
Checking database user is a superuser   ok
Checking for prepared transactions  ok
Checking for reg* system oid user data typesok
Checking for contrib/isn with bigint-passing mismatch   ok
Checking for large objects  warning

| Your installation contains large objects.
| The new database has an additional large object
| permission table.  After upgrading, you will be
| given a command to populate the pg_largeobject
| permission table with default permissions.

Creating catalog dump   ok

There were problems executing /usr/local/pgsql_84/bin/pg_ctl -w -l
/dev/null -D data84  stop  /dev/null 21
Failure, exiting

=

Performing Consistency Checks
-
Checking current, bin, and data directories ok
Checking cluster versions   ok
Checking database user is a superuser   ok
Checking for prepared transactions  ok
Checking for reg* system oid user data typesok
Checking for contrib/isn with bigint-passing mismatch   ok
Checking for large objects  warning

| Your installation contains large objects.
| The new database has an additional large object
| permission table.  After upgrading, you will be
| given a command to populate the pg_largeobject
| permission table with default permissions.

Creating catalog dump   ok
Checking for prepared transactions  ok
Checking for presence of required libraries ok

| If pg_upgrade fails after this point, you must
| re-initdb the new cluster before continuing.
| You will also need to remove the .old suffix
| from /home/postgres/data84/global/pg_control.old.

Performing Upgrade
--
Adding .old suffix to old global/pg_control   ok
Analyzing all rows in the new cluster   ok
Freezing all rows on the new clusterok
Deleting new commit clogs   ok
Copying old commit clogs to new server  ok
Setting next transaction id for new cluster ok
Resetting WAL archives  ok
Setting frozenxid counters in new cluster   ok
Creating databases in the new cluster   ok
Adding support functions to new cluster ok
Restoring database schema to new clusterok
Removing support functions from new cluster ok
Restoring user relation files

Mismatch of relation names: database dbname, old rel
pg_toast.pg_toast_1280475, new rel pg_toast.pg_toast_1202320
Failure, exiting

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Searching for bare letters

2011-10-02 Thread Eduardo Morras

At 01:25 02/10/2011, Reuven M. Lerner wrote:

Hi, everyone.  I'm working on a project on 
PostgreSQL 9.0 (soon to be upgraded to 9.1, 
given that we haven't yet launched).  The 
project will involve numerous text fields 
containing English, Spanish, and 
Portuguese.  Some of those text fields will be 
searchable by the user.  That's easy enough to 
do; for our purposes, I was planning to use some 
combination of LIKE searches; the database is 
small enough that this doesn't take very much 
time, and we don't expect the number of 
searchable records (or columns within those records) to be all that large.


The thing is, the people running the site want 
searches to work on what I'm calling (for lack 
of a better term) bare letters.  That is, if 
the user searches for n, then the search 
should also match Spanish words containing 
ñ.  I'm told by Spanish-speaking members of 
the team that this is how they would expect 
searches to work.  However, when I just did a 
quick test using a UTF-8 encoded 9.0 database, I 
found that PostgreSQL didn't  see the two 
characters as identical.  (I must say, this is 
the behavior that I would have expected, had the 
Spanish-speaking team member not said anything on the subject.)


So my question is whether I can somehow wrangle 
PostgreSQL into thinking that n and ñ are 
the same character for search purposes, or if I 
need to do something else -- use regexps, keep a 
naked, searchable version of each column 
alongside the native one, or something else entirely -- to get this to work.


Any ideas?


You can use perceptual hashing for that. There 
are multiple algorithms, some of them can be tuned for specific languages.


See this documentation:

http://en.wikipedia.org/wiki/Phonetic_algorithm for a general description,

http://en.wikipedia.org/wiki/Soundex is the first one developed, very old,

http://en.wikipedia.org/wiki/Metaphone is a 
family of several modern algorithms.


Remember that they are hashing algorithms, some 
words can collide because they have the same pronunciation but write different.


I remember that datapark search engine uses them 
with dictionaries. You can check it too.


http://www.dataparksearch.org/


Thanks,

Reuven


HTH 




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_upgrade 8.4 - 9.1 failures

2011-10-02 Thread Peter Eisentraut
On sön, 2011-10-02 at 15:45 -0400, Joseph S wrote:
 Mismatch of relation names: database dbname, old rel
 pg_toast.pg_toast_1280475, new rel pg_toast.pg_toast_1202320
 Failure, exiting

This will be fixed in 9.1.2 (or get the code already from git).


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general