date:20071112

Re: [HACKERS] Simplifying Text Search

2007-11-12 Thread Pavel Stehule

On 13/11/2007, Simon Riggs <[EMAIL PROTECTED]> wrote:
> On Mon, 2007-11-12 at 23:03 -0500, Bruce Momjian wrote:
> > Simon Riggs wrote:
> > > On Mon, 2007-11-12 at 11:56 -0500, Tom Lane wrote:
> > > > Simon Riggs <[EMAIL PROTECTED]> writes:
> > > > > So we end up with a normal sounding function that is overloaded to
> > > > > provide all of the various goodies.
> > > >
> > > > As best I can tell, @@ does exactly this already.  This is just a
> > > > different spelling of the same capability, and I don't actually
> > > > find it better.  Why is "text_search(x,y)" better than "x @@ y"?
> > > > We don't recommend that people write "texteq(x,y)" instead of
> > > > "x = y".
> > >
> > > Most people don't understand those differences. x = y means "make sure
> > > they are the same" to most people. They don't see what you (and I) see:
> > > function and operator interchangeability. So text_search() is better
> > > than @@ and = is better than texteq(). Life ain't neat...
> > >
> > > Right now, Full Text Search SQL looks like complete gibberish and it
> > > dissuades many people from using what is an awesome set of features. I
> > > just want to add a little sugar to help people get started.
> >
> > I realized this when editing the documentation but not clearly.  I
> > noticed that:
> >
> >   
> > http://momjian.us/main/writings/pgsql/sgml/textsearch-intro.html#TEXTSEARCH-MATCHING
> >
> >   tsvector @@ tsquery
> >   tsquery  @@ tsvector
> >   text @@ tsquery
> >   text @@ text
> >
> >   The first two of these we saw already. The form text @@ tsquery  is
> >   equivalent to to_tsvector(x) @@ y. The form text @@ text  is 
> > equivalent
> >   to to_tsvector(x) @@ plainto_tsquery(y).
> >
> > was quite odd, especially the "text @@ text" case, and in fact it makes
> > casting almost required unless you can remember which one is a query and
> > which is a vector (hint, the vector is first).  What really adds to the
> > confusion is that the operator is two _identical_ characters, meaning
> > the operator is symetric, and it behave symetric if you cast one side,
> > but as vector @@ query if you don't.
>
> I'm thinking we can have an inlinable function
>
> contains(text, text) returns int
>
> Return values limited to just 0 or 1 or NULL, as with SQL/MM.
> It's close to SQL/MM, but not exact.
>
> contains(sourceText, searchText) is a macro for
>
> case to_tsvector(default_text_search_config, sourceText) @@
> to_tsquery(default_text_search_config, searchText)
> when true then 1
> when false then 0
> else null
> end
>

it's look well.

Pavel

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Simplifying Text Search

2007-11-12 Thread Simon Riggs

On Mon, 2007-11-12 at 23:03 -0500, Bruce Momjian wrote:
> Simon Riggs wrote:
> > On Mon, 2007-11-12 at 11:56 -0500, Tom Lane wrote:
> > > Simon Riggs <[EMAIL PROTECTED]> writes:
> > > > So we end up with a normal sounding function that is overloaded to
> > > > provide all of the various goodies.
> > > 
> > > As best I can tell, @@ does exactly this already.  This is just a
> > > different spelling of the same capability, and I don't actually
> > > find it better.  Why is "text_search(x,y)" better than "x @@ y"?
> > > We don't recommend that people write "texteq(x,y)" instead of
> > > "x = y".
> > 
> > Most people don't understand those differences. x = y means "make sure
> > they are the same" to most people. They don't see what you (and I) see:
> > function and operator interchangeability. So text_search() is better
> > than @@ and = is better than texteq(). Life ain't neat...
> > 
> > Right now, Full Text Search SQL looks like complete gibberish and it
> > dissuades many people from using what is an awesome set of features. I
> > just want to add a little sugar to help people get started.
> 
> I realized this when editing the documentation but not clearly.  I
> noticed that:
> 
>   
> http://momjian.us/main/writings/pgsql/sgml/textsearch-intro.html#TEXTSEARCH-MATCHING
> 
>   tsvector @@ tsquery
>   tsquery  @@ tsvector
>   text @@ tsquery
>   text @@ text
> 
>   The first two of these we saw already. The form text @@ tsquery  is
>   equivalent to to_tsvector(x) @@ y. The form text @@ text  is equivalent
>   to to_tsvector(x) @@ plainto_tsquery(y).
> 
> was quite odd, especially the "text @@ text" case, and in fact it makes
> casting almost required unless you can remember which one is a query and
> which is a vector (hint, the vector is first).  What really adds to the
> confusion is that the operator is two _identical_ characters, meaning
> the operator is symetric, and it behave symetric if you cast one side,
> but as vector @@ query if you don't.

I'm thinking we can have an inlinable function

contains(text, text) returns int 

Return values limited to just 0 or 1 or NULL, as with SQL/MM.
It's close to SQL/MM, but not exact.

contains(sourceText, searchText) is a macro for

case to_tsvector(default_text_search_config, sourceText) @@
to_tsquery(default_text_search_config, searchText)
when true then 1
when false then 0
else null
end

that allows us to write indexable queries like this

WHERE contains(sourceText, searchText) > 0

where we must still have built the index on a constant config.
Not checked that still works yet, maybe not, in which case something
slightly more complex to make sure its still indexable. This is the
difficult part.

So changes are:
- add SQL function
- simplify first 2 pages of docs using this function

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] How to keep a table in memory?

2007-11-12 Thread Christopher Browne

Quoth [EMAIL PROTECTED] (Tom Lane):
> Devrim =?ISO-8859-1?Q?G=DCND=DCZ?= <[EMAIL PROTECTED]> writes:
>> So, IMHO, saying "trust your OS + PostgreSQL" is not a 100% perfect
>> approach for the people who are asking to keep their objects on RAM,
>> even though I know that there is nothing we can say right now.
>
> Well, nothing is a 100% solution.  But my opinion is that people who
> think they are smarter than an LRU caching algorithm are typically
> mistaken.  If the table is all that heavily used, it will stay in memory
> just fine.  If it's not sufficiently heavily used to stay in memory
> according to an LRU algorithm, maybe the memory space really should be
> spent on something else.
>
> Now there are certainly cases where a standard caching algorithm falls
> down --- the main one I can think of offhand is where you would like to
> give one class of queries higher priority than another, and so memory
> space should preferentially go to tables that are needed by the first
> class.  But if that's your problem, "pin these tables in memory" is
> still an awfully crude solution to the problem.  I'd be inclined to
> think instead about a scheme that lets references made by
> higher-priority queries bump buffers' use-counts by more than 1,
> or some other way of making the priority considerations visible to an
> automatic cache management algorithm.

Something I found *really* interesting was that whenever we pushed any
"high traffic" systems onto PostgreSQL 8.1, I kept seeing measurable
performance improvements taking place every day for a week.

Evidently, it took that long for cache to *truly* settle down.

Given that, and given that we've gotten a couple of good steps *more*
sophisticated than mere LRU, I'm fairly willing to go pretty far down
the "trust the shared memory cache" road.

The scenario described certainly warrants doing some benchmarking; it
warrants analyzing the state of the internal buffers over a period of
time to see what is actually in them.

If, after a reasonable period of time (that includes some variations
in system load), a reasonable portion (or perhaps the entirety) of the
Essential Table has consistently resided in buffers, then that should
be pretty decent evidence that cacheing is working the way it should.
-- 
output = ("cbbrowne" "@" "gmail.com")
http://linuxdatabases.info/info/slony.html
A Plateau is the highest form of flattery.

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] How to keep a table in memory?

2007-11-12 Thread Christopher Browne

In an attempt to throw the authorities off his trail, [EMAIL PROTECTED] 
("Joshua D. Drake") transmitted:
> Andrew Dunstan wrote:
>>> I give this a +/- 1. Yes extremely heavy websites can do this
>>> *but* they require extremely expensive hardware to do so.
>>>
>> I expect extremely heavy websites to require extremely expensive
>> equipment regardless of the software they use. Cost was not the
>> issue raised by the OP.
>
> Cost is always an issue, even if implicit. If the person is so hung
> up on the idea of pushing things into ram there is a pretty good
> possibility they have priced out the 50 and 100 spindle devices
> needed to get the same type of performance.

I dunno; I had a chat about cacheing strategies today where it became
clear to me that when we migrate to 8.3, we'll need to re-examine
things because there has been *so* much change since some of our
present policy was created back in the 7.2 days.

(Pointedly, one of the reasons to want a separate cache DB was to cut
down on XID consumption by read-only processes, and that reason
evaporates in 8.3.)

I have seen enough naive analyses done that I wouldn't be inclined to
assume much of anything.

People can get mighty self-assured about things that they have heard,
whether those things have validity or not.  Few things can get as
badly wrong as bad assumptions made about performance...
-- 
(reverse (concatenate 'string "moc.liamg" "@" "enworbbc"))
http://linuxdatabases.info/info/x.html
"When we write programs that "learn", it turns out that we do and they
don't." -- Alan Perlis

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] How to keep a table in memory?

2007-11-12 Thread Luke Lonergan

Vacuum is a better thing to run, much less CPU usage.

- Luke

Msg is shrt cuz m on ma treo

 -Original Message-
From:   Greg Smith [mailto:[EMAIL PROTECTED]
Sent:   Monday, November 12, 2007 11:59 PM Eastern Standard Time
To: Alex Drobychev
Cc: pgsql-hackers@postgresql.org
Subject:Re: [HACKERS] How to keep a table in memory?

On Mon, 12 Nov 2007, Alex Drobychev wrote:

> Or any other ideas for "pinning" a table in memory?

If the table you're worried about is only 20MB, have you considered just 
running something regularly that touches the whole thing?  This may be the 
only time I've ever considered running "select count(*) from x" as a 
productive move.  That would waste some CPU, but it would help those pages 
"win the eviction war" as you say.

You definately should follow-up on the suggestion given to look at the 
pg_buffercache contrib module to get a better idea what's going on under 
the LRU hood.  In fact, you may want to install a tweak that's standard in 
8.3 to show the usage counts in order to better get a feel for what's 
going on; the appendix on my article at 
http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm goes 
into this a bit, with the documentation to pg_buffercache having the rest 
of what you'd need.

--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] How to keep a table in memory?

2007-11-12 Thread Greg Smith


On Mon, 12 Nov 2007, Alex Drobychev wrote:


Or any other ideas for "pinning" a table in memory?


If the table you're worried about is only 20MB, have you considered just 
running something regularly that touches the whole thing?  This may be the 
only time I've ever considered running "select count(*) from x" as a 
productive move.  That would waste some CPU, but it would help those pages 
"win the eviction war" as you say.


You definately should follow-up on the suggestion given to look at the 
pg_buffercache contrib module to get a better idea what's going on under 
the LRU hood.  In fact, you may want to install a tweak that's standard in 
8.3 to show the usage counts in order to better get a feel for what's 
going on; the appendix on my article at 
http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm goes 
into this a bit, with the documentation to pg_buffercache having the rest 
of what you'd need.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 6: explain analyze is your friend

[HACKERS] Re: [COMMITTERS] pgsql: Adjust script to be consistent (thanks Tom for the fix).

2007-11-12 Thread Bruce Momjian

Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Tom Lane wrote:
> >> Pray tell, why do you think it's a good idea to not have version labels
> >> in the contrib scripts?
> 
> > If we want them, they should be in all of them.
> 
> +1 for putting the label in all of them --- these files end up "loose"
> in user installations, so having a clear version identity on them seems
> like a Real Good Idea.
> 
> > I am worried slightly
> > that a version label will make the SET search_path harder to find, though.
> 
> The documentation already says "first command", and there is already
> one comment in front of that.  I don't think having two comments in
> front of it will deter anyone smart enough to be installing contrib
> modules into nonstandard schemas.

Agreed.  Done.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Simplifying Text Search

2007-11-12 Thread Bruce Momjian

Simon Riggs wrote:
> On Mon, 2007-11-12 at 11:56 -0500, Tom Lane wrote:
> > Simon Riggs <[EMAIL PROTECTED]> writes:
> > > So we end up with a normal sounding function that is overloaded to
> > > provide all of the various goodies.
> > 
> > As best I can tell, @@ does exactly this already.  This is just a
> > different spelling of the same capability, and I don't actually
> > find it better.  Why is "text_search(x,y)" better than "x @@ y"?
> > We don't recommend that people write "texteq(x,y)" instead of
> > "x = y".
> 
> Most people don't understand those differences. x = y means "make sure
> they are the same" to most people. They don't see what you (and I) see:
> function and operator interchangeability. So text_search() is better
> than @@ and = is better than texteq(). Life ain't neat...
> 
> Right now, Full Text Search SQL looks like complete gibberish and it
> dissuades many people from using what is an awesome set of features. I
> just want to add a little sugar to help people get started.

I realized this when editing the documentation but not clearly.  I
noticed that:

http://momjian.us/main/writings/pgsql/sgml/textsearch-intro.html#TEXTSEARCH-MATCHING

tsvector @@ tsquery
tsquery  @@ tsvector
text @@ tsquery
text @@ text

The first two of these we saw already. The form text @@ tsquery  is
equivalent to to_tsvector(x) @@ y. The form text @@ text  is equivalent
to to_tsvector(x) @@ plainto_tsquery(y).

was quite odd, especially the "text @@ text" case, and in fact it makes
casting almost required unless you can remember which one is a query and
which is a vector (hint, the vector is first).  What really adds to the
confusion is that the operator is two _identical_ characters, meaning
the operator is symetric, and it behave symetric if you cast one side,
but as vector @@ query if you don't.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] How to keep a table in memory?

2007-11-12 Thread Tom Lane

Devrim =?ISO-8859-1?Q?G=DCND=DCZ?= <[EMAIL PROTECTED]> writes:
> So, IMHO, saying "trust your OS + PostgreSQL" is not a 100% perfect
> approach for the people who are asking to keep their objects on RAM,
> even though I know that there is nothing we can say right now.

Well, nothing is a 100% solution.  But my opinion is that people who
think they are smarter than an LRU caching algorithm are typically
mistaken.  If the table is all that heavily used, it will stay in memory
just fine.  If it's not sufficiently heavily used to stay in memory
according to an LRU algorithm, maybe the memory space really should be
spent on something else.

Now there are certainly cases where a standard caching algorithm falls
down --- the main one I can think of offhand is where you would like to
give one class of queries higher priority than another, and so memory
space should preferentially go to tables that are needed by the first
class.  But if that's your problem, "pin these tables in memory" is
still an awfully crude solution to the problem.  I'd be inclined to
think instead about a scheme that lets references made by
higher-priority queries bump buffers' use-counts by more than 1,
or some other way of making the priority considerations visible to an
automatic cache management algorithm.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] How to keep a table in memory?

2007-11-12 Thread Devrim GÜNDÜZ

Hi,

On Mon, 2007-11-12 at 09:12 +, Heikki Linnakangas wrote:

> Just leave it to the cache management algorithms in Postgres and
> Linux.  If it really is frequently accessed, it should stay in
> Postgres shared buffers.

How is "frequently accessed" determined by PostgreSQL?

I mean... You know, OS caches either inodes, or pages. Page caches are
pretty ignorable, since it means the data is already in virtual memory.
So, we have inode caching, and IIRC it results in i/o requests from the
disk -- and sure, it uses i/o scheduler of the kernel (like the all of
the applications running on that machine -- including a basic login
session). *If* the data hadn't been deleted, it returns from i/o
scheduler.

So there is no 100% guarantee that the table is in the memory. If we
could use the ram (some (or a :) ) database(s) can do that IIRC),  we
will avoid i/o scheduler, which will really speed up the process. (Ok,
AFAIK, you can "pin" your objects to memory with Oracle).

... and one more thing with ramfs: Since there is a fs on ramfs, it
passes through VFS -- and goes through kernel schedulers again.

So, IMHO, saying "trust your OS + PostgreSQL" is not a 100% perfect
approach for the people who are asking to keep their objects on RAM,
even though I know that there is nothing we can say right now.

Regards,
-- 
Devrim GÜNDÜZ , RHCE
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, ODBCng - http://www.commandprompt.com/

signature.asc
Description: This is a digitally signed message part

Re: [HACKERS] [COMMITTERS] pgsql: Adjust script to be consistent (thanks Tom for the fix).

2007-11-12 Thread Tom Lane

Bruce Momjian <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> Pray tell, why do you think it's a good idea to not have version labels
>> in the contrib scripts?

> If we want them, they should be in all of them.

+1 for putting the label in all of them --- these files end up "loose"
in user installations, so having a clear version identity on them seems
like a Real Good Idea.

> I am worried slightly
> that a version label will make the SET search_path harder to find, though.

The documentation already says "first command", and there is already
one comment in front of that.  I don't think having two comments in
front of it will deter anyone smart enough to be installing contrib
modules into nonstandard schemas.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] How to keep a table in memory?

2007-11-12 Thread Joshua D. Drake


Andrew Dunstan wrote:


I give this a +/- 1. Yes extremely heavy websites can do this *but* 
they require extremely expensive hardware to do so.





I expect extremely heavy websites to require extremely expensive 
equipment regardless of the software they use. Cost was not the issue 
raised by the OP.


Cost is always an issue, even if implicit. If the person is so hung up 
on the idea of pushing things into ram there is a pretty good 
possibility they have priced out the 50 and 100 spindle devices needed 
to get the same type of performance.


Sincerely,

Joshua D. Drake




cheers

andrew





---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] How to keep a table in memory?

2007-11-12 Thread Joshua D. Drake


Devrim GÜNDÜZ wrote:

Hi,

On Mon, 2007-11-12 at 21:15 -0500, Robert Treat wrote:

2. you should investigate one or more of: pg_memcache, solid state
disk.

you might also consider creating a tablespace on tmpfs or ramfs or
something like pramfs


IIRC, ramfs are not that good for database use: If you want to extend
its size, you have to stop the database instance -- which is not
considered good.


Well, depending on the size you could push the table to another table 
space, drop the old table space, resize the ramfs, and reverse the 
previous :)


Joshua D. Drake



Regards,



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] How to keep a table in memory?

2007-11-12 Thread Andrew Dunstan




Joshua D. Drake wrote:



FYI, Postgres is know to be used successfully on some *extremely* heavy
websites, without using tables pinned in memory.



+1


I give this a +/- 1. Yes extremely heavy websites can do this *but* 
they require extremely expensive hardware to do so.





I expect extremely heavy websites to require extremely expensive 
equipment regardless of the software they use. Cost was not the issue 
raised by the OP.


cheers

andrew





---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] How to keep a table in memory?

2007-11-12 Thread Devrim GÜNDÜZ

Hi,

On Mon, 2007-11-12 at 21:15 -0500, Robert Treat wrote:
> > 2. you should investigate one or more of: pg_memcache, solid state
> > disk.
> 
> you might also consider creating a tablespace on tmpfs or ramfs or
> something like pramfs

IIRC, ramfs are not that good for database use: If you want to extend
its size, you have to stop the database instance -- which is not
considered good.

Regards,
-- 
Devrim GÜNDÜZ , RHCE
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, ODBCng - http://www.commandprompt.com/


signature.asc
Description: This is a digitally signed message part

[HACKERS] Re: [COMMITTERS] pgsql: Adjust script to be consistent (thanks Tom for the fix).

2007-11-12 Thread Bruce Momjian

Tom Lane wrote:
> [EMAIL PROTECTED] (Bruce Momjian) writes:
> > pgsql/contrib/uuid-ossp:
> > uuid-ossp.sql.in (r1.4 -> r1.5)
> > 
> > (http://developer.postgresql.org/cvsweb.cgi/pgsql/contrib/uuid-ossp/uuid-ossp.sql.in?r1=1.4&r2=1.5)
> 
> Pray tell, why do you think it's a good idea to not have version labels
> in the contrib scripts?

If we want them, they should be in all of them.  I am worried slightly
that a version label will make the SET search_path harder to find, though.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] How to keep a table in memory?

2007-11-12 Thread Joshua D. Drake


Robert Treat wrote:

On Monday 12 November 2007 18:31, Andrew Dunstan wrote:

1. when someone replies to your post at the bottom, please don't put
your reply at the top. It makes everything totally unreadable.



+1


2. you should investigate one or more of: pg_memcache, solid state disk.



you might also consider creating a tablespace on tmpfs or ramfs or something 
like pramfs



FYI, Postgres is know to be used successfully on some *extremely* heavy
websites, without using tables pinned in memory.



+1


I give this a +/- 1. Yes extremely heavy websites can do this *but* they 
require extremely expensive hardware to do so.


Joshua D. Drake



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] [COMMITTERS] pgsql: Adjust script to be consistent (thanks Tom for the fix).

2007-11-12 Thread Tom Lane

[EMAIL PROTECTED] (Bruce Momjian) writes:
> pgsql/contrib/uuid-ossp:
> uuid-ossp.sql.in (r1.4 -> r1.5)
> 
> (http://developer.postgresql.org/cvsweb.cgi/pgsql/contrib/uuid-ossp/uuid-ossp.sql.in?r1=1.4&r2=1.5)

Pray tell, why do you think it's a good idea to not have version labels
in the contrib scripts?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] How to keep a table in memory?

2007-11-12 Thread Robert Treat

On Monday 12 November 2007 18:31, Andrew Dunstan wrote:
> 1. when someone replies to your post at the bottom, please don't put
> your reply at the top. It makes everything totally unreadable.
>

+1

> 2. you should investigate one or more of: pg_memcache, solid state disk.
>

you might also consider creating a tablespace on tmpfs or ramfs or something 
like pramfs

> FYI, Postgres is know to be used successfully on some *extremely* heavy
> websites, without using tables pinned in memory.
>

+1

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Problem to configure pg8.3b2 w/ ossp-uuid-support on OS X

2007-11-12 Thread Tom Lane

J=?ISO-8859-1?B?9g==?=rg Beyer <[EMAIL PROTECTED]> writes:
> To be precise, for  uuid  I have:
>   /usr/bin/uuidgen  [the CL tool, which should be of no interest here]
>   /usr/include/uuid/uuid.h

> and that's the same as you have. Nothing else uuid-related in /usr.
> And yes, AFAICT, uuid-functionalities (a) are part of libc, (b) appear to be
> DCE compatible. 

> I have installed _ossp-uuid_ with --prefix=usr/local, so my question is not,
> why pg's configure script ignores OS X's uuid stuff. I'm looking for a way
> to let configure recognize the ossp-uuid library.

OK, I tried the same experiment of installing OSSP uuid 1.6.0 from
source, with no options except --prefix=/usr/local (which I think is
the default anyway).  What I find is that it installs
/usr/local/include/uuid.h
/usr/local/lib/libuuid.dylib(.so to non-Mac'ers)
So our configure/build process is indeed broken, in that it assumes uuid
is always installed as libossp-uuid.so.  I've committed a fix for that.

Note that the original thought about needing --with-includes and/or
--with-libraries is a red herring: at least on OS X with gcc, these
directories are searched anyway.  After fixing the library-name
bug, I was able to build fine with just configure --with-ossp-uuid.

The other part of the problem is that you're seeing a configure warning
that uuid.h doesn't compile.  This is only cosmetic, because it works
fine when you actually go to build the contrib module.  The reason for
the warning is that uuid.h tries to forestall a conflict with built-in
definitions of uuid_t by doing this:

/* workaround conflicts with system headers */
#define uuid_t   __vendor_uuid_t
#define uuid_create  __vendor_uuid_create
#define uuid_compare __vendor_uuid_compare
#include 
#include 
#undef  uuid_t
#undef  uuid_create
#undef  uuid_compare

But that only works if  and/or  haven't already
been included in the current compilation ... and in configure's test
program,  *has* been included.  Darwin puts a non-compatible
definition of typedef uuid_t in , so it goes boom.

Now we can live with this as long as  isn't included
automatically by c.h, which seems to be true except on some old Sun
platforms that probably don't define uuid_t anyway.  But it's clearly
trouble waiting to happen, and even more so for other users of libuuid.
IMHO this is a bug in libuuid: they need to find some
include-order-independent way of avoiding their conflict with system
headers.  I don't care enough about it to file a bug report, but maybe
someone else does.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] How to keep a table in memory?

2007-11-12 Thread Andrew Dunstan

Alex Drobychev wrote:

Hi Heikki,

Thanks for the response!

I understand that relying on cache management would be the easiest 
solution. However, I had a similar issue with other RDBMS (MSSQL, to 
be specific) in the past and observed a lot of disk activity until the 
table was pinned in memory (fortunately MSSQL has 'dbcc pintable' for 
that).

Basically, this is all about a high-traffic website, where virtually 
_all_ data in the DB get accessed frequently - so it's not obvious 
which DB pages are going to win the eviction war. However, the overall 
cost of access is different for different tables - for the table in 
question it very well may ~20 disk seeks per webpage view, so very 
high cache hit rate (ideally 100%) has to be assured.

So - will the 'mlock' hack work? Or any other ideas for "pinning" a 
table in memory?

- Alex

*/Heikki Linnakangas <[EMAIL PROTECTED]>/* wrote:

adrobj wrote:
> I have a pretty small table (~20MB) that is accessed very
frequently and
> randomly, so I want to make sure it's 100% in memory all the
time. There is
> a lot of other staff that's also gets accessed frequently, so I
don't want
> to just hope that Linux file cache would do the right thing for me.
>
> Is there any way to do that?
>
> One idea I have in my head is to start a process that does
mmap() and
> mlock() with the table file. Will it work? If so, are there any
potential
> problems?

Just leave it to the cache management algorithms in Postgres and
Linux.
If it really is frequently accessed, it should stay in Postgres
shared
buffers.

You can use the pg_buffercache contrib module to see what's in cache.

1. when someone replies to your post at the bottom, please don't put 
your reply at the top. It makes everything totally unreadable.

2. you should investigate one or more of: pg_memcache, solid state disk.

FYI, Postgres is know to be used successfully on some *extremely* heavy 
websites, without using tables pinned in memory.

cheers

andrew

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] How to keep a table in memory?

2007-11-12 Thread Alex Drobychev

Hi Heikki,

  Thanks for the response!

  I understand that relying on cache management would be the easiest solution. 
However, I had a similar issue with other RDBMS (MSSQL, to be specific) in the 
past and observed a lot of disk activity until the table was pinned in memory 
(fortunately MSSQL has 'dbcc pintable' for that).

  Basically, this is all about a high-traffic website, where virtually _all_ 
data in the DB get accessed frequently - so it's not obvious which DB pages are 
going to win the eviction war. However, the overall cost of access is different 
for different tables - for the table in question it very well may ~20 disk 
seeks per webpage view, so very high cache hit rate (ideally 100%) has to be 
assured.

  So - will the 'mlock' hack work? Or any other ideas for "pinning" a table in 
memory?

  - Alex

Heikki Linnakangas <[EMAIL PROTECTED]> wrote:
  adrobj wrote:
> I have a pretty small table (~20MB) that is accessed very frequently and
> randomly, so I want to make sure it's 100% in memory all the time. There is
> a lot of other staff that's also gets accessed frequently, so I don't want
> to just hope that Linux file cache would do the right thing for me.
> 
> Is there any way to do that?
> 
> One idea I have in my head is to start a process that does mmap() and
> mlock() with the table file. Will it work? If so, are there any potential
> problems?

Just leave it to the cache management algorithms in Postgres and Linux. 
If it really is frequently accessed, it should stay in Postgres shared 
buffers.

You can use the pg_buffercache contrib module to see what's in cache.

-- 
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

-
Never miss a thing.   Make Yahoo your homepage.

Re: [HACKERS] Simplifying Text Search

2007-11-12 Thread Pavel Stehule

On 12/11/2007, Alvaro Herrera <[EMAIL PROTECTED]> wrote:
> Pavel Stehule escribió:
> > Hello
> >
> > look to standard, please. SQL/MM has part - full text.
>
> Huh, what version of the standard is this?  My copy (the typical 2003
> draft) doesn't have SQL/MM AFAICS.
>
>

I found

http://jtc1sc32.org/doc/N0751-0800/32N0771T.pdf
http://www.sigmod.org/record/issues/0112/standards.pdf
http://dbs.uni-leipzig.de/file/kap5.pdf

Pavel


> --
> Alvaro Herrera http://www.amazon.com/gp/registry/CTMLCN8V17R4
> "La espina, desde que nace, ya pincha" (Proverbio africano)
>

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Simplifying Text Search

2007-11-12 Thread Alvaro Herrera

Pavel Stehule escribió:
> Hello
> 
> look to standard, please. SQL/MM has part - full text.

Huh, what version of the standard is this?  My copy (the typical 2003
draft) doesn't have SQL/MM AFAICS.


-- 
Alvaro Herrera http://www.amazon.com/gp/registry/CTMLCN8V17R4
"La espina, desde que nace, ya pincha" (Proverbio africano)

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Simplifying Text Search

2007-11-12 Thread Simon Riggs

On Mon, 2007-11-12 at 21:59 +0100, Pavel Stehule wrote:

> SELECT  docno
> FROM information
> WHERE document.CONTAINS
>  ('STEMMED FORM OF "standard"
>  IN SAME PARAGRAPH AS
>  SOUNDS LIKE "sequel"') = 1
> 
> it's little bit baroque, It's sample of method.

Seems thats the way Oracle does it too.

The SQLServer syntax is 

WHERE contains(text_column, search_query)

which seems marginally better.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Simplifying Text Search

2007-11-12 Thread Martijn van Oosterhout

On Mon, Nov 12, 2007 at 03:44:18PM -0500, Aidan Van Dyk wrote:
> Can LIKE be easily overloaded in the parser?  So:
>   text LIKE text
> works in it's current form, and
>   tsvector LIKE tsquery
> also works like the @@?  Or have I gotten all the ts* types all mixed up
> again...

AIUI LIKE is mashed into an operator at parse time, so yes, if you
create the operator with the right name it will just work.

Or not (I havn't tested it).

Have a nice day,
-- 
Martijn van Oosterhout   <[EMAIL PROTECTED]>   http://svana.org/kleptog/
> Those who make peaceful revolution impossible will make violent revolution 
> inevitable.
>  -- John F Kennedy


signature.asc
Description: Digital signature

Re: [HACKERS] Simplifying Text Search

2007-11-12 Thread Joshua D. Drake


Simon Riggs wrote:

On Mon, 2007-11-12 at 20:17 +, Heikki Linnakangas wrote:

Granted, @@ is a bit awkward until you get used to it. "x LIKE y" would 
read out better, but unfortunately that's already taken ;-).


Remember, I'm not suggesting we get rid of @@


In any case, it's way too late.


I'm suggesting we add a couple of simple SQL functions that will help
text search docs be more easily understood.

It's beta and its valid to respond to usability issues just as we would
respond to code bugs. Otherwise, why have beta? Late, but not too late. 


SQLServer, Oracle and MySQL all use functions, not operators. My
observation would be that we have the hardest and most difficult to
understand full text search capability. The Contains() function seems
like a better name than I gave earlier also.

I love what we've done; I just want more people be able to use it.


Hmmm, my choices are:

SELECT 'a fat cat sat on a mat and ate a fat rat'::tsvector @@ 'cat & 
rat'::tsquery;


Or:

SELECT ts_match('a fat cat sat on a mat and ate a fat rat','cat & rat');

This seems a little too much like the "duh" department to ignore. A set 
of SQL functions would certainly be appropriate here.


Sincerely,

Joshua D. Drake






---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] Simplifying Text Search

2007-11-12 Thread Pavel Stehule

Hello

look to standard, please. SQL/MM has part - full text.

SELECT  docno
FROM information
WHERE document.CONTAINS
 ('STEMMED FORM OF "standard"
 IN SAME PARAGRAPH AS
 SOUNDS LIKE "sequel"') = 1

it's little bit baroque, It's sample of method.

So,it can be:

SELECT ..
   FROM x.contains(y);

It's well readable and elegant too.

Regards
Pavel Stehule

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Simplifying Text Search

2007-11-12 Thread Simon Riggs

On Mon, 2007-11-12 at 20:17 +, Heikki Linnakangas wrote:

> Granted, @@ is a bit awkward until you get used to it. "x LIKE y" would 
> read out better, but unfortunately that's already taken ;-).

Remember, I'm not suggesting we get rid of @@

> In any case, it's way too late.

I'm suggesting we add a couple of simple SQL functions that will help
text search docs be more easily understood.

It's beta and its valid to respond to usability issues just as we would
respond to code bugs. Otherwise, why have beta? Late, but not too late. 

SQLServer, Oracle and MySQL all use functions, not operators. My
observation would be that we have the hardest and most difficult to
understand full text search capability. The Contains() function seems
like a better name than I gave earlier also.

I love what we've done; I just want more people be able to use it.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Simplifying Text Search

2007-11-12 Thread Aidan Van Dyk

* Heikki Linnakangas <[EMAIL PROTECTED]> [071112 15:18]:
> Simon Riggs wrote:

> >Right now, Full Text Search SQL looks like complete gibberish and it
> >dissuades many people from using what is an awesome set of features. I
> >just want to add a little sugar to help people get started.

> Granted, @@ is a bit awkward until you get used to it. "x LIKE y" would 
> read out better, but unfortunately that's already taken ;-).

Can LIKE be easily overloaded in the parser?  So:
text LIKE text
works in it's current form, and
tsvector LIKE tsquery
also works like the @@?  Or have I gotten all the ts* types all mixed up
again...

But it doesn't buy anything except avoiding the "@@" that people seem to not
grok easily, and it might actually cause more grief, because of people
confusing the 2 forms of LIKE.

a.

-- 
Aidan Van Dyk Create like a god,
[EMAIL PROTECTED]   command like a king,
http://www.highrise.ca/   work like a slave.

signature.asc
Description: Digital signature

Re: [HACKERS] Simplifying Text Search

2007-11-12 Thread Alvaro Herrera

Heikki Linnakangas wrote:

> Granted, @@ is a bit awkward until you get used to it. "x LIKE y" would 
> read out better, but unfortunately that's already taken ;-).

Actually LIKE does not make much sense when you have 'hay & needle'.
Probably MATCHES would be a better term ... but then, MySQL defines a
strange thing called MATCH/AGAINST; so apparently you use "MATCH (column
list) AGAINST (pattern spec)"

None of this is standard though ...

-- 
Alvaro Herrera http://www.amazon.com/gp/registry/CTMLCN8V17R4
"Everybody understands Mickey Mouse. Few understand Hermann Hesse.
Hardly anybody understands Einstein. And nobody understands Emperor Norton."

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Proposal: Select ... AS OF Savepoint

2007-11-12 Thread Andrew Dunstan




Alvaro Herrera wrote:

Gregory Stark escribió:
  

"Decibel!" <[EMAIL PROTECTED]> writes:



On Nov 2, 2007, at 11:29 AM, Tom Lane wrote:
  

[ splorfff... ]  The grammar support alone will cost ten times that.


When next we meet, expect me to ask you how that's pronounced. ;)
  

I think it can only be properly pronounced with a mug of coffee



Where is the content of the mug expected to be, by the time you reach
the ellipsis?

  


exiting the nose?


cheers

andrew

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] [GENERAL] plperl and regexps with accented characters - incompatible?

2007-11-12 Thread Andrew Dunstan




Andrew Dunstan wrote:



Ugh, in testing I see some nastiness here without any explicit 
require. It looks like there's an implicit require if the text 
contains certain chars. I'll see what I can do to fix the bug, 
although I'm not sure if it's possible.





Looks like it's going to be very hard, unless someone has some brilliant 
insight I'm missing :-(


Maybe we need to consult the perl coders.

cheers

andrew

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Simplifying Text Search

2007-11-12 Thread Andrew Sullivan

On Mon, Nov 12, 2007 at 08:09:48PM +, Simon Riggs wrote:
> 
> @@ would still exist, so no problems. These additions are for new users,
> not old ones.

Given that this is all sugar on top of tsearch anyway, why not put it in
pgfoundry as the tsearch_sugar project?  Then packagers could include a
standard set of such sugar if they wanted.

A

-- 
Andrew Sullivan
Old sigs will return after re-constitution of blue smoke

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Simplifying Text Search

2007-11-12 Thread Heikki Linnakangas


Simon Riggs wrote:

On Mon, 2007-11-12 at 11:56 -0500, Tom Lane wrote:

Simon Riggs <[EMAIL PROTECTED]> writes:

So we end up with a normal sounding function that is overloaded to
provide all of the various goodies.

As best I can tell, @@ does exactly this already.  This is just a
different spelling of the same capability, and I don't actually
find it better.  Why is "text_search(x,y)" better than "x @@ y"?
We don't recommend that people write "texteq(x,y)" instead of
"x = y".


Most people don't understand those differences. x = y means "make sure
they are the same" to most people. They don't see what you (and I) see:
function and operator interchangeability. So text_search() is better
than @@ and = is better than texteq(). Life ain't neat...

Right now, Full Text Search SQL looks like complete gibberish and it
dissuades many people from using what is an awesome set of features. I
just want to add a little sugar to help people get started.


Granted, @@ is a bit awkward until you get used to it. "x LIKE y" would 
read out better, but unfortunately that's already taken ;-).


In any case, it's way too late.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] Simplifying Text Search

2007-11-12 Thread Simon Riggs

On Mon, 2007-11-12 at 11:56 -0500, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > So we end up with a normal sounding function that is overloaded to
> > provide all of the various goodies.
> 
> As best I can tell, @@ does exactly this already.  This is just a
> different spelling of the same capability, and I don't actually
> find it better.  Why is "text_search(x,y)" better than "x @@ y"?
> We don't recommend that people write "texteq(x,y)" instead of
> "x = y".

Most people don't understand those differences. x = y means "make sure
they are the same" to most people. They don't see what you (and I) see:
function and operator interchangeability. So text_search() is better
than @@ and = is better than texteq(). Life ain't neat...

Right now, Full Text Search SQL looks like complete gibberish and it
dissuades many people from using what is an awesome set of features. I
just want to add a little sugar to help people get started.

> > Sound good?
> 
> It's not an improvement

That is the very point of debate

> it's not compatible with what existing tsearch2
> users are accustomed to

@@ would still exist, so no problems. These additions are for new users,
not old ones.

> it's several months too late...

True. I wish I'd thought of it before. I've waded through the syntax
without thinking how to make it more easily readable and explainable.
Damn.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Proposal: Select ... AS OF Savepoint

2007-11-12 Thread Kevin Grittner

>>> On Mon, Nov 12, 2007 at  1:16 PM, in message
<[EMAIL PROTECTED]>, Alvaro Herrera
<[EMAIL PROTECTED]> wrote: 
> Gregory Stark escribió:
>> "Decibel!" <[EMAIL PROTECTED]> writes:
>> 
>> > On Nov 2, 2007, at 11:29 AM, Tom Lane wrote:
>> >> [ splorfff... ]  The grammar support alone will cost ten times
that.
>> >
>> > When next we meet, expect me to ask you how that's pronounced. ;)
>> 
>> I think it can only be properly pronounced with a mug of coffee
> 
> Where is the content of the mug expected to be, by the time you
reach
> the ellipsis?
 
I believe that about one mouthful of the contents of the mug are
expected to be distributed across  the desktop, keyboard, and
monitor at that point.
 


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Proposal: Select ... AS OF Savepoint

2007-11-12 Thread Alvaro Herrera

Gregory Stark escribió:
> "Decibel!" <[EMAIL PROTECTED]> writes:
> 
> > On Nov 2, 2007, at 11:29 AM, Tom Lane wrote:
> >> [ splorfff... ]  The grammar support alone will cost ten times that.
> >
> > When next we meet, expect me to ask you how that's pronounced. ;)
> 
> I think it can only be properly pronounced with a mug of coffee

Where is the content of the mug expected to be, by the time you reach
the ellipsis?

-- 
Alvaro Herrera   Valdivia, Chile   ICBM: S 39º 49' 18.1", W 73º 13' 56.4"
"I love the Postgres community. It's all about doing things _properly_. :-)"
(David Garamond)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Simplifying Text Search

2007-11-12 Thread Gregory Stark

"Tom Lane" <[EMAIL PROTECTED]> writes:

> Simon Riggs <[EMAIL PROTECTED]> writes:
>> So we end up with a normal sounding function that is overloaded to
>> provide all of the various goodies.
>
> As best I can tell, @@ does exactly this already.  This is just a
> different spelling of the same capability, and I don't actually
> find it better.  Why is "text_search(x,y)" better than "x @@ y"?
> We don't recommend that people write "texteq(x,y)" instead of
> "x = y".

I agree, I find it odd to suggest that a function would be more natural than
an operator. The main reason the non-core version of tsearch felt so much like
an add-on was precisely that it had to use functions to interface with
objects. That Postgres supports creating new operators is a strength which
allows a lot more extensibility.

And yet I agree that there's something awkward about the tsearch syntax. I'm
not sure where the core of it comes from though, but I don't think it comes
from the use of operators.

Part of it is that "@@" isn't a familiar operator. I'm not even sure what to
read it as. "Matches"? "Satisfies"?

Perhaps we should think (at some point in the future) about some way of
allowing alphabetic characters in operator names. Then you could write
something like: 

  col ~satisfies~ '1 & 2'

(That exact syntax wouldn't work without removing ~ from the characters in
normal operators so something with more finesse would be needed.)

The other part of tsearch that seems somewhat awkward is just the very concept
and syntax of tsqueries. But that seems pretty integral to the functionality
and I don't see any way to avoid it. It's not entirely unlike the idea of
regexps which I'm sure would seem unnatural if we were just meeting them with
no background.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] [GENERAL] plperl and regexps with accented characters - incompatible?

2007-11-12 Thread Andrew Dunstan




Andrew Dunstan wrote:




Greg Sabino Mullane wrote:


Yes, we might want to consider making utf8 come pre-loaded for 
plperl. There is no direct or easy way to do it (we don't have 
finer-grained control than the 'require' opcode), but we could 
probably dial back restrictions, 'use' it, and then reset the Safe 
container to its defaults. Not sure what other problems that may 
cause, however. CCing to hackers for discussion there.



  


UTF8 is automatically on for strings passed to plperl if the db 
encoding is UTF8. That includes the source text. Please be more 
precise about what you want.


BTW, the perl docs say this about the utf8 pragma:

  Do not use this pragma for anything else than telling Perl that 
your

  script is written in UTF-8.

There should be no need to do that - we will have done it for you. So 
any attempt to use the utf8 pragma in plperl code is probably broken 
anyway.





Ugh, in testing I see some nastiness here without any explicit require. 
It looks like there's an implicit require if the text contains certain 
chars. I'll see what I can do to fix the bug, although I'm not sure if 
it's possible.


cheers

andrew

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [HACKERS] [hibernate-team] PostgreSQLDialect

2007-11-12 Thread Dave Cramer



On 12-Nov-07, at 11:33 AM, Tom Dunstan wrote:

On Nov 12, 2007 4:08 PM, Alvaro Herrera <[EMAIL PROTECTED]>  
wrote:
What should the driver report then ? I believe the backend code  
considers 8

to be the major version, and 0123 to be the minor versions ?


No, 8.1 is the major version.  In 8.2.5, 8.2 is the major, 5 is the
minor version.


Which is nice in theory, except that the JDBC API doesn't give us the
option of a non-int major version number. We could fudge it with 80,
81 etc, but that's pretty ugly. You can imagine some database client
out there reporting that you're connected to a postgresql 82.5
database, rather than using the getDatabaseProductVersion() method
which is intended for that sort of thing.

For the most part, getting the combination of the major and minor
numbers as currently implemented should be enough for anything using
the driver, as we normally don't care about the difference between
8.2.1 and 8.2.2 in application code (heaven help mysql :)). It only
came up in this case because the minor number (as reported by the JDBC
driver) wasn't passed through.

I just looked at the code and AFAICT we can just ask the driver for  
both major and minor to get something along the lines of


8,0 or 8,2 for major, minor respectively.

Dave

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] 8.2.3: Server crashes on Windows using Eclipse/Junit

2007-11-12 Thread Magnus Hagander

Trevor Talbot wrote:
> On 11/12/07, Magnus Hagander <[EMAIL PROTECTED]> wrote:
>> On Mon, Nov 12, 2007 at 04:00:04AM -0800, Trevor Talbot wrote:
>>> On 11/12/07, Magnus Hagander <[EMAIL PROTECTED]> wrote:
 On Sat, Nov 10, 2007 at 03:17:13PM -0800, Trevor Talbot wrote:
> As for desktop heap, only 65KB of the service heap was allocated, or
> about 80 bytes per connection.  No danger of hitting limits in the
> kernel memory pools either.
 As Dave said, it could be that the server version uses a lot less heap per
 process, which would be another good reason to use server rather than XP to
 run postgresql. But might there also be other differences, such as some
 third party (or non-core microsoft) product installed?
>>> The XP SP2 machine I tried 8.2.5 on was chewing up about 3.1KB per
>>> process, and it's not running anything invasive (AV or otherwise).
>> Then I think we can claim that Server is just better than Workstation in
>> this regard. Maybe we should put that in the FAQ?
> 
> I think it's safe to claim 2003 is better than XP, but I'm not sure
> that's enough to generalize into server vs workstation yet.  It
> implies 2000 Server would be better than 2000 Pro, which might not be
> true.  I'm also wondering whether 64bit XP behaves differently, since
> IIRC it's based on the 2003 kernel.  Then there's Vista...

Valid points, of course. Specifically, it'd be interesting to know where
Vista stands, and possibly 2008 server. I don't care that much about
2000, really.

I don't have installations of either one, though.. :-(

//Magnus

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Simplifying Text Search

2007-11-12 Thread Tom Lane

Simon Riggs <[EMAIL PROTECTED]> writes:
> So we end up with a normal sounding function that is overloaded to
> provide all of the various goodies.

As best I can tell, @@ does exactly this already.  This is just a
different spelling of the same capability, and I don't actually
find it better.  Why is "text_search(x,y)" better than "x @@ y"?
We don't recommend that people write "texteq(x,y)" instead of
"x = y".

> Sound good?

It's not an improvement, it's not compatible with what existing tsearch2
users are accustomed to, and it's several months too late...

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [GENERAL] plperl and regexps with accented characters - incompatible?

2007-11-12 Thread Andrew Dunstan





Greg Sabino Mullane wrote:


Yes, we might want to consider making utf8 come pre-loaded for plperl. There 
is no direct or easy way to do it (we don't have finer-grained control than 
the 'require' opcode), but we could probably dial back restrictions, 
'use' it, and then reset the Safe container to its defaults. Not sure what 
other problems that may cause, however. CCing to hackers for discussion 
there.



  


UTF8 is automatically on for strings passed to plperl if the db encoding 
is UTF8. That includes the source text. Please be more precise about 
what you want.


BTW, the perl docs say this about the utf8 pragma:

  Do not use this pragma for anything else than telling Perl that your
  script is written in UTF-8.

There should be no need to do that - we will have done it for you. So 
any attempt to use the utf8 pragma in plperl code is probably broken anyway.


cheers

andrew





---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] [GENERAL] plperl and regexps with accented characters - incompatible?

2007-11-12 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160

hubert depesz lubaczewski writes:
...
> return (shift =~ /[a-z0-9_-]+/i) || 0;
...
> 'require' trapped by operation mask at line 15.
>
> it looks strange - what "require"?

As you guessed, it's trying to do load the utf8 pragma, and failing 
as 'require' (and 'use') are not allowed by default: plperl uses the 
Safe module to disallow things like 'require Module;'. Unfortunately, the 
only way around it on your end is to use plperlu - something I recommend 
anyway (for other reasons).

> also - perhaps loading of this particular module should be allowed even in
> plperl? otherwise it requires me to use plperlu for even the simple task of
> regexp matching.

Yes, we might want to consider making utf8 come pre-loaded for plperl. There 
is no direct or easy way to do it (we don't have finer-grained control than 
the 'require' opcode), but we could probably dial back restrictions, 
'use' it, and then reset the Safe container to its defaults. Not sure what 
other problems that may cause, however. CCing to hackers for discussion 
there.

- --
Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200711121139
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iD8DBQFHOIJPvJuQZxSWSsgRA10hAJ996hZYM8KiuziJb/R2QX0HY754bwCg+xZN
kePHNNZbLtRXj6ko8j51waw=
=fw0v
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] [hibernate-team] PostgreSQLDialect

2007-11-12 Thread Tom Dunstan

On Nov 12, 2007 4:08 PM, Alvaro Herrera <[EMAIL PROTECTED]> wrote:
> > What should the driver report then ? I believe the backend code considers 8
> > to be the major version, and 0123 to be the minor versions ?
>
> No, 8.1 is the major version.  In 8.2.5, 8.2 is the major, 5 is the
> minor version.

Which is nice in theory, except that the JDBC API doesn't give us the
option of a non-int major version number. We could fudge it with 80,
81 etc, but that's pretty ugly. You can imagine some database client
out there reporting that you're connected to a postgresql 82.5
database, rather than using the getDatabaseProductVersion() method
which is intended for that sort of thing.

For the most part, getting the combination of the major and minor
numbers as currently implemented should be enough for anything using
the driver, as we normally don't care about the difference between
8.2.1 and 8.2.2 in application code (heaven help mysql :)). It only
came up in this case because the minor number (as reported by the JDBC
driver) wasn't passed through.

Cheers

Tom

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Simplifying Text Search

2007-11-12 Thread Heikki Linnakangas


Simon Riggs wrote:

Something Tom Dunstan just mentioned has made me ask the question "Why
does our full text search feature look so strange?". It's the
operator-laden syntax that causes the problem.

By any stretch, this query is difficult for most people to understand:

SELECT * FROM text_table
WHERE to_tsvector('fat cats ate fat rats') @@ to_tsquery('fat & rat');

Wouldn't it be much simpler to just have a function, so we can write
this query like this?

SELECT * FROM text_table
WHERE text_search('haystack needle haystack', 'needle');

We then explain to people that while the above is nice, it will presume
that both the function inputs are Text, which isn't any good for complex
searches, indexing and dictionaries etc.., so then we move to:

SELECT * FROM text_table
WHERE text_search('haystack needle haystack'::tsvector,
'needle'::tsquery);

or perhaps

SELECT * FROM text_table
WHERE full_text_search('haystack needle haystack', 'needle & hay');

which would automatically do the conversions to tsvector and tsquery for
us. No more tedious casting, easy to read.


There's a text @@ text operator, so you can write just:

SELECT * FROM tstable where data @@ 'needle';

No need to cast.

Unfortunately, that form can't use a GIN index, I think. But that's 
another issue, which I don't think your proposal would fix...


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] [hibernate-team] PostgreSQLDialect

2007-11-12 Thread Alvaro Herrera

Dave Cramer escribió:
>
> On 12-Nov-07, at 10:10 AM, Simon Riggs wrote:
>
>> On Mon, 2007-11-12 at 14:35 +, Tom Dunstan wrote:
>>> Nice try :), but as I read the javadoc for DialectFactory it seems to
>>> suggest that hibernate gets the major number from our JDBC driver,
>>> which dutifully reports it as 8.
>>
>> We can extend that so it uses getMinorVersion() also.
>>
>> Personally, I think our JDBC driver is wrong, but thats another issue.
>>
> What should the driver report then ? I believe the backend code considers 8 
> to be the major version, and 0123 to be the minor versions ?

No, 8.1 is the major version.  In 8.2.5, 8.2 is the major, 5 is the
minor version.

-- 
Alvaro Herrera  Developer, http://www.PostgreSQL.org/
"No hay ausente sin culpa ni presente sin disculpa" (Prov. francés)

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] minimal update

2007-11-12 Thread Andrew Dunstan




Decibel! wrote:

On Nov 2, 2007, at 10:49 AM, Andrew Dunstan wrote:

   update tname set foo = bar ... where foo is null or foo <> bar ...


FYI, you should be able to do WHERE foo IS DISTINCT FROM bar instead.


True, that's a bit nicer. It's still more than somewhat ugly and fragile 
if there a lot of foos and the bars are complex expressions.


cheers

andrew

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [HACKERS] minimal update

2007-11-12 Thread Decibel!


On Nov 2, 2007, at 10:49 AM, Andrew Dunstan wrote:

   update tname set foo = bar ... where foo is null or foo <> bar ...


FYI, you should be able to do WHERE foo IS DISTINCT FROM bar instead.
--
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED]
Give your computer some brain candy! www.distributed.net Team #1828




smime.p7s
Description: S/MIME cryptographic signature

Re: [HACKERS] Simplifying Text Search

2007-11-12 Thread David Fetter

On Mon, Nov 12, 2007 at 03:48:20PM +, Simon Riggs wrote:
> On Mon, 2007-11-12 at 16:28 +0100, Martijn van Oosterhout wrote:
> > On Mon, Nov 12, 2007 at 03:00:36PM +, Simon Riggs wrote:
> > > Something Tom Dunstan just mentioned has made me ask the question "Why
> > > does our full text search feature look so strange?". It's the
> > > operator-laden syntax that causes the problem.
> > > 
> > > By any stretch, this query is difficult for most people to understand:
> > > 
> > > SELECT * FROM text_table
> > > WHERE to_tsvector('fat cats ate fat rats') @@ to_tsquery('fat & rat');
> > > 
> > > Wouldn't it be much simpler to just have a function, so we can write
> > > this query like this?
> > > 
> > > SELECT * FROM text_table
> > > WHERE text_search('haystack needle haystack', 'needle');
> > 
> > Can't you do this with an SQL function that gets expanded inline?
> 
> Yep, we can. Good thinking. So the change is fairly trivial.
> 
> What do you think of the proposal to make text search work this way
> and to document this more easily readable form?

+1 for adding this.

Cheers,
David.
-- 
David Fetter <[EMAIL PROTECTED]> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Simplifying Text Search

2007-11-12 Thread Simon Riggs

On Mon, 2007-11-12 at 16:28 +0100, Martijn van Oosterhout wrote:
> On Mon, Nov 12, 2007 at 03:00:36PM +, Simon Riggs wrote:
> > Something Tom Dunstan just mentioned has made me ask the question "Why
> > does our full text search feature look so strange?". It's the
> > operator-laden syntax that causes the problem.
> > 
> > By any stretch, this query is difficult for most people to understand:
> > 
> > SELECT * FROM text_table
> > WHERE to_tsvector('fat cats ate fat rats') @@ to_tsquery('fat & rat');
> > 
> > Wouldn't it be much simpler to just have a function, so we can write
> > this query like this?
> > 
> > SELECT * FROM text_table
> > WHERE text_search('haystack needle haystack', 'needle');
> 
> Can't you do this with an SQL function that gets expanded inline?

Yep, we can. Good thinking. So the change is fairly trivial.

What do you think of the proposal to make text search work this way and
to document this more easily readable form?

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Proposal: Select ... AS OF Savepoint

2007-11-12 Thread Gregory Stark

"Decibel!" <[EMAIL PROTECTED]> writes:

> On Nov 2, 2007, at 11:29 AM, Tom Lane wrote:
>> [ splorfff... ]  The grammar support alone will cost ten times that.
>
> When next we meet, expect me to ask you how that's pronounced. ;)

I think it can only be properly pronounced with a mug of coffee

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Simplifying Text Search

2007-11-12 Thread Martijn van Oosterhout

On Mon, Nov 12, 2007 at 03:00:36PM +, Simon Riggs wrote:
> Something Tom Dunstan just mentioned has made me ask the question "Why
> does our full text search feature look so strange?". It's the
> operator-laden syntax that causes the problem.
> 
> By any stretch, this query is difficult for most people to understand:
> 
> SELECT * FROM text_table
> WHERE to_tsvector('fat cats ate fat rats') @@ to_tsquery('fat & rat');
> 
> Wouldn't it be much simpler to just have a function, so we can write
> this query like this?
> 
> SELECT * FROM text_table
> WHERE text_search('haystack needle haystack', 'needle');

Can't you do this with an SQL function that gets expanded inline?

Have a nice day,
-- 
Martijn van Oosterhout   <[EMAIL PROTECTED]>   http://svana.org/kleptog/
> Those who make peaceful revolution impossible will make violent revolution 
> inevitable.
>  -- John F Kennedy


signature.asc
Description: Digital signature

Re: [HACKERS] plpgsql keywords are hidden reserved words

2007-11-12 Thread Decibel!


On Nov 5, 2007, at 11:58 AM, John DeSoi wrote:
Is there any feasibility to the idea of allowing pl/pgsql variables  
and parameters to be prefixed with a special character like '$'?  
I'm constantly adding prefixes like 'v_' because of conflicts with  
table or column names. It would be nice to have something like  
"declare $myvar integer;" so it would be very easy to distinguish  
variable and parameter names from structure names or reserved words.


+528,382 ;)

At least in 8.3, IIRC you can safely use the name of the function to  
refer to variables, but it would be very nice if you could just do  
$blah in embedded SQL statements.


While we're talking about plpgsql... is there a TODO to allow RAISE  
to take a variable instead of just a fixed string? Yes, I can always  
do RAISE '%', variable, but then I lose % expansion.

--
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED]
Give your computer some brain candy! www.distributed.net Team #1828




smime.p7s
Description: S/MIME cryptographic signature

Re: [HACKERS] [hibernate-team] PostgreSQLDialect

2007-11-12 Thread Dave Cramer



On 12-Nov-07, at 10:10 AM, Simon Riggs wrote:


On Mon, 2007-11-12 at 14:35 +, Tom Dunstan wrote:

Nice try :), but as I read the javadoc for DialectFactory it seems to
suggest that hibernate gets the major number from our JDBC driver,
which dutifully reports it as 8.


We can extend that so it uses getMinorVersion() also.

Personally, I think our JDBC driver is wrong, but thats another issue.

What should the driver report then ? I believe the backend code  
considers 8 to be the major version, and 0123 to be the minor versions ?


Dave

--
 Simon Riggs
 2ndQuadrant  http://www.2ndQuadrant.com


---(end of  
broadcast)---

TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Proposal: Select ... AS OF Savepoint

2007-11-12 Thread Decibel!


On Nov 2, 2007, at 11:29 AM, Tom Lane wrote:

[ splorfff... ]  The grammar support alone will cost ten times that.


When next we meet, expect me to ask you how that's pronounced. ;)
--
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED]
Give your computer some brain candy! www.distributed.net Team #1828




smime.p7s
Description: S/MIME cryptographic signature

Re: [HACKERS] [hibernate-team] PostgreSQLDialect

2007-11-12 Thread Simon Riggs

On Mon, 2007-11-12 at 14:35 +, Tom Dunstan wrote:
> Nice try :), but as I read the javadoc for DialectFactory it seems to
> suggest that hibernate gets the major number from our JDBC driver,
> which dutifully reports it as 8. 

We can extend that so it uses getMinorVersion() also.

Personally, I think our JDBC driver is wrong, but thats another issue.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] [hibernate-team] PostgreSQLDialect

2007-11-12 Thread Simon Riggs

On Mon, 2007-11-12 at 13:30 +, Tom Dunstan wrote:
> On Nov 12, 2007 10:55 AM, Simon Riggs <[EMAIL PROTECTED]> wrote:
> > I've posted files to pgsql-patches, as well as to Diego directly.
> 
> I dropped them into a Hibernate 3.2.5.ga source tree and ran the
> hibernate tests with the 8.3 dialect against pgsql HEAD and got a few
> errors. Diego, I assume that the hibernate tests are in a state where
> we expect them to all pass? I didn't bother trying the original
> dialect that hibernate shipped with, so I'm not sure if it passes or
> not. Given that these seem like an improvement, I'll assume not.

It's possible I caused some, though the largest single change was the
reordering, which was necessary to check off everything.

I was assuming your CLOB/BLOB changes would go in too. 

> > There are 3 files
> > PostgreSQL8Dialect.java which implements enough for 8.0 and 8.1
> > PostgreSQL82Dialect.java which extends PostgreSQL8Dialect
> > PostgreSQL83Dialect.java which extends PostgreSQL8Dialect
> 
> Given that our releases are generally a feature superset of previous
> ones, should we just make PostgreSQL83Dialect extend
> PostgreSQL82Dialect? I note that atm they are identical. Or does that
> offend anyone's delicate OO sensibilities?

I'm easy either way. That's the way I started, FWIW, I just foresaw this
long list of dependencies and switched back to the two level structure.

> > - GUID support is possible, but really opens up the debate about how
> > extensibility features should be handled.
> 
> Yeah. Should the GUID be mapped to e.g. java.util.UUID? Or just
> string? etc. I had some thoughts about enums, but if someone's using
> the annotation stuff (either JPA or hibernate specific) then they
> already have a mechanism to map between a Java enum and a string, so
> the only thing that wouldn't work would be DDL generation, since
> hibernate wouldn't understand the necessaary CREATE TYPE commands.

The Dialect says "command to select GUID from underlying database". No
real reason to get one from there.

Hibernate doesn't support a specific GUID type since
getSelectGUIDString() returns String, so I guess DB support for GUIDs is
irrelevant.

So OK, java.util.UUID sounds OK so far, anyone else?

> > - For now, I think we should document the procedure for adding a local
> > site Dialect which implements additional functions, with GUID as an
> > example
> 
> Oh, were you just referring to making GUID functions available? Yeah
> that shouldn't be too hard, but again I wonder if we should look at an
> automatic way to generate those function declarations. Given that the
> dialect can't read the database when it's instantiated, perhaps the
> way to go would be to ship a resource file containing the expected
> functions and have the dialect parse that before calling the
> registration functions. There would then be a process that a user
> could run against their own database to regenerate that file, and
> they'd just need to drop it into their classpath for it to be picked
> up.

I like that.

> All of this should work for functions, but operators are a whole
> different story. I strongly suspect that someone is not going to be
> able to use e.g. @@ in a HQL query. Are there ways to do tsearch type
> queries just using functions and more standard operators?

H...

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[HACKERS] Simplifying Text Search

2007-11-12 Thread Simon Riggs

Something Tom Dunstan just mentioned has made me ask the question "Why
does our full text search feature look so strange?". It's the
operator-laden syntax that causes the problem.

By any stretch, this query is difficult for most people to understand:

SELECT * FROM text_table
WHERE to_tsvector('fat cats ate fat rats') @@ to_tsquery('fat & rat');

Wouldn't it be much simpler to just have a function, so we can write
this query like this?

SELECT * FROM text_table
WHERE text_search('haystack needle haystack', 'needle');

We then explain to people that while the above is nice, it will presume
that both the function inputs are Text, which isn't any good for complex
searches, indexing and dictionaries etc.., so then we move to:

SELECT * FROM text_table
WHERE text_search('haystack needle haystack'::tsvector,
'needle'::tsquery);

or perhaps

SELECT * FROM text_table
WHERE full_text_search('haystack needle haystack', 'needle & hay');

which would automatically do the conversions to tsvector and tsquery for
us. No more tedious casting, easy to read.

[*text_search() functions would return bool]

So we end up with a normal sounding function that is overloaded to
provide all of the various goodies. We can include the text_search(text,
text) version of the function in the normal chapter on functions, with a
pointer to the more complex stuff elsewhere.

Sound good?

We can then explain everything without having to use @@ operators. They
can then be introduced as an option.

The side benefit of this is that we can then allow our wonderful new
functionality to be more easily usable by things like Hibernate. We just
tell them we have this new function and thats all they need to know.

I know that under the covers the @@ operator is necessary because we
hang various pieces of optimizer information from it. Each function
signature gets an operator with matching signature, so there's a 1:1
correspondence in most use cases. So to make this all hang together,
there'd need to be a some smarts that says: if there is only one
operator on a function then use the operator's optimizer information
when you see just the function. That information can be assessed at DDL
time, so we can keep accurate track of operator counts in pgproc. 

An alternative approach might be to make the first operator created on a
function the "primary" operator. All other operators would then be
secondary operators, so that adding operators would not change the
inference mechanism.

I've not got sufficient knowledge to say how hard the
function-to-operator inference is, but it would be dang useful in making
text search and many other programs readable and easy to interface to.
In the end that is going to mean wider usage of that functionality, with
more people feeling like they can dip their toes into the water.

I must confess I have insufficient time to do this myself right now, not
least me discovering exactly how. I'm spending time on this now because
I'm the one that has to explain this stuff to people and things like
this can make a huge difference in their understanding and eventual
uptake.

Thoughts?

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] [hibernate-team] PostgreSQLDialect

2007-11-12 Thread Andrew Dunstan




Tom Dunstan wrote:

On Nov 12, 2007 2:13 PM, Andrew Dunstan <[EMAIL PROTECTED]> wrote:
  

Oh, that's nice. Unfortunately, though. it only seems to support major
version number differentiation as an int. Apparently the idea that you
might have a version number like 8.3 didn't occur to whoever wrote it,
although to be fair it looks like the only implementation that
actually uses it is Oracle, where that assumption probably holds.
Probably wouldn't be that hard to hack to our purposes though...
  

800, 801 ...



Nice try :), but as I read the javadoc for DialectFactory it seems to
suggest that hibernate gets the major number from our JDBC driver,
which dutifully reports it as 8. I doubt that we're suggesting hacking
the JDBC driver to lie just to get around this wrinkle when the
obvious solution is to submit a patch to hibernate that makes it pass
both major and minor numbers through, and the Oracle code could
happily ignore the latter.


  


OK.

It's probably time to take this discussion off -hackers, I think.

cheers

andrew

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] [hibernate-team] PostgreSQLDialect

2007-11-12 Thread Tom Dunstan

On Nov 12, 2007 2:13 PM, Andrew Dunstan <[EMAIL PROTECTED]> wrote:
> > Oh, that's nice. Unfortunately, though. it only seems to support major
> > version number differentiation as an int. Apparently the idea that you
> > might have a version number like 8.3 didn't occur to whoever wrote it,
> > although to be fair it looks like the only implementation that
> > actually uses it is Oracle, where that assumption probably holds.
> > Probably wouldn't be that hard to hack to our purposes though...
>
> 800, 801 ...

Nice try :), but as I read the javadoc for DialectFactory it seems to
suggest that hibernate gets the major number from our JDBC driver,
which dutifully reports it as 8. I doubt that we're suggesting hacking
the JDBC driver to lie just to get around this wrinkle when the
obvious solution is to submit a patch to hibernate that makes it pass
both major and minor numbers through, and the Oracle code could
happily ignore the latter.

Cheers

Tom

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Clarification reqeusted for "select * from a huge table"

2007-11-12 Thread Richard Huxton


Trevor Talbot wrote:

On 11/12/07, Richard Huxton <[EMAIL PROTECTED]> wrote:

Gokulakannan Somasundaram wrote:



I also noticed that it doesn't crash with psql, but it takes a
long time to show the first set of records. It takes a long time, even
to quit after i pressed 'q'.
   With oracle SQLPlus, it is quite instantaneous.



Imagine, you need a large batch operation. In oracle we can fire the
SQL and we can be sure that the client won't crash, but with postgres
we have a region of uncertainity.

Well, if your client doesn't know if it can handle 1 million rows, maybe
it shouldn't ask for them?


Isn't that exactly his point?  He's talking about the default behavior
of clients designed for postgres, one of which is psql.


Psql isn't the client - you are.

--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [HACKERS] [hibernate-team] PostgreSQLDialect

2007-11-12 Thread Andrew Dunstan




Tom Dunstan wrote:

On Nov 12, 2007 1:08 PM, Simon Riggs <[EMAIL PROTECTED]> wrote:

  

If we do this, then it looks like we can hack this file also
http://anonsvn.jboss.org/repos/hibernate/core/trunk/core/src/main/java/org/hibernate/dialect/DialectFactory.java



Oh, that's nice. Unfortunately, though. it only seems to support major
version number differentiation as an int. Apparently the idea that you
might have a version number like 8.3 didn't occur to whoever wrote it,
although to be fair it looks like the only implementation that
actually uses it is Oracle, where that assumption probably holds.
Probably wouldn't be that hard to hack to our purposes though...


  


800, 801 ...

cheers

andrew

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Clarification reqeusted for "select * from a huge table"

2007-11-12 Thread Gokulakannan Somasundaram

On Nov 12, 2007 6:48 PM, Bernd Helmle <[EMAIL PROTECTED]> wrote:
> --On Montag, November 12, 2007 18:10:12 +0530 Gokulakannan Somasundaram
> <[EMAIL PROTECTED]> wrote:
>
> >> http://jdbc.postgresql.org/documentation/head/query.html#query-with-curs
> >> or
> >>
> >> --
> >>Heikki Linnakangas
> >>EnterpriseDB   http://www.enterprisedb.com
> >>
> >
> > Thanks Heikki. That answered my question. Can you tell me, what is the
> > similar setting for psql?
>
> See the FETCH_COUNT psql variable in Version 8.2 and above.
>
> --
>   Thanks
>
> Bernd
>
Thanks a lot Brend. I apologize for posting this query in Hackers.



-- 
Thanks,
Gokul.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] [hibernate-team] PostgreSQLDialect

2007-11-12 Thread Tom Dunstan

On Nov 12, 2007 1:08 PM, Simon Riggs <[EMAIL PROTECTED]> wrote:

> If we do this, then it looks like we can hack this file also
> http://anonsvn.jboss.org/repos/hibernate/core/trunk/core/src/main/java/org/hibernate/dialect/DialectFactory.java

Oh, that's nice. Unfortunately, though. it only seems to support major
version number differentiation as an int. Apparently the idea that you
might have a version number like 8.3 didn't occur to whoever wrote it,
although to be fair it looks like the only implementation that
actually uses it is Oracle, where that assumption probably holds.
Probably wouldn't be that hard to hack to our purposes though...

Cheers

Tom

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] [hibernate-team] PostgreSQLDialect

2007-11-12 Thread Tom Dunstan

> All of this should work for functions, but operators are a whole
> different story. I strongly suspect that someone is not going to be
> able to use e.g. @@ in a HQL query. Are there ways to do tsearch type
> queries just using functions and more standard operators?

Of course, if someone's using tsearch then they've already thrown
database agnosticism out the window, so they could always just knock
up a native SQL query directly. But it can get quite fiddly if there
are a lot of fields coming back in the result set - that's why it
would be nice if hibernate could handle these cases itself.

Cheers

Tom

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [hibernate-team] PostgreSQLDialect

2007-11-12 Thread Tom Dunstan

[oops, sent with non-subscribed from: address first time]

On Nov 12, 2007 10:55 AM, Simon Riggs <[EMAIL PROTECTED]> wrote:
> I've posted files to pgsql-patches, as well as to Diego directly.

I dropped them into a Hibernate 3.2.5.ga source tree and ran the
hibernate tests with the 8.3 dialect against pgsql HEAD and got a few
errors. Diego, I assume that the hibernate tests are in a state where
we expect them to all pass? I didn't bother trying the original
dialect that hibernate shipped with, so I'm not sure if it passes or
not. Given that these seem like an improvement, I'll assume not.

> There are 3 files
> PostgreSQL8Dialect.java which implements enough for 8.0 and 8.1
> PostgreSQL82Dialect.java which extends PostgreSQL8Dialect
> PostgreSQL83Dialect.java which extends PostgreSQL8Dialect

Given that our releases are generally a feature superset of previous
ones, should we just make PostgreSQL83Dialect extend
PostgreSQL82Dialect? I note that atm they are identical. Or does that
offend anyone's delicate OO sensibilities?

> We can then push out a new file every release.

Yes, I like the general approach.

> - GUID support is possible, but really opens up the debate about how
> extensibility features should be handled.

Yeah. Should the GUID be mapped to e.g. java.util.UUID? Or just
string? etc. I had some thoughts about enums, but if someone's using
the annotation stuff (either JPA or hibernate specific) then they
already have a mechanism to map between a Java enum and a string, so
the only thing that wouldn't work would be DDL generation, since
hibernate wouldn't understand the necessaary CREATE TYPE commands.

> - For now, I think we should document the procedure for adding a local
> site Dialect which implements additional functions, with GUID as an
> example

Oh, were you just referring to making GUID functions available? Yeah
that shouldn't be too hard, but again I wonder if we should look at an
automatic way to generate those function declarations. Given that the
dialect can't read the database when it's instantiated, perhaps the
way to go would be to ship a resource file containing the expected
functions and have the dialect parse that before calling the
registration functions. There would then be a process that a user
could run against their own database to regenerate that file, and
they'd just need to drop it into their classpath for it to be picked
up.

All of this should work for functions, but operators are a whole
different story. I strongly suspect that someone is not going to be
able to use e.g. @@ in a HQL query. Are there ways to do tsearch type
queries just using functions and more standard operators?

Cheers

Tom

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [hibernate-team] PostgreSQLDialect

2007-11-12 Thread Tom Dunstan

On Nov 12, 2007 10:55 AM, Simon Riggs <[EMAIL PROTECTED]> wrote:
> I've posted files to pgsql-patches, as well as to Diego directly.

I dropped them into a Hibernate 3.2.5.ga source tree and ran the
hibernate tests with the 8.3 dialect against pgsql HEAD and got a few
errors. Diego, I assume that the hibernate tests are in a state where
we expect them to all pass? I didn't bother trying the original
dialect that hibernate shipped with, so I'm not sure if it passes or
not. Given that these seem like an improvement, I'll assume not.

> There are 3 files
> PostgreSQL8Dialect.java which implements enough for 8.0 and 8.1
> PostgreSQL82Dialect.java which extends PostgreSQL8Dialect
> PostgreSQL83Dialect.java which extends PostgreSQL8Dialect

Given that our releases are generally a feature superset of previous
ones, should we just make PostgreSQL83Dialect extend
PostgreSQL82Dialect? I note that atm they are identical. Or does that
offend anyone's delicate OO sensibilities?

> We can then push out a new file every release.

Yes, I like the general approach.

> - GUID support is possible, but really opens up the debate about how
> extensibility features should be handled.

Yeah. Should the GUID be mapped to e.g. java.util.UUID? Or just
string? etc. I had some thoughts about enums, but if someone's using
the annotation stuff (either JPA or hibernate specific) then they
already have a mechanism to map between a Java enum and a string, so
the only thing that wouldn't work would be DDL generation, since
hibernate wouldn't understand the necessaary CREATE TYPE commands.

> - For now, I think we should document the procedure for adding a local
> site Dialect which implements additional functions, with GUID as an
> example

Oh, were you just referring to making GUID functions available? Yeah
that shouldn't be too hard, but again I wonder if we should look at an
automatic way to generate those function declarations. Given that the
dialect can't read the database when it's instantiated, perhaps the
way to go would be to ship a resource file containing the expected
functions and have the dialect parse that before calling the
registration functions. There would then be a process that a user
could run against their own database to regenerate that file, and
they'd just need to drop it into their classpath for it to be picked
up.

All of this should work for functions, but operators are a whole
different story. I strongly suspect that someone is not going to be
able to use e.g. @@ in a HQL query. Are there ways to do tsearch type
queries just using functions and more standard operators?

Cheers

Tom

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [hibernate-team] PostgreSQLDialect

2007-11-12 Thread Peter Eisentraut

Simon Riggs wrote:
> On Sun, 2007-11-11 at 17:11 +, Tom Dunstan wrote:
> > The way to fix both that and the differing available functions would
> > probably be to have a subclass of the dialect for each server version.
> > MySQL seems to have about 5 :)
>
> I think a static dialect for each server version is the way to go.

How would this handle extensions such as PostGIS, Tsearch, XML, etc.?

Certainly, the registerFunction() calls can be automated.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] 8.2.3: Server crashes on Windows using Eclipse/Junit

2007-11-12 Thread Trevor Talbot

On 11/12/07, Magnus Hagander <[EMAIL PROTECTED]> wrote:
> On Mon, Nov 12, 2007 at 04:00:04AM -0800, Trevor Talbot wrote:
> > On 11/12/07, Magnus Hagander <[EMAIL PROTECTED]> wrote:
> > > On Sat, Nov 10, 2007 at 03:17:13PM -0800, Trevor Talbot wrote:
> >
> > > > As for desktop heap, only 65KB of the service heap was allocated, or
> > > > about 80 bytes per connection.  No danger of hitting limits in the
> > > > kernel memory pools either.
> > >
> > > As Dave said, it could be that the server version uses a lot less heap per
> > > process, which would be another good reason to use server rather than XP 
> > > to
> > > run postgresql. But might there also be other differences, such as some
> > > third party (or non-core microsoft) product installed?
> >
> > The XP SP2 machine I tried 8.2.5 on was chewing up about 3.1KB per
> > process, and it's not running anything invasive (AV or otherwise).
>
> Then I think we can claim that Server is just better than Workstation in
> this regard. Maybe we should put that in the FAQ?

I think it's safe to claim 2003 is better than XP, but I'm not sure
that's enough to generalize into server vs workstation yet.  It
implies 2000 Server would be better than 2000 Pro, which might not be
true.  I'm also wondering whether 64bit XP behaves differently, since
IIRC it's based on the 2003 kernel.  Then there's Vista...

Unfortunately I don't have access to any of these versions to test
with at the moment.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Clarification reqeusted for "select * from a huge table"

2007-11-12 Thread Bernd Helmle

--On Montag, November 12, 2007 18:10:12 +0530 Gokulakannan Somasundaram 
<[EMAIL PROTECTED]> wrote:



http://jdbc.postgresql.org/documentation/head/query.html#query-with-curs
or

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com



Thanks Heikki. That answered my question. Can you tell me, what is the
similar setting for psql?


See the FETCH_COUNT psql variable in Version 8.2 and above.

--
 Thanks

   Bernd

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] Clarification reqeusted for "select * from a huge table"

2007-11-12 Thread Trevor Talbot

On 11/12/07, Richard Huxton <[EMAIL PROTECTED]> wrote:
> Gokulakannan Somasundaram wrote:

> >>> I also noticed that it doesn't crash with psql, but it takes a
> >>> long time to show the first set of records. It takes a long time, even
> >>> to quit after i pressed 'q'.
> >>>With oracle SQLPlus, it is quite instantaneous.

> > Imagine, you need a large batch operation. In oracle we can fire the
> > SQL and we can be sure that the client won't crash, but with postgres
> > we have a region of uncertainity.
>
> Well, if your client doesn't know if it can handle 1 million rows, maybe
> it shouldn't ask for them?

Isn't that exactly his point?  He's talking about the default behavior
of clients designed for postgres, one of which is psql.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] [hibernate-team] PostgreSQLDialect

2007-11-12 Thread Simon Riggs

On Mon, 2007-11-12 at 10:55 +, Simon Riggs wrote:
> On Sun, 2007-11-11 at 16:48 -0200, Diego Pires Plentz wrote:
> 
> > > > - You have supportsRowValueConstructorSyntax commented out. It does, if
> > > > you have a recent enough version, or do you mean something else?
> > >
> > > The way to fix both that and the differing available functions would
> > > probably be to have a subclass of the dialect for each server version.
> > > MySQL seems to have about 5 :)
> > > http://www.hibernate.org/hib_docs/v3/api/org/hibernate/dialect/package-summary.html.
> > 
> > I'm thinking the same thing. We could let PostgreSQLDialect to do full
> > support to Postgre 7.x and extend it to support the new
> > features/functions in Postgre 8.x. Btw, to do that, one thing that we
> > must do is identify what functions are new/still avaiable in 8.x. That
> > approach is good too, because we can get different behaviors in each
> > version of the database.
> 
> I've posted files to pgsql-patches, as well as to Diego directly.
> 
> There are 3 files
> PostgreSQL8Dialect.java which implements enough for 8.0 and 8.1
> PostgreSQL82Dialect.java which extends PostgreSQL8Dialect
> PostgreSQL83Dialect.java which extends PostgreSQL8Dialect
> 
> PostgreSQL8Dialect is not provided as a patch because the extensions
> have all been re-ordered to match the underlying sequence and grouping
> in the base Dialect file. Checking it should be much easier now.
> 
> I've not checked 7.x compatibility
> 
> We can then push out a new file every release.
> 
> Notes:
> - Not sure when getCascadeConstraintsString() gets called, so left it
> unset
> 
> - Not added any keywords. Some Dialects add a few keywords, but there
> doesn't seem to be any pattern to it. Any advice?
> 
> - GUID support is possible, but really opens up the debate about how
> extensibility features should be handled. 
> 
> - For now, I think we should document the procedure for adding a local
> site Dialect which implements additional functions, with GUID as an
> example

If we do this, then it looks like we can hack this file also
http://anonsvn.jboss.org/repos/hibernate/core/trunk/core/src/main/java/org/hibernate/dialect/DialectFactory.java

so that Hibernate can pick up the version dynamically.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] 8.2.3: Server crashes on Windows using Eclipse/Junit

2007-11-12 Thread Magnus Hagander

On Mon, Nov 12, 2007 at 04:00:04AM -0800, Trevor Talbot wrote:
> On 11/12/07, Magnus Hagander <[EMAIL PROTECTED]> wrote:
> > On Sat, Nov 10, 2007 at 03:17:13PM -0800, Trevor Talbot wrote:
> 
> > > As for desktop heap, only 65KB of the service heap was allocated, or
> > > about 80 bytes per connection.  No danger of hitting limits in the
> > > kernel memory pools either.
> >
> > As Dave said, it could be that the server version uses a lot less heap per
> > process, which would be another good reason to use server rather than XP to
> > run postgresql. But might there also be other differences, such as some
> > third party (or non-core microsoft) product installed?
> 
> The XP SP2 machine I tried 8.2.5 on was chewing up about 3.1KB per
> process, and it's not running anything invasive (AV or otherwise).

Then I think we can claim that Server is just better than Workstation in
this regard. Maybe we should put that in the FAQ?


> I've been trying to find out exactly what's in the desktop heap, but I
> haven't had much luck so far.  Apparently Microsoft changed the
> implementation after Win2000, and didn't bother teaching the public
> debugging tools about it.  The details just don't seem to exist
> anymore :(

Yeah, there are very little docs at all about the desktop heap AFAICT.

//Magnus

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Clarification reqeusted for "select * from a huge table"

2007-11-12 Thread Richard Huxton

Gokulakannan Somasundaram wrote:

I also noticed that it doesn't crash with psql, but it takes a
long time to show the first set of records. It takes a long time, even
to quit after i pressed 'q'.
   With oracle SQLPlus, it is quite instantaneous.

Again, you're measuring different things. What is the time to the *last*
row?

I made this point, because people usually fire select * from table
query in the psql prompt to get a feel of the table.

Well, I don't. If nothing else, it's difficult to fit a whole row across 
the screen with many tables.

> Ofcourse they can

fire select * from table limit 10; But i just feel its more key
strokes and it would be better, if it is interactive. I accept that it
is a debatable point and people can prefer otherwise to have more key
strokes.

If you want few keystrokes to see what columns a table has, how about
\d table

Imagine, you need a large batch operation. In oracle we can fire the
SQL and we can be sure that the client won't crash, but with postgres
we have a region of uncertainity.

Well, if your client doesn't know if it can handle 1 million rows, maybe 
it shouldn't ask for them?

> There are some JDBC hints like

setFetchSize(), which actually affects Oracle's behaviour. But it
doesn't seem to do anything with postgres. But JDBC has declared these
commands as hint commands and has provided a warning to users, about
the fact that it may get ignored

I see you've checked and seen it does work. I'm not a java man, but I do 
know there are ways to control this. Perhaps try the -jdbc mailing list.

In any case, I think we've established it's nothing for the hackers list.

--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [HACKERS] Clarification reqeusted for "select * from a huge table"

2007-11-12 Thread Gokulakannan Somasundaram

> we have a region of uncertainity. There are some JDBC hints like
> setFetchSize(), which actually affects Oracle's behaviour. But it
> doesn't seem to do anything with postgres. But JDBC has declared these
> commands as hint commands and has provided a warning to users, about
> the fact that it may get ignored
This is a wrong statement. I would like to take it back...


-- 
Thanks,
Gokul.
CertoSQL Project,
Allied Solution Group.
(www.alliedgroups.com)

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Clarification reqeusted for "select * from a huge table"

2007-11-12 Thread Gokulakannan Somasundaram

On Nov 12, 2007 6:01 PM, Heikki Linnakangas <[EMAIL PROTECTED]> wrote:
> Gokulakannan Somasundaram wrote:
> > I had a chance to test one of the real world cases with Oracle and
> > PostgreSQL. Create a Table with 10 million rows (i worked on a 1GB RAM
> > machine) both in oracle and Postgresql. Just write a JDBC program for
> > a 'select *' on that table. With PostgreSQL as backend, java crashes
> > saying that it has met 'Out Of Memory'. With Oracle it doesn't.
> > Postgres tried to send all the results back to the client at one
> > shot, whereas in Oracle it works like a Cursor. Is this issue already
> > well known among hackers community? If known, why is it designed this
> > way?
>
> http://jdbc.postgresql.org/documentation/head/query.html#query-with-cursor
>
> --
>Heikki Linnakangas
>EnterpriseDB   http://www.enterprisedb.com
>

Thanks Heikki. That answered my question. Can you tell me, what is the
similar setting for psql?

-- 
Thanks,
Gokul.
CertoSQL Project,
Allied Solution Group.
(www.alliedgroups.com)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Clarification reqeusted for "select * from a huge table"

2007-11-12 Thread Gokulakannan Somasundaram

On Nov 12, 2007 5:25 PM, Richard Huxton <[EMAIL PROTECTED]> wrote:
> Gokulakannan Somasundaram wrote:
> > Hi,
> > I had a chance to test one of the real world cases with Oracle and
> > PostgreSQL. Create a Table with 10 million rows (i worked on a 1GB RAM
> > machine) both in oracle and Postgresql. Just write a JDBC program for
> > a 'select *' on that table. With PostgreSQL as backend, java crashes
> > saying that it has met 'Out Of Memory'. With Oracle it doesn't.
> > Postgres tried to send all the results back to the client at one
> > shot, whereas in Oracle it works like a Cursor. Is this issue already
> > well known among hackers community? If known, why is it designed this
> > way?
>
> It returns everything because you've asked for it. If you wanted
> something that looks like a cursor, PG assumes you'll request a cursor.
>
> As to why, there are two reasons:
> 1. It's always been that way and changing it now would irritate most of
> the existing user-base.
JDBC API, as you might know has a rs.next() and rs.prev() to scan
backwards and forwards. The API looks more like a cursor.Currently,
all the results for Postgres is returned to the client during the
executeQuery(). This is more like the SAX / DOM argument. I just feel
JDBC APIs provide a feeling a SAX.

> 2. Repeat your test with 5,10,50,100 clients all running different big
> queries and see which puts a greater load on the server. PG favours
> supporting lots of clients by pushing the load onto them.
When we say Postgres pushes the load to the client, its a huge memory
overhead on the  client part. In oracle's scenario, the overhead is
just maintaining the state(say from which block the scan should be
continued). My point is that there are some operations, which are not
possible with postgres, whereas it is possible by an another database.
It would be better,, if we can support it. (There are some places
where DOM parsing is not possible and we prefer SAX)


>
> > I also noticed that it doesn't crash with psql, but it takes a
> > long time to show the first set of records. It takes a long time, even
> > to quit after i pressed 'q'.
> >With oracle SQLPlus, it is quite instantaneous.
>
> Again, you're measuring different things. What is the time to the *last*
> row?

I made this point, because people usually fire select * from table
query in the psql prompt to get a feel of the table. Ofcourse they can
fire select * from table limit 10; But i just feel its more key
strokes and it would be better, if it is interactive. I accept that it
is a debatable point and people can prefer otherwise to have more key
strokes.

Imagine, you need a large batch operation. In oracle we can fire the
SQL and we can be sure that the client won't crash, but with postgres
we have a region of uncertainity. There are some JDBC hints like
setFetchSize(), which actually affects Oracle's behaviour. But it
doesn't seem to do anything with postgres. But JDBC has declared these
commands as hint commands and has provided a warning to users, about
the fact that it may get ignored



-- 
Thanks,
Gokul.
CertoSQL Project,
Allied Solution Group.
(www.alliedgroups.com)

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Clarification reqeusted for "select * from a huge table"

2007-11-12 Thread Heikki Linnakangas


Gokulakannan Somasundaram wrote:

I had a chance to test one of the real world cases with Oracle and
PostgreSQL. Create a Table with 10 million rows (i worked on a 1GB RAM
machine) both in oracle and Postgresql. Just write a JDBC program for
a 'select *' on that table. With PostgreSQL as backend, java crashes
saying that it has met 'Out Of Memory'. With Oracle it doesn't.
Postgres tried to send all the results back to the client at one
shot, whereas in Oracle it works like a Cursor. Is this issue already
well known among hackers community? If known, why is it designed this
way?


http://jdbc.postgresql.org/documentation/head/query.html#query-with-cursor

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] 8.2.3: Server crashes on Windows using Eclipse/Junit

2007-11-12 Thread Trevor Talbot

On 11/12/07, Magnus Hagander <[EMAIL PROTECTED]> wrote:
> On Sat, Nov 10, 2007 at 03:17:13PM -0800, Trevor Talbot wrote:

> > As for desktop heap, only 65KB of the service heap was allocated, or
> > about 80 bytes per connection.  No danger of hitting limits in the
> > kernel memory pools either.
>
> As Dave said, it could be that the server version uses a lot less heap per
> process, which would be another good reason to use server rather than XP to
> run postgresql. But might there also be other differences, such as some
> third party (or non-core microsoft) product installed?

The XP SP2 machine I tried 8.2.5 on was chewing up about 3.1KB per
process, and it's not running anything invasive (AV or otherwise).

I've been trying to find out exactly what's in the desktop heap, but I
haven't had much luck so far.  Apparently Microsoft changed the
implementation after Win2000, and didn't bother teaching the public
debugging tools about it.  The details just don't seem to exist
anymore :(

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Clarification reqeusted for "select * from a huge table"

2007-11-12 Thread Richard Huxton


Gokulakannan Somasundaram wrote:

Hi,
I had a chance to test one of the real world cases with Oracle and
PostgreSQL. Create a Table with 10 million rows (i worked on a 1GB RAM
machine) both in oracle and Postgresql. Just write a JDBC program for
a 'select *' on that table. With PostgreSQL as backend, java crashes
saying that it has met 'Out Of Memory'. With Oracle it doesn't.
Postgres tried to send all the results back to the client at one
shot, whereas in Oracle it works like a Cursor. Is this issue already
well known among hackers community? If known, why is it designed this
way?


It returns everything because you've asked for it. If you wanted 
something that looks like a cursor, PG assumes you'll request a cursor.


As to why, there are two reasons:
1. It's always been that way and changing it now would irritate most of 
the existing user-base.
2. Repeat your test with 5,10,50,100 clients all running different big 
queries and see which puts a greater load on the server. PG favours 
supporting lots of clients by pushing the load onto them.



I also noticed that it doesn't crash with psql, but it takes a
long time to show the first set of records. It takes a long time, even
to quit after i pressed 'q'.
   With oracle SQLPlus, it is quite instantaneous.


Again, you're measuring different things. What is the time to the *last* 
row?


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] [hibernate-team] PostgreSQLDialect

2007-11-12 Thread Simon Riggs

On Sun, 2007-11-11 at 16:48 -0200, Diego Pires Plentz wrote:

> > > - You have supportsRowValueConstructorSyntax commented out. It does, if
> > > you have a recent enough version, or do you mean something else?
> >
> > The way to fix both that and the differing available functions would
> > probably be to have a subclass of the dialect for each server version.
> > MySQL seems to have about 5 :)
> > http://www.hibernate.org/hib_docs/v3/api/org/hibernate/dialect/package-summary.html.
> 
> I'm thinking the same thing. We could let PostgreSQLDialect to do full
> support to Postgre 7.x and extend it to support the new
> features/functions in Postgre 8.x. Btw, to do that, one thing that we
> must do is identify what functions are new/still avaiable in 8.x. That
> approach is good too, because we can get different behaviors in each
> version of the database.

I've posted files to pgsql-patches, as well as to Diego directly.

There are 3 files
PostgreSQL8Dialect.java which implements enough for 8.0 and 8.1
PostgreSQL82Dialect.java which extends PostgreSQL8Dialect
PostgreSQL83Dialect.java which extends PostgreSQL8Dialect

PostgreSQL8Dialect is not provided as a patch because the extensions
have all been re-ordered to match the underlying sequence and grouping
in the base Dialect file. Checking it should be much easier now.

I've not checked 7.x compatibility

We can then push out a new file every release.

Notes:
- Not sure when getCascadeConstraintsString() gets called, so left it
unset

- Not added any keywords. Some Dialects add a few keywords, but there
doesn't seem to be any pattern to it. Any advice?

- GUID support is possible, but really opens up the debate about how
extensibility features should be handled. 

- For now, I think we should document the procedure for adding a local
site Dialect which implements additional functions, with GUID as an
example

Comments?

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 6: explain analyze is your friend

[HACKERS] Clarification reqeusted for "select * from a huge table"

2007-11-12 Thread Gokulakannan Somasundaram

Hi,
I had a chance to test one of the real world cases with Oracle and
PostgreSQL. Create a Table with 10 million rows (i worked on a 1GB RAM
machine) both in oracle and Postgresql. Just write a JDBC program for
a 'select *' on that table. With PostgreSQL as backend, java crashes
saying that it has met 'Out Of Memory'. With Oracle it doesn't.
Postgres tried to send all the results back to the client at one
shot, whereas in Oracle it works like a Cursor. Is this issue already
well known among hackers community? If known, why is it designed this
way?

I also noticed that it doesn't crash with psql, but it takes a
long time to show the first set of records. It takes a long time, even
to quit after i pressed 'q'.
   With oracle SQLPlus, it is quite instantaneous.

-- 
Thanks,
Gokul.
CertoSQL Project,
Allied Solution Group.
(www.alliedgroups.com)

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] 8.2.3: Server crashes on Windows using Eclipse/Junit

2007-11-12 Thread Dave Page

Magnus Hagander wrote:
> I'm certainly not convinved about that either, but we should make a test on
> a VM at some point.
> 
> Sophos AV has plugins into for example the explorer (I assume - most AV
> does, haven't used Sophos specifically myself), which may be done with
> extra DLLs loading along with user32.dll (runtime linked) or something like
> that. I just want to be sure we exclude that possibility.

Yeah, iirc it does. I don't have time for that at the moment, but I can
fire you a copy of my test code if you do.

/D

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] 8.2.3: Server crashes on Windows using Eclipse/Junit

2007-11-12 Thread Magnus Hagander

On Mon, Nov 12, 2007 at 10:01:09AM +, Dave Page wrote:
> Magnus Hagander wrote:
> >> As for desktop heap, only 65KB of the service heap was allocated, or
> >> about 80 bytes per connection.  No danger of hitting limits in the
> >> kernel memory pools either.
> > 
> > As Dave said, it could be that the server version uses a lot less heap per
> > process, which would be another good reason to use server rather than XP to
> > run postgresql. But might there also be other differences, such as some
> > third party (or non-core microsoft) product installed? 
> > 
> > Dave, on your XP test, was that on a clean XP with nothing like AV or any
> > 3rd party stuff on it?
> 
> No, it was on my XP laptop which runs Sophos AV. I'm not convinced it's
> AV related though - in my test code I proved pretty conclusively that
> just initialising user32.dll ate the desktop heap.

I'm certainly not convinved about that either, but we should make a test on
a VM at some point.

Sophos AV has plugins into for example the explorer (I assume - most AV
does, haven't used Sophos specifically myself), which may be done with
extra DLLs loading along with user32.dll (runtime linked) or something like
that. I just want to be sure we exclude that possibility.

//Magnus

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Problem to configure pg8.3b2 w/ ossp-uuid-support on OS X

2007-11-12 Thread Jörg Beyer

Am 11.11.2007 17:36 Uhr schrieb Tom Lane (<[EMAIL PROTECTED]>):

> J=?ISO-8859-1?B?9g==?=rg Beyer <[EMAIL PROTECTED]> writes:
>> -- Binaries and libraries installed _by the OS itself_ reside in /usr (e.g.
>> uuid, libxml2, libxslt)
> 
> What I see on my OSX machines is /usr/include/uuid/uuid.h and no sign of
> anything uuid-related in /usr/lib.  (Apparently the functions declared
> by uuid/uuid.h reside directly in libc.)  This seems to be a
> DCE-compatible library not the OSSP one.  AFAICT our configure script
> completely ignores the presence of uuid/uuid.h.
> 
> So I'm wondering what exactly you've got in /usr and where it came from.
> 
> regards, tom lane

Ah, O.K., sorry, I tried not to be too verbose.
'/usr' should read as 'the respective subdirectories in /usr'.

To be precise, for  uuid  I have:
  /usr/bin/uuidgen  [the CL tool, which should be of no interest here]
  /usr/include/uuid/uuid.h

and that's the same as you have. Nothing else uuid-related in /usr.
And yes, AFAICT, uuid-functionalities (a) are part of libc, (b) appear to be
DCE compatible. 

I have installed _ossp-uuid_ with --prefix=usr/local, so my question is not,
why pg's configure script ignores OS X's uuid stuff. I'm looking for a way
to let configure recognize the ossp-uuid library.

Regards 

Jörg Beyer

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] 8.2.3: Server crashes on Windows using Eclipse/Junit

2007-11-12 Thread Dave Page

Magnus Hagander wrote:
>> As for desktop heap, only 65KB of the service heap was allocated, or
>> about 80 bytes per connection.  No danger of hitting limits in the
>> kernel memory pools either.
> 
> As Dave said, it could be that the server version uses a lot less heap per
> process, which would be another good reason to use server rather than XP to
> run postgresql. But might there also be other differences, such as some
> third party (or non-core microsoft) product installed? 
> 
> Dave, on your XP test, was that on a clean XP with nothing like AV or any
> 3rd party stuff on it?

No, it was on my XP laptop which runs Sophos AV. I'm not convinced it's
AV related though - in my test code I proved pretty conclusively that
just initialising user32.dll ate the desktop heap.

/D

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] 8.2.3: Server crashes on Windows using Eclipse/Junit

2007-11-12 Thread Magnus Hagander

On Sat, Nov 10, 2007 at 03:17:13PM -0800, Trevor Talbot wrote:
> On 10/26/07, I wrote:
> > On 10/26/07, Magnus Hagander <[EMAIL PROTECTED]> wrote:
> >
> > > Can you try the attached patch? See how many backends you can get up to.
> > >
> > > This patch changes from using a single thread for each backend started to
> > > using the builtin threadpool functionality. It also replaces the 
> > > pid/handle
> > > arrays with an i/o completion port. The net result is also, imho, much 
> > > more
> > > readable code :-)
> >
> > The patch looks good; I'm not set up to build yet, but I should be
> > able to test it sometime in the next week.
> 
> Sorry about the long delay; I retested with the 8.3-beta2 installer,
> still Win2003 SP2 32bit.
> 
> I stopped the test at 824 connections because I was about to run out
> of memory (1.25GB RAM + 3.75GB swap), but postmaster VM space usage
> was only 191MB.

Great.
I'm thinking this change may be big enough to actually backport to 8.2 -
what to others feel about that?

Assuming it is, I still think we should wait at least until we've run 8.3
RC for a while - probably until 8.3 has been actually released and run for
a while, to make sure we have a *lot* of testing of it before we consider
backpatching.

> As for desktop heap, only 65KB of the service heap was allocated, or
> about 80 bytes per connection.  No danger of hitting limits in the
> kernel memory pools either.

As Dave said, it could be that the server version uses a lot less heap per
process, which would be another good reason to use server rather than XP to
run postgresql. But might there also be other differences, such as some
third party (or non-core microsoft) product installed? 

Dave, on your XP test, was that on a clean XP with nothing like AV or any
3rd party stuff on it?

> Available RAM seems like a pretty reasonable limit to me ;)

Yeah, not much we can do about that one :-)

//Magnus

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] proposal casting from XML[] to int[], numeric[], text[]

2007-11-12 Thread Pavel Stehule

> >
> > You're right, that's my mistake, sorry. So, having casting rules seems
> > to be the only option..
> >
>
>
> We can already cast as text[], and so we can do this:
>
> andrew=# select
> xpath('//foo/text()','12')::text[]::int[];
>  xpath
> ---
>  {1,2}
> (1 row)
>
>
> So why do we desperately need anything extra at all?
>

I was blind. My problem was with function index over xml array, that
isn't indexable.

I didn't find multiple casting.

Regards
Pavel

> cheers
>
> andrew
>

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] How to keep a table in memory?

2007-11-12 Thread Heikki Linnakangas


adrobj wrote:

I have a pretty small table (~20MB) that is accessed very frequently and
randomly, so I want to make sure it's 100% in memory all the time. There is
a lot of other staff that's also gets accessed frequently, so I don't want
to just hope that Linux file cache would do the right thing for me.

Is there any way to do that?

One idea I have in my head is to start a process that does mmap() and
mlock() with the table file. Will it work? If so, are there any potential
problems?


Just leave it to the cache management algorithms in Postgres and Linux. 
If it really is frequently accessed, it should stay in Postgres shared 
buffers.


You can use the pg_buffercache contrib module to see what's in cache.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [HACKERS] [hibernate-team] PostgreSQLDialect

2007-11-12 Thread Simon Riggs

On Sun, 2007-11-11 at 23:38 -0500, Andrew Dunstan wrote:

> Moreover, Postgres is extensible, so ideally Hibernate should look at 
> providing a way of querying a database server to get a list of supported 
> function signatures.
> 
> Not sure how you could handle user defined types automatically, though. 
> Probably not.

The Hibernate Dialect is extensible, so it seems we can do it the other
way around. Generate a Hibernate dialect for a particular database
installation, then use it from Hibernate as if it was a static/manual
configuration.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] proposal casting from XML[] to int[], numeric[], text[]

2007-11-12 Thread Andrew Dunstan




Nikolay Samokhvalov wrote:

On Nov 12, 2007 12:59 AM, Tom Lane <[EMAIL PROTECTED]> wrote:
  

I'm not clear on what you're proposing.  There is no such thing as an
opclass with no operators (or at least, not a useful one), so this seems
mutually contradictory.

regards, tom lane




You're right, that's my mistake, sorry. So, having casting rules seems
to be the only option..
  



We can already cast as text[], and so we can do this:

andrew=# select 
xpath('//foo/text()','12')::text[]::int[];

xpath
---
{1,2}
(1 row)


So why do we desperately need anything extra at all?

cheers

andrew

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

[HACKERS] How to keep a table in memory?

2007-11-12 Thread adrobj


I have a pretty small table (~20MB) that is accessed very frequently and
randomly, so I want to make sure it's 100% in memory all the time. There is
a lot of other staff that's also gets accessed frequently, so I don't want
to just hope that Linux file cache would do the right thing for me.

Is there any way to do that?

One idea I have in my head is to start a process that does mmap() and
mlock() with the table file. Will it work? If so, are there any potential
problems?
-- 
View this message in context: 
http://www.nabble.com/How-to-keep-a-table-in-memory--tf4789293.html#a13700771
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

94 matches

Mail list logo