Re: [HACKERS] Reducing stats collection overhead

2007-07-31 Thread Arjen van der Meijden

On 31-7-2007 5:07 Alvaro Herrera wrote:

Arjen van der Meijden wrote:
Afaik Tom hadn't finished his patch when I was testing things, so I don't 
know. But we're in the process of benchmarking a new system (dual quad-core 
Xeon) and we'll have a look at how it performs in the postgres 8.2dev we 
used before, the stable 8.2.4 and a fresh HEAD-checkout (which we'll call 
8.3dev). I'll let you guys (or at least Tom) know how they compare in our 
benchmark.


So, ahem, did it work? :-)


The machine turned out to have a faulty mainboard, so we had to 
concentrate on first figuring out why it was unstable and then whether 
the replacement mainboard did make it stable in a long durability 
test Of course that behaviour only appeared with mysql and not with 
postgresql, so we had to run our mysql-version of the benchmark a few 
hundred times, rather than testing various versions, untill the machine 
had to go in production.


So we haven't tested postgresql 8.3dev on that machine, sorry.

Best regards,

Arjen





On 18-5-2007 15:12 Alvaro Herrera wrote:

Tom Lane wrote:

Arjen van der Meijden told me that according to the tweakers.net
benchmark, HEAD is noticeably slower than 8.2.4, and I soon confirmed
here that for small SELECT queries issued as separate transactions,
there's a significant difference.  I think much of the difference stems
from the fact that we now have stats_row_level ON by default, and so
every transaction sends a stats message that wasn't there by default
in 8.2.  When you're doing a few thousand transactions per second
(not hard for small read-only queries) that adds up.

So, did this patch make the performance problem go away?





---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Machine available for community use

2007-07-31 Thread Devrim GÜNDÜZ
Hi,

On Tue, 2007-07-31 at 01:54 -0400, Tom Lane wrote:
  Really? Are the compiler options, etc, public?
 
 Certainly.  If you doubt it, try comparing pg_config output for the
 RHEL and CentOS packages. 

As I wrote before, I used PGDG packages for both -- What I'm suspecting
is the other packages like kernel, etc. 

  BTW, they were stock 4.3 -- no updates, etc.
 
 RHEL 4.3 was obsoleted more than a year ago, so I'd like to think that
 nobody finds no update comparisons to be very relevant today ...

I was referring to 4.3 isos of both distros, with no updates by that
time.

Regards,
-- 
Devrim GÜNDÜZ
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, ODBCng - http://www.commandprompt.com/




signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] default_text_search_config and expression indexes

2007-07-31 Thread Oleg Bartunov

On Mon, 30 Jul 2007, Bruce Momjian wrote:


Oleg Bartunov wrote:

OK, here is what I am thinking.  If we make default_text_search_config
super-user-only, then the user can't do SET (using zero_damaged_pages
as a superuser-only example):

test= set zero_damaged_pages = on;
ERROR:  permission denied to set parameter zero_damaged_pages

test= alter user guest set zero_damaged_pages = on;
ERROR:  permission denied to set parameter zero_damaged_pages

but the super-user can set it in postgresql.conf, or:

test=# alter user guest set zero_damaged_pages = on;
ALTER ROLE

or

test=# alter database vendor3 set zero_damaged_pages = on;
ALTER ROLE

meaning while it will be super-user-only, the administrator can set the
default for specific databases and users.  Is that the best approach?

A user can still over-ride the default by specifying the configuration
in the function call.


This is ok, but it will not work in hosting environment and still
doesn't prevent errors.


Agreed.  super-user-only now seems strange to me because it isn't a
security issue, but rather an attempt to avoid people causing errors.

The fundamental issue is that if you do a query using tsvector and
tsquery everything will work find because default_text_search_config
will be the same for both queries.  The problem is if do an expression
index lookup that doesn't specify the configuration name and your
default_text_search_config doesn't match the index, or you INSERT or
UPDATE into an expression index with a mismatched
default_text_search_config.

If we do make default_text_search_config super-user-only it prevents a
database owner from doing ALTER DATABASE db1 SET
default_text_search_config = 'english', which seems like a pretty big
limitation because I think per-database default_text_search_config makes
the most sense.

And, again, if you specify the configuration in the expression index you
have to specify it in the WHERE clause and then
default_text_search_config is pretty useless.


agree. Notice, this is very limited usage case.



If we required the configuration to always be specified, you could still
store multiple configurations in the same column by having a secondary
column hold the configuration name:


I don't understand this. Let's don't discuss indexes at all, since indexes
doesn't know about configuratons at all



CREATE INDEX i on x USING gist (to_tsvector(config_col, body));


Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Machine available for community use

2007-07-31 Thread Dawid Kuroczko
On 7/31/07, Devrim GÜNDÜZ [EMAIL PROTECTED] wrote:
 Hi,

 On Mon, 2007-07-30 at 19:14 -0700, Joshua D. Drake wrote:
   and RHEL performed much better than CentOS.
 
  Not to be unkind, but I doubt that on an identical configuration.

 Since I don't have the permission to distribute the benchmark results, I
 will be happy to spend time for re-running these tests if someone
 provides me an identical machine.

 Each test took 1-2 days -- I will insist that CentOS performs poorer
 than RHEL.

Would it be possibe to include Unbreakable Linux in such test?
Out of curiosity of course. :-)

   Regards,
  Dawid

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [GENERAL] ascii() for utf8

2007-07-31 Thread Alvaro Herrera
Decibel! wrote:
 Moving to -hackers.

 On Jul 27, 2007, at 1:22 PM, Stuart wrote:
 Does Postgresql have a function like ascii() that will
 return the unicode codepoint value for a utf8 character?
 (And symmetrically same for question chr() of course).

 I suspect that this is just a matter of no one scratching the itch. I 
 suspect a patch would be accepted, or you could possibly put something on 
 pgFoundry.

Nay; there were some discussions about this not long ago, and I think
one conclusion you could draw from them is that many people want these
functions in the backend.

 I'd set it up so that ascii() and chr() act according to the 
 appropriate locale setting (I'm not sure which one would be appropriate).

I don't see why any of them would react to the locale, but they surely
must honor client encoding.

-- 
Alvaro Herrera   http://www.PlanetPostgreSQL.org/
I dream about dreams about dreams, sang the nightingale
under the pale moon (Sandman)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Quick idea for reducing VACUUM contention

2007-07-31 Thread Decibel!

On Jul 30, 2007, at 8:00 PM, Alvaro Herrera wrote:

ITAGAKI Takahiro wrote:

Alvaro Herrera [EMAIL PROTECTED] wrote:
I think we might need additional freezing-xmax operations to  
avoid
XID-wraparound in the first path of vacuum, though it hardly  
occurs.


I'm not sure I follow.  Can you elaborate?  Do you mean storing a
separate relfrozenxmax for each table or something like that?


We need to work around wraparound of xmax in dead tuples. If we  
miss to

vacuum them and XID is wrapped, we cannot remove them until the next
XID-wraparound, because we treat them to be deleted in the *future*.


Oh, but this should not be a problem, because a tuple is either frozen
or removed completely -- xmax cannot precede xmin.


What if it's frozen, then deleted, and then we wrap on xmax? Wouldn't  
that make the tuple re-appear?

--
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Quick idea for reducing VACUUM contention

2007-07-31 Thread Alvaro Herrera
Decibel! wrote:
 On Jul 30, 2007, at 8:00 PM, Alvaro Herrera wrote:
 ITAGAKI Takahiro wrote:
 Alvaro Herrera [EMAIL PROTECTED] wrote:
 I think we might need additional freezing-xmax operations to avoid
 XID-wraparound in the first path of vacuum, though it hardly occurs.

 I'm not sure I follow.  Can you elaborate?  Do you mean storing a
 separate relfrozenxmax for each table or something like that?

 We need to work around wraparound of xmax in dead tuples. If we miss to
 vacuum them and XID is wrapped, we cannot remove them until the next
 XID-wraparound, because we treat them to be deleted in the *future*.

 Oh, but this should not be a problem, because a tuple is either frozen
 or removed completely -- xmax cannot precede xmin.

 What if it's frozen, then deleted, and then we wrap on xmax? Wouldn't that 
 make the tuple re-appear?

That cannot happen, because the next vacuum will remove the tuple if the
Xmax is committed.  If the deleting transaction aborts, then vacuum will
set Xmax to Invalid (see heap_freeze_tuple in heapam.c).

One potential problem you would see is if the deleting transaction marks
it deleted and then not commit for 2 billion transactions, thus vacuum
is not able to remove it because it shows up as delete-in-progress.
However there are plenty other problems you would hit in that case
(autovacuum starting to misbehave being the first you would probably
notice).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Reducing Transaction Start/End Contention

2007-07-31 Thread Alvaro Herrera
Simon Riggs wrote:

 1. Increase size of Clog-specific BLCKSZ

 2. Perform ExtendClog() as a background activity

 (1) and (2) can be patched fairly easily for 8.3. I have a prototype
 patch for (1) on the shelf already from 6 months ago.

Hmm, I think (1) may be 8.3 material but all the rest are complex enough
that being left for 8.4 is called for.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Quick idea for reducing VACUUM contention

2007-07-31 Thread Decibel!

On Jul 30, 2007, at 1:47 PM, Alvaro Herrera wrote:

Jim Nasby wrote:

On Jul 27, 2007, at 1:49 AM, Alvaro Herrera wrote:

ITAGAKI Takahiro wrote:
It would be cool if we could do something like sweep a range of  
pages,
initiate IO for those that are not in shared buffers, and while  
that is
running, lock and clean up the ones that are in shared buffers,  
skipping

those that are not lockable right away; when that's done, go back to
those buffers that were gotten from I/O and clean those up.  And  
retry


Would that be substantially easier than just creating a bgreader?


I'm not sure about easier, but I'm not sure that the bgreader can  
do the

same job.  ISTM that the bgreader would be mostly in charge of reading
in advance of backends, whereas what I'm proposing is mostly about
finding the best spot for locking.  It might turn out to be more  
trouble
than it's worth though, for sure.  And in any case I'm not in a  
hurry to

implement it.


I was referring specifically to the read in what's not already in  
shared buffers part of Itagaki-san's message... that seems to be  
something best suited for a bgreader.

--
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Machine available for community use

2007-07-31 Thread Greg Smith

On Mon, 30 Jul 2007, Devrim G?ND?Z wrote:


I have performed a test using OSDL test suite a few months ago on a
system that has:
* 8 x86_64 CPUs @ 3200.263...
and RHEL [4.3] performed much better than CentOS [4.3]


RHEL 4 update 3 included some reworking of the x86_64 kernel, like adding 
the kernel-largesmp for many CPU systems.  I would not be surprised to 
find that the first CentOS release based on that may not have achieved a 
perfect rebuild because of all that, and since you didn't do any updates 
from the initial ISO images you were basically running the CentOS beta for 
that feature set.


I think it's accurate to say sometimes CentOS releases have bugs that 
make them perform worse than the RHEL they're derived from, and would not 
dispute your results accordingly.  I've seen fuzzy periods where CentOS 
had a release out to match a new RHEL version, but it wasn't quite right 
until after CentOS released an update or two.  There can be some lag 
there, particularly in the period after a new major release.  Right now, 
for example, I still don't completely trust the CentOS build based on the 
recent RHEL 5, and have been following the developer mailing lists to get 
a feel for when things have settled down.  It is one of the risks that 
goes along with using CentOS, and removing it by using a genuine RHEL 
certainly has value.


At the same time, I've done a fair amount of benchmarking work on machines 
that switched from RHEL-CentOS where performance was completely 
identical.  I'd need to see a lot more than one test result suggesting 
otherwise before I'd believe that CentOS is slower in general than the 
RHEL it's derived from.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD
---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] [PATCHES] allow CSV quote in NULL

2007-07-31 Thread Tom Lane
Stephen Frost [EMAIL PROTECTED] writes:
 Other, unrelated, options being or not being there doesn't really have
 any bearing on this though.  I'm not inventing new syntax here.  I'm
 just removing a restriction on what the user can do that doesn't need
 to exist.

I don't think you're just removing a restriction.  What you're doing
is exposing a whole lot of strange and arguably broken corner cases.
If we accept this patch I think we'll be fielding bug reports as a
result for years to come.  I *especially* dislike the part about
allowing the delimiter character in the null string --- that will allow
people to complain about the order in which decisions are made.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] stats_block_level

2007-07-31 Thread Simon Riggs
On Tue, 2007-07-31 at 12:33 -0400, Alvaro Herrera wrote:
 Tom Lane wrote:
  Alvaro Herrera [EMAIL PROTECTED] writes:
   I agree.  Let's remove stats_start_collector and merge the other two
   into a single setting.  Anything more than that is overkill.
  
  So what are we going to call the one surviving GUC variable?
 
 collect_stats

In the patch recently submitted, I opted for stats_collection.

Methinks it should be: stats_something, so that people find it in the
same place as stats_query_string, which is still there.

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PATCHES] allow CSV quote in NULL

2007-07-31 Thread Andrew Dunstan



Tom Lane wrote:

Stephen Frost [EMAIL PROTECTED] writes:
  

Other, unrelated, options being or not being there doesn't really have
any bearing on this though.  I'm not inventing new syntax here.  I'm
just removing a restriction on what the user can do that doesn't need
to exist.



I don't think you're just removing a restriction.  What you're doing
is exposing a whole lot of strange and arguably broken corner cases.
If we accept this patch I think we'll be fielding bug reports as a
result for years to come.  I *especially* dislike the part about
allowing the delimiter character in the null string --- that will allow
people to complain about the order in which decisions are made.


  


Yeah, if you allow the delimiter in the null string, what do you do if 
it's not quoted? I can't imagine what the real world case for that could 
possibly be.


Even if there's an arguable case for allowing the quote char in a null 
string (and as I indicated upthread I really think the problem being 
addressed here could be solved in a far better fashion) there is surely 
no good case for allowing the delimiter. Oh, and if we did allow the 
quote char we should surely only allow it on input - just because other 
programs produce absurd output there is not reason we should.


cheers

andrew



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Reducing Transaction Start/End Contention

2007-07-31 Thread Stefan Kaltenbrunner
Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
 Simon Riggs wrote:
 1. Increase size of Clog-specific BLCKSZ
 
 2. Perform ExtendClog() as a background activity
 
 (1) and (2) can be patched fairly easily for 8.3. I have a prototype
 patch for (1) on the shelf already from 6 months ago.
 
 Hmm, I think (1) may be 8.3 material but all the rest are complex enough
 that being left for 8.4 is called for.
 
 NONE of this is 8.3 material.  Full stop.  Try to keep your eyes on the
 ball people --- 8.3 is already months past feature freeze.

yeah - we have still 12(!) open items on the PatchStatus board:

http://developer.postgresql.org/index.php/Todo:PatchStatus

and at least half of them are in need of reviewer capacity(and some of
them there for nearly half a year).


Stefan

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] stats_block_level

2007-07-31 Thread Alvaro Herrera
Simon Riggs wrote:
 On Tue, 2007-07-31 at 12:33 -0400, Alvaro Herrera wrote:
  Tom Lane wrote:
   Alvaro Herrera [EMAIL PROTECTED] writes:
I agree.  Let's remove stats_start_collector and merge the other two
into a single setting.  Anything more than that is overkill.
   
   So what are we going to call the one surviving GUC variable?
  
  collect_stats
 
 In the patch recently submitted, I opted for stats_collection.

I think we tend to give emphasis to the verb rather than the noun, e.g.
redirect_stderr, log_connections.


FWIW I just noticed we have a variable named krb_caseins_users which I
think is not such a great name for it.  Prolly best to change it now
while it's still in the oven.

 Methinks it should be: stats_something, so that people find it in the
 same place as stats_query_string, which is still there.

Hum, but the order in postgresql.conf is arbitrary, right?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Machine available for community use

2007-07-31 Thread Gavin M. Roy
It's actually in Texas, and we have no intention to put a time limit
on its availability. I think the availability will be there as long as
there is use and we're in the Texas data center, which I don't see
ending any time soon.

On 7/31/07, Josh Berkus [EMAIL PROTECTED] wrote:
 Gavin,

 I'm actually in the middle of assembling a general performance test lab for
 the PostgreSQL hackers, using equipment donated by Sun, Hi5, and (hopefully)
 Unisys and Intel.  While your machine would obviously stay in Pennsylvania,
 it would be cool if we could somehow arrange a unified authentication 
 booking system.

 I'm pretty sure I can even raise money to get one created.

 How long will this system remain available to us?

 --
 Josh Berkus
 PostgreSQL @ Sun
 San Francisco


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] stats_block_level

2007-07-31 Thread Simon Riggs
On Tue, 2007-07-31 at 13:06 -0400, Alvaro Herrera wrote:
 Simon Riggs wrote:
  On Tue, 2007-07-31 at 12:33 -0400, Alvaro Herrera wrote:
   Tom Lane wrote:
Alvaro Herrera [EMAIL PROTECTED] writes:
 I agree.  Let's remove stats_start_collector and merge the other two
 into a single setting.  Anything more than that is overkill.

So what are we going to call the one surviving GUC variable?
   
   collect_stats
  
  In the patch recently submitted, I opted for stats_collection.
 
 I think we tend to give emphasis to the verb rather than the noun, e.g.
 redirect_stderr, log_connections.
 
 
 FWIW I just noticed we have a variable named krb_caseins_users which I
 think is not such a great name for it.  Prolly best to change it now
 while it's still in the oven.
 
  Methinks it should be: stats_something, so that people find it in the
  same place as stats_query_string, which is still there.
 
 Hum, but the order in postgresql.conf is arbitrary, right?

Yes, though the order in 'show all' is alphabetical.

However, I agree with your comment on verb first, so lets do
collect_stats.

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Machine available for community use

2007-07-31 Thread Josh Berkus
Folks,

Hey, this is looking like a serious case of Bike Shedding.  That is, a dozen 
people are arguing about what color to paint the bike shed instead of getting 
it built.[1]

Given that there are much more substantial issues: what performance software 
to install and how to install it, how to set up authentication and 
time-sharing for running tests, whether we can set up automated perf testing,  
getting money so some of our unfunded performance developers can work on it, 
etc., is the which Linux distro question worth spending our time on?

[1] http://www.bikeshed.com/

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] stats_block_level

2007-07-31 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Simon Riggs wrote:
 Methinks it should be: stats_something, so that people find it in the
 same place as stats_query_string, which is still there.

 Hum, but the order in postgresql.conf is arbitrary, right?

I concur with Simon that it should have some relationship to
stats_query_string.  However, stats_collection doesn't appeal to me
because that sounds like it would subsume stats_query_string (it seems
like a master control toggle, as stats_start_collector used to be).
Maybe something like stats_count_events?

Or we could get radical and rename both of them...

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] [PATCHES] allow CSV quote in NULL

2007-07-31 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 ... Oh, and if we did allow the 
 quote char we should surely only allow it on input - just because other 
 programs produce absurd output there is not reason we should.

Yeah.  The *real* problem with the patch as proposed is that it allows a
COPY OUT to emit a file that cannot be reloaded correctly, even given
the same options used to prepare it.  I think that the restrictions were
put there more to prevent that scenario than to restrict COPY IN.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] stats_block_level

2007-07-31 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 FWIW I just noticed we have a variable named krb_caseins_users which I
 think is not such a great name for it.  Prolly best to change it now
 while it's still in the oven.

You're two releases too late for that one :-(

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Machine available for community use

2007-07-31 Thread Tom Lane
Josh Berkus [EMAIL PROTECTED] writes:
 Hey, this is looking like a serious case of Bike Shedding.  That is, a 
 dozen 
 people are arguing about what color to paint the bike shed instead of getting 
 it built.[1]

FWIW, it's looking like Red Hat will donate a RHEL/RHN subscription if
we want one, though I don't have final approval quite yet.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] stats_block_level

2007-07-31 Thread Alvaro Herrera
Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  FWIW I just noticed we have a variable named krb_caseins_users which I
  think is not such a great name for it.  Prolly best to change it now
  while it's still in the oven.
 
 You're two releases too late for that one :-(

Doh, I thought it was new in the GSSAPI code.  Sorry.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] stats_block_level

2007-07-31 Thread Alvaro Herrera
Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  Simon Riggs wrote:
  Methinks it should be: stats_something, so that people find it in the
  same place as stats_query_string, which is still there.
 
  Hum, but the order in postgresql.conf is arbitrary, right?
 
 I concur with Simon that it should have some relationship to
 stats_query_string.  However, stats_collection doesn't appeal to me
 because that sounds like it would subsume stats_query_string (it seems
 like a master control toggle, as stats_start_collector used to be).
 Maybe something like stats_count_events?
 
 Or we could get radical and rename both of them...

Well, it is a bit misleading to have the query_string stuff be named
stats when it's not actually collected by pgstats at all.  Maybe
rename it to collect_query_string.  With the other name being
collect_stats, they would show up together in SHOW ALL.

I am not sure about using plural/singular though: why isn't it
stats_query_strings instead?  (We do have log_connections etc).

-- 
Alvaro Herrera  http://www.amazon.com/gp/registry/5ZYLFMCVHXC
Porque francamente, si para saber manejarse a uno mismo hubiera que
rendir examen... ¿Quién es el machito que tendría carnet?  (Mafalda)

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] default_text_search_config and expression indexes

2007-07-31 Thread Oleg Bartunov

On Mon, 30 Jul 2007, Bruce Momjian wrote:


Bruce Momjian wrote:

We have to decide if we want a GUC default_text_search_config, and if so
when can it be changed.

Right now there are three ways to create a tsvector (or tsquery)

::tsvector
to_tsvector(value)
to_tsvector(config, value)

(ignoring plainto_tsvector)

Only the last one specifies the configuration. The others use the
configuration specified by default_text_search_config.  (We had an
previous discussion on what the default value of
default_text_search_config should be, and it was decided it should be
set via initdb based on a flag or the locale.)

Now, because most people use a single configuration, they can just set
default_text_search_config and there is no need to specify the
configuration name.

However, expression indexes cause a problem here:


http://momjian.us/expire/fulltext/HTML/textsearch-tables.html#TEXTSEARCH-TABLES-INDEX

We recommend that users create an expression index on the column they
want to do a full text search on, e.g.

CREATE INDEX pgweb_idx ON pgweb USING gin(to_tsvector(body));

However, the big problem is that the expressions used in expression
indexes should not change their output based on the value of a GUC
variable (because it would corrupt the index), but in the case above,
default_text_search_config controls what configuration is used, and
hence the output of to_tsvector is changed if default_text_search_config
changes.

We have a few possible options:

1) Document the problem and do nothing else.
2) Make default_text_search_config a postgresql.conf-only
   setting, thereby making it impossible to change by non-super
   users, or make it a super-user-only setting.
3) Remove default_text_search_config and require the
   configuration to be specified in each function call.

If we remove default_text_search_config, it would also make ::tsvector
casting useless as well.


OK, I just found a case that I think is going to make #3 a requirement
(remove default_text_search_config).

How is a CREATE INDEX ... to_tsvector(col) going to restore from a
pg_dump?  I see no way of guaranteeing that the
default_text_search_config is correct on the restore, and in fact I
don't think we have any way of knowing the default_text_search_config
used for the index.

And if we have to require the configuration name in CREATE INDEX, it has
to be used in WHERE, so we might as well just remove the default
capability and always require the configuration name.


this is very rare use case for text searching 
1. expression index without configuration name

2. default_text_search_config can be changed by somebody

If somebody really need it, then he should be adviced to use configuration 
name, else we don't guarantee that somebody could change 
default_text_search_config  variable and this could lead to 
incorrect dump/restore.


I don't think we should remove default_text_search_config because of 
this rare case.


Regards,
Oleg

PS.

Bruce, I'm in the mountains the Northern Caucasia and internet is
a bit  unreliable :(

_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] stats_block_level

2007-07-31 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Or we could get radical and rename both of them...

 Well, it is a bit misleading to have the query_string stuff be named
 stats when it's not actually collected by pgstats at all.  Maybe
 rename it to collect_query_string.  With the other name being
 collect_stats, they would show up together in SHOW ALL.

query_string is pretty misleading these days too, since pg_stat_activity
includes a lot more than the bare query string.

If we were doing this on a blank slate I would suggest track_stats
and track_activities, but that might be too different from what
people are used to.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] default_text_search_config and expression indexes

2007-07-31 Thread Bruce Momjian
Oleg Bartunov wrote:
  If we remove default_text_search_config, it would also make ::tsvector
  casting useless as well.
 
  OK, I just found a case that I think is going to make #3 a requirement
  (remove default_text_search_config).
 
  How is a CREATE INDEX ... to_tsvector(col) going to restore from a
  pg_dump?  I see no way of guaranteeing that the
  default_text_search_config is correct on the restore, and in fact I
  don't think we have any way of knowing the default_text_search_config
  used for the index.
 
  And if we have to require the configuration name in CREATE INDEX, it has
  to be used in WHERE, so we might as well just remove the default
  capability and always require the configuration name.
 
 this is very rare use case for text searching 
 1. expression index without configuration name
 2. default_text_search_config can be changed by somebody

If you are going to be using the configuration name with the create
expression index, you have to use it in the WHERE clause (or the index
doesn't work), and I assume that is 90% of the text search uses.  I
don't see it as rare at all.

 If somebody really need it, then he should be adviced to use configuration 
 name, else we don't guarantee that somebody could change 
 default_text_search_config  variable and this could lead to 
 incorrect dump/restore.
 
 I don't think we should remove default_text_search_config because of 
 this rare case.

I still feel the default_text_search_config has to be removed.  We have
tried all sorts of ways to make it work but having it not be 100%
reliable for pg_dump/restore means it might as well be in /contrib and
unsupported.  If we have it in core, it has to work 100%.  We can't have
tons of examples that don't specify the configuration name and then
expect every create expression index and WHERE clause to use it. 
default_text_search_config _can_ work, but it seems so easy to break and
so easy to get wrong that I think it must be removed.

If we are going to keep it, I need someone to explain why my comments
above are wrong.  If I am right, someone has to remove
default_text_search_config from the patch.   I can do the documentation.

 Bruce, I'm in the mountains the Northern Caucasia and internet is
 a bit  unreliable :(

Thanks.  I noticed  a lag in your reply.  Hope you are having a good
time.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Machine available for community use

2007-07-31 Thread Greg Smith

On Tue, 31 Jul 2007, Josh Berkus wrote:

That is, a dozen people are arguing about what color to paint the bike 
shed instead of getting it built.


Until there's an OS installed on it and it's on a network, the machine 
essentially doesn't exist--so there was no way to work on the 
building--and there was a clearly a gap between what Gavin was planning to 
do and what the aggregate hacker community wanted.  If there's a 
bike-shedding analogy here, the argument has been about what type of 
foundation to build the shed on.  The design of the shed itself may be 
much more complicated than that part, but if you put it someplace that's 
not level you may not ever get what you wanted no matter how much work you 
put into it later.  That's why I thought it was important to at least talk 
through the Linux distribution topic, so everyone was aware of the 
trade-offs involved.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] [PATCHES] allow CSV quote in NULL

2007-07-31 Thread Stephen Frost
* Tom Lane ([EMAIL PROTECTED]) wrote:
 Andrew Dunstan [EMAIL PROTECTED] writes:
  ... Oh, and if we did allow the 
  quote char we should surely only allow it on input - just because other 
  programs produce absurd output there is not reason we should.
 
 Yeah.  The *real* problem with the patch as proposed is that it allows a
 COPY OUT to emit a file that cannot be reloaded correctly, even given
 the same options used to prepare it.  I think that the restrictions were
 put there more to prevent that scenario than to restrict COPY IN.

erp.  My apologies, I hadn't ever intended for this to be used with COPY
OUT.  For some reason I had thought my changes were isolated to the COPY
CSV IN path.  I'd be happy to adjust the patch to only accept the
quote-in-null syntax when doing a COPY CSV IN.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] stats_block_level

2007-07-31 Thread Peter Eisentraut
Alvaro Herrera wrote:
 Well, it is a bit misleading to have the query_string stuff be named
 stats when it's not actually collected by pgstats at all.

By now, the statistics collector is unnoticeable to most users, since 
it's always on and you never have to do anything about it.  The fact 
that not all things called statistics are managed by it should be 
pretty irrelevant.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] stats_block_level

2007-07-31 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 Alvaro Herrera [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Or we could get radical and rename both of them...

 Well, it is a bit misleading to have the query_string stuff be named
 stats when it's not actually collected by pgstats at all.  Maybe
 rename it to collect_query_string.  With the other name being
 collect_stats, they would show up together in SHOW ALL.

 query_string is pretty misleading these days too, since pg_stat_activity
 includes a lot more than the bare query string.

FWIW I find having both the stats collector and the stats that analyze
generates (ie, stats target) confusing.

Really stats doesn't describe what information it's gathering, just that
it's gathering some kind of information. Perhaps we should think of a term
that describes what kind of information that is. collect_io_stats or
collect_event_stats or something like that? Or even something without the
word stats at all. But I can't think of anything good without it.

 If we were doing this on a blank slate I would suggest track_stats
 and track_activities, but that might be too different from what
 people are used to.

I like track_events or track_activity though perhaps people might get them
confused with trace...

Sigh... and I swore I wouldn't get involved in any more name games...

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] feature suggestion

2007-07-31 Thread Rafael Azevedo
People,

I'd like to suggest you guys to implement a new feature.
Actually is an alias for a existent feature.

Unstead of having to type all the insert syntax, using (column) names, you
could do the same as MySQL does.
for example:

INSERT INTO Table SET
Field1 = 'text',
Field2 = 'text';

So it would make it easier and faster to develop applications using Postgre.

Thanks.
Rafael


Re: [HACKERS] stats_block_level

2007-07-31 Thread Bruce Momjian
Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  Simon Riggs wrote:
  Methinks it should be: stats_something, so that people find it in the
  same place as stats_query_string, which is still there.
 
  Hum, but the order in postgresql.conf is arbitrary, right?
 
 I concur with Simon that it should have some relationship to
 stats_query_string.  However, stats_collection doesn't appeal to me
 because that sounds like it would subsume stats_query_string (it seems
 like a master control toggle, as stats_start_collector used to be).
 Maybe something like stats_count_events?

stats_enable_counters, or just stats_counters?

We should prefix it with stats.  I understand the verb issue, but
putting the same prefix for the same module is more important ---
effectively it is stats.collection.  Someday we might even use dots so
we can have multiple levels of detail, e.g. stats.block.accumulate or
something like that.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] feature suggestion

2007-07-31 Thread Bruce Momjian
Rafael Azevedo wrote:
 People,
 
 I'd like to suggest you guys to implement a new feature.
 Actually is an alias for a existent feature.
 
 Unstead of having to type all the insert syntax, using (column) names, you
 could do the same as MySQL does.
 for example:
 
 INSERT INTO Table SET
 Field1 = 'text',
 Field2 = 'text';
 
 So it would make it easier and faster to develop applications using Postgre.

There is an SQL standard way to do this and we have no intention of
extending that.  If you want standard code, use the standard syntax.
MySQL, if it supports this, is just encouraging you to write
non-standard SQL.  You can complain to them.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] default_text_search_config and expression indexes

2007-07-31 Thread Oleg Bartunov

On Tue, 31 Jul 2007, Bruce Momjian wrote:


And if we have to require the configuration name in CREATE INDEX, it has
to be used in WHERE, so we might as well just remove the default
capability and always require the configuration name.


this is very rare use case for text searching
1. expression index without configuration name
2. default_text_search_config can be changed by somebody


If you are going to be using the configuration name with the create
expression index, you have to use it in the WHERE clause (or the index
doesn't work), and I assume that is 90% of the text search uses.  I
don't see it as rare at all.


What is a basis of your assumption ? In my opinion, it's very limited
use of text search, because it doesn't supports ranking. For 4-5 years
of tsearch2 usage I never used it and I never seem in mailing lists.
This is very user-oriented feature and we could probably ask 
-general people for their opinion.



If somebody really need it, then he should be adviced to use configuration
name, else we don't guarantee that somebody could change
default_text_search_config  variable and this could lead to
incorrect dump/restore.

I don't think we should remove default_text_search_config because of
this rare case.


I still feel the default_text_search_config has to be removed.  We have
tried all sorts of ways to make it work but having it not be 100%
reliable for pg_dump/restore means it might as well be in /contrib and
unsupported.  If we have it in core, it has to work 100%.  We can't have
tons of examples that don't specify the configuration name and then
expect every create expression index and WHERE clause to use it.
default_text_search_config _can_ work, but it seems so easy to break and
so easy to get wrong that I think it must be removed.


I'd better say we don't support text searching using expression index
than remove default_text_search_config. Anyway, I don't feel myself
responisble for such important problem. We need more feedback from 
users.




If we are going to keep it, I need someone to explain why my comments
above are wrong.  If I am right, someone has to remove
default_text_search_config from the patch.   I can do the documentation.


I'm in conference and then will be busy writing my applications and
earning money, Teodor is in vacation. I don't want to do 
hasty conclusion, since we're very tired to change our patch from 
one solution to another. We need consensus of developers and users.

I'm almost exhausted and have  no time  to continue this discussion.

Would you be so kind to write separate post about this problem and
call -hackers and -general for feedback. Let's experienced users
show their needs. We said everything and has nothing to add.


Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] default_text_search_config and expression indexes

2007-07-31 Thread Bruce Momjian
Oleg Bartunov wrote:
 On Tue, 31 Jul 2007, Bruce Momjian wrote:
 
  And if we have to require the configuration name in CREATE INDEX, it has
  to be used in WHERE, so we might as well just remove the default
  capability and always require the configuration name.
 
  this is very rare use case for text searching
  1. expression index without configuration name
  2. default_text_search_config can be changed by somebody
 
  If you are going to be using the configuration name with the create
  expression index, you have to use it in the WHERE clause (or the index
  doesn't work), and I assume that is 90% of the text search uses.  I
  don't see it as rare at all.
 
 What is a basis of your assumption ? In my opinion, it's very limited
 use of text search, because it doesn't supports ranking. For 4-5 years
 of tsearch2 usage I never used it and I never seem in mailing lists.
 This is very user-oriented feature and we could probably ask 
 -general people for their opinion.

I doubt 'general' is going to understand the details of merging this
into the backend.  I assume we have enough people on hackers to decide
this.

Are you saying the majority of users have a separate column with a
trigger?  Does the trigger specify the configuation?  I don't see that
as a parameter argument to tsvector_update_trigger().  If you reload a
pg_dump, what does it use for the configuration?

Why is a separate column better than the index?  Just ranking?

The reason the expression index is nice is this feature has to be easy
to use for people who are new to full text and even PostgreSQL.  Right
now /contrib is fine for experts to use, but we want a larger user base
for this feature.

  If somebody really need it, then he should be adviced to use configuration
  name, else we don't guarantee that somebody could change
  default_text_search_config  variable and this could lead to
  incorrect dump/restore.
 
  I don't think we should remove default_text_search_config because of
  this rare case.
 
  I still feel the default_text_search_config has to be removed.  We have
  tried all sorts of ways to make it work but having it not be 100%
  reliable for pg_dump/restore means it might as well be in /contrib and
  unsupported.  If we have it in core, it has to work 100%.  We can't have
  tons of examples that don't specify the configuration name and then
  expect every create expression index and WHERE clause to use it.
  default_text_search_config _can_ work, but it seems so easy to break and
  so easy to get wrong that I think it must be removed.
 
 I'd better say we don't support text searching using expression index
 than remove default_text_search_config. Anyway, I don't feel myself
 responisble for such important problem. We need more feedback from 
 users.

Well, I am waiting for other hackers to get involved, but if they don't,
I have to evaluate it myself on the email lists.

  If we are going to keep it, I need someone to explain why my comments
  above are wrong.  If I am right, someone has to remove
  default_text_search_config from the patch.   I can do the documentation.
 
 I'm in conference and then will be busy writing my applications and
 earning money, Teodor is in vacation. I don't want to do 
 hasty conclusion, since we're very tired to change our patch from 
 one solution to another. We need consensus of developers and users.
 I'm almost exhausted and have  no time  to continue this discussion.
 
 Would you be so kind to write separate post about this problem and
 call -hackers and -general for feedback. Let's experienced users
 show their needs. We said everything and has nothing to add.

If you have no time to continue discussion and perhaps update the patch,
we can consider this patch dead for 8.3 and we can hold it for 8.4
because I can guarantee you this is going to need more discussion and
patch modification before it gets into CVS.

This patch is being treated fairly and exactly the same as every other
patch.

Should we hold the patch for 8.4?

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] feature suggestion

2007-07-31 Thread Gregory Stark
Rafael Azevedo [EMAIL PROTECTED] writes:

 Unstead of having to type all the insert syntax, using (column) names, you
 could do the same as MySQL does.
 for example:

 INSERT INTO Table SET
 Field1 = 'text',
 Field2 = 'text';

 So it would make it easier and faster to develop applications using Postgre.

I'm a bit mystified here. What exactly about this syntax is easier or faster?
You still have to list all the column names. It looks like it would require
just as much typing as the regular syntax, no?

Or is it that you get to reuse the same string you use for doing an update?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] feature suggestion

2007-07-31 Thread Bruce Momjian
Gregory Stark wrote:
 Rafael Azevedo [EMAIL PROTECTED] writes:
 
  Unstead of having to type all the insert syntax, using (column) names, you
  could do the same as MySQL does.
  for example:
 
  INSERT INTO Table SET
  Field1 = 'text',
  Field2 = 'text';
 
  So it would make it easier and faster to develop applications using Postgre.
 
 I'm a bit mystified here. What exactly about this syntax is easier or faster?
 You still have to list all the column names. It looks like it would require
 just as much typing as the regular syntax, no?
 
 Or is it that you get to reuse the same string you use for doing an update?

As far as I can see, the _feature_ is matching MySQL optional
non-standard syntax.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Machine available for community use

2007-07-31 Thread Josh Berkus
Tom,

 FWIW, it's looking like Red Hat will donate a RHEL/RHN subscription if
 we want one, though I don't have final approval quite yet.

Great.  Any chance of a machine?  Can RH exert some leverage with Dell?

We could use up to 8 servers for performance testing, so I'm asking 
everyone.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] feature suggestion

2007-07-31 Thread Rafael Azevedo
Imagine that you have about 30 fields.
Ok, then your first SQL is done.
Now, you just have to add 10 more fields.
Its very easy to get lost. If we have this implementation, you could just
add

Field31 = 'text',
Field32 = 'text'
...
wherever you want.

This is just a PLUS. I just don't see any problem by doing this.
Even knowing that this is not Standard SQL-Syntax, I just see this as a
benefit feature.

Another reason is that we have more people migrating from MySQL to Postgre
than any other database server. People don't migrate to Postgre from Oracle.
Hardly from MS SQL Server.

It just makes easier to migrate users from other db servers.
And this is sure not hard to implement.

Today its easier to migrate to PostgreSQL from MySQL than from PostgreSQL to
MySQL.

In few words I have given you more reason to add this feature than not to
add it.

Think about it.



2007/7/31, Rafael Azevedo [EMAIL PROTECTED]:

 Yes it is. And it makes easier to migrate from MySQL servers to
 PostgreSQL.
 Today its easier to migrate to MySQL from PostgreSQL than from PostgreSQL
 to MySQL.


 2007/7/31, Bruce Momjian [EMAIL PROTECTED]:
 
  Gregory Stark wrote:
   Rafael Azevedo [EMAIL PROTECTED]  writes:
  
Unstead of having to type all the insert syntax, using (column)
  names, you
could do the same as MySQL does.
for example:
   
INSERT INTO Table SET
Field1 = 'text',
Field2 = 'text';
   
So it would make it easier and faster to develop applications using
  Postgre.
  
   I'm a bit mystified here. What exactly about this syntax is easier or
  faster?
   You still have to list all the column names. It looks like it would
  require
   just as much typing as the regular syntax, no?
  
   Or is it that you get to reuse the same string you use for doing an
  update?
 
  As far as I can see, the _feature_ is matching MySQL optional
  non-standard syntax.
 
  --
  Bruce Momjian  [EMAIL PROTECTED]   http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com
 
  + If your life is a hard drive, Christ can be your backup. +
 



Re: [HACKERS] feature suggestion

2007-07-31 Thread Josh Berkus
Rafael,

 This is just a PLUS. I just don't see any problem by doing this.
 Even knowing that this is not Standard SQL-Syntax, I just see this as a
 benefit feature.

Our project has a policy of upholding the SQL standard whereever possible.  
For that reason, we don't approve non-standard syntax just for reasons of 
accessibility.  Any non-standard syntax we approve needs to add 
significant extra functionality to the DBMS, not just convenience, and 
certainly not because MySQL does it.

Standards are important is one of the themes of PostgreSQL which 
differentiates us from MySQL.

 Another reason is that we have more people migrating from MySQL to
 Postgre than any other database server. People don't migrate to Postgre
 from Oracle. Hardly from MS SQL Server.

You're mistaken.  I think we get more migrations from Oracle than from 
MySQL.  And quite a few from DB2 and Informix.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[HACKERS] GIT patch

2007-07-31 Thread Alvaro Herrera
Hi,

I've started reading the GIT patch to see if I can help with the review.
First thing I notice is that there are several things that seems left
over; for example the comments in pg_proc for the new functions are
incomplete.

More subtle: in _bt_findinsertloc, the test for
modifiedpage = _bt_groupify() may reset the bit set by the
_bt_vacuum_one_page.  Surely it should look like

modifiedpage |= _bt_groupify() 

There's also a couple of spots that were not merged cleanly, but since
they were inside comments, the compiler did not complain and so were not
fixed.

I'm also finding a certain lack of code commentary that makes the
reviewing a bit harder.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] feature suggestion

2007-07-31 Thread Merlin Moncure
On 8/1/07, Rafael Azevedo [EMAIL PROTECTED] wrote:
 Imagine that you have about 30 fields.
 Ok, then your first SQL is done.
 Now, you just have to add 10 more fields.
 Its very easy to get lost. If we have this implementation, you could just
 add

 Field31 = 'text',
 Field32 = 'text'

I have to admit this syntax has a lot of advantages over the
insert...values statement, especially in dynamic sql situations.  That
being said, more and more I just write queries insert..select which
would be an awkward fit.  mysql compatibility is usually pretty poor
justification of a feature (they have a million ways to do everything)
and things have to stand on general merit.

It is really quite unfortunate the way certain aspects of the sql
standard evolved (indirectly causing these types of issues) but that
is a topic for another day :)

merlin

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] feature suggestion

2007-07-31 Thread Rafael Azevedo
Well. Ok.
Then I'll just do it myself.
Just thought it would be good for thousands of users.
As I said, it was just a suggestion.
I surely aint the only one who ever thought about it.

Thanks anyway.


2007/7/31, Josh Berkus [EMAIL PROTECTED]:

 Rafael,

  This is just a PLUS. I just don't see any problem by doing this.
  Even knowing that this is not Standard SQL-Syntax, I just see this as a
  benefit feature.

 Our project has a policy of upholding the SQL standard whereever possible.
 For that reason, we don't approve non-standard syntax just for reasons of
 accessibility.  Any non-standard syntax we approve needs to add
 significant extra functionality to the DBMS, not just convenience, and
 certainly not because MySQL does it.

 Standards are important is one of the themes of PostgreSQL which
 differentiates us from MySQL.

  Another reason is that we have more people migrating from MySQL to
  Postgre than any other database server. People don't migrate to Postgre
  from Oracle. Hardly from MS SQL Server.

 You're mistaken.  I think we get more migrations from Oracle than from
 MySQL.  And quite a few from DB2 and Informix.

 --
 --Josh

 Josh Berkus
 PostgreSQL @ Sun
 San Francisco




-- 
Atenciosamente,

Rafael Azevedo
.: Diretor
:: WEBPRO SOLUÇÕES DIGITAIS
:: Telefone: 51 3266.3446
:: Celular: 51 9243.9893
:: http://www.webpro.com.br
:: Email: [EMAIL PROTECTED]
:::
:: Conheça o MAILMAN, Solução em E-mail Marketing
:: http://www.mailman.com.br/


Re: [HACKERS] default_text_search_config and expression indexes

2007-07-31 Thread Oleg Bartunov

On Tue, 31 Jul 2007, Bruce Momjian wrote:


Oleg Bartunov wrote:

On Tue, 31 Jul 2007, Bruce Momjian wrote:


And if we have to require the configuration name in CREATE INDEX, it has
to be used in WHERE, so we might as well just remove the default
capability and always require the configuration name.


this is very rare use case for text searching
1. expression index without configuration name
2. default_text_search_config can be changed by somebody


If you are going to be using the configuration name with the create
expression index, you have to use it in the WHERE clause (or the index
doesn't work), and I assume that is 90% of the text search uses.  I
don't see it as rare at all.


What is a basis of your assumption ? In my opinion, it's very limited
use of text search, because it doesn't supports ranking. For 4-5 years
of tsearch2 usage I never used it and I never seem in mailing lists.
This is very user-oriented feature and we could probably ask
-general people for their opinion.


I doubt 'general' is going to understand the details of merging this
into the backend.  I assume we have enough people on hackers to decide
this.


I mean not technical details, but use case. Does they need expressional
index without ranking but sacrifice ability to use default configuration
in other cases too ? My prediction is that  people doesn't ever thought about 
this possibility until we said them about.




Are you saying the majority of users have a separate column with a
trigger?  Does the trigger specify the configuation?  I don't see that
as a parameter argument to tsvector_update_trigger().  If you reload a
pg_dump, what does it use for the configuration?



yes, separate column with custom trigger works fine. It's up to you how
to keep your data actual and it's up to you how to write trigger. 
Our tsvector_update_trigger() is a tsvector_update_trigger_example() !



Why is a separate column better than the index?  Just ranking?


ranking + composite documents. I already mentioned, that this could be
rather expensive. Also, having separate column allow people various
ways to say what is a document and even change it.



The reason the expression index is nice is this feature has to be easy
to use for people who are new to full text and even PostgreSQL.  Right
now /contrib is fine for experts to use, but we want a larger user base
for this feature.


I agree here. This was one of the main reason of our work for 8.3.
Probably, we shold think in another direction - not to curtail tsearch2
and confuse rather big existing users, but to add an ability to save somehow
configuration used for creating of *document*
either implicitly (in expression index, or just gin(text_column)), or
explicitly (separate column). There is no problem with index itself !





I'd better say we don't support text searching using expression index
than remove default_text_search_config. Anyway, I don't feel myself
responisble for such important problem. We need more feedback from
users.


Well, I am waiting for other hackers to get involved, but if they don't,
I have to evaluate it myself on the email lists.


If we are going to keep it, I need someone to explain why my comments
above are wrong.  If I am right, someone has to remove
default_text_search_config from the patch.   I can do the documentation.


I'm in conference and then will be busy writing my applications and
earning money, Teodor is in vacation. I don't want to do
hasty conclusion, since we're very tired to change our patch from
one solution to another. We need consensus of developers and users.
I'm almost exhausted and have  no time  to continue this discussion.

Would you be so kind to write separate post about this problem and
call -hackers and -general for feedback. Let's experienced users
show their needs. We said everything and has nothing to add.


If you have no time to continue discussion and perhaps update the patch,
we can consider this patch dead for 8.3 and we can hold it for 8.4
because I can guarantee you this is going to need more discussion and
patch modification before it gets into CVS.

This patch is being treated fairly and exactly the same as every other
patch.


why do you say this ? I didn't complain about this.



Should we hold the patch for 8.4?


If we're not agree to say in docs, that implicit usage of text search 
configuration in CREATE INDEX command doesn't supported. Could we leave

default_text_search_config for super-users, at least ?

Anyway, let's wait what other people say.

Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] GIT patch

2007-07-31 Thread Alvaro Herrera

Another thing that would be useful would be to separate the changes to
add pgstat counters and view columns, since they are relatively minor
and could be committed separately (or not at all for 8.3, even).

-- 
Alvaro Herrera  Developer, http://www.PostgreSQL.org/
inflex really, I see PHP as like a strange amalgamation of C, Perl, Shell
crab inflex: you know that amalgam means mixture with mercury,
   more or less, right?
crab i.e., deadly poison

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [GENERAL] ascii() for utf8

2007-07-31 Thread Stuart McGraw

From: Alvaro Herrera
 Decibel! wrote:
  Moving to -hackers.
 
  On Jul 27, 2007, at 1:22 PM, Stuart wrote:
  Does Postgresql have a function like ascii() that will
  return the unicode codepoint value for a utf8 character?
  (And symmetrically same for question chr() of course).
 
  I suspect that this is just a matter of no one scratching the itch. I 
  suspect a patch would be accepted, or you could possibly put something on 
  pgFoundry.
 
 Nay; there were some discussions about this not long ago, and I think
 one conclusion you could draw from them is that many people want these
 functions in the backend.

That would certainly be my preference.  I will be distributing an 
application, the database part of which may (not sure yet) require 
this function, to multiple platforms including Windows and (though 
I have never done it) am anticipating it will be significantly harder 
if I have to worry about the recipient compiling an external function 
or making sure a dll goes in the right place, gets updated, etc.

  I'd set it up so that ascii() and chr() act according to the 
  appropriate locale setting (I'm not sure which one would be appropriate).
 
 I don't see why any of them would react to the locale, but they surely
 must honor client encoding.

Wouldn't this be the database encoding?  (I have been using 
strictly utf-8 and admit I am pretty fuzzy on encoding issues.)

If one had written an external function, how much more effort 
would it be to make it acceptable for inclusion in the backend? 


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] GIT patch

2007-07-31 Thread Alvaro Herrera

Oh, and the new function in bitmapset.c could use with some explanation
of what it is.

-- 
Alvaro Herrera  http://www.amazon.com/gp/registry/5ZYLFMCVHXC
Es filósofo el que disfruta con los enigmas (G. Coli)

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq