[HACKERS] PATCH: CITEXT 2.0 v4

2008-07-16 Thread David E. Wheeler

Howdy,

I've attached a new patch with the latest revisions of for the citext  
contrib module patch. The changes include:


* Using strlen() to pass string lengths to the comparison function,
  since lowercasing the value can change the length. Per Tom Lane.
* Made citextcmp consistently return int32, per Tom Lane.
* Made the hash index function return the proper value, per Tom Lane.
* Removed the COMMENTs and GRANTs from citext.sql.in.
* Added a cast function from bpchar to citext, as suggested by Tom Lane.
* Set the storage type for CITEXT to extended, to ensure that it will
  be toastable. Per Tom Lane.
* Fixed the COMMUTATOR of =.
* Changed the cast from citext to bpchar from implicit to assignment.
  This eliminates ambiguous function resolutions.
* Eliminated superflous functions, per Tom Lane.
* Removed unnecessary `OPERATOR()` calls in NEGATORs and the like.
* Added binary in/out functions. Per Tom Lane
* Added an explicit shell type to make the output a bit quieter.
* Converted tests to pure SQL and omitted multibyte tests (though a
  few remain commented-out).
* Reorganized and expanded the documentation a bit.

This version is far better than I started with, and I'm very grateful  
for the feedback.


Now, I have a few remaining questions to ask, mostly just to get your  
opinions:


* The README for citext 1.0 on pgFoundry says:

I had to make a decision on casting between types for regular  
expressions and
decided that if any parameter is of citext type then case  
insensitive applies.
For example applying regular expressions with a varchar and a citext  
will

produce a case-insensitive result.

Having thought about this afterwards I realised that since we have  
the option
to use case-insensitive results with regular expressions I should  
have left the
behaviour exactly as text and then you have the best of both  
worlds... oh well

not hard to change for any of you perfectionists!


I followed the original and made all the regex and LIKE comparisons  
case-insensitive. But maybe I should not have? Especially since the  
regular expression functions (e.g., regexp_replace()) and a few non- 
regex functions (e.g., replace()) still don't behave case-insensitively?


* If the answer is no, how can I make those functions behave case- 
insensitively? (See the TODO tests.)


* Should there be any other casts? To and from name, perhaps?

Thanks!

David


citext4.patch.gz
Description: GNU Zip compressed data



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] postmaster.pid not visible

2008-07-16 Thread cinu
Hi All, 

I installed PostgreSQL-8.3.1 on my Suse Linux machine, it went on fine without 
any problems and I was able to create and access the database, even I was able 
to start, restart and check the status of the service.
Since it is my local machine and people are remotly connecting to the database 
on my local machine, I used to keep the machine up and running.

Today I came and checked and It was telling me that the service of postgres is 
not running, so I went and checked the postmaster.pid file it was not in the 
data folder, but I was able to get to the psql prompt and execute standard sql 
statements, even people were able to connect remotly and access the databse on 
my machine.

The only difficult that I was facing was that I was unable to restart or stop 
the service.

So with the help of the ps -ef | grep postgres command I was able to trace out 
the pid and then manually kill the pid with the kill -9 command, after this I 
was able to restart, stop or check the status of the service.

Can anyone throw light on why the postmaster.pid was not visible, the other 
intresting factor that I observed was that the postgres service was running on 
the 5432 port this was visible from the /tmp location.

Also I would like to know if theer is any other alternative with which i can 
restart the service and retain the postmaster.pid file.

Thanks in advance
Regards
Cinu


  Explore your hobbies and interests. Go to 
http://in.promos.yahoo.com/groups/

Re: [HACKERS] gsoc, store hash index tuple with hash code only

2008-07-16 Thread Xiao Meng
I've fixed the patch just now. It works and pass the regression test ;-)
Here is the new patch. I'll keep the hash code  in order and use
binary search in a later version soon.


diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 6a5c000..1a8dc75 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -129,7 +129,11 @@ hashbuildCallback(Relation index,
IndexTuple  itup;

/* form an index tuple and point it at the heap tuple */
+#ifdef HASHVALUE_ONLY
+itup = _hash_form_tuple(index, values,isnull);
+#else
itup = index_form_tuple(RelationGetDescr(index), values, isnull);
+#endif
itup-t_tid = htup-t_self;

/* Hash indexes don't index nulls, see notes in hashinsert */
@@ -171,7 +175,12 @@ hashinsert(PG_FUNCTION_ARGS)
IndexTuple  itup;

/* generate an index tuple */
+#ifdef HASHVALUE_ONLY
+itup = _hash_form_tuple(rel, values, isnull);
+#else
itup = index_form_tuple(RelationGetDescr(rel), values, isnull);
+#endif
+
itup-t_tid = *ht_ctid;

/*
@@ -212,7 +221,11 @@ hashgettuple(PG_FUNCTION_ARGS)
boolres;

/* Hash indexes are never lossy (at the moment anyway) */
-   scan-xs_recheck = false;
+#ifdef HASHVALUE_ONLY
+   scan-xs_recheck = true;
+#else
+   scan-xs_recheck = false;
+#endif

/*
 * We hold pin but not lock on current buffer while outside the hash AM.
diff --git a/src/backend/access/hash/hashinsert.c
b/src/backend/access/hash/hashinsert.c
index 3eb226a..086 100644
--- a/src/backend/access/hash/hashinsert.c
+++ b/src/backend/access/hash/hashinsert.c
@@ -52,9 +52,15 @@ _hash_doinsert(Relation rel, IndexTuple itup)
 */
if (rel-rd_rel-relnatts != 1)
elog(ERROR, hash indexes support only one index key);
+#ifdef HASHVALUE_ONLY
+   datum = index_getattr(itup, 1, _create_hash_desc(), isnull);
+   Assert(!isnull);
+hashkey = DatumGetUInt32(datum);
+#else
datum = index_getattr(itup, 1, RelationGetDescr(rel), isnull);
Assert(!isnull);
hashkey = _hash_datum2hashkey(rel, datum);
+#endif

/* compute item size too */
itemsz = IndexTupleDSize(*itup);
diff --git a/src/backend/access/hash/hashpage.c
b/src/backend/access/hash/hashpage.c
index b0b5874..bba64c4 100644
--- a/src/backend/access/hash/hashpage.c
+++ b/src/backend/access/hash/hashpage.c
@@ -785,7 +785,12 @@ _hash_splitbucket(Relation rel,
OffsetNumber omaxoffnum;
Pageopage;
Pagenpage;
-   TupleDesc   itupdesc = RelationGetDescr(rel);
+   TupleDesc   itupdesc;
+#ifdef HASHVALUE_ONLY
+itupdesc = _create_hash_desc();
+#else
+itupdesc = RelationGetDescr(rel);
+#endif

/*
 * It should be okay to simultaneously write-lock pages from each 
bucket,
@@ -854,9 +859,13 @@ _hash_splitbucket(Relation rel,
itup = (IndexTuple) PageGetItem(opage, PageGetItemId(opage, 
ooffnum));
datum = index_getattr(itup, 1, itupdesc, null);
Assert(!null);
-
+#ifdef HASHVALUE_ONLY
+   bucket = _hash_hashkey2bucket(DatumGetUInt32(datum),
+ 
maxbucket, highmask, lowmask);
+#else
bucket = _hash_hashkey2bucket(_hash_datum2hashkey(rel, datum),
  
maxbucket, highmask, lowmask);
+#endif

if (bucket == nbucket)
{
diff --git a/src/backend/access/hash/hashsearch.c
b/src/backend/access/hash/hashsearch.c
index 258526b..5211e67 100644
--- a/src/backend/access/hash/hashsearch.c
+++ b/src/backend/access/hash/hashsearch.c
@@ -178,6 +178,7 @@ _hash_first(IndexScanDesc scan, ScanDirection dir)
hashkey = _hash_datum2hashkey_type(rel, cur-sk_argument,

   cur-sk_subtype);

+so-hashso_sk_hash = hashkey;
/*
 * Acquire shared split lock so we can compute the target bucket safely
 * (see README).
diff --git a/src/backend/access/hash/hashutil.c
b/src/backend/access/hash/hashutil.c
index 41e2eef..81c6829 100644
--- a/src/backend/access/hash/hashutil.c
+++ b/src/backend/access/hash/hashutil.c
@@ -20,7 +20,7 @@
 #include executor/execdebug.h
 #include storage/bufmgr.h
 #include utils/lsyscache.h
-
+#include catalog/pg_type.h

 /*
  * _hash_checkqual -- does the index tuple satisfy the scan conditions?
@@ -28,16 +28,31 @@
 bool
 _hash_checkqual(IndexScanDesc scan, IndexTuple itup)
 {
-   TupleDesc   tupdesc = RelationGetDescr(scan-indexRelation);
+   TupleDesc   tupdesc;
ScanKey key = scan-keyData;
int scanKeySize = scan-numberOfKeys;
+Datum  datum;
+bool   isNull;
+HashScanOpaque   

Re: [HACKERS] gsoc, store hash index tuple with hash code only

2008-07-16 Thread Xiao Meng
A problem here is that _create_hash_desc is called many times to
create a TupleDesc with int32 attribute.
I've tried to implement the function like this ,
TupleDesc _create_hash_desc()
{
static bool firstcall = true;
static TupleDesc tupdesc;
if(firstcall){
tupdesc = CreateTemplateTupleDesc(1, false);
TupleDescInitEntry(tupdesc, 1, hashcode, INT4OID, -1, 0);
}
firstcall = false;
return tupdesc;
}

but it failed because tupdesc is free later, IMHO.
Any advice?

-- 
Best Regards,
Xiao Meng

DKERC, Harbin Institute of Technology, China
Gtalk: [EMAIL PROTECTED]
MSN: [EMAIL PROTECTED]
http://xiaomeng.yo2.cn

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] pg_dump lock timeout

2008-07-16 Thread daveg
On Thu, Jul 03, 2008 at 05:55:01AM -0700, daveg wrote:
 On Thu, Jul 03, 2008 at 11:15:10AM +0300, Marko Kreen wrote:
  On 5/11/08, daveg [EMAIL PROTECTED] wrote:
Attached is a patch to add a commandline option to pg_dump to limit how 
   long
pg_dump will wait for locks during startup.
  
  My quick review:
  
  - It does not seem important enough to waste a short option on.
Having only long option should be enough.
 
 Agreed. I'll change it.
  
  - It would be more polite to do SET LOCAL instead SET.
(Eg. it makes safer to use pg_dump through pooler.)
 
 Also agreed. Thanks.

On second glance, pg_dump sets lots of variables without using SET LOCAL.
I think fixing that must be the subject of a separate patch as fixing just
one of many will only cause confusion.

  - The statement_timeout is set back with statement_timeout = default
Maybe it would be better to do = 0 here?  Although such decision
would go outside the scope of the patch, I see no sense having
any other statement_timeout for actual dumping.
 
 I'd prefer to leave whatever policy is otherwise in place alone. I can see
 use cases for either having or not having a timeout for pg_dump, but it does
 seem  outside the scope of this patch.

As it happens, another patch has set the policy to statement_timeout = 0,
so I will follow that.

I'm sending in the revised patch today.

-dg


-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] temp table problem

2008-07-16 Thread Abbas
Hi,
I have come across a problem. When you try to access a temp table
created via SPI_EXEC, you get a table not found error.

  SPI_EXEC(CREATE TEMP TABLE my_temp_table(first_name text, last_name
text), UTILITY);
  SPI_EXEC(REVOKE ALL ON TABLE my_temp_table FROM PUBLIC, UTILITY);

The second statement generates a table not found error, although the
first statement was successful.

After initdb the system has no temp namespace to hold
temp objects and hence the search path does not contain 
any temp namespace either.
On first call to create a temp table the system first creates
a temp namespace. At this point the system calls recomputeNamespacePath
thinking that it would update search path and include the temp namespace
in it, but that does not happen beccause of override search path stack.
Hence subsquent calls to say insert into the temp table fail.

Any suggestions on how to tackle this problem?

Regards
Abbas
EnterpriseDB   http://www.enterprisedb.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] autovacuum crash due to null pointer

2008-07-16 Thread Tom Lane
There's a fairly interesting crash here:
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=jaguardt=2008-07-16%2003:00:02
The buildfarm was nice enough to provide a stack trace at the bottom of
the page, which shows clearly that autovac tried to pfree a null
pointer.

What I think happened was that the table that was selected to be
autovacuumed got dropped during the setup steps, leading get_rel_name()
to return NULL at line 2167.  vacuum() itself would have fallen out
silently ...  however, had it errored out, the attempts at error
reporting in the PG_CATCH block would have crashed.

I see that we already noticed and fixed this type of problem in
autovac_report_activity(), but do_autovacuum() didn't get the word.
Is there anyplace else in there with the same issue?  For that matter,
why is autovac_report_activity repeating the lookups already done
at the outer level?

One other point is that the postmaster log just says

TRAP: FailedAssertion(!(pointer != ((void *)0)), File: mcxt.c, Line: 580)
[487d6715.3a87:2] LOG:  server process (PID 16885) was terminated by signal 6: 
Aborted

Could we get that to say autovacuum worker instead of server?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhauling GUCS

2008-07-16 Thread Bruce Momjian

Added to TODO:

o Add external tool to auto-tune some postgresql.conf parameters

 http://archives.postgresql.org/pgsql-hackers/2008-06/msg0.php


---

Tom Lane wrote:
 Andrew Dunstan [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  * Can we build a configuration wizard to tell newbies what settings
  they need to tweak?
 
  That would trump all the other suggestions conclusively. Anyone good at 
  expert systems?
 
 How far could we get with the answers to just three questions:
 
 * How many concurrent queries do you expect to have?
 
 * How much RAM space are you willing to let Postgres use?
 
 * How much overhead disk space are you willing to let Postgres use?
 
 concurrent queries drives max_connections, obviously, and RAM space
 would drive shared_buffers and effective_cache_size, and both of them
 would be needed to size work_mem.  The third one is a bit weird but
 I don't see any other good way to set the checkpoint parameters.
 
 If those aren't enough questions, what else must we ask?  Or maybe they
 aren't the right questions at all --- maybe we should ask is this a
 dedicated machine or not and try to extrapolate everything else from
 what we (hopefully) can find out about the hardware.
 
   regards, tom lane
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ExecuteTruncate quirk: expects a unique list of relations

2008-07-16 Thread Bruce Momjian
Nikhils wrote:
 Hi,
 
 Consider this simple case:
 
 postgres=# TRUNCATE foo, foo;
 ERROR:  cannot TRUNCATE foo because it is being used by active queries in
 this session
 
 The above occurs because the ExecuteTruncate() function invokes
 truncate_check_rel() in a loop. Since the same table name appears twice, the
 rd_refcnt for table foo is bumped up to 2, causing the above failure.
 
 We might want to add a step to ExecuteTruncate(), or whatever calls it, to
 make the list unique.

Fixed with attached, applied patch.  I didn't see any other cases that
need fixing;  LOCK foo, foo already works fine.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: src/backend/commands/tablecmds.c
===
RCS file: /cvsroot/pgsql/src/backend/commands/tablecmds.c,v
retrieving revision 1.259
diff -c -c -r1.259 tablecmds.c
*** src/backend/commands/tablecmds.c	19 Jun 2008 00:46:04 -	1.259
--- src/backend/commands/tablecmds.c	16 Jul 2008 16:35:28 -
***
*** 762,767 
--- 762,770 
  	ResultRelInfo *resultRelInfo;
  	ListCell   *cell;
  
+ 	/* make list unique */
+ 	stmt-relations = list_union(NIL, stmt-relations);
+ 
  	/*
  	 * Open, exclusive-lock, and check all the explicitly-specified relations
  	 */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: CITEXT 2.0 v4

2008-07-16 Thread David E. Wheeler

On Jul 15, 2008, at 22:23, David E. Wheeler wrote:


* The README for citext 1.0 on pgFoundry says:

I had to make a decision on casting between types for regular  
expressions and
decided that if any parameter is of citext type then case  
insensitive applies.
For example applying regular expressions with a varchar and a  
citext will

produce a case-insensitive result.

Having thought about this afterwards I realised that since we have  
the option
to use case-insensitive results with regular expressions I should  
have left the
behaviour exactly as text and then you have the best of both  
worlds... oh well

not hard to change for any of you perfectionists!


I followed the original and made all the regex and LIKE comparisons  
case-insensitive. But maybe I should not have? Especially since the  
regular expression functions (e.g., regexp_replace()) and a few non- 
regex functions (e.g., replace()) still don't behave case- 
insensitively?


I was thinking about this a bit last night and wanted to fill things  
out a bit.


As a programmer, I find Donald Fraser's hindsight to be more  
appealing, because at least that way I have the option to do matching  
against CITEXT strings case-sensitively when I want to.


OTOH, if what we want is to have CITEXT work more like a case- 
insensitive collation, then the expectation from the matching  
operators and functions might be different. Does anyone have any idea  
whether regex and LIKE matching against a string in a case-insensitive  
collation would match case-insensitively or not? If so, then maybe the  
regex and LIKE ops and funcs *should* match case-insensitively? If  
not, or if only for some collations, then I would think not.


Either way, I know of no way, currently, to allow functions like  
replace(), split_part(), strpos(), and translate() to match case- 
insensitiely, even if we wanted to. Anyone have any ideas?


* If the answer is no, how can I make those functions behave case- 
insensitively? (See the TODO tests.)


* Should there be any other casts? To and from name, perhaps?


Thanks,

David


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: CITEXT 2.0 v4

2008-07-16 Thread Robert Treat
On Wednesday 16 July 2008 13:54:25 David E. Wheeler wrote:
 On Jul 15, 2008, at 22:23, David E. Wheeler wrote:
  * The README for citext 1.0 on pgFoundry says:
  I had to make a decision on casting between types for regular
  expressions and
  decided that if any parameter is of citext type then case
  insensitive applies.
  For example applying regular expressions with a varchar and a
  citext will
  produce a case-insensitive result.
 
  Having thought about this afterwards I realised that since we have
  the option
  to use case-insensitive results with regular expressions I should
  have left the
  behaviour exactly as text and then you have the best of both
  worlds... oh well
  not hard to change for any of you perfectionists!
 
  I followed the original and made all the regex and LIKE comparisons
  case-insensitive. But maybe I should not have? Especially since the
  regular expression functions (e.g., regexp_replace()) and a few non-
  regex functions (e.g., replace()) still don't behave case-
  insensitively?

 I was thinking about this a bit last night and wanted to fill things
 out a bit.

 As a programmer, I find Donald Fraser's hindsight to be more
 appealing, because at least that way I have the option to do matching
 against CITEXT strings case-sensitively when I want to.

 OTOH, if what we want is to have CITEXT work more like a case-
 insensitive collation, then the expectation from the matching
 operators and functions might be different. Does anyone have any idea
 whether regex and LIKE matching against a string in a case-insensitive
 collation would match case-insensitively or not? If so, then maybe the
 regex and LIKE ops and funcs *should* match case-insensitively? If
 not, or if only for some collations, then I would think not.

 Either way, I know of no way, currently, to allow functions like
 replace(), split_part(), strpos(), and translate() to match case-
 insensitiely, even if we wanted to. Anyone have any ideas?

  * If the answer is no, how can I make those functions behave case-
  insensitively? (See the TODO tests.)
 
  * Should there be any other casts? To and from name, perhaps?


AIUI, your propsing the following:

select 'x'::citext = 'X'::citext;
 ?column?
--
 t
(1 row)

select 'x'::citext ~ 'X'::citext;
 ?column?
--
 f
(1 row)

I understand the desire for flexibility, but the above seems wierd to me. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: CITEXT 2.0 v4

2008-07-16 Thread David E. Wheeler

On Jul 16, 2008, at 11:20, Robert Treat wrote:


I was thinking about this a bit last night and wanted to fill things
out a bit.

As a programmer, I find Donald Fraser's hindsight to be more
appealing, because at least that way I have the option to do matching
against CITEXT strings case-sensitively when I want to.

OTOH, if what we want is to have CITEXT work more like a case-
insensitive collation, then the expectation from the matching
operators and functions might be different. Does anyone have any idea
whether regex and LIKE matching against a string in a case- 
insensitive
collation would match case-insensitively or not? If so, then maybe  
the

regex and LIKE ops and funcs *should* match case-insensitively? If
not, or if only for some collations, then I would think not.

Either way, I know of no way, currently, to allow functions like
replace(), split_part(), strpos(), and translate() to match case-
insensitiely, even if we wanted to. Anyone have any ideas?


* If the answer is no, how can I make those functions behave case-
insensitively? (See the TODO tests.)

* Should there be any other casts? To and from name, perhaps?




AIUI, your propsing the following:

select 'x'::citext = 'X'::citext;
?column?
--
t
(1 row)

select 'x'::citext ~ 'X'::citext;
?column?
--
f
(1 row)

I understand the desire for flexibility, but the above seems wierd  
to me.


That's what Donald Fraser suggested, and I see some value in that, but  
wanted to get some other opinions. And you're right, that does seem a  
bit weird.


The trouble is that, right now:

template1=# select regexp_replace( 'fxx'::citext, 'X'::citext, 'o');
 regexp_replace

 fxx
(1 row)

So there's an inconsistency there. I don't know how to make that work  
case-insensitively.


Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R source code release

2008-07-16 Thread Markus Wanner

Hi,

David Fetter wrote:

Would you mind if I were to make a git branch for it on
http://git.postgresql.org/ ?


I've set up a git-daemon with the Postgres-R patch here:

git://postgres-r.org/repo

Since it's a distributed VCS, you should be able to mirror that to 
git.postgtresql.org somehow (if you figure out how, please tell me!).


Please note that I'm still struggling with git and I cannot promise to 
keep using it.


Regards

Markus


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Postgres-R: primary key patches

2008-07-16 Thread Markus Wanner

Hi,

as you might know, Postgres-R relies on primary keys to address tuples 
of a table. It cannot replicate tables without a primary key.


Primary keys currently aren't really used within the executor, so I had 
to extended and modify Postgres here and there, to get the required 
information. To ease reviewing I have split out these modifications and 
present them here as two separate little patches.


The first one, get_pkey_index_oid.diff, changes the function 
relationHasPrimaryKey into GetPrimaryKeyIndexOid, which now returns an 
index oid instead of just a boolean. It works pretty much the same, 
except from returning an oid instead of just a boolean. (In the current 
Postgres-R code, I've duplicated that code to 
src/backend/replication/recovery.c)


And secondly, the add_pkey_info.diff patch adds a boolean field 
ii_Primary to the IndexInfo struct and ri_PrimaryKey to the 
ResultRelInfo struct, which is an index into the indexInfoArray.


I think these are relatively trivial modifications which could be 
helpful for other purposes as well. So I suggest to apply them to 
mainline whenever appropriate (read: choose the appropriate commit fest).


This also raises the more general question of how to start collaborating 
on Postgres-R. I realize that it's a pretty huge project. However, I'm 
unsure on how to ease reviewing for others, so if you have any ideas or 
questions, please don't hesitate to ask.


Regards

Markus


*** src/backend/commands/indexcmds.c	61a8b3774b682554e8670624583ab4cf4b9dbdb9
--- src/backend/commands/indexcmds.c	dc6fc2a3fbce90748bcf4cd7a60ea2ea887bc97f
*** static Oid GetIndexOpClass(List *opclass
*** 64,70 
    bool isconstraint);
  static Oid GetIndexOpClass(List *opclass, Oid attrType,
  char *accessMethodName, Oid accessMethodId);
- static bool relationHasPrimaryKey(Relation rel);
  
  
  /*
--- 64,69 
*** DefineIndex(RangeVar *heapRelation,
*** 324,330 
  		 * it's no problem either.
  		 */
  		if (is_alter_table 
! 			relationHasPrimaryKey(rel))
  		{
  			ereport(ERROR,
  	(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
--- 323,329 
  		 * it's no problem either.
  		 */
  		if (is_alter_table 
! 			(GetPrimaryKeyIndexOid(rel) != InvalidOid))
  		{
  			ereport(ERROR,
  	(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
*** ChooseRelationName(const char *name1, co
*** 1216,1229 
  }
  
  /*
!  * relationHasPrimaryKey -
   *
!  *	See whether an existing relation has a primary key.
   */
! static bool
! relationHasPrimaryKey(Relation rel)
  {
! 	bool		result = false;
  	List	   *indexoidlist;
  	ListCell   *indexoidscan;
  
--- 1215,1229 
  }
  
  /*
!  * GetPrimaryKeyIndexOid
   *
!  * Returns the oid of the primary key index of the relation, if any,
!  * otherwise InvalidOid is returned.
   */
! Oid
! GetPrimaryKeyIndexOid(Relation rel)
  {
! 	Oid			result = InvalidOid;
  	List	   *indexoidlist;
  	ListCell   *indexoidscan;
  
*** relationHasPrimaryKey(Relation rel)
*** 1244,1257 
  	0, 0, 0);
  		if (!HeapTupleIsValid(indexTuple))		/* should not happen */
  			elog(ERROR, cache lookup failed for index %u, indexoid);
! 		result = ((Form_pg_index) GETSTRUCT(indexTuple))-indisprimary;
  		ReleaseSysCache(indexTuple);
! 		if (result)
  			break;
  	}
  
  	list_free(indexoidlist);
- 
  	return result;
  }
  
--- 1244,1260 
  	0, 0, 0);
  		if (!HeapTupleIsValid(indexTuple))		/* should not happen */
  			elog(ERROR, cache lookup failed for index %u, indexoid);
! 
! 		if (((Form_pg_index) GETSTRUCT(indexTuple))-indisprimary)
! 			result = indexoid;
! 
  		ReleaseSysCache(indexTuple);
! 
! 		if (result != InvalidOid)
  			break;
  	}
  
  	list_free(indexoidlist);
  	return result;
  }
  

*** src/include/commands/defrem.h	e2384af33d917bff68234bbe407ea16e3ec43123
--- src/include/commands/defrem.h	58bb763402c9bef8ead035a3524505ad8fe58de5
***
*** 15,22 
  #define DEFREM_H
  
  #include nodes/parsenodes.h
  
- 
  /* commands/indexcmds.c */
  extern void DefineIndex(RangeVar *heapRelation,
  			char *indexRelationName,
--- 15,22 
  #define DEFREM_H
  
  #include nodes/parsenodes.h
+ #include utils/relcache.h
  
  /* commands/indexcmds.c */
  extern void DefineIndex(RangeVar *heapRelation,
  			char *indexRelationName,
*** extern Oid	GetDefaultOpClass(Oid type_id
*** 43,48 
--- 43,49 
  extern char *ChooseRelationName(const char *name1, const char *name2,
     const char *label, Oid namespace);
  extern Oid	GetDefaultOpClass(Oid type_id, Oid am_id);
+ extern Oid	GetPrimaryKeyIndexOid(Relation rel);
  
  /* commands/functioncmds.c */
  extern void CreateFunction(CreateFunctionStmt *stmt);

*** src/backend/catalog/index.c	c360fcfd1002ffa557c1a376d3e74c9c2a0924db

Re: [HACKERS] Postgres-R: current state of development

2008-07-16 Thread Dimitri Fontaine

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,

First, thanks a lot for opening Postgres-R, I hope -core will find in  
your code as many good ideas and code as possible :)


Le 15 juil. 08 à 18:48, Markus Wanner a écrit :
A pretty general framework for helper processes is provided. I think  
this framework could be used for parallel querying or data loading  
as well. The helper processes are ordinary backends which process a  
single transaction at a time. But they don't have a client  
connection, instead they communicate with a manager via a messaging  
module based on shared memory and signals. Within Postgres-R, those  
helper backends are mostly called 'remote backends', which is a  
somewhat misleading name. It's just a short name for a helper  
backend which processes a remote transaction.


Could this framework help the current TODO item to have a concurrent  
pg_restore?


The ideas I remember of on this topic where to add the capability for  
pg_restore to create all indexes of any given table in parallel as to  
benefit from concurrent seqscan improvements of 8.3.


There was also the idea to have pg_restore handle the ALTER TABLE  
statements in parallel to the other data copying taking place, this  
part maybe requiring more dependancy information than currently  
available.


And there was some parallel pg_dump idea floating around too, in order  
to give PostgreSQL the capability to saturate high-end hardware at  
pg_dump time, as far as I understood this part of the mails.


Of course, reading that an Open Source framework for parallel queries  
in PostgreSQL is available, can we skip asking if having the executor  
benefit from it for general purpose queries would be doable?


Regards,
- --
dim


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Darwin)

iEYEARECAAYFAkh+UjwACgkQlBXRlnbh1bmsAwCaAhr4xTeCeGjtuap4sHL04IOP
OL8AoI0yv0qEn1eDt+s0qeajzxyIqRhI
=KaLQ
-END PGP SIGNATURE-

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] avoid recasting text to tsvector when calculating selectivity

2008-07-16 Thread Jan Urbański
I'm about to write a oprrest function for the @@ operator. Currently @@ 
handles multiple cases, like tsvector @@ tsquery, text @@ tsquery, 
tsquery @@ tsvector etc. The text @@ text case is for instance handled 
by calling to_tsvector and plainto_tsquery on the input arguments.


For a @@ restriction function, I need to have a tsquery and a tsvector, 
so in the text @@ text situation I'd end up calling plainto_tsquery 
during planning, which would consequently get called again during 
execution. Also, I'd need a not-so-elegant if-elsif-elsif sequence at 
the beginning of the function. Is this OK/unavoidable/easly avoided?


Cheers,
Jan
--
Jan Urbanski
GPG key ID: E583D7D2

ouden estin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Fragments in tsearch2 headline

2008-07-16 Thread Oleg Bartunov

Sushant,

first, please, provide simple test queries, which demonstrate the right work
in the corner cases. This will helps reviewers to test your patch and
helps you to make sure your new version is ok. For example:

=# select ts_headline('1 2 3 4 5 1 2 3 1','13'::tsquery);
 ts_headline
--
 b1/b 2 b3/b 4 5 b1/b 2 b3/b b1/b

This select breaks your code:

=# select ts_headline('1 2 3 4 5 1 2 3 1','13'::tsquery,'maxfragments=2');
 ts_headline
--
 ...  2 ...

and so on 


Oleg
On Tue, 15 Jul 2008, Sushant Sinha wrote:


Attached a new patch that:

1. fixes previous bug
2. better handles the case when cover size is greater than the MaxWords.
Basically it divides a cover greater than MaxWords into fragments of
MaxWords, resizes each such fragment so that each end of the fragment
contains a query word and then evaluates best fragments based on number of
query words in each fragment. In case of tie it picks up the smaller
fragment. This allows more query words to be shown with multiple fragments
in case a single cover is larger than the MaxWords.

The resizing of a  fragment such that each end is a query word provides room
for stretching both sides of the fragment. This (hopefully) better presents
the context in which query words appear in the document. If a cover is
smaller than MaxWords then the cover is treated as a fragment.

Let me know if you have any more suggestions or anything is not clear.

I have not yet added the regression tests. The regression test suite seemed
to be only ensuring that the function works. How many tests should I be
adding? Is there any other place that I need to add different test cases for
the function?

-Sushant.


Nice. But it will be good to resolve following issues:

1) Patch contains mistakes, I didn't investigate or carefully read it. Get
http://www.sai.msu.su/~megera/postgres/fts/apod.dump.gzhttp://www.sai.msu.su/%7Emegera/postgres/fts/apod.dump.gzand
 load in db.

Queries
# select ts_headline(body, plainto_tsquery('black hole'), 'MaxFragments=1')
from apod where to_tsvector(body) @@ plainto_tsquery('black hole');

and

# select ts_headline(body, plainto_tsquery('black hole'), 'MaxFragments=1')
from apod;

crash postgresql :(

2) pls, include in your patch documentation and regression tests.



Another change that I was thinking:

Right now if cover size  max_words then I just cut the trailing words.
Instead I was thinking that we should split the cover into more
fragments such that each fragment contains a few query words. Then each
fragment will not contain all query words but will show more occurrences
of query words in the headline. I would  like to know what your opinion
on this is.



Agreed.


--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
  WWW:
http://www.sigaev.ru/





Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Fragments in tsearch2 headline

2008-07-16 Thread Sushant Sinha
I will add test queries and their results for the corner cases in a
separate file. I guess the only thing I am confused about is what should
be the behavior of headline generation when Query items have words of
size less than ShortWord. I guess the answer is to ignore ShortWord
parameter but let me know if the answer is any different.

-Sushant.
 
On Thu, 2008-07-17 at 02:53 +0400, Oleg Bartunov wrote:
 Sushant,
 
 first, please, provide simple test queries, which demonstrate the right work
 in the corner cases. This will helps reviewers to test your patch and
 helps you to make sure your new version is ok. For example:
 
 =# select ts_headline('1 2 3 4 5 1 2 3 1','13'::tsquery);
   ts_headline
 --
   b1/b 2 b3/b 4 5 b1/b 2 b3/b b1/b
 
 This select breaks your code:
 
 =# select ts_headline('1 2 3 4 5 1 2 3 1','13'::tsquery,'maxfragments=2');
   ts_headline
 --
   ...  2 ...
 
 and so on 
 
 
 Oleg
 On Tue, 15 Jul 2008, Sushant Sinha wrote:
 
  Attached a new patch that:
 
  1. fixes previous bug
  2. better handles the case when cover size is greater than the MaxWords.
  Basically it divides a cover greater than MaxWords into fragments of
  MaxWords, resizes each such fragment so that each end of the fragment
  contains a query word and then evaluates best fragments based on number of
  query words in each fragment. In case of tie it picks up the smaller
  fragment. This allows more query words to be shown with multiple fragments
  in case a single cover is larger than the MaxWords.
 
  The resizing of a  fragment such that each end is a query word provides room
  for stretching both sides of the fragment. This (hopefully) better presents
  the context in which query words appear in the document. If a cover is
  smaller than MaxWords then the cover is treated as a fragment.
 
  Let me know if you have any more suggestions or anything is not clear.
 
  I have not yet added the regression tests. The regression test suite seemed
  to be only ensuring that the function works. How many tests should I be
  adding? Is there any other place that I need to add different test cases for
  the function?
 
  -Sushant.
 
 
  Nice. But it will be good to resolve following issues:
  1) Patch contains mistakes, I didn't investigate or carefully read it. Get
  http://www.sai.msu.su/~megera/postgres/fts/apod.dump.gzhttp://www.sai.msu.su/%7Emegera/postgres/fts/apod.dump.gzand
   load in db.
 
  Queries
  # select ts_headline(body, plainto_tsquery('black hole'), 'MaxFragments=1')
  from apod where to_tsvector(body) @@ plainto_tsquery('black hole');
 
  and
 
  # select ts_headline(body, plainto_tsquery('black hole'), 'MaxFragments=1')
  from apod;
 
  crash postgresql :(
 
  2) pls, include in your patch documentation and regression tests.
 
 
  Another change that I was thinking:
 
  Right now if cover size  max_words then I just cut the trailing words.
  Instead I was thinking that we should split the cover into more
  fragments such that each fragment contains a few query words. Then each
  fragment will not contain all query words but will show more occurrences
  of query words in the headline. I would  like to know what your opinion
  on this is.
 
 
  Agreed.
 
 
  --
  Teodor Sigaev   E-mail: [EMAIL PROTECTED]
WWW:
  http://www.sigaev.ru/
 
 
 
   Regards,
   Oleg
 _
 Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
 Sternberg Astronomical Institute, Moscow University, Russia
 Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
 phone: +007(495)939-16-83, +007(495)939-23-83


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] Fragments in tsearch2 headline

2008-07-16 Thread Oleg Bartunov

On Wed, 16 Jul 2008, Sushant Sinha wrote:


I will add test queries and their results for the corner cases in a
separate file. I guess the only thing I am confused about is what should
be the behavior of headline generation when Query items have words of
size less than ShortWord. I guess the answer is to ignore ShortWord
parameter but let me know if the answer is any different.



ShortWord is about headline text, it doesn't affects words in query,
so you can't discard them from query.


-Sushant.

On Thu, 2008-07-17 at 02:53 +0400, Oleg Bartunov wrote:

Sushant,

first, please, provide simple test queries, which demonstrate the right work
in the corner cases. This will helps reviewers to test your patch and
helps you to make sure your new version is ok. For example:

=# select ts_headline('1 2 3 4 5 1 2 3 1','13'::tsquery);
  ts_headline
--
  b1/b 2 b3/b 4 5 b1/b 2 b3/b b1/b

This select breaks your code:

=# select ts_headline('1 2 3 4 5 1 2 3 1','13'::tsquery,'maxfragments=2');
  ts_headline
--
  ...  2 ...

and so on 


Oleg
On Tue, 15 Jul 2008, Sushant Sinha wrote:


Attached a new patch that:

1. fixes previous bug
2. better handles the case when cover size is greater than the MaxWords.
Basically it divides a cover greater than MaxWords into fragments of
MaxWords, resizes each such fragment so that each end of the fragment
contains a query word and then evaluates best fragments based on number of
query words in each fragment. In case of tie it picks up the smaller
fragment. This allows more query words to be shown with multiple fragments
in case a single cover is larger than the MaxWords.

The resizing of a  fragment such that each end is a query word provides room
for stretching both sides of the fragment. This (hopefully) better presents
the context in which query words appear in the document. If a cover is
smaller than MaxWords then the cover is treated as a fragment.

Let me know if you have any more suggestions or anything is not clear.

I have not yet added the regression tests. The regression test suite seemed
to be only ensuring that the function works. How many tests should I be
adding? Is there any other place that I need to add different test cases for
the function?

-Sushant.


Nice. But it will be good to resolve following issues:

1) Patch contains mistakes, I didn't investigate or carefully read it. Get
http://www.sai.msu.su/~megera/postgres/fts/apod.dump.gzhttp://www.sai.msu.su/%7Emegera/postgres/fts/apod.dump.gzand
 load in db.

Queries
# select ts_headline(body, plainto_tsquery('black hole'), 'MaxFragments=1')
from apod where to_tsvector(body) @@ plainto_tsquery('black hole');

and

# select ts_headline(body, plainto_tsquery('black hole'), 'MaxFragments=1')
from apod;

crash postgresql :(

2) pls, include in your patch documentation and regression tests.



Another change that I was thinking:

Right now if cover size  max_words then I just cut the trailing words.
Instead I was thinking that we should split the cover into more
fragments such that each fragment contains a few query words. Then each
fragment will not contain all query words but will show more occurrences
of query words in the headline. I would  like to know what your opinion
on this is.



Agreed.


--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
  WWW:
http://www.sigaev.ru/





Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83




Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] .psqlrc output for \pset commands

2008-07-16 Thread Bruce Momjian
Gregory Stark wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
 
  In my .psqlrc I have:
 
  \pset format wrapped
 
  and this outputs this on psql startup:
 
  $ psql test
  -- Output format is wrapped.
  psql (8.4devel)
  Type help for help.
 
  Is this desirable?  \set QUIET at the top of .psqlrc fixes it, but I am
  wondering if we should be automatically doing quiet while .psqlrc is
  processed.
 
 I was wondering about this myself, but I'm still not used to the new banner.
 It seems kind of... curt. Perhaps it should just be a single line instead of
 two lines both around 20 characters...
 
 Anyways the thing that struck me as odd was the messages appearing *before*
 the header. It seems to me the header should print followed by .psqlrc output
 followed by normal output.

Do you like this better?

$ psql test
psql (8.4devel)
Type help for help.
Output format is wrapped.

test=

The attached patch accomplishes this.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: src/bin/psql/startup.c
===
RCS file: /cvsroot/pgsql/src/bin/psql/startup.c,v
retrieving revision 1.149
diff -c -c -r1.149 startup.c
*** src/bin/psql/startup.c	1 Jul 2008 00:08:18 -	1.149
--- src/bin/psql/startup.c	17 Jul 2008 00:44:22 -
***
*** 281,292 
  	 */
  	else
  	{
  		if (!options.no_psqlrc)
  			process_psqlrc(argv[0]);
! 
! 		connection_warnings();
  		if (!pset.quiet  !pset.notty)
! 			printf(_(Type \help\ for help.\n\n));
  		if (!pset.notty)
  			initializeInput(options.no_readline ? 0 : 1);
  		if (options.action_string)		/* -f - was used */
--- 281,294 
  	 */
  	else
  	{
+ 		connection_warnings();
+ 		if (!pset.quiet  !pset.notty)
+ 			printf(_(Type \help\ for help.\n));
  		if (!options.no_psqlrc)
  			process_psqlrc(argv[0]);
! 		/* output newline here because .psqlrc might output something */
  		if (!pset.quiet  !pset.notty)
! 			printf(\n);
  		if (!pset.notty)
  			initializeInput(options.no_readline ? 0 : 1);
  		if (options.action_string)		/* -f - was used */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Change lock requirements for adding a trigger

2008-07-16 Thread Bruce Momjian

Added to TODO:

* Reduce locking requirements for creating a trigger

  http://archives.postgresql.org/pgsql-hackers/2008-06/msg00635.php


---

Simon Riggs wrote:
 
 On Wed, 2008-06-04 at 16:33 -0400, Tom Lane wrote:
  Simon Riggs [EMAIL PROTECTED] writes:
   We have
   * relhasindex (bool) set by CREATE INDEX but not unset by DROP INDEX
   * relhasrules (bool)
   * reltriggers (int2)  set by CREATE and DROP, since its an integer
  
  Right.
  
   If CREATE INDEX can take a Share lock and can update pg_class, why would
   it not be theoretically possible for CREATE TRIGGER? 
  
  It's (probably) theoretically possible, if we replace reltriggers with a
  bool that acts more like relhasindex, ie it's a hint to go look in
  pg_triggers.  
 
 Looking at this area of locking, I've noticed that the locks held by
 CREATE TRIGGER are more of a problem than might be apparent. 
 
 * Locks held by CREATE TRIGGER are an issue for trigger-based
 replication systems, where triggers are frequently added and removed to
 various tables.
 
 * ALTER TABLE .. ADD FOREIGN KEY holds an AccessExclusiveveLock on
 *both* referencing and referenced tables. It does this because we must
 add triggers to both tables. So reducing the lock strength required by
 CREATE TRIGGER would also allow a reduction in lock strength for adding
 FKs.
 
 So useful steps will be to
 
 * refactor pg_class code so that CREATE TRIGGER uses an identical
 approach to CREATE INDEX
 
 * reduce lock strength for CREATE TRIGGER and ALTER TABLE ... ADD
 FOREIGN KEY so that it takes a ShareLock during
 ATAddForeignKeyConstraint()
 
 * look at how we can reduce lock strength for other ALTER TABLE
 subcommands. Not sure how yet.
 
 -- 
  Simon Riggs   www.2ndQuadrant.com
  PostgreSQL Training, Services and Support
 
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] small bug in hlCover

2008-07-16 Thread Sushant Sinha
I think there is a slight bug in hlCover function in wparser_def.c

If there is only one query item and that is the first word in the text,
then hlCover does not returns any cover. This is evident in this example
when ts_headline only generates the min_words:

testdb=# select ts_headline('1 2 3 4 5 6 7 8 9 10','1'::tsquery,
'MinWords=5');
   ts_headline
--
 b1/b 2 3 4 5
(1 row)

The problem is that *q is initialized to 0 which is a legitimate value
for a cover. So I have attached a patch that fixes it and after applying
the patch here is the result.

testdb=# select ts_headline('1 2 3 4 5 6 7 8 9 10','1'::tsquery,
'MinWords=5');
 ts_headline 
-
 b1/b 2 3 4 5 6 7 8 9 10
(1 row)

-Sushant.
Index: src/backend/tsearch/wparser_def.c
===
RCS file: /home/postgres/devel/pgsql-cvs/pgsql/src/backend/tsearch/wparser_def.c,v
retrieving revision 1.15
diff -c -r1.15 wparser_def.c
*** src/backend/tsearch/wparser_def.c	17 Jun 2008 16:09:06 -	1.15
--- src/backend/tsearch/wparser_def.c	17 Jul 2008 02:45:34 -
***
*** 1621,1627 
  	QueryItem  *item = GETQUERY(query);
  	int			pos = *p;
  
! 	*q = 0;
  	*p = 0x7fff;
  
  	for (j = 0; j  query-size; j++)
--- 1621,1627 
  	QueryItem  *item = GETQUERY(query);
  	int			pos = *p;
  
! 	*q = -1;
  	*p = 0x7fff;
  
  	for (j = 0; j  query-size; j++)
***
*** 1643,1649 
  		item++;
  	}
  
! 	if (*q == 0)
  		return false;
  
  	item = GETQUERY(query);
--- 1643,1649 
  		item++;
  	}
  
! 	if (*q  0)
  		return false;
  
  	item = GETQUERY(query);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] avoid recasting text to tsvector when calculating selectivity

2008-07-16 Thread Tom Lane
=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= [EMAIL PROTECTED] writes:
 I'm about to write a oprrest function for the @@ operator. Currently @@ 
 handles multiple cases, like tsvector @@ tsquery, text @@ tsquery, 
 tsquery @@ tsvector etc. The text @@ text case is for instance handled 
 by calling to_tsvector and plainto_tsquery on the input arguments.

 For a @@ restriction function, I need to have a tsquery and a tsvector, 
 so in the text @@ text situation I'd end up calling plainto_tsquery 
 during planning, which would consequently get called again during 
 execution. Also, I'd need a not-so-elegant if-elsif-elsif sequence at 
 the beginning of the function. Is this OK/unavoidable/easly avoided?

I'm not following your point here.  Sure, there are multiple flavors of
@@, but why shouldn't they each have their own oprrest function?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Allow TRUNCATE foo, foo to succeed, per report from Nikhils.

2008-07-16 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 On Wed, 2008-07-16 at 17:59 -0400, Neil Conway wrote:
 On Wed, 2008-07-16 at 21:39 +0100, Simon Riggs wrote:
 So why do we need 
 TRUNCATE foo, foo;
 
 For the sake of completeness? Having TRUNCATE foo, foo fail would be
 rather inconsistent.

 Inconsistent with what exactly?

Well, it's certainly surprising that it fails entirely.  And if we
actually wanted to reject the case, it should be drawing an apropos
error message.  The fact is that this failure is just an implementation
issue.

 Our users will be surprised to find this was at the top of our list

If it had taken more than five lines of code to fix, I might agree with
you.  But we don't stop fixing bugs just because commitfest is on,
especially not trivial ones.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres-R source code release

2008-07-16 Thread David Fetter
On Wed, Jul 16, 2008 at 09:35:28PM +0200, Markus Schiltknecht wrote:
 Hi,

 David Fetter wrote:
 Would you mind if I were to make a git branch for it on
 http://git.postgresql.org/ ?

 I've set up a git-daemon with the Postgres-R patch here:

 git://postgres-r.org/repo

 Since it's a distributed VCS, you should be able to mirror that to  
 git.postgtresql.org somehow (if you figure out how, please tell me!).

I've merged the latest Postgres in.  Care to see whether it runs?
http://git.postgresql.org/?p=~davidfetter/pgr/.git;a=summary

 Please note that I'm still struggling with git and I cannot promise
 to  keep using it.

I'm struggling, too, but the cheapness of experimenting is making it
easier and easier :)

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers