Re: [HACKERS] Questions about indexes with text_pattern_ops

2008-02-25 Thread Gregory Stark
Kaare Rasmussen [EMAIL PROTECTED] writes:

 Hi 

 The database is initialized with utf8, so in order for LIKE to use the index 
 on
 a text field, I used text_pattern_ops when I created it. So far so good. 

 It's in the documentation, but there's no explanation of why this index will
 only work for LIKE searches. How come that I have to have two different 
 indexes
 if I want to give Postgres the ability to choose index scan over seq scan on
 LIKE and non-LIKE searches? 

Because in non-C locales (which you're almost certainly using if you're using
UTF8) the ordering which the normal text operations use can be quite complex.
Just as an example most locales have spaces being entirely insignificant. So
no range can reliably match a prefix LIKE pattern. The text_pattern_ops use
simple character-by-character ordering which are useful for LIKE but not for
regular  and  comparisons. They're just two different orderings.

 Also, when I tried to create the index as a partial one (avoiding the 95%
 entries with empty strings), Postgresql chooses to use seq scan. This sounds
 counter intuitive to me. 

 CREATE INDEX new_index ON a (b text_pattern_ops) WHERE b  '';
 This is 8.2.6.

Hm, for a simple = or  I think it doesn't matter which operator class you
use. For  or  it would produce different answers. Postgres isn't clever enough
to notice that this is equivalent though so I think you would have to do
something like (untested):

CREATE INDEX new_index ON a (b text_pattern_ops) WHERE b ~~ '';

That uses the same operator that the LIKE clause will use for the index range.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Questions about indexes with text_pattern_ops

2008-02-25 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 Gregory Stark [EMAIL PROTECTED] writes:
 Hm, for a simple = or  I think it doesn't matter which operator class you
 use. For  or  it would produce different answers. Postgres isn't clever 
 enough
 to notice that this is equivalent though so I think you would have to do
 something like (untested):

 CREATE INDEX new_index ON a (b text_pattern_ops) WHERE b ~~ '';

 That uses the same operator that the LIKE clause will use for the index 
 range.

 I'm intending to get rid of ~=~ and ~~ for 8.4; there's no longer any
 reason why those slots in the pattern_ops classes can't be filled by the
 plain = and  operators.  (There *was* a reason when they were first
 invented --- but now that texteq will only return true for exact bitwise
 match, I think it's OK to assume these are equivalent.)

The only question is whether we'll keep that forever. I thought it was a good
idea at the time but I'm starting to wonder about the implications for
multi-key indexes.

 In the meantime, though, I think the only way that Kaare's query can use
 that index is if he writes
   WHERE b LIKE 'whatever' AND b  '';
 (with whatever spelling of  the index predicate has).  There is not
 anything in the predicate proving machinery that knows enough about LIKE
 to be able to show that b LIKE 'whatever' implies b  ''.

I was thinking that the inequalities that the LIKE index scan introduces would
imply the inequality. I take it we generate those inequalities too late in the
planning process to use them for other planning? 


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Questions about indexes with text_pattern_ops

2008-02-25 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Hm, for a simple = or  I think it doesn't matter which operator class you
 use. For  or  it would produce different answers. Postgres isn't clever 
 enough
 to notice that this is equivalent though so I think you would have to do
 something like (untested):

 CREATE INDEX new_index ON a (b text_pattern_ops) WHERE b ~~ '';

 That uses the same operator that the LIKE clause will use for the index range.

I'm intending to get rid of ~=~ and ~~ for 8.4; there's no longer any
reason why those slots in the pattern_ops classes can't be filled by the
plain = and  operators.  (There *was* a reason when they were first
invented --- but now that texteq will only return true for exact bitwise
match, I think it's OK to assume these are equivalent.)

In the meantime, though, I think the only way that Kaare's query can use
that index is if he writes
WHERE b LIKE 'whatever' AND b  '';
(with whatever spelling of  the index predicate has).  There is not
anything in the predicate proving machinery that knows enough about LIKE
to be able to show that b LIKE 'whatever' implies b  ''.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Questions about indexes with text_pattern_ops

2008-02-25 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 I'm intending to get rid of ~=~ and ~~ for 8.4; there's no longer any
 reason why those slots in the pattern_ops classes can't be filled by the
 plain = and  operators.  (There *was* a reason when they were first
 invented --- but now that texteq will only return true for exact bitwise
 match, I think it's OK to assume these are equivalent.)

 The only question is whether we'll keep that forever. I thought it was a good
 idea at the time but I'm starting to wonder about the implications for
 multi-key indexes.

How so?  If you think this change is a bad idea you'd better speak up
PDQ.

 In the meantime, though, I think the only way that Kaare's query can use
 that index is if he writes
 WHERE b LIKE 'whatever' AND b  '';
 (with whatever spelling of  the index predicate has).  There is not
 anything in the predicate proving machinery that knows enough about LIKE
 to be able to show that b LIKE 'whatever' implies b  ''.

 I was thinking that the inequalities that the LIKE index scan introduces would
 imply the inequality. I take it we generate those inequalities too late in the
 planning process to use them for other planning? 

Hmm, good point [ experiments... ]  Yeah, it seems we don't reconsider
partial indexes after those clauses have been generated.  Not sure how
expensive it'd be to change that.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Questions about indexes with text_pattern_ops

2008-02-25 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 I'm intending to get rid of ~=~ and ~~ for 8.4; there's no longer any
 reason why those slots in the pattern_ops classes can't be filled by the
 plain = and  operators.  (There *was* a reason when they were first
 invented --- but now that texteq will only return true for exact bitwise
 match, I think it's OK to assume these are equivalent.)

 The only question is whether we'll keep that forever. I thought it was a good
 idea at the time but I'm starting to wonder about the implications for
 multi-key indexes.

 How so?  If you think this change is a bad idea you'd better speak up
 PDQ.

Well I think it's fine for 'foo ' != 'foo' even if they sort similarly. 

But I'm not sure it makes sense for 'foo ','a' to sort after 'foo','b' if
the locale says that 'foo ' should be compare equal to 'foo' and 'a' before
'b'.

I'm starting to think equality and sorts interchangeably are not the same
operator at all. On the other hand it may not be worth the complexity that
might bring.

 I was thinking that the inequalities that the LIKE index scan introduces 
 would
 imply the inequality. I take it we generate those inequalities too late in 
 the
 planning process to use them for other planning? 

 Hmm, good point [ experiments... ]  Yeah, it seems we don't reconsider
 partial indexes after those clauses have been generated.  Not sure how
 expensive it'd be to change that.

Perhaps we should always generate those inequalities even if there's no index
that can use them. Then calculate the regexp selectivity based only on the
regexp after the prefix.

That might also help constraint exclusion. Maybe merge joins?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Questions about indexes with text_pattern_ops

2008-02-25 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 How so?  If you think this change is a bad idea you'd better speak up
 PDQ.

 Well I think it's fine for 'foo ' != 'foo' even if they sort similarly. 

 But I'm not sure it makes sense for 'foo ','a' to sort after 'foo','b' if
 the locale says that 'foo ' should be compare equal to 'foo' and 'a' before
 'b'.

I don't think we can concern ourselves with that; it would require
allowing different columns of an index to interact, which would be
impossibly messy.  What's more, it'd destroy the property that a btree
index is sorted by its leading column(s) as well as by all its columns.

 Perhaps we should always generate those inequalities even if there's no index
 that can use them.

Hmmm ... we intentionally don't do that, but the constraint exclusion
code might be a sufficient reason to reconsider.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Questions about indexes with text_pattern_ops

2008-02-25 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 How so?  If you think this change is a bad idea you'd better speak up
 PDQ.

 Well I think it's fine for 'foo ' != 'foo' even if they sort similarly. 

 But I'm not sure it makes sense for 'foo ','a' to sort after 'foo','b' if
 the locale says that 'foo ' should be compare equal to 'foo' and 'a' before
 'b'.

 I don't think we can concern ourselves with that; it would require
 allowing different columns of an index to interact, which would be
 impossibly messy.  What's more, it'd destroy the property that a btree
 index is sorted by its leading column(s) as well as by all its columns.

Well, I was thinking we might have to separate the equal operators from the
btree opclass. Equals would be a stricter property than sorts as same.

It would be mighty strange to have values which compare unequal but are
neither  or  though. Or which compare equal but also compare  or .

It might be a little less surprising if we invent a new operator === for
actually the same and have == report whether two objects sort as equals. But
I'm not sure our experience with Turkish doesn't show that that will still
surprise people.

It may be more right in an abstract ideal world -- the reality is that text
collation is annoyingly complex. But this may be a case where we can get away
with just eliding this hassle.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Questions about indexes with text_pattern_ops

2008-02-25 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 It may be more right in an abstract ideal world -- the reality is that text
 collation is annoyingly complex. But this may be a case where we can get away
 with just eliding this hassle.

If anyone actually complains about it, I think we can point to the SQL
spec, which unambiguously says that a multicolumn sort key is considered
one column at a time.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Questions about indexes?

2003-02-17 Thread Daniel Kalchev
Ryan Bradetich said:
  the table would look like:
  1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user x has an invalid shell.
  1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user y has an invalid shell.
  1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user y has expired password.
  2 | Mon Feb 17 00:34:24 MST 2003 | f101 | file /foo has improper owner.
  etc...
  
  So I do not need the anomaly to be part of the index, I only need it to 
  
  I agree with you, that I would not normally add the anomally to the
  index, except for the unique row requirement.  Thinking about it now,
  maybe I should guarentee unique rows via a check constraint...
  
  Thanks for making me think about this in a different way!

(sorry this is a bit long)

Ryan,

I use somewhat similarly structured data (archived records of various events) 
and when the database was setup (back when this baby was called postgres95), I 
too used indexes on all possible fields.

My database consists of an 'operations' table, which holds for the last x days 
period (example) and several tables with archived records (per month, or 
per-year - see later, The operations table can have frequent updates, which 
add new data. Data is never modified but often lookups are made. The archived 
tables are generated once and forever from the operations table (possibly 
merging in the future, but I haven't yet made my mind on this) - then access 
is read-only, although sufficiently frequent.

What I found for the many years of operating this database on different 
PostgreSQL versions and hardware is that indexes have considerable cost. :)
So does the need to not miss anything from the operations table (that is, 
collect data from many places and have have it all it there).

I ended up with few only indexes on the operations table, because the 
processes that fill it up do minimal lookups to see if data is already in the 
table, if not do inserts. Then at regular intervals, the table is cleaned up - 
that is, a process to remove the duplicate is run. This unfortunately costs 
OIDs, but I found no other reasonable way to do the fast inserts. Perhaps the 
best way is to create the table without OIDs (but wouldn't this still waste 
OIDs?) use COPY and then clean afterwards?

The archived tables are generated, then cleaned up. Then, as Tom suggested 
indexes are put on the archived tables, only for the fields that are used in 
queries. Once the table is created, there is no way duplicated data will 
exist, as it will not be inserted into. Therefore no need for UNIQUE index 
enforcement.

If you need to have one large 'history' table, then perhaps you will just have 
to do (slow :) selects for each record before each insert, or just insert the 
data and then run the cleanup process.

Daniel


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Questions about indexes?

2003-02-17 Thread Christopher Kings-Lynne
 I ended up with few only indexes on the operations table, because the
 processes that fill it up do minimal lookups to see if data is already in the
 table, if not do inserts. Then at regular intervals, the table is cleaned up -
 that is, a process to remove the duplicate is run. This unfortunately costs
 OIDs, but I found no other reasonable way to do the fast inserts. Perhaps the
 best way is to create the table without OIDs (but wouldn't this still waste
 OIDs?) use COPY and then clean afterwards?

No, WITHOUT OIDS is implemented specifically to not waste OIDs.

Chris


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Questions about indexes?

2003-02-17 Thread Tom Lane
Ryan Bradetich [EMAIL PROTECTED] writes:
 the table would look like:
 1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user x has an invalid shell.
 1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user y has an invalid shell.

Ah, I see your point now.  (Thinks: what about separating the anomaly
column into an identifier and a complaint column:

1 | Mon Feb 17 00:34:24 MST 2003 | p101 | x| user has an invalid shell.
1 | Mon Feb 17 00:34:24 MST 2003 | p101 | y| user has an invalid shell.
1 | Mon Feb 17 00:34:24 MST 2003 | p101 | y| user has expired password.
2 | Mon Feb 17 00:34:24 MST 2003 | f101 | /foo | file has improper owner.

No, that doesn't quite work either, unless you are willing to make the
categories more specific.  At which point the category and the anomaly
text become equivalent.  Actually I'm wondering why you bother with the
category at all; isn't it implied by the anomaly text?)

 I agree with you, that I would not normally add the anomally to the
 index, except for the unique row requirement.  Thinking about it now,
 maybe I should guarentee unique rows via a check constraint...

A check constraint won't be efficient either, at least not without a
supporting index.  Possibly you could index just the host and timestamp
columns, which would not be unique but it would cut the number of rows
the constraint would need to examine to something manageable.

But I'm still thinking that enforcing uniqueness is a waste of time.
What exactly is so harmful about it if
1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user x has an invalid shell.
appears twice?  How likely is that anyway (especially if you don't
truncate the timestamp precision)?

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Questions about indexes?

2003-02-17 Thread Curt Sampson
On Mon, 16 Feb 2003, Ryan Bradetich wrote:

 I am not sure why all the data is duplicated in the index ...

Well, you have to have the full key in the index, or how would you know,
when you look at a particular index item, if it actually matches what
you're searching for?

MS SQL server does have an interesting option that would help you a lot
in this case: clustered indexes. A table may have a single clustered
index, and each leaf node of the index stores not just the key but
actually the entire row. Thus, in a case like yours, you'd store the row
only once, not twice.

Without thinking too hard about it (my usual mode of operation on this
list :-)) this could probably be implemented in postgresql. But I don't
think it would be entirely trivial, and your case is unusual enough
that I very much doubt whether it would be worth implementing to fix
that alone. It would also offer the advantage that any lookup using the
clustered index would save fetching the heap page after that as well,
but it's hard to say if the savings would be worth the work.

 Since my only requirement is that the rows be unique, I have developed a
 custom MD5 function in C, and created an index on the MD5 hash of the
 concatanation of all the fields.

Well, that won't guarantee uniqueness, since it's perfectly possible
to have two different rows hash to the same value. (If that weren't
possible, your hash would have to contain as much information as the row
itself, and your space savings wouldn't be nearly so dramatic.)

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Questions about indexes?

2003-02-17 Thread Kevin Brown
Curt Sampson wrote:
 On Mon, 16 Feb 2003, Ryan Bradetich wrote:
  Since my only requirement is that the rows be unique, I have developed a
  custom MD5 function in C, and created an index on the MD5 hash of the
  concatanation of all the fields.
 
 Well, that won't guarantee uniqueness, since it's perfectly possible
 to have two different rows hash to the same value. (If that weren't
 possible, your hash would have to contain as much information as the row
 itself, and your space savings wouldn't be nearly so dramatic.)

That's true, but even if he has 4 billion rows it drops the
probability of a duplicate down to something like one in 4 billion, so
it's probably a safe enough bet.  His application doesn't require
absolute uniqueness, fortunately, so md5 works well enough in this
case.

Otherwise md5 wouldn't be a terribly good hash...


-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



[HACKERS] Questions about indexes?

2003-02-16 Thread Ryan Bradetich
Hello postgres hackers,

Been a while since I have participated on this list ... but I have a new
itch to scratch

Although the table schema is immaterial, I will provide it so we have a
common framework for this discussion:


host_id integer (not null)
timestamp   datetime(not null)
categorytext(not null)  [=5 chars]
anomaly text(not null)  [= 1024 chars]

This table is used to store archived data, so each row in the table must
be unique.  Currently I am using a primary key across each column to
enforce this uniqueness.  This table currently has ~86 million rows and
is 16+ GB in size.  This primary key index is also 16+ GB in size,
because it appears all the data is duplicated in the index.  (I have
only done some preliminary looking at the database file with strings,
etc ... so this assumption is purly based on these observations).

I am not sure why all the data is duplicated in the index ... but i bet
it has to do with performance since it would save a lookup in the main
table.  Is there any benchmarks or papers related to this topic I should
locate and read?  I am curious about this because it seems the only
advantaged gained is searching the index for the specified values
Once the entry is found, the full entry needs to be pulled from the main
table anyhow since the index does not contain all the data.  Also with
the increased size, it seems additional pressure would be put on the
shared memory caches (no idea how this really works, just guessing! :))


Since my only requirement is that the rows be unique, I have developed a
custom MD5 function in C, and created an index on the MD5 hash of the
concatanation of all the fields.  This has reduced the disk space usage
considerably, as show below against my test database ~6 million rows
at 1+ GB.

All this data is based off the test database running 7.3.2:

TypeSize
---
Database Table  1188642816
All columns pkey1510252544
MD5 columns pkey 370999296

Just using MD5 hash data instead of all the columns is a considerable
diskspace win going from 1.5 GB to 370 MB.

Has anyone else solved this problem?  Has anyone else looked into
something like this and mind sharing so I do not have to re-invent the
wheel? :)  Also (assuming there is no papers / benchmarks proving data
in index is a good idea), how difficult would it be to impliment an
index type that extracts the data from the main table?


Thanks for reading.  I will be happy to field any question that I can,
or read any papers, research, etc that relates to this topic.

- Ryan

P.S. the production database is running 7.2.4 if that makes a
difference.

-- 
Ryan Bradetich [EMAIL PROTECTED]


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Questions about indexes?

2003-02-16 Thread Tom Lane
Ryan Bradetich [EMAIL PROTECTED] writes:
 Although the table schema is immaterial, I will provide it so we have a
 common framework for this discussion:

   host_id integer (not null)
   timestamp   datetime(not null)
   categorytext(not null)  [=5 chars]
   anomaly text(not null)  [= 1024 chars]

 This table is used to store archived data, so each row in the table must
 be unique.  Currently I am using a primary key across each column to
 enforce this uniqueness.

It's not real clear to me why you bother enforcing a constraint that the
complete row be unique.  Wouldn't a useful constraint be that the first
three columns be unique?  Even if that's not correct, what's wrong with
tolerating a few duplicates?  You can't tell me it's to save on storage
;-)

 I am not sure why all the data is duplicated in the index ... but i bet
 it has to do with performance since it would save a lookup in the main
 table.

An index that can't prevent looking into the main table wouldn't be
worth anything AFAICS ...

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Questions about indexes?

2003-02-16 Thread Tom Lane
Ryan Bradetich [EMAIL PROTECTED] writes:
 On Sun, 2003-02-16 at 23:34, Tom Lane wrote:
 It's not real clear to me why you bother enforcing a constraint that the
 complete row be unique.  Wouldn't a useful constraint be that the first
 three columns be unique?

 The table holds system policy compliance data.  The catagory is
 basically the policy, and the anomaly is the detailed text explaining
 why the system is out of compliance.  So the anomaly data is important
 (and often the reason why the key is unique).

Well, sure the anomaly is important: it's the payload, the reason why
you bother to have the table in the first place.  But that doesn't mean
it's part of the key.  Generally the key would be the info you use to
look up a particular anomaly text.  In this example, it's not clear to
me why you'd need/want two different anomaly texts entered for the same
host_id and the same category at the same instant of time.  ISTM there's
something inadequate about your category column if you need that.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Questions about indexes?

2003-02-16 Thread Ryan Bradetich
On Mon, 2003-02-17 at 00:15, Tom Lane wrote:
 Ryan Bradetich [EMAIL PROTECTED] writes:
  On Sun, 2003-02-16 at 23:34, Tom Lane wrote:
  It's not real clear to me why you bother enforcing a constraint that the
  complete row be unique.  Wouldn't a useful constraint be that the first
  three columns be unique?
 
  The table holds system policy compliance data.  The catagory is
  basically the policy, and the anomaly is the detailed text explaining
  why the system is out of compliance.  So the anomaly data is important
  (and often the reason why the key is unique).
 
 Well, sure the anomaly is important: it's the payload, the reason why
 you bother to have the table in the first place.  But that doesn't mean
 it's part of the key.  Generally the key would be the info you use to
 look up a particular anomaly text.  In this example, it's not clear to
 me why you'd need/want two different anomaly texts entered for the same
 host_id and the same category at the same instant of time.  ISTM there's
 something inadequate about your category column if you need that.

Ok, I understand what you are asking now :)

Let me make up a contrived example to show how the table is used.

host_id 1 = hosta.somewhere.com
host_id 2 = hostb.somewhere.com

The catagories are coded so (made up examples):
cat p101 = /etc/passwd check
cat f101 = filesystem check.

the table would look like:
1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user x has an invalid shell.
1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user y has an invalid shell.
1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user y has expired password.
2 | Mon Feb 17 00:34:24 MST 2003 | f101 | file /foo has improper owner.
etc...

So I do not need the anomaly to be part of the index, I only need it to 

I agree with you, that I would not normally add the anomally to the
index, except for the unique row requirement.  Thinking about it now,
maybe I should guarentee unique rows via a check constraint...

Thanks for making me think about this in a different way!

- Ryan

   regards, tom lane
-- 
Ryan Bradetich [EMAIL PROTECTED]


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])