RE: [Zope-dev] Hey Chris, question for you

2001-06-28 Thread Toby Dickenson

 I think it has changed for FieldIndexes.

Yes, from UnKeywordIndex.py

newKeywords = getattr(obj, self.id, ())

  You can now make 
 the distinction
 between doesnt have that attribute and attribute is one of 
 [None, '', [],
 ()] within a Field Index.

Reviewing UnKeywordIndex.py, I dont see what an object should do to mean
doesnt have that attribute dont include me in this FieldIndex. Any
suggestions?

Thanks for your time,

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )



Re: [Zope-dev] Hey Chris, question for you

2001-06-27 Thread Casey Duncan

Chris McDonough wrote:
 
 Hi casey,
 
 Changes were recently made to Field/Keyword Indexes so that they will
 store empty items.  An equivalent change could be made to TextIndexes...
 we'd need to think about that a bit.
 
 But for your purposes, you might want to start out attempting to write
 your operator implementation using Field and Keyword indexes...
 
 - C
 
 Michel Pelletier wrote:
 
 
  Hmm the reason for the current behavior was optimization by saving space
  not indexing empty values.  The problem with your latter aproach is that
  all objects in the catalog may include object that don't have a title
  attribute at all.
 
  I'm not against indexing empty values though.
 
  -Michel
 

My implementation does not modify the behavior of the indexes in any
way, and I would like to keep it that way if possible. I have been able
to (thus far) pull this off without compromises, which was my hope in
the beginning.

I guess the question here is given the query:

spam != 'eggs'

Should objects be returned that do not have an attribute spam at all.
For the behavior to be intuitive, I would say yes, but that is just my
opinion. I also though of an optimization that could eventually be
included if this behavior is adopted. for example, take the following
query expression:

title == 'foo' and spam != 'eggs'

As implemented, my query engine does the following:

1. Find items where title  matches 'foo' (exact behavior depends on
index type)
2. Find items where spam matches 'eggs'
3. Take the difference of all items in the index spam and the result of
#2
4. Return the intersection of #3 and #1

To be intuitive (I use that term loosely) I think it should be:

1. Find items where title  matches 'foo'
2. Find items where spam matches 'eggs'
3. Take the difference of all items in the catalog and the result of #2
4. Return the intersection of #3 and #1

Which can be optimized as:

1. Find items where title  matches 'foo'
2. Find items where spam matches 'eggs'
3. Return the difference #1 and #2

If an or is used in place of the and, then the optimization doesn't
apply though.

One other thing:

I noticed (with a colleague) that passing a list of values to a
FieldIndex and a TextIndex results in nearly opposite behavior. The
fieldIndex does a union on the results of querying against each item in
the list whereas TextIndex does an intersection. This seemed highly
inconsistent to me, another thread perhaps...

-- 
| Casey Duncan
| Kaivo, Inc.
| [EMAIL PROTECTED]
`--

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )



Re: [Zope-dev] Hey Chris, question for you

2001-06-27 Thread Toby Dickenson

On Tue, 26 Jun 2001 15:42:40 -0700 (PDT), Michel Pelletier
[EMAIL PROTECTED] wrote:

Hmm the reason for the current behavior was optimization by saving space
not indexing empty values.

I was always very pleased with that characteristic, but I had not
realised it was a design goal. 

I thought I observed that characteristic had changed in a recent Zope
release... hmmm, Ill take a look.


Toby Dickenson
[EMAIL PROTECTED]

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists -
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )



Re: [Zope-dev] Hey Chris, question for you

2001-06-27 Thread Chris McDonough

I think it has changed for FieldIndexes.  You can now make the distinction
between doesnt have that attribute and attribute is one of [None, '', [],
()] within a Field Index.  You do this in an almost natural way, the major
exception being that you need to wrap a blank string ('') in a sequence in
the query (e.g. title=['']) due to hysterical behavior.

I'm not sure about Text Indexes.

- Original Message -
From: Toby Dickenson [EMAIL PROTECTED]
To: Michel Pelletier [EMAIL PROTECTED]
Cc: Casey Duncan [EMAIL PROTECTED]; Chris McDonough
[EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: Wednesday, June 27, 2001 11:47 AM
Subject: Re: [Zope-dev] Hey Chris, question for you


On Tue, 26 Jun 2001 15:42:40 -0700 (PDT), Michel Pelletier
[EMAIL PROTECTED] wrote:

Hmm the reason for the current behavior was optimization by saving space
not indexing empty values.

I was always very pleased with that characteristic, but I had not
realised it was a design goal.

I thought I observed that characteristic had changed in a recent Zope
release... hmmm, Ill take a look.


Toby Dickenson
[EMAIL PROTECTED]



___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )



Re: [Zope-dev] Hey Chris, question for you

2001-06-26 Thread Chris McDonough

 Chris:

 I am working on getting a decent query language for ZCatalog/Catalog and

Very cool...

 I have been able to make good progress, however I am running into a bit
 of an issue that I thought you might know something about:

 In order to implement a != query operator, I am trying to do the
 following:

Tricky.

 From the index, return the result set that match the value (easy)
 Subtract that from the set of all items in the index (not so easy)

 I see that there is the difference method available from IIBTree,
 however I seem to be unable to use it on the entire index (Which is an
 OOBTree and not really a set I guess). Here is a snippit of my code
 which doesn't work:

 if op == '!=' or op[:3] == 'not':
 w, rs = difference(index._index, rs) # XXX Not a warm fuzzy...

 (where rs is the index result set that matches the value and index is
 the Catalog index OOBTree)

 What can I supply for the first argument to get a set of all items in
 the index, or is there any easier and better approach to this whole
 issue?

Well.. I assume that _index is the forward data structure of a FieldIndex.
In this case, you could get the info you want (a list of all document ids in
the index) from _unindex.keys(), as _index and _unindex are mirror images of
each other that need to be kept in sync... I think what comes back is a
BTreeItems object.  I think this is usable in conjunction with the resultset
IISet (also a list of document ids) via the difference function... I haven't
tried it, though...

 BTW: I realize I could step though _index.items() and create an IISet
 but that seems awful inefficient...

Yeah, that'd be terrible.

This is a tricky operator.  I can't really wrap my head around using it in
conjunction with parens.  Then again, maybe you wouldn't...


HTH,

- C



___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )



Re: [Zope-dev] Hey Chris, question for you

2001-06-26 Thread Casey Duncan

Chris McDonough wrote:
 
  Chris:
 
  I am working on getting a decent query language for ZCatalog/Catalog and
 
 Very cool...
 
  I have been able to make good progress, however I am running into a bit
  of an issue that I thought you might know something about:
 
  In order to implement a != query operator, I am trying to do the
  following:
 
 Tricky.
 

Ok, I was able to get it to work by instantiating a IISet around
_unindex.keys() and passing that to difference (Thanks!), however, I
notice an interesting side effect. Let's say you have a TextIndex on
title and you do the following query:

title != 'foo*'

Which to me means: all cataloged objects whose title do not match the
substring 'foo*'

However, this is not what you get exactly, instead you get:

all cataloged objects that have a non-empty title that does not match
the substring 'foo*'

Because from what I am seeing, objects with empty (or no) titles are not
included in the index *at all*. So the set of all objects does not
include ones without titles. I could fix this by making all objects be
instead All objects in the catalog (via catalog.data.keys()) instead
of all objects in the index, but I wanted to see if anyone had
additional thoughts about this.

-- 
| Casey Duncan
| Kaivo, Inc.
| [EMAIL PROTECTED]
`--

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )



Re: [Zope-dev] Hey Chris, question for you

2001-06-26 Thread Michel Pelletier

On Tue, 26 Jun 2001, Casey Duncan wrote:

 Ok, I was able to get it to work by instantiating a IISet around
 _unindex.keys() and passing that to difference (Thanks!), however, I
 notice an interesting side effect. Let's say you have a TextIndex on
 title and you do the following query:

 title != 'foo*'

 Which to me means: all cataloged objects whose title do not match the
 substring 'foo*'

 However, this is not what you get exactly, instead you get:

 all cataloged objects that have a non-empty title that does not match
 the substring 'foo*'

 Because from what I am seeing, objects with empty (or no) titles are not
 included in the index *at all*. So the set of all objects does not
 include ones without titles. I could fix this by making all objects be
 instead All objects in the catalog (via catalog.data.keys()) instead
 of all objects in the index, but I wanted to see if anyone had
 additional thoughts about this.

Hmm the reason for the current behavior was optimization by saving space
not indexing empty values.  The problem with your latter aproach is that
all objects in the catalog may include object that don't have a title
attribute at all.

I'm not against indexing empty values though.

-Michel


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )



Re: [Zope-dev] Hey Chris, question for you

2001-06-26 Thread Chris McDonough

Hi casey,

Changes were recently made to Field/Keyword Indexes so that they will
store empty items.  An equivalent change could be made to TextIndexes...
we'd need to think about that a bit.

But for your purposes, you might want to start out attempting to write
your operator implementation using Field and Keyword indexes...

- C


Michel Pelletier wrote:
 
 On Tue, 26 Jun 2001, Casey Duncan wrote:
 
  Ok, I was able to get it to work by instantiating a IISet around
  _unindex.keys() and passing that to difference (Thanks!), however, I
  notice an interesting side effect. Let's say you have a TextIndex on
  title and you do the following query:
 
  title != 'foo*'
 
  Which to me means: all cataloged objects whose title do not match the
  substring 'foo*'
 
  However, this is not what you get exactly, instead you get:
 
  all cataloged objects that have a non-empty title that does not match
  the substring 'foo*'
 
  Because from what I am seeing, objects with empty (or no) titles are not
  included in the index *at all*. So the set of all objects does not
  include ones without titles. I could fix this by making all objects be
  instead All objects in the catalog (via catalog.data.keys()) instead
  of all objects in the index, but I wanted to see if anyone had
  additional thoughts about this.
 
 Hmm the reason for the current behavior was optimization by saving space
 not indexing empty values.  The problem with your latter aproach is that
 all objects in the catalog may include object that don't have a title
 attribute at all.
 
 I'm not against indexing empty values though.
 
 -Michel
 
 ___
 Zope-Dev maillist  -  [EMAIL PROTECTED]
 http://lists.zope.org/mailman/listinfo/zope-dev
 **  No cross posts or HTML encoding!  **
 (Related lists -
  http://lists.zope.org/mailman/listinfo/zope-announce
  http://lists.zope.org/mailman/listinfo/zope )

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )