date:20100713

[ZODB-Dev] KeyError in BTrees.check

2010-07-13 Thread Suresh V.

While trying to run manage_cleanup on my BTreeFolder2, I get a KeyError 
from the classify function in Module BTrees.check:

def classify(obj):
 return _type2kind[type(obj)]

I see that obj is None at this point.

Anyway I can patch the code to run around this?

Thanks.

Suresh

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] KeyError in BTrees.check

2010-07-13 Thread Hanno Schlichting

On Tue, Jul 13, 2010 at 11:23 AM, Suresh V. suresh...@yahoo.com wrote:
 While trying to run manage_cleanup on my BTreeFolder2, I get a KeyError
 from the classify function in Module BTrees.check:

 def classify(obj):
     return _type2kind[type(obj)]

 I see that obj is None at this point.

 Anyway I can patch the code to run around this?

This should be solved in BTreeFolder2 I guess. It shouldn't pass in
something that is None into the lower level function.

But I thought the manage_cleanup methods where only used years ago,
while BTree's had some internal book keeping problems. Do you still
get BTree corruption in any recent ZODB3 / Zope 2 combination? The
cleanup code is from six years back before BTreeFolder2 was even part
of a Zope2 release, I was close to removing it the other day.

Hanno
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Indexing and dates/times

2010-07-13 Thread Jim Fulton

On Tue, Jul 13, 2010 at 4:35 AM, Pedro Ferreira
jose.pedro.ferre...@cern.ch wrote:
 Hello,

 I am currently trying to devise a way to index and retrieve some
 millions of objects according to their modification date/time. One of
 the problems I'm facing is that of index granularity: I'd like to
 provide to the second granularity,


 will there ever be more than item with the same key?


 Exactly, that's the problem.

Typically, to model something like this, you's have a BTree who's
values are sets.  If single items are common and you were willing to
work a bit harder, you could have BTrees whos values could be either a
set or a scalar.

 but for that I need some structure
 that lets me do that. So, the options I see are:
  - A timestamp-based


 What do you mean by timestamp


 Well, it could be a UNIX timestamp.

It could be lots of things. I was asking what you meant.

If you used a unix time stamp, you could ise one of the Ix flavors of BTree.



 BTree index - looks highly inefficient, as there
 will be many entries with only one element (probably almost all of
 them),


 I have no idea what you mean by this.


 That's the problem you've already mentioned above.

So, the issue is that you have multiple items with the same
key. This is simply handled by using sets as values ion a BTree.
There are existing index implementations that do this.


 So, in a relational DB i would do something like:

 SELECT * FROM table WHERE timestamp = X AND timestamp = Y

 Since I cannot do this with ZODB,

I don't know what this is. Range seaches? SQL? BTrees and various
index implementations based on the,m support range searches.  of
course, ZODB doesn't support SQL.

 I'd have to have a BTree, indexed by
 timestamp... however, as you said, if I want to the second granularity, I
 will rarely have two items with the same key (which makes it pretty
 useless).

I don't know why it is useless, but it is easily handled.

 So, I was wondering if there is some data structure I can use for this, as
 this seems to be a pretty common use case.

That's why the various indexing(/catalog) schemes already support it.

Jim

--
Jim Fulton
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Indexing and dates/times

2010-07-13 Thread Benji York

On Tue, Jul 13, 2010 at 4:35 AM, Pedro Ferreira
jose.pedro.ferre...@cern.ch wrote:
 Hello,
 I am currently trying to devise a way to index and retrieve some
 millions of objects according to their modification date/time.

 So, in a relational DB i would do something like:

 SELECT * FROM table WHERE timestamp = X AND timestamp = Y

 Since I cannot do this with ZODB, I'd have to have a BTree, indexed by
 timestamp... however, as you said, if I want to the second
 granularity, I will rarely have two items with the same key (which makes
 it pretty useless).

If you use the timestamp as the key and you want to retrieve all values
between two timestamps (inclusive), you can do

my_btree.values(min=start, max=end)
-- 
Benji York
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Indexing and dates/times

2010-07-13 Thread Pedro Ferreira


 So, the issue is that you have multiple items with the same
 key. This is simply handled by using sets as values ion a BTree.
 There are existing index implementations that do this.


Hmm... no, in fact the problem is that most of the time I will have only 
one value per index entry.So, in a relational DB i would do something like:
 SELECT * FROM table WHERE timestamp= X AND timestamp= Y

 Since I cannot do this with ZODB,
  
 I don't know what this is. Range seaches? SQL? BTrees and various
 index implementations based on the,m support range searches.  of
 course, ZODB doesn't support SQL.


Yes, I know ZODB doesn't support SQL, I was just trying to demonstrate 
my use case.
What I meant was that in relational DBs a query like the one above can 
be performed over an arbitrary table, without the need for having an 
extra data structure for indexing.

 I'd have to have a BTree, indexed by
 timestamp... however, as you said, if I want to the second granularity, I
 will rarely have two items with the same key (which makes it pretty
 useless).
  
 I don't know why it is useless, but it is easily handled.



It's not useless. I'm sorry, I have used the wrong word. I meant that a 
range query will normally involve the union of a higher number of sets 
as the granularity gets smaller and smaller. If there is only one item 
per index entry, the union operation will take longer... I assumed that 
the more BTree entries we have, the more buckets we will have to fetch 
from the DB, for a given range query. But I am probably wrong...

 So, I was wondering if there is some data structure I can use for this, as
 this seems to be a pretty common use case.
  
 That's why the various indexing(/catalog) schemes already support it.


So, if I need an index where items can be queried by date, and range 
queries can be performed efficiently, an IOBTree will do the job? As I 
mention above, my only concern is the number of sets that will have to 
be joined.

Thanks, once again,

Pedro

-- 
José Pedro Ferreira

Indico Team

IT-UDS-AVC

513-R-0042
CERN, Geneva, Switzerland

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Indexing and dates/times

2010-07-13 Thread Pedro Ferreira


 If you use the timestamp as the key and you want to retrieve all values
 between two timestamps (inclusive), you can do

  my_btree.values(min=start, max=end)


Yes, but, as I mentioned in my answer to Jim's mail, my concern is the 
performance of this range operation for a very large range. If you 
tell me it will be OK, I won't hesitate :).

Regards,

Pedro

-- 
José Pedro Ferreira

Indico Team

IT-UDS-AVC

513-R-0042
CERN, Geneva, Switzerland

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Indexing and dates/times

2010-07-13 Thread Jim Fulton

On Tue, Jul 13, 2010 at 7:51 AM, Pedro Ferreira
jose.pedro.ferre...@cern.ch wrote:
...
 Hmm... no, in fact the problem is that most of the time I will have only one
 value per index entry.So, in a relational DB i would do something like:

As I mentioned in some text you didn't quote, you could use a strategy
of storing a scalar unless there are dups.  If dups are very rare,
this *might* be a win.

...

 I'm sorry, I have used the wrong word. I meant that a
 range query will normally involve the union of a higher number of sets as
 the granularity gets smaller and smaller. If there is only one item per
 index entry, the union operation will take longer... I assumed that the more
 BTree entries we have, the more buckets we will have to fetch from the DB,
 for a given range query. But I am probably wrong...

The BTrees package has a pretty efficient strategy for merging
multiple sets.


 So, I was wondering if there is some data structure I can use for this,
 as
 this seems to be a pretty common use case.


 That's why the various indexing(/catalog) schemes already support it.


 So, if I need an index where items can be queried by date, and range queries
 can be performed efficiently, an IOBTree will do the job?

An IOBTree could be the *basis* of a solution.

 As I mention
 above, my only concern is the number of sets that will have to be joined.

Depending on how rare dups are, I'd either use a BTree who's values
are scalers or sets. I might start with just using sets as the values.

Typically the strategy is to assign each object you want to index an
integer id.  Then your index is of the form:

  {key - {docids}}

Then you can do a lookup with something like:

  result = BTrees.IOBTree.multiunion(index.values(min,max))

For more code examples, you should look at the zope.index and
zope.catalog packages.

Jim

--
Jim Fulton
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev

[ZODB-Dev] a doubt of the pool_size of zodb connection

2010-07-13 Thread firewood firewood

Hello, in ZODB.DB, I saw this:

There's no limit on the number of connections this can keep track of,
but a warning is logged if there are more than pool_size active
connections, and a critical problem if more than twice pool_size.

I think it mean that the number of connections should be less than
pool_size, isn`t is? but when I
use ZODB, it often has number of active connections more than pool_size, is
that right? I want to
know why the default pool_size is set to 8, I think it is so little, because
I often have active connections
much more than 8
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] KeyError in BTrees.check

2010-07-13 Thread Suresh V.

Hanno Schlichting wrote:
 On Tue, Jul 13, 2010 at 11:23 AM, Suresh V. suresh...@yahoo.com wrote:
 While trying to run manage_cleanup on my BTreeFolder2, I get a KeyError
 from the classify function in Module BTrees.check:

 def classify(obj):
 return _type2kind[type(obj)]

 I see that obj is None at this point.

 Anyway I can patch the code to run around this?
 
 This should be solved in BTreeFolder2 I guess. It shouldn't pass in
 something that is None into the lower level function.
 
 But I thought the manage_cleanup methods where only used years ago,
 while BTree's had some internal book keeping problems. Do you still
 get BTree corruption in any recent ZODB3 / Zope 2 combination? The
 cleanup code is from six years back before BTreeFolder2 was even part
 of a Zope2 release, I was close to removing it the other day.

Problem was due to broken object. Seems to affect sibling objects or 
even objects higher up in the hierarchy (can't navigate to them in the 
ZMI). Not sure if anything can be done to fix it though.

Suresh

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev

[ZODB-Dev] KeyError in BTrees.check

Re: [ZODB-Dev] KeyError in BTrees.check

Re: [ZODB-Dev] Indexing and dates/times

Re: [ZODB-Dev] Indexing and dates/times

Re: [ZODB-Dev] Indexing and dates/times

Re: [ZODB-Dev] Indexing and dates/times

Re: [ZODB-Dev] Indexing and dates/times

[ZODB-Dev] a doubt of the pool_size of zodb connection

Re: [ZODB-Dev] KeyError in BTrees.check

9 matches

Site Navigation

Mail list logo

Footer information