Re: [ZODB-Dev] Ackward PersistentList read Performance

2013-08-13 Thread Joerg Baach
Hi Jim, Malthe,

thanks a lot for your advice, really appreciated. I think I get what you
mean, but will read up a bit to fully understand whats going on, and to
write accordingly (I need the best read throughput I can get, and don't
need to bother much with the writes).

Cheers,

  Joerg
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Ackward PersistentList read Performance

2013-08-13 Thread Jim Fulton
On Tue, Aug 13, 2013 at 9:40 AM, Joerg Baach  wrote:
> Hi *,
>
> I was trying to measure the impact of using different kind of objects to
> store data in ZODB (disk, ram, time).
>
> Whats really ackward is the measurement for reading content from
> PersistentLists (that are stored in an IOBTree):
>
> case a
> ==
> g.edges=IOBTree()
> for j in range(1,100):
> edge =PersistentList([j,1,2,{}])
> g.edges[j] = edge
>
> x = list(g.edges.values())
> y = [e[3] for e in x]   #this takes 30 seconds
>
> case b
> ==
> g.edges=IOBTree()
> for j in range(1,100):
> edge =[j,1,2,{}]
> g.edges[j] = edge
>
> x = list(g.edges.values())
> y = [e[3] for e in x]   #this takes 0.09 seconds
>
> So, can it really be that using a PersistentList is 300 times slower?

Yes.  This would be true of *any* persistent object. In the first
case, You're creating 100+B database objects, were B is ~20.
In the second case, you're creating B persistent objects.

Depending on what you do between cases A and B, you may also
have to load 100+B vs B objects.

> Am
> I doing something completely wrong,

It depends on your application.  Generally, one uses a BTree to avoid
loading a large collection into memory.  Iterating over the whole
thing defeats that.

Deciding whether to use a few large database objects or many small
ones is a tradeoff between efficiency of access and efficiency of
update, depending on access patterns.

> or am I missing something?

Possibly

> I am using ZODB3-3.10.5. The whole setup (incl. results) is at
> https://github.com/jhb/zodbtime

tl;dr

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Ackward PersistentList read Performance

2013-08-13 Thread Malthe Borch
On 13 August 2013 15:40, Joerg Baach  wrote:
> So, can it really be that using a PersistentList is 300 times slower? Am
> I doing something completely wrong, or am I missing something?

In your test setup, you commit the transaction and close the
connection. This means that when you iterate through the edges, not
only do you load buckets, but for each item in the bucket, you load an
additional persistent object, namely the `PersistentList`.

This doesn't happen when you persist a Python list. In that case, the
list is simply persisted right into the bucket data. This makes the
buckets take up more space, but you need less reads from disk which
makes the whole thing go a lot faster.

Note that once you've loaded all the `PersistentList` objects,
iterating again should go much faster because it's now all in memory.

\malthe
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Ackward PersistentList read Performance

2013-08-13 Thread Joerg Baach
Hi *,

I was trying to measure the impact of using different kind of objects to
store data in ZODB (disk, ram, time).

Whats really ackward is the measurement for reading content from
PersistentLists (that are stored in an IOBTree):

case a
==
g.edges=IOBTree()
for j in range(1,100):
edge =PersistentList([j,1,2,{}])
g.edges[j] = edge

x = list(g.edges.values())
y = [e[3] for e in x]   #this takes 30 seconds

case b
==
g.edges=IOBTree()
for j in range(1,100):
edge =[j,1,2,{}]
g.edges[j] = edge

x = list(g.edges.values())
y = [e[3] for e in x]   #this takes 0.09 seconds

So, can it really be that using a PersistentList is 300 times slower? Am
I doing something completely wrong, or am I missing something?

I am using ZODB3-3.10.5. The whole setup (incl. results) is at
https://github.com/jhb/zodbtime

Cheers,

  Joerg

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev