Re: [ZODB-Dev] Sharing (persisted) strings between threads

2010-12-08 Thread Shane Hathaway
On 12/08/2010 02:26 PM, Dylan Jay wrote:
> I was working with a high volume site recently where 70% of requests
> called a back end API that could take up to 4sec. For better or worse
> this was in zope. The best solution to scaling this was to increase
> the number of threads since this process was now IO bound. You do run
> out of memory when you do this so this solution would have been
> helpful. If a shared cache between processes were possible, such as
> using memcached, that would be even better :)

That is already possible: just use RelStorage with memcached and set 
your ZODB cache size to zero or something small.  However, large ZODB 
applications typically depend on a large number of persistent objects to 
respond to even the simplest requests, so you would have to optimize the 
application to load as few objects as possible.

Shane
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Sharing (persisted) strings between threads

2010-12-08 Thread Dylan Jay
Dylan Jay
Technical solution manager
PretaWeb 99552830

On 08/12/2010, at 11:28 PM, Jim Fulton  wrote:

> On Wed, Dec 8, 2010 at 5:06 AM, Malthe Borch  wrote:
>> Currently, when a thread loads a non-ghost into its object cache, its
>> straight from being unpickled. That means that if two threads load the
>> exact same object, any (immutable) string contained in the object
>> state will be allocated for in duplicate (or in general, on the count
>> of the active threads).
>>
>> If instead, all unpickled strings were made canonical via a weak
>> dictionary, there would be only one copy in memory, no matter the
>> thread count, e.g.:
>>
>>  string = weak_string_map.setdefault(string, string)
>>
>> If the returned string was a different (canonical) copy, the duplicate
>> would immediately be ready for garbage collection.
>>
>> This is a real win in memory savings. Using Plone, I experimented with
>> the approach by using the Python pickle implementation and interning
>> all byte strings (using ``intern``) directly in the unpickle routine
>> to the same effect:
>>
>>def load_binstring(self):
>>len = mloads('i' + self.read(4))
>>string = self.read(len)
>>interned = intern(string)# (sic)
>>self.append(interned)
>>
>> With 20 active threads, each having rendered the Plone 4 front page,
>> this approach reduced the memory usage with 70 MB.
>
> Out of a total of what?
>
> Note that if a process is CPU bound (as most dynamic Python apps
> should be), then there is little or no benefit in having multiple
> threads, due to the (damn) GIL.

I was working with a high volume site recently where 70% of requests
called a back end API that could take up to 4sec. For better or worse
this was in zope. The best solution to scaling this was to increase
the number of threads since this process was now IO bound. You do run
out of memory when you do this so this solution would have been
helpful. If a shared cache between processes were possible, such as
using memcached, that would be even better :)

>
> If your app only renders pages based on data read from a ZODB, and
> it's not CPU bound with a single thread, then your database config is
> probably wrong.
>
>> Note that unicode
>> strings aren't internable (but the alternative technique of using a
>> weak mapping should work fine).
>
> Except that you can't create wekrefs to strings or unicode.
>
> Also, while interning is fine for an experiment, it's wasteful for
> strings that are rarely needed.
>
> Sharing immutable data between threads is very appealing
> intellectually. I've certainly thoughtr about it a lot. In practice,
> I doubt the benefit will be worth the extra overhead (let alond the
> effort :).
>
> Jim
>
> --
> Jim Fulton
> ___
> For more information about ZODB, see the ZODB Wiki:
> http://www.zope.org/Wikis/ZODB/
>
> ZODB-Dev mailing list  -  ZODB-Dev@zope.org
> https://mail.zope.org/mailman/listinfo/zodb-dev
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Sharing (persisted) strings between threads

2010-12-08 Thread Malthe Borch
On 8 December 2010 15:29, Jim Fulton  wrote:
> The hash table retains a reference to the strings in it.  The
> references aren't weak afaik.

The reference is removed by the custom string dealloc function when
the ref count falls to 0. Interned strings do not linger on.

> What are you thinking of as applications with non-trivial strings?

In content management apps for instance, you have many non-trivial
strings with content, be it shorter descriptions or text bodies.
That's a substantial amount of text if allocated in duplicate or more.

Actually, I think *text* is the only kind of data that ought to take
up significant memory. Binary data ought to be blobbed out of the
database and the rest is just site structure and object overhead.

> The only one I can think of is template source.  That might be better
> served by either storing the source compressed or even storing it in a 
> separate
> object that doesn't need to be in memory except when editing or compiling.

It might make sense to keep a global cache for ZODB-persisted
templates. But I'm not sure if there are so many that it's a burden
directly. This is an app-specific matter though.

I'll try to get some raw numbers out of currently running apps to make
the conversation more substantial.

\malthe
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Sharing (persisted) strings between threads

2010-12-08 Thread Jim Fulton
On Wed, Dec 8, 2010 at 7:45 AM, Malthe Borch  wrote:
> On 8 December 2010 13:28, Jim Fulton  wrote:
>>> With 20 active threads, each having rendered the Plone 4 front page,
>>> this approach reduced the memory usage with 70 MB.
>>
>> Out of a total of what?
>
> In my case out of 430 MB non-shared for the process.
>
>> Note that if a process is CPU bound (as most dynamic Python apps
>> should be), then there is little or no benefit in having multiple
>> threads, due to the (damn) GIL.
>
> The case I'm thinking of is when one thread is being used in a write
> transaction, while another is doing a read.

I doubt that write transactions block enough to make a difference.

>
> If the database is bigger than the allowed memory usage, then I guess
> threads can also ensure that requests for in-memory objects can be
> served while some threads are blocked due to swapping and/or reading
> pickles from disk.

It's not the database size that matters, but the working set.  We have
an application that is somewhat pathological in that it's working set
is much larger than the amount of memory it's given and yet we're
still substantially CPU bound.  Data can be loaded from a ZEO cache
pretty quickly.

As Hanno said, the recommendation for a single thread assumes that you
have multiple processors.

>
>> Except that you can't create wekrefs to strings or unicode.
>
> I see. Maybe another scheme could be devised.

Yeah, maybe.  For example, you could subclass string or unicode.  This
will add significant per-string overhead that could swamp the benefits
you hope to achieve.

>
>> Also, while interning is fine for an experiment, it's wasteful for
>> strings that are rarely needed.
>
> How so? As far as I can see, interning is still subjected to reference
> counting. The only real difference is that a hash table is maintained
> (fairly minimal memory use + probable computation of string hash).

The hash table retains a reference to the strings in it.  The
references aren't weak afaik.

>
>> Sharing immutable data between threads is very appealing
>> intellectually. I've certainly thoughtr about it a lot. In practice,
>> I doubt the benefit will be worth the extra overhead (let alond the
>> effort :).
>
> I think if the case can be made for threading, then it's worth
> pursuing.

Knock yourself out. :)

> Alternatively, applications might put all non-trivial
> strings into blobs, but I don't know if there's a non-trivial overhead
> with that approach.

What are you thinking of as applications with non-trivial strings?

The only one I can think of is template source.  That might be better
served by either storing the source compressed or even storing it in a separate
object that doesn't need to be in memory except when editing or compiling.

Jim

-- 
Jim Fulton
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Sharing (persisted) strings between threads

2010-12-08 Thread Malthe Borch
On 8 December 2010 14:03, Hanno Schlichting  wrote:
> On Wed, Dec 8, 2010 at 11:06 AM, Malthe Borch  wrote:
>> With 20 active threads, each having rendered the Plone 4 front page,
>> this approach reduced the memory usage with 70 MB.
>
> Did you measure throughput of the system? In the benchmarks I've seen
> threads numbers of 3 or above will perform worse than one or two
> threads. At least with the GIL implementation up to Python 2.6 you get
> much worse performance the more threads you have on multicore systems.
> There's good explanations of the behavior done by David Beazley at
> http://www.dabeaz.com/blog.html.

I should have mentioned: I only chose 20 threads such that I might see
a noticeable difference in memory usage.

> In default Plone 4 we have two threads per instance. If you have more
> than a single ZEO instance you should reduce the thread number to one.
> We also set a default Python checkinterval of 1000 (instructions),
> which prevents thread switching for long stretches of time to counter
> the GIL in the two thread case.

I agree. As I mentioned to Jim, there's a classic case of running two
threads. However, there's also a case of running more if you expected
to be swapping data in and out at regular intervals.

> So while sharing data between threads might sound interesting, it's
> not of much help in Python.

I think it's worthwhile even sharing immutable data between two
threads, if it's a relatively straight-forward procedure. I think my
initial investigation shows that it is.

\malthe
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Sharing (persisted) strings between threads

2010-12-08 Thread Hanno Schlichting
On Wed, Dec 8, 2010 at 11:06 AM, Malthe Borch  wrote:
> With 20 active threads, each having rendered the Plone 4 front page,
> this approach reduced the memory usage with 70 MB.

Did you measure throughput of the system? In the benchmarks I've seen
threads numbers of 3 or above will perform worse than one or two
threads. At least with the GIL implementation up to Python 2.6 you get
much worse performance the more threads you have on multicore systems.
There's good explanations of the behavior done by David Beazley at
http://www.dabeaz.com/blog.html.

In default Plone 4 we have two threads per instance. If you have more
than a single ZEO instance you should reduce the thread number to one.
We also set a default Python checkinterval of 1000 (instructions),
which prevents thread switching for long stretches of time to counter
the GIL in the two thread case.

So while sharing data between threads might sound interesting, it's
not of much help in Python.

Hanno
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Sharing (persisted) strings between threads

2010-12-08 Thread Malthe Borch
On 8 December 2010 13:28, Jim Fulton  wrote:
>> With 20 active threads, each having rendered the Plone 4 front page,
>> this approach reduced the memory usage with 70 MB.
>
> Out of a total of what?

In my case out of 430 MB non-shared for the process.

> Note that if a process is CPU bound (as most dynamic Python apps
> should be), then there is little or no benefit in having multiple
> threads, due to the (damn) GIL.

The case I'm thinking of is when one thread is being used in a write
transaction, while another is doing a read.

If the database is bigger than the allowed memory usage, then I guess
threads can also ensure that requests for in-memory objects can be
served while some threads are blocked due to swapping and/or reading
pickles from disk.

> Except that you can't create wekrefs to strings or unicode.

I see. Maybe another scheme could be devised.

> Also, while interning is fine for an experiment, it's wasteful for
> strings that are rarely needed.

How so? As far as I can see, interning is still subjected to reference
counting. The only real difference is that a hash table is maintained
(fairly minimal memory use + probable computation of string hash).

> Sharing immutable data between threads is very appealing
> intellectually. I've certainly thoughtr about it a lot. In practice,
> I doubt the benefit will be worth the extra overhead (let alond the
> effort :).

I think if the case can be made for threading, then it's worth
pursuing. Alternatively, applications might put all non-trivial
strings into blobs, but I don't know if there's a non-trivial overhead
with that approach.

\malthe
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Sharing (persisted) strings between threads

2010-12-08 Thread Jim Fulton
On Wed, Dec 8, 2010 at 5:06 AM, Malthe Borch  wrote:
> Currently, when a thread loads a non-ghost into its object cache, its
> straight from being unpickled. That means that if two threads load the
> exact same object, any (immutable) string contained in the object
> state will be allocated for in duplicate (or in general, on the count
> of the active threads).
>
> If instead, all unpickled strings were made canonical via a weak
> dictionary, there would be only one copy in memory, no matter the
> thread count, e.g.:
>
>  string = weak_string_map.setdefault(string, string)
>
> If the returned string was a different (canonical) copy, the duplicate
> would immediately be ready for garbage collection.
>
> This is a real win in memory savings. Using Plone, I experimented with
> the approach by using the Python pickle implementation and interning
> all byte strings (using ``intern``) directly in the unpickle routine
> to the same effect:
>
>    def load_binstring(self):
>        len = mloads('i' + self.read(4))
>        string = self.read(len)
>        interned = intern(string)    # (sic)
>        self.append(interned)
>
> With 20 active threads, each having rendered the Plone 4 front page,
> this approach reduced the memory usage with 70 MB.

Out of a total of what?

Note that if a process is CPU bound (as most dynamic Python apps
should be), then there is little or no benefit in having multiple
threads, due to the (damn) GIL.

If your app only renders pages based on data read from a ZODB, and
it's not CPU bound with a single thread, then your database config is
probably wrong.

> Note that unicode
> strings aren't internable (but the alternative technique of using a
> weak mapping should work fine).

Except that you can't create wekrefs to strings or unicode.

Also, while interning is fine for an experiment, it's wasteful for
strings that are rarely needed.

Sharing immutable data between threads is very appealing
intellectually. I've certainly thoughtr about it a lot. In practice,
I doubt the benefit will be worth the extra overhead (let alond the
effort :).

Jim

--
Jim Fulton
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Sharing (persisted) strings between threads

2010-12-08 Thread Malthe Borch
Currently, when a thread loads a non-ghost into its object cache, its
straight from being unpickled. That means that if two threads load the
exact same object, any (immutable) string contained in the object
state will be allocated for in duplicate (or in general, on the count
of the active threads).

If instead, all unpickled strings were made canonical via a weak
dictionary, there would be only one copy in memory, no matter the
thread count, e.g.:

  string = weak_string_map.setdefault(string, string)

If the returned string was a different (canonical) copy, the duplicate
would immediately be ready for garbage collection.

This is a real win in memory savings. Using Plone, I experimented with
the approach by using the Python pickle implementation and interning
all byte strings (using ``intern``) directly in the unpickle routine
to the same effect:

def load_binstring(self):
len = mloads('i' + self.read(4))
string = self.read(len)
interned = intern(string)# (sic)
self.append(interned)

With 20 active threads, each having rendered the Plone 4 front page,
this approach reduced the memory usage with 70 MB. Note that unicode
strings aren't internable (but the alternative technique of using a
weak mapping should work fine).

In a long-running operation, dirty objects should be invalidated after
the transaction, to prevent future data redundancy.

For an implementation one needs to have a hook to use a special
reconstructor function for strings. Currently there is a technical
impediment in that BTrees and Persistent objects have their own
internal way to save strings. In my experiments, the ``persistent_id``
function was not called for string objects (which is a different
behavior than the regular cPickle.Pickler.dump has).

\malthe
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev