Re: [ZODB-Dev] Progress report: porting persistent, BTrees to Python3

2012-12-17 Thread Chris McDonough
On Sun, 2012-12-16 at 22:10 +0100, Godefroid Chapelle wrote:
 Le 15/12/12 01:52, Tres Seaver a écrit :
  I fixed the remainig issues in persistent and released 4.0.5 today:  its
  tests properly exercise the C extensions Under Python 3.2 / 3.3.
 
 I want to express my thanks to you, Tres, for taking care of that work !
 
 This port of ZODB to Python 3 is really a crucial step for the ZTK 
 ecosystem. After the work already done on zope.interface and zope.component.
 
 Further, I'd like to also thank Jim for his work on porting buildout.
 
 When this will be finished, porting the rest of the ZTK should be much 
 easier, which hopefully implies that more of us will be able to 
 participate.

Hear, hear!

- C


___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] shared cache when no write?

2012-12-17 Thread Dylan Jay

On 14/12/2012, at 8:32 AM, Jim Fulton j...@zope.com wrote:

 On Thu, Dec 13, 2012 at 4:18 PM, Dylan Jay d...@pretaweb.com wrote:
 ...
 I'd never considered that the cache was attached to the db connection rather
 than the thread. I just reread
 http://docs.zope.org/zope2/zope2book/MaintainingZope.html and it says
 exactly that.
 So what your saying is I'd tune db connections down to memory size on an
 instance dedicated to io bound and then increase the threads. Whenever a
 thread requests a db connection and there isn't one available it will block.
 So I just optimize my app the release the db connection when not needed.
 In fact I could tune all my copes this way since a zone with 10 threads and
 2 connections is going to end up queuing requests the same as 2 threads and
 10 connections?
 
 Something like that. It's a little more complicated than that because
 Zope 2 is managing connections for you, it would be easy to run afoul
 of that.  This is a case where something that usually makes your life
 easier, makes it harder. :)

true. With Plone as you have many modules sharing the connection all expecting 
it to be the same connection closing the connection half way through isn't 
possible. If it was closed and another connection opened then the other modules 
that are outside of your control might have references to stale data.

 
 What I'd do is use a separate database other than the one Zope 2 is
 using.  Then you can manage connections yourself without conflicting
 with the publisher is doing.  Then, when you want to use the database,
 you just open the database, being careful to close it when you're
 going to block.  The downside being that you'll have separate
 transactions.
 
 This should be easier to achieve and changes the application less than the
 erp5 background task solution mentioned.
 
 It would probably be a good idea to lean more bout how erp does this.
 The erp approach sounds like a variation on what I suggested.

It's not always possible as sometimes you need to feedback the result to the 
user immediately. 
Let's take another example. A Plone site with a page that lets you upload a mp3 
file and it guesses the song, then combines that with your preference data to 
return other songs you might like. The guessing the song bit is an external 
service and the preference data is stored in the same zodb as Plone. 
To do it the ERP background task way you;d deliver back a page with some 
javascript on it that polls the server to see if the song had been processed 
yet. This isn't always desirable, esp if you have to avoid javascript.  

Maybe another possibility is to do it the way ZODB handles streaming blobs. The 
blob streaming happens after the db connection is closed. Perhaps if there was 
a way to register a callback in zope for processing to happen after the db 
connection is closed but before the request is returned. At this point, I could 
do a external connection and combine the resulting data to modify the response 
object, perhaps in an async thread like blobs uses. If I really wanted to write 
or read more data I could request a new thread and db connection at that point.


 
 I can see from the previous post, as there is no checkout semantics
 in zodb,
 
 I don't know what checkout semantics means.

As in the ZODB protocol doesn't have a call you have to make before you write 
to an object. You just write to the object and afterwards flag as changed (if 
needed). So there isn't a way to block at the point of writing. Malthe's 
database had an explicit checkout action so you weren't allowed to mutate 
anything until you checked it out presumably. Not something you can introduce 
into ZODB.

 
 you are free to write anytime so there is no sane way to block at the point
 someone wants to write to an object, so it wouldn't work.
 
 ZODB provides a very simple concurrency model by giving each
 connection (and in common practice, each thread) it's own view of the
 database. If you break that, then you're injecting concurrency issues
 into the app or in some pretty magical layer.
 
 You perhaps could have a single read only db connection which is
 shared?
 
 But even if the database data was only read, objects have other state
 that may be mutated.  You'd have to inspect every class to make sure
 it's thread safe. That's too scary for me.
 
 Jim
 
 --
 Jim Fulton
 http://www.linkedin.com/in/jimfulton
 Jerky is better than bacon! http://zo.pe/Kqm

___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] shared cache when no write?

2012-12-17 Thread Leonardo Rochael Almeida
Hi,

On Mon, Dec 17, 2012 at 10:03 PM, Dylan Jay d...@pretaweb.com wrote:

 On 14/12/2012, at 8:32 AM, Jim Fulton j...@zope.com wrote:

 On Thu, Dec 13, 2012 at 4:18 PM, Dylan Jay d...@pretaweb.com wrote:
 ...
 I'd never considered that the cache was attached to the db connection rather
 than the thread. I just reread
 http://docs.zope.org/zope2/zope2book/MaintainingZope.html and it says
 exactly that.
 So what your saying is I'd tune db connections down to memory size on an
 instance dedicated to io bound and then increase the threads. Whenever a
 thread requests a db connection and there isn't one available it will block.
 So I just optimize my app the release the db connection when not needed.
 In fact I could tune all my copes this way since a zone with 10 threads and
 2 connections is going to end up queuing requests the same as 2 threads and
 10 connections?

 Something like that. It's a little more complicated than that because
 Zope 2 is managing connections for you, it would be easy to run afoul
 of that.  This is a case where something that usually makes your life
 easier, makes it harder. :)

 true. With Plone as you have many modules sharing the connection all 
 expecting it to be the same connection closing the connection half way 
 through isn't possible. If it was closed and another connection opened then 
 the other modules that are outside of your control might have references to 
 stale data.


 What I'd do is use a separate database other than the one Zope 2 is
 using.  Then you can manage connections yourself without conflicting
 with the publisher is doing.  Then, when you want to use the database,
 you just open the database, being careful to close it when you're
 going to block.  The downside being that you'll have separate
 transactions.

 This should be easier to achieve and changes the application less than the
 erp5 background task solution mentioned.

 It would probably be a good idea to lean more bout how erp does this.
 The erp approach sounds like a variation on what I suggested.

Indeed, it's clear from all the proposed solutions (including DJ's
reconnect after transaction end but before returning to the user) that
you can't have, at the same time, a single ZODB transaction AND
immediate user feedback, when depending on an external system.

There's not much more to the ERP5 technique than what I already
explained earlier. It boils down to:

 * take user input
 * store it as received with as little processing as possible
 * trigger background activities (as few as possible) for anything
that requires looking beyond the object the user is currently
manipulating and it's immediate vicinity (specially object
reindexing).
 * return info to the user as fast as possible, including any info
telling him to check back later if necessary.

 It's not always possible as sometimes you need to feedback the result to the 
 user immediately.
 Let's take another example. A Plone site with a page that lets you upload a 
 mp3 file and it guesses the song, then combines that with your preference 
 data to return other songs you might like. The guessing the song bit is an 
 external service and the preference data is stored in the same zodb as Plone.
 To do it the ERP background task way you;d deliver back a page with some 
 javascript on it that polls the server to see if the song had been processed 
 yet. This isn't always desirable, esp if you have to avoid javascript.

Avoiding JavaScript is possible with the same approach GitHub does
when forking a repo: a meta-http-equiv-refresh message we're
processing your request. This page will update itself when we're done.
You may refresh if it on your own if it makes you feel like you're in
control.

Providing user feedback is usually less tricky than coping with system
restrictions. As long as the user is seeing something happening, and
the system feels like it's evolving towards a solution, instead of
seeming stuck, users tend to be satisfied.

In your example, the user already waits quite a bit for his file
upload to finish. Having him wait on the external system to handle the
date could be a bit too much, better return some info to him and show
the rest later.

 [...]

Cheers,

Leo
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] shared cache when no write?

2012-12-17 Thread Dylan Jay

On 18/12/2012, at 2:15 PM, Leonardo Rochael Almeida leoroch...@gmail.com 
wrote:

 Hi,
 
 On Mon, Dec 17, 2012 at 10:03 PM, Dylan Jay d...@pretaweb.com wrote:
 
 On 14/12/2012, at 8:32 AM, Jim Fulton j...@zope.com wrote:
 
 On Thu, Dec 13, 2012 at 4:18 PM, Dylan Jay d...@pretaweb.com wrote:
 ...
 I'd never considered that the cache was attached to the db connection 
 rather
 than the thread. I just reread
 http://docs.zope.org/zope2/zope2book/MaintainingZope.html and it says
 exactly that.
 So what your saying is I'd tune db connections down to memory size on an
 instance dedicated to io bound and then increase the threads. Whenever a
 thread requests a db connection and there isn't one available it will 
 block.
 So I just optimize my app the release the db connection when not needed.
 In fact I could tune all my copes this way since a zone with 10 threads and
 2 connections is going to end up queuing requests the same as 2 threads and
 10 connections?
 
 Something like that. It's a little more complicated than that because
 Zope 2 is managing connections for you, it would be easy to run afoul
 of that.  This is a case where something that usually makes your life
 easier, makes it harder. :)
 
 true. With Plone as you have many modules sharing the connection all 
 expecting it to be the same connection closing the connection half way 
 through isn't possible. If it was closed and another connection opened then 
 the other modules that are outside of your control might have references to 
 stale data.
 
 
 What I'd do is use a separate database other than the one Zope 2 is
 using.  Then you can manage connections yourself without conflicting
 with the publisher is doing.  Then, when you want to use the database,
 you just open the database, being careful to close it when you're
 going to block.  The downside being that you'll have separate
 transactions.
 
 This should be easier to achieve and changes the application less than the
 erp5 background task solution mentioned.
 
 It would probably be a good idea to lean more bout how erp does this.
 The erp approach sounds like a variation on what I suggested.
 
 Indeed, it's clear from all the proposed solutions (including DJ's
 reconnect after transaction end but before returning to the user) that
 you can't have, at the same time, a single ZODB transaction AND
 immediate user feedback, when depending on an external system.
 
 There's not much more to the ERP5 technique than what I already
 explained earlier. It boils down to:
 
 * take user input
 * store it as received with as little processing as possible
 * trigger background activities (as few as possible) for anything
 that requires looking beyond the object the user is currently
 manipulating and it's immediate vicinity (specially object
 reindexing).
 * return info to the user as fast as possible, including any info
 telling him to check back later if necessary.
 
 It's not always possible as sometimes you need to feedback the result to the 
 user immediately.
 Let's take another example. A Plone site with a page that lets you upload a 
 mp3 file and it guesses the song, then combines that with your preference 
 data to return other songs you might like. The guessing the song bit is an 
 external service and the preference data is stored in the same zodb as Plone.
 To do it the ERP background task way you;d deliver back a page with some 
 javascript on it that polls the server to see if the song had been processed 
 yet. This isn't always desirable, esp if you have to avoid javascript.
 
 Avoiding JavaScript is possible with the same approach GitHub does
 when forking a repo: a meta-http-equiv-refresh message we're
 processing your request. This page will update itself when we're done.
 You may refresh if it on your own if it makes you feel like you're in
 control.
 
 Providing user feedback is usually less tricky than coping with system
 restrictions. As long as the user is seeing something happening, and
 the system feels like it's evolving towards a solution, instead of
 seeming stuck, users tend to be satisfied.
 
 In your example, the user already waits quite a bit for his file
 upload to finish. Having him wait on the external system to handle the
 date could be a bit too much, better return some info to him and show
 the rest later.

true you could do it that way for certain types of requests. The real life 
situation I was involved with had a backend response time of between 1-3 
seconds. Long enough to cause scalability issues on the server by running out 
of connections but not too long that the customers were prepared to have a UI 
that autorefreshed or used ajax, esp since plenty of other technologies don't 
have this limitation (or has Jim pointed out, they do have this limitation but 
it isn't as bad). Also if you are proxing another external application, then it 
would be a lot of work to rework to make each page asynchronous. 


 
 [...]
 
 Cheers,
 
 Leo