On Thu, Jan 13, 2011 at 09:03:10AM -0500, C. Michael Pilato wrote: > On 01/13/2011 08:25 AM, Stefan Sperling wrote: > > On Wed, Jan 12, 2011 at 09:46:47AM -0500, C. Michael Pilato wrote: > >> Nothing to add to the immediate discussion at this time. Just wanted to > >> note that we have essentially the same problems to deal with in the BDB > >> backend. The solution we've taken there is "don't drive public API > >> callbacks from inside BDB transactions". It's not always the best for > >> performance, but the FS API is such a high profile API for third-party > >> consumers (very unlike the WC API in that respect, I'd imagine) that > >> simplicity of the API "rules" was valued. > >> > > > > This is about callbacks in the libsvn_client(!) and libsvn_wc APIs. > > Those are widely used by third parties so we should really make > > the right decision. > > > > Do we trust third parties to heed restrictions we place on callbacks? > > There is no way can enforce the restrictions. > > Really? Our BDB code detects re-entry from within a transaction and errors > out. :-) It does this by keeping a simple boolean state variable that is > set whenever the code enters a BDB transaction, and cleared when the > transaction is committed or aborted. If that variable is set, and some > other code comes along to create a new (nested) transaction, and error is > raised. > > Could we do something similar in our SQL logic?
This is a nice idea. I'll think about it. > > Or we could decide not to run callbacks while sqlite transactions > > are in progress, and pay some performance overhead. We'd have to > > run multiple queries and provide data to the callback in chunks > > (of, say, directory-sized units). I'm not sure if this would provide > > adequate performance, but I haven't taken any measurements either (yet). > > A recent change of this kind I made to the node walker didn't make > > to make any notable difference. > > So, yeah, we really need more information to answer questions of > performance, it seems. In the meantime, we need to answer questions of > correctness, maintainability, and ease-of-use. There is no question that calling callbacks while sqlite transactions are running provides best performance. Instead of crawling a tree on disk and running a query for each node (or a set of nodes), we can simply "walk" the db and pass information to the callback. This only works for read-only operations like proplist and status, of course, but those are the ones that need to perform really well so they can efficiently be used by GUIs, for instance. > > We may need both approaches anyway, if we decide to use the second > > approach in backwards-compat code. > > I don't see how we'd ever need both approaches. Can you explain? If we decide that the new approach violates our backwards compat rules, we'd have to provide a separate implementation to continue supporting semantics of old APIs. Stefan