On 31 January 2016 at 14:03, Stefan Fuhrmann <stef...@apache.org> wrote:
> Hi there,
>
> When the server needs to transmit the list of changed paths
> in a revision, its memory usage is O(#changes), i.e. practically
> unbound. The problems are:
>
> * FS and repos API require a full collection of all changes
>   Most consumers simply scan that data once. So, they can
>   just as well work on a one change at a time basis as a
>   callback.
>
> * The data types are different, so we end up with two copies.
>   A revised svn_fs_path_change_t with identical
>   svn_repo_path_change3_t can fix this. Minor adjustments
>   to the element data types further improve efficiency.
>
> I've played with a revised version of svn_fs_paths_changed
> and svn_repos_get_logs to use callbacks for the individual
> path changes. Effectively, the FS layer can now pump the
> info through the repos / authz filter into the RA layer.

Here is some feedback.

Making FS API streamy is a good goal. But as far I remember repos
layer still has to store all changed path in hashtable for
authorization purposes.

Also I think we should not use callbacks to deliver data from the FS
layer. Currently FS API is passive and I think it should remain the
same: FS API users may invoke FS function from callback and this will
require FS implementation to be fully reentrant (to avoid problems
like fixed in r1718167 [1])

Instead of callbacks we should create svn_fs_changed_paths_t and some
kind of iterator of it. As a nice benefit FS API user may request only
interesting information by providing NULL for some arguments (like
WC-NG API).

I also noticed that svn_fs_nodeid_t member is removed from svn_fs_path_change3_t
 in r1727822. Are you sure about this change? Some FSFS bugs are not a
reason to drop this kind of information. We should fix nodeid in
changed path and maybe eventually replace it with svn_fs_node_t.

[1] http://svn.haxx.se/dev/archive-2015-12/0006.shtml

-- 
Ivan Zhakov

Reply via email to