On 31 January 2016 at 14:03, Stefan Fuhrmann <stef...@apache.org> wrote: > Hi there, > > When the server needs to transmit the list of changed paths > in a revision, its memory usage is O(#changes), i.e. practically > unbound. The problems are: > > * FS and repos API require a full collection of all changes > Most consumers simply scan that data once. So, they can > just as well work on a one change at a time basis as a > callback. > > * The data types are different, so we end up with two copies. > A revised svn_fs_path_change_t with identical > svn_repo_path_change3_t can fix this. Minor adjustments > to the element data types further improve efficiency. > > I've played with a revised version of svn_fs_paths_changed > and svn_repos_get_logs to use callbacks for the individual > path changes. Effectively, the FS layer can now pump the > info through the repos / authz filter into the RA layer.
Here is some feedback. Making FS API streamy is a good goal. But as far I remember repos layer still has to store all changed path in hashtable for authorization purposes. Also I think we should not use callbacks to deliver data from the FS layer. Currently FS API is passive and I think it should remain the same: FS API users may invoke FS function from callback and this will require FS implementation to be fully reentrant (to avoid problems like fixed in r1718167 [1]) Instead of callbacks we should create svn_fs_changed_paths_t and some kind of iterator of it. As a nice benefit FS API user may request only interesting information by providing NULL for some arguments (like WC-NG API). I also noticed that svn_fs_nodeid_t member is removed from svn_fs_path_change3_t in r1727822. Are you sure about this change? Some FSFS bugs are not a reason to drop this kind of information. We should fix nodeid in changed path and maybe eventually replace it with svn_fs_node_t. [1] http://svn.haxx.se/dev/archive-2015-12/0006.shtml -- Ivan Zhakov