On Fri, Jun 11, 2010 at 20:28, Jelmer Vernooij <[email protected]> wrote:
> On Fri, Jun 11, 2010 at 07:26:31PM -0400, David Borowitz wrote: > > I'm a bit torn on this issue. On the one hand, I agree that there are > some > > things like excludes that we can provide well-defined APIs for rather > than > > depending on specific filenames. On the other hand, there are two cases > that > > I think merit keeping get_named_file around: > > 1. Allow users to read and parse files on their own that Dulwich does not > > (yet) have support for. (Presumably we would encourage such users to > > contribute their parsing code back to Dulwich and/or transition to the > > official API once one exists.) > I don't have any objections against making it possible for people to do > this but I'd rather keep it optional, not a requirement for the correct > or full functioning. What do you mean by the correct or full functioning? I was referring to the fact that there are a lot of miscellaneous files that can already live under .git (*_HEAD, RENAMED-REF, branches/*, hooks/*, logs/*). If users of dulwich want to use these files before we've developed APIs for them, I think we should let them. Yes, it wouldn't work for non-disk-based Repos, but at least it would work in the most common case. This is just about the transitional period, until we provide first-class support for those features. > > 2. It's necessary for the implementation of dumb HTTP serving. > Is there any reason for this to go through the Repo API rather than through > the standard os module ? > Well, the os module wouldn't work in the case when a repo doesn't have a controldir. I suppose it would be possible to implement all of the services in web.py:HttpGitApplication using functions that don't rely on get_named_file. It would be a bit of work, however. > It doesn't seem right to make up these files on the fly for repository > implementations that do not use the standard git disk representation. The problem is that the dumb HTTP protocol itself relies on the standard disk representation--files named HEAD, info/refs, packed-refs, objects/pack/pack-*.pack, and so on. So if we get rid of get_named_file in the manner I described above, you'd in fact end up with *more* on-the-fly file generation. Actually, I take that back. Any non-disk-based implementation has to do all sorts of on-the-fly file generation, to implement dumb HTTP serving, period. It's just a question of whether it goes in get_named_file or some higher-level API. So I guess I'm for making more higher-level functions. I'll try to take a crack at that in web.py in the next week. > I'm still slightly for using '/' to be consistent with refnames and URL > > representations in the HTTP server. Note that both of these points are > > orthogonal to the choice of path separator. > I'm not opposed to making that change but I would prefer to make that > change > at the same time as any other changes we make to get_named_file. > > Cheers, > > Jelmer > > > > > On Wed, Jun 9, 2010 at 17:46, Jelmer Vernooij <[email protected]> wrote: > > > > > On Wed, Jun 09, 2010 at 02:48:18PM -0500, Augie Fackler wrote: > > > > On Mon, Jun 7, 2010 at 5:32 PM, Jelmer Vernooij <[email protected]> > > > wrote: > > > > > On Mon, 2010-06-07 at 15:59 -0500, Augie Fackler wrote: > > > > >> On May 30, 2010, at 9:18 AM, Jelmer Vernooij wrote: > > > > >> > On Mon, May 24, 2010 at 11:08:56AM -0700, David Borowitz wrote: > > > > >> > > This is quite the grab-bag of changes :). Available as usual > at my > > > > >> > > github > > > > >> > > fork: > > > > >> > > http://github.com/dborowitz/dulwich > > > > >> > > > > > > >> > > I think Augie has requested that we do code reviews in some > place > > > > >> > > public, > > > > >> > > like this mailing list. If others agree I'm happy to oblige by > > > > >> > > mailing > > > > >> > > patches. > > > > >> > That'd be great - mailed patches are a bit easier to comment on > for > > > > >> > me, and this > > > > >> > mailing list seems like the most appropriate place. > > > > >> > > > > > >> > > b9677d0 Add tests for sorted_tree_items and C implementation. > > > > >> > > 186fb3a Fix memory leak in C implementation of > sorted_tree_items. > > > > >> > > 0994d03 Clean up file headers. > > > > >> > > 4a42aad Fix copyright in dul-web. > > > > >> > I've merged these, thanks. > > > > >> > > > > > >> > > e146b26 Move named file initilization to BaseRepo. > > > > >> > Thanks, this makes sense. > > > > >> > > > > > >> > We should've really started out by naming BaseRepo Repo and > naming > > > > >> > Repo > > > > >> > DiskRepo or something, but it seems hard to change that now > without > > > > >> > breaking > > > > >> > everybody's existing code. :-( > > > > >> > > > > > >> > > 9f08973 Use / as separator for named files on all platforms. > > > > >> > I'm not sure about this change - other than the use of > info/excludes > > > > >> > we don't > > > > >> > actually appear to use / in calls to {get,put}_named_file so it > > > > >> > seems like a > > > > >> > silly thing to break the API for. > > > > >> Is it an API break then? It also means that you can always request > > > > >> 'info/excludes' without respect to the platform you're on. > > > > > It changes the behaviour of this particular call on some platforms > > > > > (specifying info\excludes on Windows will break), so users will > have to > > > > > change their callsites. > > > > > > > > > >> > Alternatively, perhaps we can make get_named_file > private/protected? > > > > >> > I'd be > > > > >> > less hesitant about making this sort of change if we do that. > > > > > > > > > >> -1, it feels like clients of Repo will need to load files within > .git > > > > >> and get_named_file feels like the correct call for this. > > > > > Do they need to do that directly though, and for what sort of > > > > > functionality? I'd rather see higher-level function calls on Repo, > in > > > > > particular because that makes it easier to e.g. implement the Repo > API > > > > > on top of other databases with a different on-disk format. > > > > > > > > It seems reasonable that we'll need access to (for example) > > > > info/excludes so we can construct the appropriate matcher for > > > > Mercurial, Bazaar or whatever else. It seems kind of odd to my mind > to > > > > have a function per special file on the repo object. > > > I think it's reasonable to have a separate function, the fact that the > > > exclude info can be found in info/excludes is a representation issue. > > > > > > I'd rather have a function that returns a list of exclude patterns for > > > which we can put in a stub that e.g. returns [] or uses some other > magic. > > > The alternative is to have a get_named_file() implementation that > returns > > > custom file contents (the disk format rather than the content) based on > the > > > filename that gets passed in. The implementation for a bzr repo would > > > look something like this: > > > > > > def get_named_file(self, name): > > > if name == "info/excludes": > > > return StringIO() > > > elif name == "description": > > > return "A Bazaar repository" > > > elif name =="config": > > > # Construct a ConfigObj, serialize it and return it as a > > > # file-like object > > > elif name.startswith("refs/heads/"): > > > # Some magic involving a RefsContainer and reserializing > > > else: > > > ... > > > > > > Making the disk format part of the API is something I'd like to avoid > as > > > much > > > as possible as it makes it hard (and slow) to implement the API on top > of > > > different disk representations (not just bzr or mercurial but also on > top > > > of > > > say a highly scalable database system). I don't think having a function > for > > > getting at the excludes would make the API that much more complicated, > > > there > > > aren't much similar files like this. > > > > > > Cheers, > > > > > > Jelmer > > > > > -- >
_______________________________________________ Mailing list: https://launchpad.net/~dulwich-users Post to : [email protected] Unsubscribe : https://launchpad.net/~dulwich-users More help : https://help.launchpad.net/ListHelp

