On Fri, Nov 10, 2017 at 2:48 PM, Augie Fackler <r...@durin42.com> wrote:
> > On Nov 10, 2017, at 17:46, Gregory Szorc <gregory.sz...@gmail.com> wrote: > > On Fri, Nov 10, 2017 at 2:24 PM, Augie Fackler <r...@durin42.com> wrote: > >> On Mon, Nov 06, 2017 at 09:46:18AM -0800, Gregory Szorc wrote: >> > On Mon, Nov 6, 2017 at 4:34 AM, Pulkit Goyal <7895pul...@gmail.com> >> wrote: >> > >> > > Hey, >> > > >> > > I am working on porting functionalities from hgremotenames extension >> > > to core. The hgremotenames extensions pull the information about >> > > remote branches and remote bookmarks and store them to provide a >> > > better workflow. >> > > >> > > The current storage format which hgremotenames has is having a file >> > > `.hg/remotenames` in which each line is of the format `node nametype >> > > name`, where >> > > - `node` refers to node where the remotename was last seen >> > > - `nametype` refers whether it's a bookmark or branch >> > > - `name` consists of name of the remote and name of the remote >> > > bookmark/branch >> > > >> > > At sprint, Ryan suggested to split the file according to bookmark and >> > > branches so that we can read and write more easily which makes sense. >> > > >> > > While working on the patches, I found out that if the name of the >> > > remote contains a '/', then the current storage format is not good and >> > > we can fail to parse things correctly. >> > > >> > > Do you guys have any better ideas on how we can store remotenames? >> > > >> > >> > I have somewhat strong feels that we should strive to use append-only >> file >> > formats as much as possible. This will enable us to more easily >> implement a >> > "time machine" feature that allows the full state of the repository at a >> > previous point in time to be seen. It also makes transactions lighter, >> as >> > we don't need to perform a full file backup: you merely record the >> offset >> > of files being appended to. >> > >> > A problem with naive append-only files is you need to scan the entire >> file >> > to obtain the most recent state. But this can be rectified with either >> > periodic "snapshots" in the data stream (like how revlogs periodically >> > store a fulltext) or via separate cache files holding snapshot(s). The >> tags >> > and branches caches kinda work this way: they effectively prevent full >> > scans or expensive reads of changelog and/or manifest-based data. >> > >> > A revlog may actually not be a bad solution to this problem space. A bit >> > heavyweight. But the solution exists and should be relatively easy to >> > integrate. >> >> I think for now we should not let the perfect (generalized undo) be >> the enemy of the good (keeping track of where labels were on >> remotes). For now, I'd be fine with using some sort of null-separated >> file format. It's gross, but it should let Pulkit land the feature, >> and since it's largely advisory data it's not a crisis if we improve >> it later. >> > > I agree. A generalized undo format is a lot of work and is definitely > scope bloat for remote names. > > But my point about append-only data structures still stands. You basically > get the Git equivalent of the "reflog" for free if you e.g. store this data > in a revlog. > > > That sounds relevant for journal, less so for remotenames. > Fair enough. > > Would it make sense to try and come to an agreement on a single format we > can use for node->label storage? It's come up in the bookmarks binary part > patches as well... > Perhaps. But that could quickly devolve into such topics as: * Unified vs per-domain storage * Label specific versus generic node metadata storage
_______________________________________________ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel