Sorry for sending this so late, but I'm catching up with darcs-related mail.
On Tue, Sep 14, 2010 at 02:12:13PM +0200, Petr Rockai wrote: > Instead, I could offer a list of key->info mapping as part of the > machine format at the start or end (probably end). I imagine it could > look like: > > <the annotation> > > <hash1> > <patch info> > > <hash2> > <patch info> > > ... > > I did not include before because I only had feedback from Lele who said > it is redundant in his case (since he already maintains a <hash> -> > <info> map internally). I don't think it'd be too costly to add the map > for anyone (if the infos are not interesting, you can simply cut off at > the first empty line). That depends on the tool. Darcsweb relies on annotate --xml output to show the annotate page, and if it had only the hash ids, that would mean extra darcs invocations to get authorship information. That is so because darcsweb does not rely on any database, or persistent state. It's supposed to be a light, easy to install and read-only cgi application. I can imagine that, for example, a short-lived graphical anotate browser (like git gui blame) could have similar requirements. I've read the discussion and I think most of the formats look great (both machine and readable), but it'd be nice if the machine-readable ones could export the same (or more) information than the current --xml, for the reasons stated above. Maybe a short --machine, and an optional --machine=long or something like that, where the latter would behave as you propose, with an initial hash -> info map. About the format in particular, as long as it is unambiguous and reasonable to parse, I'm ok with it. These are some things in XML output that caused trouble for darcsweb in the past, and maybe could be avoided/improved in the new format: - Encoding of code: in particular non-utf8 files, or files with a mix. - Non-printable characters in code: things like ^L are common, if you are escaping some of them, please make it easy to handle. - Date formats: please use a normalized date format (ISO would be IMHO a nice choice), and avoid timezone names if possible, using [+-]XXXX instead. Timezone names are very problematic to parse. - Encoding of the author's name. Remember that people may put weird characters in their name and it should be handled properly. - Names and email addresses: if you are putting names and email addresses together, please escape < and > in names, so finding out the email address is easier. - Binary files: while this has not been a problem, it's a very nice feature to know from darcs which files it considers binary. Also, if you are going to deprecate --xml, please make sure there is a way to reliably detect the availability of the new output in a backwards-compatible way. That is so tools can try to use the new format, and if it fails they can fall back to the old one. One simple possibility is making sure current/old darcs exits with code != 0 when invoked with the new flag, and also different from the one used by darcs --machine (or whatever) to signal an error. Thanks a lot, Alberto _______________________________________________ darcs-users mailing list darcs-users@darcs.net http://lists.osuosl.org/mailman/listinfo/darcs-users