Ricardo Wurmus <ricardo.wur...@mdc-berlin.de> skribis: > So I rewrote the CRAN importer to do this: > > * download the DESCRIPTION file for a given package > * break it up into a simple alist > * transform the alist into a package expression > > This is much simpler than the sxml hackery we did before and the code > can be reused to write an importer for Bioconductor, a popular, > versioned R package repository for bioinformatics packages.[1]
Sounds great! > From f2455d19461d50c775360aafb60c104281f83483 Mon Sep 17 00:00:00 2001 > From: Ricardo Wurmus <ricardo.wur...@mdc-berlin.de> > Date: Thu, 3 Dec 2015 16:12:09 +0100 > Subject: [PATCH] import: cran: Parse DESCRIPTION instead of HTML. > > * guix/import/cran.scm (description->alist, safe-car, listify, > beautify-description, description->package): New procedures. > (table-datum, downloads->url, nodes->text, cran-sxml->sexp): Remove > proceduces. > (latest-release): Use parsed DESCRIPTION instead of SXML. > * tests/cran.scm: Rewrite to match importer. > --- > guix/import/cran.scm | 270 > +++++++++++++++++++++++++-------------------------- > tests/cran.scm | 189 +++++++++++++++--------------------- > 2 files changed, 214 insertions(+), 245 deletions(-) Bonus points for making it shorter. :-) > +(define (safe-car maybe-pair) Does it have airbags? > + (if (or (null? maybe-pair) > + (not maybe-pair)) > + #f > + (car maybe-pair))) Seriously, I think this is the wrong way to deal with that. > +(define (package->cran-name package) > + "Given a Guix PACKAGE value, return the name of the R package on CRAN." > + (let* ((source-url (and=> (package-source package) > + (compose safe-car origin-uri))) I would change it to: (match (package-source package) ((? origin? origin) (match (origin-uri origin) ((url rest ...) (let ((end (string-rindex url #\_)) …) …)) (_ #f))) (_ #f)) This is more verbose, but it makes the intent clearer IMO, and avoids “wrong-type-arg #f” errors that would arise with the proposed code. > + (let ((url (string-append %cran-url name "/DESCRIPTION"))) > + (call-with-temporary-output-file > + (lambda (temp port) > + (and (url-fetch url temp) > + (call-with-input-file temp > + (compose description->alist read-string))))))) I think it’s best to use ‘http-fetch’ or ‘http-fetch/cached’ from (guix http-client), which does not necessitate the creation of a temporary file. > +(define (beautify-description description) > + "Improve the package DESCRIPTION by turning a beginning sentence fragment > +into a proper sentence and by using two spaces between sentences." Excellent. :-) > +URL: http://gnu.org/s/my-example > +Description: This is a long description > +spanning multiple lines: and it could confuse the parser that > +there is a colon : on the lines. > + And: this line continues the description. > +biocViews: 0 > +SystemRequirements: Cairo (>= 0) > +Depends: A C++11 compiler. Version 4.6.* of g++ (as > + currently in Rtools) is insufficient; versions 4.8.*, 4.9.* or > + later will be fine. Too bad that this format is almost, but not quite, recutils. The rest LGTM! Can you send an updated patch? Thank you! Ludo’.