Hi Simon,
On 12/06/2012 05:59 PM, Simon Urbanek wrote:
On Dec 6, 2012, at 8:36 PM, Hervé Pagès wrote:
On 12/06/2012 04:53 PM, William Dunlap wrote:
Why not just use some tag that R doesn't already use, say "Comment:", instead
of a #? If you allow # in position one of a line to mean a comment then people
may expect # to be used as a comment anywhere on a line.
I would stick to whatever the DCF spec say, if there is such thing.
If the spec says # on position 1 means a comment then I think read.dcf()
should do that. Then the function can be used to read any DCF file,
not just DESCRIPTION files.
DCF itself doesn't define the meaning of # -- it only defines that no field
name is allowed to start with #. In fact the same document says that lines
starting with # are not permitted in general DCF files -- they are only
permitted in Debian's source package control files. That leaves the status of #
as comments somewhat confusing. My interpretation would be that generic DCF
doesn't allow # but specific formats derived from DCF may choose to interpret
it that way. In either case the current behavior of read.dcf() definitely
satisfies the DCF definition.
Not if the definition says that no field name is allowed to start
with #:
> read.dcf("toto.dcf")
#Package Version
[1,] "toto" "0.0.0"
As both Brian and Bill pointed out, the proper way to do that is to define a
data field with data/value as the comment.
which maybe works OK for inserting comments in DESCRIPTION files,
but not so well for inserting inter-record comments in DCF files with
multiple records.
In Bioconductor we maintain a big DCF file that we use to automatically
re-generate a collection of annotation packages at each release. The
file looks like:
# Annotation packages for Human
Package: hcg110.db
Version: 2.8.0
PkgTemplate: NCBICHIP.DB
Package: hgfocus.db
Version: 2.8.0
PkgTemplate: NCBICHIP.DB
# Annotation packages for Mouse
Package: mgu74a.db
Version: 2.8.0
PkgTemplate: NCBICHIP.DB
Package: mgu74av2.db
Version: 2.8.0
PkgTemplate: NCBICHIP.DB
The problem if you put those comments in key/value pairs is that
it contaminates the output of read.dcf() with fake records:
> read.dcf("toto.dcf")
Note Package Version PkgTemplate
[1,] "Annotation packages for Human" NA NA NA
[2,] NA "hcg110.db" "2.8.0" "NCBICHIP.DB"
[3,] NA "hgfocus.db" "2.8.0" "NCBICHIP.DB"
[4,] "Annotation packages for Mouse" NA NA NA
[5,] NA "mgu74a.db" "2.8.0" "NCBICHIP.DB"
[6,] NA "mgu74av2.db" "2.8.0" "NCBICHIP.DB"
The file really has 4 records of data and it'd be good to be able to add
inter-record comments without altering the number of records.
This is the reason why we use a "comment aware" version of read.dcf().
I can see why maybe you wouldn't like having people start using # to
insert comment lines in their DESCRIPTION file and I agree that it
should probably be discouraged. So maybe support for # comments could
be made optional in read.dcf() thru an extra arg, and would be disabled
by default?
Thanks,
H.
Cheers,
Simon
Cheers,
H.
(It may also mess up some dcf parsing code that I've written - it checks that
lines
after tagged lines are either empty, the start of a new description, or start
with a space,
a continuation of the previous line.)
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
-----Original Message-----
From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On
Behalf
Of Hervé Pagès
Sent: Thursday, December 06, 2012 3:47 PM
To: Duncan Murdoch
Cc: christophe.genol...@u-paris10.fr; r-devel@r-project.org; Christophe Genolini
Subject: Re: [Rd] Comments in the DESCRIPTION file
On 12/06/2012 03:41 PM, Hervé Pagès wrote:
Hi,
Wouldn't be hard to patch read.dcf() though.
FWIW here's the "comment aware" version of read.dcf() I've been using
for years:
.removeCommentLines <- function(infile=stdin(), outfile=stdout())
{
if (is.character(infile)) {
infile <- file(infile, "r")
on.exit(close(infile))
}
if (is.character(outfile)) {
outfile <- file(outfile, "w")
on.exit({close(infile); close(outfile)})
}
while (TRUE) {
lines <- readLines(infile, n=25000L)
if (length(lines) == 0L)
return()
keep_it <- substr(lines, 1L, 1L) != "#"
writeLines(lines[keep_it], outfile)
}
}
read.dcf2 <- function(file, ...)
{
clean_file <- file.path(tempdir(), "clean.dcf")
mmh, would certainly be better to just use tempfile() here.
H.
.removeCommentLines(file, clean_file)
on.exit(file.remove(clean_file))
read.dcf(clean_file, ...)
}
Cheers,
H.
On 11/07/2012 01:53 AM, Duncan Murdoch wrote:
On 12-11-07 4:26 AM, Christophe Genolini wrote:
Hi all,
Is it possible to add comments in the DESCRIPTION file?
The read.dcf function is used to read the DESCRIPTION file, and it
doesn't support comments. (The current Debian control format
description does appear to support comments with leading # markers, but
R's read.dcf function doesn't support these.)
You could probably get away with something like
#: this is a comment
since unrecognized fields are ignored, but I think this fact is
undocumented so I would say it's safer to assume that comments are not
supported.
Duncan Murdoch
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpa...@fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpa...@fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpa...@fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel