[R] Stricter read.table?

2010-12-10 Thread Stavros Macrakis
read.table gives idiosyncratic results when the input is formatted
strangely, for example:

read.table(textConnection(a'b\nc'd\n),header=FALSE,fill=TRUE,sep=,quote=')
  = c'd a'b c'd

read.table(textConnection(a'b\nc'd\nf'\n'\n),header=FALSE,fill=TRUE,sep=,quote=')
  = f'  \na b   c'd f'  \n

Though read.table doesn't specify the syntax of its input precisely, these
results don't seem particularly useful or consistent.

Is there a stricter version of read.table (perhaps in a package) that gives
errors or warnings if it finds quotation marks in the middle of fields or
encounters other such peculiar situations?

Thanks,

 -s

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Stricter read.table?

2010-12-10 Thread Ben Bolker
Stavros Macrakis macrakis at alum.mit.edu writes:

 
 read.table gives idiosyncratic results when the input is formatted
 strangely, for example:
 
 read.table(textConnection(
a'b\nc'd\n),header=FALSE,
  fill=TRUE,sep=,quote=')
   = c'd a'b c'd
 

 read.table(textConnection(
a'b\nc'd\nf'\n'\n),
header=FALSE,fill=TRUE
sep=,quote=')
   = f'  \na b   c'd f'  \n
 
 Though read.table doesn't specify the syntax of its input precisely, these
 results don't seem particularly useful or consistent.
 
 Is there a stricter version of read.table (perhaps in a package) that gives
 errors or warnings if it finds quotation marks in the middle of fields or
 encounters other such peculiar situations?

  I dissected this behavior a bit more here

https://stat.ethz.ch/pipermail/r-devel/2010-November/059016.html

(it is due to an inconsistency between the way that scan() and
readLines() handle lines with unterminated quotes, IIRC)

and Martin Maechler said
https://stat.ethz.ch/pipermail/r-devel/2010-November/059107.html
I think it can be defended to file as a bug, but it is tricky to pinpoint
exactly what the issue is.
   I don't know of a stricter version of read.table(), but if you had
the time and inclination to pick through the code and (i) provide a
careful definition of desired behavior and (ii) supply patches, you could
do your little bit to make R better. (If I posted a bug report would you
annotate it with a discussion of desired behavior?)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.