Re: [Rd] Marking a ticket as a (potential) regression in bug tracker?

2020-11-26 Thread Scott Kostyshak
On Fri, Jun 12, 2020 at 10:17:11AM -0400, Scott Kostyshak wrote:
> 
> Is there a way to mark a ticket as a potential regression in the bug
> tracker? I think the following issue is a regression:
> 
>   https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17684
> 
> I've just tested (2020-06-12 r78687) and what I believe to be a
> regression is still there. I don't think the bug has bitten many people,
> so I don't think it is critical, but often it is helpful to mark bugs as
> regressions in trackers.

If there's no current way to mark something as a regression, would there
be support for adding a way?

Best,

Scott


-- 
Scott Kostyshak (he/him)
Assistant Professor of Economics
University of Florida
https://people.clas.ufl.edu/skostyshak/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Marking a ticket as a (potential) regression in bug tracker?

2020-06-12 Thread Scott Kostyshak
Is there a way to mark a ticket as a potential regression in the bug
tracker? I think the following issue is a regression:

  https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17684

I've just tested (2020-06-12 r78687) and what I believe to be a
regression is still there. I don't think the bug has bitten many people,
so I don't think it is critical, but often it is helpful to mark bugs as
regressions in trackers.

Thanks,

Scott


-- 
Scott Kostyshak
Assistant Professor of Economics
University of Florida
https://people.clas.ufl.edu/skostyshak/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [patch] add sanity checks to quantile()

2020-01-04 Thread Scott Kostyshak
On Sat, Jan 04, 2020 at 06:32:15PM -0500, Duncan Murdoch wrote:
> 
> On 04/01/2020 4:35 p.m., Scott Kostyshak wrote:
> > On Fri, May 31, 2019 at 01:28:55AM -0400, Scott Kostyshak wrote:
> > > The attached patch adds some sanity checks to the "type" argument of
> ...
> > Bump. For this type of patch proposal, is it better to use the
> > bug tracker?
> 
> For almost any patch proposal it is.  Certainly if you don't get action
> (or at least discussion) within a few days, any other proposal will be
> forgotten.
> 
> Duncan Murdoch

That makes sense. Thanks for the quick reply and advice. Here is the
ticket:

  https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17683

Scott


-- 
Scott Kostyshak
Assistant Professor of Economics
University of Florida
https://people.clas.ufl.edu/skostyshak/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [patch] add sanity checks to quantile()

2020-01-04 Thread Scott Kostyshak
On Fri, May 31, 2019 at 01:28:55AM -0400, Scott Kostyshak wrote:
> The attached patch adds some sanity checks to the "type" argument of
> quantile(). Output from the following commands show the change of
> behavior with the current patch:
> 
>   vec <- 1:10
>   quantile(vec, type = c(1, 2))
>   quantile(vec, type = 10)
>   quantile(vec, type = "aaa")
>   quantile(vec, type = NA_real_)
>   quantile(vec, type = 4.3)
>   quantile(vec, type = -1)
> 
> Current behavior (i.e., without the patch):
> 
>   > vec <- 1:10
>   > quantile(vec, type = c(1, 2))
>   Error in switch(type, (nppm > j), ((nppm > j) + 1)/2, (nppm != j) | 
> ((j%%2L) ==  : 
> EXPR must be a length 1 vector
>   In addition: Warning messages:
>   1: In if (type == 7) { :
> the condition has length > 1 and only the first element will be used
>   2: In if (type <= 3) { :
> the condition has length > 1 and only the first element will be used
>   3: In if (type == 3) n * probs - 0.5 else n * probs :
> the condition has length > 1 and only the first element will be used
>   > quantile(vec, type = 10)
>   Error in quantile.default(vec, type = 10) : object 'a' not found
>   > quantile(vec, type = "aaa")
>   Error in type - 3 : non-numeric argument to binary operator
>   > quantile(vec, type = NA_real_)
>   Error in if (type == 7) { : missing value where TRUE/FALSE needed
>   > quantile(vec, type = 4.3)
> 0%  25%  50%  75% 100% 
>1.0  2.5  5.0  7.5 10.0 
>   > quantile(vec, type = -1)
> 0%  25%  50%  75% 100% 
>  1257   10 
> 
> 
> Behavior with the patch:
> 
>   > vec <- 1:10
>   > quantile(vec, type = c(1, 2))
>   Error in quantile.default(vec, type = c(1, 2)) : 
> 'type' must be of length 1
>   > quantile(vec, type = 10)
>   Error in quantile.default(vec, type = 10) : 
> 'type' must be an integer between 1 and 9
>   > quantile(vec, type = "aaa")
>   Error in quantile.default(vec, type = "aaa") : 
> 'type' must be an integer between 1 and 9
>   > quantile(vec, type = NA_real_)
>   Error in quantile.default(vec, type = NA_real_) : 
> 'type' must be an integer between 1 and 9
>   > quantile(vec, type = 4.3)
>   Error in quantile.default(vec, type = 4.3) : 
> 'type' must be an integer between 1 and 9
>   > quantile(vec, type = -1)
>   Error in quantile.default(vec, type = -1) : 
> 'type' must be an integer between 1 and 9
> 
> 
> Note that with the patch, quantile() gives an error in some cases where
> the current code does not. Specifically, the following two calls to
> quantile() do not give an error without the patch:
> 
>   quantile(vec, type = 4.3)
>   quantile(vec, type = -1)
> 
> Thus, this patch could cause current code to give an error. If it is
> desired, I could change the patch such that it only gives an error when
> current R gives an error (i.e., the only benefit of the patch would be
> better error messages), or I can change the patch to give a warning in
> these cases.
> 
> Scott
> 
> 
> -- 
> Scott Kostyshak
> Assistant Professor of Economics
> University of Florida
> https://people.clas.ufl.edu/skostyshak/
> 

Bump. For this type of patch proposal, is it better to use the
bug tracker?

Thanks,

Scott


-- 
Scott Kostyshak
Assistant Professor of Economics
University of Florida
https://people.clas.ufl.edu/skostyshak/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] [patch] add sanity checks to quantile()

2019-05-30 Thread Scott Kostyshak
The attached patch adds some sanity checks to the "type" argument of
quantile(). Output from the following commands show the change of
behavior with the current patch:

  vec <- 1:10
  quantile(vec, type = c(1, 2))
  quantile(vec, type = 10)
  quantile(vec, type = "aaa")
  quantile(vec, type = NA_real_)
  quantile(vec, type = 4.3)
  quantile(vec, type = -1)

Current behavior (i.e., without the patch):

  > vec <- 1:10
  > quantile(vec, type = c(1, 2))
  Error in switch(type, (nppm > j), ((nppm > j) + 1)/2, (nppm != j) | ((j%%2L) 
==  : 
EXPR must be a length 1 vector
  In addition: Warning messages:
  1: In if (type == 7) { :
the condition has length > 1 and only the first element will be used
  2: In if (type <= 3) { :
the condition has length > 1 and only the first element will be used
  3: In if (type == 3) n * probs - 0.5 else n * probs :
the condition has length > 1 and only the first element will be used
  > quantile(vec, type = 10)
  Error in quantile.default(vec, type = 10) : object 'a' not found
  > quantile(vec, type = "aaa")
  Error in type - 3 : non-numeric argument to binary operator
  > quantile(vec, type = NA_real_)
  Error in if (type == 7) { : missing value where TRUE/FALSE needed
  > quantile(vec, type = 4.3)
0%  25%  50%  75% 100% 
   1.0  2.5  5.0  7.5 10.0 
  > quantile(vec, type = -1)
0%  25%  50%  75% 100% 
 1257   10 


Behavior with the patch:

  > vec <- 1:10
  > quantile(vec, type = c(1, 2))
  Error in quantile.default(vec, type = c(1, 2)) : 
'type' must be of length 1
  > quantile(vec, type = 10)
  Error in quantile.default(vec, type = 10) : 
'type' must be an integer between 1 and 9
  > quantile(vec, type = "aaa")
  Error in quantile.default(vec, type = "aaa") : 
'type' must be an integer between 1 and 9
  > quantile(vec, type = NA_real_)
  Error in quantile.default(vec, type = NA_real_) : 
'type' must be an integer between 1 and 9
  > quantile(vec, type = 4.3)
  Error in quantile.default(vec, type = 4.3) : 
'type' must be an integer between 1 and 9
  > quantile(vec, type = -1)
  Error in quantile.default(vec, type = -1) : 
'type' must be an integer between 1 and 9


Note that with the patch, quantile() gives an error in some cases where
the current code does not. Specifically, the following two calls to
quantile() do not give an error without the patch:

  quantile(vec, type = 4.3)
  quantile(vec, type = -1)

Thus, this patch could cause current code to give an error. If it is
desired, I could change the patch such that it only gives an error when
current R gives an error (i.e., the only benefit of the patch would be
better error messages), or I can change the patch to give a warning in
these cases.

Scott


-- 
Scott Kostyshak
Assistant Professor of Economics
University of Florida
https://people.clas.ufl.edu/skostyshak/

Index: src/library/stats/R/quantile.R
===
--- src/library/stats/R/quantile.R	(revision 76528)
+++ src/library/stats/R/quantile.R	(working copy)
@@ -25,6 +25,12 @@
 function(x, probs = seq(0, 1, 0.25), na.rm = FALSE, names = TRUE,
  type = 7, ...)
 {
+if (length(type) != 1L) {
+stop("'type' must be of length 1")
+}
+if (is.na(type) || !is.numeric(type) || !any(type == 1:9)) {
+stop("'type' must be an integer between 1 and 9")
+}
 if(is.factor(x)) {
 	if(is.ordered(x)) {
 	   if(!any(type == c(1L, 3L)))
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] source(echo = TRUE) with a iso-8859-1 encoded file gives an error

2018-05-04 Thread Scott Kostyshak
Thanks for your reply, Ista, and your advice. I will re-post to r-help.

Best,

Scott


-- 
Scott Kostyshak
Assistant Professor of Economics
University of Florida
https://people.clas.ufl.edu/skostyshak/

On Tue, May 01, 2018 at 07:15:30PM +, Ista Zahn wrote:
> Hi Scott,
> 
> This question is appropriate for the r-help mailing list, but probably
> off-topic here on r-devel.
> 
> Best,
> Ista
> 
> On Tue, May 1, 2018 at 2:57 PM, Scott Kostyshak <skostys...@ufl.edu> wrote:
> > I have very little knowledge about file encodings and would like to
> > learn more.
> >
> > I've read the following pages to learn more:
> >
> >   
> > https://urldefense.proofpoint.com/v2/url?u=http-3A__stat.ethz.ch_R-2Dmanual_R-2Ddevel_library_base_html_Encoding.html=DwIDAw=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM=1fpq0SJ48L-zRWX2t0llEVIDZAHfU8S-4oINHlOA0rk=Hx2R8haOcpOy7nHCyZ63_tEVrmVn5txQk-yjGkgjKjw=HegPJMcZ_5R6vYtdQLgIsh-M6ElOlewHPBZxe8IPSlI=
> >   
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__stackoverflow.com_questions_4806823_how-2Dto-2Ddetect-2Dthe-2Dright-2Dencoding-2Dfor-2Dread-2Dcsv=DwIDAw=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM=1fpq0SJ48L-zRWX2t0llEVIDZAHfU8S-4oINHlOA0rk=Hx2R8haOcpOy7nHCyZ63_tEVrmVn5txQk-yjGkgjKjw=KGDvHJrfkvqbwyKnIiY0V45HtN-W4Rpq4ZBXfIFaFMk=
> >   
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__developer.r-2Dproject.org_Encodings-5Fand-5FR.html=DwIDAw=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM=1fpq0SJ48L-zRWX2t0llEVIDZAHfU8S-4oINHlOA0rk=Hx2R8haOcpOy7nHCyZ63_tEVrmVn5txQk-yjGkgjKjw=Ka1kGiCw3w22tOLfA50AyrKsMT-La14TQdutJJkdE04=
> >
> > The last one, in particular, has been very helpful. I would be
> > interested in any further references that you suggest.
> >
> > I attach a file that reproduces the issue I would like to learn more
> > about. I do not know if the file encoding will be correctly preserved
> > through email, so I also provide the file (temporarily) on Dropbox here:
> >
> >   
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.dropbox.com_s_3lbgebk7b5uaia7_encoding-5Fexport-5Fissue.R-3Fdl-3D0=DwIDAw=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM=1fpq0SJ48L-zRWX2t0llEVIDZAHfU8S-4oINHlOA0rk=Hx2R8haOcpOy7nHCyZ63_tEVrmVn5txQk-yjGkgjKjw=58a7qB9IHt3s2ZLDglGEHwWARuo8xvSlH_z8G5jDaUY=
> >
> > The file gives an error when using "source()" with the
> > argument echo = TRUE:
> >
> >   > source("encoding_export_issue.R", echo = TRUE)
> >   Error in nchar(dep, "c") : invalid multibyte string, element 1
> >   In addition: Warning message:
> >   In grepl("^[[:blank:]]*$", dep[1L]) :
> > input string 1 is invalid in this locale
> >
> > The problem comes from the "á" character in the .R file. The file
> > appears to be encoded as "iso-8859-1":
> >
> >   $ file --mime-encoding encoding_export_issue.R
> >   encoding_export_issue.R: iso-8859-1
> >
> > Note that for me:
> >
> >   > getOption("encoding")
> >   [1] "native.enc"
> >
> > so "native.enc" is used for the "encoding" argument of source().
> >
> > The following two calls succeed:
> >
> >   > source("encoding_export_issue.R", echo = TRUE, encoding = "unknown")
> >   > source("encoding_export_issue.R", echo = TRUE, encoding = "iso-8859-1")
> >
> > Is this file a valid "iso-8859-1" encoded file?  Why does source() fail
> > in the case of encoding set to "native.enc"? Is it because of the
> > settings to UTF-8 in my locale (see info on my system at the bottom of
> > this email).
> >
> > I'm guessing it would be a bad idea to put
> >
> >   options(encoding = "unknown")
> >
> > in my .Rprofile, because it is difficult to always correctly guess the
> > encoding of files? Is there a reason why setting it to "unknown" would
> > lead to more problems than leaving it set to "native.enc"?
> >
> > I've reproduced the above behavior on R-devel (r74677) and 3.4.3. Below
> > is my session info and locale info for my system with the 3.4.3 version:
> >
> >> sessionInfo()
> > R version 3.4.3 (2017-11-30)
> > Platform: x86_64-pc-linux-gnu (64-bit)
> > Running under: Ubuntu 16.04.3 LTS
> >
> > Matrix products: default
> > BLAS: /usr/lib/libblas/libblas.so.3.6.0
> > LAPACK: /usr/lib/lapack/liblapack.so.3.6.0
> >
> > locale:
> >  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
> >  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
> &

[Rd] source(echo = TRUE) with a iso-8859-1 encoded file gives an error

2018-05-01 Thread Scott Kostyshak
I have very little knowledge about file encodings and would like to
learn more.

I've read the following pages to learn more:

  
https://urldefense.proofpoint.com/v2/url?u=http-3A__stat.ethz.ch_R-2Dmanual_R-2Ddevel_library_base_html_Encoding.html=DwIDAw=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM=1fpq0SJ48L-zRWX2t0llEVIDZAHfU8S-4oINHlOA0rk=Hx2R8haOcpOy7nHCyZ63_tEVrmVn5txQk-yjGkgjKjw=HegPJMcZ_5R6vYtdQLgIsh-M6ElOlewHPBZxe8IPSlI=
  
https://urldefense.proofpoint.com/v2/url?u=https-3A__stackoverflow.com_questions_4806823_how-2Dto-2Ddetect-2Dthe-2Dright-2Dencoding-2Dfor-2Dread-2Dcsv=DwIDAw=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM=1fpq0SJ48L-zRWX2t0llEVIDZAHfU8S-4oINHlOA0rk=Hx2R8haOcpOy7nHCyZ63_tEVrmVn5txQk-yjGkgjKjw=KGDvHJrfkvqbwyKnIiY0V45HtN-W4Rpq4ZBXfIFaFMk=
  
https://urldefense.proofpoint.com/v2/url?u=https-3A__developer.r-2Dproject.org_Encodings-5Fand-5FR.html=DwIDAw=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM=1fpq0SJ48L-zRWX2t0llEVIDZAHfU8S-4oINHlOA0rk=Hx2R8haOcpOy7nHCyZ63_tEVrmVn5txQk-yjGkgjKjw=Ka1kGiCw3w22tOLfA50AyrKsMT-La14TQdutJJkdE04=

The last one, in particular, has been very helpful. I would be
interested in any further references that you suggest.

I attach a file that reproduces the issue I would like to learn more
about. I do not know if the file encoding will be correctly preserved
through email, so I also provide the file (temporarily) on Dropbox here:

  
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.dropbox.com_s_3lbgebk7b5uaia7_encoding-5Fexport-5Fissue.R-3Fdl-3D0=DwIDAw=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM=1fpq0SJ48L-zRWX2t0llEVIDZAHfU8S-4oINHlOA0rk=Hx2R8haOcpOy7nHCyZ63_tEVrmVn5txQk-yjGkgjKjw=58a7qB9IHt3s2ZLDglGEHwWARuo8xvSlH_z8G5jDaUY=

The file gives an error when using "source()" with the
argument echo = TRUE:

  > source("encoding_export_issue.R", echo = TRUE)
  Error in nchar(dep, "c") : invalid multibyte string, element 1
  In addition: Warning message:
  In grepl("^[[:blank:]]*$", dep[1L]) :
input string 1 is invalid in this locale

The problem comes from the "á" character in the .R file. The file
appears to be encoded as "iso-8859-1":

  $ file --mime-encoding encoding_export_issue.R 
  encoding_export_issue.R: iso-8859-1

Note that for me:

  > getOption("encoding")
  [1] "native.enc"

so "native.enc" is used for the "encoding" argument of source().

The following two calls succeed:

  > source("encoding_export_issue.R", echo = TRUE, encoding = "unknown")
  > source("encoding_export_issue.R", echo = TRUE, encoding = "iso-8859-1")

Is this file a valid "iso-8859-1" encoded file?  Why does source() fail
in the case of encoding set to "native.enc"? Is it because of the
settings to UTF-8 in my locale (see info on my system at the bottom of
this email).

I'm guessing it would be a bad idea to put

  options(encoding = "unknown")

in my .Rprofile, because it is difficult to always correctly guess the
encoding of files? Is there a reason why setting it to "unknown" would
lead to more problems than leaving it set to "native.enc"?

I've reproduced the above behavior on R-devel (r74677) and 3.4.3. Below
is my session info and locale info for my system with the 3.4.3 version:

> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.3 LTS

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C  
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C 
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

loaded via a namespace (and not attached):
[1] compiler_3.4.3

> Sys.getlocale()
[1] 
"LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C"

Thanks for your time,

Scott


-- 
Scott Kostyshak
Assistant Professor of Economics
University of Florida
https://people.clas.ufl.edu/skostyshak/

# Ch?vez
quantile_type <- 4

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Mention the case of logical(0) in ?stopifnot

2018-03-31 Thread Scott Kostyshak
I wonder if it would be helpful to mention in ?stopifnot that
stopifnot(logical(0)) does not give an error (for background on why this
is the case, see [1]). For example, ?all explicitly mentions the
following:

  That all(logical(0)) is true is a useful convention

and includes an example:

  all(logical(0))  # true, as all zero of the elements are true.

I think it would be nice to give examples in ?stopifnot of calls that
are not ideal uses of the function, such as the poorly written
stopifnot() call that I recently wrote:

  x <- 1:5
  # does not give an error
  stopifnot(ncol(x) == 2)
  # gives an error
  stopifnot(identical(ncol(x), 2L))

Or this code from [2]:

  li <- list()
  li$item <- 1
  # Does not give an error, because
  # "item" is misspelled and "NULL == 0" returns logical(0)
  stopifnot(li$tem == 0)

I think that a useful way to teach users how to use a function is to
teach them how not to use it.

Would a patch for the documentation along these lines be considered?

By the way, there are some regression tests in base R that rely on the
behavior of stopifnot(logical(0)), where the logical(0) results from
`==`. I can make a list of these tests if someone thinks it would be a
good idea to double-check them and possibly improve them (e.g., convert
them to use identical() instead of `==`). I'm guessing it's not worth
the time.

Scott


[1]
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_pipermail_r-2Dhelp_2015-2DDecember_434610.html=DwIBAg=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM=1fpq0SJ48L-zRWX2t0llEVIDZAHfU8S-4oINHlOA0rk=G8tEZpMWPL4vxGGinNsRHdfXpDqiFEownNAdY_AOiUk=wxOygcK0MIUDAQhkzjgfT-4edxWNCWluOEgAyR-xCC0=
[2]
https://urldefense.proofpoint.com/v2/url?u=https-3A__stackoverflow.com_questions_33670060_how-2Dto-2Dhave-2Dstopifnot-2Dreturn-2Dan-2Derror-2Dwhen-2Dcalled-2Don-2Da-2Dmissing-2Dnull-2Delement=DwIBAg=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM=1fpq0SJ48L-zRWX2t0llEVIDZAHfU8S-4oINHlOA0rk=G8tEZpMWPL4vxGGinNsRHdfXpDqiFEownNAdY_AOiUk=ZCSz07Z1Gz4pAWgw75UUn9wIMI-wCv2Srfkn2MGYYlI=


-- 
Scott Kostyshak
Assistant Professor of Economics
University of Florida
https://people.clas.ufl.edu/skostyshak/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Using response variable in interaction as explanatory variable in glm crashes R

2017-10-10 Thread Scott Kostyshak
On Mon, Oct 09, 2017 at 03:52:43PM +, Martin Maechler wrote:
> >>>>> Jan van der Laan <rh...@eoos.dds.nl>
> >>>>> on Fri, 6 Oct 2017 12:13:39 +0200 writes:
> 
> > It is actually model.matrix that crashes, not glm. Same
> > crash occurs with e.g. lm.
> 
> > model.matrix(dob_mon ~ dob_day*dob_mon, data = tab)
> 
> > also crashes R.
> 
> Yes, segmentation fault.
> 
> It only happens when these are *logical* variables, not, e.g., when
> transformed to integer.
> 
> The C code in src/library/stats/src/model.c  tries to eliminate
> occurances of the LHS of the formula from the RHS when building
> the model matrix and it does work fine in the integer case.
> 
> Part of the culprit code may be this (from line 717),
> with the  isLogical(.) which in our case, shifts the pointer by
> 1  in the call to firstfactor() :
> 
>   int adj = isLogical(var_i)?1:0;
>   // avoid overflow of jstart * nn PR#15578
>   firstfactor([jstart * nn], n, jnext - jstart,
>   REAL(contrast), nrows(contrast),
>   ncols(contrast), INTEGER(var_i)+adj);
> 
> then in firstfactor(), we see the segfault (when running R with
> '-d gdb') :
> 
> > model.matrix(dob_mon ~ dob_day*dob_mon, data = tab)
> 
>   Program received signal SIGSEGV, Segmentation fault.
>   0x7fffeafa76b5 in firstfactor (ncx=0, v=0x5c3b37c, ncc=1, nrc=2, 
> c=0x5c90008, 
>nrx=8, x=0x5cbf150) at ../../../../../R/src/library/stats/src/model.c:252
> 252   else xj[i] = cj[v[i]-1];
> Missing separate debuginfos, .
> (gdb) list
> 247   for (int j = 0; j < ncc; j++) {
> 248   xj = [j * (R_xlen_t)nrx];
> 249   cj = [j * (R_xlen_t)nrc];
> 250   for (int i = 0; i < nrx; i++)
> 251   if(v[i] == NA_INTEGER) xj[i] = NA_REAL;
> 252   else xj[i] = cj[v[i]-1];
> 253   }
> 254   }
> 255   
> 
> and indeed in the debugger,  i=7  and  v[i] is "outside", v[]
> being of length 7, hence indexed 0:6.

Dear Martin,

I just wanted to thank you for providing details on your approach to
debugging. Often I see bug fixes and I wonder "how the heck did they
figure that out?" so I am very excited when I see details like these on
the process (and not just the end result), so that I can learn.

Best,

Scott


-- 
Scott Kostyshak
Assistant Professor of Economics
University of Florida
https://people.clas.ufl.edu/skostyshak/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] specifying name in the error message "promise already under evaluation"

2017-09-17 Thread Scott Kostyshak
Consider the following R code:

abc <- function(x, y = y) {
  x + y
}

abc(x = 3)

which gives the following error:

promise already under evaluation: recursive default argument
reference or earlier problems?

If you google that error, you will find that it usually refers to the
situation given in the example above, although I'm sure the error is
more general and could be triggered in other situations.

I'm trying to think about how to improve the error for the most common
situation that triggers it. One simple way would be to give the name of
the promise. For example, I think that the following would already be an
improvement:

promise "y" already under evaluation: recursive default argument
reference or earlier problems?

Any thoughts?

Scott


-- 
Scott Kostyshak
Assistant Professor of Economics
University of Florida
https://people.clas.ufl.edu/skostyshak/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [patch] ?confint: "assumes asymptotic normality"

2017-07-20 Thread Scott Kostyshak
On Thu, Jul 20, 2017 at 04:21:04PM +0200, Martin Maechler wrote:
> >>>>> Scott Kostyshak <skostys...@ufl.edu>
> >>>>> on Thu, 20 Jul 2017 03:28:37 -0400 writes:
> 
> >> From ?confint:
> > "Computes confidence intervals" and "The default method assumes
> > asymptotic normality"
> 
> > For me, a "confidence interval" implies an exact confidence interval in
> > formal statistics (I concede that when speaking, the term is often used
> > more loosely). And of course, even if a test statistic is asymptotically
> > normal (so the assumption is satisfied), the finite distribution might
> > not be normal and thus an exact confidence interval would not be
> > computed.
> 
> > Attached is a patch that simply changes "asymptotic normality" to
> > "normality" in confint.Rd. This encourages the user of the function to
> > think about whether their asymptotically normal statistic is "normal
> > enough" in a finite sample to get something reliable from confint().
> 
> > Alternatively, we could instead change "Computes confidence intervals"
> > to "Computes asymptotic confidence intervals".
> 
> > I hope I'm not being too pedantic here.
> 
> well, it's just at the 97.5% border line of "too pedantic"  ...

:)

> ;-)
> 
> I think you are right with your first proposal to drop
> "asymptotic" here.  After all, there's the explict 'fac <- qnorm(a)'.

Note that I received a private email that my message was indeed too
pedantic and expressed disagreement with the proposal. I'm not sure if
they intended it to be private so I will respond in private and see if
they feel like bringing the discussion on the list. Or perhaps this
minor (and perhaps controversial?) issue is not worth any additional
time.

> One could consider to make  'qnorm' an argument of the
> default method to allow more general distributional assumptions,
> but it may be wiser to have useRs write their own
> confint.() method, notably for cases where
> diag(vcov(object)) is an efficiency waste...

Thanks for your comments,

Scott

> Martin
> 
> 
> > Scott
> 
> 
> > -- 
> > Scott Kostyshak
> > Assistant Professor of Economics
> > University of Florida
> > https://people.clas.ufl.edu/skostyshak/
> 
> 
> > --
> > Index: src/library/stats/man/confint.Rd
> > ===
> > --- src/library/stats/man/confint.Rd(revision 72930)
> > +++ src/library/stats/man/confint.Rd(working copy)
> > @@ -31,7 +31,7 @@
> > }
> > \details{
> > \code{confint} is a generic function.  The default method assumes
> > -  asymptotic normality, and needs suitable \code{\link{coef}} and
> > +  normality, and needs suitable \code{\link{coef}} and
> > \code{\link{vcov}} methods to be available.  The default method can be
> > called directly for comparison with other methods.
>  
> 
> > --
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] [patch] ?confint: "assumes asymptotic normality"

2017-07-20 Thread Scott Kostyshak
>From ?confint:

"Computes confidence intervals" and "The default method assumes
asymptotic normality"

For me, a "confidence interval" implies an exact confidence interval in
formal statistics (I concede that when speaking, the term is often used
more loosely). And of course, even if a test statistic is asymptotically
normal (so the assumption is satisfied), the finite distribution might
not be normal and thus an exact confidence interval would not be
computed.

Attached is a patch that simply changes "asymptotic normality" to
"normality" in confint.Rd. This encourages the user of the function to
think about whether their asymptotically normal statistic is "normal
enough" in a finite sample to get something reliable from confint().

Alternatively, we could instead change "Computes confidence intervals"
to "Computes asymptotic confidence intervals".

I hope I'm not being too pedantic here.

Scott


-- 
Scott Kostyshak
Assistant Professor of Economics
University of Florida
https://people.clas.ufl.edu/skostyshak/

Index: src/library/stats/man/confint.Rd
===
--- src/library/stats/man/confint.Rd(revision 72930)
+++ src/library/stats/man/confint.Rd(working copy)
@@ -31,7 +31,7 @@
 }
 \details{
   \code{confint} is a generic function.  The default method assumes
-  asymptotic normality, and needs suitable \code{\link{coef}} and
+  normality, and needs suitable \code{\link{coef}} and
   \code{\link{vcov}} methods to be available.  The default method can be
   called directly for comparison with other methods.
 
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Patch for R-exts.texi

2017-07-08 Thread Scott Kostyshak
On Sat, Jul 08, 2017 at 06:18:25PM +0200, Martin Maechler wrote:
> >>>>> Scott Kostyshak <skostys...@ufl.edu>
> >>>>> on Mon, 3 Jul 2017 02:09:47 -0400 writes:
> 
> > Attached is a patch for R-exts.texi against r72880.  Here
> > are some of the changes I made:
> 
> > - Fix a broken link:
> > 
> https://developer.apple.com/library/mac/#documentation/DeveloperTools/Conceptual/InstrumentsUserGuide/Introduction/Introduction.html
> -> 
> > 
> https://developer.apple.com/library/content/documentation/DeveloperTools/Conceptual/InstrumentsUserGuide/index.html
> 
> > - Changed a few http to https (and checked that the
> > connections are indeed secure, as judged by Chromium and
> > Firefox).
> 
> > - A couple of grammar fixes and "sounds more natural to
> > me" changes.
> 
> > - "x84_64" -> x86_64
> 
> > - One change of "which" -> "that"
> 
> > - The link to Luke's uiowa.edu page involves two changes,
> > removing the duplicate URL and changing the protocol to
> > https.
> 
> > Thanks for your time,
> > Scott
> 
> > -- 
> > Scott Kostyshak Assistant Professor of Economics
> > University of Florida
> > https://people.clas.ufl.edu/skostyshak/
> 
> > [DELETED ATTACHMENT external: R-exts.texi.diff, plain
> > text]
> 
> Thank you very much, Scott!
> 
> This is a clear improvement
>  ((even though some of the style changes may be debatable - but only by native
>English/American (;-) speakers, not me. ...))
> 
> Hence I've committed it (R-devel, svn rev 72900).

Thanks for putting it in, Martin! I do my best to not impose American
English, but sometimes I just don't realize. I actually have adopted
several non-American rules because I find them more logical. For
example, I like to put punctuation outside of quotes, such as "this is a
quote", where for some reason in American English it is preferred to put
it as "this is a quote."

Thanks for taking the time to review the patch and commit it.

Scott


-- 
Scott Kostyshak
Assistant Professor of Economics
University of Florida
https://people.clas.ufl.edu/skostyshak/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Patch for R-exts.texi

2017-07-03 Thread Scott Kostyshak
Attached is a patch for R-exts.texi against r72880.

Here are some of the changes I made:

- Fix a broken link:

https://developer.apple.com/library/mac/#documentation/DeveloperTools/Conceptual/InstrumentsUserGuide/Introduction/Introduction.html
->

https://developer.apple.com/library/content/documentation/DeveloperTools/Conceptual/InstrumentsUserGuide/index.html

- Changed a few http to https (and checked that the connections are
  indeed secure, as judged by Chromium and Firefox).

- A couple of grammar fixes and "sounds more natural to me" changes.

- "x84_64" -> x86_64

- One change of "which" -> "that"

- The link to Luke's uiowa.edu page involves two changes, removing the
  duplicate URL and changing the protocol to https.

Thanks for your time,

Scott


-- 
Scott Kostyshak
Assistant Professor of Economics
University of Florida
https://people.clas.ufl.edu/skostyshak/

Index: doc/manual/R-exts.texi
===
--- doc/manual/R-exts.texi  (revision 72880)
+++ doc/manual/R-exts.texi  (working copy)
@@ -1457,7 +1457,7 @@
 
 @noindent
 then download the sources from
-@uref{http://sourceforge.net/@/projects/@/tcllib/@/files/@/BWidget/} and
+@uref{https://sourceforge.net/@/projects/@/tcllib/@/files/@/BWidget/} and
 at the command line run something like
 
 @example
@@ -1494,7 +1494,7 @@
 
 @noindent
 and not a version starting
-@samp{http://cran.r-project.org/web/packages/@var{pkgname}}.
+@samp{https://cran.r-project.org/web/packages/@var{pkgname}}.
 
 @node Configure and cleanup, Checking and building packages, Package 
structure, Creating R packages
 @section Configure and cleanup
@@ -2117,7 +2117,7 @@
 word, so computations done on OpenMP threads will not make use of
 extended-precision arithmetic which is the default for the main process.
 @c mingw64-public, 2015-02-02.
-@c 
http://stackoverflow.com/questions/2553725/is-the-fpu-control-word-setting-per-thread-or-per-process
+@c 
https://stackoverflow.com/questions/2553725/is-the-fpu-control-word-setting-per-thread-or-per-process
 
 Calling any of the @R{} API from threaded code is `for experts only':
 they will need to read the source code to determine if it is
@@ -7645,7 +7645,7 @@
 which is a GUI version), @command{Shark} (in version of @code{Xcode}
 up to those for Snow Leopard), and @command{Instruments} (part of
 @code{Xcode}, see
-@uref{https://developer.apple.com/library/mac/#documentation/DeveloperTools/Conceptual/InstrumentsUserGuide/Introduction/Introduction.html}).
+@uref{https://developer.apple.com/library/content/documentation/DeveloperTools/Conceptual/InstrumentsUserGuide/index.html}).
 
 
 @node Debugging, System and foreign language interfaces, Tidying and profiling 
R code, Top
@@ -8295,8 +8295,8 @@
 to be installed separately, and for checking C++ you may also need
 @pkg{libubsan}.} of @command{gcc} and @command{clang} on common Linux
 and macOS platforms.  See
-@uref{http://clang.llvm.org/@/docs/@/UsersManual.html#controlling-code-generation},
-@uref{http://clang.llvm.org/@/docs/@/AddressSanitizer.html} and
+@uref{https://clang.llvm.org/@/docs/@/UsersManual.html#controlling-code-generation},
+@uref{https://clang.llvm.org/@/docs/@/AddressSanitizer.html} and
 @uref{https://code.google.com/@/p/@/address-sanitizer/}.
 
 More thorough checks of C++ code are done if the C++ library has been
@@ -8455,7 +8455,7 @@
 
 Finer control of what is checked can be achieved by other options: for
 @command{clang} see
-@uref{http://clang.llvm.org/@/docs/@/UsersManual.html#controlling-code-generation}.@footnote{or
+@uref{https://clang.llvm.org/@/docs/@/UsersManual.html#controlling-code-generation}.@footnote{or
 the user manual for your version of @command{clang}, e.g.@: (the paths
 have differed for some versions)
 
@uref{http://llvm.org/@/releases/@/4.0.0/@/tools/@/clang/@/docs/@/UsersManual.html}.}
@@ -8560,13 +8560,13 @@
 Recent versions of @command{clang} on @cputype{x86_64} Linux have
 `ThreadSanitizer' (@uref{https://code.google.com/@/p/@/thread-sanitizer/}),
 a `data race detector for C/C++ programs', and `MemorySanitizer'
-(@uref{http://clang.llvm.org/@/docs/@/MemorySanitizer.html},
+(@uref{https://clang.llvm.org/@/docs/@/MemorySanitizer.html},
 @uref{https://code.google.com/@/p/@/memory-sanitizer/@/wiki/@/MemorySanitizer})
 for the detection of uninitialized memory.  Both are based on and
 provide similar functionality to tools in @command{valgrind}.
 
 @command{clang} has a `Static Analyser' which can be run on the source
-files during compilation: see @uref{http://clang-analyzer.llvm.org/}.
+files during compilation: see @uref{https://clang-analyzer.llvm.org/}.
 
 @node Using `Dr. Memory', Fortran array bounds checking, Other analyses with 
`clang', Checking memory access
 @subsection Using `Dr. Memory'
@@ -9429,7 +9429,7 @@
 @uref{https://www.r-project.org/@/doc/@/Rnews/Rnews_2001-3.pdf}).
 
 Once routines are re

Re: [Rd] Cursor not behaving properly

2014-11-19 Thread Scott Kostyshak
On Tue, Nov 18, 2014 at 9:50 PM, Scott Kostyshak skost...@princeton.edu wrote:
 On Mon, Nov 10, 2014 at 10:52 AM, Kaiyin Zhong (Victor Chung)
 kindlych...@gmail.com wrote:
 I found a strange bug in R recently (version 3.1.2):

 As you can see from the screenshots attached, when the cursor passes the
 right edge of the console, instead of start on a new line, it goes back to
 the beginning of the same line, and overwrites everything after it.

 This happens every time the size of the terminal is changed, for example,
 if you fit the terminal to the right half of the screen, start an R
 session, exec some commands, maximize the terminal, and type a long command
 into the session, then you will find the bug reproduced.

 I am on Ubuntu 14.04, and I have tested this in konsole, guake and
 gnome-terminal.

 I can reproduce this, also on Ubuntu 14.04, with gnome-terminal and
 xterm. If you don't get any response here, please file a bug report at
 bugs.r-project.org.

For archival purposes, the OP reported the bug here:
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16077

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Cursor not behaving properly

2014-11-18 Thread Scott Kostyshak
On Mon, Nov 10, 2014 at 10:52 AM, Kaiyin Zhong (Victor Chung)
kindlych...@gmail.com wrote:
 I found a strange bug in R recently (version 3.1.2):

 As you can see from the screenshots attached, when the cursor passes the
 right edge of the console, instead of start on a new line, it goes back to
 the beginning of the same line, and overwrites everything after it.

 This happens every time the size of the terminal is changed, for example,
 if you fit the terminal to the right half of the screen, start an R
 session, exec some commands, maximize the terminal, and type a long command
 into the session, then you will find the bug reproduced.

 I am on Ubuntu 14.04, and I have tested this in konsole, guake and
 gnome-terminal.

I can reproduce this, also on Ubuntu 14.04, with gnome-terminal and
xterm. If you don't get any response here, please file a bug report at
bugs.r-project.org.

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Turn warnings or notes into errors on CMD check ?

2014-10-11 Thread Scott Kostyshak
Hi,

I am using a local patch to have CMD check exit with error if there is
a note or warning. Am I missing an already existing way to do this?

If not, Is there any interest in having an option or environment
variable for this upstream? I would be interested in making a patch.
If so, option or environment variable? Any suggestions for the name?
Should this be two options or one option with 1 means only turn
warnings into errors and 2 means turn both warnings and notes into
errors?

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [patch] Rscript off-by-one error in output

2014-09-20 Thread Scott Kostyshak
On Wed, Jul 9, 2014 at 7:26 PM, Scott Kostyshak skost...@princeton.edu wrote:
 Rscript eats up the last argument when reporting the command it runs:

 $ Rscript --verbose /tmp/test.R one two three
 running
   '/usr/local/lib/R-devel/lib/R/bin/R --slave --no-restore
 --file=/tmp/test.R --args one two'

 With the patch below, I get the following:

 $ Rscript --verbose /tmp/test.R one two three
 running
   '/usr/local/lib/R-devel/lib/R/bin/R --slave --no-restore
 --file=/tmp/test.R --args one two three'


 Index: src/unix/Rscript.c
 ===
 --- src/unix/Rscript.c  (revision 66100)
 +++ src/unix/Rscript.c  (working copy)
 @@ -249,7 +249,7 @@
  #endif
  if(verbose) {
   fprintf(stderr, running\n  '%s, cmd);
 - for(i = 1; i  ac-1; i++) fprintf(stderr,  %s, av[i]);
 + for(i = 1; i  ac; i++) fprintf(stderr,  %s, av[i]);
   fprintf(stderr, '\n\n);
  }
  #ifndef _WIN32


 Scott


 sessionInfo()
 R Under development (unstable) (2014-07-08 r66100)
 Platform: x86_64-unknown-linux-gnu (64-bit)

 locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base


 --
 Scott Kostyshak
 Economics PhD Candidate
 Princeton University

For archival purposes, this was fixed at r66644.

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] last user argument missing from Rscript --verbose

2014-09-19 Thread Scott Kostyshak
On Fri, Sep 19, 2014 at 8:12 AM, Martin Maechler
maech...@stat.math.ethz.ch wrote:
 Harris A Jaffee h...@jhu.edu
 on Thu, 18 Sep 2014 19:32:29 +0200 writes:

 (using  HTML, please don't )

 The loop that echoes the arguments almost always stops too soon.  It
 apparently does that to avoid
 echoing the --args (that had been inserted) when there are no user
 arguments.  However, when there
 are user arguments, the next element of the 'av' array is the last
 argument and usually not --args,
 although it can be.
 ?Rscript is a little sketchy:
  `--verbose' gives details of what `Rscript' is doing.  Also passed
   on to R.
 What is passed to R is correct, but the diagnostic is not:
  $ Rscript --verbose /dev/null 1 2
   running
   '/path_to_R --slave --no-restore --file=/dev/null --args 1'
 Fixed (only tested on Mac):
  $ Rscript --verbose /dev/null 1 2
   running
   '/Library/Frameworks/R.framework/Versions/3.1/Resources/bin/R --slave
 --no-restore --file=/dev/null --args 1 2'

 You are right about the problem, also reproducible on Linux.
 You mention a 'fix'.
 It looks to me that is just

 -   for(i = 1; i  ac-1; i++) fprintf(stderr,  %s, av[i]);
 +   for(i = 1; i  ac; i++) fprintf(stderr,  %s, av[i]);

 in unix/Rscript.c, right ?

Yes, I suggested the same patch here:
http://r.789695.n4.nabble.com/patch-Rscript-off-by-one-error-in-output-td4693780.html

Scott

 BTW: If one use  -e 'commandArgs()'  instead of   /dev/null one
 sees that Rscript's lying about the last argument is not
 helpful anyway :

   Rscript --verbose -e 'commandArgs()'

   running
 '/usr/local64.sfs/app/R/R-3.1.1-inst/bin/R --slave --no-restore -e 
 commandArgs()'

   [1] /usr/local64.sfs/app/R/R-3.1.1-inst/bin/exec/R
   [2] --slave
   [3] --no-restore
   [4] -e
   [5] commandArgs()
   [6] --args

 because the '--args' appears anyway and indeed *is* passed to 'R'...

 A better fix would rather suppress that; but I will commit the
 above change.


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Looking for new maintainer of orphans R2HTML SemiPar cghseg hexbin lgtdl monreg muhaz operators pamr

2014-09-07 Thread Scott Kostyshak
On Sun, Sep 7, 2014 at 7:03 PM, Uwe Ligges
lig...@statistik.tu-dortmund.de wrote:


 On 08.09.2014 01:01, Gregory R. Warnes wrote:

 And I’ll pick up hexbin.


 Err, that one has been adopted a month ago already.

 open are:

 SemiPar cghseg monreg

I will take monreg. Coincidentally my recent research is related.

Best,

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University



 Best,
 Uwe Ligges




 -Greg

 On Sep 7, 2014, at 12:17 PM, Romain Francois rom...@r-enthusiasts.com
 wrote:

 I'll pick up operators.

 Le 7 sept. 2014 à 18:03, Uwe Ligges lig...@statistik.tu-dortmund.de a
 écrit :



 On 05.09.2014 20:25, Greg Snow wrote:

 Uwe,

 Have all of these packages found new maintainers? if not, which ones
 are still looking to be adopted?


 Thanks for asking, the ones still looking to be adaopted are:
 SemiPar cghseg monreg operators

 Best,
 Uwe Ligges



 thanks,

 On Fri, Aug 8, 2014 at 10:41 AM, Uwe Ligges uwe.lig...@r-project.org
 wrote:

 Dear maintainers and R-devel,

 Several orphaned CRAN packages are about to be archived due to
 outstanding
 QC problems, but have CRAN and BioC packages depending on them which
 would
 be broken by the archival (and hence need archiving alongside).
 Therefore we are looking for new maintainers taking over
 maintainership for
 one or more of the following packages:

 R2HTML SemiPar cghseg hexbin lgtdl monreg muhaz operators pamr

 Package maintainers whose packages depend on one of these may be
 natural
 candidates to become new maintainers.
 Hence this messages is addressed to all these maintainers via BCC and
 to
 R-devel.

 See

   http://CRAN.R-project.org/package=R2HTML
   http://CRAN.R-project.org/package=SemiPar
   http://CRAN.R-project.org/package=cghseg
   http://CRAN.R-project.org/package=hexbin
   http://CRAN.R-project.org/package=lgtdl
   http://CRAN.R-project.org/package=monreg
   http://CRAN.R-project.org/package=muhaz
   http://CRAN.R-project.org/package=operators
   http://CRAN.R-project.org/package=pamr

 for information on the QC issues and the reverse dependencies.

 Best wishes,
 Uwe Ligges
 (for the CRAN team)

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel





 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel



 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [patch] Add support for editor function in edit.default

2014-08-23 Thread Scott Kostyshak
On Tue, May 20, 2014 at 5:55 AM, Scott Kostyshak skost...@princeton.edu wrote:
 Regarding the following extract of ?options:
  ‘editor’: a non-empty string, or a function that is called with a
   file path as argument.

 edit.default currently calls the function with three arguments: name,
 file, and title. For example, running the following

To be clear with what I view as problematic, note in the above that
the documentation says the function is called with a file path as an
argument, suggesting one argument; but in practice it is called with
three arguments.

 vimCmd - 'vim -c set ft=r'
 vimEdit - function(file_) system(paste(vimCmd, file_))
 options(editor = vimEdit)
 myls - edit(ls)

 gives Error in editor(name, file, title) : unused arguments (file, title).

 The attached patch changes edit.default to call the editor function
 with just the file path. There is at least one inconsistent behavior
 that this patch causes in its current form. It does not obey the
 following (from ?edit):
  Calling ‘edit()’, with no arguments, will result in the temporary
 file being reopened for further editing.

 I see two ways to address this: (1) add a getEdFile() function to
 utils/edit.R that calls a function getEd() defined in edit.c that
 returns DefaultFileName; or (2) this patch could be rewritten in C in
 a new function in edit.c.

 Is there any interest in this patch?
 If not, would there be interest in an update of the docs, either
 ?options (stating the possibility that if 'editor' is a function, it
 might be called with 'name', 'file', and 'title' arguments) or ?edit
  ?

Any interest in this patch? If not, would a patch for the
documentation be considered?

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] [patch] Rscript off-by-one error in output

2014-07-09 Thread Scott Kostyshak
Rscript eats up the last argument when reporting the command it runs:

$ Rscript --verbose /tmp/test.R one two three
running
  '/usr/local/lib/R-devel/lib/R/bin/R --slave --no-restore
--file=/tmp/test.R --args one two'

With the patch below, I get the following:

$ Rscript --verbose /tmp/test.R one two three
running
  '/usr/local/lib/R-devel/lib/R/bin/R --slave --no-restore
--file=/tmp/test.R --args one two three'


Index: src/unix/Rscript.c
===
--- src/unix/Rscript.c  (revision 66100)
+++ src/unix/Rscript.c  (working copy)
@@ -249,7 +249,7 @@
 #endif
 if(verbose) {
  fprintf(stderr, running\n  '%s, cmd);
- for(i = 1; i  ac-1; i++) fprintf(stderr,  %s, av[i]);
+ for(i = 1; i  ac; i++) fprintf(stderr,  %s, av[i]);
  fprintf(stderr, '\n\n);
 }
 #ifndef _WIN32


Scott


 sessionInfo()
R Under development (unstable) (2014-07-08 r66100)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] [patch] Fix n arg in mclapply call to ngettext

2014-06-29 Thread Scott Kostyshak
Regarding the following code,

warning(sprintf(ngettext(has.errors,
  scheduled core %s encountered error in user code, all values of
the job will be affected,
  scheduled cores %s encountered errors in user code, all values
of the jobs will be affected),
paste(has.errors, collapse = , )),
  domain = NA)

has.errors is a vector whose elements are the cores that have encountered
errors. The plural message thus appears if the first element of has.errors is
greater than one and is singular otherwise. What we want is for the plural
message to be given if more than one core encountered errors. Changing the n
arg of ngettext from has.errors to length(has.errors) leads to the correct
messages.

Attached is a patch.

More details for completeness:

I've reproduced this on 3.1.0 and r66050.

Below is an example that leads to bad output sometimes (depending on
the order in which the cores finish).
library(parallel)
options(mc.cores = 4)
abc - mclapply(2:5, FUN = function(x) stopifnot(x = 4))
# Warning message:
# In mclapply(2:5, FUN = function(x) { :
#   scheduled core 1, 2 encountered error in user code, all values of
the job will be affected

# if a core with number great than 1 has the only error, then an
incorrect message is shown:
library(parallel)
options(mc.cores = 4)
abc - mclapply(2:5, FUN = function(x) stopifnot(x = 4))
# Warning message:
# In mclapply(2:5, FUN = function(x) { :
#  scheduled cores 4 encountered errors in user code, all values of
the jobs will be affected

 sessionInfo()
R Under development (unstable) (2014-06-29 r66050)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University
Index: src/library/parallel/R/unix/mclapply.R
===
--- src/library/parallel/R/unix/mclapply.R  (revision 66050)
+++ src/library/parallel/R/unix/mclapply.R  (working copy)
@@ -172,7 +172,7 @@
 if (length(has.errors) == cores)
 warning(all scheduled cores encountered errors in user code)
 else
-warning(sprintf(ngettext(has.errors,
+warning(sprintf(ngettext(length(has.errors),
  scheduled core %s encountered error in 
user code, all values of the job will be affected,
  scheduled cores %s encountered errors in 
user code, all values of the jobs will be affected),
 paste(has.errors, collapse = , )),
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] r65998 build error. share/Rd/macros/*: No such file or directory

2014-06-22 Thread Scott Kostyshak
As of r65998 I'm getting
/usr/bin/install: cannot stat
‘/home/scott/rbuilds/r-devel/repo/share/Rd/macros/*’: No such file or
directory

Commenting out the newly added

@for f in $(srcdir)/Rd/macros/*; do \
  $(INSTALL_DATA) $${f} $(DESTDIR)$(rsharedir)/Rd/macros; \
done

in share/Makefile.in
fixes compilation for me.

I'm on Ubuntu 13.10. My configure output is here:
https://www.dropbox.com/s/srwa1mbzesvvq5v/configure
my make output is here:
https://www.dropbox.com/s/q7ylkw00re7riaf/make
and my config.log is here:
https://www.dropbox.com/s/0w09zhds9q6253n/config.log

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] r65998 build error. share/Rd/macros/*: No such file or directory

2014-06-22 Thread Scott Kostyshak
On Sun, Jun 22, 2014 at 11:16 AM, Duncan Murdoch
murdoch.dun...@gmail.com wrote:
 On 22/06/2014, 5:07 PM, Scott Kostyshak wrote:
 As of r65998 I'm getting
 /usr/bin/install: cannot stat
 ‘/home/scott/rbuilds/r-devel/repo/share/Rd/macros/*’: No such file or
 directory

 Commenting out the newly added

 @for f in $(srcdir)/Rd/macros/*; do \
   $(INSTALL_DATA) $${f} $(DESTDIR)$(rsharedir)/Rd/macros; \
 done

 in share/Makefile.in
 fixes compilation for me.

 I'm on Ubuntu 13.10. My configure output is here:
 https://www.dropbox.com/s/srwa1mbzesvvq5v/configure
 my make output is here:
 https://www.dropbox.com/s/q7ylkw00re7riaf/make
 and my config.log is here:
 https://www.dropbox.com/s/0w09zhds9q6253n/config.log

 Just a missed commit, now fixed.

Thanks Duncan,

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Encourage exit with nonzero error status in ?last.dump

2014-06-14 Thread Scott Kostyshak
On Fri, Jun 13, 2014 at 5:32 AM, Martin Maechler
maech...@stat.math.ethz.ch wrote:
 Scott Kostyshak skost...@princeton.edu
 on Fri, 13 Jun 2014 02:04:36 -0400 writes:

  The following example in ?dump.frames options(error =
  quote({dump.frames(to.file = TRUE); q()}))

  is useful for teaching the user how to save a frame dump
  when R encounters an error during non-interactive
  sessions. This command however causes an additional change
  that on encountering an error R exits with a 0 error
  status. Although it's just an example, it's an important
  one as it's referenced in the 'Details' section of the
  help file. I think it would be better to encourage exiting
  with a nonzero error status:

  options(error = quote({dump.frames(to.file = TRUE); q(status = 1)}))

 You are right.
 Thank you for the suggestion: it will be in next
 release.

 Martin Maechler,
 ETH Zurich

Thanks, Martin.


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Encourage exit with nonzero error status in ?last.dump

2014-06-13 Thread Scott Kostyshak
The following example in ?dump.frames

options(error = quote({dump.frames(to.file = TRUE); q()}))

is useful for teaching the user how to save a frame dump when R
encounters an error during non-interactive sessions. This command
however causes an additional change that on encountering an error R
exits with a 0 error status. Although it's just an example, it's an
important one as it's referenced in the 'Details' section of the help
file. I think it would be better to encourage exiting with a nonzero
error status:

options(error = quote({dump.frames(to.file = TRUE); q(status = 1)}))

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] [patch] Add support for editor function in edit.default

2014-05-20 Thread Scott Kostyshak
Regarding the following extract of ?options:
 ‘editor’: a non-empty string, or a function that is called with a
  file path as argument.

edit.default currently calls the function with three arguments: name,
file, and title. For example, running the following

vimCmd - 'vim -c set ft=r'
vimEdit - function(file_) system(paste(vimCmd, file_))
options(editor = vimEdit)
myls - edit(ls)

gives Error in editor(name, file, title) : unused arguments (file, title).

The attached patch changes edit.default to call the editor function
with just the file path. There is at least one inconsistent behavior
that this patch causes in its current form. It does not obey the
following (from ?edit):
 Calling ‘edit()’, with no arguments, will result in the temporary
file being reopened for further editing.

I see two ways to address this: (1) add a getEdFile() function to
utils/edit.R that calls a function getEd() defined in edit.c that
returns DefaultFileName; or (2) this patch could be rewritten in C in
a new function in edit.c.

Is there any interest in this patch?
If not, would there be interest in an update of the docs, either
?options (stating the possibility that if 'editor' is a function, it
might be called with 'name', 'file', and 'title' arguments) or ?edit
 ?

Scott


 sessionInfo()
R Under development (unstable) (2014-05-20 r65677)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base


--
Scott Kostyshak
Economics PhD Candidate
Princeton University
Index: src/library/utils/R/edit.R
===
--- src/library/utils/R/edit.R	(revision 65677)
+++ src/library/utils/R/edit.R	(working copy)
@@ -53,7 +53,13 @@
   editor = getOption(editor), ...)
 {
 if (is.null(title)) title - deparse(substitute(name))
-if (is.function(editor)) invisible(editor(name, file, title))
+if (is.function(editor)) {
+if (file == ) file - tempfile()
+objDep - if (is.null(name))  else deparse(name)
+writeLines(objDep, con = file)
+editor(file)
+eval(parse(file))
+}
 else .External2(C_edit, name, file, title, editor)
 }
 
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Name partially matched in data frame

2014-04-30 Thread Scott Kostyshak
On Wed, Apr 30, 2014 at 3:33 PM, Scott Kostyshak skost...@princeton.edu wrote:
 Hi Dennis,

 On Wed, Apr 30, 2014 at 3:03 PM, Fisher Dennis fis...@plessthan.com wrote:
 R 3.1.0
 OS X

 Colleagues,

 I recently updated to 3.1.0 and I have encountered
 Warning messages: ...  Name partially matched in data frame
 when I do something like:
 DATAFRAME$colname
 where colname is actually something longer than that (but unambiguous).

 I have much appreciated the partial matching capabilities because it fits 
 with my workflow.  I often receive updated data months after the initial 
 code is written.  In order to keep track of what I did in the past, I 
 provide lengthy (unambiguous) names for columns, then abbreviate the names 
 as I call them.  This behavior has been termed “lazy” in various 
 correspondence on this mailing list but it works for me and probably works 
 for others.

 Why not store that information elsewhere? e.g. in an attribute?

 I realize that the new message is only a warning but it is a minor nuisance. 
  Would it be possible to add an
 option(partialMatch=TRUE)   ## default is FALSE
 or something similar to suppress that behavior?  That should keep both camps 
 happy.

 There is currently no option to control that behavior and (although I
 do understand your use case) I personally hope one is not implemented.
 The reason is that you might put that option in your .Rprofile and
 when you share your code with me I get errors that columns aren't
 found.

Let me change this to I would get warnings, which would make me worried.

 You can of course redefine the `$`:

 dataf - data.frame(longColumn = 5)
 dataf$long
 [1] 5
 Warning message:
 In `$.data.frame`(dataf, long) : Name partially matched in data frame

 `$.data.frame` -
 + function (x, name)
 + {
 + a - x[[name]]
 + if (!is.null(a))
 + return(a)
 + a - x[[name, exact = FALSE]]
 + return(a)
 + }

 dataf$long
 [1] 5


 I hope you don't do that though.

 Another option is to use the more verbose dataf[[long, exact = FALSE]].

 Scott


 --
 Scott Kostyshak
 Economics PhD Candidate
 Princeton University

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Name partially matched in data frame

2014-04-30 Thread Scott Kostyshak
Hi Dennis,

On Wed, Apr 30, 2014 at 3:03 PM, Fisher Dennis fis...@plessthan.com wrote:
 R 3.1.0
 OS X

 Colleagues,

 I recently updated to 3.1.0 and I have encountered
 Warning messages: ...  Name partially matched in data frame
 when I do something like:
 DATAFRAME$colname
 where colname is actually something longer than that (but unambiguous).

 I have much appreciated the partial matching capabilities because it fits 
 with my workflow.  I often receive updated data months after the initial code 
 is written.  In order to keep track of what I did in the past, I provide 
 lengthy (unambiguous) names for columns, then abbreviate the names as I call 
 them.  This behavior has been termed “lazy” in various correspondence on this 
 mailing list but it works for me and probably works for others.

Why not store that information elsewhere? e.g. in an attribute?

 I realize that the new message is only a warning but it is a minor nuisance.  
 Would it be possible to add an
 option(partialMatch=TRUE)   ## default is FALSE
 or something similar to suppress that behavior?  That should keep both camps 
 happy.

There is currently no option to control that behavior and (although I
do understand your use case) I personally hope one is not implemented.
The reason is that you might put that option in your .Rprofile and
when you share your code with me I get errors that columns aren't
found.

You can of course redefine the `$`:

 dataf - data.frame(longColumn = 5)
 dataf$long
[1] 5
Warning message:
In `$.data.frame`(dataf, long) : Name partially matched in data frame

 `$.data.frame` -
+ function (x, name)
+ {
+ a - x[[name]]
+ if (!is.null(a))
+ return(a)
+ a - x[[name, exact = FALSE]]
+ return(a)
+ }

 dataf$long
[1] 5


I hope you don't do that though.

Another option is to use the more verbose dataf[[long, exact = FALSE]].

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] duplication regression (?)

2014-04-14 Thread Scott Kostyshak
Below is an example of output that changed as a result of r64970. I
did not see any NEWS item suggesting this change is expected.

Note that the example is contrived and I don't have a use case for it.
I stumbled across it when playing with recent changes in R relating to
duplication. Does the example use undefined syntax?

-
fn1 - function(mylist) {
fn1a - function() mylist[[c(1,1)]][[1]] - 9
fn1a()
return(NULL)
}

fn2 - function(myarg) fn1(myarg)

test_list - list(list(list(1)))
print(test_list[[c(1,1,1)]])
fn2(test_list)
print(test_list[[c(1,1,1)]])
-

Before r64970 the output is
[1] 1
[1] 1

After r64970 the output is
[1] 1
[1] 9

 sessionInfo()
R Under development (unstable) (2014-04-10 r65396)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base


Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [PATCH] suggestions for R-lang manual

2014-03-04 Thread Scott Kostyshak
On Mon, Mar 3, 2014 at 7:48 AM, Martin Maechler
maech...@stat.math.ethz.ch wrote:
 Scott Kostyshak skost...@princeton.edu
 on Thu, 27 Feb 2014 16:43:02 -0500 writes:

  On Thu, Nov 21, 2013 at 1:17 AM, Scott Kostyshak 
 skost...@princeton.edu wrote:
  Attached is a patch with suggestions for the R-lang manual at r64277.
 
  Below are a few comments (some are implemented in the patch):
 
  In the section Objects, there is a table introduced by The
  following table describes the possible values returned by typeof. One
  of the results is any. Can any be returned by typeof() ?

 ANYSXP  is a valid internal type on the C level, and
 src/main/util.c  will make  typeof(ob) return any
 if you can get your hands at an R level object of that type.
 I'd guess you can only get it currently by using .Call() and
 using your own C code, .. but at least that way it must be possible.

Interesting to know.

  Regarding the Recycling rules section,
 
  -One exception is that when adding vectors to matrices, a warning is 
 not
  -given if the lengths are incompatible.
  -@c Is that a bug?
  -
 
  was this a bug that was fixed?

 I did not investigate in details, but yes, I vaguely remember we
 had fixed that.  So indeed, it's fine you omitted the para in
 your patch.

  I see the following behavior:
 
  myvec - 1:3
  mymat - matrix(1:12, ncol=2)
  myvec - 1:5
  myvec + mymat
  [,1] [,2]
  [1,]29
  [2,]4   11
  [3,]6   13
  [4,]8   15
  [5,]   10   12
  [6,]7   14
  Warning message:
  In myvec + mymat :
  longer object length is not a multiple of shorter object length
 
 
  Regarding
 
  -The arguments in the call to the generic are rematched with the
  -arguments for the method using the standard argument matching 
 mechanism.
  -The first argument, i.e.@: the object, will have been evaluated.
  -
 
  this information is duplicated. See a few paragraphs up When the
  method is invoked it is called...

  Scott

 Thank you, Scott.
 Indeed, I've finally carefully looked at the patch, and applied
 it - for R-devel, to become R 3.1.0 in April.

Thanks, Martin!

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

 Martin


  --
  Scott Kostyshak
  Economics PhD Candidate
  Princeton University

  The patch still applies cleanly (one offset) on r65090.

  Best,
  Scott

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [PATCH] suggestions for R-lang manual

2014-02-27 Thread Scott Kostyshak
On Thu, Nov 21, 2013 at 1:17 AM, Scott Kostyshak skost...@princeton.edu wrote:
 Attached is a patch with suggestions for the R-lang manual at r64277.

 Below are a few comments (some are implemented in the patch):

 In the section Objects, there is a table introduced by The
 following table describes the possible values returned by typeof. One
 of the results is any. Can any be returned by typeof() ?

 Regarding the Recycling rules section,

 -One exception is that when adding vectors to matrices, a warning is not
 -given if the lengths are incompatible.
 -@c Is that a bug?
 -

 was this a bug that was fixed? I see the following behavior:

 myvec - 1:3
 mymat - matrix(1:12, ncol=2)
 myvec - 1:5
 myvec + mymat
  [,1] [,2]
 [1,]29
 [2,]4   11
 [3,]6   13
 [4,]8   15
 [5,]   10   12
 [6,]7   14
 Warning message:
 In myvec + mymat :
   longer object length is not a multiple of shorter object length


 Regarding

 -The arguments in the call to the generic are rematched with the
 -arguments for the method using the standard argument matching mechanism.
 -The first argument, i.e.@: the object, will have been evaluated.
 -

 this information is duplicated. See a few paragraphs up When the
 method is invoked it is called...

 Scott


 --
 Scott Kostyshak
 Economics PhD Candidate
 Princeton University

The patch still applies cleanly (one offset) on r65090.

Best,

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] help page of warnings()

2013-12-29 Thread Scott Kostyshak
On Sat, Dec 28, 2013 at 11:19 PM, Elad Zippory elad.zipp...@gmail.com wrote:
 Hi Scott,

 Thank you for your detailed response. (btw, the reason why I didn't link the
 Stack Overflow question is because I deleted it after I sent the e-mail).

Hi Elad,

Please keep the conversation on the list unless there is a reason for
it to be private, in which case please say so. This way everyone can
participate (and more importantly can correct my errors).

 The rationale behind my proposal was because I was surprised to learn that
 rm(list=ls()) does not clear the warning list. The reason why I was
 surprised is because it is not clear from the help page (if you are at a
 level that requires you to read the help page of such a base function, the
 warning that I quoted does not fully warn the 'user', who is not a
 'developer', what is going on. Environments in R are not trivial knowledge
 that can be raised too concisely).

In some cases environments can be thought of like lists. As for how
name look-up goes, yes it takes some studying to learn about that.

 The reason why it mattered is because I am writing a program to be run on
 our HPC, and I want it to abort when there is a warning so I can attend to
 it right away. No point to discover after expensive usage that some warning
 should be investigated, casting doubt on several days of computation. It is
 also useful when writing recursive code, to abort immediately when the
 warning list is populated as it is very hard to understand what went wrong,
 and especially, where...

This is a great programming strategy. You might be interested in one
of my favorite recommendations: treat warnings like errors.

options(warn = 2) # asks R to treat warnings as errors. See ?options

As far as knowing more precisely where something went wrong (where not
in the sense of what line of code, but in which function), consider
using the traceback function. Or, in addition to the above options
command, you might like:

options(error = recover) # asks R to enter the debugger when there is an error

and because warnings are now errors, it also enters the debugger for
warnings. This way you can poke around where the warning occurred.

 So, those were my motivations. Again, if I would know that I need a fresh R
 session, I would get it. I don't like 'touching' what I don't understand. I
 just wish I knew I needed to do so without wasting a day trying to debug a
 warning, where all my actions to debug it were 'virtual'.

I still don't see a need to manually access last.warning for the
situation you described.

 Again, thank you for your detailed response, I hope that the case I am
 making is clearer now.

Thank you for giving more details on what you're trying to accomplish.

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

 Best regards,
 Elad Zippory
 Ph.D student
 Politics, NYU

 On Sat, Dec 28, 2013 at 9:19 PM, Scott Kostyshak skost...@princeton.edu
 wrote:

 On Sat, Dec 28, 2013 at 6:06 PM, Elad Zippory elad.zipp...@gmail.com
 wrote:
  Hi,
 
  I raised this issue at stackoverflow and it was suggested to raise it
  here:
 
  From the current help page, it is unclear that warnings() does not
  clear
  after rm(list=ls()). Currently the page states that:
 
  Warning: It is undocumented where last.warning is stored nor that it is
  visible, and this is subject to change. Prior to R 2.4.0 it was stored
  in
  the workspace, but no longer.
 
  Yet, I suggest that, if to keep the current behavior or until the
  behavior
  is changed, at least write explicitly in the help file something like
  clearing the global environment will not clear the warning list. To do
  so
  use assign(last.warning, NULL, envir = baseenv())
 
  Thank you,
  Elad Zippory

 Hi Elad,

 I'm not a decision maker around here but I'm curious about your
 suggestion. I always find it helpful to try to understand how people
 use R and how they expect R to work.

 From what I understand, you agree that there's no contradiction of
 behavior in terms of how R is documented to work and you agree that
 rm(list=ls()) should indeed not clear the warnings list. First, let me
 give my observation that I think the policy of writing R documentation
 is to give sufficient information for what a function does. When there
 is something surprising or there are performance issues to keep in
 mind, occasionally the R documentation appropriately mentions what a
 function does not do.

 I think you are interested in making more of a let's make it easier
 on the user argument so let me try to address that. I think it's easy
 to learn how to find the last.warning object. This would only require
 a user to read the first line of ?warnings and then to know about the
 getAnywhere function. That's it.

 In fact, I think that's too easy. I would personally be in favor of
 making it _more_ difficult for a beginning user to modify
 last.warning. I've never had to do such a thing and I would be
 suspicious of beginning

[Rd] [PATCH] suggestions for R-lang manual

2013-11-20 Thread Scott Kostyshak
Attached is a patch with suggestions for the R-lang manual at r64277.

Below are a few comments (some are implemented in the patch):

In the section Objects, there is a table introduced by The
following table describes the possible values returned by typeof. One
of the results is any. Can any be returned by typeof() ?

Regarding the Recycling rules section,

-One exception is that when adding vectors to matrices, a warning is not
-given if the lengths are incompatible.
-@c Is that a bug?
-

was this a bug that was fixed? I see the following behavior:

 myvec - 1:3
 mymat - matrix(1:12, ncol=2)
 myvec - 1:5
 myvec + mymat
 [,1] [,2]
[1,]29
[2,]4   11
[3,]6   13
[4,]8   15
[5,]   10   12
[6,]7   14
Warning message:
In myvec + mymat :
  longer object length is not a multiple of shorter object length


Regarding

-The arguments in the call to the generic are rematched with the
-arguments for the method using the standard argument matching mechanism.
-The first argument, i.e.@: the object, will have been evaluated.
-

this information is duplicated. See a few paragraphs up When the
method is invoked it is called...

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University
Index: trunk/doc/manual/R-lang.texi
===
--- trunk/doc/manual/R-lang.texi(revision 64277)
+++ trunk/doc/manual/R-lang.texi(working copy)
@@ -1064,7 +1064,7 @@
 @cindex function
 @cindex function arguments
 Function calls can have @emph{tagged} (or @emph{named}) arguments, as in
-@code{plot(x, y, pch = 3)} arguments without tags are known as
+@code{plot(x, y, pch = 3)}.  Arguments without tags are known as
 @emph{positional} since the function must distinguish their meaning from
 their sequential positions among the arguments of the call, e.g., that
 @code{x} denotes the abscissa variable and @code{y} the ordinate.  The
@@ -1308,10 +1308,10 @@
 ignored.  If @var{value1} has any type other than a logical or a numeric
 vector an error is signalled.
 
-If/else statements can be used to avoid numeric problems such as taking
-the logarithm of a negative number.  Because if/else statements are the
-same as other statements you can assign the value of them.  The two
-examples below are equivalent.
+@code{if}/@code{else} statements can be used to avoid numeric problems
+such as taking the logarithm of a negative number.  Because
+@code{if}/@code{else} statements are the same as other statements you
+can assign the value of them.  The two examples below are equivalent.
 
 @example
  if( any(x = 0) ) y - log(1+x) else y - log(x)
@@ -1327,7 +1327,7 @@
 compound statement wrapped in braces, putting the @code{else} on the
 same line as the closing brace that marks the end of the statement.
 
-If/else statements can be nested.
+@code{if}/@code{else} statements can be nested.
 
 @example
 if ( @var{statement1} ) @{
@@ -1342,7 +1342,7 @@
 
 One of the even numbered statements will be evaluated and the resulting
 value returned.  If the optional @code{else} clause is omitted and all
-the odd numbered @var{statement}'s evaluate to @code{FALSE} no statement
+the odd numbered @var{statement}s evaluate to @code{FALSE} no statement
 will be evaluated and @code{NULL} is returned.
 
 The odd numbered @var{statement}s are evaluated, in order, until one
@@ -1378,7 +1378,7 @@
 of the loop (if there is one) is then executed.  No statement below
 @code{next} in the current loop is evaluated.
 
-The value returned by a loop statement statement is always @code{NULL}
+The value returned by a loop statement is always @code{NULL}
 and is returned invisibly.
 
 @node repeat, while, Looping, Control structures
@@ -1451,7 +1451,7 @@
 where the elements of @var{list} may be named.  First, @var{statement}
 is evaluated and the result, @var{value}, obtained.  If @var{value} is a
 number between 1 and the length of @var{list} then the corresponding
-element @var{list} is evaluated and the result returned.  If @var{value}
+element of @var{list} is evaluated and the result returned.  If @var{value}
 is too large or too small @code{NULL} is returned.
 
 @example
@@ -1530,10 +1530,6 @@
 As from @R{} 1.4.0, any arithmetic operation involving a zero-length
 vector has a zero-length result.
 
-One exception is that when adding vectors to matrices, a warning is not
-given if the lengths are incompatible.
-@c Is that a bug?
-
 @node Propagation of names, Dimensional attributes, Recycling rules, 
Elementary arithmetic operations
 @subsection Propagation of names
 @cindex name
@@ -1842,7 +1838,7 @@
 matching.
 
 The most important example of a class method for @code{[} is that used
-for data frames.  It is not be described in detail here (see the help
+for data frames.  It is not described in detail here (see the help
 page for @code{[.data.frame}, but in broad terms, if two indices are
 supplied (even if one is empty) it creates matrix-like indexing for a
 structure

Re: [Rd] [PATCH] minor suggestions for R-ints manual

2013-11-06 Thread Scott Kostyshak
On Tue, Nov 5, 2013 at 11:43 AM, Martin Maechler
maech...@stat.math.ethz.ch wrote:
 Scott Kostyshak skost...@princeton.edu
 on Sat, 12 Oct 2013 17:50:52 -0400 writes:

  Attached is a patch with minor suggestions for the R-ints
  manual at r64048. The most substantial change is the
  following:

   The top layer comprises the graphics subsystems. Although
  there is -provision for 24 subsystems, after 6 years only
  two exist, `base' and +provision for 24 subsystems, since
  2001 only two exist, `base' and `grid'.

  Is the year 2001 correct? I base it on the date of the
  commit that introduced the 6 years string and on the
  date of grid 0.1.

  Scott

 I've used  about 2001 and otherwise basically applied your patch
 (after checking it).

 Thank you very much for your contribution!

Thank *you* Martin!

Scott


 Martin Maechler


  --
  Scott Kostyshak Economics PhD Candidate Princeton
  University x[DELETED ATTACHMENT external: R-ints.diff.txt, plain text]
   __
  R-devel@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-devel


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] [PATCH] minor suggestions for R-ints manual

2013-10-12 Thread Scott Kostyshak
Attached is a patch with minor suggestions for the R-ints manual at
r64048. The most substantial change is the following:

 The top layer comprises the graphics subsystems. Although there is
-provision for 24 subsystems, after 6 years only two exist, `base' and
+provision for 24 subsystems, since 2001 only two exist, `base' and
 `grid'.

Is the year 2001 correct? I base it on the date of the commit that
introduced the 6 years string and on the date of grid 0.1.

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University
Index: trunk/doc/manual/R-ints.texi
===
--- trunk/doc/manual/R-ints.texi(revision 64048)
+++ trunk/doc/manual/R-ints.texi(working copy)
@@ -462,7 +462,7 @@
 (which are 32 bits on all @R{} platforms).
 
 @item REALSXP
-@code{length}, @code{truelength} followed by a block of C @code{double}s
+@code{length}, @code{truelength} followed by a block of C @code{double}s.
 
 @item CPLXSXP
 @code{length}, @code{truelength} followed by a block of C99 @code{double
@@ -1330,7 +1330,7 @@
 The relationship between the pairs is similar: @code{warning} tries to
 fathom out a suitable call, and then calls @code{warningcall} with that
 call as the first argument if it succeeds, and with @code{call =
-R_NilValue} it is does not.  When @code{warningcall} is called, it
+R_NilValue} if it does not.  When @code{warningcall} is called, it
 includes the deparsed call in its printout unless @code{call =
 R_NilValue}.
 
@@ -2289,12 +2289,12 @@
 @file{src/main/names.c}: primitives have @samp{Y = 0} in the @samp{eval}
 field.
 
-There needs to an a @samp{\alias} entry in a help file in the @pkg{base}
+There needs to be a @samp{\alias} entry in a help file in the @pkg{base}
 package, and the primitive needs to be added to one of the lists at the
 start of this section.
 
 Some primitives are regarded as language elements (the current ones are
-listed above).  These need to be in added to two lists of exceptions,
+listed above).  These need to be added to two lists of exceptions,
 @code{langElts} in @code{undoc()} (in file
 @file{src/library/tools/R/QC.R}) and @code{lang_elements} in
 @file{tests/primitives.R}.
@@ -2778,7 +2778,7 @@
 
 
 The top layer comprises the graphics subsystems. Although there is
-provision for 24 subsystems, after 6 years only two exist, `base' and
+provision for 24 subsystems, since 2001 only two exist, `base' and
 `grid'.  The base subsystem is registered with the engine when @R{} is
 initialized, and unregistered (via @code{KillAllDevices}) when an @R{}
 session is shut down.  The grid subsystem is registered in its
@@ -3797,7 +3797,7 @@
 interactively.
 Default: true.
 @item _R_CHECK_VIGNETTES_NLINES_
-Maximum number of lines to show of the bottom of the output when reporting
+Maximum number of lines to show at the bottom of the output when reporting
 errors in running vignettes.
 Default: 10.
 @item _R_CHECK_CODOC_S4_METHODS_
@@ -4258,7 +4258,7 @@
 @file{Renviron} file.  This used to record @samp{false} if no command
 was found, but it nowadays records the name for looking up on the path
 at run time.  The latter can be important for binary distributions: one
-does not want to be tied to, for example, TeXLive 2007.
+does not want to be tied to, for example, TeX Live 2007.
 
 
 @node Current and future directions, Function and variable index, Use of TeX 
dialects, Top
@@ -4408,7 +4408,7 @@
 are supported provided that each of the dimensions is no more than
 2^31-1.  However, not all applications can be supported.
 
-The main problem is linear algebra, on done by FORTRAN code compiled
+The main problem is linear algebra done by FORTRAN code compiled
 with 32-bit @code{INTEGER}.  Although not guaranteed, it seems that all
 the compilers currently used with @R{} on a 64-bit platform allow
 matrices each of whose dimensions is less than 2^31 but with more than
@@ -4416,7 +4416,7 @@
 support software (such as @acronym{BLAS} and @acronym{LAPACK}) also
 work.
 
-There are exceptions: for example some complex @acronym{LAPACK})
+There are exceptions: for example some complex @acronym{LAPACK}
 auxiliary routines do use a single @code{INTEGER} index and hence
 overflow silently and segfault or give incorrect results.  One example
 is @code{svd()} on a complex matrix.
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Possible POSIXlt / wday glitch bugs.r-project.org status

2013-10-04 Thread Scott Kostyshak
On Fri, Oct 4, 2013 at 6:11 AM, Imanuel Costigan i.costi...@me.com wrote:
 Wanted to raise two questions:

 1. Is bugs.r-project.org down? I haven't been able to reach it for two or 
 three days:

Yes. Quote from Duncan:

... the server is currently down. The volunteer who runs the server is
currently away from his office, so I expect it won't get fixed until he
gets back in a few days.

https://stat.ethz.ch/pipermail/r-help/2013-October/360958.html

Scott


 ```
 ping bugs.r-project.org
 PING rbugs.research.att.com (207.140.168.137): 56 data bytes
 Request timeout for icmp_seq 0
 Request timeout for icmp_seq 1
 Request timeout for icmp_seq 2
 Request timeout for icmp_seq 3
 Request timeout for icmp_seq 4
 Request timeout for icmp_seq 5
 Request timeout for icmp_seq 6
 ```

 2. Is wday element of POSIXlt meant to be timezone invariant? You would 
 expect the wday element to be invariant to the timezone of a date. That is, 
 the same date/time instant of 5th October 2013 in both Australia/Sydney and 
 UTC should be a Saturday (i.e. wday = 6). And indeed that is the case with 1 
 min past midnight on 5 October 2013:

 ```
 library(lubridate)
 d_utc - ymd_hms(2013100501, tz='UTC')
 d_local - ymd_hms(2013100501, tz='Australia/Sydney')
 as.POSIXlt(x=d_utc, tz=tz(d_utc))$wday # 6
 as.POSIXlt(x=d_local, tz=tz(d_local))$wday # 6
 ```

 But this isn't always the case. For example,

 ```
 d_utc - ymd_hms(2038100201, tz='UTC')
 d_local - ymd_hms(2038100201, tz='Australia/Sydney')
 as.POSIXlt(x=d_utc, tz=tz(d_utc))$wday # 6
 as.POSIXlt(x=d_local, tz=tz(d_local))$wday # 5
 ```

 Is this expected behaviour? I would have expected a properly encoded 
 date/time of 2 Oct 2038 to be a Saturday irrespective of its time zone.

 Obligatory system dump:

 ```
 sessionInfo()
 R version 3.0.1 (2013-05-16)
 Platform: x86_64-apple-darwin12.4.0 (64-bit)

 locale:
 [1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base

 other attached packages:
 [1] lubridate_1.3.0 testthat_0.7.1  devtools_1.3

 loaded via a namespace (and not attached):
  [1] colorspace_1.2-4   dichromat_2.0-0digest_0.6.3   evaluate_0.5.1
  [5] ggplot2_0.9.3.1grid_3.0.1 gtable_0.1.2   httr_0.2
  [9] labeling_0.2   MASS_7.3-29memoise_0.1munsell_0.4.2
 [13] parallel_3.0.1 plyr_1.8   proto_0.3-10   
 RColorBrewer_1.0-5
 [17] RCurl_1.95-4.1 reshape2_1.2.2 scales_0.2.3   stringr_0.6.2
 [21] tools_3.0.1whisker_0.3-2

 ```

 Using R compiled by homebrew [1]. But also experiencing the same bug using R 
 installed on Windows 7 from the CRAN binaries.

 For those interested, I've also noted this on the `lubridate` Github issues 
 page [2], even though this doesn't appear to be a lubridate issue.

 Thanks for any help.

 [1] http://brew.sh
 [2] https://github.com/hadley/lubridate/issues/209

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] [PATCH] file.access returns success for NA

2013-10-03 Thread Scott Kostyshak
Currently on R I get the following:

 file.access(c(doesNotExist, NA))
doesNotExist NA
  -10

where 0 means success. Is the 0 correct? I was expecting either NA or -1.

?file.access does not mention how NA values should be handled. The
subsection 3.3.4 NA handling from the R Language Definition manual
suggest to me that file.access should return NA if given NA. I
interpret it in this way because if an element in the input vector is
NA, that means that there is a filename that exists but is not known.
Thus, I thought that file.access should return NA because it is not
known whether the file corresponding to the missing filename exists.

Perhaps file.access acts in this way to maintain compatibility with
the S-PLUS function ‘access’ (which I currently do not have a way of
testing to see how it handles NAs) ? If this is the case, would a
patch for ?file.access be considered?

Below is a patch that changes the return of an NA to NA.

Index: trunk/src/main/platform.c
===
--- trunk/src/main/platform.c (revision 64011)
+++ trunk/src/main/platform.c (working copy)
@@ -1299,7 +1299,7 @@
  access(R_ExpandFileName(translateChar(STRING_ELT(fn, i))),
modemask);
 #endif
- } else INTEGER(ans)[i] = FALSE;
+ } else INTEGER(ans)[i] = NA_INTEGER;
 UNPROTECT(1);
 return ans;
 }

Comments?

Scott

 sessionInfo()
R Under development (unstable) (2013-09-27 r64011)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base



--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] tools::md5sum(directory) behavior different on Windows vs. Unix

2013-09-29 Thread Scott Kostyshak
On Mon, Sep 9, 2013 at 3:00 AM, Scott Kostyshak skost...@princeton.edu wrote:
 tools::md5sum gives a warning if it receives a directory as an
 argument on Unix but not on Windows.

 From what I understand, this happens because in Windows a directory is
 not treated as a file so fopen returns NULL. Then, NA is returned
 without a warning. On Unix, a directory is treated as a file so fopen
 does not return NULL so md5 is run and fails, leading to a warning.

 This is a good opportunity for me to understand further (in addition
 to [1] and the many places where OS special cases are mentioned) in
 which cases R tries to behave the same on Windows as on Unix and in
 which cases it allows for differences (in this case, a warning vs. no
 warning). For example, it would be straightforward to create a patch
 that would lead to the same behavior in this case. tools::md5sum could
 either issue a warning for each argument that is a directory or it
 could issue no warning (consistent with file.info). Would either patch
 be considered?

Attached is a patch that gives a warning if an element in the file
argument is not a regular file (e.g. is a directory or does not
exist). In my opinion the advantages of this patch are:

(1) the same warnings are generated on all platforms in the case where
one of the elements is a folder.
(2) a warning is also given if a file does not exist.

Comments?

Scott


 Or is this difference encouraged because the concept of a file is
 different on Unix than on Windows?

 Scott

 [1] 
 http://cran.r-project.org/bin/windows/base/rw-FAQ.html#What-should-I-expect-to-behave-differently-from-the-Unix-version


 --
 Scott Kostyshak
 Economics PhD Candidate
 Princeton University
Index: trunk/src/library/tools/R/md5.R
===
--- trunk/src/library/tools/R/md5.R (revision 64011)
+++ trunk/src/library/tools/R/md5.R (working copy)
@@ -17,7 +17,18 @@
 #  http://www.r-project.org/Licenses/
 
 md5sum - function(files)
-structure(.Call(Rmd5, files), names=files)
+{
+reg_ - file_test(-f, files)
+regFiles - files[reg_]
+notReg - files[!reg_]
+if(!all(reg_))
+warning(The following are not regular files: ,
+paste(shQuote(notReg), collapse =  ))
+names(files) - files
+files[!reg_] - NA
+files[reg_] - .Call(Rmd5, regFiles)
+files
+}
 
 .installMD5sums - function(pkgDir, outDir = pkgDir)
 {
Index: trunk/src/library/tools/man/md5sum.Rd
===
--- trunk/src/library/tools/man/md5sum.Rd   (revision 64011)
+++ trunk/src/library/tools/man/md5sum.Rd   (working copy)
@@ -18,7 +18,8 @@
 \value{
   A character vector of the same length as \code{files}, with names
   equal to \code{files}. The elements
-  will be \code{NA} for non-existent or unreadable files, otherwise
+  will be \code{NA} for non-existent or unreadable files (in which case
+  a warning will be generated), otherwise
   a 32-character string of hexadecimal digits.
 
   On Windows all files are read in binary mode (as the \code{md5sum}
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] tools::md5sum(directory) behavior different on Windows vs. Unix

2013-09-09 Thread Scott Kostyshak
tools::md5sum gives a warning if it receives a directory as an
argument on Unix but not on Windows.

From what I understand, this happens because in Windows a directory is
not treated as a file so fopen returns NULL. Then, NA is returned
without a warning. On Unix, a directory is treated as a file so fopen
does not return NULL so md5 is run and fails, leading to a warning.

This is a good opportunity for me to understand further (in addition
to [1] and the many places where OS special cases are mentioned) in
which cases R tries to behave the same on Windows as on Unix and in
which cases it allows for differences (in this case, a warning vs. no
warning). For example, it would be straightforward to create a patch
that would lead to the same behavior in this case. tools::md5sum could
either issue a warning for each argument that is a directory or it
could issue no warning (consistent with file.info). Would either patch
be considered?

Or is this difference encouraged because the concept of a file is
different on Unix than on Windows?

Scott

[1] 
http://cran.r-project.org/bin/windows/base/rw-FAQ.html#What-should-I-expect-to-behave-differently-from-the-Unix-version


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Comments requested on changedFiles function

2013-09-08 Thread Scott Kostyshak
On Sun, Sep 8, 2013 at 10:55 AM, Duncan Murdoch
murdoch.dun...@gmail.com wrote:
 On 13-09-06 11:07 PM, Karl Millar wrote:

 On Fri, Sep 6, 2013 at 7:03 PM, Duncan Murdoch murdoch.dun...@gmail.com
 wrote:

 On 13-09-06 9:21 PM, Karl Millar wrote:


 Hi Duncan,

 I like the interface of this version a lot better, but there's still a
 bunch of implementation details that need fixing:

 * As previously mentioned, there are important cases where the mtime
 values change in ways that this code doesn't detect.
 * If the timestamp file (which is usually in the temp directory) gets
 deleted (which can happen after a moderate amount of time of
 inactivity on some systems), then the file_test('-nt', ...) will
 always return false, even if the file has changed.



 If that happened without user intervention, I think it would break other
 things in R -- the temp directory is supposed to last for the whole
 session.
 But I should be checking anyway.


 Yes, it does break other things in R -- my experience has been that
 the help system seems to be the one that is impacted the most by this.
   FWIW, I've never seen the entire R temp directory deleted, just
 individual files and subdirectories in it, but even that probably
 depends on how the machine is configured.  I suspect only a few users
 ever notice this, but my R use is probably somewhat anomalous and I
 think it only happens to R sessions that I haven't used for a few
 days.


 I use Windows and never see this; deleting temp files is up to me, not to
 the system.  But my understanding was the *nix systems should only clean up
 /tmp on restart, and I don't think an R session will survive a restart.

 However, you have convinced me that the use of the timestamp file is not
 beneficial enough to be the default.  I'll leave it as an option, but add
 warnings that it might be unreliable.



 * If files get added or deleted between the two calls to list.files in
 fileSnapshot, it will fail with an error.



 Yours won't work if path contains more than one directory.  This is
 probably
 a reasonable restriction, but it's inconsistent with list.files, so I'd
 like
 to avoid it if I can find a way.


 I'm currently unsure what the behaviour when comparing snapshots with
 multiple directories should be.

 Presumably we should have the property that (horribly abusing notation
 for succinctness):
compareSnapshots(c(a1, a2),  c(a1, a2))
 is the same as concatenating (in some form)
compareSnapshots(a1, a1) and compareSnapshots(a2, a2)
 and there's a bunch of ways we could concatenate -- we could return a
 list of results, or a single result where each of the 'added, deleted,
 modified' fields are a list, or where we concatenate the 'added,
 deleted, modified' fields together into three simple vectors.
 Concatenating the vectors together like this is appealing, but unless
 you're using the full names, it doesn't include the information of
 which directory the changes are in, and using the full names doesn't
 work in the case where you're comparing different sets of directories,
 e.g. compareSnapshots(c(a1, a2), c(b1, b2)), where there is no
 sensible choice for a full name.  The list options don't have this
 problem, but are harder to work with, particularly for the common case
 where there's only a single directory.  You'd also have to be somewhat
 careful with filenames that occur in both directories.

 Maybe I'm just being dense, but I don't see a way to do this thats
 clear, easy to use and wouldn't confuse users at the moment.


 The way I've done this is to require full.names when multiple dirs are on
 the path.  I've reduced it to one list.files() call per dir, by iterating
 over the path variable and using your approach of calling it with full.names
 = FALSE, then adding the dir if necessary.

 I haven't adopted your change that forces comparison of only size and mtime
 from file.info.  I don't see a big cost in storing whatever file.info
 returns (which is system dependent; on Windows I don't see the user and
 group related columns; on Unix I don't see the exe column).
 Users might want to detect changes to anything there, and I shouldn't make
 it harder for them.

 I've also kept the special-casing of md5sum; it really needs to be wrapped
 in suppressWarnings() (on Unix only).  And I've kept the options to specify
 what changedFiles checks among the file.info columns; I can see that you
 might want a snapshot with everything, but sometimes only want to be told
 about changes in a subset of the attributes.

 I've uploaded
 http://www.stats.uwo.ca/faculty/murdoch/temp/testpkg_1.1.tar.gz if anyone
 is interested.

Works well.

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Comments requested on changedFiles function

2013-09-06 Thread Scott Kostyshak
On Fri, Sep 6, 2013 at 3:46 PM, Duncan Murdoch murdoch.dun...@gmail.com wrote:
 On 06/09/2013 2:20 PM, Duncan Murdoch wrote:

 I have now put the code into a temporary package for testing; if anyone
 is interested, for a few days it will be downloadable from

 fisher.stats.uwo.ca/faculty/murdoch/temp/testpkg_1.0.tar.gz


 Sorry, error in the URL.  It should be

 http://www.stats.uwo.ca/faculty/murdoch/temp/testpkg_1.0.tar.gz

Works well. A couple of things I noticed:

(1)
md5sum is being called on directories, which causes warnings. (If this
is not viewed as undesirable, please ignore the rest of this comment.)
Should this be the responsibility of the user (by passing arguments to
list.files)? In the example, changing
fileSnapshot(dir, file.info=TRUE, md5sum=TRUE)
to
fileSnapshot(dir, file.info=TRUE, md5sum=TRUE, include.dirs=FALSE,
recursive=TRUE)

gets rid of the warnings. But perhaps the user just wants to exclude
directories for the md5sum calculations. This can't be controlled from
fileSnapshot.

Or, should the if (md5sum) chunk subset fullnames using file_test
or file.info to exclude directories (and then fill in the directories
with NA)?

(2)
If I run example(changedFiles) several times, sometimes I get:

chngdF changedFiles(snapshot)
File changes:
  mtime md5sum
file2  TRUE   TRUE

and other times I get:

chngdF changedFiles(snapshot)
File changes:
  md5sum
file2   TRUE

I wonder why.

Scott

 sessionInfo()
R Under development (unstable) (2013-08-31 r63780)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] testpkg_1.0

loaded via a namespace (and not attached):
[1] tools_3.1.0



--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Comments requested on changedFiles function

2013-09-06 Thread Scott Kostyshak
On Fri, Sep 6, 2013 at 7:40 PM, Scott Kostyshak skost...@princeton.edu wrote:
 On Fri, Sep 6, 2013 at 3:46 PM, Duncan Murdoch murdoch.dun...@gmail.com 
 wrote:
 On 06/09/2013 2:20 PM, Duncan Murdoch wrote:

 I have now put the code into a temporary package for testing; if anyone
 is interested, for a few days it will be downloadable from

 fisher.stats.uwo.ca/faculty/murdoch/temp/testpkg_1.0.tar.gz


 Sorry, error in the URL.  It should be

 http://www.stats.uwo.ca/faculty/murdoch/temp/testpkg_1.0.tar.gz

 Works well. A couple of things I noticed:

 (1)
 md5sum is being called on directories, which causes warnings. (If this
 is not viewed as undesirable, please ignore the rest of this comment.)
 Should this be the responsibility of the user (by passing arguments to
 list.files)? In the example, changing
 fileSnapshot(dir, file.info=TRUE, md5sum=TRUE)
 to
 fileSnapshot(dir, file.info=TRUE, md5sum=TRUE, include.dirs=FALSE,
 recursive=TRUE)

 gets rid of the warnings. But perhaps the user just wants to exclude
 directories for the md5sum calculations. This can't be controlled from
 fileSnapshot.

 Or, should the if (md5sum) chunk subset fullnames using file_test
 or file.info to exclude directories (and then fill in the directories
 with NA)?

 (2)
 If I run example(changedFiles) several times, sometimes I get:

 chngdF changedFiles(snapshot)
 File changes:
   mtime md5sum
 file2  TRUE   TRUE

 and other times I get:

 chngdF changedFiles(snapshot)
 File changes:
   md5sum
 file2   TRUE

 I wonder why.

Putting the following in-between snapshot and writeBin in the example
leads to consistent output:

# allow for mtime to change
Sys.sleep(.1)

Scott


 Scott

 sessionInfo()
 R Under development (unstable) (2013-08-31 r63780)
 Platform: x86_64-unknown-linux-gnu (64-bit)

 locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base

 other attached packages:
 [1] testpkg_1.0

 loaded via a namespace (and not attached):
 [1] tools_3.1.0



 --
 Scott Kostyshak
 Economics PhD Candidate
 Princeton University


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Comments requested on changedFiles function

2013-09-05 Thread Scott Kostyshak
On Thu, Sep 5, 2013 at 6:48 AM, Duncan Murdoch murdoch.dun...@gmail.com wrote:
 On 13-09-04 11:36 PM, Scott Kostyshak wrote:

 On Wed, Sep 4, 2013 at 1:53 PM, Duncan Murdoch murdoch.dun...@gmail.com
 wrote:

 In a number of places internal to R, we need to know which files have
 changed (e.g. after building a vignette).  I've just written a general
 purpose function changedFiles that I'll probably commit to R-devel.
 Comments on the design (or bug reports) would be appreciated.

 The source for the function and the Rd page for it are inline below.


 This looks like a useful function. Thanks for writing it. I have only
 one (picky) comment below.

 - changedFiles.R:
 changedFiles - function(snapshot, timestamp = tempfile(timestamp),
 file.info = NULL,
   md5sum = FALSE, full.names = FALSE, ...) {
  dosnapshot - function(args) {
  fullnames - do.call(list.files, c(full.names = TRUE, args))
  names - do.call(list.files, c(full.names = full.names, args))
  if (isTRUE(file.info) || (is.character(file.info) 
 length(file.info))) {
  info - file.info(fullnames)
  rownames(info) - names
  if (isTRUE(file.info))
  file.info - c(size, isdir, mode, mtime)
  } else
  info - data.frame(row.names=names)
  if (md5sum)
  info - data.frame(info, md5sum = tools::md5sum(fullnames))
  list(info = info, timestamp = timestamp, file.info = file.info,
   md5sum = md5sum, full.names = full.names, args = args)
  }
  if (missing(snapshot) || !inherits(snapshot,
 changedFilesSnapshot)) {
  if (length(timestamp) == 1)
  file.create(timestamp)
  if (missing(snapshot)) snapshot - .
  pre - dosnapshot(list(path = snapshot, ...))
  pre$pre - pre$info
  pre$info - NULL
  pre$wd - getwd()
  class(pre) - changedFilesSnapshot
  return(pre)
  }

  if (missing(timestamp)) timestamp - snapshot$timestamp
  if (missing(file.info) || isTRUE(file.info)) file.info -
 snapshot$file.info
  if (identical(file.info, FALSE)) file.info - NULL
  if (missing(md5sum))md5sum - snapshot$md5sum
  if (missing(full.names)) full.names - snapshot$full.names

  pre - snapshot$pre
  savewd - getwd()
  on.exit(setwd(savewd))
  setwd(snapshot$wd)

  args - snapshot$args
  newargs - list(...)
  args[names(newargs)] - newargs
  post - dosnapshot(args)$info
  prenames - rownames(pre)
  postnames - rownames(post)

  added - setdiff(postnames, prenames)
  deleted - setdiff(prenames, postnames)
  common - intersect(prenames, postnames)

  if (length(file.info)) {
  preinfo - pre[common, file.info]
  postinfo - post[common, file.info]
  changes - preinfo != postinfo
  }
  else changes - matrix(logical(0), nrow = length(common), ncol = 0,
 dimnames = list(common, character(0)))
  if (length(timestamp))
  changes - cbind(changes, Newer = file_test(-nt, common,
 timestamp))
  if (md5sum) {
  premd5 - pre[common, md5sum]
  postmd5 - post[common, md5sum]
  changes - cbind(changes, md5sum = premd5 != postmd5)
  }
  changes1 - changes[rowSums(changes, na.rm = TRUE)  0, , drop =
 FALSE]
  changed - rownames(changes1)
  structure(list(added = added, deleted = deleted, changed = changed,
  unchanged = setdiff(common, changed), changes = changes), class
 =
 changedFiles)
 }

 print.changedFilesSnapshot - function(x, ...) {
  cat(changedFiles snapshot:\n timestamp = \, x$timestamp, \\n
 file.info = ,
  if (length(x$file.info)) paste(paste0('', x$file.info, ''),
 collapse=,),
  \n md5sum = , x$md5sum, \n args = , deparse(x$args, control
 =
 NULL), \n, sep=)
  x
 }

 print.changedFiles - function(x, ...) {
  if (length(x$added)) cat(Files added:\n,  paste0(  , x$added,
 collapse=\n), \n, sep=)
  if (length(x$deleted)) cat(Files deleted:\n,  paste0(  ,
 x$deleted,
 collapse=\n), \n, sep=)
  changes - x$changes
  changes - changes[rowSums(changes, na.rm = TRUE)  0, , drop=FALSE]
  changes - changes[, colSums(changes, na.rm = TRUE)  0, drop=FALSE]
  if (nrow(changes)) {
  cat(Files changed:\n)
  print(changes)
  }
  x
 }
 --

 --- changedFiles.Rd:
 \name{changedFiles}
 \alias{changedFiles}
 \alias{print.changedFiles}
 \alias{print.changedFilesSnapshot}
 \title{
 Detect which files have changed
 }
 \description{
 On the first call, \code{changedFiles} takes a snapshot of a selection of
 files.  In subsequent
 calls, it takes another snapshot, and returns an object containing data
 on
 the
 differences between the two snapshots.  The snapshots need not be the
 same
 directory;
 this could be used to compare two directories.
 }
 \usage{
 changedFiles(snapshot, timestamp = tempfile

Re: [Rd] Comments requested on changedFiles function

2013-09-04 Thread Scott Kostyshak
 of a file to write at the time the initial snapshot
 is taken.  In subsequent calls, modification times of files will be compared
 to
 this file, and newer files will be reported as changed.  Set to \code{NULL}
 to skip this test.
 }
   \item{file.info}{
 A vector of columns from the result of the \code{file.info} function, or a
 logical value.  If
 \code{TRUE}, columns \code{c(size, isdir, mode, mtime)} will be
 used.  Set to
 \code{FALSE} or \code{NULL} to skip this test.  See the Details.
 }
   \item{md5sum}{
 A logical value indicating whether MD5 summaries should be taken as part of
 the snapshot.
 }
   \item{full.names}{
 A logical value indicating whether full names (as in
 \code{\link{list.files}}) should be
 recorded.
 }
   \item{\dots}{
 Additional parameters to pass to \code{\link{list.files}} to control the set
 of files
 in the snapshots.
 }
 }
 \details{
 This function works in two modes.  If the \code{snapshot} argument is
 missing or is
 not of S3 class \code{changedFilesSnapshot}, it is used as the \code{path}
 argument
 to \code{\link{list.files}} to obtain a list of files.  If it is of class
 \code{changedFilesSnapshot}, then it is taken to be the baseline file
 and a new snapshot is taken and compared with it.  In the latter case,
 missing
 arguments default to match those from the initial snapshot.

 If the \code{timestamp} argument is length 1, a file with that name is
 created
 in the current directory during the initial snapshot, and
 \code{\link{file_test}}
 is used to compare the age of all files to it during subsequent calls.

 If the \code{file.info} argument is \code{TRUE} or it contains a non-empty
 character vector, the indicated columns from the result of a call to
 \code{\link{file.info}} will be recorded and compared.

 If \code{md5sum} is \code{TRUE}, the \code{tools::\link{md5sum}} function
 will be called to record the 32 byte MD5 checksum for each file, and these
 values
 will be compared.
 }
 \value{
 In the initial snapshot phase, an object of class
 \code{changedFilesSnapshot} is returned.  This
 is a list containing the fields
 \item{pre}{a dataframe whose rownames are the filenames, and whose columns
 contain the
 requested snapshot data}
 \item{timestamp, file.info, md5sum, full.names}{a record of the arguments in
 the initial call}
 \item{args}{other arguments passed via \code{...} to
 \code{\link{list.files}}.}

 In the comparison phase, an object of class \code{changedFiles}. This is a
 list containing
 \item{added, deleted, changed, unchanged}{character vectors of filenames
 from the before
 and after snapshots, with obvious meanings}
 \item{changes}{a logical matrix with a row for each common file, and a
 column for each
 comparison test.  \code{TRUE} indicates a change in that test.}

 \code{\link{print}} methods are defined for each of these types. The
 \code{\link{print}} method for \code{changedFilesSnapshot} objects
 displays the arguments used to produce it, while the one for
 \code{changedFiles} displays the \code{added}, \code{deleted}
 and \code{changed} fields if non-empty, and a submatrix of the
 \code{changes}
 matrix containing all of the \code{TRUE} values.
 }
 \author{
 Duncan Murdoch
 }
 \seealso{
 \code{\link{file.info}}, \code{\link{file_test}}, \code{\link{md5sum}}.
 }
 \examples{
 # Create some files in a temporary directory
 dir - tempfile()
 dir.create(dir)

Should a different name than 'dir' be used since 'dir' is a base function?

Further, if someone is not very familiar with R (or just not in R
mode at the time of reading), they might think that 'dir.create' is
calling the create member of the object named 'dir' that you just
made.

Scott

 writeBin(1, file.path(dir, file1))
 writeBin(2, file.path(dir, file2))
 dir.create(file.path(dir, dir))

 # Take a snapshot
 snapshot - changedFiles(dir, file.info=TRUE, md5sum=TRUE)

 # Change one of the files
 writeBin(3, file.path(dir, file2))

 # Display the detected changes
 changedFiles(snapshot)
 changedFiles(snapshot)$changes
 }
 \keyword{utilities}
 \keyword{file}

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] [PATCH] remove a duplicate tk function definition (and alphabetize)

2013-09-01 Thread Scott Kostyshak
'tkcoords' is defined twice (in the same way) in src/library/tcltk/R/Tk.R.

Attached is a patch against r63780 that removes the duplicate
definition and alphabetizes the functions.

I've read that minor patches such as this should be sent to r-devel [1].

Scott

[1] http://permalink.gmane.org/gmane.comp.lang.r.devel/33987


--
Scott Kostyshak
Economics PhD Candidate
Princeton University
Index: src/library/tcltk/R/Tk.R
===
--- src/library/tcltk/R/Tk.R(revision 63780)
+++ src/library/tcltk/R/Tk.R(working copy)
@@ -493,12 +493,11 @@
 tkbbox  - function(widget, ...) tcl(widget, bbox, ...)
 tkcanvasx   - function(widget, ...) tcl(widget, canvasx, ...)
 tkcanvasy   - function(widget, ...) tcl(widget, canvasy, ...)
+tkcget  - function(widget, ...) tcl(widget, cget, ...)
 tkcompare   - function(widget, ...) tcl(widget, compare, ...)
 tkconfigure - function(widget, ...) tcl(widget, configure, ...)
 tkcoords- function(widget, ...) tcl(widget, coords, ...)
 tkcreate- function(widget, ...) tcl(widget, create, ...)
-tkcget  - function(widget, ...) tcl(widget, cget, ...)
-tkcoords- function(widget, ...) tcl(widget, coords, ...)
 tkcurselection  - function(widget, ...) tcl(widget, curselection, ...)
 tkdchars- function(widget, ...) tcl(widget, dchars, ...)
 tkdebug - function(widget, ...) tcl(widget, debug, ...)
@@ -508,8 +507,8 @@
 tkdlineinfo - function(widget, ...) tcl(widget, dlineinfo, ...)
 tkdtag  - function(widget, ...) tcl(widget, dtag, ...)
 tkdump  - function(widget, ...) tcl(widget, dump, ...)
+tkentrycget - function(widget, ...) tcl(widget, entrycget, ...)
 tkentryconfigure - function(widget, ...) tcl(widget, entryconfigure, ...)
-tkentrycget - function(widget, ...) tcl(widget, entrycget, ...)
 tkfind  - function(widget, ...) tcl(widget, find, ...)
 tkflash - function(widget, ...) tcl(widget, flash, ...)
 tkfraction  - function(widget, ...) tcl(widget, fraction, ...)
@@ -535,11 +534,11 @@
 tkmark.unset- function(widget, ...) tcl(widget, mark, unset, ...)
 tkmove  - function(widget, ...) tcl(widget, move, ...)
 tknearest   - function(widget, ...) tcl(widget, nearest, ...)
+tkpostcascade   - function(widget, ...) tcl(widget, postcascade, ...)
 tkpost  - function(widget, ...) tcl(widget, post, ...)
-tkpostcascade   - function(widget, ...) tcl(widget, postcascade, ...)
 tkpostscript- function(widget, ...) tcl(widget, postscript, ...)
+tkscan.dragto   - function(widget, ...) tcl(widget, scan, dragto, ...)
 tkscan.mark - function(widget, ...) tcl(widget, scan, mark, ...)
-tkscan.dragto   - function(widget, ...) tcl(widget, scan, dragto, ...)
 tksearch- function(widget, ...) tcl(widget, search, ...)
 tksee   - function(widget, ...) tcl(widget, see, ...)
 tkselect- function(widget, ...) tcl(widget, select, ...)
@@ -563,7 +562,6 @@
 tcl(widget, selection, to, ...)
 tkset   - function(widget, ...) tcl(widget, set, ...)
 tksize  - function(widget, ...) tcl(widget, size, ...)
-tktoggle- function(widget, ...) tcl(widget, toggle, ...)
 tktag.add   - function(widget, ...) tcl(widget, tag, add, ...)
 tktag.bind  - function(widget, ...) tcl(widget, tag, bind, ...)
 tktag.cget  - function(widget, ...) tcl(widget, tag, cget, ...)
@@ -576,6 +574,7 @@
 tktag.raise - function(widget, ...) tcl(widget, tag, raise, ...)
 tktag.ranges- function(widget, ...) tcl(widget, tag, ranges, ...)
 tktag.remove- function(widget, ...) tcl(widget, tag, remove, ...)
+tktoggle- function(widget, ...) tcl(widget, toggle, ...)
 tktype  - function(widget, ...) tcl(widget, type, ...)
 tkunpost- function(widget, ...) tcl(widget, unpost, ...)
 tkwindow.cget   - function(widget, ...) tcl(widget, window, cget, ...)
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] tk + browser() can leave R unresponsive

2013-08-03 Thread Scott Kostyshak
I don't know if this is a bug. I can reproduce the following on Ubuntu
12.04.2 and 13.04 64-bit with R version 3.0.1 and with r63479. There
is no difference if R is patched with the fix for PR#15407 or not,
although without the fix there are more ways to trigger this.

I can reproduce with the following:

1. Open R in gnome-terminal or xterm

2. Run 'library(tcltk)'

3. Run 'trace(tk_select.list, edit = TRUE)'
and put browser() at the beginning of the onOK body (e.g. in Vim run
:g/onOK/put ='browser()'). That is, transform

onOK - function() {
res - 1L + as.integer(tkcurselection(box))
cat(res is: , res)
ans.select_list - choices[res]
tkgrab.release(dlg)
tkdestroy(dlg)
}

to:

onOK - function() {
browser()
res - 1L + as.integer(tkcurselection(box))
cat(res is: , res)
ans.select_list - choices[res]
tkgrab.release(dlg)
tkdestroy(dlg)
}

4. Run 'install.packages()'

5. Double-click on a package

R becomes unresponsive and I have to kill it.

 sessionInfo()
R Under development (unstable) (2013-08-02 r63479)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] tk + browser() can leave R unresponsive

2013-08-03 Thread Scott Kostyshak
On Sat, Aug 3, 2013 at 5:56 AM, Scott Kostyshak skost...@princeton.edu wrote:
 I don't know if this is a bug. I can reproduce the following on Ubuntu
 12.04.2 and 13.04 64-bit with R version 3.0.1 and with r63479. There
 is no difference if R is patched with the fix for PR#15407 or not,
 although without the fix there are more ways to trigger this.

 I can reproduce with the following:

 1. Open R in gnome-terminal or xterm

 2. Run 'library(tcltk)'

 3. Run 'trace(tk_select.list, edit = TRUE)'
 and put browser() at the beginning of the onOK body (e.g. in Vim run
 :g/onOK/put ='browser()'). That is, transform

 onOK - function() {
 res - 1L + as.integer(tkcurselection(box))
 cat(res is: , res)
 ans.select_list - choices[res]
 tkgrab.release(dlg)
 tkdestroy(dlg)
 }

 to:

 onOK - function() {
 browser()
 res - 1L + as.integer(tkcurselection(box))
 cat(res is: , res)
 ans.select_list - choices[res]
 tkgrab.release(dlg)
 tkdestroy(dlg)
 }

 4. Run 'install.packages()'

 5. Double-click on a package

 R becomes unresponsive and I have to kill it.

 sessionInfo()
 R Under development (unstable) (2013-08-02 r63479)
 Platform: x86_64-unknown-linux-gnu (64-bit)

 locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base

 Scott


 --
 Scott Kostyshak
 Economics PhD Candidate
 Princeton University

This might be related to PR#14730. I will add this info there unless
someone suggests otherwise.

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R CMD check --outdir=path gives unknown option '--outdir'

2013-07-19 Thread Scott Kostyshak
On Thu, Apr 4, 2013 at 2:06 PM, Henrik Bengtsson h...@biostat.ucsf.edu wrote:
 For 'R CMD check', it appears that option '--outdir' is not recognized
 and generates warning unknown option '--outdir'. R CMD check --help
 says:

 Usage: R CMD check [options] pkgs
 [...]
 Options:
 [...]
   -o, --outdir=DIR  directory used for logfiles, R output, etc.
 (default is 'pkg.Rcheck' in current directory,
 where 'pkg' is the name of the package checked)

 Example:

 mkdir foo

 # Check output is written to foo/
 R CMD check -o foo pkg_0.1.tar.gz

 # Option is ignored and check output is written to bar.Rcheck/
 R CMD check --outdir=foo pkg_0.1.tar.gz
 Warning: unknown option '--outdir=foo'

 # Also tried with:
 R CMD check --outdir foo pkg_0.1.tar.gz
 Warning: unknown option '--outdir'

 R CMD check -outdir=foo pkg_0.1.tar.gz
 Warning: unknown option '-outdir=foo'

 R CMD check -outdir foo pkg_0.1.tar.gz
 Warning: unknown option '-outdir'

 I get this with:

 sessionInfo()
 R version 3.0.0 (2013-04-03)
 Platform: x86_64-w64-mingw32/x64 (64-bit)

 sessionInfo()
 R Under development (unstable) (2013-04-02 r62479)
 Platform: x86_64-w64-mingw32/x64 (64-bit)

 sessionInfo()
 R version 2.15.3 (2013-03-01)
 Platform: x86_64-unknown-linux-gnu (64-bit)

I see the same behavior on 3.0.1 (pre-compiled binaries on Ubuntu
12.04 and 13.04).

 sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=C LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

 Should I report this to http://bugs.r-project.org/?

Did you? If not, please do (or tell me to if you don't have time). I
see nothing in News.Rd on trunk.

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R CMD check --outdir=path gives unknown option '--outdir'

2013-07-19 Thread Scott Kostyshak
On Fri, Jul 19, 2013 at 3:04 AM, Prof Brian Ripley
rip...@stats.ox.ac.uk wrote:
 So please follow the posting guide at
 http://www.r-project.org/posting-guide.html, to wit

 'f you are using an old version of R and think it does not work properly,
 upgrade to the latest version and try that, before posting. If possible, try
 the current R-patched or R-devel version of R (see the FAQ for details), to
 see if the problem has already been addressed.'

OK.

 It has been.

Thanks,

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Posting Guide: changed link and other comment

2013-07-19 Thread Scott Kostyshak
I have two comments regarding the Posting Guide:

(1) The link in the following sentence did not work for me:

Take care when you quote other people's comments to respect their
rights, e.g., as summarized here[a].

[a] http://www.jiscmail.ac.uk/help/policy/copyright.htm

Has it been changed to the following?
  http://www.jiscmail.ac.uk/policyandsecurity/copyrightissues.html

(2) Regarding the following extract

  `If you feel insulted by some response to a post of yours, don't
make any hasty response in return - you're as likely as not to regret
it.'

wouldn't someone who is `as likely as not to regret it' be indifferent
between sending a hasty response and not sending a hasty response? The
intent is perfectly clear but perhaps `you're _more_ likely than not'
is a more probabilistically correct expression?

Thanks for the helpful document -- it is useful reading for this list
as well as more generally.

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel