from:"Patrick Burns"

[Rd] random network disconnects

2023-07-31 Thread Patrick Burns

I'm experiencing a weird issue, and wondering if anyone has seen this, 
and better yet has a solution.


At work we are getting lots of issues with 'permission denied' or 
'network not found' and so forth when reading and writing between our 
machines and a file server.  This happens randomly so the following 
function solves the problem for 'cat' commands:


catSafer <-
function (..., ReTries = 20, ThrowError = TRUE)
{
for (catsi in seq_len(ReTries)) {
res <- try(cat(...))
if (!inherits(res, "try-error"))
break
}
if (inherits(res, "try-error")) {
if (ThrowError) {
stop("file connection failed")
}
else {
warning("file connection failed")
}
}
}

People have done network traces and such, but so far nothing has been seen.

Thanks,
Pat

--
Patrick Burns
pbu...@pburns.seanet.com
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of:
 'Impatient R'
 'The R Inferno'
 'Tao Te Programming')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Is this a bug in `[`?

2018-08-05 Thread Patrick Burns


This is Circle 8..1.13 of the R Inferno.


On 05/08/2018 06:57, Rui Barradas wrote:

Thanks.
This is exactly the doubt I had.

Rui Barradas

Às 05:26 de 05/08/2018, Kenny Bell escreveu:

This should more clearly illustrate the issue:

c(1, 2, 3, 4)[-seq_len(4)]
#> numeric(0)
c(1, 2, 3, 4)[-seq_len(3)]
#> [1] 4
c(1, 2, 3, 4)[-seq_len(2)]
#> [1] 3 4
c(1, 2, 3, 4)[-seq_len(1)]
#> [1] 2 3 4
c(1, 2, 3, 4)[-seq_len(0)]
#> numeric(0)
Created on 2018-08-05 by the reprex package (v0.2.0.9000).

On Sun, Aug 5, 2018 at 3:58 AM Rui Barradas <mailto:ruipbarra...@sapo.pt>> wrote:




    Às 15:51 de 04/08/2018, Iñaki Úcar escreveu:
 > El sáb., 4 ago. 2018 a las 15:32, Rui Barradas
 > (mailto:ruipbarra...@sapo.pt>>) escribió:
 >>
 >> Hello,
 >>
 >> Maybe I am not understanding how negative indexing works but
 >>
 >> 1) This is right.
 >>
 >> (1:10)[-1]
 >> #[1]  2  3  4  5  6  7  8  9 10
 >>
 >> 2) Are these right? They are at least surprising to me.
 >>
 >> (1:10)[-0]
 >> #integer(0)
 >>
 >> (1:10)[-seq_len(0)]
 >> #integer(0)
 >>
 >>
 >> It was the last example that made me ask, seq_len(0) whould 
avoid an

 >> if/else or something similar.
 >
 > I think it's ok, because there is no negative zero integer, so -0
    is 0.

    Ok, this makes sense, I should have thought about that.

 >
 > 1.0/-0L # Inf
 > 1.0/-0.0 # - Inf
 >
 > And the same can be said for integer(0), which is the result of
 > seq_len(0): there is no negative empty integer.

    I'm not completely convinced about this one, though.
    I would expect -seq_len(n) to remove the first n elements from the
    vector, therefore, when n == 0, it would remove none.

    And integer(0) is not the same as 0.

    (1:10)[-0] == (1:10)[0] == integer(0) # empty

    (1:10)[-seq_len(0)] == (1:10)[-integer(0)]


    And I have just reminded myself to run

    identical(-integer(0), integer(0))

    It returns TRUE so my intuition is wrong, R is right.
    End of story.

    Thanks for the help,

    Rui Barradas

 >
 > Iñaki
 >
 >>
 >>
 >> Thanks in advance,
 >>
 >> Rui Barradas
 >>
 >> __
 >> R-devel@r-project.org <mailto:R-devel@r-project.org> mailing list
 >> https://stat.ethz.ch/mailman/listinfo/r-devel

    ______
    R-devel@r-project.org <mailto:R-devel@r-project.org> mailing list
    https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @burnsstat @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of:
 'Impatient R'
 'The R Inferno'
 'Tao Te Programming')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] no objects apparent in built package

2014-02-13 Thread Patrick Burns


Here's a weird problem that I hope someone
can give me some hints for.

Actions:

Build a package of all R functions  -- no
compiled code.  No indication of anything
being wrong.

'require' the newly built package.

As far as the session is concerned, there
is nothing in the package.


This is being done in RStudio on Windows
with R version 3.0.2.

The package used to work.  There was a very
minor change to one function when the package
started not working.

The zip files that don't work are the same size
as the one that does work.

Thanks for any suggestions.

Pat


--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @burnsstat @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of:
 'Impatient R'
 'The R Inferno'
 'Tao Te Programming')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] suggested addition to 'install.packages' help file

2014-02-06 Thread Patrick Burns


I suggest that there be an additional sentence
in the explanation for the 'repos' argument in
the help file for 'install.packages':

If the repository is on a local drive, then the
string should begin with \code{file:}, e.g.,
\code{"file:J:/Rrepos"}.


Perhaps I'm missing some subtlety, but it makes
things work in my case.

Pat

--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @burnsstat @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of:
 'Impatient R'
 'The R Inferno'
 'Tao Te Programming')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] wishlist: decreasing argument to is.unsorted

2014-01-03 Thread Patrick Burns



I've just realized that it could be handy
to have a 'decreasing' argument in 'is.unsorted'.

And I'm cheekily hoping someone else will
implement it.

It is easy enough to work around (with 'rev'),
but would be less hassle with an argument.
The case I have in mind uses 'is.unsorted' in
'stopifnot'.

Pat

--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @burnsstat @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of:
 'Impatient R'
 'The R Inferno'
 'Tao Te Programming')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] as.name and namespaces

2013-04-24 Thread Patrick Burns

Here is an example problem:

> mycall <- expression(lm(Y ~ x))[[1]]
> mycall
lm(Y ~ x)
> newname <- "stats::lm"

> desiredResult
stats::lm(Y ~ x)

I've solved the problem in the kludgy way of
deparsing, fixing the string and then parsing.

I like Duncan's third method, but it seems like
it assumes the solution.  Moving functions around
is unappetizing for my use -- this is for testing
and keeping things as faithful to real use is a
good thing.

Pat

On 23/04/2013 21:18, Duncan Murdoch wrote:

On 13-04-23 3:51 PM, Patrick Burns wrote:

Okay, that's a good reason why it shouldn't.

Why it should is that I want to substitute
the first element of a call to be a function
including the namespace.

Three ways:

1.  Assign the function from the namespace locally, then call the local
one.
2.  Import the function in your NAMESPACE (if you know the name in
advance).
3.  Construct an expression involving ::, and substitute that in.

For example:

substitute(foo(x), list(foo=quote(baz::bar)))

Duncan Murdoch

Pat

On 23/04/2013 18:32, peter dalgaard wrote:

On Apr 23, 2013, at 19:23 , Patrick Burns wrote:

'as.name' doesn't recognize a name with
its namespace extension as a name:

as.name("lm")

lm

as.name("stats::lm")

`stats::lm`

as.name("stats:::lm")

`stats:::lm`

Is there a reason why it shouldn't?

Any reason why it should? :: and ::: are operators. foo$bar is not
the same as `foo$bar` either.

--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @burnsstat @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of:
 'Impatient R'
 'The R Inferno'
 'Tao Te Programming')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] as.name and namespaces

2013-04-23 Thread Patrick Burns


Okay, that's a good reason why it shouldn't.

Why it should is that I want to substitute
the first element of a call to be a function
including the namespace.

Pat


On 23/04/2013 18:32, peter dalgaard wrote:


On Apr 23, 2013, at 19:23 , Patrick Burns wrote:


'as.name' doesn't recognize a name with
its namespace extension as a name:


as.name("lm")

lm

as.name("stats::lm")

`stats::lm`

as.name("stats:::lm")

`stats:::lm`


Is there a reason why it shouldn't?


Any reason why it should? :: and ::: are operators. foo$bar is not the same as 
`foo$bar` either.



--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @burnsstat @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of:
 'Impatient R'
 'The R Inferno'
 'Tao Te Programming')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] as.name and namespaces

2013-04-23 Thread Patrick Burns


'as.name' doesn't recognize a name with
its namespace extension as a name:

> as.name("lm")
lm
> as.name("stats::lm")
`stats::lm`
> as.name("stats:::lm")
`stats:::lm`


Is there a reason why it shouldn't?

Pat


--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @burnsstat @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of:
 'Impatient R'
 'The R Inferno'
 'Tao Te Programming')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] ifelse can't return a list? Please explain (R-2.15.3)

2013-03-25 Thread Patrick Burns


When you what you hope for turns out to
be wrong, then have a look at 'The R Inferno'.

http://www.burns-stat.com/documents/books/the-r-inferno/

It does talk about 'ifelse'.

Pat

On 25/03/2013 02:21, Paul Johnson wrote:

I hope you are doing well.

For me, this was an unexpected problem. I've hoped for quite a few
wrong things today, but I'm only asking you about this one. Why does

ifelse(1, list(a, b, c), list(x, y, z))

return a list with only a, not list(a, b, c) as I hoped.  I wish it
would either
cause an error or return the whole list, not just the first thing.

Working example:


x <- 1
y <- 2
z <- 3
a <- 4
b <- 5
c <- 6
list(x,y,z)

[[1]]
[1] 1

[[2]]
[1] 2

[[3]]
[1] 3


list(a,b,c)

[[1]]
[1] 4

[[2]]
[1] 5

[[3]]
[1] 6


ifelse(1, list(a,b,c), list(x,y,z))

[[1]]
[1] 4


ifelse(0, list(a,b,c), list(x,y,z))

[[1]]
[1] 1



sessionInfo()

R version 2.15.3 (2013-03-01)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=C LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] rockchalk_1.5.5.10 car_2.0-16 nnet_7.3-5 MASS_7.3-23

loaded via a namespace (and not attached):
[1] compiler_2.15.3 tools_2.15.3


I realize I can code around this, but I'm just curious about why
ifelse hates me so much :(


if (1) myvar <- list(a, b, c) else myvar <- list(x, y, z)
myvar

[[1]]
[1] 4

[[2]]
[1] 5

[[3]]
[1] 6


myvar <- if (1) list(a, b, c) else list(x, y, z)
myvar

[[1]]
[1] 4

[[2]]
[1] 5

[[3]]
[1] 6




--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @burnsstat @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of:
 'Impatient R'
 'The R Inferno'
 'Tao Te Programming')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] fortune?

2013-03-20 Thread Patrick Burns


Brian Ripley:

If things are not readily available in R it is always good to pause and 
reflect if there might be a good reason.


In the R-help thread: How to get the t-stat for arima()?

Pat

--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @burnsstat @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of:
 'Impatient R'
 'The R Inferno'
 'Tao Te Programming')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Fortune?

2013-02-28 Thread Patrick Burns

I think the rule is that you can do anything as long as you don't 
complain. If you want to complain, you must follow the instructions.


-- Jari Oksanen  in

Re: [Rd] Keeping up to date with R-devel


--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @burnsstat @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of:
 'Impatient R'
 'The R Inferno'
 'Tao Te Programming')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] double bracket stripping names

2013-02-26 Thread Patrick Burns


Is it on purpose that `[[` strips the
names when used on an atomic vector?

> c(a=1, b=2)[1]
a
1
> c(a=1, b=2)[[1]]
[1] 1


> sessionInfo()
R Under development (unstable) (2013-02-11 r61902)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United Kingdom.1252
[2] LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base


--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @burnsstat @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of:
 'Impatient R'
 'The R Inferno'
 'Tao Te Programming')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] get and exists are not vectorized

2013-02-12 Thread Patrick Burns


Here is the current behavior (in 2.15.2 and 3.0.0):

> exists(c('notLikely', 'exists'))
[1] FALSE
> exists(c('exists', 'notLikely'))
[1] TRUE
> get(c('notLikely', 'exists'))
Error in get(c("notLikely", "exists")) : object 'notLikely' not found
> get(c('exists', 'notLikely'))
function (x, where = -1, envir = if (missing(frame)) 
as.environment(where) else sys.frame(frame),

frame, mode = "any", inherits = TRUE)
.Internal(exists(x, envir, mode, inherits))




Both 'exists' and 'get' silently ignore all but the
first element.

My view is that 'get' should do what it currently does
except it should warn about ignoring subsequent elements
if there are any.

I don't see a reason why 'exists' shouldn't be vectorized.

Am I missing something?

Pat

--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @burnsstat @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of:
 'Impatient R'
 'The R Inferno'
 'Tao Te Programming')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] no carriage returns in BATCH output from 2.15.0

2012-04-15 Thread Patrick Burns


It seems like I must be missing something
since I haven't been able to find mention
of this.

Under Windows 7 I'm not getting carriage returns
in the output of BATCH files using 2.15.0 (both
64-bit and 32-bit).  They are in the startup
messages, but not for the real output.  Is this
on purpose?

Pat


--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Overwrite S3 methond from base package

2012-02-13 Thread Patrick Burns

The key question is:

> So, is there any way to overwrite droplevels.factor and
> droplevels.data.frame from the base package with my functions?

There is "can" and there is "should".

I don't know the answer to "can".

In regard to "should", I think it is a
very very bad idea.

The result that someone gets depends on
whether or not the new package is attached
in the session.  That is a recipe for hours
(or days) of trying to figure out mysterious
behavior.

Much better, I think, is to create a new
generic.

Pat

On 13/02/2012 11:59, Thaler,Thorn,LAUSANNE,Applied Mathematics wrote:

Dear all,

I am developing a package, which bundles my most frequently used
functions. One of those is a modified version of droplevels from the
base package (basically, it preserves any contrast function which was
used to create the factor, contrast matrices are not kept, for they
could be wrong if a level is dropped).

In my NAMESPACE file I've the following directives, which should export
my methods:

S3method(droplevels, factor)
S3method(droplevels, data.frame)

However, when I load my package and I try to use those functions, the
dispatching mechanism calls the functions droplevels.factor and
droplevels.data.frame from the _base package_.

So, is there any way to overwrite droplevels.factor and
droplevels.data.frame from the base package with my functions? Or do I
have to create a generic function on my own?

Thanks for your help.

Kind Regards,

Thorn

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Task views

2012-02-13 Thread Patrick Burns


I feel compelled to rebuff Barry's
attack on the word "Task Views".
I think it is a fine description (I
was not involved in originating it),
though of course I'd be open to better
suggestions.

"Look at all these things you can do with R"
is also nice but lacks a certain amount
of brevity.

Pat

On 12/02/2012 10:43, Barry Rowlingson wrote:

On Sat, Feb 11, 2012 at 5:55 PM, Patrick Burns  wrote:


Now it could be that people are not trying
very hard to solve their own problems, but
to be fair it is a pretty gruelling process
to find the Task Views.

May I suggest that there be a "Task Views" item
on the left sidebar of the R website in the
Documentation section?



  I'd go further, and suggest that the list of Task Views appears on
the home page of www.r-project.org under the heading "Look at all
these things you can do with R". (Maybe to replace the 8 year old
clustering graphic (or maybe someone could do something in ggplot2
that looks nice n shiny?) )

  "Task Views" (stupid name, who's idea was that?) are an absolute GEM
and shouldn't be slotted between 'What's New?' and "Search" on CRAN
mirror sites. The CRAN Task Views page doesn't even say what "Task
Views" are. Here's some text that might help:

  "Task Views are short documents outlining the functionality of R in a
given field or methodology. Since most of R's power comes from add-on
packages downloaded from CRAN, Task Views tend to concentrate on
summarising the packages that are relevant. If you ever find yourself
thinking 'how do I do X in R?' then the list of Task Views should be
your first stop."

Barry



--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Task views (was: Re: [R] Schwefel Function Optimization)

2012-02-11 Thread Patrick Burns




On 11/02/2012 08:25, Hans W Borchers wrote:

Vartanian, Ara  indiana.edu>  writes:


All,

I am looking for


...







Why is it necessary over and over again to point to the Optimization Task
View?


...

Now it could be that people are not trying
very hard to solve their own problems, but
to be fair it is a pretty gruelling process
to find the Task Views.

May I suggest that there be a "Task Views" item
on the left sidebar of the R website in the
Documentation section?

--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] strange behavior from cex="*"

2011-11-18 Thread Patrick Burns


Someone ambitious could find problems like
this using random input testing like I talked
about at useR last summer.

http://www.burns-stat.com/pages/Present/random_input_test_annotated.pdf

Testing graphics would be more labor intensive
than the testing I do, but you could think of it
as a video game.

On 17/11/2011 00:29, Duncan Murdoch wrote:

On 11-11-16 5:26 PM, Ben Bolker wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> On 11-11-16 05:18 PM, peter dalgaard wrote:
>>
>> On Nov 16, 2011, at 22:38 , Ben Bolker wrote:
>>
>>> Someone inquired on StackOverflow about apparently non-deterministic
>>> graphics behaviour in R. I noticed that they were using cex="*" and
>>> discovered some potentially weird behavior.
>>
>> It can be reproduced much more simply (well, not the hang, but bad
>> enough):
>>
>> In a plain R application console (OSX Snow Leopard),
>>
>> for (i in 1:100) plot(1:10,cex="*")
>>
>> will _sometimes_ show big circles, indicating random data being
>> picked up.
>>
>> The "cex" is by definition numeric, so you can't expect to be able to
>> pass a character string, but the code should check.
>
> Looks (?) like the check could go in FixupCex (which already tests for
> isReal, isInteger, and isLogical) in src/main/plot.c , unless there is a
> wish to catch it earlier/in R code.

Yes, that's where the check was missed. I'll fix it. The other
parameters appear to have been checked properly.

> It's mildly surprising to me that people can continue to find odd
> cases like this after more than 10 years (and imagine how many
> cumulative hours of R use ...) [I'm assuming that this hole has been
> present for a log time: I don't have the patience to do the SVN
> archaeology to find out how long.]

So now you can prove me wrong about the other parameters...

Duncan Murdoch


>
>>
>>>
>>> On repeated runs of the same code I can get different PNGs. If I set
>>> the number of runs high enough, I seem to be able to get R to hang.
>>> If I do a single version plotting to an interactive graphics window I
>>> can get the point sizes to jump around as I resize the window (someone
>>> reported being able to reproduce that behaviour in the Windows GUI
>>> as well).
>>>
>>> This is clearly a user error, but non-deterministic behaviour (and
>>> hanging) are a little disturbing.
>>>
>>> I haven't had a chance yet to try to dig in and see what's happening
>>> but thought I would report to see if anyone else could 
reproduce/figure

>>> it out.
>>>
>>> Ben Bolker
>>>
>>>
>>> 
>>> ## n<- 100 ## hangs R
>>>
>>> n<- 33
>>>
>>> fn<- paste("tmp",seq(n),"png",sep=".")
>>> for (i in seq(n)) {
>>> png(fn[i])
>>> plot(1:10,1:10,cex="*");
>>> dev.off()
>>> }
>>>
>>> ff<- subset(file.info(fn),select=size)
>>> ff<- ff[!duplicated(ff$size),,drop=FALSE]
>>> table(ff$size)
>>> require(png)
>>> pngs<- lapply(rownames(ff),readPNG)
>>>
>>> png.to.img<- function(x) matrix(rgb(x[,,1],x[,,2],x[,,3]),
>>> nrow=dim(x)[1],ncol=dim(x)[2])
>>>
>>> imgs<- lapply(pngs,png.to.img)
>>>
>>> par(mfrow=c(2,2))
>>> lapply(imgs,function(x) {
>>> plot(0:1,0:1,type="n",ann=FALSE,axes=FALSE)
>>> rasterImage(x,0,0,1,1)
>>> })
>>>
>>> #
>>>
>>>> sessionInfo()
>>> R Under development (unstable) (2011-10-06 r57181)
>>> Platform: i686-pc-linux-gnu (32-bit)
>>>
>>> attached base packages:
>>> [1] stats graphics grDevices utils datasets methods base
>>>
>>> other attached packages:
>>> [1] glmmADMB_0.6.5 MASS_7.3-14 png_0.1-3
>>>
>>> loaded via a namespace (and not attached):
>>> [1] grid_2.15.0 lattice_0.19-33 nlme_3.1-102 tools_2.15.0
>>>
>>> __
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.10 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iQEcBAEBAgAGBQJOxDiKAAoJED2whTVMEyK9ThoIAIjyMpzZqsjUpJVbAb9K8IrL
> LbSFh8zb+cZb90ABkFwJaZ2FNTKCjPrUzOYzxxHuU9AY0bdPQGbIm2hvQfzcuMlc
> urS/ILIMzZEFSYkqkj0mWI9SADyJ+W0YeN/t3EuWy8nZqUkYQZ8M0GsuXjhtUL/i
> hVJU0uuIWCOCHpeI3SQKoxviTE6MQFRXXWhCAJx01h8ee/5UQ5GSGB7Er2Zilld3
> 0sLI6dmoF7gbeYqz33MaEpQ7geJoW3tfnVbQWUlF86+jGGv5trIqWYIp33OYIxMO
> u2YUq51vB+4uIRPFJ4Oyr+nJF0Z9NH4IJBipp/bF6wQ5u6JdXFqKTPeQ1V6m5qk=
> =YajM
> -END PGP SIGNATURE-
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Efficiency of factor objects

2011-11-05 Thread Patrick Burns


Perhaps 'data.table' would be a package
on CRAN that would be acceptable.

On 05/11/2011 16:45, Jeffrey Ryan wrote:

Or better still, extend R via the mechanisms in place.  Something akin
to a fast factor package.  Any change to R causes downstream issues in
(hundreds of?) millions of lines of deployed code.

It almost seems hard to fathom that a package for this doesn't already
exist. Have you searched CRAN?

Jeff



On Sat, Nov 5, 2011 at 11:30 AM, Milan Bouchet-Valat  wrote:

Le vendredi 04 novembre 2011 à 19:19 -0400, Stavros Macrakis a écrit :

R factors are the natural way to represent factors -- and should be
efficient since they use small integers.  But in fact, for many (but
not all) operations, R factors are considerably slower than integers,
or even character strings.  This appears to be because whenever a
factor vector is subsetted, the entire levels vector is copied.

Is it so common for a factor to have so many levels? One can probably
argue that, in that case, using a numeric or character vector is
preferred - factors are no longer the "natural way" of representing this
kind of data.

Adding code to fix a completely theoretical problem is generally not a
good idea. I think you'd have to come up with a real use case to hope
convincing the developers a change is needed. There are probably many
more interesting areas where speedups can be gained than that.


Regards

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel







--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] median and data frames

2011-04-29 Thread Patrick Burns


If Martin's proposal is accepted, does
that mean that the median method for
data frames would be something like:

function (x, ...)
{
stop(paste("you probably mean to use the command: sapply(",
deparse(substitute(x)), ", median)", sep=""))
}

Pat


On 29/04/2011 15:25, Martin Maechler wrote:

Paul Johnson
 on Thu, 28 Apr 2011 00:20:27 -0500 writes:


 >  On Wed, Apr 27, 2011 at 12:44 PM, Patrick Burns
 >wrote:
 >>  Here are some data frames:
 >>
 >>  df3.2<- data.frame(1:3, 7:9)
 >>  df4.2<- data.frame(1:4, 7:10)
 >>  df3.3<- data.frame(1:3, 7:9, 10:12)
 >>  df4.3<- data.frame(1:4, 7:10, 10:13)
 >>  df3.4<- data.frame(1:3, 7:9, 10:12, 15:17)
 >>  df4.4<- data.frame(1:4, 7:10, 10:13, 15:18)
 >>
 >>  Now here are some commands and their answers:

 >>>  median(df4.4)
 >>  [1]  8.5 11.5
 >>>  median(df3.2[c(1,2,3),])
 >>  [1] 2 8
 >>>  median(df3.2[c(1,3,2),])
 >>  [1]  2 NA
 >>  Warning message:
 >>  In mean.default(X[[2L]], ...) :
 >>argument is not numeric or logical: returning NA
 >>
 >>
 >>
 >>  The sessionInfo is below, but it looks
 >>  to me like the present behavior started
 >>  in 2.10.0.
 >>
 >>  Sometimes it gets the right answer.  I'd
 >>  be grateful to hear how it does that -- I
 >>  can't figure it out.
 >>

 >  Hello, Pat.

 >  Nice poetry there!  I think I have an actual answer, as opposed to the
 >  usual crap I spew.

 >  I would agree if you said median.data.frame ought to be written to
 >  work columnwise, similar to mean.data.frame.

 >  apply and sapply  always give the correct answer

 >>  apply(df3.3, 2, median)
 >  X1.3   X7.9 X10.12
 >  2  8 11

 [...]

exactly

 >  mean.data.frame is now implemented as

 >  mean.data.frame<- function(x, ...) sapply(x, mean, ...)

exactly.

My personal oppinion is that  mean.data.frame() should never have
been written.
People should know, or learn, to use apply functions for such a
task.

The unfortunate fact that mean.data.frame() exists makes people
think that median.data.frame() should too,
and then

   var.data.frame()
sd.data.frame()
   mad.data.frame()
   min.data.frame()
   max.data.frame()
   ...
   ...

all just in order to *not* to have to know  sapply()


No, rather not.

My vote is for deprecating  mean.data.frame().

Martin



--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] median and data frames

2011-04-27 Thread Patrick Burns


Here are some data frames:

df3.2 <- data.frame(1:3, 7:9)
df4.2 <- data.frame(1:4, 7:10)
df3.3 <- data.frame(1:3, 7:9, 10:12)
df4.3 <- data.frame(1:4, 7:10, 10:13)
df3.4 <- data.frame(1:3, 7:9, 10:12, 15:17)
df4.4 <- data.frame(1:4, 7:10, 10:13, 15:18)

Now here are some commands and their answers:

> median(df3.2)
[1] 2 8
> median(df4.2)
[1] 2.5 8.5
> median(df3.3)
  NA
1  7
2  8
3  9
> median(df4.3)
  NA
1  7
2  8
3  9
4 10
> median(df3.4)
[1]  8 11
> median(df4.4)
[1]  8.5 11.5
> median(df3.2[c(1,2,3),])
[1] 2 8
> median(df3.2[c(1,3,2),])
[1]  2 NA
Warning message:
In mean.default(X[[2L]], ...) :
  argument is not numeric or logical: returning NA



The sessionInfo is below, but it looks
to me like the present behavior started
in 2.10.0.

Sometimes it gets the right answer.  I'd
be grateful to hear how it does that -- I
can't figure it out.

Under the current regime we can get numbers
that are correct, partially correct, or sort
of random (given the intention).

I claim that much better behavior would be
to always get exactly one of the following:

* a numeric answer (that is consistently correct)
* an error

I would think a method in analogy to
'mean.data.frame' would be a logical choice.
But I'm presuming there might be an argument
against that or 'median.data.frame' would already
exist.


> sessionInfo()
R version 2.13.0 (2011-04-13)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_United Kingdom.1252
[2] LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] graphics  grDevices utils datasets  stats methods   base

other attached packages:
[1] xts_0.8-0 zoo_1.6-5

loaded via a namespace (and not attached):
[1] grid_2.13.0 lattice_0.19-23 tools_2.13.0

--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R vs. C

2011-01-18 Thread Patrick Burns


Claudia,

I think we agree.

Having the examples run in the
tests is a good thing, I think.
They might strengthen the tests
some (especially if there are
no other tests).  But mainly if
examples don't work, then it's
hard to have much faith in the
code.

On 18/01/2011 11:36, Claudia Beleites wrote:

On 01/18/2011 10:53 AM, Patrick Burns wrote:

I'm not at all a fan of thinking
of the examples as being tests.

Examples should clarify the thinking
of potential users. Tests should
clarify the space in which the code
is correct. These two goals are
generally at odds.


Patrick, I completely agree with you that
- Tests should not clutter the documentation and go to their proper place.
- Examples are there for the user's benefit - and must be written
accordingly.
- Often, test should cover far more situations than good examples.

Yet it seems to me that (part of the) examples are justly considered a
(small) subset of the tests:
As a potential user, I reqest two things from good examples that have an
implicit testing message/side effect:
- I like the examples to roughly outline the space in which the code
works: they should tell me what I'm supposed to do.
- Depending on the function's purpose, I like to see a demonstration of
the correctness for some example calculation.
(I don't want to see all further tests - I can look them up if I feel
the need)

The fact that the very same line of example code serves a testing (side)
purpose doesn't mean that it should be copied into the tests, does it?

Thus, I think of the "public" part (the "preface") of the tests living
in the examples.

My 2 ct,
Best regards,

Claudia





On 17/01/2011 22:15, Spencer Graves wrote:

Hi, Paul:


The "Writing R Extensions" manual says that *.R code in a "tests"
directory is run during "R CMD check". I suspect that many R programmers
do this routinely. I probably should do that also. However, for me, it's
simpler to have everything in the "examples" section of *.Rd files. I
think the examples with independently developed answers provides useful
documentation.


Spencer


On 1/17/2011 1:52 PM, Paul Gilbert wrote:

Spencer

Would it not be easier to include this kind of test in a small file in
the tests/ directory?

Paul

-Original Message-
From: r-devel-boun...@r-project.org
[mailto:r-devel-boun...@r-project.org] On Behalf Of Spencer Graves
Sent: January 17, 2011 3:58 PM
To: Dominick Samperi
Cc: Patrick Leyshock; r-devel@r-project.org; Dirk Eddelbuettel
Subject: Re: [Rd] R vs. C


For me, a major strength of R is the package development
process. I've found this so valuable that I created a Wikipedia entry
by that name and made additions to a Wikipedia entry on "software
repository", noting that this process encourages good software
development practices that I have not seen standardized for other
languages. I encourage people to review this material and make
additions or corrections as they like (or sent me suggestions for me to
make appropriate changes).


While R has other capabilities for unit and regression testing, I
often include unit tests in the "examples" section of documentation
files. To keep from cluttering the examples with unnecessary material,
I often include something like the following:


A1<- myfunc() # to test myfunc

A0<- ("manual generation of the correct answer for A1")

\dontshow{stopifnot(} # so the user doesn't see "stopifnot("
all.equal(A1, A0) # compare myfunc output with the correct answer
\dontshow{)} # close paren on "stopifnot(".


This may not be as good in some ways as a full suite of unit
tests, which could be provided separately. However, this has the
distinct advantage of including unit tests with the documentation in a
way that should help users understand "myfunc". (Unit tests too
detailed to show users could be completely enclosed in "\dontshow".


Spencer


On 1/17/2011 11:38 AM, Dominick Samperi wrote:

On Mon, Jan 17, 2011 at 2:08 PM, Spencer Graves<
spencer.gra...@structuremonitoring.com> wrote:


Another point I have not yet seen mentioned: If your code is
painfully slow, that can often be fixed without leaving R by
experimenting
with different ways of doing the same thing -- often after using
profiling
your code to find the slowest part as described in chapter 3 of
"Writing R
Extensions".


If I'm given code already written in C (or some other language),
unless it's really simple, I may link to it rather than recode it
in R.
However, the problems with portability, maintainability,
transparency to
others who may not be very facile with C, etc., all suggest that
it's well
worth some effort experimenting with alternate ways of doing the
same thing
in R before jumping to C or something else.

Hope this helps.
Spencer



On 1/17/2011 10:57 AM, David Henderson wrote:


I t

Re: [Rd] R vs. C

2011-01-18 Thread Patrick Burns

sp that
makes it easy to mix and
match functions (using classes and generic functions), many of which are
written in C (or C++
or Fortran) for performance reasons. Like any object-based system
there can
be a lot of
object copying, and like any functional programming system, there can
be a
lot of function
calls, resulting in poor performance for some applications.

If you can vectorize your R code then you have effectively found a
way to
benefit from
somebody else's C code, thus saving yourself some time. For
operations other
than pure
vector calculations you will have to do the C/C++ programming
yourself (or
call a library
that somebody else has written).

Dominick




- Original Message 
From: Dirk Eddelbuettel
To: Patrick Leyshock
Cc: r-devel@r-project.org
Sent: Mon, January 17, 2011 10:13:36 AM
Subject: Re: [Rd] R vs. C


On 17 January 2011 at 09:13, Patrick Leyshock wrote:
| A question, please about development of R packages:
|
| Are there any guidelines or best practices for deciding when and
why to
| implement an operation in R, vs. implementing it in C? The
"Writing R
| Extensions" recommends "working in interpreted R code . . . this is
normally
| the best option." But we do write C-functions and access them in R -
the
| question is, when/why is this justified, and when/why is it NOT
justified?
|
| While I have identified helpful documents on R coding standards,
I have
not
| seen notes/discussions on when/why to implement in R, vs. when to
implement
| in C.

The (still fairly recent) book 'Software for Data Analysis:
Programming
with
R' by John Chambers (Springer, 2008) has a lot to say about this. John
also
gave a talk in November which stressed 'multilanguage' approaches; see
e.g.

http://blog.revolutionanalytics.com/2010/11/john-chambers-on-r-and-multilingualism.html



In short, it all depends, and it is unlikely that you will get a
coherent
answer that is valid for all circumstances. We all love R for how
expressive
and powerful it is, yet there are times when something else is
called for.
Exactly when that time is depends on a great many things and you
have not
mentioned a single metric in your question. So I'd start with John's
book.

Hope this helps, Dirk


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



La version française suit le texte anglais.

--------


This email may contain privileged and/or confidential ...{{dropped:25}}


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R vs. C

2011-01-17 Thread Patrick Burns


Everyone has their own utility
function.  Mine is if the boredom
of waiting for the pure R function
to finish is going to out-weight the
boredom of writing the C code.

Another issue is that adding C code
increases the hassle of users who might
want the code to run on different
architectures.

On 17/01/2011 17:13, Patrick Leyshock wrote:

A question, please about development of R packages:

Are there any guidelines or best practices for deciding when and why to
implement an operation in R, vs. implementing it in C?  The "Writing R
Extensions" recommends "working in interpreted R code . . . this is normally
the best option."  But we do write C-functions and access them in R - the
question is, when/why is this justified, and when/why is it NOT justified?

While I have identified helpful documents on R coding standards, I have not
seen notes/discussions on when/why to implement in R, vs. when to implement
in C.

Thanks, Patrick

On Sun, Jan 16, 2011 at 3:00 AM,  wrote:


Send R-devel mailing list submissions to
r-devel@r-project.org

To subscribe or unsubscribe via the World Wide Web, visit
https://stat.ethz.ch/mailman/listinfo/r-devel
or, via email, send a message with subject or body 'help' to
r-devel-requ...@r-project.org

You can reach the person managing the list at
r-devel-ow...@r-project.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of R-devel digest..."


Today's Topics:

   1. RPostgreSQL 0.1.7 for Windows 64 causes R.2.12.1 Win64 crash
  (Xiaobo Gu)


--

Message: 1
Date: Sat, 15 Jan 2011 10:34:55 +0800
From: Xiaobo Gu
To: r-devel@r-project.org
Subject: [Rd] RPostgreSQL 0.1.7 for Windows 64 causes R.2.12.1 Win64
crash
Message-ID:

Content-Type: text/plain; charset=ISO-8859-1

Hi,
I build the binary package file of RPostgreSQL 0.1.7 for Windows 2003
Server R2 64 bit SP2, the software environments are as following:
 R 2.12.1 for Win64
 RTools212 for Win64
 DBI 0.2.5
 RPostgreSQL 0.1.7
 Postgresql related binaries shipped with
postgresql-9.0.2-1-windows_x64.exe from EnterpriseDB

The package can be loaded, and driver can be created, but the
dbConnect function causes the whole RGui crashes,

driver<- dbDriver("PostgreSQL")
con<- dbConnect(driver, dbname="demo", host="192.168.8.1",
user="postgres", password="postgres", port=5432)



--

___
R-devel@r-project.org mailing list  DIGESTED
https://stat.ethz.ch/mailman/listinfo/r-devel


End of R-devel Digest, Vol 95, Issue 14
***



[[alternative HTML version deleted]]

______
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] minor problem in strsplit help file

2010-12-24 Thread Patrick Burns


The 'extended' argument to 'strsplit'
has been removed, but it is still mentioned
in the argument items in the help file
for 'fixed' and 'perl'.

--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] symbol and symbols help files

2010-10-09 Thread Patrick Burns


I think it makes sense to have
'symbol' in the See Also of 'symbols'
and vice versa.


--
Patrick Burns
pbu...@pburns.seanet.com
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] arr.ind argument to which.min and which.max

2010-07-06 Thread Patrick Burns

On 06/07/2010 10:53, Martin Maechler wrote:

[ ... ]

Wouldn't it make more sense to call

arrayInd(which.min(mat), dim(mat))

instead of
which.min(mat, arr.ind = TRUE)

in the spirit of modularity, maintainability, ... ?
Honestly, in my first reply I had forgotten about my own
arrayInd() modularization

Yes.  Then I guess the suggested change is to
put 'arrayInd' in the See Also and Examples for
'which.min' for dunderheads like me that don't
think of it themselves.

 >>>  If the order of the if condition were reversed, then
 >>>  possibly the slight reduction in speed of 'which.min'
 >>>  and 'which.max' would be more than made up for in the
 >>>  slight increase in speed of 'which'.

thanks for the hint, but

   "increase in speed of 'which'"  -- really, can you measure that?

I doubt it.

(I'll reverse the order anyway)

If we are interested in speed increase, we should add an option
to *not* work with dimnames at all (*) and if we have programmer
time left, we could take it .Internal() and get a real
boost... not now though.

(*) I'm doing that for now, *and* I would like to change the
 default behavior or arrayInd(), but of course *not* the
 default behavior of which(),
 to *not* attach dimnames to the result, by default.

   I.e., I'm proposing to add   'useNames = FALSE' as argument to
   arrayInd() but have  which() call arrayInd(..., useNames=TRUE).

   This is a back-compatibility change in arrayInd() -- which has
   existed only since 2.11.0 anyway, so would seem ok, to me.

   Opinions ?

I find it hard to believe that would cause
too much trauma.

Pat

--
Martin

--
Patrick Burns
pbu...@pburns.seanet.com
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] arr.ind argument to which.min and which.max

2010-07-05 Thread Patrick Burns


On 05/07/2010 10:56, Martin Maechler wrote:

"PatB" == Patrick Burns
 on Sun, 04 Jul 2010 09:43:44 +0100 writes:


 PatB>  Is there a reason that 'which.min' and
 PatB>  'which.max' don't have an 'arr.ind'
 PatB>  argument?

well,  help(which.min)  tells you that they really were aimed at
doing their job *fast* for vectors.

Of course you are right and a generalization to arrays might be
convenient at times.

 PatB>  The context in which I wanted that was
 PatB>  a grid search optimization, which seems
 PatB>  like it would be reasonably common to me.

well, as the author of these two functions, I can only say

   "patches are welcome!"

and I think should be pretty simple, right ?
You just have to do very simple remapping of the 1d index 'i' back
to the array index, i.e., the same operation
you need to transform seconds into days:hours:minutes:seconds
{{and yes, we old-timers may recall that APL had an operator (I
   think "T-bar") to do that ...}


I think the exercise is just to copy the definition of
'which' and add four characters.

If the order of the if condition were reversed, then
possibly the slight reduction in speed of 'which.min'
and 'which.max' would be more than made up for in the
slight increase in speed of 'which'.

Pat



Martin Maechler, ETH Zurich


 PatB>  --
 PatB>  Patrick Burns
 PatB>  pbu...@pburns.seanet.com
 PatB>  http://www.burns-stat.com
 PatB>  (home of 'Some hints for the R beginner'
 PatB>  and 'The R Inferno')



--
Patrick Burns
pbu...@pburns.seanet.com
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] arr.ind argument to which.min and which.max

2010-07-04 Thread Patrick Burns


Is there a reason that 'which.min' and
'which.max' don't have an 'arr.ind'
argument?

The context in which I wanted that was
a grid search optimization, which seems
like it would be reasonably common to me.


--
Patrick Burns
pbu...@pburns.seanet.com
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] proposed change to 'sample'

2010-06-20 Thread Patrick Burns


There is a weakness in the 'sample'
function that is highlighted in the
help file.  The 'x' argument can be
either the vector from which to sample,
or the maximum value of the sequence
from which to sample.

This can be ambiguous if the length of
'x' is one.

I propose adding an argument that allows
the user (programmer) to avoid that
ambiguity:

function (x, size, replace = FALSE, prob = NULL,
max = length(x) == 1L && is.numeric(x) && x >= 1)
{
if (max) {
if (missing(size))
size <- x
.Internal(sample(x, size, replace, prob))
}
else {
if (missing(size))
size <- length(x)
x[.Internal(sample(length(x), size, replace, prob))]
}
}



This just takes the condition of the first
'if' to be the default value of the new 'max'
argument.

So in the "surprise" section of the examples
in the 'sample' help file

sample(x[x > 9])

and

sample(x[x > 9], max=FALSE)

have different behaviours.

By the way, I'm certainly not convinced that
'max' is the best name for the argument.

--
Patrick Burns
pbu...@pburns.seanet.com
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Package development process?

2010-06-17 Thread Patrick Burns


I agree with Hadley, and add that trying
to have an example be both an example and
a test may not be good for the example
aspect either.

Examples should make people who are ignorant
of the function twig as to how the function
works.  Creating good examples is hard.

Problems that really test the software are
unlikely to serve as a good example.  Good
examples are unlikely to seriously test the
code.  (But you do want the examples to run,
it is seriously bad advertising when they
don't.)

Pat


On 16/06/2010 15:53, Hadley Wickham wrote:

  What about the encouragement to add unit tests, if only disguised as
examples?


Examples are not unit tests. Examples are a convenient way of testing
some aspects of the package, but serve a rather different purpose to
tests. The R community does not emphase testing nearly as much as
other communities.  For example,  Ruby has a very strong testing
culture including at least 10 different unit testing frameworks.


  I've found the unit tests to be a powerful tool to help improve and
maintain the quality of packages to which I contribute.  To this end, Sundar
and I added a column "Autochecks" to the table of "Selected Repositories" in
the Wikipedia article on "Software repository"
(http://en.wikipedia.org/wiki/Software_repository), and we describe it
briefly in the text introducing that table.


Unfortunately your description highlights a inadequacy of R and poor
software development procedures on the part of many R package
developers (including me!). For exactly the reason you discuss, it's
never a good idea for your package to depend on the most current
version of another package.  If you do, changes to that package might
break yours.  Most other package management systems allow you to
specify the version of a dependency so that this doesn't happen.  You
can do this with R, but because it's hard to have multiple versions of
the same package installed at the same time, it's not as useful.

Hadley



--
Patrick Burns
pbu...@pburns.seanet.com
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] print(big+small*1i) -> big + 0i

2010-03-25 Thread Patrick Burns


I'm not sure it's awful either, but it
is surprising -- at least to my eye.

In particular, it is using the same
amount of real estate as it would to
print the "right" value.


Pat

On 25/03/2010 01:14, William Dunlap wrote:

Should both parts of a complex number be printed
to the same precision?   The imaginary part of 0
looks a bit odd when log10(real/imag)>=~ getOption("digits"),
but I'm not sure it is awful.  Some people might
expect the same number of significant digits in the
two parts.


1e7+4i

[1] 1000+0i

1e7+5i

[1] 1000+0i

1e10 + 1000i

[1] 1e+10+0e+00i

getOption("digits")

[1] 7


options(digits=4)
1e4+4i

[1] 1+0i

1e7+1000i

[1] 1000+0i


version

_

platform   i386-pc-mingw32

arch   i386

os mingw32

system i386, mingw32

status Under development (unstable)

major  2

minor  11.0

year   2010

month  03

day07

svn rev51225

language   R

version.string R version 2.11.0 Under development (unstable) (2010-03-07
r51225)

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Patrick Burns
pbu...@pburns.seanet.com
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Why is there no c.factor?

2010-02-04 Thread Patrick Burns


The argument I have in 'The R Inferno'
is that how you want to combine factors
may differ from someone else's desires.

There are lots of tricky questions:
What about ordered factors?
What if the ordered levels are different in
different objects?
...

Pat


On 04/02/2010 15:53, Hadley Wickham wrote:

Hi all,

Is there are reason that there is no c.factor method?  Analogous to
c.Date, I'd expect something like the following to be useful:

c.factor<- function(...) {
   factors<- list(...)
   levels<- unique(unlist(lapply(factors, levels)))
   char<- unlist(lapply(factors, as.character))

   factor(char, levels = levels)
}

c(factor("a"), factor("b"), factor(c("c", "b","a")), factor("d"))
# [1] a b c b a d
# Levels: a b c d

Hadley



--
Patrick Burns
pbu...@pburns.seanet.com
http://www.burns-stat.com
(home of 'The R Inferno' and 'A Guide for the Unwilling S User')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] available.packages and 2.10.0

2010-01-10 Thread Patrick Burns


I'm confused about how to change a repository
so that it doesn't hit the bug in 'available.packages'
in 2.10.0 that was fixed in 2.10.1.

I presume it involves adding fields to the
PACKAGES file.



Patrick Burns
patr...@burns-stat.com
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of "The R Inferno" and "A Guide for the Unwilling S User")

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] n=1 default for random number generators

2009-11-16 Thread Patrick Burns


Duncan Murdoch wrote:

On 11/16/2009 11:00 AM, Richard Cotton wrote:
One tiny thing that would be very nice to have is a default value of 
n=1 in

the random number generator functions, enabling, e.g., runif() instead of
runif(1).  This won't break anyone's existing code and ought to be
relatively straightforward to do. 
Is there anyone in the core team who would be willing to include this

change?


I doubt it.  Even if you put together the patch (and I didn't see an 
offer to do so), merging it into the trunk code would take more work 
than you'd save in several years of typing the "1".


Duncan Murdoch


In the spirit of sour grapes, the proposed default
might discourage some users from vectorizing their
thinking.



Patrick Burns
patr...@burns-stat.com
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of "The R Inferno" and "A Guide for the Unwilling S User")



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Wish: more explicit error message with missing argument in c

2009-11-14 Thread Patrick Burns


Consider:

> c(,2)
Error: argument is missing, with no default

That there is no traceback is unfortunate
but understandable.  If such a mistake were
made like this, there wouldn't be much problem.
But the mistake is likely to be made in more
complicated settings:

> rbind(c(,2), c(3,4))
Error in rbind(c(, 2), c(3, 4)) : argument is missing, with no default
> traceback()
1: rbind(c(, 2), c(3, 4))

So naive (and not-so-naive) users will be led
to think the problem is with 'rbind' and not with
'c'.

If the message could be:

Error: argument to 'c' is missing, with no default

Then that would be satisfactory, I think.



Patrick Burns
patr...@burns-stat.com
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of "The R Inferno" and "A Guide for the Unwilling S User")

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] how to document stuff most users don't want to see

2009-10-07 Thread Patrick Burns


Under the system of development we
now have, I agreee with Seth's
assertion.  But if there were
people dedicated to documentation,
then I think something like what I
described could be workable.

Pat


Seth Falcon wrote:

Writing good documentation is hard.  I can appreciate the desire to
find technological solutions that improve documentation.  However, the
benefit of a help system that allows for varying degrees of verbosity
is very likely to be overshadowed by the additional complexity imposed
on the help system.

Users would need to learn how to tune the help system.  Developers
would need to learn and follow the system of variable verbosity.  This
time would be better spent by developers simply improving the
documentation and by users by simply reading the improved
documentation.

My $0.02.

+ seth



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] how to document stuff most users don't want to see

2009-10-06 Thread Patrick Burns


I think the problem is more subtle
than Spencer implies.  It is good
to have as much documentation as
possible.  However, if a help file
is long and complex, then people
get intimidated and don't read it
at all.

It would be nice to have a feature
so that help files can be displayed
with different levels of detail.  A
sophisticated version of this scheme
might even assume different levels of
knowledge of the user so that the least
detailed level might be longer (but
easier) than a more detailed level.



Patrick Burns
patr...@burns-stat.com
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of "The R Inferno" and "A Guide for the Unwilling S User")


spencerg wrote:
There are many arguments in many functions that are rarely used.  I 
prefer to see it all documented in the help pages.  If they are not 
documented in the help pages (and sometimes even if they are), a user 
who wants them can invent other ways to get similar information with 
much greater effort, and do so for years only to eventually find a much 
easier way buried in the documentation.  Example:  I was frustrated for 
years that "nls" would refuse to produce output if it did not converge.  
I often used "optim" instead of "nls" for that reason.  However, the 
list returned by "optim" does not have the nice methods that one can use 
with an "nls" object.  Eventually, I found the "warnOnly" option 
documented under "nls.control", which has made "nls" easier for me to use.

Spencer Graves


William Dunlap wrote:

There are several help files in the R sources that
describe concepts and not particular R objects.
E.g., help(Methods), help(Syntax), and help(regex).
They don't have a docType  entry and their alias
entries do not refer to functions.  Perhaps your
debugging documentation could go into a similar
*.Rd file.

Does check balk at such help files in a package? Should it?
Should there be a special docType for such help files?

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 
 

-Original Message-
From: r-devel-boun...@r-project.org 
[mailto:r-devel-boun...@r-project.org] On Behalf Of Charles Geyer

Sent: Monday, October 05, 2009 10:51 AM
To: r-devel@r-project.org
Subject: [Rd] how to document stuff most users don't want to see

The functions metrop and temper in the mcmc package have a debug = FALSE
argument that when TRUE adds a lot of debugging information to the 
returned

list.  This is absolutely necessary to test the functions, because one
generally knows nothing about the simulated distribution except what 
what

one learns from MCMC samples.  Hence you must expose all details of the
simulation to have any hope of checking that it is doing what it is 
supposed
to do.  However, this information is of interested mostly (perhaps 
solely)
to developers.  So I didn't document it in the Rd files for these 
functions.


But it has ocurred to me that people might be interested in how these 
functions
are validated, and I would like to document the debug output 
somewhere, but I
don't want to clutter up the documentation that ordinary users see.  
That
suggests a separate help page for debugging.  Looking at "Writing R 
Extensions"
it doesn't seem like there is a type of Rd file for this purpose.  I 
suppose
it could be added in (fairly long) sections titled "Debug Output" in 
metrop.Rd
and temper.Rd or it could be put in a package help page (although 
that's not
what that kind of page is really for).  Any other possibilities to 
consider?

--
Charles Geyer
Professor, School of Statistics
University of Minnesota
char...@stat.umn.edu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

  





__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] question

2009-03-07 Thread Patrick Burns


One idea of program design is that users
should be protected against themselves.

It is my experience that users, especially
novices, tend to over-split items rather than
over-clump items.  The fact that items are
returned by the same function call would
argue to me that there is a connection between
the items.


Patrick Burns
patr...@burns-stat.com
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of "The R Inferno" and "A Guide for the Unwilling S User")

ivo welch wrote:

hi gabor:  this would be difficult to do.  I don't think you want to
read my programs.  it would give you an appreciation of what ugly
horror programs end users can write in the beautiful R language  ;-).

clearly, one can work around the lack of such a feature.
multiple-return values are syntax sugar.  but maybe it helps to
explain how I got to my own view.  I had to send an R program to
someone who had never used it before.  without knowing R, he could
literally read the entire program.  the only thing that stumped him
was the multiple return values.  In my program, he saw

  f= function() { return(list(a=myvector1, b=myvector2)) }

  result=f()
  a= result$a
  b= result$a
  rm(result)

I had picked this method up over the years reading r-help.  of course,
I had 10 return values, not two, each return value with its own long
name.  I think it would have been a whole lot nicer if I could have
written FOR HIM simply

  f= function() { return(myvector1,myvector2); }
  (a,b)= f()

again, its syntax sugar.  I would find such syntax a whole lot more
appealing.  and I often write functions that pass back a main program,
but also some debug or other information.  maybe I am the only one...

regards,

/iaw


On Sat, Mar 7, 2009 at 9:28 AM, Gabor Grothendieck
 wrote:
  

Why?   Can you demonstrate any situations where its useful?  Despite
having my own facility for this I've found that over the years I
have never used it.

On Sat, Mar 7, 2009 at 9:23 AM,   wrote:


Gentlemen---these are all very clever workarounds, but please forgive me for
voicing my own opinion: IMHO, returning multiple values in a statistical
language should really be part of the language itself. there should be a
standard syntax of some sort, whatever it may be, that everyone should be
able to use and which easily transfers from one local computer to another.
It should not rely on clever hacks in the .Rprofile that are different from
user to user, and which leave a reader of end user R code baffled at first
by all the magic that is going on. Even the R tutorials for beginners should
show a multiple-value return example right at the point where function calls
and return values are first explained.

I really do not understand why the earlier implementation of "multiple-value
returns" was deprecated. then again, I am a naive end user, not a computer
language expert. I probably would not even understand the nuances of syntax
ambiguities that may have arisen. (this is my shortcoming.)

regards,

/iaw


On Mar 7, 2009 4:34am, Wacek Kusnierczyk
 wrote:
  

mark.braving...@csiro.au wrote:



The syntax for returning multiple arguments does not strike me as

particularly appealing.  would it not possible to allow syntax like:

  f= function() { return( rnorm(10), rnorm(20) ) }

  (a,d$b) = f()


FWIW, my own solution is to define a "multi-assign operator":
  
'%

  # a must be of the form '{thing1;thing2;...}'
  
  a

  e
  stopifnot( length( b) == length( a))
  
  for( i in seq_along( a))
  
eval( call( '

  NULL
  
}
  


you might want to have the check less stringent, so that rhs may consist

of more values that the lhs has variables.  or even skip the check and

assign NULL to a[i] for i > length(b).  another idea is to allow %
be used with just one variable on the lhs.



here's a modified version:



   '%
   a
   if (length(a) > 1)

   a
   if (length(a) > length(b))

   b
   e
   for( i in seq_along( a))

   eval( call( '
   NULL }



   {a; b} %
   # a = 1; b = 2

   a %
   # a = 3

   {a; b} %
   # a = 5; b = NULL





vQ




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel





__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] length of POSIXlt object (PR#13482)

2009-01-30 Thread Patrick Burns


'The R Inferno' page 93 and page 99.


Patrick Burns
patr...@burns-stat.com
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of "The R Inferno" and "A Guide for the Unwilling S User")

twoutop...@gmail.com wrote:

The length() of a POSIXlt object is given as 9 regardless of the actual
length. For example:

  

make.date.time


function (year=c(2006,2006),month=c(8,8),day=2:5,hour=13,minute=45)
{# convert year, etc., into POSIXlt object
#
d=as.character(make.date(year,month,day))
t=paste(hour,minute,sep=":")
as.POSIXlt(paste(d,t))
}
  

t=make.date.time()
t


[1] "2006-08-02 13:45:00" "2006-08-03 13:45:00" "2006-08-04 13:45:00"
[4] "2006-08-05 13:45:00"
  

length(t)


[1] 9
  

t[1]


[1] "2006-08-02 13:45:00"
  

length(t[1])


[1] 9






__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] x <- 1:2; dim(x) <- 2? A vector or not?

2009-01-13 Thread Patrick Burns


Henrik Bengtsson wrote:

Hi.

On Mon, Jan 12, 2009 at 11:58 PM, Prof Brian Ripley
 wrote:
  

What you have is a one-dimensional array: they crop up in R most often from
table() in my experience.



f <- table(rpois(100, 4))
str(f)
  

 'table' int [, 1:10] 2 6 18 21 13 16 13 4 3 4
 - attr(*, "dimnames")=List of 1
 ..$ : chr [1:10] "0" "1" "2" "3" ...

and yes, f is an atmoic vector and yes, str()'s notation is confusing here
but if it did [1:10] you would not know it was an array.  I recall
discussing this with Martin Maechler (str's author) last century, and I've
just checked that R 2.0.0 did the same.

The place in which one-dimensional arrays differ from normal vectors is how
names are handled: notice that my example has dimnames not names, and ?names
says

For a one-dimensional array the 'names' attribute really is
'dimnames[[1]]'.



Thanks for this explanation.  One could then argue that [1:10,] is
somewhat better than [,1:10], but that is just polish.
  


Perhaps it could be:

[1:10(,)]

That is weird enough that it should not lead people to
believe that it is a matrix.  But might prompt them a bit
in that direction.


Patrick Burns
patr...@burns-stat.com
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of "The R Inferno" and "A Guide for the Unwilling S User")

/Henrik

  

I think these days we have enough internal glue in place that an end user
would not notice the difference (but those working at C level with R objects
may need to know).

On Mon, 12 Jan 2009, Henrik Bengtsson wrote:



Ran into the follow intermediate case in an external package (w/
recent R v2.8.1 patched and R v2.9.0 devel):

  

x <- 1:2
dim(x) <- 2
dim(x)


[1] 2
  

x


[1] 1 2
  

str(x)


int [, 1:2] 1 2
  

nrow(x)


[1] 2
  

ncol(x)


[1] NA
  

is.vector(x)


[1] FALSE
  

is.matrix(x)


[1] FALSE
  

is.array(x)


[1] TRUE
  

x[1]


[1] 1
  

x[,1]


Error in x[, 1] : incorrect number of dimensions
  

x[1,]


Error in x[1, ] : incorrect number of dimensions

Is str() treating single-dimension arrays incorrectly?

What does it mean to have a single dimension this way?  Should it
equal a vector?  I am aware of "is.vector returns FALSE if x has any
attributes except names".

/Henrik

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

  

--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel





__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [R] odd behavior when multiplying data frame by an element

2008-12-13 Thread Patrick Burns

I think it might be up for discussion whether or not
there is a bug in here somewhere.

> mat10 <- matrix(1:6, 3)
> mat10 * 1:12
Error: dims [product 6] do not match the length of object [12]
> data.frame(mat10) * 1:12
 X1 X2
1  1 16
2  4 25
3  9 36
> version

_  
platform   
i386-pc-mingw32
arch   
i386   
os 
mingw32
system i386, 
mingw32  
status Under development 
(unstable)   
major  
2  
minor  
9.0
year   
2008   
month  
11 
day
22 
svn rev
47006  
language   
R  
version.string R version 2.9.0 Under development (unstable) (2008-11-22 
r47006)

It seems dangerous to me that there is not at least
a warning in the data frame case.  I think I'd prefer
an error like the matrix case.

Patrick Burns
patr...@burns-stat.com
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

markle...@verizon.net wrote:
could someone explain what is happening below ? I was trying to solve 
a related question on the list and then , as I was solving it,
I was getting strange answers and then I noticed below. It's obviously 
not a bug but I don't get it. Thanks.

m <- data.frame(class = c("birds", "planes"), feather = c(1,3), jet = 
c(2,4))

d1 <- data.frame(jet = c(10), feather = c(20))

print(m)
m[,-1]*2   # THIS IS FINE
m[,"jet"]*2 # THIS IS FINE

print(d1["jet"]) # THIS IS FINE
m[,"jet"]*d1["jet"] # THIS IS ODD

__
r-h...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] minor typo in assignOps help file

2008-11-23 Thread Patrick Burns


There is a minor typo in the help file for assignOps:

There is no space in "operator<-" in the second
sentence of the second paragraph of the Details
section.

Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] 'by' with one-dimensional array

2008-11-16 Thread Patrick Burns


I've played a bit with the problem that Jeff Laake
reported on R-help:

# create data:
jl <- data.frame(x=rep(1, 3), y=tapply(1:9, rep(c('A','B','C'), each=3), 
sum))

jl2 <- jl
jl2$y <- as.numeric(jl2$y)


# do the test:

> tapply(jl$y, jl$x, length)
1
3
> tapply(jl2$y, jl2$x, length)
1
3
> by(jl2$y, jl2$x, length)
jl2$x: 1
[1] 3
> by(jl$y, jl$x, length)
INDICES: 1
[1] 1

The result of 'by' on the 1-dimensional array is
giving the correct answer to a question that I don't
think many of us thought we were asking.

Once upon a time 'by' gave 3 as the answer in both
situations.

'by.default' used to be a one-liner, but now decides
what to do based on 'length(dim(data))'.

This specific problem goes away if the line:

if (length(dim(data)))

is replaced by:

if(length(dim(data)) > 1)

But I don't know what other mischief such a change
would make.

Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Numerical optimisation and "non-feasible" regions

2008-08-07 Thread Patrick Burns


If I understand your proposal correctly, then it
probably isn't a good idea.

A derivative-based optimization algorithm is going
to get upset whenever it sees negative infinity.
Genetic algorithms, simulated annealing (and I think
Nelder-Mead) will be okay when they see infinity
but if all infeasible solutions have value negative infinity,
then you are not giving the algorithm a clue about what
direction to go.

Pat

Mathieu Ribatet wrote:

Dear Patrick (and other),

Well I used the Sylvester's criteria (which is equivalent) to test for 
this. But unfortunately, this is not the only issue!
Well, to sum up quickly, it's more or less like geostatistics. 
Consequently, I have several unfeasible regions (covariance, margins 
and others).
The problem seems that the unfeasible regions may be large and 
sometimes lead to optimization issues - even when the starting values 
are well defined.
This is the reason why I wonder if setting by myself a $-\infty$ in 
the composite likelihood function is appropriate here.


However, you might be right in setting a tolerance value 'eps' instead 
of the theoretical bound eigen values > 0.

Thanks for your tips,
Best,
Mathieu


Patrick Burns a écrit :

If the positive definiteness of the covariance
is the only issue, then you could base a penalty on:

eps - smallest.eigen.value

if the smallest eigen value is smaller than eps.

Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

Mathieu Ribatet wrote:
 

Thanks Ben for your tips.
I'm not sure it'll be so easy to do (as the non-feasible regions
depend on the model parameters), but I'm sure it's worth giving a try.
Thanks !!!
Best,

Mathieu

Ben Bolker a écrit :
   

Mathieu Ribatet  epfl.ch> writes:


 

Dear list,

I'm currently writing a C code to compute the (composite) 
likelihood -
well this is done but not really robust. The C code is wrapped in 
an R

one which call the optimizer routine - optim or nlm. However, the
fitting procedure is far from being robust as the parameter space
depends on the parameter - I have a covariance matrix that should 
be a

valid one for example.



  One reasonably straightforward hack to deal with this is
to add a penalty that is (e.g.) a quadratic function of the
distance from the feasible region, if that is reasonably
straightforward to compute -- that way your function will
get gently pushed back toward the feasible region.

  Ben Bolker

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

  




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Numerical optimisation and "non-feasible" regions

2008-08-07 Thread Patrick Burns


If the positive definiteness of the covariance
is the only issue, then you could base a penalty on:

eps - smallest.eigen.value

if the smallest eigen value is smaller than eps.

Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

Mathieu Ribatet wrote:

Thanks Ben for your tips.
I'm not sure it'll be so easy to do (as the non-feasible regions 
depend on the model parameters), but I'm sure it's worth giving a try.

Thanks !!!
Best,

Mathieu

Ben Bolker a écrit :

Mathieu Ribatet  epfl.ch> writes:

 

Dear list,

I'm currently writing a C code to compute the (composite) likelihood -
well this is done but not really robust. The C code is wrapped in an R
one which call the optimizer routine - optim or nlm. However, the
fitting procedure is far from being robust as the parameter space
depends on the parameter - I have a covariance matrix that should be a
valid one for example.



  One reasonably straightforward hack to deal with this is
to add a penalty that is (e.g.) a quadratic function of the
distance from the feasible region, if that is reasonably
straightforward to compute -- that way your function will
get gently pushed back toward the feasible region.

  Ben Bolker

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
  




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Is text(..., adj) upside down? (Or am I?)

2008-07-22 Thread Patrick Burns


Basically the only thing in the thread that was clear
to me was Brian's phrasing.  So I'd suggest basing
any changes on that.


Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

S Ellison wrote:

Yup; you're all right - it IS consistent (and I'd even checked the x-adj
and it did what I expected!!). It's just that ?text is talking about the
position of the 'anchor' point in the text region rather than the
subsequent location of the centre of the text.

Anyway; if anyone is considering a minor tweak to ?text, would it be
clearer if it said 
"Values of 0, 0.5, and 1 specify text towards right/top, middle and
left/bottom of x,y, 
 respectively." ?


(or, of course, "Values of 0, 0.5, and 1 specify x,y at left/bottom,
middle and right/top of text, respectively.")

Steve Ellison
Lab of the Government Chemist
UK



***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel





__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] multiple names to assign

2008-07-13 Thread Patrick Burns


'assign' does not give a warning if 'x' has length
greater than 1 -- it just uses the first element:

assign(c('a1', 'a2'), 1:2)

One way of thinking about this is that people using
'assign' get what they deserve.  The other is that it is
used seldom enough that adding a warning isn't going
to slow things down appreciably.


Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] names<- bug or feature?

2008-05-23 Thread Patrick Burns


The two statements below with 'names' are conceptually
the same.  The second performs the desired operation
while the first does not.

> xx <- list(A=c(a=1, b=2))
> names(xx$A[1]) <- "changed"
> xx
$A
a b
1 2

> names(xx$A)[1] <- "changed"
> xx
$A
changed   b
 1   2

This is observed in 2.4.0 on Linux as well as 2.7.0 and
2.8.0  on Windows XP.


Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] RFC: What should ?foo do?

2008-04-27 Thread Patrick Burns

Peter Dalgaard wrote:
> Duncan Murdoch wrote:
>   
>> I haven't done it, but I suspect we could introduce special behaviour 
>> for ??foo very easily.  We could even have a whole hierarchy:
>>
>> ?foo, ??foo, ???foo, foo, ...
>>
>>   
>> 
> Heh, that's rather nice, actually. In words, that could read
>
> ?foo: tell me about foo!
> ??foo: what can you tell me about foo?
> ???foo: what can you tell me about things like foo?
> foo: I don't know what I'm looking for but it might be something
> related foo?
>   

I quite like this.  It seems very intuitive to me -- just match
the number of question marks to the level of my frustration.

Pat
> You do have to be careful about messing with ?, though. I think many
> people, including me, would pretty quickly go nuts if ?par suddenly
> didn't work the way we're used to.
>
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] NA warnings for r() {aka "patch for random.c"}

2008-03-08 Thread Patrick Burns

Martin Maechler wrote:

> [ ... ]
>
>But actually, part of the changed behavior may be considered
>undesirable:
>
>  rnorm(2, mean = NA)
>
>which gives two NaN's  would now produce a warning,
>where I could argue that 
>   'arithmetic with NAs should give NAs without a warning'
>since
>  1:2 + NA
>also gives NAs without a warning.
>
>So we could argue that a warning should *only* be produced in a
>case where the parameters of the distribution are not NA.
>
>What do others (particularly R-core !) think? 
>  
>

I think the answer depends on the probability that
the user realizes that the parameter is NA.  Obviously
the user should know if 'rnorm' is used directly.  If it
is used inside a function, then the user probably doesn't
know.  Not warning in such a case means that tracking
down the ultimate cause of the problem could be harder.
However, it seems to me that the function calling 'rnorm'
is really the one responsible for warning the user.

Pat

[ ... ]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] ar on a constant series

2008-02-21 Thread Patrick Burns

Apparently it is quite rare to be as stupid as me and
want to predict a constant time series with 'ar'.

 > ar(rep(4, 60))
Error in if (order > 0) coefs[order, 1:order] else numeric(0) :
  missing value where TRUE/FALSE needed
In addition: Warning message:
In if (order > 0) coefs[order, 1:order] else numeric(0) :
  the condition has length > 1 and only the first element will be used


A fix for the multivariate case looks quite daunting, but
the univariate case can be improved (in my opinion at least)
with not much effort.

The definition of 'order' in 'ar.yw.default' can be changed from:

 order <- if (aic)
 (0:order.max)[xaic == 0]
 else order.max

to:

  if(any(var.pred <= 0)) {
order <- which(var.pred <= 0)[1] - 1
  } else {
order <- if (aic)
(0:order.max)[xaic == 0]
else order.max
  }

I presume something similar can be done in 'ar.burg',
but I haven't looked.  'ar.ols' and 'ar.mle' look problematic
for fixes.


Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] help files for load and related functions

2007-12-17 Thread Patrick Burns

Patches to the help files sound like a good idea.  However,
it isn't something I'm likely to get to immediately.  I'm
hoping that some other nice person will volunteer.

Pat

Duncan Murdoch wrote:

> On 12/17/2007 6:00 AM, Patrick Burns wrote:
>
>> I recently had a discussion with a user about loading
>> and attaching in R.  I was surprised that the help files
>> don't  provide a very clear picture.
>>
>>  From my point of view 'load' and 'attach' are very
>> similar operations, the difference being that 'attach'
>> creates a new database on the search list while 'load'
>> puts all the objects into the global environment.
>>
>> The help file for 'load' is inexplicit that this is what
>> happens.  The 'load' and 'attach' help files neither refer
>> to the other in their See Also.
>>
>> Furthermore, the 'library' help file talks about "loading"
>> packages.  I would suggest that it should use "attaching"
>> as that is the analogous operation.
>>
>> None of these three help files (nor that of 'save') has a
>> Side Effects section.  Personally I think that all help files
>> should have a Side Effects section (to make it clear to
>> new users what side effects are and that they are not a
>> good thing for most functions to have).  I can understand
>> there could be another point of view on that.  However, I
>> definitely think that there should be a Side Effects section
>> in the help files of functions whose whole point is a side
>> effect.
>
>
> I think you make good points.  Care to submit patches?  The source for 
> those man pages are in
>
> https://svn.R-project.org/R/trunk/src/library/base/man/attach.Rd
>
> https://svn.R-project.org/R/trunk/src/library/base/man/library.Rd
>
> https://svn.R-project.org/R/trunk/src/library/base/man/load.Rd
>
> https://svn.R-project.org/R/trunk/src/library/base/man/save.Rd
>
> If you send them to me before Thursday or after Jan 2, I'll take a 
> look.  (If you send them to me during the Xmas break there's a good 
> chance they'll get lost.)
>
> Duncan Murdoch
>
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] help files for load and related functions

2007-12-17 Thread Patrick Burns

I recently had a discussion with a user about loading
and attaching in R.  I was surprised that the help files
don't  provide a very clear picture.

 From my point of view 'load' and 'attach' are very
similar operations, the difference being that 'attach'
creates a new database on the search list while 'load'
puts all the objects into the global environment.

The help file for 'load' is inexplicit that this is what
happens.  The 'load' and 'attach' help files neither refer
to the other in their See Also.

Furthermore, the 'library' help file talks about "loading"
packages.  I would suggest that it should use "attaching"
as that is the analogous operation.

None of these three help files (nor that of 'save') has a
Side Effects section.  Personally I think that all help files
should have a Side Effects section (to make it clear to
new users what side effects are and that they are not a
good thing for most functions to have).  I can understand
there could be another point of view on that.  However, I
definitely think that there should be a Side Effects section
in the help files of functions whose whole point is a side
effect.

Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] sd(NA)

2007-12-03 Thread Patrick Burns

I like the 2.6.x behaviour better.  Consider:

x <- array(1:30), c(10,3))
x[,1] <- NA
x[-1,2] <- NA
x[1,3] <- NA

sd(x, na.rm=TRUE)

# 2.7.0
Error in var(x, na.rm = na.rm) : no complete element pairs

# 2.6.x
[1]   NA   NA 2.738613

The reason to put 'na.rm=TRUE' into the call is to avoid
getting an error due to missing values. (And, yes, in finance
it is entirely possible to have a matrix with all NAs in a
column.)

I think the way out is to allow there to be a conceptual
difference between computing a value with no data, and
computing a value on all NAs after removing NAs.  The
first is clearly impossible.  The second has some actual
value, but we don't have enough information to have an
estimate of the value.

Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

Prof Brian Ripley wrote:

>On Sun, 2 Dec 2007, Wolfgang Huber wrote:
>
>  
>
>>Dear Prof. Ripley
>>
>>I noted a change in the behaviour of "cov", which is very reasonable:
>>
>>## R version 2.7.0 Under development (unstable) (2007-11-30 r43565)
>>
>>
>>> cov(as.numeric(NA), as.numeric(NA), use="complete.obs")
>>>  
>>>
>>Error in cov(as.numeric(NA), as.numeric(NA), use = "complete.obs") :
>>  no complete element pairs
>>
>>whereas earlier behavior was, for example:
>>## R version 2.6.0 Patched (2007-10-23 r43258)
>>
>>
>>>cov(as.numeric(NA), as.numeric(NA), use="complete.obs")
>>>  
>>>
>>[1] NA
>>
>>
>>I wanted to ask whether the effect this has on "sd" is desired:
>>
>>## R version 2.7.0 Under development (unstable) (2007-11-30 r43565)
>>
>>
>>>sd(NA, na.rm=TRUE)
>>>  
>>>
>>Error in var(x, na.rm = na.rm) : no complete element pairs
>>
>>## R version 2.6.0 Patched (2007-10-23 r43258)
>>
>>
>>> sd(NA, na.rm=TRUE)
>>>  
>>>
>>[1] NA
>>
>>
>
>That is a bug fix: see the NEWS entry.  The previous behaviour of
>
>  
>
>>sd(numeric(0))
>>
>>
>Error in var(x, na.rm = na.rm) : 'x' is empty
>  
>
>>sd(NA_real_, na.rm=TRUE)
>>
>>
>[1] NA
>
>was not as documented:
>
>  This function computes the standard deviation of the values in
>  'x'. If 'na.rm' is 'TRUE' then missing values are removed before
>  computation proceeds.
>
>so somehow an empty vector had a sd() if computed one way, and not if 
>computed another.
>
>  
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] meaning of "trim" in mean()

2007-10-28 Thread Patrick Burns

If the sentence in question were amended to:

Values of trim outside that range ...

then I think it would rule out the misinterpretation of
the sentence.

Pat


Prof Brian Ripley wrote:

>There is only one _range_ mentioned, (0, 0.5).  I don't see how you can 
>construe 'that range' to be a reference to anything other than (0, 0.5).
>
>And why do you suppose the description for argument 'trim' is referring to 
>'values' of a different argument?
>
>It is telling you what happens for values of trim < 0 or > 0.5: that is 
>not information that it is appropriate to excise.
>
>
>On Thu, 25 Oct 2007, Peter Dalgaard wrote:
>
>  
>
>>Liaw, Andy wrote:
>>
>>
>>>(I see this in both R-patched r43124 and R-devel r43233.)
>>>In the Argument section of ?mean:
>>>
>>>trim the fraction (0 to 0.5) of observations to be trimmed from each
>>>end of x before the mean is computed. Values outside that range are
>>>taken as the nearest endpoint.
>>>
>>>Then in the Value section:
>>>
>>>If trim is non-zero, a symmetrically trimmed mean is computed with a
>>>fraction of trim observations deleted from each end before the mean is
>>>computed.
>>>
>>>The description in "trim" to me sounds like Windsorizing, rather than
>>>trimming.  Should that be edited?
>>>
>>>
>>>  
>>>
>>I think so:
>>
>>
>>
>>>x <- sort(rnorm(10))
>>>mean(x,trim=.1)
>>>  
>>>
>>[1] -0.6387413
>>
>>
>>>mean(x[2:9])
>>>  
>>>
>>[1] -0.6387413
>>
>>
>>>mean(x[c(2,2:9,9)]) # Winsorizing
>>>  
>>>
>>[1] -0.6204222
>>
>>So yes, it is trimming, not Winsorizing, and the last sentence in the
>>description of "trim" is misleading and should be, well..., trimmed.
>>
>>
>>
>>
>
>  
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Debug with set.seed()

2007-06-02 Thread Patrick Burns

I think you will find that 'set.seed' does give you the same
random state each time.  That you don't get the same result
each time implies a bug in your code.  Looking for uninitialized
variables would probably be a good start.  Using valgrind
under Linux would most likely find the error quickly.

Also, using '<<-' is generally a bad idea.

Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

Tong Wang wrote:

>HI, all
>I am debugging an R code with dynamically loaded function in it.  It seems 
> set.seed(n) does not give me the same 
>random draws every time.  Could somebody tell me what I should do to get the 
>same outcome verytime ?
>I am not sure what infomation is relevent to this question , the following 
> is a really scatchy description of the code I wrote.I am using the R-2.4.1 
> built under Cygwin. OS is WinXP.
>
>Thanks a lot  for any help.  
>  
>Myfunction<-function(...) {
>...
>drawsomething<-function(){  out<- .C( "drawit"...)  
> var <<- out$var  
> # assign the outputs}
>...
>for( i in 1 : iter) { .
>drawsomething()
>.. }
>
>   reture( var.stor ) #return stored result
>}
>
>__
>R-devel@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>  
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Trailing message on R CMD BATCH

2007-01-10 Thread Patrick Burns

I rather like it.

Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

Brian Ripley wrote:

>Unix versions of R CMD BATCH have reported proc.time() unless the script 
>ends in q().  E.g. if the input is 'search()' the output is
>
>  
>
>>invisible(options(echo = TRUE))
>>search()
>>
>>
>[1] ".GlobalEnv""package:stats" "package:graphics"
>[4] "package:grDevices" "package:utils" "package:datasets"
>[7] "package:methods"   "Autoloads" "package:base"
>  
>
>>proc.time()
>>
>>
>[1] 1.053 0.067 1.109 0.000 0.000
>  
>
>
>This was undocumented, and not shared by the Windows version.
>
>Is it useful?
>Do people want it retained?
>
>  
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] stack imbalance in contour

2006-10-01 Thread Patrick Burns

I'm not sure if this has much significance or not -- but
it sounds rather ominous.  It doesn't appear to be new
as it happens with 2.0.0 in Linux (but the formatting of
the warning messages has improved).

 > contour(array(3e10, c(10,10), list(1:10, 11:20)))
Warning: stack imbalance in 'contour', 20 then 24
Warning: stack imbalance in '.Internal', 19 then 23
Warning: stack imbalance in '{', 17 then 21
Warning message:
all z values are equal
 > sessionInfo()
R version 2.4.0 RC (2006-09-27 r39543)
i386-pc-mingw32

locale:
LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United 
Kingdom.1252;LC_MONETARY=English_United 
Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252

attached base packages:
[1] "methods"   "stats"     "graphics"  "grDevices" "utils" "datasets"
[7] "base"


Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] proposal for lower.tri and upper.tri value argument

2006-08-06 Thread Patrick Burns

Gabor came close to the situation I had yesterday
that prompted me to write a local version of 'lower.tri'.
It was approximately:

x[sub, sub][lower.tri(x[sub,sub])]

Pat

Gabor Grothendieck wrote:

> On 8/6/06, Prof Brian Ripley <[EMAIL PROTECTED]> wrote:
>
>> Is there a case to be made for this?  If so, where is it?
>>
>> (I don't find x[lower.tri(x)] harder to write than lower.tri(x,
>> value=TRUE), and wonder why you do?
>
>
> The reasons are
>
> 1. x might be the result of an expression.  Without value=
> one must store the result of that expression in a variable, x, first:
>
>   x <- outer(1:6, 1:6, "+")
>   x[lower.tri(x)]
>
> but with the proposed value= argument one could just use function
> composition:
>
>   lower.tri(outer(1:6, 1:6, "+"), value = TRUE)
>
> 2. the whole object approach of R encourages working with the objects
> themselves rather than indexes and value= is consistent with that.
>
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] proposal for lower.tri and upper.tri value argument

2006-08-05 Thread Patrick Burns

I propose that a 'value' argument be added to
'lower.tri' and 'upper.tri'.  This is analogous to
the 'value' argument of 'grep'.

Something like the following should work:

 > upper.tri
function (x, diag = FALSE, value = FALSE)
{
x <- as.matrix(x)
if (diag)
ans <- row(x) <= col(x)
else ans <- row(x) < col(x)
if(value) x[ans] else ans
}



Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [R] rownames, colnames, and date and time

2006-03-30 Thread Patrick Burns

I haven't been following all of this thread, but
it reminds me of a bug that was in S-PLUS not
too long ago where dimnames could sometimes
be numeric.  This caused some problems that
were very hard to track down because there were
no visual clues of what was really wrong.

I've been pleased not to encounter that in R and
hope it continues.

Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

Prof Brian Ripley wrote:

>Looking at the code it occurs to me that there is another case you have 
>not considered, namely dimnames().
>
>rownames<- and colnames<- are just wrappers for dimnames<-, so consistency 
>does mean that all three should behave the same.
>
>For arrays (including matrices), dimnames<- is primitive.  It coerces 
>factors to character, and says in the C code
>
> /* if (isObject(val1)) dispatch on as.character.foo, but we don't
>have the context at this point to do so */
>
>so someone considered this before now.
>
>For data frames, dimnames<-.data.frame is used.  That calls row.names<- 
>and names<-, and the first has a data.frame method.  Only the row.names<- 
>method is documented to coerce its value to character, and I think it _is_ 
>all quite consistent.  The basic rule is that all these functions coerce 
>for data frames, and none do for arrays.
>
>However, there was a problematic assumption in the row.names<-.data.frame 
>and dimnames<-.data.frame methods, which tested the length of 'value' 
>before coercion.  That sounds reasonable, but in unusual cases such as 
>POSIXlt, coercion changes the length, and I have swapped the lines around.
>
>What you expected was that dimnames<-() would coerce to character, 
>although I can find no support for that expectation in the documentation. 
>If it were not a primitive function that would be easy to achieve, but as 
>it is, it would need an expert in the internal code to change.  There is 
>also the risk of inconsistency, since as the comment says, the C code is 
>used in places where the context is not known.  I think this is probably 
>best left alone.
>
>
>On Wed, 29 Mar 2006, Prof Brian Ripley wrote:
>
>  
>
>>Yet again, this is the wrong list for suggesting changes to R.  Please do use 
>>R-devel for that purpose (and I have moved this).
>>
>>If this bothers you (it all works as documented, so why not use it as 
>>documented?), please supply a suitable patch to the current R-devel sources 
>>and it will be considered.
>>
>>And BTW, row.names is the canonical accessor function for data frames,
>>and its 'value' argument is documented differently from that for rownames for 
>>an array.  Cf:
>>
>>Details:
>>
>>The extractor functions try to do something sensible for any
>>matrix-like object 'x'.  If the object has 'dimnames' the first
>>component is used as the row names, and the second component (if
>>any) is used for the col names.  For a data frame, 'rownames' and
>>'colnames' are equivalent to 'row.names' and 'names' respectively.
>>
>>Note:
>>
>>'row.names' is similar to 'rownames' for arrays, and it has a
>>method that calls 'rownames' for an array argument.
>>
>>I am not sure why R decided to add rownames for the same purpose as 
>>row.names: eventually they were made equivalent.
>>
>>
>>On Tue, 21 Mar 2006, Erich Neuwirth wrote:
>>
>>
>>
>>>I noticed something surprising (in R 2.2.1 on WinXP)
>>>According to the documentation, rownames and colnames are character 
>>>vectors.
>>>Assigning a vector of class POSIXct or POSIXlt as rownames or colnames
>>>therefore is not strictly according to the rules.
>>>In some cases, R performs a reasonable typecast, but in some other cases
>>>where the same typecast also would be possible, it does not.
>>>
>>>Assigning a vector of class POSIXct to the rownames or names of a
>>>dataframe creates a reasonable string representation of the dates (and
>>>possibly times).
>>>Assigning such a vector to the rownames or colnames of a matrix produces
>>>rownames or colnames consisting of the integer representation of the
>>>date-time value.
>>>Trying to assign a vector of class POSIXlt in all cases
>>>(dataframes and matrices, rownames, colnames, names)
>>>produces an error.
>>>
>>>Demonstration code is given below.
>>>
>>>This is some

[Rd] all.equal buglet(s)

2006-03-18 Thread Patrick Burns

In the details section for 'all.equal' (in the paragraph
on complex values) it says 'all.numeric.numeric'.  I
presume that should be 'all.equal.numeric'.

When two integer vectors differ, it is possible to get
overflow:

 > set.seed(1)
 > r1 <- .Random.seed
 > set.seed(2)
 > r2 <- .Random.seed
 > all.equal(r1, r2)
[1] "Mean relative  difference: NA"
Warning message:
NAs produced by integer overflow in: target - current

A small change to 'all.equal.numeric' would fix that if it
is felt to be worthwhile.

Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] [Fwd: Re: [R] a strange problem with integrate()]

2006-03-01 Thread Patrick Burns

When I saw the subject of the original message on
R-help, I was 95% confident that I knew the answer
(before I had seen the question).

This made me think that perhaps for some functions
there should be a 'Troubleshooting' section in the help
file.

The current help file for 'integrate' does say, as Sundar
points out, what the requirements are.  However, I
think more people would solve the problem more quickly
on their own if there were a troubleshooting section.

Most functions aren't abused in predictable ways, but
a few are.  Another that springs immediately to mind
is 'read.table'.

Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

 Original Message 
Subject:Re: [R] a strange problem with integrate()
Date:   Wed, 01 Mar 2006 11:44:33 -0600
From:   Sundar Dorai-Raj <[EMAIL PROTECTED]>
Organization:   PDF Solutions, Inc.
To: vito muggeo <[EMAIL PROTECTED]>
CC: r-help@stat.math.ethz.ch 
References: <[EMAIL PROTECTED]>

vito muggeo wrote:
> Dear all,
> I am stuck on the following problem with integrate(). I have been out of 
> luck using RSiteSearch()..
> 
> My function is
> 
> g2<-function(b,theta,xi,yi,sigma2){
>xi<-cbind(1,xi)
>eta<-drop(xi%*%theta)
>num<-exp((eta + rep(b,length(eta)))*yi)
>den<- 1 + exp(eta + rep(b,length(eta)))
>result=(num/den)*exp((-b^2)/sigma2)/sqrt(2*pi*sigma2)
>r=prod(result)
>return(r)
>}
> 
> And I am interested in evaluating the simple integral, but:
> 
>  > integrate(g2,-2,2,theta=c(-2,3),xi=c(1,2,5,6),yi=c(1,0,1,1),sigma2=1)
> Error in integrate(g2, -2, 2, theta = c(-2, 3), xi = c(1, 2, 5, 6), yi = 
> c(1,  :
>  evaluation of function gave a result of wrong length
>  >
> 
> I have checked the integrand function
> 
>  > valori<-seq(-2,2,l=30)
>  > risvalori<-NULL
>  > for(k in valori) 
> risvalori[length(risvalori)+1]<-g2(k,theta=c(-2,3),xi=c(1,2,5,6),yi=c(1,0,1,1),sigma2=1)
>  > plot(valori, risvalori)
> 
> And the integral exists..
> 
> Please, any comment is coming?
> 
> many thanks,
> vito
> 

Please (re-)read ?integrate:

  f: an R function taking a numeric first argument and returning a
   numeric vector of the same length.  Returning a non-finite
   element will generate an error.

Note the "returning a numeric vector of the *same* length." Your 
function returns "prod(r)" which is not the same length as "b".

Some style issues (and I state these as diplomatically as is possible in 
e-mail):

a. Don't mix "<-" with "=" for assignment in your scripts.
b. Use more spaces and consistent indenting.

Here's what my code looks like:

g2 <- function(b, theta, xi, yi, sigma2) {
   xi <- cbind(1, xi)
   eta <- drop(xi %*% theta)
   num <- exp((eta + rep(b, length(eta))) * yi)
   den <- 1 + exp(eta + rep(b, length(eta)))
   result <- (num/den) * exp((-b2)/sigma2)/sqrt(2 * pi * sigma2)
   r <- prod(result)
   r
}

After reformatting your code I saw your problem immediately without 
having executing a single line.

HTH,

--sundar

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] weights in nls

2005-12-09 Thread Patrick Burns

It would probably be more polite to give a warning
in 'nls' that the 'weights' argument is ignored.  Something
like the following should do:

if(missing(weights)) warning("weights are not currently implemented")

 > version
 _  
platform i386-pc-mingw32
arch i386   
os   mingw32
system   i386, mingw32  
status   Under development (unstable)
major2  
minor3.0
year 2005   
month12 
day  07 
svn rev  36656  
language R 


Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] x[1,], x[1,,], x[1,,,], ...

2005-11-23 Thread Patrick Burns

You can look at the definition of 'corner' in the public
domain area of the Burns Statistics website.  It uses
'do.call' on '[' to achieve (sort of) what you want.

Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

Henrik Bengtsson wrote:

>Hi,
>
>is there a function in R already doing what I try to do below:
>
># Let 'x' be an array with *any* number of dimensions (>=1).
>x <- array(1:24, dim=c(2,2,3,2))
>...
>x <- array(1:24, dim=c(4,3,2))
>
>i <- 2:3
>
>ndim <- length(dim(x))
>if (ndim == 1)
>   y <- x[i]
>else if (ndim == 2)
>   y <- x[i,]
>else if (ndim == 3)
>   y <- x[i,,]
>else ...
>
>and so on.  My current solution is
>
>ndim <- length(dim(x))
>args <- rep(",", ndim)
>args[1] <- "i"
>args <- paste(args, collapse="")
>code <- paste("x[", args, "]", sep="")
>expr <- parse(text=code)
>y <- eval(expr)
>
>ndim <- length(dim(x))
>args <- rep(",", ndim)
>args[1] <- "i"
>args <- paste(args, collapse="")
>code <- paste("x[", args, "]", sep="")
>expr <- parse(text=code)
>y <- eval(expr)
>
>Is there another way I can do this in R that I have overlooked?
>
>/Henrik
>
>__
>R-devel@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
>  
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Typo(s) in proc.time.Rd and comment about ?proc.time (PR#8091)

2005-08-24 Thread Patrick Burns

I'm not so sure that it is a typo or ungrammatical as much as
an awkward way of stating it.  Maybe something along the
lines of:

The resolution of the times will be system-specific; it is
common for the resolution to be on the order of 1/100
second, ...


Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

[EMAIL PROTECTED] wrote:

>I just downloaded the file
>
>ftp://ftp.stat.math.ethz.ch/Software/R/R-devel.tar.gz
>
>and within proc.time.Rd, the second paragraph of the \value
>section contains a typo:
>
> The resolution of the times will be system-specific; it is common for
> them to be recorded to of the order of 1/100 second, and elapsed [...]
> ^
>   
>I'd say replacing "to of" with just "of" would grammatically
>fix the sentence.
>
>Second, the \note{} section for Unix-like machines reads:
>
> It is possible to compile \R without support for \code{proc.time},
> when the function will throw an error.
>
>I believe this is ungrammatical and suggest replacing 
>"when the function will throw an error" with "in which 
>case the function will throw an error".
>
>Finally, my comment about ?proc.time is that if across 
>platforms the returned value is in seconds, then it might 
>be helpful to readers if this were noted explicitly in the 
>first paragraph of \value{}.
>
>My suggestion is in brackets:
>
> A numeric vector of length 5, containing the user, system, and
> total elapsed times [in seconds] for the currently running R 
> process [...]
>
>Thank you,
>
>Stephen
>
>::
>Stephen Weigand
>Division of Biostatistics
>Mayo Clinic Rochester
>Phone (507) 266-1650, fax 284-9542
>
>__
>R-devel@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
>  
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] minor typo in Writing R Extensions

2005-08-19 Thread Patrick Burns

On page 10 (section 1.1.4) of Writing R Extensions
version 2.1.1 the following two phrases appear:

define these function in a file

after the packages is

The 's' from 'functions' fell down the page and attached
itself to 'package'.

Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [R] sequence()

2005-07-22 Thread Patrick Burns

I definitely agree that 'sequence' is not the right name
for this functionality.  The functionality is occasionally
useful -- I've been asked for it several times. But I do
wonder if it is basic enough that it should be in 'base'.

The function could be rewritten to create the proper
length of the answer at the outset since that is known.
It could then be used as an example (somewhere) of
how not to fragment memory.

Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

Prof Brian Ripley wrote:

>R-help is not the list for R development questions: you didn't want help 
>did you? --> moved to R-devel.
>
>I do wonder why
>
>  
>
>>sequence(c(0,-1))
>>
>>
>[1]  1  0  1  0 -1
>
>is considered useful.
>
>Given that the definition seems flawed and I could not find any uses of 
>this function in any package my reaction was to suggest the function be 
>deprecated on the way to removal.  (I also do not understand why anyone 
>would expect sequence() to do that and not one of the many things which 
>seq() does.)
>
>We certainly do not want to replace a function that works as described at 
>a reasonable speed by one that does not work as described, however fast.
>
>   `Accuracy first, speed second'.
>
>
>On Fri, 22 Jul 2005, Robin Hankin wrote:
>
>  
>
>>Function sequence() repeatedly concatenates
>>its output, and this is slow.
>>
>>
>
>  
>
>>It is possible to improve on the performance of sequence by
>>defining
>>
>> myseq <- function(x){unlist(sapply(x,function(i){1:i}))}
>>
>>
>
>I don't think you want sapply here, but lapply.  Try
>
>  
>
>>myseq(c(2,2))
>>
>>
>  [,1] [,2]
>[1,]11
>[2,]22
>
>sic!
>
>  
>
>>The following session compares the  performance of
>>myseq(), and sequence(), at least on my G5:
>>
>>
>>
>>
>>>identical(sequence(1:50),myseq(1:50))
>>>  
>>>
>>[1] TRUE
>>
>>
>>>system.time(ignore <- sequence(1:800))
>>>  
>>>
>>[1] 1.16 0.88 2.07 0.00 0.00
>>
>>
>>>system.time(ignore <- sequence(1:800))
>>>  
>>>
>>[1] 1.14 0.84 1.99 0.00 0.00
>>
>>
>>>system.time(ignore <- myseq(1:800))
>>>  
>>>
>>[1] 0.02 0.02 0.04 0.00 0.00
>>
>>
>>>system.time(ignore <- myseq(1:800))
>>>  
>>>
>>[1] 0.03 0.00 0.03 0.00 0.00
>>
>>
>>(the time differential is even more marked for longer arguments).
>>
>>
>
>and much less for more realistic shorter arguments.
>
>  
>
>>Is there any reason why we couldn't use  this definition instead?
>>
>>
>
>The fact that it sometimes gives the wrong answer, for one.
>
>  
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] help.search of precedence is empty

2005-07-15 Thread Patrick Burns

Doing

help.search('precedence')

comes up empty.  A fix would be to have the title:

Operator Syntax and Precedence

instead of

Operator Syntax

Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] eigen of a real pd symmetric matrix gives NaNs in $vector (PR#7987)

2005-07-04 Thread Patrick Burns

I would presume this is another manifestation of what I reported
(reproduced below) on 2003-12-01.

[EMAIL PROTECTED] wrote:

>Full_Name: cajo ter Braak
>Version: 2.1.1
>OS: Windows
>Submission from: (NULL) (137.224.10.105)
>
>
># I would like to attach the matrix C in the Rdata file; it is 50x50 and comes
>from a geostatistical problem (spherical covariogram)
>
>  
>
>>rm(list=ls(all=TRUE))
>>load(file= "test.eigen.Rdata")
>>ls()
>>
>>
>[1] "C"  "eW"
>  
>
>>sym.check = max(abs(C - t(C)))# should be 0 for symmetry
>>sym.check
>>
>>
>[1] 0
>  
>
>>eW <-eigen(C, symmetric = TRUE)
>>l_eW <- eW$values
>>print(eW$values)
>>
>>
> [1] 4.5730646 4.5730646 3.3066738 3.3066738 3.3066738 3.3066738 2.3935268
> [8] 2.3935268 1.9367508 1.9367508 1.9347787 1.9347787 1.4276845 1.4276845
>[15] 1.4276845 1.4276845 0.9858318 0.9858318 0.9858318 0.9858318 0.9123115
>[22] 0.9123115 0.7945283 0.7945283 0.7880493 0.7880493 0.6047920 0.6047920
>[29] 0.6047920 0.6047920 0.5689609 0.5689609 0.5681210 0.5681210 0.5440676
>[36] 0.5440676 0.5440676 0.5440676 0.5224040 0.5224040 0.5139844 0.5139844
>[43] 0.5077485 0.5077485 0.5008249 0.5008249 0.5008249 0.5008249 0.4960220
>[50] 0.4960220
>  
>
>>#print(eW$vector)
>>
>>
>>#library(MASS)
>>#n = nrow(C)
>>#y = runif(n)
>>
>>#lm1 = lm.gls(y~1, W = C)
>>#summary(lm1)
>>
>>eW <-eigen(C, symmetric = TRUE)
>>l_eW <- eW$values
>># the thirdteens eigenvector contrains NaN
>>print(eW$values)
>>
>>
> [1] 4.5730646 4.5730646 3.3066738 3.3066738 3.3066738 3.3066738 2.3935268
> [8] 2.3935268 1.9367508 1.9367508 1.9347787 1.9347787 1.4276845 1.4276845
>[15] 1.4276845 1.4276845 0.9858318 0.9858318 0.9858318 0.9858318 0.9123115
>[22] 0.9123115 0.7945283 0.7945283 0.7880493 0.7880493 0.6047920 0.6047920
>[29] 0.6047920 0.6047920 0.5689609 0.5689609 0.5681210 0.5681210 0.5440676
>[36] 0.5440676 0.5440676 0.5440676 0.5224040 0.5224040 0.5139844 0.5139844
>[43] 0.5077485 0.5077485 0.5008249 0.5008249 0.5008249 0.5008249 0.4960220
>[50] 0.4960220
>  
>
>>print(eW$vector[,13])
>>
>>
> [1]   0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
>NaN
>[20] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
>NaN
>[39] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
>  
>

This is undoubtedly a bug, but I doubt that it is really down to R.

Synopsis: a certain matrix causes eigen(symmetric=TRUE)
to produce NaN's in some of the returned eigenvectors.
This happens using SuSe 8.2 Professional and the precompiled
R rpm (happens in both 1.8.1 and 1.7.1).  I don't see it under
Windows.

To reproduce the bug:
The matrix (75 by 75, about 45K) is in
http://www.burns-stat.com/pages/Flotsam/eigenbugmatrix.RData

Load that into R.

sum(is.na(eigen(eigenbugmatrix, symmetric=TRUE)$vectors))

is non-zero (600 in my experience) when the bug is exhibited and
is zero when the bug is not.

It is quite sensitive to the numbers.  The bug is still there with some
scaling of the matrix (from about divide by 2 to multiply by 8).  The
bug disappears if the matrix is dumped and sourced back in again.

The only clue that I can offer is that it is vectors 43:50 (I think) that
are NaN's and the matrix is logically of rank 50.  That is, it is a
covariance matrix on 75 variables using 50 observations.

Hopefully, someone has the experience and tenacity to figure out
what is going on here.


Patrick Burns

Burns Statistics
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

>__
>R-devel@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
>  
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

71 matches

Mail list logo