Re: [R] Number of Cores limited to two in CRAN

2023-07-02 Thread Henrik Bengtsson
Short answer: You don't want to override that limit in your R package.
Don't do it.

Long answer: You'll find the reason for this in the 'CRAN Repository
Policy' (https://cran.r-project.org/web/packages/policies.html).
Specifically, the following passage:

"Checking the package should take as little CPU time as possible, as
the CRAN check farm is a very limited resource and there are thousands
of packages. Long-running tests and vignette code can be made optional
for checking, but do ensure that the checks that are left do exercise
all the features of the package.

**If running a package uses multiple threads/cores it must never use
more than two simultaneously: the check farm is a shared resource and
will typically be running many checks simultaneously.**

Examples should run for no more than a few seconds each: they are
intended to exemplify to the would-be user how to use the functions in
the package."

Basically, you can use two cores to demonstrate or validate (e.g. in
package tests) that your code *can* run in parallel, but you must not
use more than that to demonstrate that your code can "run super fast".

Even-longer answer: See my blog post 'Please Avoid detectCores() in
your R Packages' (https://www.jottr.org/2022/12/05/avoid-detectcores/)
from 2022-12-05 for even more reasons.

/Henrik

On Sun, Jul 2, 2023 at 9:55 AM Ravi Varadhan via R-help
 wrote:
>
> This is the specific error messsage from R CMD check --as-cran
>
>
> Error in .check_ncores(length(names)) : 16 simultaneous processes spawned
>   Calls: prepost -> makeCluster -> makePSOCKcluster -> .check_ncores
>   Execution halted
>
>
> Thanks,
> Ravi
>
> 
> From: Ravi Varadhan
> Sent: Saturday, July 1, 2023 1:15 PM
> To: R-Help 
> Subject: Number of Cores limited to two in CRAN
>
> Hi,
> I am developing a package where I would like to utilize multiple cores for 
> parallel computing.  However, I get an error message when I run R CMD check 
> --as-cran.
>
> I read that CRAN limits the number of cores to 2.  Is this correct? Is there 
> any way to overcome this limitation?
>
> Thank you,
> Ravi
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] suprising behaviour of tryCatch()

2023-05-18 Thread Henrik Bengtsson
... or just put the R expression inside curly brackets, e.g.

tryCatch({
  sexsnp[i] = fisher.test(table(data[,3], data[,i+38]))$p
}, error=function(e) print(NA))

Exercise: Compare

> list(a = 2)
$a
[1] 2

with

> list({ a = 2 })
[[1]]
[1] 2

and

> list(b = { a = 2 })
$b
[1] 2

BTW, note how the latter two assigned a <- 2 to the global environment.

/Henrik

On Thu, May 18, 2023 at 8:22 AM Berwin A Turlach
 wrote:
>
> G'day Federico,
>
> On Wed, 17 May 2023 10:42:17 +
> "Calboli Federico (LUKE)"  wrote:
>
> > sexsnp = rep(NA, 1750)
> > for(i in 1:1750){tryCatch(sexsnp[i] = fisher.test(table(data[,3],
> > data[,i + 38]))$p, error = function(e) print(NA))} Error: unexpected
> > '=' in "for(i in 1:1750){tryCatch(sexsnp[i] ="
>
> Try:
>
> R> for(i in 1:1750){tryCatch(eval(expression("sexsnp[i] = 
> fisher.test(table(data[,3], data[,i+38]))$p")), error=function(e)print(NA))}
>
> or
>
> R> for(i in 1:1750){tryCatch(bquote("sexsnp[i] = fisher.test(table(data[,3], 
> data[,i+38]))$p"), error=function(e) print(NA))}
>
> or
>
> R> for(i in 1:1750){tryCatch(.("sexsnp[i] = fisher.test(table(data[,3], 
> data[,i+38]))$p"), error=function(e) print(NA))}
>
> If you want to use the '='.
>
> Cheers,
>
> Berwin
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] netstat in R in linux...

2022-12-06 Thread Henrik Bengtsson
> By the by, what advantages does port4me have as compared to netstat?

As I said in my previous email, it doesn't require external tools, so
it's more likely to work out of the box for more people. But that
wasn't the main reason for this package. For the full motivation
behind port4me, see the vignette
<https://cran.r-project.org/web/packages/port4me/vignettes/port4me-overview.html>.

/Henrik

On Tue, Dec 6, 2022 at 11:00 AM akshay kulkarni  wrote:
>
> Dear Henrik
>   It is workingthanks a lot! I had actually 
> previously tried this:  sudo yum install netstat rather than sudo yum install 
> net-tools
>
> By the by, what advantages does port4me have as compared to netstat?
>
> THanking you,
> Yours sincerely,
> AKSHAY M KULKARNI
>
> ____
> From: Henrik Bengtsson 
> Sent: Wednesday, December 7, 2022 12:19 AM
> To: akshay kulkarni 
> Cc: R help Mailing list 
> Subject: Re: [R] netstat in R in linux...
>
> Okay,
>
> that means that the Linux machine where you run this on does not have
> the 'netstat' software installed.  That is something that needs to be
> installed outside of R.  For example, if it's Ubuntu, I think 'sudo
> apt info net-tools' will do.
>
> (Disclaimer: I'm the author)
> A cross-platform alternative to netstat::free_port(), is
> port4me::port4me(), which is also available from CRAN
> (https://cran.r-project.org/package=port4me). It requires no external
> tools, but R (>= 4.0.0).
>
> /Henrik
>
> On Tue, Dec 6, 2022 at 10:41 AM akshay kulkarni  wrote:
> >
> > Dear Henrik,
> > The error is:
> >
> > > library(netstat)
> > > free_port()
> > sh: netstat: command not found
> > Error in system("netstat -n -a", intern = TRUE) :
> >   error in running command
> >
> > Thanking you,
> > Yours sincerely
> > AKSHAY M KULKARNI
> > 
> > From: Henrik Bengtsson 
> > Sent: Tuesday, December 6, 2022 11:53 PM
> > To: akshay kulkarni 
> > Cc: R help Mailing list 
> > Subject: Re: [R] netstat in R in linux...
> >
> > What's the error?!?
> >
> > /Henrik
> >
> > On Tue, Dec 6, 2022 at 10:19 AM akshay kulkarni  
> > wrote:
> > >
> > > dear members,
> > >
> > > I am using free_port() in netstat package in R. It is working in windows 
> > > but not in linux. It is throwing an error in linux. ANy help please?
> > >
> > > THanking you,
> > > Yours sincerely
> > > AKSHAY M KULKARNI
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide 
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] netstat in R in linux...

2022-12-06 Thread Henrik Bengtsson
Okay,

that means that the Linux machine where you run this on does not have
the 'netstat' software installed.  That is something that needs to be
installed outside of R.  For example, if it's Ubuntu, I think 'sudo
apt info net-tools' will do.

(Disclaimer: I'm the author)
A cross-platform alternative to netstat::free_port(), is
port4me::port4me(), which is also available from CRAN
(https://cran.r-project.org/package=port4me). It requires no external
tools, but R (>= 4.0.0).

/Henrik

On Tue, Dec 6, 2022 at 10:41 AM akshay kulkarni  wrote:
>
> Dear Henrik,
> The error is:
>
> > library(netstat)
> > free_port()
> sh: netstat: command not found
> Error in system("netstat -n -a", intern = TRUE) :
>   error in running command
>
> Thanking you,
> Yours sincerely
> AKSHAY M KULKARNI
> 
> From: Henrik Bengtsson 
> Sent: Tuesday, December 6, 2022 11:53 PM
> To: akshay kulkarni 
> Cc: R help Mailing list 
> Subject: Re: [R] netstat in R in linux...
>
> What's the error?!?
>
> /Henrik
>
> On Tue, Dec 6, 2022 at 10:19 AM akshay kulkarni  wrote:
> >
> > dear members,
> >
> > I am using free_port() in netstat package in R. It is working in windows 
> > but not in linux. It is throwing an error in linux. ANy help please?
> >
> > THanking you,
> > Yours sincerely
> > AKSHAY M KULKARNI
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] netstat in R in linux...

2022-12-06 Thread Henrik Bengtsson
What's the error?!?

/Henrik

On Tue, Dec 6, 2022 at 10:19 AM akshay kulkarni  wrote:
>
> dear members,
>
> I am using free_port() in netstat package in R. It is working in windows but 
> not in linux. It is throwing an error in linux. ANy help please?
>
> THanking you,
> Yours sincerely
> AKSHAY M KULKARNI
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] When using require(), why do I get the error message "Error in if (!loaded) { : the condition has length > 1" ?

2022-10-24 Thread Henrik Bengtsson
You need to pass character.only = TRUE to require() whenever you
specify the package using a character variable.

I agree, the error message is confusing.

/Henrik

On Mon, Oct 24, 2022 at 9:26 AM Kelly Thompson  wrote:
>
> # Below, when using require(), why do I get the error message "Error
> in if (!loaded) { : the condition has length > 1" ?
>
> # This is my reproducible code:
>
> #create a vector with the names of the packages I want to use
> packages_i_want_to_use <- c('base', 'this_pac_does_not_exist')
>
> # Here I get error messages:
> require( packages_i_want_to_use[1] )
> #Error in if (!loaded) { : the condition has length > 1
>
> require( packages_i_want_to_use[2] )
> #Error in if (!loaded) { : the condition has length > 1
>
> # Here I get what I expect:
> require('base')
>
> require('this_pac_does_not_exist')
> #Loading required package: this_pac_does_not_exist
> #Warning message:
> #In library(package, lib.loc = lib.loc, character.only = TRUE,
> logical.return = TRUE,  :
> #  there is no package called ‘this_pac_does_not_exist’
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Deprecating download method='wininet' in R on Windows causes trouble with corporate proxy

2022-09-29 Thread Henrik Bengtsson
Is R centrally installed?  If so, environment variables 'HTTP_PROXY',
'HTTPS_PROXY', and 'HTTPS_PROXY_USER' could be set for all users by
setting them in the R_HOME/etc/Renviron.site file.  R_HOME is the
folder where R is installed.  You can find this file from within R by
calling:

> file.path(R.home("etc"), "Renviron")
[1] "C:/PROGRA~1/R/R-42~1.1/etc/Renviron"

If not centrally installed, I don't know anything better than users
setting them in their personal ~/.Renviron file;

> normalizePath("~/.Renviron")
[1] "C:\\Users\\alice\\Documents\\.Renviron"

For example,

> cat(file = "~/.Renviron", append = TRUE, 
> "HTTP_PROXY=http://proxy-host:3128/;, "HTTPS_PROXY=https://proxy-host:3128/;, 
> "HTTPS_PROXY_USER=dummy", sep = "\n")

At least this avoid having to configure them in MS Windows settings,
which is tedious to document and explain.

My $.02

Henrik

On Thu, Sep 29, 2022 at 3:48 PM Selke, Gisbert W.
 wrote:
>
> Method="wininet" is deprecated and scheduled to go away, the standard method 
> is now libcurl. This causes trouble for all R users in our shop, because we 
> are sitting behind a corporate proxy, which uses Kerberos authentication. 
> (We're all on Windows.)
>
> Using wininet, this used to work without problems and without additional 
> effort; it currently still does with explicit method="wininet" (which, by the 
> way, precludes use of the handy menu command "Update packages", which will 
> use the default method, i.e., libcurl as of now.)
>
> For the future, when wininet will be gone for good, the only option we have 
> is to resort to first setting environment variables HTTP_PROXY and 
> HTTPS_PROXY and then tricking the proxy out of using Kerberos, setting 
> HTTPS_PROXY_USER to a dummy string.
> This is certainly doable for R users with enough knowledge of the 
> technicalities of internet access, but our average R user will just be lost. 
> As has been pointed out elsewhere 
> (https://github.com/rstudio/rstudio/issues/10163#issuecomment-1154071514) , 
> this will create a lot of blood, sweat and tears (and swears), and it is a 
> moderate nightmare to maintain consistently and up-to-date for many users.
>
> My first question is: Since we are probably not the only institution in this 
> situation, has anyone come up with a robust and maintainable solution other 
> than our approach described above?
>
> Failing that: would it be possible at all to change the use that the R core 
> makes of libcurl in such a way that it would automagically Do The Right Thing 
> (tm)? In principle, this should be possible; after all, wininet did the 
> trick, and ordinary browsers can handle this situation. (Disclaimer: I know 
> nothing about the R internals so cannot say whether I am being overly naïve 
> here.)
>
> Any help appreciated.
>
> (I'm new to this list, so if this has been discussed here before, I apologize 
> and would be grateful for a pointer to do my reading.)
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] error with more 100 forked processes

2022-04-08 Thread Henrik Bengtsson
The reason why you hit the limit already around 100 workers, could be
because you already have other connections open, e.g. file
connections, capture.output(), etc.

If you want to use *forked* processing with more than 125 workers
using bare-bone R, you can use parallel::mclapply() and friends,
because they don't use sockets connections to communicate between the
main process and the workers.

If you don't need *forked* processing per se, there are other
alternatives, as already pointed out above.

As the author of the future framework (https://www.futureverse.org/),
I obviously suggest you try that one. It's on CRAN and installs out of
the box on all OSes. You get several alternatives for parallel
backends. For *forked* processing, call plan(multicore) on top of your
script, and it'll parallelize via the parallel::mclapply() framework
internally, so you won't have the connection limitation to worry
about(*). You can also use plan(future.callr::callr) to parallelize
via the callr package, which also don't have the connection
limitation. Your code will be the same regardless which you end up
using.  For the front end, there's future.apply::future_lapply() et
al. (parallel version of base lapply functions), furrr::future_map()
et al. (parallel version of purrr's map functions), foreach w/
doFuture if you like the y <- foreach(...) %dopar% { ... } style.

(*) But there are other issues with forked processing, e.g. it might
not be compatible with multi-threaded code used by some packages. This
is a problem independent of futures per se.

Hope this helps

Henrik

On Fri, Apr 8, 2022 at 2:19 PM Ivan Krylov  wrote:
>
> On Fri, 8 Apr 2022 22:02:25 +0200
> Guido Kraemer via R-help  wrote:
>
> >  > cl <- makeForkCluster(128)
> > Error in UseMethod("sendData") :
> >no applicable method for 'sendData' applied to an object of class
> > "NULL"
>
> In order to communicate with the workers, R creates connection objects.
> Unfortunately, the memory for connection objects in R has a
> statically-defined limit of 128. (A few connections are used by
> default, and a few more will likely be used by user code during the
> actual program run.)
>
> Try increasing the limit in #define NCONNECTIONS in
> src/main/connections.c and re-compiling R.
>
> See also: https://github.com/HenrikBengtsson/Wishlist-for-R/issues/28
> According to Henrik Bengtsson, R should work well even with as many
> as 16381 possible connections, but then you may run into OS limits on
> file descriptors.
>
>
> --
> Best regards,
> Ivan
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] checkpointing

2021-12-14 Thread Henrik Bengtsson
On Tue, Dec 14, 2021 at 1:17 AM Andy Jacobson  wrote:
>
> Those are good points, Duncan. I am experimenting with a nice checkpointing 
> tool called DMTCP. It operates on the system level but is quite OS-dependent. 
> It can be found at http://dmtcp.sourceforge.net/index.html.
>
> Still, it would be nice to be able to checkpoint calls within R to 
> potentially long-running processes like optim().

Teasing idea. Imagine if we could come up with some de-facto standard
API for this and that such a framework could be called automatically
by R. Something similar to how user interrupts are checked (e.g.
R_CheckUserInterrupt()) on a regular basis by the R engine and
through-out the R code. That could help troubleshooting and debugging,
e.g. sending the checkpoint to someone else or going backwards in
time.

Pasting in the below since I failed to hit Reply *All* the other day,
and it was only Richard who got it:

A few weeks ago, I played around with DMTCP (Distributed MultiThreaded
CheckPointing ) for Linux (https://github.com/dmtcp/dmtcp).  I'm
sharing in case someone is interested in investigating this further.
Also, somewhere on the DMTCP wiki, they asked for testing with R by
more experienced users.

"DMTCP is a tool to transparently checkpoint the state of multiple
simultaneous applications, including multi-threaded and distributed
applications. It operates directly on the user binary executable,
without any Linux kernel modules or other kernel modifications."

They seem to be able to run this with HPC jobs, open files, Linux
containers, and even MPI, and so on.  I've only tested it very quickly
with interactive R and it seems to work.  Obviously more testing needs
to be done to identify when it doesn't work.  For example, I'd have a
hard time it would work out of the box with local parallel PSOCK
workers.  They mention "plug-ins", so maybe there's a way to adding
support for specific use cases on a one by one.

Different academic HPC environment appear to use it, e.g.

* https://docs.nersc.gov/development/checkpoint-restart/dmtcp/
* http://wiki.orc.gmu.edu/mkdocs/Creating_Checkpoints_%28DMTCP%29/
* https://wiki.york.ac.uk/display/RCS/VK21%29+Checkpointing+with+DMTCP

That's all I have time for now,

Henrik

>
> -Andy
>
> On 12/13/21 11:51 AM, Duncan Murdoch wrote:
> > On 13/12/2021 12:58 p.m., Greg Minshall wrote:
> >> Jeff,
> >>
> >>> This sounds like an OS feature, not an R feature... certainly not a
> >>> portable R feature.
> >>
> >> i'm not arguing for it, but this seems to me like something that could
> >> be a language feature.
> >>
> >
> > R functions can call libraries written in other languages, and can start 
> > processes, etc.  R doesn't know everything going on in every function call, 
> > and would have a lot of trouble saving it.
> >
> > If you added some limitations, e.g. a process that periodically has its 
> > entire state stored in R variables, then it would be a lot easier.
> >
> > Duncan Murdoch
>
> --
> Andy Jacobson
> a...@yovo.org
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] I'd like to request that my R CRAN package is not tested on Solaris OS

2021-10-22 Thread Henrik Bengtsson
I agree with others that this suggests there is a hidden bug in the
code.  In addition to running with Valgrind, R-hub's

> rhub::check(platform="linux-x86_64-rocker-gcc-san")

will compile the native code with the Address Sanitizer (ASan) and the
UndefinedBehaviorSanitizer (UBSan).  Those have helped me in the past
to track down mistakes, and even spot things I was not aware of.  And
it's an ease of mind as a developer when these tools and Valgrind
checks give all OK reports.

The R-hub services is cross-platform and requires no local setup.

/Henrik

On Fri, Oct 22, 2021 at 7:41 AM Bill Dunlap  wrote:
>
> I agree with Stefan.  Try using valgrind (on Linux) to check for memory
> misuse:
>
> R --debugger=valgrind --debugger-args="--leak-check=full
> --track-origins=yes"
> ...
> > yourTests()
> > q("no")
>
> -Bill
>
>
> On Fri, Oct 22, 2021 at 7:30 AM Stefan Evert 
> wrote:
>
> > Just to add my personal cent to this:  I've had similar issues with an R
> > package some time ago, which kept crashing somewhat unpredictably in the
> > Solaris tests.
> >
> > Debugging was hard because it only happened on Solaris, but in the end it
> > turned out to be due to serious bugs in the code that only happened to
> > surface in the Solaris tests.   I would think that it's likely to be the
> > same for your package, so the segfaults shouldn't be accepted too readily
> > as a platform quirk.
> >
> > Best
> > SE
> >
> >
> > > On 22 Oct 2021, at 15:47, Marc Schwartz via R-help 
> > wrote:
> > >
> > >
> > > 1. The CRAN repository policy here:
> > >
> > >  https://cran.r-project.org/web/packages/policies.html
> > >
> > > notes:
> > >
> > > "Package authors should make all reasonable efforts to provide
> > cross-platform portable code. Packages will not normally be accepted that
> > do not run on at least two of the major R platforms. Cases for Windows-only
> > packages will be considered, but CRAN may not be the most appropriate place
> > to host them."
> > >
> > > That would seem to infer that, with reasonable justification, one may be
> > able to make a request of the CRAN maintainers to exclude at least one of
> > the OS platforms from testing. A request that would be at the discretion of
> > the CRAN maintainers and Solaris, in light of the low market prevalence,
> > may be a more common exclusion as you have noted below.
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] small object but huge RData file exported

2021-10-20 Thread Henrik Bengtsson
Example illustrating what Duncan says:

> make_formula <- function() { large <- rnorm(1e6); x ~ y }
> formula <- make_formula()

# "Apparent" size of object
> object.size(formula)
728 bytes

# Actual serialization size
> length(serialize(formula, connection = NULL))
[1] 8000203

# A better size estimate
> lobstr::obj_size(formula)
8,000,888 B

/Henrik

On Wed, Oct 20, 2021 at 12:57 PM Duncan Murdoch
 wrote:
>
> On 20/10/2021 9:20 a.m., Jinsong Zhao wrote:
> > On 2021/10/20 21:05, Duncan Murdoch wrote:
> >> On 20/10/2021 8:57 a.m., Jinsong Zhao wrote:
> >>> Hi there,
> >>>
> >>> I have a RData file that is obtained by save.image() with size about
> >>> 74.0 MB (77,608,222 bytes).
> >>>
> >>> When load into R, I measured the size of each object with object.size():
> >>>
>  object.size(combn.rda.m)
> >>> 105448 bytes
>  object.size(cross)
> >>> 102064 bytes
>  object.size(denitr.1)
> >>> 25032 bytes
>  object.size(rda.denitr.1)
> >>> 600280 bytes
>  object.size(xh)
> >>> 7792 bytes
>  object.size(xh.x)
> >>> 6064 bytes
>  object.size(xh.x.1)
> >>> 24144 bytes
>  object.size(xh.x.2)
> >>> 24144 bytes
>  object.size(xh.x.3)
> >>> 24144 bytes
>  object.size(xh.y)
> >>> 2384 bytes
> >>>
> >>> There are all small objects.
> >>>
> >>> If I delete the largest one "rda.denitr.1", and save.image("xx.RData").
> >>> It has the size of 22.6 KB (23,244 bytes). All seem OK.
> >>>
> >>> However, when I save(rda.denitr.1, file = "yy.RData"), then it has the
> >>> size of 73.9 MB (77,574,869 bytes).
> >>>
> >>> I don't know why...
> >>>
> >>> Any hint?
> >>
> >> As the docs for object.size() say, "Exactly which parts of the memory
> >> allocation should be attributed to which object is not clear-cut."  In
> >> particular, if a function or formula has an associated environment, it
> >> isn't included, but it is sometimes saved in the image.
> >>
> >> So I'd suspect rda.denitr.1 contains something that references an
> >> environment, and it's an environment that would be saved.  (I forget the
> >> exact rules, but I think that means it's not the global environment and
> >> it's not a package environment.)
> >>
> >> Duncan Murdoch
> >
> >
> > The rda.denitr.1 is only a list with length 2:
> > rda.denitr.1[[1]] is a vector with length 10;
> > rda.denitr.2[[2]] is a list with the length 10. rda.denitr.1[[2]][[1]]
> > to rda.denitr.1[[2]][[10]] are small RDA objects generated by rda() from
> > vegan package.
> >
> > If I
> >   > a <- rda.denitr.1[[2]][[1]]
> >   > object.size(a)
> > 59896 bytes
> >   > save(a, file = "abc.RData")
> > It also has a large size of 73.9 MB (77,536,611 bytes)
> >
> > Jinsong
> >
>
> The rda() function uses formulas.  If it saves the formula in the
> result, then it references the environment of that formula, typically
> the environment where the formula was created.
>
> Duncan Murdoch
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Package installation help: Stuck at "** byte-compile and prepare package for lazy loading"

2021-09-29 Thread Henrik Bengtsson
I just tried on an up-to-date CentOS 7 with R 4.1.1 built from source
using gcc 8.3.1 (from SCL devtoolset-8; so not the default gcc 4.8.5),
and it works there.  If of any help, here's the output when installing
to user's personal package library:

> chooseCRANmirror(ind = 1)
> install.packages("forensim")
Installing package into
‘/c4/home/henrik/R/x86_64-pc-linux-gnu-library/4.1-CBI-gcc8’
(as ‘lib’ is unspecified)
trying URL 'https://cloud.r-project.org/src/contrib/forensim_4.3.tar.gz'
Content type 'application/x-gzip' length 84232 bytes (82 KB)
==
downloaded 82 KB

* installing *source* package ‘forensim’ ...
** package ‘forensim’ successfully unpacked and MD5 sums checked
** using staged installation
** libs
gcc -I"/software/c4/cbi/software/R-4.1.1-gcc8/lib64/R/include"
-DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c auxilary.c -o
auxilary.o
gcc -I"/software/c4/cbi/software/R-4.1.1-gcc8/lib64/R/include"
-DNDEBUG   -I/usr/local/include   -fpic  -g -O2  -c recursFinal.c -o
recursFinal.o
gcc -shared -L/software/c4/cbi/software/R-4.1.1-gcc8/lib64/R/lib
-L/usr/local/lib64 -o forensim.so auxilary.o recursFinal.o
-L/software/c4/cbi/software/R-4.1.1-gcc8/lib64/R/lib -lR
installing to 
/c4/home/henrik/R/x86_64-pc-linux-gnu-library/4.1-CBI-gcc8/00LOCK-forensim/00new/forensim/libs
** R
** data
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded from temporary location
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (forensim)

The downloaded source packages are in
‘/scratch/henrik/RtmpYFlQyS/downloaded_packages’

You could also try to install it via 'R CMD INSTALL' and try with
different options disabled to maybe narrow in on what's going on.

My $.02

/Henrik

On Wed, Sep 29, 2021 at 6:37 PM Brodie, Kent via R-help
 wrote:
>
> Hey everyone!  So, I've been asked by one of our researchers to install "all" 
> cran packages on one of our servers.Yeah, it's a bit much (and clearly, 
> not everything will install correctly due to various missing tidbits), but it 
> will go a long way to having to constantly respond to install requests of 
> various packages.  OK, fine.
>
> Anyway, I have tried this, and things proceed nicely for several hours until 
> the process gets completely and absolutely stuck.   I have tried several 
> things, for example trying newer versions of R, and even repeating the 
> process on a CentOS 8 server instead of where my stuff is now (CentOS 7).
>
> While re-trying one of my newer attempts at this, I decided to focus on the 
> very first place where it hangs.   It dies on package "forensim".It's a 
> slightly older package, and I don't see anything particularly special about 
> it.The text below is where it hangs.The install "R" process doing 
> this is whizzing at 100%,  but no progress, no output, no errors.  Nothing in 
> the system logs.
>
> On new "R" installs, with either operating system (CentOS 7, CentOS 8) and 
> even different versions of "R" (up to including 4.1.1), I get the same result 
> when just attempting to install this ONE package.(and my guess, there's 
> more packages out there that may bite me the same way).
>
> I am seeking any recommendations on how I can get past this?   A debug 
> option?  A timeout of sorts so things will  move along to the next package 
> when attempting to install a ton, or...?   I'm of course willing to try 
> anything.   It's stupidly frustrating.I'd be OK if it errored out and 
> moved on.   But it...  hangs.
>
> Here's the latter part of the install attempt of this one package. This 
> latest attempt has been stuck on that last line now for 5 hours and counting.
>
> * DONE (tkrplot)
> Making 'packages.html' ... done
> * installing *source* package 'forensim' ...
> ** package 'forensim' successfully unpacked and MD5 sums checked
> ** using staged installation
> ** libs
> gcc -m64 -I"/usr/include/R" -DNDEBUG   -I/usr/local/include   -fpic  -O2 -g 
> -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 
> -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong 
> -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 
> -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic 
> -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection  -c 
> auxilary.c -o auxilary.o
> gcc -m64 -I"/usr/include/R" -DNDEBUG   -I/usr/local/include   -fpic  -O2 -g 
> -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 
> -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong 
> -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 
> -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic 
> 

Re: [R] Getting different results with set.seed()

2021-08-19 Thread Henrik Bengtsson
You need to use %dorng% from the doRNG package instead of %dopar% when
parallelizing with foreach::foreach() to get reproducible random
numbers.  See also
https://www.jottr.org/2020/09/22/push-for-statical-sound-rng/.

/Henrik

On Thu, Aug 19, 2021 at 1:38 PM Eric Berger  wrote:
>
> In that case, another interesting test would be to check whether the
> problem exists when you don't use doParallel().
>
>
> On Thu, Aug 19, 2021 at 2:28 PM Shah Alam  wrote:
> >
> > Dear All,
> >
> > Thanks a lot for your valuable suggestions. I am going to implement one by
> > one.
> >
> > Jan:
> >
> > Yes, I am using the "doParallel" package for parallelization. I will let
> > you know the results after implementing all the given suggestions.
> >
> > Best regards,
> > Shah
> >
> >
> >
> > On Thu, 19 Aug 2021 at 11:57, Jan van der Laan  wrote:
> >
> > >
> > >
> > > What you could also try is check if the self coded functions use the
> > > random generator when defining them:
> > >
> > > starting_seed <- .Random.seed
> > >
> > > Step 1. Self-coded functions (these functions generate random numbers as
> > > well)
> > >
> > > # check if functions have modified the seed:
> > > all.equal(starting_seed, .Random.seed)
> > >
> > > Step 2: set.seed (123)
> > >
> > >
> > >
> > > What has also happened to me is that some of the functions I called had
> > > their own random number generator independent of that of R. For example
> > > using one in C/C++.
> > >
> > > Do your functions do stuff in parallel? For example using the parallel
> > > or snow package? In that case you also have to set the seed in the
> > > parallel workers.
> > >
> > > Best,
> > > Jan
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On 19-08-2021 11:25, PIKAL Petr wrote:
> > > > Hi
> > > >
> > > > Did you try different order?
> > > >
> > > > Step 2: set.seed (123)
> > > >
> > > > Step 1. Self-coded functions (these functions generate random numbers as
> > > well)
> > > >
> > > > Step 3: Call those functions.
> > > >
> > > > Step 4: model results.
> > > >
> > > > Cheers
> > > > Petr.
> > > >
> > > > And BTW, do not use HTML formating, it could cause problems in text only
> > > list.
> > > >
> > > >
> > > > From: Shah Alam 
> > > > Sent: Thursday, August 19, 2021 10:10 AM
> > > > To: PIKAL Petr 
> > > > Cc: r-help mailing list 
> > > > Subject: Re: [R] Getting different results with set.seed()
> > > >
> > > > Dear Petr,
> > > >
> > > > It is more than 2000 lines of code with a lot of functions and data
> > > inputs. I
> > > > am not sure whether it would be useful to upload it. However, you are
> > > > absolutely right. I used
> > > >
> > > > Step 1. Self-coded functions (these functions generate random numbers as
> > > well)
> > > >
> > > > Step 2: set.seed (123)
> > > >
> > > > Step 3: Call those functions.
> > > >
> > > > Step 4: model results.
> > > >
> > > > I close the R session and run the code from step 1. I get different
> > > results
> > > > for the same set of values for parameters.
> > > >
> > > > Best regards,
> > > > Shah
> > > >
> > > >
> > > >
> > > >
> > > > On Thu, 19 Aug 2021 at 09:56, PIKAL Petr 
> > > > wrote:
> > > > Hi
> > > >
> > > > Please provide at least your code preferably with some data to reproduce
> > > > this behaviour. I wonder if anybody could help you without such
> > > information.
> > > >
> > > > My wild guess is that you used
> > > >
> > > > set.seed(1234)
> > > >
> > > > some code
> > > >
> > > > the code used again
> > > >
> > > > in which case you have to expect different results.
> > > >
> > > > Cheers
> > > > Petr
> > > >
> > > >> -Original Message-
> > > >> From: R-help  On Behalf Of Shah
> > > Alam
> > > >> Sent: Thursday, August 19, 2021 9:46 AM
> > > >> To: r-help mailing list 
> > > >> Subject: [R] Getting different results with set.seed()
> > > >>
> > > >> Dear All,
> > > >>
> > > >> I was using set.seed to reproduce the same results for the discrete
> > > event
> > > >> simulation model. I have 12 unknown parameters for optimization (just a
> > > >> little background). I got a good fit of parameter combinations. 
> > > >> However,
> > > >> when I use those parameters combinations again in the model. I am
> > > getting
> > > >> different results.
> > > >>
> > > >> Is there any problem with the set.seed. I assume the set.seed should
> > > >> produce the same results.
> > > >>
> > > >> I used set.seed(1234).
> > > >>
> > > >> Best regards,
> > > >> Shah
> > > >>
> > > >>[[alternative HTML version deleted]]
> > > >>
> > > >> __
> > > >> mailto:R-help@r-project.org mailing list -- To UNSUBSCRIBE and more,
> > > see
> > > >> https://stat.ethz.ch/mailman/listinfo/r-help
> > > >> PLEASE do read the posting guide http://www.R-project.org/posting-
> > > >> guide.html
> > > >> and provide commented, minimal, self-contained, reproducible code.
> > > >>
> > > >> 

Re: [R] R does not start from (Debian) linux command line - error with doWithOneRestart() - segmentation fault

2021-04-08 Thread Henrik Bengtsson
Ashim, as Martin says, there's something really weird going on with
your core R installation.  This is definitely not expected, and I
don't know think if I've every seen this reported before.  Here are
some questions/comments that might help you move forward and for
others to pitch in:

1. How did you install R?

2. What happens if you call:

$ Rscript --vanilla -e "1+2"

3. What happens if you call:

$ Rscript --vanilla --default-packages=base -e "1+2"

4. What happens if you call:

$ Rscript --vanilla --default-packages=base,methods -e "1+2"

If any of 2-4 gives a different result, that's a first clue.

/Henrik

On Wed, Apr 7, 2021 at 1:17 AM Ashim Kapoor  wrote:
>
> On Wed, Apr 7, 2021 at 12:51 PM Martin Maechler
>  wrote:
> >
> > > Ashim Kapoor
> > > on Wed, 7 Apr 2021 10:35:14 +0530 writes:
> >
> > > Dear R experts,
> >
> > > Here is my problem :
> >
> > > R startup FAILS with an error message. The error message
> > > is more meaningful when I do invoke R via sudo OR as
> > > root. I attach the startup messages when I invoke R as :
> >
> > > 1. as non root user 2. with sudo 3. as Root user.
> >
> > > The error messages ( mentioned in snippets below ) are
> > > more meaningful to me in the above mentioned order.
> >
> > Thank you, Ashim.
> >
> > Yes, the messages point to something really bad.
> >
> > OTOH ("On the other hand"), what we *can* see is that you try to
> > start R version 3.6.3.
> >
>
> My apologies for not being clear.
>
> I would like to clarify :
>
> 1. The error message from the incantation `R` is not very informative.
>
> 2. The error message from the incantation `sudo R` says (among other
> things which don't seem significant to me):
> Error in doWithOneRestart(return(` \\x82\\x0ccPV`), restart) :
>   not a proper file name
>
> 3. The error message from invoking R as a root user says:
>
> Error: not a proper file name
> Fatal error: unable to initialize the JIT
>
> I found this on the internet :
> https://stackoverflow.com/questions/19512165/error-in-dowithonerestart
>
> The error messages and the above link point to issues with JAVA, but
> it does not say how to fix them.
>
> That is why I thought : there is an .xlsx file which has non english
> characters which is messing with
> Java.
>
> Query : is there a way to do JAVA garbage collection when R is not starting ?
>
> > While that is not extremely old, it may well be older than
> > several other pieces of software (or even hardware) that you are
> > running with.
> >
> > I very very  *strongly* recommend to use an R version 4.0.x ... and why not
> > use the latest  4.0.5 ?
> >
>
> We upgraded to 4.0.5 but it did not make a difference. We restarted
> the computer and that fixed the error.
> I think the JAVA garbage collection kicked in when we reset the
> computer and that is why it was fixed on rebooting. Not sure though.
>
> > Then, it may also be caused by a mismatch of system libraries
> > and your oldish version of R. ... but there I'd strongly
> > recommend consulting with other Debian users, notably as there
> > is a dedicated  mailing list  R-SIG-Debian --> do subscribe
> > there, and ask -- with more details on how you got your R: Is it
> > the default R on your Debian, which version of Debian,  etc.
> >
> > Last but not least, Dirk Eddelbuettel, the maintainer of the
> > official R Debian package maintains a nice web page -- part of
> > the official CRAN web pages, but unfortunately a bit hidden
> > nowadays, (not the least because CRAN still uses frames (würg!!)):
> >
> >   https://cloud.r-project.org/bin/linux/debian/
> >
> > A very nice and useful page,  much underrated and underused,
> > probably.
> >
>
> Thank you for this.
>
> > Best regards,
> >
> > Martin Maechler
> > ETH Zurich  and  R Core Team
> >
> >
> > > When I google around for the error message, it looks like
> > > there is an .xlsx file which has non english characters
> > > which is messing with Java.
> >
> > > I do not know how to fix this. I tried :-
> >
> > > R --vanilla
> >
> > > so that it would not use any startup scripts but that also
> > > does not work.
> >
> > > - snip
> > > 
> > 
> >
> > > When I try to start R from the command line :
> >
> > > ~$ R
> >
> > >  *** caught segfault *** address (nil), cause 'unknown'
> >
> > > Traceback: 1: NextMethod(.Generic) 2:
> > > Ops.numeric_version(R_version_built_under, "3.0.0") 3:
> > > testRversion(pkgInfo, package, pkgpath) 4:
> > > library(package, lib.loc = lib.loc, character.only = TRUE,
> > > logical.return = TRUE, warn.conflicts = warn.conflicts,
> > > quietly = quietly, mask.ok = mask.ok, exclude = exclude,
> > > include.only = include.only, attach.required =
> > > attach.required) 5: doTryCatch(return(expr), name,
> > > parentenv, 

Re: [R] Warning messages while parallel computing

2021-03-04 Thread Henrik Bengtsson
Test with:

clusterCall(cl, function() { suppressWarnings(source("xx.R")) })

If the warnings disappear, then the warnings are produced on the
workers from source():ing the file.

/Henrik

On Thu, Mar 4, 2021 at 10:20 AM Bill Dunlap  wrote:
>
> The warnings come from the garbage collector, which may be called from
> almost anywhere.  It is possible that the file that is sourced causes
> the problem, but if you don't call parallel::stopCluster before
> removing the cluster object you will get those warnings.
>
> > cl <- parallel::makeCluster(3, type="PSOCK")
> > invisible(gc())
> > rm(cl)
> > invisible(gc())
> Warning messages:
> 1: In .Internal(gc(verbose, reset, full)) :
>   closing unused connection 6 (<-Bill-T490:11216)
> 2: In .Internal(gc(verbose, reset, full)) :
>   closing unused connection 5 (<-Bill-T490:11216)
> 3: In .Internal(gc(verbose, reset, full)) :
>   closing unused connection 4 (<-Bill-T490:11216)
> >
> >
> > cl <- parallel::makeCluster(3, type="PSOCK")
> > invisible(gc())
> > parallel::stopCluster(cl)
> > invisible(gc())
> > rm(cl)
> > invisible(gc())
> >
>
> The fact that he got 8 warnings when playing a cluster of size 8 makes
> me suspect that omitting stopCluster is the problem.
>
> -Bill
>
> On Thu, Mar 4, 2021 at 10:12 AM Henrik Bengtsson
>  wrote:
> >
> > I don't think 'parallel' is to blame in this case. Those warnings:
> >
> > Warning in for (i in seq_len(Ne + echo)) { :
> >   closing unused connection 19
> >
> > come from base::source()
> > [https://github.com/wch/r-source/blob/9caddc1eaad1f480283f1e98af34a328699d1869/src/library/base/R/source.R#L166-L244].
> >
> > Unless there's a bug in source() that leaves connections open, which
> > is unlikely, I think there's something in the 'xx.R' script that opens
> > a connection but doesn't close it.  Possibly multiple times.  A good
> > check is to see if the same warnings are produced when calling
> > source("xx.R") sequentially in a for() loop or an lapply() call.
> >
> > Hope this helps,
> >
> > Henrik
> >
> > On Thu, Mar 4, 2021 at 9:58 AM Bill Dunlap  wrote:
> > >
> > > To avoid the warnings from gc(), call parallel::stopCluster(cl) before
> > > removing or overwriting cl.
> > >
> > > -Bill
> > >
> > > On Thu, Mar 4, 2021 at 1:52 AM Shah Alam  wrote:
> > > >
> > > > Hello everyone,
> > > >
> > > > I am using the "parallel" R package for parallel computation.
> > > >
> > > > Code:
> > > >
> > > ># set number of cores
> > > > cl <- makeCluster(8, type = "PSOCK")  # Mac/Linux need to set as 
> > > > "FORK"
> > > >
> > > >   # pass functions and objects to the cluster environment and set 
> > > > seed
> > > >   # all the items exported need to stay in the global environment!!
> > > >   clusterCall(cl, function() { source("xx.R" )})
> > > >   clusterExport(cl, list("a", "b", "c", "d",
> > > >  "5"))
> > > >   clusterSetRNGStream(cl, 1)
> > > >
> > > > While parallel processing, I receive the following warning signs.  Do I
> > > > need to ignore these signs or do they potentially slow the whole 
> > > > process?
> > > >
> > > >  *  Warning signs:*
> > > > Warning in for (i in seq_len(Ne + echo)) { :
> > > >   closing unused connection 19
> > > > Warning in for (i in seq_len(Ne + echo)) { :
> > > >   closing unused connection 18
> > > > Warning in for (i in seq_len(Ne + echo)) { :
> > > >   closing unused connection 17
> > > > Warning in for (i in seq_len(Ne + echo)) { :
> > > >   closing unused connection 16
> > > > Warning in for (i in seq_len(Ne + echo)) { :
> > > >   closing unused connection 15
> > > > Warning in for (i in seq_len(Ne + echo)) { :
> > > >   closing unused connection 14
> > > > Warning in for (i in seq_len(Ne + echo)) { :
> > > >   closing unused connection 13
> > > > Warning in for (i in seq_len(Ne + echo)) { :
> > > >   closing unused connection 12
> > > >
> > > > Best regards,
> > > > Shah Alam
> > > >
> > > > [[alternative HTML version deleted]]
> > > >
> > > > __
> > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guide 
> > > > http://www.R-project.org/posting-guide.html
> > > > and provide commented, minimal, self-contained, reproducible code.
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide 
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Warning messages while parallel computing

2021-03-04 Thread Henrik Bengtsson
I don't think 'parallel' is to blame in this case. Those warnings:

Warning in for (i in seq_len(Ne + echo)) { :
  closing unused connection 19

come from base::source()
[https://github.com/wch/r-source/blob/9caddc1eaad1f480283f1e98af34a328699d1869/src/library/base/R/source.R#L166-L244].

Unless there's a bug in source() that leaves connections open, which
is unlikely, I think there's something in the 'xx.R' script that opens
a connection but doesn't close it.  Possibly multiple times.  A good
check is to see if the same warnings are produced when calling
source("xx.R") sequentially in a for() loop or an lapply() call.

Hope this helps,

Henrik

On Thu, Mar 4, 2021 at 9:58 AM Bill Dunlap  wrote:
>
> To avoid the warnings from gc(), call parallel::stopCluster(cl) before
> removing or overwriting cl.
>
> -Bill
>
> On Thu, Mar 4, 2021 at 1:52 AM Shah Alam  wrote:
> >
> > Hello everyone,
> >
> > I am using the "parallel" R package for parallel computation.
> >
> > Code:
> >
> ># set number of cores
> > cl <- makeCluster(8, type = "PSOCK")  # Mac/Linux need to set as "FORK"
> >
> >   # pass functions and objects to the cluster environment and set seed
> >   # all the items exported need to stay in the global environment!!
> >   clusterCall(cl, function() { source("xx.R" )})
> >   clusterExport(cl, list("a", "b", "c", "d",
> >  "5"))
> >   clusterSetRNGStream(cl, 1)
> >
> > While parallel processing, I receive the following warning signs.  Do I
> > need to ignore these signs or do they potentially slow the whole process?
> >
> >  *  Warning signs:*
> > Warning in for (i in seq_len(Ne + echo)) { :
> >   closing unused connection 19
> > Warning in for (i in seq_len(Ne + echo)) { :
> >   closing unused connection 18
> > Warning in for (i in seq_len(Ne + echo)) { :
> >   closing unused connection 17
> > Warning in for (i in seq_len(Ne + echo)) { :
> >   closing unused connection 16
> > Warning in for (i in seq_len(Ne + echo)) { :
> >   closing unused connection 15
> > Warning in for (i in seq_len(Ne + echo)) { :
> >   closing unused connection 14
> > Warning in for (i in seq_len(Ne + echo)) { :
> >   closing unused connection 13
> > Warning in for (i in seq_len(Ne + echo)) { :
> >   closing unused connection 12
> >
> > Best regards,
> > Shah Alam
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] parallel: socket connection behind a NAT router

2021-01-18 Thread Henrik Bengtsson
On Mon, Jan 18, 2021 at 9:42 PM Jiefei Wang  wrote:
>
> Thanks for introducing this interesting package to me! it is great to know a 
> new powerful tool, but it seems like this method does not work in my 
> environment. ` parallelly::makeClusterPSOCK` will hang until timeout.
>
> I checked the verbose output and it looks like the parallelly package also 
> depends on `parallel:::.slaveRSOCK` on the remote instance to build the 
> connection. This explains why it failed for the local machine does not have a 
> public IP and the remote does not know how to build the connection.

It's correct that the worker does attempt to connect back to the
parent R process that runs on your local machine.  However, it does
*not* do so by your local machines public IP address but it does it by
connecting to a port on its own machine - a port that was set up by
SSH.  More specifically, when parallelly::makeClusterPSOCK() connects
to the remote machine over SSH it also sets up a so-called reverse SSH
tunnel with a certain port on your local machine and certain port of
your remote machine.  This is what happens:

> cl <- parallelly::makeClusterPSOCK("machine1.example.org", verbose=TRUE)
[local output] Workers: [n = 1] 'machine1.example.org'
[local output] Base port: 11019
...
[local output] Starting worker #1 on 'machine1.example.org':
'/usr/bin/ssh' -R 11068:localhost:11068 machine1.example.org
"'Rscript' --default-packages=datasets,utils,grDevices,graphics,stats,methods
-e 'workRSOCK <- tryCatch(parallel:::.slaveRSOCK, error=function(e)
parallel:::.workRSOCK); workRSOCK()' MASTER=localhost PORT=11068
OUT=/dev/null TIMEOUT=2592000 XDR=FALSE"
[local output] - Exit code of system() call: 0
[local output] Waiting for worker #1 on 'machine1.example.org' to
connect back  '/usr/bin/ssh' -R 11019:localhost:11019
machine1.example.org "'Rscript'
--default-packages=datasets,utils,grDevices,graphics,stats,methods -e
'workRSOCK <- tryCatch(parallel:::.slaveRSOCK, error=function(e)
parallel:::.workRSOCK); workRSOCK()' MASTER=localhost PORT=11019
OUT=/dev/null TIMEOUT=2592000 XDR=FALSE"

All the magic is in that SSH option '-R 11068:localhost:11068' SSH
options, which allow the parent R process on your local machine to
communicate with the remote worker R process on its own port 11068,
and vice versa, the worker R process will communicate with the parent
R process as if it was running on MASTER=localhost PORT=11068.
Basically, for all that the worker R process' knows, the parent R
process runs on the same machine as itself.

You haven't said what operating system you're running on your local
machine, but if it's MS Windows, know that the 'ssh' client that comes
with Windows 10 has some bugs in its reverse tunneling.  See
?parallelly::makeClusterPSOCK for lots of details.  You also haven't
said what OS the cloud workers run, but I assume it's Linux.

So, my guesses on your setup is, the above "should work" for you.  For
your troubleshooting, you can also set argument outfile=NULL.  Then
you'll also see output from the worker R process.  There are
additional troubleshooting suggestions in Section 'Failing to set up
remote workers' of ?parallelly::makeClusterPSOCK that will help you
figure out what the problem is.

>
> I see in README the package states it works with "remote clusters without 
> knowing public IP". I think this might be where the confusion is, it may mean 
> the remote machine does not have a public IP, but the server machine does. 
> I'm in the opposite situation, the server does not have a public IP, but the 
> remote does. I'm not sure if this package can handle my case, but it looks 
> very powerful and I appreciate your help!

Thanks. I've updated the text to "remote clusters without knowing
[local] public IP".

/Henrik

>
> Best,
> Jiefei
>
>
>
>
>
> On Tue, Jan 19, 2021 at 1:22 AM Henrik Bengtsson  
> wrote:
>>
>> If you have SSH access to the workers, then
>>
>> workers <- c("machine1.example.org", "machine2.example.org")
>> cl <- parallelly::makeClusterPSOCK(workers)
>>
>> should do it.  It does this without admin rights and port forwarding.
>> See also the README in https://cran.r-project.org/package=parallelly.
>>
>> /Henrik
>>
>> On Mon, Jan 18, 2021 at 6:45 AM Jiefei Wang  wrote:
>> >
>> > Hi all,
>> >
>> > I have a few cloud instances and I want to use them to do parallel
>> > computing. I would like to create a socket cluster on my local machine to
>> > control the remote instances. Here is my network setup:
>> >
>> > local machine -- NAT -- Internet -- cloud instances
>> >
>> > In the parallel package, the server needs to call `makeCluster()` and
>> > listens to the c

Re: [R] parallel: socket connection behind a NAT router

2021-01-18 Thread Henrik Bengtsson
If you have SSH access to the workers, then

workers <- c("machine1.example.org", "machine2.example.org")
cl <- parallelly::makeClusterPSOCK(workers)

should do it.  It does this without admin rights and port forwarding.
See also the README in https://cran.r-project.org/package=parallelly.

/Henrik

On Mon, Jan 18, 2021 at 6:45 AM Jiefei Wang  wrote:
>
> Hi all,
>
> I have a few cloud instances and I want to use them to do parallel
> computing. I would like to create a socket cluster on my local machine to
> control the remote instances. Here is my network setup:
>
> local machine -- NAT -- Internet -- cloud instances
>
> In the parallel package, the server needs to call `makeCluster()` and
> listens to the connection from the workers. In my case, the server is the
> local machine and the workers are the cloud instances. However, since the
> local machine is hidden behind the NAT, it does not have a public address
> and the worker cannot connect to it. Therefore, `makeCluster()` will never
> be able to see the connection from the workers and hang forever.
>
> One solution for letting the external machine to access the device inside
> the NAT is to use port forwarding. However, this would not work for my case
> as the NAT is set by the network provider(not my home router) so I do not
> have access to the router. As the cloud instances have public addresses,
> I'll wonder if there is any way to build the cluster by letting the server
> connect to the cloud? I have checked `?parallel::makeCluster` and
> `?snow::makeSOCKcluster` but I found no result. The only promising solution
> I can see now is to use TCP hole punching, but it is quite complicated and
> may not work for every case. Since building a connection from local to the
> remote is super easy, I would like to know if there exists any simple
> solution. I have searched it on Google for a week but find no answer. I'll
> appreciate it if you can provide me any suggestions!
>
> Best,
> Jiefei
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] vanilla session in R Gui or RStudio

2020-10-22 Thread Henrik Bengtsson
This can happens if save an object with an environment part of a package, e.g.

$ R --quiet --vanilla
> fcn <- Matrix::Matrix
> environment(fcn)

> quit("yes")  # saves the workspace in an .RData file

# Loading the .RData file at startup triggers 'Matrix' to be loaded
$ R --quiet --no-init-file -e "loadedNamespaces()"
> loadedNamespaces()
 [1] "compiler"  "Matrix""graphics"  "utils" "grDevices" "stats"
 [7] "datasets"  "grid"  "methods"   "base"  "lattice"
>
>

Also, if you have saved S4 objects (e.g. x <- Matrix::Matrix(0, 3,
2)), they will trigger their corresponding packages to be loaded when
"used" (e.g. print():ed) but not before.

Not saying it explains all of OPs packages - just wanted to say the
content of .RData may trigger packages being loaded.

/Henrik

On Thu, Oct 22, 2020 at 7:54 PM Jeff Newmiller  wrote:
>
> Can you be more specific about what conditions cause R to automatically load 
> a package when a .RData file is loaded? My experience has actually been the 
> opposite.
>
> On October 22, 2020 6:13:11 PM PDT, Henrik Bengtsson 
>  wrote:
> >As Jeff says, it might be that you have a ~/.Rprofile file with
> >instructions to load packages when R starts.  It could also be that
> >you have a .RData file, which is saved if you answer yes to:
> >
> >> Save workspace image? [y/n/c]: y
> >
> >when you quit R.   If this file exists, then R loads it and all the
> >objects you had when you saved it. If there are objects associated
> >with packages, then that will cause those packages to be loaded when R
> >starts.  To avoid this, you need to move or delete the .RData file.
> >
> >You can use:
> >
> >> startup::startup(debug = TRUE, dryrun = TRUE)
> >
> >to get detailed information on what happens when R starts, e.g. if you
> >have a .Rprofile file and an .RData file.  That might help you to
> >track down what's going on.  The 'startup' package is on CRAN.
> >
> >I don't know of an easy way to restart RGui or RStudio Console in
> >vanilla mode, similarly how you can start R at the terminal with 'R
> >--vanilla'.
> >
> >/Henrik
> >
> >On Thu, Oct 22, 2020 at 4:14 PM Jeff Newmiller
> > wrote:
> >>
> >> Have you looked into your .Rprofile file? Loading packages is not
> >something R normally does without your telling it to do so, but many
> >people forget that they have done so.
> >>
> >> On October 22, 2020 3:47:04 PM PDT, Michael L Friendly
> > wrote:
> >> >[env: Windows, R 3.6.6]
> >> >
> >> >When I start R from the R Gui icon or from RStudio, I get a large
> >> >number of packages loaded via a namespace. Not entirely clear where
> >> >these come from.
> >> >
> >> >As a result, I often run into problems updating packages because
> >> >something is already loaded.  How can start a new gui session with
> >> >minimal packages loaded?
> >> >
> >> >> sessionInfo()
> >> >R version 3.6.3 (2020-02-29)
> >> >Platform: x86_64-w64-mingw32/x64 (64-bit)
> >> >Running under: Windows 7 x64 (build 7601) Service Pack 1
> >> >
> >> >Matrix products: default
> >> >
> >> >locale:
> >> >[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
> >> >States.1252
> >> >[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
> >> >
> >> >[5] LC_TIME=English_United States.1252
> >> >
> >> >attached base packages:
> >> >[1] stats graphics  grDevices utils datasets  methods   base
> >> >
> >> >
> >> >loaded via a namespace (and not attached):
> >> >[1] statmod_1.4.34   xfun_0.18tidyselect_1.1.0
> >reshape2_1.4.4
> >> >purrr_0.3.4  mitools_2.4
> >> >[7] splines_3.6.3lattice_0.20-41  coefplot_1.2.6   carData_3.0-4
> >> >colorspace_1.4-1 vctrs_0.3.4
> >> >[13] generics_0.0.2   htmltools_0.5.0  yaml_2.2.1
> >survival_3.2-7
> >> > rlang_0.4.7  pillar_1.4.6
> >> >[19] nloptr_1.2.2.2   glue_1.4.2   DBI_1.1.0
> >lifecycle_0.2.0
> >> > plyr_1.8.6   stringr_1.4.0
> >> >[25] effects_4.2-0munsell_0.5.0gtable_0.3.0
> >evaluate_0.14
> >> > knitr_1.30   fansi_0.4.1
> >> >[31] Rcpp_1.0.5   scales_1.1.1 useful_1.2.6 fs_1.4.2
> >> > lme4_1.1-23  

Re: [R] vanilla session in R Gui or RStudio

2020-10-22 Thread Henrik Bengtsson
As Jeff says, it might be that you have a ~/.Rprofile file with
instructions to load packages when R starts.  It could also be that
you have a .RData file, which is saved if you answer yes to:

> Save workspace image? [y/n/c]: y

when you quit R.   If this file exists, then R loads it and all the
objects you had when you saved it. If there are objects associated
with packages, then that will cause those packages to be loaded when R
starts.  To avoid this, you need to move or delete the .RData file.

You can use:

> startup::startup(debug = TRUE, dryrun = TRUE)

to get detailed information on what happens when R starts, e.g. if you
have a .Rprofile file and an .RData file.  That might help you to
track down what's going on.  The 'startup' package is on CRAN.

I don't know of an easy way to restart RGui or RStudio Console in
vanilla mode, similarly how you can start R at the terminal with 'R
--vanilla'.

/Henrik

On Thu, Oct 22, 2020 at 4:14 PM Jeff Newmiller  wrote:
>
> Have you looked into your .Rprofile file? Loading packages is not something R 
> normally does without your telling it to do so, but many people forget that 
> they have done so.
>
> On October 22, 2020 3:47:04 PM PDT, Michael L Friendly  
> wrote:
> >[env: Windows, R 3.6.6]
> >
> >When I start R from the R Gui icon or from RStudio, I get a large
> >number of packages loaded via a namespace. Not entirely clear where
> >these come from.
> >
> >As a result, I often run into problems updating packages because
> >something is already loaded.  How can start a new gui session with
> >minimal packages loaded?
> >
> >> sessionInfo()
> >R version 3.6.3 (2020-02-29)
> >Platform: x86_64-w64-mingw32/x64 (64-bit)
> >Running under: Windows 7 x64 (build 7601) Service Pack 1
> >
> >Matrix products: default
> >
> >locale:
> >[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
> >States.1252
> >[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
> >
> >[5] LC_TIME=English_United States.1252
> >
> >attached base packages:
> >[1] stats graphics  grDevices utils datasets  methods   base
> >
> >
> >loaded via a namespace (and not attached):
> >[1] statmod_1.4.34   xfun_0.18tidyselect_1.1.0 reshape2_1.4.4
> >purrr_0.3.4  mitools_2.4
> >[7] splines_3.6.3lattice_0.20-41  coefplot_1.2.6   carData_3.0-4
> >colorspace_1.4-1 vctrs_0.3.4
> >[13] generics_0.0.2   htmltools_0.5.0  yaml_2.2.1   survival_3.2-7
> > rlang_0.4.7  pillar_1.4.6
> >[19] nloptr_1.2.2.2   glue_1.4.2   DBI_1.1.0lifecycle_0.2.0
> > plyr_1.8.6   stringr_1.4.0
> >[25] effects_4.2-0munsell_0.5.0gtable_0.3.0 evaluate_0.14
> > knitr_1.30   fansi_0.4.1
> >[31] Rcpp_1.0.5   scales_1.1.1 useful_1.2.6 fs_1.4.2
> > lme4_1.1-23  packrat_0.5.0
> >[37] ggplot2_3.3.2digest_0.6.25stringi_1.4.6insight_0.9.6
> > dplyr_1.0.2  survey_4.0
> >[43] grid_3.6.3   cli_2.1.0tools_3.6.3  magrittr_1.5
> > tibble_3.0.4 crayon_1.3.4
> >[49] pkgconfig_2.0.3  ellipsis_0.3.1   MASS_7.3-53  Matrix_1.2-18
> > reprex_0.3.0 assertthat_0.2.1
> >[55] minqa_1.2.4  rmarkdown_2.4rstudioapi_0.11  R6_2.4.1
> > boot_1.3-25  nnet_7.3-14
> >[61] nlme_3.1-149 compiler_3.6.3
> >>
> >
> >Michael Friendly Email: friendly AT yorku DOT ca
> >Professor, Psychology Dept. & Former Chair, ASA Statistical Graphics
> >Section
> >York University  Voice: 416 736-2100 x66249
> >4700 Keele StreetWeb: http://www.datavis.ca | @datavisFriendly
> >Toronto, ONT  M3J 1P3 CANADA
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] tempdir() does not respect TMPDIR

2020-08-30 Thread Henrik Bengtsson
Sorry, I should retract my claim that it's too late to set TMPDIR in
.Renviron.  It does indeed work on Linux and R 4.0.2, e.g.

$ cd
$ mkdir test
$ cd test
$ echo "TMPDIR=$PWD" > ./.Renviron
$ cat ./.Renviron
TMPDIR=/home/hb/test
Rscript --no-init-file -e "tempdir()"
[1] "/home/hb/test/RtmpyH47tc"

Hmm... either this has changed "recently" or I've got it wrong all the
time.  Eitherway, I need to revise the vignette in my 'startup'
package.

Sorry for the misleading comment.

So, back to your comment about it does *not* work, that is,
~/.Renviron is not read, when you double-click on an .RData file.  I
just tried with R 4.0.2 in a Windows 10 VM and I think I can reproduce
what you're describing.

The problem seems to be that when one launches Rgui via
double-clicking .RData, the Rgui will only read ./.Renviron, that is,
the .Renviron file that is located in the same folder as the .RData
file.  It will never load ~/.Renviron (e.g.
C:/Users\alice/Documents/.Renviron) unless the .RData file is in that
folder too.

This looks odd to me but it could be that I made another mistake in my
conclusions above.  I let someone else with a less mushy brain take
over from here.

/Henrik

On Sat, Aug 29, 2020 at 4:31 PM Jinsong Zhao  wrote:
>
> I read the help page, I don't understand it very well, since I set the
> environmental variable TMPDIR in .Renviron. What confused me is when
> double clicking the *.RData to launch R, the tempdir() does not respect
> the environmental variable TMPDIR, but launch R by double clicking Rgui
> icon does.
>
> Best,
> Jinsong
>
> On 2020/8/30 0:36, Henrik Bengtsson wrote:
> > It is too late to set TMPDIR in .Renviron.  It is one of the
> > environment variables that has to be set prior to launching R.  From
> > help("tempfile", package = "base"):
> >
> > The environment variables TMPDIR, TMP and TEMP are checked in turn and
> > the first found which points to a writable directory is used: if none
> > succeeds ‘/tmp’ is used. The path should not contain spaces. **Note
> > that setting any of these environment variables in the R session has
> > no effect on tempdir(): the per-session temporary directory is created
> > before the interpreter is started.**
> >
> > /Henrik
> >
> > On Sat, Aug 29, 2020 at 6:40 AM Jinsong Zhao  wrote:
> >>
> >> Hi there,
> >>
> >> When I started R by double clicking on Rgui icon (I am on Windows), the
> >> tempdir() returned the tmpdir in the directory I set in .Renviron. If I
> >> started R by double clicking on a *.RData file, the tempdir() return the
> >> tmpdir in the directory setting by Windows system. I don't know whether
> >> it's designed.
> >>
> >>   > sessionInfo()
> >> R version 4.0.2 (2020-06-22)
> >> Platform: x86_64-w64-mingw32/x64 (64-bit)
> >> Running under: Windows 10 x64 (build 18363)
> >> ...
> >>
> >> Best,
> >> Jinsong
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide 
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] serialize does not work as expected

2020-08-29 Thread Henrik Bengtsson
Does serialize(..., version = 2L) do what you want?

/Henrik

On Sat, Aug 29, 2020 at 10:10 AM Sigbert Klinke
 wrote:
>
> Hi,
>
> is there in R a way to "normalize" a vector from
> compact_intseq/compact_realseq to a "normal" vector?
>
> Sigbert
>
> Am 29.08.20 um 18:13 schrieb Duncan Murdoch:
> > Element 1
> > A
> > 3
> > 262146
> > 197888
> > 5
> > UTF-8
> > 238
> > 2
> > 1
> > 262153
> > 14
> > compact_intseq
> > 2
> > 1
> > 262153
> > 4
> > base
> > 2
> > 13
> > 1
> > 13
> > 254
> > 14
> > 3
> > 3
> > 1
> > 1
> > 254
> >
> > Element 2
> > A
> > 3
> > 262146
> > 197888
> > 5
> > UTF-8
> > 238
> > 2
> > 1
> > 262153
> > 15
> > compact_realseq
> > 2
> > 1
> > 262153
> > 4
> > base
> > 2
> > 13
> > 1
> > 14
> > 254
> > 14
> > 3
> > 3
> > 1
> > 1
> > 254
> >
> > Element 3
> > A
> > 3
> > 262146
> > 197888
> > 5
> > UTF-8
> > 14
> > 3
> > 1
> > 2
> > 3
>
>
> --
> https://hu.berlin/sk
> https://hu.berlin/mmstat3
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] tempdir() does not respect TMPDIR

2020-08-29 Thread Henrik Bengtsson
It is too late to set TMPDIR in .Renviron.  It is one of the
environment variables that has to be set prior to launching R.  From
help("tempfile", package = "base"):

The environment variables TMPDIR, TMP and TEMP are checked in turn and
the first found which points to a writable directory is used: if none
succeeds ‘/tmp’ is used. The path should not contain spaces. **Note
that setting any of these environment variables in the R session has
no effect on tempdir(): the per-session temporary directory is created
before the interpreter is started.**

/Henrik

On Sat, Aug 29, 2020 at 6:40 AM Jinsong Zhao  wrote:
>
> Hi there,
>
> When I started R by double clicking on Rgui icon (I am on Windows), the
> tempdir() returned the tmpdir in the directory I set in .Renviron. If I
> started R by double clicking on a *.RData file, the tempdir() return the
> tmpdir in the directory setting by Windows system. I don't know whether
> it's designed.
>
>  > sessionInfo()
> R version 4.0.2 (2020-06-22)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
> Running under: Windows 10 x64 (build 18363)
> ...
>
> Best,
> Jinsong
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] & and |

2020-08-19 Thread Henrik Bengtsson
A version of Eric's answer is to use grepl(), which returns a logical vector:

mydata[grepl("ConfoMap", mydata) & grepl("GuineaPigs", mydata)]

with the OR analogue:

mydata[grepl("ConfoMap", mydata) | grepl("GuineaPigs", mydata)]

/Henrik

On Wed, Aug 19, 2020 at 8:24 AM Ivan Calandra  wrote:
>
> Thank you Eric, I didn't think about intersect().
>
> Now I'm trying to do that in tidyverse with pipes, and I think that's
> too much for me for now!
>
> Ivan
>
> --
> Dr. Ivan Calandra
> TraCEr, laboratory for Traceology and Controlled Experiments
> MONREPOS Archaeological Research Centre and
> Museum for Human Behavioural Evolution
> Schloss Monrepos
> 56567 Neuwied, Germany
> +49 (0) 2631 9772-243
> https://www.researchgate.net/profile/Ivan_Calandra
>
> On 19/08/2020 17:17, Eric Berger wrote:
> > mydata[ intersect( grep("ConfoMap", mydata), grep("GuineaPigs",
> > mydata)  ) ]
> >
> >
> >
> > On Wed, Aug 19, 2020 at 6:13 PM Bert Gunter  > > wrote:
> >
> > "&" is not a regex metacharacter.
> > See ?regexp
> >
> > Bert Gunter
> >
> > "The trouble with having an open mind is that people keep coming
> > along and
> > sticking things into it."
> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> >
> > On Wed, Aug 19, 2020 at 7:53 AM Ivan Calandra  > > wrote:
> >
> > > Dear useRs,
> > >
> > > I feel really stupid, but I cannot understand why "&" doesn't
> > work as I
> > > expect, while "|" does.
> > >
> > > I have the following vector:
> > > mydata <- c("SSFA-ConfoMap_GuineaPigs_NMPfilled.csv",
> > > "SSFA-ConfoMap_Lithics_NMPfilled.csv",
> > > "SSFA-ConfoMap_Sheeps_NMPfilled.csv",
> > "SSFA-Toothfrax_GuineaPigs.xlsx",
> > > "SSFA-Toothfrax_Lithics.xlsx", "SSFA-Toothfrax_Sheeps.xlsx")
> > > and I want to find the values that include both "ConfoMap" and
> > > "GuineaPigs".
> > >
> > > If I do:
> > > grep("ConfoMap", mydata, value=TRUE)
> > > it returns an empty vector, character(0).
> > >
> > > But if I do:
> > > grep("ConfoMap|GuineaPigs", mydata, value=TRUE)
> > > it returns all the elements that include either "ConfoMap" or
> > > "GuineaPigs", as I would expect.
> > >
> > > So what is wrong with my "&" construct? How can I return the
> > elements
> > > that include both parts?
> > >
> > > Thank you for your help!
> > > Ivan
> > >
> > > --
> > > Dr. Ivan Calandra
> > > TraCEr, laboratory for Traceology and Controlled Experiments
> > > MONREPOS Archaeological Research Centre and
> > > Museum for Human Behavioural Evolution
> > > Schloss Monrepos
> > > 56567 Neuwied, Germany
> > > +49 (0) 2631 9772-243
> > > https://www.researchgate.net/profile/Ivan_Calandra
> > >
> > > __
> > > R-help@r-project.org  mailing list
> > -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org  mailing list --
> > To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [External] Re: Playing a music file in R

2020-07-23 Thread Henrik Bengtsson
FWIW, see also the 'midi' R package
(https://github.com/moodymudskipper/midi).  It's not on CRAN.  /Henrik

On Thu, Jul 23, 2020 at 12:19 PM Rasmus Liland  wrote:
>
> On 2020-07-23 14:19 -0400, Richard M. Heiberger wrote:
> > On Thu, Jul 23, 2020 at 7:21 AM bretschr  wrote:
> > >
> > > Dear Vahid,
> > >
> > >
> > > Re:
> > >
> > > > I have a question regarding the
> > > > following code:
> > > > setWavPlayer("/Library/Frameworks/R.framework/Versions/3.6/Resources/bin/play")
> > > ># path depends on your R version etc.
> > > >
> > > > How can I find the corresponding
> > > > path on my laptop?
> > >
> > >
> > > This line ...
> > >
> > > setWavPlayer("/Library/Frameworks/R.framework/Versions/3.6/Resources/bin/play")
> > >
> > > ... is the path on my MacBook Air
> > > running Mac OS Mojave.  The
> > > structure of R on similar computers
> > > and OS versions will be the same,
> > > but I don't know if you use Windows
> > > or Linux.  Maybe an R-user working
> > > with Windows or Linux can help.
> >
> > The R command
> > system.file()
> >
> > tells you where the currently running
> > version of R is located on your
> > machine and your operating system.
> >
> > On the Macintosh it shows
> > > system.file()
> > [1] "/Library/Frameworks/R.framework/Resources/library/base"
> >
> > You don't need to worry about the R
> > version number, as R knows where it is
> > and gives you the location of the
> > currently running version.  You can
> > now open a system directory file
> > (Finder on Mac, WindowsExplorer on
> > Windows, etc) and navigate up and down
> > to what you are looking for.
> >
> > Or, within R, you can navigate with ..
> > and more directory statements, thus
> > the example from the original email
> >
> > setWavPlayer("/Library/Frameworks/R.framework/Versions/3.6/Resources/bin/play")
> >   # path depends on your R version etc.
> >
> > would be written
> > tmp <- system.file("../../bin/play")
> > and the result is the correct location
> > (if it is there).
> > You will get an empty string if it is
> > not ther, but this will get you
> > started on where to look.  then you
> > write
> >
> > setWavPlayer(tmp)   # path depends on your R version etc.
>
>
> Hello, I found you can use mpv (and
> mplayer) as well, instead of sox[1].
> There are many ways to install it[1]
> including various ways on mac.
>
> > system("whereis mpv")
> mpv: /usr/bin/mpv /etc/mpv /usr/include/mpv /usr/share/mpv 
> /usr/share/man/man1/mpv.1.gz
> > tuneR::setWavPlayer("/usr/bin/mpv")
> > w <- tuneR::readWave(filename="rf_07.wav", from=0, to=1e6)
> > w
>
> Wave Object
> Number of Samples:  101
> Duration (seconds): 22.68
> Samplingrate (Hertz):   44100
> Channels (Mono/Stereo): Stereo
> PCM (integer format):   TRUE
> Bit (8/16/24/32/64):16
>
> > tuneR::play(w)
>  (+) Audio --aid=1 (pcm_s16le 2ch 44100Hz)
> AO: [pulse] 44100Hz stereo 2ch s16
> A: 00:00:22 / 00:00:22 (98%)
>
>
> Exiting... (End of file)
> >
> > tuneR::setWavPlayer("/usr/bin/play")
> > tuneR::play(w)
>
> /tmp/RtmpadTOu6/tuneRtemp.wav:
>
>  File Size: 4.00M Bit Rate: 1.41M
>   Encoding: Signed PCM
>   Channels: 2 @ 16-bit
> Samplerate: 44100Hz
> Replaygain: off
>   Duration: 00:00:22.68
>
> In:100%  00:00:22.68 [00:00:00.00] Out:1.00M [ -|- ] Hd:4.2 
> Clip:0
> Done.
> >
>
>
> Best,
> Rasmus
>
> [1] http://sox.sourceforge.net/
> [1] https://mpv.io/installation/
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] doParallel cores HPC

2020-06-26 Thread Henrik Bengtsson
> On the other hand, [...], you can spawn 20 workers on each of the 10 hosts:
>
> makePSOCKcluster(names = rep(c('Host01', ..., 'Host10'), each = 20))

Unfortunately, this will most likely not work because it will require
200 open connections - one for each worker - but R limits you to 125
(see ?base::connections):

> A maximum of 128 connections can be allocated (not necessarily open) at any 
> one time. Three of these are pre-allocated (see stdout). The OS will impose 
> limits on the numbers of connections of various types, but these are usually 
> larger than 125.

Depending on the system, you might be able to increase this by
rebuilding R from source after editing a hard-coded constant.  I've
verified that this worked on a local Ubuntu 16.04 system.  See
https://github.com/HenrikBengtsson/Wishlist-for-R/issues/28 for
details about this problem.

/Henrik

On Fri, Jun 26, 2020 at 12:11 AM Ivan Krylov  wrote:
>
> On Thu, 25 Jun 2020 00:29:42 +
> "Silva, Eder David Borges da"  wrote:
>
> > I have the HPC, with 10 nodes, and each node with 20 cores in UNIX OS.
>
> > cl <- makePSOCKcluster(names=c('Host01', ... , 'Host10)
>
> > This code is the best way for use all machine power?
>
> The code as written will create one worker _process_ on each of the
> hosts. What happens next depends on the code to be running and the way
> R is installed.
>
> The code may or may not be written to take advantage of multi-core CPUs
> (e.g. using OpenMP). In particular, if R is linked with a
> multi-threaded BLAS (such as OpenBLAS or MKL) and uses matrix algebra
> during the computation, it may spawn multiple _threads_ to utilise the
> CPU better. Whether it succeeds depends on multiple factors, including
> the size of the task. On occasion I noticed OpenBLAS threads spending
> most of their time in sched_yield() system call, making the kernel do a
> lot of unnecessary work, and set the environment variable
> OPENBLAS_NUM_THREADS=1 to use only one thread instead.
>
> On the other hand, if the computation is purely single-threaded (or you
> disabled the multi-threaded behaviour of OpenMP or BLAS for some
> reason), you can spawn 20 workers on each of the 10 hosts:
>
> makePSOCKcluster(names = rep(c('Host01', ..., 'Host10'), each = 20))
>
> You can also try to combine the two approaches by limiting the number
> of working threads to a sensible value which results in the threads
> spending most of the time computing things (instead of waiting for more
> work busy-looping on sched_yield()), then spawning as many processes as
> required to utilise all of the cores.
>
> --
> Best regards,
> Ivan
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rtools required

2020-04-29 Thread Henrik Bengtsson
Careful so you don't overwrite an existing ~/.Renviron file; it's safer to
use something like:

cat('PATH="${RTOOLS40_HOME}\\usr\\bin;${PATH}"\n', file="~/.Renviron"
append=TRUE)

/Henrik




On Wed, Apr 29, 2020, 15:33 Fox, John  wrote:

> Dear Steven,
>
> It's possible that Windows will hide .Renviron, but it's generally a good
> idea, in my opinion, in Folder Options > View to click "Show hidden files"
> and uncheck "hide extensions". Then .Renviron should show up (once you've
> created it).
>
> Best,
>  John
>
> > -Original Message-
> > From: Bert Gunter 
> > Sent: Wednesday, April 29, 2020 5:50 PM
> > To: Steven 
> > Cc: Fox, John ; R-help Mailing List  > project.org>
> > Subject: Re: [R] Rtools required
> >
> > Type
> > ?.Renviron
> > ?R.home
> > ?"environment variables"
> >
> > at the R prompt to get what I think should be the info you need (or at
> > least useful info).
> >
> >
> > Bert Gunter
> >
> > "The trouble with having an open mind is that people keep coming along
> and
> > sticking things into it."
> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> > On Wed, Apr 29, 2020 at 2:37 PM Steven  wrote:
> > >
> > > Thanks John. Where is file .Renviron located? It must be a hidden file.
> > > I cannot find it.
> > >
> > > On 2020/4/28 下午 08:29, Fox, John wrote:
> > > > Dear Steven,
> > > >
> > > > Did you follow the instruction on the Rtools webpage to add
> > > >
> > > >   PATH="${RTOOLS40_HOME}\usr\bin;${PATH}"
> > > >
> > > > to your .Renviron file?
> > > >
> > > > I hope this helps,
> > > >   John
> > > >
> > > >-
> > > >John Fox, Professor Emeritus
> > > >McMaster University
> > > >Hamilton, Ontario, Canada
> > > >Web: http::/socserv.mcmaster.ca/jfox
> > > >
> > > >> On Apr 28, 2020, at 4:38 AM, Steven  wrote:
> > > >>
> > > >> Dear All
> > > >>
> > > >> I updated to R-4.0.0. and also installed the latest Rtools 4.0 (to
> > > >> now the new default folder c:\rtools40). While compiling a package
> > > >> (binary) I received the follow marning message saying Rtools is
> > > >> required. Any clues? Thanks.
> > > >>
> > > >> Steven Yen
> > > >>
> > > >> WARNING: Rtools is required to build R packages but is not
> > > >> currently installed. Please download and install the appropriate
> > > >> version of Rtools before proceeding:
> > > >> https://cran.rstudio.com/bin/windows/Rtools/
> > > >>
> > > >>
> > > >>  [[alternative HTML version deleted]]
> > > >>
> > > >> __
> > > >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > >> https://stat.ethz.ch/mailman/listinfo/r-help
> > > >> PLEASE do read the posting guide
> > > >> http://www.R-project.org/posting-guide.html
> > > >> and provide commented, minimal, self-contained, reproducible code.
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Please correct my iterative merge sort code. The lack of recursion in the code is the main condition.

2019-12-16 Thread Henrik Bengtsson
Folks on this list, this is my personal opinion, but please refrain from
answering this person's requests. It's pretty clear by now that they are
misusing your good intentions of trying to help people interested in R to
get their homework-like "questions" answered. There are no indications that
this person has even attempted to solve the "problems" themselves. This
smells bad intent to me.

/Henrik

On Mon, Dec 16, 2019, 07:40 Александр Дубровский 
wrote:

> mrg <- function(A,B){
>   R <- c()
>   while(length(A)>0  length(B)>0){
> if(A[1]   R <- c(R,A[1])
>   A <- A[-1]
> } else{
>   R <- c(R,B[1])
>   B <- B[-1]
> }
>   }
>   return(c(R,A,B))
> }
> msort <- function(A){
>   if(length(A)<2){
> return(A)
>   } else{
> R <- c()
> W <- c()
> x <- 8
>   for(i in 1:length(A)){
> if(i%%2==0){
>   R <- c(R,mrg(A[(i-1)],A[i]))
> }
>   }
>  }
>   if((length(A)%%2)==1){
> R <- c(R,A[length(A)])
>   }
>   for(i in 1:length(R)){
> if(i%%4==0){
>   j <- i
>   W <- c(W,mrg(R[(j-3):(j-2)],R[(j-1):j]))
> }
>   }
>   if((length(R)%%4)==3){
> W <- c(W,mrg(R[(j+1):(j+2)],R[(j+3)]))
>   }
>   if((length(R)%%4)<3 && (length(R)%%4)!=0){
> W <- c(W,R[(j+1):length(R)])
>   }
>   R <- W
>   W <- c()
>   while(x for(i in 1:length(R)){
>   if(i%%x==0){
> j <- i
> W <- c(W,mrg(R[(j-(x-1)):(j-(x%/%2))],R[(j-(x%/%2)+1):j]))
>   }
> }
> if((length(R)%%x)>(x%/%2)){
>   W <- c(W,mrg(R[(j+1):(j+(x%/%2))],R[(j+(x%/%2)+1):length(R)]))
> }
> if((length(R)%%x)<=(x%/%2) && (length(R)%%x)!=0){
>   W <- c(W,R[(j+1):length(R)])
> }
> x <- x*2
> R <- W
> W <- c()
>   }
> R <- mrg(R[1:j],R[(j+1):length(R)])
>   return(R)
> }
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cannot open file '--no-restore.matrix'

2019-11-23 Thread Henrik Bengtsson
Looking at the https://github.com/eleporcu/TWMR/blob/master/README.txt,
it looks like you should pass a single argument when you call the MR.R
script and it should be the name of a gene, e.g. 'ENSG0154803'.
You are passing two arguments and they look like filenames.

Update the script, because it should really use commandArgs(TRUE), to:

cmd_args <- commandArgs(TRUE)
if (length(cmd_args) == 0L) stop("No arguments specified.")
print(cmd_args)
gene<-cmd_args[length(cmd_args)]   ## Last argument is the 'gene'
Ngwas<-239087
N_eQTLs<-32000
out<-c("gene","alpha","SE","P","Nsnps","Ngene")

file<-paste(gene,"matrix",sep=".")
if (!file.exists(file)) stop("File not found: ", file)
filecluster<-read.table(file,header=T,sep=" ",dec=".")
...

The call the script from the command line as:

$ Rscript MR.R ENSG0154803

That should do it

/Henrik


On Sat, Nov 23, 2019 at 12:24 PM Ana Marija  wrote:
>
> it is confusing because in documentation they say this is how you run
> the script:
> https://github.com/eleporcu/TWMR
>
> I tried changing this on the script:
>
> cmd_args=commandArgs(TRUE)
> print(cmd_args)
> gene<-cmd_args[3]
> Ngwas<-239087
> N_eQTLs<-32000
> out<-c("gene","alpha","SE","P","Nsnps","Ngene")
>
> file<-paste(gene,"matrix",sep=".")
> if (!file.exists(file)) stop("File not found: ", file)
> filecluster<-read.table(file,header=T,sep=" ",dec=".")
> #file<-paste(gene,"matrix",sep=".")
> #filecluster<-read.table(file,header=T,sep=".",dec=" ")
> beta<-as.matrix(filecluster[,2:(length(filecluster[1,])-1)])
>
> and I run it:
> Rscript< MR.R --no-save ENSG0154803.ld ENSG0154803.matrix
> Error: unexpected numeric constant in "1.000 0.089"
> Execution halted
>
>
> I also tried this:
> Rscript --no-save MR.R ENSG0154803.ld ENSG0154803.matrix
> [1] "ENSG0154803.ld" "ENSG0154803.matrix"
> Error: File not found: NA.matrix
> Execution halted
>
>
> Rscript --no-save MR.R ENSG0154803.ld ENSG0154803.matrix
> [1] "ENSG0154803.ld" "ENSG0154803.matrix"
> Error: File not found: NA.matrix
> Execution halted
>
> On Sat, Nov 23, 2019 at 2:08 PM Henrik Bengtsson
>  wrote:
> >
> > Maybe it would help to add:
> >
> > file<-paste(gene,"matrix",sep=".")
> > if (!file.exists(file)) stop("File not found: ", file)
> > filecluster<-read.table(file,header=T,sep=" ",dec=".")
> >
> > /Henrik
> >
> > On Sat, Nov 23, 2019 at 11:55 AM Duncan Murdoch
> >  wrote:
> > >
> > > On 23/11/2019 1:21 p.m., Ana Marija wrote:
> > > > Hi Duncan,
> > > >
> > > > thanks, I just did,
> > > >   Rscript --no-save MR.R ENSG0154803.ld ENSG0154803.matrix
> > > > [1] "ENSG0154803.ld" "ENSG0154803.matrix"
> > > > Error in file(file, "rt") : cannot open the connection
> > > > Calls: read.table -> file
> > > > In addition: Warning message:
> > > > In file(file, "rt") :
> > > >cannot open file 'NA.matrix': No such file or directory
> > > > Execution halted
> > > >
> > > Your script works with the third element in the list of arguments, and
> > > there are only two.
> > >
> > > Duncan Murdoch
> > >
> > > >
> > > > Please advise
> > > >
> > > > On Sat, Nov 23, 2019 at 12:13 PM Duncan Murdoch
> > > >  wrote:
> > > >>
> > > >> On 23/11/2019 11:05 a.m., Ana Marija wrote:
> > > >>> Hi Ben,
> > > >>>
> > > >>> I am not sure what you mean when you say to print, is it this?
> > > >>>
> > > >>>> cmd_args=commandArgs(TRUE)
> > > >>>> print(cmd_args)
> > > >>> character(0)
> > > >>>> cmd_args=commandArgs()
> > > >>>> print(cmd_args)
> > > >>> [1] "/software/linux-el7-x86_64/compilers/r-3.6.1/lib64/R/bin/exec/R"
> > > >>>
> > > >>> I changed in the first line of this script:
> > > >>> https://github.com/eleporcu/TWMR/blob/master/MR.R
> > > >>>
> > > >>>

Re: [R] cannot open file '--no-restore.matrix'

2019-11-23 Thread Henrik Bengtsson
Maybe it would help to add:

file<-paste(gene,"matrix",sep=".")
if (!file.exists(file)) stop("File not found: ", file)
filecluster<-read.table(file,header=T,sep=" ",dec=".")

/Henrik

On Sat, Nov 23, 2019 at 11:55 AM Duncan Murdoch
 wrote:
>
> On 23/11/2019 1:21 p.m., Ana Marija wrote:
> > Hi Duncan,
> >
> > thanks, I just did,
> >   Rscript --no-save MR.R ENSG0154803.ld ENSG0154803.matrix
> > [1] "ENSG0154803.ld" "ENSG0154803.matrix"
> > Error in file(file, "rt") : cannot open the connection
> > Calls: read.table -> file
> > In addition: Warning message:
> > In file(file, "rt") :
> >cannot open file 'NA.matrix': No such file or directory
> > Execution halted
> >
> Your script works with the third element in the list of arguments, and
> there are only two.
>
> Duncan Murdoch
>
> >
> > Please advise
> >
> > On Sat, Nov 23, 2019 at 12:13 PM Duncan Murdoch
> >  wrote:
> >>
> >> On 23/11/2019 11:05 a.m., Ana Marija wrote:
> >>> Hi Ben,
> >>>
> >>> I am not sure what you mean when you say to print, is it this?
> >>>
>  cmd_args=commandArgs(TRUE)
>  print(cmd_args)
> >>> character(0)
>  cmd_args=commandArgs()
>  print(cmd_args)
> >>> [1] "/software/linux-el7-x86_64/compilers/r-3.6.1/lib64/R/bin/exec/R"
> >>>
> >>> I changed in the first line of this script:
> >>> https://github.com/eleporcu/TWMR/blob/master/MR.R
> >>>
> >>> cmd_args=commandArgs() to be cmd_args=commandArgs(TRUE)
> >>>
> >>> but again I get the same error:
> >>>
> >>> Rscript --no-save MR.R ENSG0154803.ld ENSG0154803.matrix
> >>> Error in file(file, "rt") : cannot open the connection
> >>> Calls: read.table -> file
> >>> In addition: Warning message:
> >>> In file(file, "rt") :
> >>> cannot open file 'NA.matrix': No such file or directory
> >>> Execution halted
> >>
> >> You didn't put the print(cmd_args) into the script.
> >>
> >> Duncan Murdoch
> >>>
> >>>
> >>> Please advise,
> >>> Ana
> >>>
> >>> On Sat, Nov 23, 2019 at 9:44 AM Duncan Murdoch  
> >>> wrote:
> 
>  On 23/11/2019 10:26 a.m., Ana Marija wrote:
> > HI Ben,
> >
> > I tried it but it doesn't work:
> >
> > Rscript --no-save MR.R ENSG0154803.ld ENSG0154803.matrix
> > Error in file(file, "rt") : cannot open the connection
> > Calls: read.table -> file
> > In addition: Warning message:
> > In file(file, "rt") :
> >  cannot open file '--no-restore.matrix': No such file or directory
> > Execution halted
> >
> 
>  You should print the cmd_args variable that is set on the first line of
>  that script.  When I run a script that prints it using your command
>  line, this is what it looks like:
> 
>  $ Rscript --no-save MR.R ENSG0154803.ld ENSG0154803.matrix
>  [1] "/Library/Frameworks/R.framework/Resources/bin/exec/R"
>  [2] "--slave"
>  [3] "--no-restore"
>  [4] "--no-save"
>  [5] "--file=MR.R"
>  [6] "--args"
>  [7] "ENSG0154803.ld"
>  [8] "ENSG0154803.matrix"
> 
>  The next line
> 
>  gene <- cmd_args[3]
> 
>  is obviously wrong for my system, because it would set gene to
>  "--no-restore".  Your results will probably be somewhat different, but
>  it might be clear what you should use instead of the third element.
> 
>  By the way, changing the first line
> 
>  cmd_args=commandArgs()
> 
>  to
> 
>  cmd_args <- commandArgs(TRUE)
> 
>  makes a lot of sense in most cases.  I haven't read your whole script so
>  I don't know it it makes sense for you.
> 
>  Duncan Murdoch
> 
> 
> > Please advise,
> > Ana
> >
> > On Sat, Nov 23, 2019 at 4:16 AM Ben Tupper  wrote:
> >>
> >> Hi,
> >>
> >> I think you want this order...
> >>
> >> Rscript [options for R] script_file.R argument_1 argument_2 ...
> >>
> >> So, like this ...
> >>
> >> Rscript --no-save MR.R ENSG0154803.ld ENSG0154803.matrix
> >>
> >> Cheers,
> >> Ben
> >>
> >> On Fri, Nov 22, 2019 at 8:59 PM Ana Marija 
> >>  wrote:
> >>>
> >>> HI Ben,
> >>>
> >>> thank you so much , I did this:
> >>>
> >>> Rscript --no-save ENSG0154803.ld ENSG0154803.matrix  MR.R
> >>> Error: unexpected numeric constant in "1.000 0.089"
> >>> Execution halted
> >>>
> >>> I made ENSG0154803.ld with:
> >>> library(MASS)
> >>> write.matrix(ENSG0154803.ld, file="ENSG0154803.ld")
> >>>
> >>> and it looks like this:
> >>>
> >>> 1.000 0.089 0.006 0.038 0.012 0.014 0.003 0.001 0.005 0.015 0.013
> >>> 0.000 0.000 0.000 0.001 0.003 0.000
> >>> 0.089 1.000 0.002 0.007 0.005 0.001 0.004 0.005 0.000 0.003 0.014
> >>> 0.001 0.012 0.005 0.000 0.004 0.004
> >>> 0.006 0.002 1.000 0.004 0.008 0.029 0.040 0.001 0.001 0.006 0.013
> >>> 0.054 0.006 0.002 0.010 0.001 0.000
> >>> 0.038 0.007 0.004 1.000 0.460 0.044 

Re: [R] .libPaths() can not deal non-latin characters?

2019-10-21 Thread Henrik Bengtsson
The folder must exist. If not, .libPaths() *silently* ignores it. Could
that be it?

Henrik



On Mon, Oct 21, 2019, 02:32 Jinsong Zhao  wrote:

> Hi there,
>
> I have a computer run Win10 with user names in Chinese. I installed R on
> it. It can run normally. When I installed a package, for example, ada, then
> the library would be installed into
> "C:/Users/中文/Documents/R/win-library/3.6", where "中文" is my user name.
>
> > library(ada)
> Error in library(ada) : there is no package called ‘ada’
>
> > .libPaths()
> [1] "C:/Program Files/R/R-3.6.1/library"
>
> > .libPaths(c("C:/Users/中文/Documents/R/win-library/3.6", .libPaths()))
> > .libPaths()
> [1] "C:/Program Files/R/R-3.6.1/library"
>
> you will find that .libPaths() does not accept the path with Chinese (I
> think non-latin characters may not be accepted).
>
> I also tried to install the package to other directory with Chinese
> character, and then set the .libPaths, and failed.
>
> Is it the features?
>
> Any hints? Thanks in advance.
>
> Best,
> Jinsong
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error with install.packages using R v 3.5.1 and 3.5.2

2019-01-16 Thread Henrik Bengtsson
Immediately after you get that error:

Error in if (any(diff)) { : missing value where TRUE/FALSE needed

what does

> traceback()

output?  (I suspect this error occurs in tools:::checkMD5sums() used
to assert that the package files are correctly downloaded).  Also,
going forward, let's try with a single package installed, e.g.
install.packages("glue").   Does that also give an error?

/Henrik


On Wed, Jan 16, 2019 at 11:10 AM Jeff Newmiller
 wrote:
>
> I don't know specifically where that error comes from... but I can think of 
> two possible directions to go:
>
> 1) If you have ever run R as Administrator then you may need to delete your 
> personal library (‘~/My Documents/R/win-library/3.5’) and reload all packages 
> NOT using Run As Administrator. Any files created by R using those security 
> credentials may impede the function of R when run without those credentials.
>
> 2) There have previously been reports that this error arises from the 
> installed.packages function that is invoked by install.packages. This could 
> be related to (1) above or be unrelated. You might confirm on this discussion 
> thread whether this function runs okay for you.
>
> For future reference, use CRAN packages for examples on this mailing list to 
> clarify that the problem is relevant here.
>
> On January 16, 2019 9:48:12 AM PST, Emily Wan  
> wrote:
> >Hi Jeff -
> >I do not think the issue is Bioconductor (which is why I had posted the
> >inquiry on this forum - but as an aside, I do have the latest version
> >of
> >Bioconductor (3.8)).   As an example, when I attempt to use the generic
> >install.packages() function, I receive the same error message. I have
> >included an example below (along with the sessionInfo):
> >
> >> install.packages('stringr')
> >Installing package into ‘~/My Documents/R/win-library/3.5’
> >(as ‘lib’ is unspecified)
> >--- Please select a CRAN mirror for use in this session ---
> >also installing the dependencies ‘glue’, ‘magrittr’
> >trying URL '
> >https://cran.revolutionanalytics.com/bin/windows/contrib/3.5/glue_1.3.0.zip'
> >Content type 'application/zip' length 108591 bytes (106 KB)
> >downloaded 106 KB
> >trying URL '
> >https://cran.revolutionanalytics.com/bin/windows/contrib/3.5/magrittr_1.5.zip
> >'
> >Content type 'application/zip' length 155452 bytes (151 KB)
> >downloaded 151 KB
> >trying URL '
> >https://cran.revolutionanalytics.com/bin/windows/contrib/3.5/stringr_1.3.1.zip
> >'
> >Content type 'application/zip' length 194247 bytes (189 KB)
> >downloaded 189 KB
> >Error in if (any(diff)) { : missing value where TRUE/FALSE needed
> >> sessionInfo()
> >R version 3.5.2 (2018-12-20)
> >Platform: x86_64-w64-mingw32/x64 (64-bit)
> >Running under: Windows Server 2012 R2 x64 (build 9600)
> >Matrix products: default
> >locale:
> >[1] LC_COLLATE=English_United States.1252
> >[2] LC_CTYPE=English_United States.1252
> >[3] LC_MONETARY=English_United States.1252
> >[4] LC_NUMERIC=C
> >[5] LC_TIME=English_United States.1252
> >attached base packages:
> >[1] stats graphics  grDevices utils datasets  methods   base
> >loaded via a namespace (and not attached):
> >[1] compiler_3.5.2 tools_3.5.2
> >>
> >Please let me know what additional information is needed - many thanks.
> >
> >On Tue, Jan 15, 2019 at 4:51 PM Jeff Newmiller
> >
> >wrote:
> >
> >> Please ask questions about Bioconductor on the Bioconductor forum
> >[1].
> >>
> >> Chances are that you need to re-install Bioconductor because packages
> >are
> >> installed in two-digit version-specific libraries... e.g. R 3.4 and R
> >3.5
> >> do not share packages.
> >>
> >> [1] https://support.bioconductor.org
> >>
> >> On January 15, 2019 11:51:16 AM PST, Emily Wan
> >
> >> wrote:
> >> >Hi -
> >> >I am working with R on a Window Server 2012 R2 - I had originally
> >> >installed
> >> >R (v3.5.1) in September/October 2018 and have used multiple packages
> >> >without incident. However, last week, when attempting to install
> >> >additional
> >> >packages (using install.packages() or Bioconductor's
> >> >BiocManager::install()
> >> >wrapper), I kept on receiving the following error message:
> >> >
> >> >Error in if (any(diff)) { : missing value where TRUE/FALSE needed
> >> >
> >> >I have searched the prior threads on this topic (including the issue
> >> >reported with R v3.4.0 which required a patch), rebooted my server,
> >and
> >> >actually uninstalled R v3.5.1 and upgraded to v3.5.2 but am still
> >> >receiving
> >> >the same error message when I attempt to install *any* package.
> >> >Please let me know what additional details I can provide to assist
> >with
> >> >troubleshooting. Thank you.
> >>
> >> --
> >> Sent from my phone. Please excuse my brevity.
> >>
>
> --
> Sent from my phone. Please excuse my brevity.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 

Re: [R] g++ error causes non-zero exit status for package installation

2019-01-05 Thread Henrik Bengtsson
On Sat, Jan 5, 2019 at 9:41 AM Dirk Eddelbuettel  wrote:
>
>
> On 5 January 2019 at 09:14, William Dunlap via R-help wrote:
> | You would get these errors ("R: file or directory not found, version: file
> | or directory not found...") if you had a ~/.Rprofile file containing the
> | line 'cat(version$version.string, sep="\n").
>
> Well spotted -- very much so. That is bound to break use within src/Makevars
> and alike. If you must do something in ~/.Rprofile either make it silent, or
> make it conditional based on if (interactive()) { ...that_code_here... }

Interesting problem.  One way to workaround this startup issue with:

  PKG_LIBS = `$(R_HOME)/bin/Rscript -e "Rcpp:::LdFlags()"`

used by minqa:src/Makevars
(https://r-forge.r-project.org/scm/viewvc.php/pkg/minqa/src/Makevars?view=markup=optimizer),
could be to use something like:

  PKG_LIBS = `Rscript -e "cat('LDFLAGS:\n')" -e "Rcpp:::LdFlags()" |
grep -A 999 "LDFLAGS:" | grep -v "LDFLAGS:"`

and analogously for src/Makevars.win.

But in the bigger picture, maybe there's room for an R/Rscript option
to silence all R startup stdout and/or stderr output?  For example,

  Rscript --quiet-startup -e "Rcpp:::LdFlags()"`

/Henrik

PS. It's probably better to output to stderr in .Rprofile, e.g. by
always using message() instead of cat(). However, that cannot be
assumed to always be the case.

>
> Dirk
>
> --
> http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Access function as text from package by name

2018-09-27 Thread Henrik Bengtsson
deparse(graphics::box)

/Henrik
On Thu, Sep 27, 2018 at 3:30 AM Sigbert Klinke
 wrote:
>
> Hi,
>
> I want to have a function, e.g. graphics::box, as text.
> Currently I'am using
>
> deparse(eval(parse(text='graphics::box')))
>
> It is important that '::' and ':::' can be used in the name.
>
> Is there a simpler way?
>
> Thanks
>
> Sigbert
>
> --
> https://hu.berlin/sk
> https://hu.berlin/mmstat3
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] makeCluster() hangs infinitely

2018-09-17 Thread Henrik Bengtsson
On Mon, Sep 17, 2018 at 12:56 PM Zhihao Huang  wrote:
>
> Hi Henrik,
>
> Thanks for the suggestions! I tried your approach, and obtained the following 
> output, which is pretty similar to the previous ones.
>
> > cl <- future::makeClusterPSOCK(1, outfile = NULL, verbose = TRUE)
> Workers: [n = 1] ‘localhost’
> Base port: 11214
> Creating node 1 of 1 ...
> - setting up node
> Starting worker #1 on ‘localhost’: 
> '/Library/Frameworks/R.framework/Resources/bin/Rscript' 
> --default-packages=datasets,utils,grDevices,graphics,stats,methods -e 
> 'parallel:::.slaveRSOCK()' MASTER=localhost PORT=11214 OUT= TIMEOUT=2592000 
> XDR=TRUE
> Waiting for worker #1 on ‘localhost’ to connect back
> starting worker pid=13731 on localhost:11214 at 15:48:41.991
>
> I guess this is a connection problem. I am not sure what these numbers mean. 
> Do you have any further idea on this? I very much appreciate it!

Yes, it looks similar with the important difference of displaying:

"starting worker pid=13731 on localhost:11214 at 15:48:41.991"

That tells us that the background worker (separate R session running
parallel:::.slaveRSOCK()) was successfully launched, which is good.

BTW, you should see something similar with:

cl <- parallel::makeCluster(1, outfile = NULL)

which helps others help you (in case they say "oh, it might be a
problem with the future package - as the maintainer").

Yes, it looks like a "connection problem" - this could be a firewall
issue or similar.  I'm not on macOS, so I cannot help you there, but
maybe others can pitch in.

/Henrik

>
> Thanks,
> Zhihao
> --
> Zhihao (Daniel) Huang
> Graduate Student
> Department of Statistics,
> University of Michigan, Ann Arbor
> Email: zhhhw...@umich.edu
>
>
>
> On Mon, Sep 17, 2018 at 12:38 AM Henrik Bengtsson 
>  wrote:
>>
>> Hi,
>>
>> did you see my answer on StackOverflow? Specifically, if you set
>> argument 'outfile = NULL' to either of those two functions, you'll get
>> a little bit more information that *might* provide some clues.
>>
>> /Henrik
>>
>>
>> On Sun, Sep 16, 2018 at 5:38 PM Zhihao Huang  wrote:
>> >
>> > Hi all,
>> >
>> > The function makeCluster() of parallel does not work on my laptop. It hangs
>> > infinitely.
>> >
>> > *1. Problem Summary:*
>> >
>> > > # Loading parallel packages
>> >
>> > > library(parallel)
>> >
>> > > cl <- makeCluster(2) # It hangs at this line of code.
>> > It hangs at the second line of the code.
>> >
>> > *2. Potential Reason*
>> > I also tried to see the details of what it does internally by using the
>> > following code.
>> >
>> > > library(future)
>> >
>> > > cl <- future::makeClusterPSOCK(1L, verbose = TRUE) # It hangs at this
>> > line of code.
>> > And it returns the following descriptions and hangs.
>> >
>> > *Workers: [n = 1] ‘localhost’*
>> >
>> > *Base port: 11214*
>> >
>> > *Creating node 1 of 1 ...*
>> >
>> > *- setting up node*
>> >
>> > *Starting worker #1 on ‘localhost’:
>> > '/Library/Frameworks/R.framework/Resources/bin/Rscript'
>> > --default-packages=datasets,utils,grDevices,graphics,stats,methods -e
>> > 'parallel:::.slaveRSOCK()' MASTER=localhost PORT=11214 OUT=/dev/null
>> > TIMEOUT=2592000 XDR=TRUE*
>> >
>> > *Waiting for worker #1 on ‘localhost’ to connect back*
>> > So the problem is that the "worker #1 on 'local host'" never connects back,
>> > and that's why it hangs forever. I have no idea what causes this.
>> >
>> > *3. my sessionInfo():*
>> >
>> > R version 3.5.1 (2018-07-02)
>> >
>> > Platform: x86_64-apple-darwin15.6.0 (64-bit)
>> >
>> > Running under: macOS High Sierra 10.13.6
>> >
>> >
>> > Matrix products: default
>> >
>> > BLAS:
>> > /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
>> >
>> > LAPACK:
>> > /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
>> >
>> >
>> > locale:
>> >
>> > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>> >
>> >
>> > attached base packages:
>> >
>> > [1] stats graphics  grDevices utils datasets  methods   base
>> >
>> >
>> > loaded via a namespace (and not att

Re: [R] makeCluster() hangs infinitely

2018-09-16 Thread Henrik Bengtsson
Hi,

did you see my answer on StackOverflow? Specifically, if you set
argument 'outfile = NULL' to either of those two functions, you'll get
a little bit more information that *might* provide some clues.

/Henrik


On Sun, Sep 16, 2018 at 5:38 PM Zhihao Huang  wrote:
>
> Hi all,
>
> The function makeCluster() of parallel does not work on my laptop. It hangs
> infinitely.
>
> *1. Problem Summary:*
>
> > # Loading parallel packages
>
> > library(parallel)
>
> > cl <- makeCluster(2) # It hangs at this line of code.
> It hangs at the second line of the code.
>
> *2. Potential Reason*
> I also tried to see the details of what it does internally by using the
> following code.
>
> > library(future)
>
> > cl <- future::makeClusterPSOCK(1L, verbose = TRUE) # It hangs at this
> line of code.
> And it returns the following descriptions and hangs.
>
> *Workers: [n = 1] ‘localhost’*
>
> *Base port: 11214*
>
> *Creating node 1 of 1 ...*
>
> *- setting up node*
>
> *Starting worker #1 on ‘localhost’:
> '/Library/Frameworks/R.framework/Resources/bin/Rscript'
> --default-packages=datasets,utils,grDevices,graphics,stats,methods -e
> 'parallel:::.slaveRSOCK()' MASTER=localhost PORT=11214 OUT=/dev/null
> TIMEOUT=2592000 XDR=TRUE*
>
> *Waiting for worker #1 on ‘localhost’ to connect back*
> So the problem is that the "worker #1 on 'local host'" never connects back,
> and that's why it hangs forever. I have no idea what causes this.
>
> *3. my sessionInfo():*
>
> R version 3.5.1 (2018-07-02)
>
> Platform: x86_64-apple-darwin15.6.0 (64-bit)
>
> Running under: macOS High Sierra 10.13.6
>
>
> Matrix products: default
>
> BLAS:
> /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
>
> LAPACK:
> /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
>
>
> locale:
>
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
>
> attached base packages:
>
> [1] stats graphics  grDevices utils datasets  methods   base
>
>
> loaded via a namespace (and not attached):
>
> [1] compiler_3.5.1
>
> I spent hours searching for the solutions but failed. It looks like some
> other people met similar problem here
> . Also, I
> posted this question online here
> 
> a
> week ago.
>
> Any suggestion would be appreciated. Thanks a lot!
>
> Thanks,
> Zhihao
> --
> Zhihao (Daniel) Huang
> Graduate Student
> Department of Statistics,
> University of Michigan, Ann Arbor
> Email: zhhhw...@umich.edu
>
> --
> 黄 之昊
> Zhihao Huang
>
> Graduate Student
> Department of Statistics,
> University of Michigan, Ann Arbor
> Email: zhhhw...@umich.edu
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sink() output to another directory

2018-09-13 Thread Henrik Bengtsson
On Thu, Sep 13, 2018 at 7:12 PM Rich Shepard  wrote:
>
> On Thu, 13 Sep 2018, Henrik Bengtsson wrote:
>
> >> sink('stat-summaries/estacada-se-precip.txt')
> >> print(summary(estacada_se_wx))
> >> sink()
> >>
> >> while accepting:
> >>
> >> pdf('../images/rainfall-estacada-se.pdf')
> >>   
> >> plot(rain_est_se)
> >> dev.off()
> >>
> >>Changing the sink() file to
> >> './stat-summaries/estacada-se-precip.txt'
> >>
> >> generates the same error
> >
> > "same error" as what? (ambiguity is the reason for not being able to
> > help you - all the replies in this thread this far are correct and on
> > the spot)
> >
> > BTW, not that it should matter, what is your operating system and version 
> > of R?
>
> Henrik,
>
>As I wrote in earlier messages:
>
> sink('stat-summaries/estacada-wnw-precip.txt')
> print(summary(estacada_se_wx))
> sink()
>
> results in
>
> 24: sink('stat-summaries/estacada-wnw-precip.txt')
> 25: print(/
> ^
> Does not matter if I use single or double quotes.
>
>The message that print() doesn't like the forward slash results when I
> specify 'stat-summaries/estacada-wnw-precip.txt' or
> './stat-summaries/estacada-wnw-precip.txt'.

Since it is impossible to get that error message (which is a syntax
error, i.e. the R parser does not accept the code as written and it
never gets to the point where the R engine even runs your code) for
the code you are showing, I strongly suspect that you didn't source()
the same file that you were editing (the one that contains the
three-line code you are displaying above).

I see from one of your later message that you've got it to work now.

/Henrik

>
>Running R-3.5.1 on Slackware-14.2.
>
> Rich
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sink() output to another directory

2018-09-13 Thread Henrik Bengtsson
On Thu, Sep 13, 2018 at 6:05 PM Rich Shepard  wrote:
>
> On Thu, 13 Sep 2018, MacQueen, Don wrote:
>
> > In my experience, any path that can be used at the shell prompt in a
> > unix-alike can be used anywhere that R wants a file name.
>
> Don,
>
>That's been my experiences, too.
>
> > Hopefully, that helps...
>
>That's why I don't understand why the plot() function accepts the
> different directory while the sink() function (here) doesn't.
>
>I showed R rejecting:
>
> sink('stat-summaries/estacada-se-precip.txt')
> print(summary(estacada_se_wx))
> sink()
>
> while accepting:
>
> pdf('../images/rainfall-estacada-se.pdf')
>   
> plot(rain_est_se)
> dev.off()
>
>Changing the sink() file to
> './stat-summaries/estacada-se-precip.txt'
>
> generates the same error

"same error" as what? (ambiguity is the reason for not being able to
help you - all the replies in this thread this far are correct and on
the spot)

BTW, not that it should matter, what is your operating system and version of R?

/Henrik

> while I regularly use this syntax to copy files or
> specify the relative path to an executable file.
>
> Regards,
>
> Rich
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sink() output to another directory

2018-09-13 Thread Henrik Bengtsson
On Thu, Sep 13, 2018 at 3:33 PM Rich Shepard  wrote:
>
>Neither ?sink nor ?capture.output indicates how the output file can be
> specified to be in a directory other than the cwd.
>
>When the cwd is ../analyses/ and I want the output to be in
> ../analyses/stat-summaries/ how do I write this?
>
>sink('example-output.txt')
>print(summary(df))
>sink()
>
> writes output to the current directory. My attempts to prefix the file name
> with ./ or just / don't sit well with R.

Hi welcome to R-help. Please help the helper(s) to help you by being
as explicit as possible what you've tried (i.e. cut'n'paste your
code), provide error messages (cut'n'paste) you get, if any, and/or
what you mean by "don't sit well with R" (that can mean many different
things).

/Henrik

> What is the proper syntax?
>
> TIA,
>
> Rich
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mysterious seg fault --- SOLVED

2018-08-13 Thread Henrik Bengtsson
On Mon, Aug 13, 2018 at 3:51 AM Rolf Turner  wrote:
>
>
> OK everybody!  You can relax.  :-) I managed to spot the loony.  After
> mucking around with valgrind, and before trying gdb, I had one more look
> at my code and *finally* saw the stupid thing that I had been doing.
>
> In the call to .Fortran() I had a line
>
>  nphi=as.integer(nphi),
>
> but "nphi" was nowhere defined (!!!) in the R code.  The name "nphi"
> appeared as an argument in the Fortran subroutine in question, but was
> nowhere actually *used*!!!

Didn't R CMD check pick this up, that is, didn't it report that 'nphi'
is a "global" variable?

/Henrik

>
> It seems that passing a non-existent value as an argument to a Fortran
> subroutine can *sometimes* confuse it.  Understandably.
>
> I think that this "nphi" was a left-over from an earlier version of the
> code.  I must have changed the code so that nphi was no longer needed,
> but then forgot to remove it from some places.  Psigh!  I hate myself
> sometimes.
>
> Anyhow, thanks to all those who took the time and made the effort to try
> to help me.
>
> cheers,
>
> Rolf
>
> --
> Technical Editor ANZJS
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subsetting ls() as per class...

2018-07-28 Thread Henrik Bengtsson
The ll() function of R.oo returns a data.frame with various attributes that
you can subset on, e.g.

> subset(R.oo::ll(), data.class %in% c("zoo", "xts"))
   member data.class dimension objectSize
2  fzzoo10   1344
4  sample.xtsxts  c(180,4)  10128
5   xzoo 5528
6  x1zoo 5880
7  x2zoo 5496
9   yzoo 5   1040
11  zzooc(5,3)   1184
12 z0zoo 0448
13 z2zooc(4,3)904
14z20zooc(4,0)616
15 z3zoo 8528
16 z4zoo 5592
17 z5zoo 5792

Henrik

On Sat, Jul 28, 2018, 08:22 Jeff Newmiller  wrote:

> You can extract the names into a character vector with ls and then use
> grep(..., values=TRUE ) to select which ones you want to remove, and then
> pass that list to rm.
>
> However, due to the way R handles memory you are unlikely to see much
> savings by doing this. I would recommend focusing on creating a script or
> series of scripts that can allow you to re-create your analysis, and then
> restarting R whenever you are ready to reduce memory usage. This will have
> the side benefit of leaving you with a verified-complete record of how your
> analysis was done.
>
> On July 27, 2018 10:58:36 PM PDT, akshay kulkarni 
> wrote:
> >dear memebers,
> >I am using R in AWS linux instance for my research. I want to remove
> >certain objects from the global environment  to reduce my EBS cost..for
> >example, I want to remove all objects of class "xts", "zoo". Is there
> >any way to automate this, instead of removing the objects one by one?
> >
> >Basically, I want to subset  ls() according to class, and then remove
> >that subset by using rm function.
> >
> >I got to know about mget in SO, but that is not working in my case
> >
> >Also, all the above objects end with ".NS".  I came to know that you
> >can remove objects starting with a certain pattern; is there any way to
> >remove objects ending in a certain pattern?
> >
> >very many thanks for your time and effort...
> >yours sincerely,
> >AKSHAY M KULKARNI
> >
> >   [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unable to Change Library Paths in R 3.5.1 in Windows 10

2018-07-26 Thread Henrik Bengtsson
Some more info:

1. The library folder have to exist; if not, then R will silently ignore it.

2. Try call your .libPaths(my_new_folder) setup in an interactive R
session.  Then, in the same session, look at .libPaths().  The first
element should be your new folder.  If not, make sure it exists, e.g.
file_test("-d", my_new_folder).

3. If (1)-(2) is correct, then it might be that you save your
.Rprofile in the wrong location (or as .Rprofile.txt which happens in
some Windows editors, e.g. Notepad - if so, save with quotation
marks).  You can verify you've saved it in the correct place by
file_test("-f", "~/.Rprofile").  To find the full location, do
normalizePath("~/.Rprofile").

4. If you saved .Rprofile in the correct place, it could be that there
is a missing newline on the last line.  If that is the case, then R
silently ignores that line.

You might find the startup package helpful (disclaimer: I'm the
author); For (4), you can run startup::check() and it'll tell you and
fix potential issues like this one, e.g.

> startup::check()
Backed up R startup file: ‘~/.Rprofile’ (29 bytes) ->
‘~/.Rprofile.bak.20180726-122923’ (29 bytes)
Warning message:
In check_rprofile_eof(all = all, fix = fix, backup = backup, debug = debug) :
  SYNTAX ISSUE FIXED: Added missing newline to the end of file
~/.Rprofile, which otherwise would cause R to silently ignore the file
in the startup process.
>

/Henrik

On Wed, Jul 25, 2018 at 11:58 PM Bert Gunter  wrote:
>
> Pemissions settings on your target  directory (which is a Windows not an R
> issue)??
>
> -- Bert
>
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
> On Wed, Jul 25, 2018 at 1:17 PM, Jack Pincus 
> wrote:
>
> > I just installed R 3.5.1 on a new Windows 10 computer.  R tries to set a
> > personal library to C:/Users/jackp/OneDrive/Documents/R/win-lib/3.5.  I
> > want to store R packages on my local hard drive, not OneDrive.  I tried
> > placing the line of code:.libPaths(c(.libPaths(),
> > "C:/Users/jackp/Documents/R/win-lib/3.5")) at the top of Rprofile located
> > in C:/Program Files/R/R3.5.1/library/base?R but it does not recognize
> > libraries in my personal library.  Any suggestions how to fix this
> > problem.  Also, is there a reason that R tries to default to OneDrive in
> > Windows 10.  I also had a OneDrive folder in Windows 8.1 but could set R to
> > recognize a personal library on C:/Users/jackp/Documents.
> > Thanks in advance,
> > Jack
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> > posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [bug] spdep package?

2018-07-23 Thread Henrik Bengtsson
This is intended/expected because the spdep package *depends* on the
spData package (see https://cran.r-project.org/web/packages/spdep/),
which means that the maintainer of spdep intends also spData to be
*attached* whenever spdep is attached.If they would have only
imported it, then spData would only be *loaded* (but not attached),
and you would not get 'spData' on your search() path and therefore not
see 'x' either.

Example:

## Loading spData
> loadNamespace("spData")


> loadedNamespaces()
[1] "compiler"  "graphics"  "utils" "grDevices" "stats" "datasets"
[7] "methods"   "spData""base"

## The search path used to find objects
> search()
[1] ".GlobalEnv""package:stats" "package:graphics"
[4] "package:grDevices" "package:utils" "package:datasets"
[7] "package:methods"   "Autoloads" "package:base"

## So, spData::x is not found
> x
Error: object 'x' not found

## But is still there
> spData::x
 [1]   0  30  60  90 120 150 180 210 240 270 300 330 360 390 420 450


## Attaching spData, which also happens when you do library(spdat)
> library(spData)
To access larger datasets in this package, install the spDataLarge
package with: `install.packages('spDataLarge',
repos='https://nowosad.github.io/drat/', type='source'))

> loadedNamespaces()
[1] "compiler"  "graphics"  "utils" "grDevices" "stats" "datasets"
[7] "methods"   "spData""base"

## Now, spData is on the search path
> search()
 [1] ".GlobalEnv""package:spData""package:stats"
 [4] "package:graphics"  "package:grDevices" "package:utils"
 [7] "package:datasets"  "package:methods"   "Autoloads"
[10] "package:base

> x
 [1]   0  30  60  90 120 150 180 210 240 270 300 330 360 390 420 450

> find("x")
[1] "package:spData"

/Henrik
On Mon, Jul 23, 2018 at 2:01 PM Jeremie Juste  wrote:
>
>
> Helllo,
>
> Thanks for the info. I still think these variables should not be loaded
> when library(spdep) is called.
>
> But I'll handle it following your suggestion.
>
> Thanks,
>
> Jeremie
>
>
>
>
>
>
> > It turns out that that 'x' comes from the spData package and lives
> > inside that package (part of its namespace).
> >
> >> spData::x
> >  [1]   0  30  60  90 120 150 180 210 240 270 300 330 360 390 420 450
> >
> > This is conceptually no different from other objects in package
> > namespace, although we are more used to seeing functions and not data
> > object.  Another well-known example of this is:
> >
> >> base::pi
> > [1] 3.141593
> >
> > So, this 'x' is *not* in your global workspace and you cannot remove
> > it without unloading the package.
> >
> > /Henrik
>
>
> >>
> >>
> >> I found a dangerous issue in the library spdep. I get variables x and y
> >> that cannot be removed by rm() and I don't don't how they show up. Can
> >> anyone reproduce this?
> >>
> >> ~$ R --vanilla
> >> > rm(list=ls())
> >> > library(spdep)
> >> > x
> >> [1]   0  30  60  90 120 150 180 210 240 270 300 330 360 390 420 450
> >> > rm(list=ls())
> >> > x
> >> [1]   0  30  60  90 120 150 180 210 240 270 300 330 360 390 420 450
> >>
> >>
> >>
> >> > Sys.info()
> >>
> >> sysname"Linux"
> >> release"4.9.0-6-amd64"
> >> version"#1 SMP Debian 4.9.88-1+deb9u1 (2018-05-07)"
> >> nodename   "freegnu"
> >> machine"x86_64"
> >>
> >>
> >> > Session
> >>
> >>
> >> > sessionInfo()
> >>
> >> R version 3.4.1 (2017-06-30)
> >> Platform: x86_64-pc-linux-gnu (64-bit)
> >> Running under: Debian GNU/Linux 9 (stretch)
> >>
> >> Matrix products: default
> >> BLAS: /usr/local/lib/R/lib/libRblas.so
> >> LAPACK: /usr/local/lib/R/lib/libRlapack.so
> >>
> >> locale:
> >>  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
> >>  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
> >>  [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
> >>  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
> >>  [9] LC_ADDRESS=C   LC_TELEPHONE=C
> >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
> >>
> >> attached base packages:
> >> [1] stats graphics  grDevices utils datasets  methods   base
> >>
> >> loaded via a namespace (and not attached):
> >> [1] compiler_3.4.1
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide 
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [bug] spdep package?

2018-07-23 Thread Henrik Bengtsson
It turns out that that 'x' comes from the spData package and lives
inside that package (part of its namespace).

> spData::x
 [1]   0  30  60  90 120 150 180 210 240 270 300 330 360 390 420 450

This is conceptually no different from other objects in package
namespace, although we are more used to seeing functions and not data
object.  Another well-known example of this is:

> base::pi
[1] 3.141593

So, this 'x' is *not* in your global workspace and you cannot remove
it without unloading the package.

/Henrik

On Mon, Jul 23, 2018 at 12:30 PM Jeremie Juste  wrote:
>
>
>
> Hello,
>
>
> I found a dangerous issue in the library spdep. I get variables x and y
> that cannot be removed by rm() and I don't don't how they show up. Can
> anyone reproduce this?
>
> ~$ R --vanilla
> > rm(list=ls())
> > library(spdep)
> > x
> [1]   0  30  60  90 120 150 180 210 240 270 300 330 360 390 420 450
> > rm(list=ls())
> > x
> [1]   0  30  60  90 120 150 180 210 240 270 300 330 360 390 420 450
>
>
>
> > Sys.info()
>
> sysname"Linux"
> release"4.9.0-6-amd64"
> version"#1 SMP Debian 4.9.88-1+deb9u1 (2018-05-07)"
> nodename   "freegnu"
> machine"x86_64"
>
>
> > Session
>
>
> > sessionInfo()
>
> R version 3.4.1 (2017-06-30)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Debian GNU/Linux 9 (stretch)
>
> Matrix products: default
> BLAS: /usr/local/lib/R/lib/libRblas.so
> LAPACK: /usr/local/lib/R/lib/libRlapack.so
>
> locale:
>  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
>  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
>  [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
>  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
>  [9] LC_ADDRESS=C   LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>
> loaded via a namespace (and not attached):
> [1] compiler_3.4.1
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Package 'data.table' in version R-3.5.0 not successfully being installed

2018-04-26 Thread Henrik Bengtsson
If you're installing packages to the default location in your home
account and you didn't remove those library folders, you still have
you R 3.4 package installs there, e.g.

> dir(dirname(.libPaths()[1]), full.names = TRUE)
[1] "/home/hb/R/x86_64-pc-linux-gnu-library/3.4"
[2] "/home/hb/R/x86_64-pc-linux-gnu-library/3.5"
[3] "/home/hb/R/x86_64-pc-linux-gnu-library/3.6"

/Henrik

On Thu, Apr 26, 2018 at 11:41 AM, Akhilesh Singh
 wrote:
> You are right. I do take backups. But, this time I was too sure that
> nothing will go wrong. But, this was over-confidence. I need to take more
> care in future. Thanks anyway.
>
> With regards,
>
> Dr. A.K. Singh
>
> On Thu 26 Apr, 2018, 11:49 PM Duncan Murdoch, 
> wrote:
>
>> On 26/04/2018 1:54 PM, Akhilesh Singh wrote:
>> > My thanks to Dr. John Fox and Dr. Duncan Murdoch. But, I have upgraded
>> > all my R-3.4.3 libraries to R-3.5.0, and I have not backed-up copies of
>> > old version. So, I would give a try each to the solutions suggested by
>> > John Fox and Dengan Murdoch.
>>
>> Here is some unsolicited advice:  I would strongly recommend that you
>> make it a higher priority to have backups available.  In my experience
>> computer hardware is becoming quite reliable, but software isn't, and
>> the person next to the keyboard isn't either.  (My last desperate need
>> for a backup was due to a hardware failure 2 years ago, but it wasn't
>> the manufacturer's fault:  my laptop accidentally drowned.)
>>
>> Backups can save you a lot of grief in the event of a mistake, or a
>> software or hardware failure.  But even in the case of routine events
>> like software updates that don't go as planned, they can save time.
>>
>> Duncan Murdoch
>>
>>
>> >
>> > With regards,
>> >
>> > Dr. A.K. Singh
>> >
>> > On Thu 26 Apr, 2018, 9:44 PM Duncan Murdoch, > > > wrote:
>> >
>> > On 26/04/2018 10:33 AM, Fox, John wrote:
>> >  > Dear A.K. Singh,
>> >  >
>> >  > As you discovered, the data.table package has an error under R
>> > 3.5.0 that prevents CRAN from distributing a Windows binary for the
>> > package. The reason that you weren't able to install the package
>> > from source is apparently that you haven't installed the R
>> > package-building tools for Windows. See
>> > .
>> >  >
>> >  > Because a number of users of my Rcmdr and car packages have
>> > contacted me with a similar issue, as a temporary work-around I've
>> > placed a Windows binary for the data.table package on my website at
>> > <
>> https://socialsciences.mcmaster.ca/jfox/.Pickup/data.table_1.10.4-3.zip>.
>> > You should be able to install the package from there via the command
>> >  >
>> >  >
>> >   install.packages("
>> https://socialsciences.mcmaster.ca/jfox/.Pickup/data.table_1.10.4-3.zip;,
>> repos=NULL, type="win.binary")
>> >  >
>> >  > I expect that this problem will go away when the maintainer of
>> > the data.table package fixes the error.
>> >
>> > You can see the errors in the package on this web page:
>> >
>> > https://cloud.r-project.org/web/checks/check_results_data.table.html
>> >
>> > Currently it is failing self-tests on all platforms except r-oldrel,
>> > which is the previous release of R.  I'd recommend backing out of R
>> > 3.5.0 and going to R 3.4.4 if that's a possibility for you.
>> >
>> > Yet another possibility is to use a version of data.table from
>> Github,
>> > which is newer than the version on CRAN and may have fixed the
>> errors,
>> > but that would require an installation from source, which not every
>> > Windows user is comfortable with.
>> >
>> > Duncan Murdoch
>> >
>> >
>> > On 26-Apr-2018 9:44 PM, "Duncan Murdoch" > > > wrote:
>> >
>> > On 26/04/2018 10:33 AM, Fox, John wrote:
>> >  > Dear A.K. Singh,
>> >  >
>> >  > As you discovered, the data.table package has an error under R
>> > 3.5.0 that prevents CRAN from distributing a Windows binary for the
>> > package. The reason that you weren't able to install the package
>> > from source is apparently that you haven't installed the R
>> > package-building tools for Windows. See
>> > .
>> >  >
>> >  > Because a number of users of my Rcmdr and car packages have
>> > contacted me with a similar issue, as a temporary work-around I've
>> > placed a Windows binary for the data.table package on my website at
>> > <
>> https://socialsciences.mcmaster.ca/jfox/.Pickup/data.table_1.10.4-3.zip>.
>> > You should be able to install the package from there via the command
>> >  >
>> >  >
>> >   install.packages("
>> 

Re: [R] Information about compatibility R

2018-04-11 Thread Henrik Bengtsson
I guess I misread your question. /Henrik

On Wed, Apr 11, 2018 at 12:15 PM, Marcos Fiorini
 wrote:
> Hi Albrecht, thanks for your return.
>
> I'm  using the SUSE Linux Enterprise Server 11 (x86_64) PATCHLEVEL = 4 and 
> not the openSUSE. Do you know if exist a compatibility list of R and Linux 
> version? In the site
> https://cran.r-project.org is informed only about the R version 3.4.4 for 
> SLE12 but not about the previous versions. I can't upgrade for SLE12 and I 
> would like know what is the last version that I can use of R in my SUSE 
> version.
>
> Best regards.
>
> Marcos Fiorini
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Information about compatibility R

2018-04-11 Thread Henrik Bengtsson
Hi,

your question is nearly impossible to answer without actual testing of
your existing pipelines.  But from my soon 20-year usage of R, it's
clear to me that the R Core Team strives to keep things backward
compatible as far as they can.  The NEWS file:

  https://cran.r-project.org/doc/manuals/r-devel/NEWS.html

contains useful information on what's been updated in each release -
you can inspect that one from your own perspective.  There are also
12,000 packages on CRAN that are tested on old-release, current
release, and developers version of R.  Many of these packages rarely
gets updated, which indicates that these package work across a large
number of R versions (and platforms).

FYI, R 3.5.0 is being released on April 23, 2018.  You should aim to
install that version.   Also, on Linux it's not too hard to install
multiple R versions in parallel when you install from source.  Then
you can go back and forth between version whenever you'd like.

/Henrik



On Wed, Apr 11, 2018 at 8:31 AM, Marcos Fiorini  wrote:
> Hi everyone.
>
> I need an information about compatibility of R and Suse Linux. I have 
> installed R 3.2.1 in Suse Linux 11 SP4 and I would like know if it’s 
> compatible the upgrade for the new version 3.4.4 in my Suse release? This 
> information’s about version R and linux version have in site or other place?
>
>
>
> Thanks.
>
>
>
> Marcos Fiorini
>
> [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Possible Improvement to sapply

2018-03-13 Thread Henrik Bengtsson
FYI, in R devel (to become 3.5.0), there's isFALSE() which will cut
some corners compared to identical():

> microbenchmark::microbenchmark(identical(FALSE, FALSE), isFALSE(FALSE))
Unit: nanoseconds
expr min   lqmean median uq   max neval
 identical(FALSE, FALSE) 984 1138 1694.13 1218.0 1337.5 13584   100
  isFALSE(FALSE) 713  761 1133.53  809.5  871.5 18619   100

> microbenchmark::microbenchmark(identical(TRUE, FALSE), isFALSE(TRUE))
Unit: nanoseconds
   expr  min lqmean median   uq   max neval
 identical(TRUE, FALSE) 1009 1103.5 2228.20 1170.5 1357 14346   100
  isFALSE(TRUE)  718  760.0 1298.98  798.0  898 17782   100

> microbenchmark::microbenchmark(identical("array", FALSE), isFALSE("array"))
Unit: nanoseconds
  expr min lqmean median uq  max neval
 identical("array", FALSE) 975 1058.5 1257.95 1119.5 1250.0 9299   100
  isFALSE("array") 409  433.5  658.76  446.0  476.5 9383   100

That could probably be used also is sapply().  The difference is that
isFALSE() is a bit more liberal than identical(x, FALSE), e.g.

> isFALSE(c(a = FALSE))
[1] TRUE
> identical(c(a = FALSE), FALSE)
[1] FALSE

Assuming the latter is not an issue, there are 69 places in base R
where isFALSE() could be used:

$ grep -E "identical[(][^,]+,[ ]*FALSE[)]" -r --include="*.R" | grep
-F "/R/" | wc
 69 3265472

and another 59 where isTRUE() can be used:

$ grep -E "identical[(][^,]+,[ ]*TRUE[)]" -r --include="*.R" | grep -F
"/R/" | wc
 59 3075021

/Henrik

On Tue, Mar 13, 2018 at 9:21 AM, Doran, Harold  wrote:
> Quite possibly, and I’ll look into that. Aside from the work I was doing, 
> however, I wonder if there is a way such that sapply could avoid the overhead 
> of having to call the identical function to determine the conditional path.
>
>
>
> From: William Dunlap [mailto:wdun...@tibco.com]
> Sent: Tuesday, March 13, 2018 12:14 PM
> To: Doran, Harold 
> Cc: Martin Morgan ; r-help@r-project.org
> Subject: Re: [R] Possible Improvement to sapply
>
> Could your code use vapply instead of sapply?  vapply forces you to declare 
> the type and dimensions
> of FUN's output and stops if any call to FUN does not match the declaration.  
> It can use much less
> memory and time than sapply because it fills in the output array as it goes 
> instead of calling lapply()
> and seeing how it could be simplified.
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Tue, Mar 13, 2018 at 7:06 AM, Doran, Harold 
> > wrote:
> Martin
>
> In terms of context of the actual problem, sapply is called millions of times 
> because the work involves scoring individual students who took a test. A 
> score for student A is generated and then student B and such and there are 
> millions of students. The psychometric process of scoring students is complex 
> and our code makes use of sapply many times for each student.
>
> The toy example used length just to illustrate, our actual code doesn't do 
> that. But your point is well taken, there may be a very good counterexample 
> why my proposal doesn't achieve the goal is a generalizable way.
>
>
>
> -Original Message-
> From: Martin Morgan 
> [mailto:martin.mor...@roswellpark.org]
> Sent: Tuesday, March 13, 2018 9:43 AM
> To: Doran, Harold >; 
> 'r-help@r-project.org' 
> >
> Subject: Re: [R] Possible Improvement to sapply
>
>
>
> On 03/13/2018 09:23 AM, Doran, Harold wrote:
>> While working with sapply, the documentation states that the simplify
>> argument will yield a vector, matrix etc "when possible". I was
>> curious how the code actually defined "as possible" and see this
>> within the function
>>
>> if (!identical(simplify, FALSE) && length(answer))
>>
>> This seems superfluous to me, in particular this part:
>>
>> !identical(simplify, FALSE)
>>
>> The preceding code could be reduced to
>>
>> if (simplify && length(answer))
>>
>> and it would not need to execute the call to identical in order to trigger 
>> the conditional execution, which is known from the user's simplify = TRUE or 
>> FALSE inputs. I *think* the extra call to identical is just unnecessary 
>> overhead in this instance.
>>
>> Take for example, the following toy example code and benchmark results and a 
>> small modification to sapply:
>>
>> myList <- list(a = rnorm(100), b = rnorm(100))
>>
>> answer <- lapply(X = myList, FUN = length) simplify = TRUE
>>
>> library(microbenchmark)
>>
>> mySapply <- function (X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE){
>>   FUN <- match.fun(FUN)
>>  answer <- lapply(X = X, FUN = FUN, ...)
>>  if (USE.NAMES && is.character(X) && is.null(names(answer)))
>>  

Re: [R] Random Seed Location

2018-03-04 Thread Henrik Bengtsson
On Sun, Mar 4, 2018 at 3:23 PM, Duncan Murdoch <murdoch.dun...@gmail.com> wrote:
> On 04/03/2018 5:54 PM, Henrik Bengtsson wrote:
>>
>> The following helps identify when .GlobalEnv$.Random.seed has changed:
>>
>> rng_tracker <- local({
>>last <- .GlobalEnv$.Random.seed
>>function(...) {
>>  curr <- .GlobalEnv$.Random.seed
>>  if (!identical(curr, last)) {
>>warning(".Random.seed changed")
>>last <<- curr
>>  }
>>  TRUE
>>}
>> })
>>
>> addTaskCallback(rng_tracker, name = "RNG tracker")
>>
>>
>> EXAMPLE:
>>
>>> sample.int(1L)
>>
>> [1] 1
>> Warning: .Random.seed changed
>>
>> This will help you find for instance:
>>
>> ## Loading ggplot2 does not affect the RNG seed
>>>
>>> loadNamespace("ggplot2")
>>
>> 
>>
>> ## But attaching it does
>>>
>>> library("ggplot2")
>>
>> Warning: .Random.seed changed
>>
>> which reveals:
>>
>>> ggplot2:::.onAttach
>>
>> function (...)
>> {
>>  if (!interactive() || stats::runif(1) > 0.1)
>>  return()
>>  tips <- c("Need help? Try the ggplot2 mailing list:
>> http://groups.google.com/group/ggplot2.;,
>>  "Find out what's changed in ggplot2 at
>> http://github.com/tidyverse/ggplot2/releases.;,
>>  "Use suppressPackageStartupMessages() to eliminate package
>> startup messages.",
>>  "Stackoverflow is a great place to get help:
>> http://stackoverflow.com/tags/ggplot2.;,
>>  "Need help getting started? Try the cookbook for R:
>> http://www.cookbook-r.com/Graphs/;,
>>  "Want to understand how all the pieces fit together? Buy the
>> ggplot2 book: http://ggplot2.org/book/;)
>>  tip <- sample(tips, 1)
>>  packageStartupMessage(paste(strwrap(tip), collapse = "\n"))
>> }
>> 
>>
>> There are probably many case of this in different R packages.
>>
>>
>> R WISH:
>>
>> There could be a
>>
>> preserveRandomSeed({
>>tip <- sample(tips, 1)
>> })
>>
>> function in R for these type of random needs where true random
>> properties are non-critical.  This type of
>> "draw-a-random-number-and-reset-the-seed" is for instance used in
>> parallel:::initDefaultClusterOptions() which is called when the
>> 'parallel' package is loaded:
>>
>> seed <- .GlobalEnv$.Random.seed
>> ran1 <- sample.int(.Machine$integer.max - 1L, 1L) / .Machine$integer.max
>> port <- 11000 + 1000 * ((ran1 + unclass(Sys.time()) / 300) %% 1)
>> if(is.null(seed)) ## there was none, initially
>> rm( ".Random.seed", envir = .GlobalEnv, inherits = FALSE)
>> else # reset
>> assign(".Random.seed", seed, envir = .GlobalEnv, inherits = FALSE)
>
>
> An issue is that .Random.seed doesn't contain the full state of the RNG
> system, so restoring it doesn't necessarily lead to an identical sequence of
> output.  The only way to guarantee the sequence will repeat is to call
> set.seed(n), and that only leads to a tiny fraction of possible states.
>
> Expanding .Random.seed so that it does contain the full state would be a
> good idea, and that would make your preserveRandomSeed really easy to write.
>
> Here's a demo that .Random.seed is not enough:
>
>> set.seed(123, normal.kind = "Box-Muller")
>> rnorm(1)
> [1] -0.1613431
>> save <- .Random.seed
>> rnorm(1)
> [1] 0.6706031
>> .Random.seed <- save
>> rnorm(1)
> [1] -0.4194403
>
> If .Random.seed were the complete state, the 2nd and 3rd rnorm results would
> be the same.

Ah... good point - I forgot about that "oddity", which is documented
in help(".Random.seed"):

".Random.seed saves the seed set for the uniform random-number
generator, at least for the system generators. It does not necessarily
save the state of other generators, and in particular does not save
the state of the Box–Muller normal generator. If you want to reproduce
work later, call set.seed (preferably with explicit values for kind
and normal.kind) rather than set .Random.seed."

So, this is is only for some of the RNG kinds.  Is the reason for this
limitation that it is not possible for R (not even the R internals) to
get hold of some of the RNG generators?  In other words, it is
unlikely to ever be fixed?

Since there is no bijective function to infer `x` such that
`set.seed(x)` resets

Re: [R] Random Seed Location

2018-03-04 Thread Henrik Bengtsson
On Sun, Mar 4, 2018 at 10:18 AM, Paul Gilbert  wrote:
> On Mon, Feb 26, 2018 at 3:25 PM, Gary Black 
> wrote:
>
> (Sorry to be a bit slow responding.)
>
> You have not supplied a complete example, which would be good in this case
> because what you are suggesting could be a serious bug in R or a package.
> Serious journals require reproducibility these days. For example, JSS is
> very clear on this point.
>
> To your question
>> My question simply is:  should the location of the set.seed command
>> matter,
>> provided that it is applied before any commands which involve randomness
>> (such as partitioning)?
>
> the answer is no, it should not matter. But the proviso is important.
>
> You can determine where things are messing up using something like
>
>  set.seed(654321)
>  zk <- RNGkind()# [1] "Mersenne-Twister" "Inversion"
>  zk
>  z <- runif(2)
>  z
>  set.seed(654321)
>
>  #  install.packages(c('caret', 'ggplot2', 'e1071'))
>  library(caret)
>  all(runif(2)  == z)   # should be true but it is not always
>
>  set.seed(654321)
>  library(ggplot2)
>  all(runif(2)  == z)   # should be true
>
>  set.seed(654321)
>  library(e1071)
>  all(runif(2)  == z)   # should be true
>
>  all(RNGkind() == zk)  # should be true
>
> On my computer package caret seems to sometimes, but not always, do
> something that advances or changes the RNG. So you will need to set the seed
> after that package is loaded if you want reproducibility.
>
> As Bill Dunlap points out, parallel can introduce much more complicated
> issues. If you are in fact using parallel then we really need a new thread
> with a better subject line, and the discussion will get much messier.
>
> The short answer is that, yes you should be able to get reproducible results
> with parallel computing. If you cannot then you are almost certainly doing
> something wrong. To publish you really must have reproducible results.
>
> In the example that Bill gave, I think the problem is that set.seed() only
> resets the seed in the main thread, the nodes continue to operate with
> unreset RNG. To demonstrate this to yourself you can do
>
>  library(parallel)
>  cl <- parallel::makeCluster(3)
>  parallel::clusterCall(cl, function()set.seed(100))
>  parallel::clusterCall(cl, function()RNGkind())
>  parallel::clusterCall(cl, function()runif(2)) # similar result from all
> nodes
># [1] 0.3077661 0.2576725
>
> However, do *NOT* do that in real work. You will be getting the same RNG
> stream from each node. If you are using random numbers and parallel you need
> to read a lot more, and probably consider a variant of the "L'Ecuyer"
> generator or something designed for parallel computing.
>
> One special point I will mention because it does not seem to be widely
> appreciated: the number of nodes affects the random stream, so recording the
> number of compute nodes along with the RNG and seed information is important
> for reproducible results. This has the unfortunate consequence that an
> experiment cannot be reproduced on a larger cluster. (If anyone knows
> differently I would very much like to hear.)

[Disclaimer: I'm the author] future.apply::future_lapply(X, ...,
future.seed) etc. can produce identical RNG results regardless of how
'X' is chunked up.  For example,

library(future.apply)

task <- function(i) {
  c(i = i, random = sample.int(10, size = 1), pid = Sys.getpid())
}

y <- list()

plan(multiprocess, workers = 1L)
y[[1]] <- future_sapply(1:10, FUN = task, future.seed = 42)

plan(multiprocess, workers = 2L)
y[[2]] <- future_sapply(1:10, FUN = task, future.seed = 42)

plan(multiprocess, workers = 3L)
y[[3]] <- future_sapply(1:10, FUN = task, future.seed = 42)

gives the exact same random output:

> y

[[1]]
[,1]  [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8]  [,9] [,10]
i  1 2 3 4 5 6 7 8 910
random 510 1 8 7 9 3 510 4
pid31933 31933 31933 31933 31933 31933 31933 31933 31933 31933

[[2]]
[,1]  [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8]  [,9] [,10]
i  1 2 3 4 5 6 7 8 910
random 510 1 8 7 9 3 510 4
pid32141 32141 32141 32141 32141 32142 32142 32142 32142 32142

[[3]]
[,1]  [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8]  [,9] [,10]
i  1 2 3 4 5 6 7 8 910
random 510 1 8 7 9 3 510 4
pid32199 32199 32199 32200 32200 32200 32200 32201 32201 32201

To base the RNG on the current RNG seed (== .GlobalEnv$.Random.seed),
one can use 'future.seed = TRUE'.  For performance reasons, I choose
the default to be 'future.seed = FALSE', because there can be a
substantial overhead in setting up reproducible L'Ecuyer
subRNG-streams for all elements in 'X'.

I think the snowFT package by Sevcikova & Rossini also 

Re: [R] Random Seed Location

2018-03-04 Thread Henrik Bengtsson
The following helps identify when .GlobalEnv$.Random.seed has changed:

rng_tracker <- local({
  last <- .GlobalEnv$.Random.seed
  function(...) {
curr <- .GlobalEnv$.Random.seed
if (!identical(curr, last)) {
  warning(".Random.seed changed")
  last <<- curr
}
TRUE
  }
})

addTaskCallback(rng_tracker, name = "RNG tracker")


EXAMPLE:

> sample.int(1L)
[1] 1
Warning: .Random.seed changed

This will help you find for instance:

## Loading ggplot2 does not affect the RNG seed
> loadNamespace("ggplot2")


## But attaching it does
> library("ggplot2")
Warning: .Random.seed changed

which reveals:

> ggplot2:::.onAttach
function (...)
{
if (!interactive() || stats::runif(1) > 0.1)
return()
tips <- c("Need help? Try the ggplot2 mailing list:
http://groups.google.com/group/ggplot2.;,
"Find out what's changed in ggplot2 at
http://github.com/tidyverse/ggplot2/releases.;,
"Use suppressPackageStartupMessages() to eliminate package
startup messages.",
"Stackoverflow is a great place to get help:
http://stackoverflow.com/tags/ggplot2.;,
"Need help getting started? Try the cookbook for R:
http://www.cookbook-r.com/Graphs/;,
"Want to understand how all the pieces fit together? Buy the
ggplot2 book: http://ggplot2.org/book/;)
tip <- sample(tips, 1)
packageStartupMessage(paste(strwrap(tip), collapse = "\n"))
}


There are probably many case of this in different R packages.


R WISH:

There could be a

preserveRandomSeed({
  tip <- sample(tips, 1)
})

function in R for these type of random needs where true random
properties are non-critical.  This type of
"draw-a-random-number-and-reset-the-seed" is for instance used in
parallel:::initDefaultClusterOptions() which is called when the
'parallel' package is loaded:

seed <- .GlobalEnv$.Random.seed
ran1 <- sample.int(.Machine$integer.max - 1L, 1L) / .Machine$integer.max
port <- 11000 + 1000 * ((ran1 + unclass(Sys.time()) / 300) %% 1)
if(is.null(seed)) ## there was none, initially
rm( ".Random.seed", envir = .GlobalEnv, inherits = FALSE)
else # reset
assign(".Random.seed", seed, envir = .GlobalEnv, inherits = FALSE)

/Henrik

On Sun, Mar 4, 2018 at 1:40 PM, Gary Black  wrote:
> Thank you, everybody, who replied!  I appreciate your valuable advise!  I 
> will move the location of the set.seed() command to after all packages have 
> been installed and loaded.
>
> Best regards,
> Gary
>
> Sent from my iPad
>
>> On Mar 4, 2018, at 12:18 PM, Paul Gilbert  wrote:
>>
>> On Mon, Feb 26, 2018 at 3:25 PM, Gary Black 
>> wrote:
>>
>> (Sorry to be a bit slow responding.)
>>
>> You have not supplied a complete example, which would be good in this case 
>> because what you are suggesting could be a serious bug in R or a package. 
>> Serious journals require reproducibility these days. For example, JSS is 
>> very clear on this point.
>>
>> To your question
>> > My question simply is:  should the location of the set.seed command matter,
>> > provided that it is applied before any commands which involve randomness
>> > (such as partitioning)?
>>
>> the answer is no, it should not matter. But the proviso is important.
>>
>> You can determine where things are messing up using something like
>>
>> set.seed(654321)
>> zk <- RNGkind()# [1] "Mersenne-Twister" "Inversion"
>> zk
>> z <- runif(2)
>> z
>> set.seed(654321)
>>
>> #  install.packages(c('caret', 'ggplot2', 'e1071'))
>> library(caret)
>> all(runif(2)  == z)   # should be true but it is not always
>>
>> set.seed(654321)
>> library(ggplot2)
>> all(runif(2)  == z)   # should be true
>>
>> set.seed(654321)
>> library(e1071)
>> all(runif(2)  == z)   # should be true
>>
>> all(RNGkind() == zk)  # should be true
>>
>> On my computer package caret seems to sometimes, but not always, do 
>> something that advances or changes the RNG. So you will need to set the seed 
>> after that package is loaded if you want reproducibility.
>>
>> As Bill Dunlap points out, parallel can introduce much more complicated 
>> issues. If you are in fact using parallel then we really need a new thread 
>> with a better subject line, and the discussion will get much messier.
>>
>> The short answer is that, yes you should be able to get reproducible results 
>> with parallel computing. If you cannot then you are almost certainly doing 
>> something wrong. To publish you really must have reproducible results.
>>
>> In the example that Bill gave, I think the problem is that set.seed() only 
>> resets the seed in the main thread, the nodes continue to operate with 
>> unreset RNG. To demonstrate this to yourself you can do
>>
>> library(parallel)
>> cl <- parallel::makeCluster(3)
>> parallel::clusterCall(cl, function()set.seed(100))
>> parallel::clusterCall(cl, function()RNGkind())
>> parallel::clusterCall(cl, function()runif(2)) # similar result from all nodes
>>   

Re: [R] change location of temporary files

2018-02-23 Thread Henrik Bengtsson
On Fri, Feb 23, 2018 at 1:55 PM, William Dunlap via R-help
 wrote:
> Does setting the environment variable TMPDIR, before starting R,
> to a directory on a bigger file system help?  On Linux I get
>
>   % mkdir /tmp/RTMP-BILL
>   % env TMPDIR=/tmp/RTMP-BILL R --quiet --vanilla
>   > tempdir()
>   [1] "/tmp/RTMP-BILL/Rtmppgowz4"
>   > tempfile()
>   [1] "/tmp/RTMP-BILL/Rtmppgowz4/file7ce36ec5cb1e"
>
> I don't know if there is an R-specific environment variable or startup
> flag for this.

Yes, TMPDIR needs to be set *prior* to launching R in order for R to
acknowledge it.  Although most environment variables can be set in
platform-independent Renviron files which are processed early during
the R startup, TMPDIR is one of the exception - it is simply too late
to set it there because R =needs it very very early on.  So, it needs
to be set as in Bill's example above, or similarly:

export TMPDIR=/tmp/RTMP-BILL

/Henrik

>
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Fri, Feb 23, 2018 at 10:52 AM, Kumar Mainali  wrote:
>
>> I would like to change where R stores the temporary files to my external
>> hard drive in my iMac. This is important because the temporary files R
>> creates are huge and I do not have enough available space in my internal
>> HD. Again, this is for macOS.
>>
>> This change has to be temporary. Later I need to use the usual location for
>> temp files in the internal HD.
>>
>> Thanks in advance,
>> Kumar Mainali
>>
>> --
>> Postdoctoral Associate
>> Department of Biology
>> University of Maryland, College Park
>> ᐧ
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Take the maximum of every 12 columns

2018-02-20 Thread Henrik Bengtsson
It looks like OP uses a data.frame, so in order to use matrixStats
(I'm the author) one would have to pay the price to coerce to a matrix
before using matrixStats::rowMaxs().  However, if it is that the
original data could equally well live in a matrix, then matrixStats
should be computational efficient for this task.  (I've seen cases
where an original matrix was turned into a data.frame just because
that is what is commonly used elsewhere and because the user may not
pay attention to the differences between matrices and data.frame.)

If the original data would be a matrix 'X', then one can do the
following with matrixStats:

Y <- sapply(seq(from = 0, to = 2880, by = 12), FUN = function(offset) {
   rowMaxs(X, cols = offset + 1:12)
})

which avoids internal temporary copies required when using regular
subsetting, e.g.

Y <- sapply(seq(from = 0, to = 2880, by = 12), FUN = function(offset) {
   rowMaxs(X[, offset + 1:12])
})

Subsetting data frames by columns is already efficient, so the same
argument does not apply there.

/Henrik

On Tue, Feb 20, 2018 at 10:00 AM, Ista Zahn  wrote:
> On Tue, Feb 20, 2018 at 11:58 AM, Bert Gunter 
> wrote:
>
>> Ista, et. al: efficiency?
>> (Note: I needed to correct my previous post: do.call() is required for
>> pmax() over the data frame)
>>
>> > x <- data.frame(matrix(runif(12e6), ncol=12))
>>
>> > system.time(r1 <- do.call(pmax,x))
>>user  system elapsed
>>   0.049   0.000   0.049
>>
>> > identical(r1,r2)
>> [1] FALSE
>> > system.time(r2 <- apply(x,1,max))
>>user  system elapsed
>>   2.162   0.045   2.207
>>
>> ## 150 times slower!
>>
>> > identical(r1,r2)
>> [1] TRUE
>>
>> pmax() is there for a reason.
>> Or is there something I am missing?
>>
>
>
> Personal preference I think. I prefer the consistency of apply. If speed
> is an issue matrixStats is both consistent and fast:
>
> library(matrixStats)
> x <- matrix(runif(12e6), ncol=12)
>
> system.time(r1 <- do.call(pmax,as.data.frame(x)))
>   ##  user  system elapsed
>   ## 0.109   0.000   0.109
> system.time(r2 <- apply(x,1,max))
>   ##  user  system elapsed
>   ## 1.292   0.024   1.321
> system.time(r3 <- rowMaxs(x))
>   ##  user  system elapsed
>   ## 0.044   0.000   0.044
>
> pmax is a fine alternative for max special case.
>
> Best,
> Ista
>
>
>
>>
>> Cheers,
>> Bert
>>
>>
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along and
>> sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>> On Tue, Feb 20, 2018 at 8:16 AM, Miluji Sb  wrote:
>>
>>> This is what I was looking for. Thank you everyone!
>>>
>>> Sincerely,
>>>
>>> Milu
>>>
>>>
>>> 
>>>  Mail
>>> priva di virus. www.avast.com
>>> 
>>> <#m_4297398466082743447_m_6071581590498622123_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>>
>>> On Tue, Feb 20, 2018 at 5:10 PM, Ista Zahn  wrote:
>>>
 Hi Milu,

 byapply(df, 12, function(x) apply(x, 1, max))

 You might also be interested in the matrixStats package.

 Best,
 Ista

 On Tue, Feb 20, 2018 at 9:55 AM, Miluji Sb  wrote:
 >  Dear all,
 >
 > I have monthly data in wide format, I am only providing data (at the
 bottom
 > of the email) for the first 24 columns but I have 2880 columns in
 total.
 >
 > I would like to take max of every 12 columns. I have taken the mean of
 > every 12 columns with the following code:
 >
 > byapply <- function(x, by, fun, ...)
 > {
 >   # Create index list
 >   if (length(by) == 1)
 >   {
 > nc <- ncol(x)
 > split.index <- rep(1:ceiling(nc / by), each = by, length.out = nc)
 >   } else # 'by' is a vector of groups
 >   {
 > nc <- length(by)
 > split.index <- by
 >   }
 >   index.list <- split(seq(from = 1, to = nc), split.index)
 >
 >   # Pass index list to fun using sapply() and return object
 >   sapply(index.list, function(i)
 >   {
 > do.call(fun, list(x[, i], ...))
 >   })
 > }
 >
 > ## Compute annual means
 > y <- byapply(df, 12, rowMeans)
 >
 > How can I switch rowMeans with a command that takes the maximum? I am
 a bit
 > baffled. Any help will be appreciated. Thank you.
 >
 > Sincerely,
 >
 > Milu
 >
 > ###
 > dput(droplevels(head(x, 5)))
 > structure(list(X0 = c(295.812103271484, 297.672424316406,
 299.006805419922,
 > 297.631500244141, 298.372741699219), X1 = c(295.361328125,
 > 297.345092773438,
 > 298.067504882812, 297.285339355469, 298.275268554688), X2 =
 > c(294.279602050781,
 > 296.401550292969, 296.777984619141, 296.089111328125, 

Re: [R] PSOCK cluster and renice

2018-02-11 Thread Henrik Bengtsson
As a follow up, future 1.7.0 was just released on CRAN allowing you
specify 'renice' as expected.  Example (skip 'dryrun = TRUE' for
actually usage):

> cl <- future::makeClusterPSOCK(2L, renice = 19, dryrun = TRUE)

--
Manually start worker #1 on 'localhost' with:
  nice --adjustment=19 '/usr/lib/R/bin/Rscript'
--default-packages=datasets,utils,grDevices,graphics,stats,methods -e
'parallel:::.slaveRSOCK()' MASTER=localhost PORT=11414 OUT=/dev/null
TIMEOUT=2592000 XDR=TRUE
--
Manually start worker #2 on 'localhost' with:
  nice --adjustment=19 '/usr/lib/R/bin/Rscript'
--default-packages=datasets,utils,grDevices,graphics,stats,methods -e
'parallel:::.slaveRSOCK()' MASTER=localhost PORT=11414 OUT=/dev/null
TIMEOUT=2592000 XDR=TRUE

/Henrik

On Sun, Dec 3, 2017 at 9:06 PM, Andreas Leha
<andreas.l...@med.uni-goettingen.de> wrote:
> Hi Henrik,
>
> Thanks for the detailed in fast reply!
>
> My guess would be that the confusion comes from the different use of nice and 
> renice.
>
> The workraund you provided work fine!  Thanks a lot.
>
> Best,
> Andreas
>
>
>
> Henrik Bengtsson <henrik.bengts...@gmail.com> writes:
>
>> Looks like a bug to me due to wrong assumptions about 'nice'
>> arguments, but could be because a "non-standard" 'nice' is used.  If
>> we do:
>>
>>> trace(system, tracer = quote(print(command)))
>> Tracing function "system" in package "base"
>>
>> we see that the system call used is:
>>
>>> cl <- parallel::makePSOCKcluster(2L, renice = 19)
>> Tracing system(cmd, wait = FALSE) on entry
>> [1] "nice +19 '/usr/lib/R/bin/Rscript'
>> --default-packages=datasets,utils,grDevices,graphics,stats,methods -e
>> 'parallel:::.slaveRSOCK()' MASTER=localhost PORT=11146 OUT=/dev/null
>> TIMEOUT=2592000 XDR=TRUE"
>> nice: ‘+19’: No such file or directory
>> ^C
>>
>> The code that prepends that 'nice +19' is in parallel:::newPSOCKnode:
>>
>> if (!is.na(renice) && renice)
>> cmd <- sprintf("nice +%d %s", as.integer(renice), cmd)
>>
>> I don't know where that originates from and on what platform it was
>> tests/validated.  On Ubuntu 16.04, CentOS 6.6, and CentOS 7.4, I have
>> 'nice' from "GNU coreutils" and they all complain about using '+',
>> e.g.
>>
>> $ nice +19 date
>> nice: +19: No such file or directory
>>
>> but '-' works:
>>
>> $ nice -19 date
>> Sun Dec  3 20:01:31 PST 2017
>>
>> Neither 'nice --help' nor 'man help' mention the use of a +n option.
>>
>>
>> WORKAROUND:  As a workaround, you can use:
>>
>> cl <- future::makeClusterPSOCK(2L, rscript = c("nice",
>> "--adjustment=10", file.path(R.home("bin"), "Rscript")))
>>
>> which is backward compatible with parallel::makePSOCKcluster() but
>> provides you with more detailed control.  Try adding verbose = TRUE to
>> see what the exact call looks like.
>>
>> /Henrik
>>
>>
>> On Sun, Dec 3, 2017 at 7:35 PM, Andreas Leha
>> <andreas.l...@med.uni-goettingen.de> wrote:
>>> Hi all,
>>>
>>> Is it possible to use the 'renice' option together with parallel
>>> clusters of type 'PSOCK'?  The help page for parallel::makeCluster is
>>> not specific about which options are supported on which types and I am
>>> getting the following message when passing renice = 19 :
>>>
>>>> cl <- parallel::makeCluster(2, renice = 19)
>>> nice: ‘+19’: No such file or directory
>>>
>>> Kind regards,
>>> Andreas
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error while working with png output on linux server

2018-02-01 Thread Henrik Bengtsson
You could try with png2() in the R.devices package, which is just a
convenient wrapper around the bitmap() device which can also produce PNGs.
It's not perfect but it might get you going.

Henrik

On Feb 1, 2018 08:24, "Sariya, Sanjeev"  wrote:

> Thanks for your reply. I searched for the error on Google before resorting
> to the R forum (help group).
> I tried Sys.env(...) too, but didn't resolve the error I get. Hence I am
> looking for solution.
>
>
>
> --
>
>
> -Original Message-
> From: Thierry Onkelinx [mailto:thierry.onkel...@inbo.be]
> Sent: Thursday, February 01, 2018 10:57 AM
> To: Sariya, Sanjeev 
> Cc: Jeff Newmiller ; r-help@r-project.org
> Subject: Re: [R] Error while working with png output on linux server
>
> Dear Sanjeev,
>
> It seems that you system neither supports X11 devices nor cairo devices.
> See http://lmgtfy.com/?q=R+unable+to+open+connection+to+X11
> for possible solutions.
>
> Best regards,
>
> ir. Thierry Onkelinx
> Statisticus / Statistician
>
> Vlaamse Overheid / Government of Flanders INSTITUUT VOOR NATUUR- EN
> BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST Team Biometrie &
> Kwaliteitszorg / Team Biometrics & Quality Assurance
> thierry.onkel...@inbo.be Havenlaan 88 bus 73, 1000 Brussel www.inbo.be
>
> 
> ///
> To call in the statistician after the experiment is done may be no more
> than asking him to perform a post-mortem examination: he may be able to say
> what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of
> anecdote is not data. ~ Roger Brinner The combination of some data and an
> aching desire for an answer does not ensure that a reasonable answer can be
> extracted from a given body of data. ~ John Tukey
> 
> ///
>
>
>
>
> 2018-02-01 16:18 GMT+01:00 Sariya, Sanjeev :
> > Thanks for pointing to FAQ: I tried with cairo (shared in commands),
> unfortunately didn't work.
> >
> > --
> > Sanjeev Sariya
> >
> >
> > -Original Message-
> > From: Jeff Newmiller [mailto:jdnew...@dcn.davis.ca.us]
> > Sent: Thursday, February 01, 2018 10:12 AM
> > To: r-help@r-project.org; Sariya, Sanjeev ;
> > r-help@r-project.org
> > Subject: Re: [R] Error while working with png output on linux server
> >
> > FAQ 7.19?
> >
> > Also, read the Posting Guide, in particular about posting using plain
> text.
> > --
> > Sent from my phone. Please excuse my brevity.
> >
> > On February 1, 2018 6:50:42 AM PST, "Sariya, Sanjeev" <
> ss5...@cumc.columbia.edu> wrote:
> >>I'm working on linux server:
> >>Linux  4.9.0-4-amd64 #1 SMP Debian 4.9.51-1 (2017-09-28) x86_64
> >>GNU/Linux
> >>
> >>I get error while creating png files. I'm sharing my commands and
> >>error while I run those commands:
> >>
> >>>png("abc", type="cairo")
> >>Error in .External2(C_X11, paste0("png::", filename), g$width,
> >>g$height,  :
> >>  unable to start device PNG
> >>In addition: Warning message:
> >>In png("abc", type = "cairo") : unable to open connection to X11
> >>display ''
> >>
> >>> png("apoeqqplot.png", res=600)
> >>Error in .External2(C_X11, paste0("png::", filename), g$width,
> >>g$height,  :
> >>  unable to start device PNG
> >>In addition: Warning message:
> >>In png("apoeqqplot.png", res = 600) :
> >>  unable to open connection to X11 display ''
> >>
> >>dev.off()
> >>
> >>R version 3.4.2 (2017-09-28)
> >>Platform: x86_64-pc-linux-gnu (64-bit) Running under: Debian GNU/Linux
> >>9 (stretch)
> >>
> >>Matrix products: default
> >>BLAS: /mnt/mfs/cluster/bin/R-3.4/lib/libRblas.so
> >>LAPACK: /mnt/mfs/cluster/bin/R-3.4/lib/libRlapack.so
> >>
> >>locale:
> >>[1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
> >>[3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
> >>[5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
> >>[7] LC_PAPER=en_US.UTF-8   LC_NAME=C
> >>[9] LC_ADDRESS=C   LC_TELEPHONE=C
> >>[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
> >>
> >>attached base packages:
> >>[1] stats graphics  grDevices utils datasets  methods   base
> >>
> >>other attached packages:
> >>[1] CMplot_3.3.1
> >>
> >>loaded via a namespace (and not attached):
> >>[1] compiler_3.4.2 tools_3.4.2
> >>
> >>
> >>How do I fix this?
> >>
> >>
> >>   [[alternative HTML version deleted]]
> >>
> >>__
> >>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>https://stat.ethz.ch/mailman/listinfo/r-help
> >>PLEASE do read the posting guide
> >>http://www.R-project.org/posting-guide.html
> >>and provide commented, minimal, self-contained, reproducible code.
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > 

Re: [R] Newbie wants to compare 2 huge RDSs row by row.

2018-01-28 Thread Henrik Bengtsson
The diffobj package (https://cran.r-project.org/package=diffobj) is
really helpful here.  It provides "diff" functions diffPrint(),
diffStr(), and diffChr() to compare two object 'x' and 'y' and provide
neat colorized summary output.

Example:

> iris2 <- iris
> iris2[122:125,4] <- iris2[122:125,4] + 0.1

> diffobj::diffPrint(iris2, iris)
< iris2
> iris
@@ 121,8 / 121,8 @@
~ Sepal.Length Sepal.Width Petal.Length Petal.WidthSpecies
  120  6.0 2.2  5.0 1.5  virginica
  121  6.9 3.2  5.7 2.3  virginica
< 122  5.6 2.8  4.9 2.1  virginica
> 122  5.6 2.8  4.9 2.0  virginica
< 123  7.7 2.8  6.7 2.1  virginica
> 123  7.7 2.8  6.7 2.0  virginica
< 124  6.3 2.7  4.9 1.9  virginica
> 124  6.3 2.7  4.9 1.8  virginica
< 125  6.7 3.3  5.7 2.2  virginica
> 125  6.7 3.3  5.7 2.1  virginica
  126  7.2 3.2  6.0 1.8  virginica
  127  6.2 2.8  4.8 1.8  virginica

What's not show here is that the colored output (supported by many
terminals these days) also highlights exactly which elements in those
rows differ.

/Henrik

On Sun, Jan 28, 2018 at 12:17 AM, Ulrik Stervbo  wrote:
> The anti_join from the package dplyr might also be handy.
>
> install.package("dplyr")
> library(dplyr)
> anti_join (x1, x2)
>
> You can get help on the different functions by ?function.name(), so
> ?anti_join() will bring you help - and examples - on the anti_join
> function.
>
> It might be worth testing your approach on a small subset of the data. That
> makes it easier for you to follow what happens and evaluate the outcome.
>
> HTH
> Ulrik
>
> Marsh Hardy ARA/RISK  schrieb am So., 28. Jan. 2018, 04:14:
>
>> Cool, looks like that'd do it, almost as if converting an entire record to
>> a character string and comparing strings.
>>
>>   --  M. B. Hardy, statistician
>> work: Applied Research Associates, S. E. Div.
>>   8537 Six Forks Rd., # 6000 / Raleigh, NC 27615
>> 
>> -2963
>>   (919) 582-3329, fax: 582-3301
>> home: 1020 W. South St. / Raleigh, NC 27603
>> 
>> -2162
>>   (919) 834-1245
>> 
>> From: William Dunlap [wdun...@tibco.com]
>> Sent: Saturday, January 27, 2018 4:57 PM
>> To: Marsh Hardy ARA/RISK
>> Cc: Ulrik Stervbo; Eric Berger; r-help@r-project.org
>> Subject: Re: [R] Newbie wants to compare 2 huge RDSs row by row.
>>
>> If your two objects have class "data.frame" (look at class(objectName))
>> and they
>> both have the same number of columns and the same order of columns and the
>> column types match closely enough (use all.equal(x1, x2) for that), then
>> you can try
>>  which( rowSums( x1 != x2 ) > 0)
>> E.g.,
>> > x1 <- data.frame(X=1:5, Y=rep(c("A","B"),c(3,2)))
>> > x2 <- data.frame(X=c(1,2,-3,-4,5), Y=rep(c("A","B"),c(2,3)))
>> > x1
>>   X Y
>> 1 1 A
>> 2 2 A
>> 3 3 A
>> 4 4 B
>> 5 5 B
>> > x2
>>X Y
>> 1  1 A
>> 2  2 A
>> 3 -3 B
>> 4 -4 B
>> 5  5 B
>> > which( rowSums( x1 != x2 ) > 0)
>> [1] 3 4
>>
>> If you want to allow small numeric differences but exactly character
>> matches
>> you will have to get a bit fancier.  Splitting the data.frames into
>> character and
>> numeric parts and comparing each works well.
>>
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com
>>
>> On Sat, Jan 27, 2018 at 1:18 PM, Marsh Hardy ARA/RISK > > wrote:
>> Hi Guys, I apologize for my rank & utter newness at R.
>>
>> I used summary() and found about 95 variables, both character and numeric,
>> all with "Length:368842" I assume is the # of records.
>>
>> I'd like to know the record number (row #?) of any record where the data
>> doesn't match in the 2 files of what should be the same output.
>>
>> Thanks in advance, M.
>>
>> //
>> 
>> From: Ulrik Stervbo [ulrik.ster...@gmail.com> ulrik.ster...@gmail.com>]
>> Sent: Saturday, January 27, 2018 10:00 AM
>> To: Eric Berger
>> Cc: Marsh Hardy ARA/RISK; r-help@r-project.org> >
>> Subject: Re: [R] Newbie wants to compare 2 huge RDSs row by row.
>>
>> Also, it will be easier to provide helpful information if you'd describe
>> what in your data you want to compare and what you hope to get out of the
>> comparison.
>>
>> Best wishes,
>> Ulrik
>>
>> Eric Berger > ericjber...@gmail.com>> schrieb am Sa., 27.
>> Jan. 2018, 08:18:
>> Hi Marsh,
>> An RDS is not a data 

Re: [R] Merging RData files

2018-01-16 Thread Henrik Bengtsson
To expand on what Bert suggests.  Use:

loadToEnv <- function(file, ..., envir = new.env()) {
  base::load(file = file, envir = envir, ...)
}

envA <- loadToEnv("a.RData")
envB <- loadToEnv("b.RData")

and then access the objects in environments envA and envB using
environment access methods, e.g. ls(envir = envA), envA[[name]],
envA$foo, as.list(envA) [careful is large objects], ...

/H

On Tue, Jan 16, 2018 at 7:30 AM, Bert Gunter  wrote:
> ?load
>
> Read this carefully. Pay attention to its instructions re: overwriting
> existing objects.
>
> Cheers,
> Bert
>
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
> On Tue, Jan 16, 2018 at 12:43 AM, Steven Yen  wrote:
>
>> I ran two separate hours-long projects. Results of each were saved to
>> two separate .RData files.
>> Content of each includes, among others, the following:
>>
>> mese  t p sig
>> pc21.age0.640 0.219  2.918 0.004 ***
>> pc21.agesq  0.000 0.000NaN   NaN
>> pc21.inc0.903 0.103  8.752 0.000 ***
>> pc21.incsq  0.000 0.000NaN   NaN
>> pc21.sei10  0.451 0.145  3.122 0.002 ***
>> pc21.sblkprot  -4.334 3.387  1.280 0.201
>> ...
>>
>> Question: How can I combine/consolidate the two .RData files into one?
>> Thank you.
>>
>>
>>
>>
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R minor version

2018-01-12 Thread Henrik Bengtsson
This can be controlled by environment variables R_LIBS, R_LIBS_SITE,
and R_LIBS_USER, which support "conversion specifiers".  See
help(".libPaths") or aliases help("R_LIBS") etc.  You can set these in
in any of the Renviron files (~/.Renviron, /path/to/R/etc/Renviron,
/path/to/R/etc/Renviron.site).

It sounds like you're trying to set up a site-wide package library to
be shared among users.  If so, add a line:

R_LIBS_SITE=/some/where/else/R/site-%p-library/%v

and make sure that folder exists, otherwise it's silently dropped from
.libPaths().

/Henrik


On Fri, Jan 12, 2018 at 8:42 AM, William Dunlap via R-help
 wrote:
>> .expand_R_libs_env_var("poof/%p/%v")
> [1] "poof/x86_64-w64-mingw32/3.4"
>
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Fri, Jan 12, 2018 at 4:49 AM, Loris Bennett 
> wrote:
>
>> Hi,
>>
>> When I install a package as a non-root user, it gets saved in a path
>> such as
>>
>>   ~/R/x86_64-unknown-linux-gnu-library/3.4
>>
>> for any R version 3.4.x.  Thus, if I want to install the packages
>> somewhere else, as root say, I might do
>>
>>   install.packages("somepackage","/some/where/else/R/site-library/3.4")
>>
>> In this case I would then want to construct the appropriate path in,
>> say, /etc/Rprofile to allow the packages to be found.
>>
>> However, using R.Version() gives me
>>
>>   > R.Version()$minor
>>   [1] "4.3"
>>
>> I can obviously extract the "real" minor version from the string, but
>> shouldn't there be a more straightforward way to obtain the part of the
>> version that is used in .libPaths() by default?
>>
>> Or am I misunderstanding something?
>>
>> Cheers,
>>
>> Loris
>>
>> --
>> Dr. Loris Bennett (Mr.)
>> ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to programmatically save a web-page using R (mimicking Command+S)

2018-01-06 Thread Henrik Bengtsson
The 'webshot' package (on CRAN) can do this.

Henrik

On Jan 6, 2018 05:27, "Christofer Bogaso" 
wrote:

> Hi,
>
> I would appreciate if someone can give me a pointer on how to save a
> webpage programmatically using R.
>
> For example, let say I have this webpage open in my browser:
>
> http://www.bseindia.com/stock-share-price/dabur-india-ltd/dabur/500096/
>
> When manually I save this page, I just press Command+S (using Mac) and
> then this page get saved in hard-disk
>
> Now I want R to mimic this same job that I do using Command-S
>
> So far I have tried with readLines() however the output content is
> different than what I could achieve using Command+S
>
> Any help will be highly appreciated.
>
> Thanks for your time.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] PSOCK cluster and renice

2017-12-03 Thread Henrik Bengtsson
Looks like a bug to me due to wrong assumptions about 'nice'
arguments, but could be because a "non-standard" 'nice' is used.  If
we do:

> trace(system, tracer = quote(print(command)))
Tracing function "system" in package "base"

we see that the system call used is:

> cl <- parallel::makePSOCKcluster(2L, renice = 19)
Tracing system(cmd, wait = FALSE) on entry
[1] "nice +19 '/usr/lib/R/bin/Rscript'
--default-packages=datasets,utils,grDevices,graphics,stats,methods -e
'parallel:::.slaveRSOCK()' MASTER=localhost PORT=11146 OUT=/dev/null
TIMEOUT=2592000 XDR=TRUE"
nice: ‘+19’: No such file or directory
^C

The code that prepends that 'nice +19' is in parallel:::newPSOCKnode:

if (!is.na(renice) && renice)
cmd <- sprintf("nice +%d %s", as.integer(renice), cmd)

I don't know where that originates from and on what platform it was
tests/validated.  On Ubuntu 16.04, CentOS 6.6, and CentOS 7.4, I have
'nice' from "GNU coreutils" and they all complain about using '+',
e.g.

$ nice +19 date
nice: +19: No such file or directory

but '-' works:

$ nice -19 date
Sun Dec  3 20:01:31 PST 2017

Neither 'nice --help' nor 'man help' mention the use of a +n option.


WORKAROUND:  As a workaround, you can use:

cl <- future::makeClusterPSOCK(2L, rscript = c("nice",
"--adjustment=10", file.path(R.home("bin"), "Rscript")))

which is backward compatible with parallel::makePSOCKcluster() but
provides you with more detailed control.  Try adding verbose = TRUE to
see what the exact call looks like.

/Henrik


On Sun, Dec 3, 2017 at 7:35 PM, Andreas Leha
 wrote:
> Hi all,
>
> Is it possible to use the 'renice' option together with parallel
> clusters of type 'PSOCK'?  The help page for parallel::makeCluster is
> not specific about which options are supported on which types and I am
> getting the following message when passing renice = 19 :
>
>> cl <- parallel::makeCluster(2, renice = 19)
> nice: ‘+19’: No such file or directory
>
> Kind regards,
> Andreas
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] valid package repositories

2017-10-02 Thread Henrik Bengtsson
Here's my view on this:

CRAN = Comprehensive R Archive Network.  The "Archive" part is very
important - it "promises" the research community that R packages that
have ever been published on CRAN, and all the versions of each
package, will be available also in the future.  It requires quite a
bit for a package/code to disappear from CRAN, e.g. a package contains
code/data that is not allowed to be shared (due to licenses and
copyrights).  Not even the original developer/maintainer can remove a
package that has already been released on CRAN.  What we see at times,
a package is "archived" on CRAN (i.e. no longer available via
install.packages()), but the old package versions are still
distributed.  That CRAN protects us this way is extremely valuable to
the research community, open science, and reproducible research.  The
Bioconductor has a similar philosophy.

However convenient GitHub / GitLab / ... is for development etc, it
certainly does not provide scientific archiving - in that sense it is
no different than sharing packages on Dropbox, Google Drive, etc.

/Henrik


On Mon, Oct 2, 2017 at 10:25 AM, Jeff Newmiller
 wrote:
> I tend to regard GitHub as a bit of wild west... anyone can upload anything 
> there, working or not. CRAN packages at least have to compile so there is 
> some additional verification in being there.
>
> GitHub does have the advantage that you can easily download it and run an 
> example if the authors have set up such scaffolding... which is better than 
> "it ran once on that laptop that died". However, there is a distinct extra 
> level of sophistication involved in getting researchers to make those 
> examples or test cases beyond their mainline code, and nothing about GitHub 
> requires that such features be present in uploaded code.
> --
> Sent from my phone. Please excuse my brevity.
>
> On October 2, 2017 7:47:35 AM PDT, Federico Calboli 
>  wrote:
>>Hi All,
>>
>>I noticed that it is quite common to find in papers mentions to ‘R
>>libraries’ developed for the algorithms/models/code/whatever that is
>>being described by the paper, so that third parties will be able to use
>>said method for themselves.  On further enquiries these libraries are
>>not actually available on CRAN, but need to be requested from the devs.
>>
>>
>>That is in itself does not seem a big issue, were it not for the fact
>>most of the time I am in such situation the code is very specific for
>>the environment of the developer, and does not actually work on any
>>machine I try to run it on (something that is painfully true for code
>>calling C/C++/Fortran).  A second pattern I seem to have noticed is
>>that, despite said libraries being advertised for general use in a
>>*published* paper, when I raise the issue the library is not actually
>>formally published and it does not actually work like a CRAN published
>>library would, I get a vague ‘the person who actually did the work left
>>and nobody can maintain the code/fix stuff/finish the job’.
>>
>>As a referee I am trying to weed out what I see as malpractice: the
>>promise that third parties outside the developers might actually use
>>the code because it has been packaged as a R library, a claim that
>>seems to boost publishing chances.
>>
>>Thus my question: when can I consider a library to be properly
>>published and really publicly available?  CRAN and BioConductor are
>>clearly gold standards.  What about Github?  I am currently using the
>>rule ‘not on CRAN == outright rejection’.  If Github is as good as CRAN
>>I will include it on my list of ‘the code is available in a functional
>>state as claimed’.
>>
>>Finally, please note the scope of my query:  I am not looking at those
>>cases where a colleague gives me half finished code that might be
>>useful but I need to sort out.  I am looking at formal claims ‘we have
>>developed a method to do X and said method is available to the public
>>as a R library’.  If that is the claim I expect it to be true.
>>
>>Best
>>
>>F
>>
>>
>>
>>
>>--
>>Federico Calboli
>>LBEG - Laboratory of Biodiversity and Evolutionary Genomics
>>Charles Deberiotstraat 32 box 2439
>>3000 Leuven
>>+32 16 32 87 67
>>
>>
>>
>>
>>
>>__
>>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide
>>http://www.R-project.org/posting-guide.html
>>and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see

Re: [R] R_LIBS_USER not in libPaths

2017-09-16 Thread Henrik Bengtsson
I'm not sure I follow what.the problem is. Are you trying to
set R_LIBS_USER but R does not acknowledge it, or do you observe something
in R that you didn't expect to be there and you are trying to figure out
why that is / where that happens?

Henrik

On Sep 16, 2017 07:10, "Rene J Suarez-Soto"  wrote:

> I have a computer where R_LIBS_USER is not found in libPaths. This is for
> Windows (x64). I ran R from the command line, RGui and RStudio and I get
> the same results. I also ran R --vanilla and I still get the discrepancy.
>
> The only thing I found interesting was that I also ran SET from the command
> line and the "R related variables" (e.g.,  R_HOME; R_LIBS_USER) are not
> there. Therefore these variables are being set when I start R. I have not
> been able to track where does R obtain the value for these.
>
> Aside from looking at
> http://stat.ethz.ch/R-manual/R-patched/library/base/html/Startup.html I am
> not sure I have much more information that I have found useful.
>
> Thanks
>
> R
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] getOption() versus Sys.getenv

2017-08-25 Thread Henrik Bengtsson
There's also the alternative to use both, e.g. by having a system
environment variable set a corresponding R option, which then can be
overridden using options().  For instance, the R option mc.cores,
which is used by the parallel package, is set to the (integer) value
of system environment variable MC_CORES, iff set. Conceptually, when
the parallel package is loaded, the following takes place:

if (is.null(getOption("mc.cores")) {
  cores <- as.integer(Sys.getenv("MC_CORES"))
  if (!is.na(cores)) options(mc.cores = cores)
}

Example:

$ Rscript -e "library(parallel); getOption('mc.cores')"
NULL

$ MC_CORES=2 Rscript -e "library(parallel); getOption('mc.cores')"
[1] 2

$ MC_CORES=2 Rscript -e "options(mc.cores = 4); library(parallel);
getOption('mc.cores')"
[1] 4

/Henrik


On Fri, Aug 25, 2017 at 10:33 AM, Duncan Murdoch
 wrote:
> On 25/08/2017 1:19 PM, Sam Albers wrote:
>>
>> Hi there,
>>
>> I am trying to distinguish between getOption() and Sys.getenv(). My
>> understanding is that these are both used to set values for variables.
>> getOption is set something like this: option("var" = "A"). This can be
>> placed in an .Rprofile or at the top of script. They are called like this
>> getOption("var").
>>
>> Environmental variables are set in the .Renviron file like this: "var" =
>> "A" and called like this: Sys.getenv("var"). I've seen mention in the httr
>> package documentation that credentials for APIs should be stored in this
>> way.
>>
>> So my question is how does one decide which path is most appropriate? For
>> example I am working on a package that has to query a database in almost
>> every function call. I want to provide users an ability to skip having to
>> specify that path in every function call. So in this case should I
>> recommend users store the path as an option or as an environmental
>> variable? If I am storing credentials in an .Renviron file then maybe I
>> should store the path there as well?
>>
>> More generally the question is can anyone recommend some good
>> discussion/documentation on this topic?
>
>
>
> The environment is set outside of R; it's really part of the operating
> system that runs R.  So use Sys.getenv() if you want the user to be able to
> set something before starting R.  Use Sys.setenv() only if your R program is
> going to use system() (or related function) to run another process, and you
> want to communicate with it.
>
> The options live entirely within a given session.
>
> The .Renviron and .Rprofile files hide this difference, but they aren't the
> only ways to set these things, they're just convenient ways to set them at
> the start of a session.
>
> Duncan Murdoch
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] about reading files in order

2017-06-29 Thread Henrik Bengtsson
You can use:

> files <- list.files(path = "folder01")
> files <- gtools::mixedsort(files)

to order the files in a "human-friendly" order rather than
lexicographic order (which sort() provides).

FYI 1; it's preferred to use file.path("folder01", list[i]) rather
than paste('folder01',lists[i],sep='/').

FYI 2; if you use list.files(path = "folder01", full.names = TRUE),
you get the full paths rather name just the file names, i.e. you don't
have to use file.path().

/Henrik

On Thu, Jun 29, 2017 at 12:04 PM, lily li  wrote:
> Hi R users,
> I have a question about opening the txt files and putting them into a
> matrix. The txt files are in the folder01, while they have the name
> file.1.txt, file.2.txt, file.3.txt, etc. There are about 200 such text
> files. Each txt file contains one value inside. When I tried to use the
> code below, I found that the txt files are not in order, from 1, 2, 3, to
> 200. Rather, they are in the order 1, 10, 100, 101, etc. How to change it
> so that they are in order? Thanks for your help.
>
> temp <- list.files('folder01',pattern="*.txt"
> name.list <-lapply(paste('folder01',temp,sep='/'),read.table,head=F)
> library(data.table)
> files.matrix <-rbindlist(name.list)
>
> Also, when use the code below, how to complete it so that the values of the
> files are stored in a matrix?
> lists = list.files('folder01')
> for (i in 1:length(lists)){
>   file <- read.table(paste('folder01',lists[i],sep='/'),head=F)
>   print(file)
> }
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] odfWeave - A loop of the "same" data

2017-06-01 Thread Henrik Bengtsson
This is what the R.rsp (https://cran.r-project.org/package=R.rsp; I'm
the author) and it's RSP markup is good at and was designed to handle.
We're using it lots in report generation where we iterate of elements,
e.g. over the 24 chromosomes.  See Section 2.3 in
https://cran.r-project.org/web/packages/R.rsp/vignettes/Dynamic_document_creation_using_RSP.pdf.
RSP is independent of input format - all it requires is that it's
text-based - so you can use RSP-embedded LaTeX, HTML, Markdown, ...,
and even RSP-embedded Sweave, knitr, Rmarkdown (where it then it
effectively works as a pre-processor to those formats).

Hope this helps

Henrik



On Thu, Jun 1, 2017 at 9:35 AM, Charles C. Berry  wrote:
> On Thu, 1 Jun 2017, POLWART, Calum (COUNTY DURHAM AND DARLINGTON NHS
> FOUNDATION TRUST) via R-help wrote:
>
>> Before I go and do this another way - can I check if anyone has a way of
>> looping through data in odfWeave (or possibly sweave) to do a repeating
>> analysis on subsets of data?
>>
>> For simplicity lets use mtcars dataset in R to explain.  Dataset looks
>> like this:
>>
>>> mtcars
>>
>>   mpg cyl disp  hp drat   wt ...
>> Mazda RX4 21.0   6  160 110 3.90 2.62 ...
>> Mazda RX4 Wag 21.0   6  160 110 3.90 2.88 ...
>> Datsun 71022.8   4  108  93 3.85 2.32 ...
>>   
>>
>> Say I wanted to have a 'catalogue' style report from mtcars, where on each
>> page I would perhaps have the Rowname as a heading and then plot a graph of
>> mpg highlighting that specific car
>>
>> Then add a page break and *do the same for the next car*.  I can manually
>> do this of course, but it is effectively a loop something like this:
>>
>> for (n in length(mtcars$mpg)) {
>> barplot (mtcars$mpg, col=c(rep(1,n-1),2,rep(1,length(mtcars$mpg)-n)))
>> }
>>
>> There is a odfWeave page break function so I can do that sort of thing (I
>> think).  But I don't think I can output more than one image can I? In
>> reality I will want several images and a table per "catalogue" page.
>>
>> At the moment I think I need to create a master odt document, and create
>> individual catalogue pages.  And merge them into one document - but that
>> feels clunky (unless I can script the merge!)
>>
>> Anyone got a better way?
>
>
>
> For a complex template inside a loop, I'd probably do as Jeff suggests and
> use a knitr child document for ease of developing and debugging the
> template.
>
> But for the simple case you describe I'd use a brew script to
> unroll the loop.
>
> You would write your input file as usual, but put a brew script in the
> right place, then run brew on the input file to produce an
> intermediate file that unrolls the loop, then weave the intermediate
> file to get your desired result.  Here is a simple example of such you can
> run in an R session (assuming the brew package is installed) and see the
> results printed out.
>
> --8<---cut here---start->8---
>
> brew::brew(text="
>
> Everything before the loop
>
> <% for (i in 1:10) { %>
> Print the value of i
> <% print(i) %> or better yet
> \\Sexpr{<%= i %>}
> <% } %>
>
> everything after
>
> ")
>
> --8<---cut here---end--->8---
>
> The double backslash is needed in the literal string used here.  If
> you put that script in a file using an editor, you would just use a
> single backslash.
>
> HTH,
>
> Chuck
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Maximum length for a -e argument to Rscript?

2017-04-21 Thread Henrik Bengtsson
That Rscript stalls sounds like a bug, but not sure it's R or the
terminal that needs to be fixed:

  Rscript -e "$long_expression"
WARNING: '-e 
{x~+~<-~+~c(-1.31171,~+~-0.686165,~+~1.62771,~+~0.320195,~+~-0.322011,~+~1.66518,~+~-0.271971,~+~-0.665367,~+~0.516482,~+~-0.716343,~+~-0.317471,~+~0.068046,~+~-0.100371,~+~-1.15907,~+~0.263329,~+~-0.936049,~+~-0.852444,~+~0.358817,~+~-0.233959,~+~0.209891,~+~-0.831575,~+~-0.952987,~+~-0.0420206,~+~-1.78527,~+~-0.280584,~+~-0.62353,~+~1.42597,~+~0.127994,~+~0.0751232,~+~0.896835,~+~-0.319488,~+~0.897876,~+~0.18457,~+~0.779571,~+~-0.0543194,~+~0.226722,~+~-0.769983,~+~-0.723463,~+~0.144386,~+~-0.468544,~+~-0.349417,~+~0.336786,~+~0.749212,~+~-1.62397,~+~0.683075,~+~-0.746449,~+~0.300921,~+~-0.365468,~+~0.548271,~+~1.13169,~+~-1.34042,~+~-0.0740572,~+~1.34986,~+~0.531771,~+~-0.147157,~+~0.824894,~+~-1.05816,~+~1.58867,~+~-0.885764,~+~1.11912,~+~0.361512,~+~1.77985,~+~0.585099,~+~-1.205,~+~2.44134,~+~-0.331372,~+~-0.346322,~+~0.0535267,~+~-1.75089,~+~0.0773243,~+~-1.07846,~+~-1.29632,~+~1.0622,~+~1.34867,~+~0.199777,~+~0.197516,~+~0.574185,~+~1.06555,~+~-0.885166,~+~-0.788576,~+~-1.46061,~+~-1.5402
^C^C
^C
^C^C^C^C^C^C
^\Quit (core dumped)

On my default Ubuntu 16.04 terminal, R 3.3.3 hangs and does not
respond to user interrupts (SIGINT), but it does respond to Ctrl-\
(SIGKILL).

A workaround is to pass the expression via standard input to R, e.g.

$ echo "$long_expression" | R --no-save

/Henrik

On Fri, Apr 21, 2017 at 11:07 AM, Ben Tupper  wrote:
> Hi,
>
> I suspect you are over the 10kb limit for the expression.  See
>
> https://cran.r-project.org/doc/manuals/r-release/R-intro.html#Invoking-R-from-the-command-line
>
> Cheers,
> Ben
>
>> On Apr 21, 2017, at 3:44 AM, Ben Haller  wrote:
>>
>>  Hi!  I’m attempting to use Rscript to do some automated plotting.  It is 
>> working well, except that I seem to be running into a maximum line length 
>> issue, and I’m wondering if it is a bug on your end.  Here’s an example of 
>> the command I’m trying to run:
>>
>> /usr/local/bin/Rscript -e '{x <- c(-1.31171, -0.686165, 1.62771, 0.320195, 
>> -0.322011, 1.66518, -0.271971, -0.665367, 0.516482, -0.716343, -0.317471, 
>> 0.068046, -0.100371, -1.15907, 0.263329, -0.936049, -0.852444, 0.358817, 
>> -0.233959, 0.209891, -0.831575, -0.952987, -0.0420206, -1.78527, -0.280584, 
>> -0.62353, 1.42597, 0.127994, 0.0751232, 0.896835, -0.319488, 0.897876, 
>> 0.18457, 0.779571, -0.0543194, 0.226722, -0.769983, -0.723463, 0.144386, 
>> -0.468544, -0.349417, 0.336786, 0.749212, -1.62397, 0.683075, -0.746449, 
>> 0.300921, -0.365468, 0.548271, 1.13169, -1.34042, -0.0740572, 1.34986, 
>> 0.531771, -0.147157, 0.824894, -1.05816, 1.58867, -0.885764, 1.11912, 
>> 0.361512, 1.77985, 0.585099, -1.205, 2.44134, -0.331372, -0.346322, 
>> 0.0535267, -1.75089, 0.0773243, -1.07846, -1.29632, 1.0622, 1.34867, 
>> 0.199777, 0.197516, 0.574185, 1.06555, -0.885166, -0.788576, -1.46061, 
>> -1.54026, 0.690576, -0.88821, 0.343747, -0.100751, -0.865596, -0.128504, 
>> 0.222334, -1.18932, -0.555258, -0.557368, 0.219272, 0.298858, 0.848022, 
>> 0.142608, 1.10082, -0.348039, 0.0566489, 0.662136, 0.50451, -0.909399, 
>> 1.02446, 1.40592, -0.114786, -1.10718, 2.02549, 0.0818607, -1.037, 1.18961, 
>> -0.204, 2.83165, -0.959653, -0.393082, -0.463351, 0.914054, 1.14472, 
>> -1.32927, 1.25416, 0.372267, 0.410832, 1.04187, 1.22288, 1.27131, 0.0949385, 
>> 0.194053, -0.226184, -0.502155, -1.36834, -0.000591861, -0.565903, 1.14099, 
>> 1.67811, 0.331709, -0.756879, 0.889596, 0.718098, 0.740242, -0.861818, 
>> 0.0332746, 1.01745, 0.584536, -1.14245, -0.85, -1.34237, 0.660603, 
>> 1.16048, -0.898828, 0.965746, -1.16953, -2.33417, 0.591078, -0.364892, 
>> 0.0719267, -1.21676, 1.12646, 1.37534, 0.0712832, 1.22889, -0.0110024, 
>> 0.248213, -1.12013, -0.525197, -0.352156, -0.317182, -0.89552, 1.53422, 
>> -1.36777, 1.52099, 1.18789, -3.15251, 1.24008, -0.564289, -0.515629, 
>> -0.0920464, 2.94027, 0.895481, -0.157643, -0.347874, -0.290823, -0.771436, 
>> 1.29285, 0.216689, -1.86856, 2.24075, 0.888635, 0.430417, -0.585856, 
>> 1.13119, -0.243977, 0.544491, 0.921995, 0.815365, 1.2584, -1.29347, 
>> 0.0574579, 0.990557, -1.58657, -0.264373, 0.865893, 0.599298, -0.417531, 
>> 0.132897, 1.88597, 1.33112, -0.880904, 0.0762161, 0.0567852, 0.593295, 
>> -0.632135, 0.885625, 0.365863, -0.17885, 0.420185, -0.508275, 0.974357, 
>> 0.628085, 0.710578, 1.72447, 1.38488, 1.01301, 1.30015, 0.260501, 0.808981, 
>> 0.440228, 0.416344, -1.66273, -0.397224, -0.512086, -0.175854, -0.663143, 
>> 0.369712, -1.01654, 0.660465, 0.124851, -1.51101, -0.95725, 2.09893, 
>> 1.26819, 1.08086, 0.493204, 0.79073, 1.49191, 0.563689, 0.414473, 2.27361, 
>> 0.871923, 0.193703, -0.185039, -0.312859, -1.42267, -2.11561, 0.311996, 
>> -0.0906527, 1.19139, 1.57502, 1.10587, 0.416333, 2.35374, -1.0531, 
>> 0.0450512, 0.979958, 0.398269, 0.0897618, -0.855305, -1.59337, -0.084904, 
>> 0.245872, 1.27115, 1.3512, 

Re: [R] Setting .Rprofile for RStudio on a Windows 7 x64bit

2017-04-19 Thread Henrik Bengtsson
I'd be really surprised if basically not all text editors could be
used here; it more that some have quirks making it less obvious.  For
instance, with plain old Notepad that comes with all Windows distros
you can put (double) quotes around the file name in the Save As...
dialog to prevent Notepad from adding it's own default *.txt extension
(when the extension is missing).  BTW, since you're using RStudio, you
can also use that to create your .Rprofile file; RStudio's Save As...
does not mess with the filename - it saves the file as you specify.

Bruce, what I think would have helped helping you here early on is if
you had given a bit more details on what you did / tried.  There was a
lot of "does not work" early on, which gives little clues.  It wasn't
clear where you saved the files.  Maybe you weren't sure yourself, but
even knowing so would have helped help you.  It was also not clear how
much experience you had with R, which caused confusion - knowing "new
to R, with non professional programming skills" would probably have
cut some corners.  As a helper, here and on other forums, it's often
hard to guess this and to provide a proper reply is then hard - like
trying to give a scientific presentation when you don't know who's in
the audience.  This happens all the time and most people quickly picks
up what the expectations are and then get a smoother ride going
forward.

/Henrik

PS. With the risk of adding confusion, unless you already do so, you
should focus on only using/editing your
C:/Users/BruceRatner/Documents/.Rprofile (also referred to as
~/.Rprofile) file.  That one does not require any Administrative
rights to edit and has the advantage of working also when you update R
later.   The C:/PROGRA~1/R/R-33~1.3/etc/.Rprofile.site file requires
Admin to edit and will have to be recreated/reedited whenever you
update R (e.g. R 3.4.0 will install in a different directory).  A
regular R user should never have to edit the latter.  It's mostly used
for system admins who wish to set common startup settings for multiple
users in a location file.  The *.site part of .Rprofile.site suggests
site-wide settings.  So, use ~/.Rprofile to control your R startup
settings.


On Tue, Apr 18, 2017 at 8:26 AM, Bruce Ratner PhD  wrote:
> Dear John:
> My pleasure to respond to your request.
> Problem: Cannot get the .Rprofile file to take affect in either R (or 
> RStudio).
> As to "what" can be in put into a  .Rprofile file is abound, many examples in 
> the manuals, blogs, links, and books.
>
> The "how to" write the file was the real issue, not clearly covered in any 
> material I could find or purchase.
>
> I read that any notepad-type app can be used to create the .Rprofile file:
> 1. with or without a txt/R extension, and/or
> 2. with or without Administrator permission.
>
> Not being a professional programmer/developer, I did not know about text 
> editors that can create files with no extension, which was the problem at 
> hand.
>
> After many back and forth drilling down by R-helpers with trouble shooting 
> queries, it became clear that I was not using a developer's text editor.
>
> Solution: I found an editor online, EditPad Pro 7 (for Windows), with which I 
> created my .Rprofile file.
>
> The result was complete success, and gratitude to all R-helpers who stuck by 
> me,
> understanding I am new to R, with non professional programming skills. As a 
> statistician (or if you prefer data scientist) for twenty plus years, clearly 
> I must know how to program, but not at the pro level or pro understanding.
>
> John, I hope this write up is satisfactory, if not please let let me know, as 
> I will rewrite until you are happy with it.
>
> It is a nice surprise to hear your wanting to archive the problem-solution, 
> which almost did me in, and which created ill feelings among several 
> R-helpers towards me.
>
> Regards,
> Bruce
>
> __
> Bruce Ratner PhD
> The Significant Statistician™
> (516) 791-3544
> Statistical Predictive Analytics -- www.DMSTAT1.com
> Machine-Learning Data Mining -- www.GenIQ.net
>
>
>
>> On Apr 18, 2017, at 9:52 AM, Sparks, John James  wrote:
>>
>> Bruce,
>>
>> Do you think that you could post the final solution to the problem?  That
>> way it would be stored with this thread and the next person who has the
>> same problem would be able to locate the FINAL solution.
>>
>> --JJS
>>
>>
>>> On Mon, April 17, 2017 12:47 pm, BR_email wrote:
>>> TO _ALL_:
>>> THANK YOU. THANK YOU. THANK YOU.
>>> After hours, and hours, and hours, and ... , and hours: Success.
>>> To all who helped, thanks.
>>> My quest was minor, but major for me, as I learn from the path of one,
>>> whether big or small begets another.
>>>
>>> I never look down at anyone, except to help him/her up.
>>>
>>> With gratitude,
>>> Bruce
>>>
>>> Bruce Ratner, Ph.D.
>>> The Significant Statistician™
>>> (516) 791-3544
>>> Statistical Predictive Analtyics -- www.DMSTAT1.com
>>> 

Re: [R] Setting .Rprofile for RStudio on a Windows 7 x64bit

2017-04-17 Thread Henrik Bengtsson
Did you try any of the troubleshooting I suggested? If you do that, I'm
99.99% certain it'll help you to resolve this.

Henrik


On Apr 17, 2017 03:07, "Bruce Ratner PhD"  wrote:

David:
When I launch Rstudio the effects of the Rprofile do not show, e.g., I want
the prompt to be "R> " instead of the default "> ". The former doesn't show.
Bruce

__
Bruce Ratner PhD
The Significant Statistician™
(516) 791-3544
Statistical Predictive Analytics -- www.DMSTAT1.com
Machine-Learning Data Mining -- www.GenIQ.net



> On Apr 16, 2017, at 7:34 PM, David Winsemius 
wrote:
>
>
>> On Apr 16, 2017, at 3:43 PM, BR_email  wrote:
>>
>> Peter:
>> Thanks for reply and suggestion.
>> Sorry, I am not sure how to assess.
>> The doc is too technical for me to understand.
>> I found multiple instructions online and in R and RStudio books.
>> I'm doing what it says, but no success.
>
> What is "it" and what is "lack of success"?
>
>> The instructions are simple as a-b-c, but some setting within the
Windows system must be the culprit.
>
> Although the RStudio page immediately below was done with a Mac, I
suspect there are similar selection panels and dialogs on the Windows
version of RStudio.
>
> https://support.rstudio.com/hc/en-us/articles/200549016#general
>
> When I look at the Windows installation advice I see near the top: "When
installing on a 64-bit version of Windows the options will include 32- or
64-bit versions of R (and the default is to install both)." So is it
possible that RStudio is looking at a different version of R than you
believe it should be, perhaps at the 32 bit R versus the 64 bit one? The
result at the beginning of this thread makes me think you got the 32-bit
one connected to RStudio.
>
> And I say again: I believe problems in configuring RStudio are off-topic
for Rhelp and you should have been searching or posting question either to
the RStudio support or StackOverflow. Looking at the responses to the
queries above and the ones found below, it appears to me that there are
RStudio-specific issues that go beyond what is in the `help(Startup)` or
equivalent `help(.Rprofile)` page.  I gave an instance of an SO search
upthread and I offer another SO search:
>
> http://stackoverflow.com/search?tab=votes=%5brstudio%
5d%20environment%20variables%20windows
>
> I thought that this one below had potentially useful information, but I
am not a Windows user (and you have not shown an inclination in offering a
complete description of your efforts at following that advice. At any rate
it would have been more appropriate to respond to the SO answers that were
ineffective or to post a question there with full description of your
efforts and content of your .Rprofile file and your current environment
variable settings.)
>
> http://stackoverflow.com/search?tab=votes=%5brstudio%
5d%20rprofile%20windows
>
> --
> David
>
>>
>> Regards,
>> Bruce
>>
>> Bruce Ratner, Ph.D.
>> The Significant Statistician™
>> (516) 791-3544
>> Statistical Predictive Analtyics -- www.DMSTAT1.com
>> Machine-Learning Data Mining and Modeling -- www.GenIQ.net
>>
>> peter dalgaard wrote:
>>> Um, tried help(.Rprofile) lately?
>>>
>>> -pd
>>>
 On 17 Apr 2017, at 00:08 , Rolf Turner  wrote:


> On 17/04/17 08:46, John C Frain wrote:
>
> Bruce
>
> The official documentation for these startup files can be obtained
with
> the command
>
> Help(Startup)

 Minor point of order, Mr. Chairman.  That should be:

   help(Startup)

 There is (as far as I know) no such function as "Help()".  It is
important to remember that R is case sensitive.

 Another point that is worthy of thought is "How in God's name would
any beginner know or find out about the usage help(Startup)?"  Unless they
were explicitly told about it, in the manner which you just demonstrated.
The usage gets a mention in "An Introduction to R" --- but I had to search
for it.

 To me the word "startup" is not terribly intuitive.  I would tend to
search for "starting" rather than "startup", I think, but I'm not sure what
the average beginner would search for.  A search of "An Introduction to R"
for "starting" gets seven or eight hits, one of which is relevant.  So it
all takes patience and persistence.

 Also note that "An Introduction to R" mostly uses the word "startup"
(lower case "s") and only uses "Startup" twice.  Note also that

   help(startup)

 fails.  You have to get that initial "S" right.

 This isn't a criticism of the documentation.  I'm just pointing out
that there are problems, mostly insoluble.  Until some clever Johnny gets
on with developing that mind_read() function referred to in fortune(182).

 cheers,

 Rolf Turner

 --
 Technical Editor ANZJS
 Department of Statistics
 University of Auckland
 Phone: +64-9-373-7599 ext. 88276

Re: [R] Setting .Rprofile for RStudio on a Windows 7 x64bit

2017-04-15 Thread Henrik Bengtsson
Hi.

First, there should be no difference in where and how R and RStudio
locate the R startup file.

Second, if there is an .Rprofile in the working directory (i.e.
./.Rprofile), then that file with have higher priority than the file
located in ~/.Rprofile.  You can use the following R calls, also on
Windows, to check if you have either of these two files:

> file <- normalizePath("./.Rprofile")
> file
> file.exist(file)

> file <- normalizePath("~/.Rprofile")
> file
> file.exist(file)

In my case, my working directory is C:/Users/hb/Documents/Projects/, I
have a ~/.Rprofile file, but not a .Rprofile in the working directory.
So, I get:

> file <- normalizePath("./.Rprofile")
> file
[1] "C:\\Users\\hb\\Documents\\Projects\\.Rprofile"
> file.exists(file)
[1] FALSE

> file <- normalizePath("~/.Rprofile")
> file
[1] "C:\\Users\\hb\\Documents\\.Rprofile"
> file.exists(file)
[1] TRUE

This tells me that my startup file that R tries to load / source
during startup is "C:\\Users\\hb\\Documents\\.Rprofile" and that's the
one I should edit.

BTW, the value of normalizePath("~/.Rprofile") and
file.path(Sys.getenv("HOME"), ".Rprofile") should point to the same
file, expect that normalizePath() makes all backward slashed on
Windows; the former is just a neater version to use:

> normalizePath("~/.Rprofile")
[1] "C:\\Users\\hb\\Documents\\.Rprofile"

> file.path(Sys.getenv("HOME"), ".Rprofile")
[1] "C:/Users/hb/Documents/.Rprofile"

> normalizePath(file.path(Sys.getenv("HOME"), ".Rprofile"))
[1] "C:\\Users\\hb\\Documents\\.Rprofile"

(all of the above reference the same file).

So, if file.exists(normalizePath("~/.Rprofile")) gives FALSE, then you
don't have that file.  If you think you've edited that, then it might
be that you hit the peculiar Windows property where it hides the
filename extension from you in the Explorer.  It might be that you
instead have created / edited the file:

normalizePath("~/.Rprofile.txt")

That often happens when one uses Notepad and saves the file as
.Rprofile - Notepad simply add a *.txt filename extension unless you
save it with quotation marks in the Save-As panel.

Now, if you indeed have the file:

normalizePath("~/.Rprofile")

then there is one last annoyance in R that you might have hit.   If you're last
line in that file does not have a newline, the the file will be
silently ignored by R when R start.  There won't be a warning - not
even a message.  That is true for all OSes.  It's a "feature" that
should really be fixed, because I keep seeing it tricking beginners
and advanced R users all the times.  The easiest way to check if this
is your problem, use readLines() to read in the content; readLines()
will give a warning if the last line doesn't have a new line, e.g.

> readLines(normalizePath("~/.Rprofile"))
[1] "options(prompt=\"R> \")" "set.seed(12345)"
Warning message:
In readLines(normalizePath("~/.Rprofile")) :
  incomplete final line found on 'C:\Users\hb\Documents\.Rprofile'

If you don't see the warning message, you should be fine.

Finally, an easy way to setup a ~/.Rprofile startup file is to do it
from within R, e.g.

> cat('options(prompt="R> ")\n', file = "~/.Rprofile")
> cat('set.seed(12345)\n', file = "~/.Rprofile", append = TRUE)

The '\n' at the end of each string represents a newline character, so
make sure you don't forget those.

Hope this help

Henrik



On Sat, Apr 15, 2017 at 1:10 PM, David Winsemius  wrote:
>
>> On Apr 15, 2017, at 12:46 PM, Boris Steipe  wrote:
>>
>> As with R, do with RStudio: Read The Beautiful Manual, and peruse The 
>> Google. For example, searching Google with the two (admittedly hard to 
>> guess) cryptograms:
>>  "RStudio Rprofile"
>>
>> will present more than a dozen most enlightening links to fulfil your desire.
>>
>> Perhaps the following link works better for you though:
>>  https://www.bing.com/search?q=rstudio+rprofile
>
> Another promising search strategy would be SO with "[rstudio]" in the tags:
>
> http://stackoverflow.com/search?q=%5Brstudio%5D+rprofile+windows
>
> --
> david.
>>
>> B.
>>
>>
>>> On Apr 15, 2017, at 3:14 PM, BR_email  wrote:
>>>
>>> Bill:
>>> Thanks for reply.
>>> Sorry, I do not understand it.
>>> For example, where do I put "file.path(getwd(), ".Rprofile")" ?
>>>
>>> Bruce
>>>
>>>
>>> William Dunlap wrote:
 I think the site-specific R profile should be, using R syntax
   file.path(R.home("etc"), "Rprofile.site") # no dot before the capital R
 The personal R profile will be
   file.path(Sys.getenv("HOME"), ".Rprofile") # there is a dot before 
 capital R
 but if a local R profile,
   file.path(getwd(), ".Rprofile") # there is a dot before capital R
 exists it will be used and the one in HOME will not be.  (getwd() should
 be the startup directory.)


 Bill Dunlap
 TIBCO Software
 wdunlap tibco.com


 On Sat, Apr 15, 2017 at 9:06 AM, BR_email  wrote:

[R] The R-help community list was started on this day 20 years ago

2017-04-01 Thread Henrik Bengtsson
Today, it is been 20 years since Martin Mächler started the R-help
community list (https://stat.ethz.ch/pipermail/r-help/). The first
post was written by Ross Ihaka on 1997-04-01:

Subject: R-alpha: R-testers: pmin heisenbug
From: Ross Ihaka 
When: Tue Apr 1 10:35:48 CEST 1997
Archive: https://stat.ethz.ch/pipermail/r-help/1997-April/001488.html

This is a post about R's memory model. We're talking R v0.50 beta. I
think that the paragraph at the end provides a nice anecdote on the
importance not to be overwhelmed by problems ahead:

   "(The consumption of one cell per string is perhaps the major
memory problem in R - we didn't design it with large problems in mind.
It is probably fixable, but it will mean a lot of work)."

We all know the story; an endless number of hours has been put in by
many contributors throughout the years, making The R Project and its
community the great experience it is today.

Thank you!

Henrik

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Setting up a .Rprofile file

2017-03-24 Thread Henrik Bengtsson
R for Windows is a bit peculiar where it locates your .Rprofile file,
or rather what it consider to be your home directory.  If you look
from within R, the file you do want to create / edit is:

> f <- normalizePath("~/.Rprofile", mustWork = FALSE)
> f
[1] "C:\\Users\\joe\\Documents\\.Rprofile"

For instance, to change what your R prompt looks you can add this from
within R as:

> cat('options(prompt = "> R ")\n', file = f, append = TRUE)

Hope this helps

Henrik

On Fri, Mar 24, 2017 at 2:36 AM, Bruce Ratner PhD  wrote:
> Henrico:
> Thanks for quick reply.
> However, one last question:
> If I want to change working directory, and put setwd() in the Rprofile file, 
> logically R will not know where the be work directory is, correct?
>
> So, should I install R in my preferred working directory?
>
> Thanks again, in advance.
> Bruce
>
>
> __
> Bruce Ratner PhD
> The Significant Statistician™
> (516) 791-3544
> Statistical Predictive Analytics -- www.DMSTAT1.com
> Machine-Learning Data Mining -- www.GenIQ.net
>
>
>
>> On Mar 24, 2017, at 3:48 AM, Enrico Schumann  wrote:
>>
>> On Thu, 23 Mar 2017, Bruce Ratner PhD writes:
>>
>>> Hi R'ers:
>>> I would like to setting up a .Rprofile file with
>>> setwd("C:/R_WorkDir")
>>> set.seed(12345)
>>> options (prompt "> R ")
>>>
>>> ---
>>> Can you help providing the code or instructive link,
>>> I've find many links, but I can't figure it out?
>>>
>>> Thanks.
>>> Bruce
>>>
>>
>> Quoting from ?Startup:
>>
>> ,
>> | [...] unless ‘--no-init-file’ was given, R searches
>> | for a user profile, a file of R code.  The path of
>> | this file can be specified by the ‘R_PROFILE_USER’
>> | environment variable (and tilde expansion will be
>> | performed).  If this is unset, a file called
>> | ‘.Rprofile’ is searched for in the current directory
>> | or in the user's home directory (in that order).  The
>> | user profile file is sourced into the workspace.
>> `
>>
>> --
>> Enrico Schumann
>> Lucerne, Switzerland
>> http://enricoschumann.net
>>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to do double (nested) parSapply in R?

2017-02-02 Thread Henrik Bengtsson
Quick comment: sapply() / parSapply() can behaves "unexpectedly". To
troubleshoot this, use parLapply() instead to see if you at least get the
individual results you think you should get.

Henrik

On Feb 2, 2017 08:03, "Art U"  wrote:

> Hello,
>
> I have a data orig.raw that contains 8 predictors and 1 outcome variable.
> I'm trying to run simulation (bootstrap) and create, let's say, 10
> confidence intervals for coefficients estimated by LASSO. I did it with
> regular function replicate, but now I want to do it by using parallel
> programming. Here is my code:
>
> cl <- makeCluster(detectCores())
> clusterEvalQ(cl,library(glmnet))
> clusterEvalQ(cl,library(MASS))
> clusterExport(cl,c("orig.raw"))
>
> pp=parSapply(cl,1:10,function(i,data=orig.raw,...){
>   library(parallel)
>   cl <- makeCluster(detectCores())
>   clusterEvalQ(cl,library(glmnet))
>   clusterEvalQ(cl,library(MASS))
>   clusterExport(cl,c("orig.raw"))
>
>   repl=parSapply(cl=cl,1:10,function(i,data=orig.raw,...){
> s1=data[sample(nrow(data),size=500,replace=TRUE),]
> cross=cv.glmnet(s1[,1:8],s1[,9])
> penalty <- cross$lambda.min
> fit=glmnet(s1[,1:8],s1[,9],alpha=1,lambda=penalty)
> coe=coef(fit)
> m=as(coe, "matrix")
> return(m)
>   })
>
> stopCluster(cl)
>
>   mr=t(matrix(repl,nrow = 9,ncol=10))
>   means=colMeans(mr)
>   std=apply(mr, 2, sd)
>   lb=means-1.96*std;
>   ub=means+1.96*std;
>   ind=t(as.numeric({beta>lb & beta   return(ind)})
> stopCluster(cl)
>
> And here is the error I'm getting
>
> Error in checkForRemoteErrors(val) : 8 nodes produced errors; first
> error: comparison (6) is possible only for atomic and list types
>
> If I run only function repl - it works and I get the matrix that contains
> coefficients from 10 runs.
>
> Can you please help me to solve the problem?
> Regards,
> Ariel
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [FORGED] file.exists() on device files

2017-01-11 Thread Henrik Bengtsson
On Wed, Jan 11, 2017 at 3:56 PM, Rolf Turner  wrote:
> On 11/01/17 23:12, Benjamin Tyner wrote:
>>
>> Hi,
>>
>> On my linux machine (Ubuntu, and also tested on RHEL), I am curious to
>> know what might be causing file.exists (and also normalizePath) to not
>> see the final device file here:
>>
>>> list.files("/dev/fd", full.names = TRUE)
>>[1] "/dev/fd/0" "/dev/fd/1" "/dev/fd/2" "/dev/fd/3"
>>> file.exists(list.files("/dev/fd", full.names = TRUE))
>>[1]  TRUE  TRUE  TRUE FALSE
>>> normalizePath(list.files("/dev/fd", full.names = TRUE))
>>[1] "/dev/pts/2" "/dev/pts/2" "/dev/pts/2" "/dev/fd/3"
>>Warning message:
>>In normalizePath(list.files("/dev/fd", full.names = TRUE)) :
>>  path[4]="/dev/fd/3": No such file or directory
>
>
> (a) Exactly the same thing happens to me (I am also running Ubuntu 16.04).
>
> (b) Things get a bit confused because /dev/fd is actually a symbolic link:
>
>> $ ls -l /dev/fd
>> lrwxrwxrwx 1 root root 13 Jan  6 20:16 /dev/fd -> /proc/self/fd/
>
>
> (c) But then doing
>
>>> file.exists(list.files("/proc/self/fd", full.names = TRUE))
>
>
> Gives the same result as before:
>
>> [1]  TRUE  TRUE  TRUE FALSE
>
>
> (d) It turns out that the four "files" in /proc/self/fd are again
> symbolic links:
>
>> $ ls -l /proc/self/fd
>> total 0
>> lrwx-- 1 rolf rolf 64 Jan 12 12:32 0 -> /dev/pts/3
>> lrwx-- 1 rolf rolf 64 Jan 12 12:32 1 -> /dev/pts/3
>> lrwx-- 1 rolf rolf 64 Jan 12 12:32 2 -> /dev/pts/3
>> lr-x-- 1 rolf rolf 64 Jan 12 12:32 3 -> /proc/7150/fd/
>
>
> (e) But now do it again!!!
>
>> $ ls -l /proc/self/fd
>> total 0
>> lrwx-- 1 rolf rolf 64 Jan 12 12:32 0 -> /dev/pts/3
>> lrwx-- 1 rolf rolf 64 Jan 12 12:32 1 -> /dev/pts/3
>> lrwx-- 1 rolf rolf 64 Jan 12 12:32 2 -> /dev/pts/3
>> lr-x-- 1 rolf rolf 64 Jan 12 12:32 3 -> /proc/7154/fd/
>
>
> Different number; 7154 rather than 7150.
>
> (f) The name "/proc" would seem to imply that this has something to do with
> processes; the directories "7150", "7154" etc. are being created and removed
> on the fly, as a result of some process (presumably the "ls"
> process) starting and finishing.

FYI, the /proc is there because Unix has something called the "proc
filesystem (procfs; https://en.wikipedia.org/wiki/Procfs) is a special
filesystem in Unix-like operating systems that presents information
about processes and other system information in a hierarchical
file-like structure".  For instance, you can query the uptime of the
machine by reading from /proc/uptime:

$ cat /proc/uptime
332826.96 661438.10

$ cat /proc/uptime
332871.40 661568.50


You can get all IDs (PIDs) of all processes currently running:

$ ls /proc/ | grep -E '^[0-9]+$'

and for each process you there are multiple attributes mapped as
files, e.g. if I start R as:

$ R --args -e "message('hello there')"

then I can query that process as:

$ pid=$(pidof R)
$ echo $pid
26323

$ cat /proc/26323/cmdline
/usr/lib/R/bin/exec/R--args-emessage('hello there')

Unix is neat

/Henrik

>
> I have no insight into what is being effected here, or what is really going
> on "deep down", but the foregoing is some sort of "explanation".
> By the time file.exists() is invoked, the ls process called by list.files()
> has finished and the associated directory (e.g. "7150", "7154", ...) has
> ceased to be.
>
> What you do with this "explanation" is up to you.  My advice would be to
> forget about it and go to the pub! :-)
>
> cheers,
>
> Rolf Turner
>
> --
> Technical Editor ANZJS
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] getTimeLimit?

2017-01-03 Thread Henrik Bengtsson
On Tue, Jan 3, 2017 at 2:43 PM, William Dunlap via R-help
 wrote:
> I am interested in measuring the time it takes to run an expression.
>  system.time(expr) does this but I would like to have it report just 'more
> than xxx seconds', where xxx is a argument to the timing function, when it
> takes a long time.  This is useful automating the process of seeing how
> processing time grows with the size of a dataset.
>
> My latest attempt is the following function, which adds a 'censored=TRUE'
> attribute when the cpu or elapsed time exceeds some limit:
>
> system.time2 <- function (expr, gcFirst = TRUE, cpu = Inf, elapsed = Inf)
> {
> setTimeLimit(cpu = cpu, elapsed = elapsed, transient = TRUE)
> censored <- NULL
> time <- system.time(gcFirst = gcFirst, tryCatch(expr, error =
> function(e) if (grepl("reached (CPU|elapsed) time limit",
> conditionMessage(e)))
> censored <<- conditionMessage(e)
> else stop(e)))
> attr(time, "censored") <- censored
> time
> }
>
> It would be used as
>
>> system.time(times <- lapply(10^(1:7), function(n)system.time2(for(i in
> 1:n)lgamma(1:i), elapsed=10) ))
>user  system elapsed
>   33.550.25   33.82
>> vapply(times, function(t)t[["elapsed"]], 0)
> [1]  0.02  0.00  0.03  3.08 10.02 10.14 10.18
>> # following gives which times are valid
>> vapply(times, function(t)is.null(attr(t,"censored")), NA)
> [1]  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE
>
> I have two questions.
> * Is this a reasonable way to compute such a censored time?
> * Is there a getTimeLimit()-like function?

I also wanted such a function, but I don't think it exists.
Internally in R the timeout is set in global variables 'cpuLimitValue'
and 'elapsedLimitValue'.  Grepping the source for it doesn't reveal
any external access to it, e.g.

$ grep -F "cpuLimitValue" -r --include="*.h" src/
src/include/Defn.h:extern0 double cpuLimitValue INI_as(-1.0);

$ grep -F "cpuLimitValue" -r --include="*.c" src/
src/main/sysutils.c:cpuLimit = (cpuLimitValue > 0) ? data[0] +
data[1] + cpuLimitValue : -1.0;
src/main/sysutils.c:cpuLimit = (cpuLimitValue > 0) ? data[0] +
data[1] + data[3] + data[4] + cpuLimitValue : -1.0;
src/main/sysutils.c:double cpu, elapsed, old_cpu = cpuLimitValue,
src/main/sysutils.c:if (R_FINITE(cpu) && cpu > 0) cpuLimitValue =
cpu; else cpuLimitValue = -1;
src/main/sysutils.c: cpuLimitValue = old_cpu;

Similar for 'elapsedLimitValue'.

>
> Also, I think it would be nice if the error thrown when timing out had a
> special class so I didn't have to rely on grepping the error message, but
> that is true of lots of errors.

FYI, R.utils::withTimeout() greps the error message (for any language;
https://github.com/HenrikBengtsson/R.utils/blob/2.5.0/R/withTimeout.R#L113-L114)
this way and returns an error of class TimeoutException.

FYI 2, there is as 'Working group for standard error (condition)
classes' proposal to the RConsortium "wishlist", cf.
https://github.com/RConsortium/wishlist/issues/6.

/Henrik

>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Memory problem

2016-11-22 Thread Henrik Bengtsson
On Windows 32-bit I think (it's been a while) you can push it to 3 GB but
to go beyond you need to run R  on 64-bit Windows (same rule for all
software not just R). I'm pretty sure this is already documented in the R
documentation.

Henrik

On Nov 22, 2016 19:49, "Ista Zahn"  wrote:

Not conveniently. Memory is cheap, you should buy more.

Best,
Ista

On Nov 22, 2016 12:19 PM, "Partha Sinha"  wrote:

>  I am using R 3.3.2 on win 7, 32 bit with 2gb Ram. Is it possible to use
> more than 2 Gb data set ?
>
> Regards
> Partha
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is this foreach behaviour correct?

2016-11-13 Thread Henrik Bengtsson
On Nov 13, 2016 13:54, "Henrik Bengtsson" <henrik.bengts...@gmail.com>
wrote:
>
> It looks like a bug.  I don't think c.Date() is every called, because:
>
> > trace(c.Date, tracer = quote(message("c.Date() called")))
> Tracing function "c.Date" in package "base"
> [1] "c.Date"
>
> Tracing works:
>
> > c(as.Date(1L), as.Date(10001L))
> Tracing c.Date(as.Date(1L), as.Date(10001L)) on entry
> c.Date() called
> [1] "1997-05-19" "1997-05-20"
>
> but c.Date() is not called here:
>
> > x <- foreach(i=1:10100, .combine = function(...) c(...)) %do% {
as.Date(i) }
> > str(x)
>  num [1:101] 1 10001 10002 10003 10004 ...

Cut'n'paste error above. It is

x <- foreach(i=1:10100, .combine = "c") %do% { as.Date(i) }

that doesn't call c.Date(). Same if you try with .combine = c.

>
>
> The following hack works:
>
> > library("foreach")
> > library("zoo")
> > x <- foreach(i=1:10100, .combine = function(...) c(...)) %do% {
as.Date(i) }
> > str(x)
>  Date[1:101], format: "1997-05-19" "1997-05-20" "1997-05-21" "1997-05-22"
...
>
> Alternatively, one can use append() which works like c() if no other
> arguments are specified:
>
> > x <- foreach(i=1:10100, .combine = append) %do% { as.Date(i) }
> > str(x)
>  Date[1:101], format: "1997-05-19" "1997-05-20" "1997-05-21" "1997-05-22"
...
>
> It looks like foreach is treating the .combine = c case specially and
> someone fail to properly dispatch c() on the object (or something).
>
> /Henrik
>
>
> On Sun, Nov 13, 2016 at 7:14 AM, James Hirschorn
> <james.hirsch...@hotmail.com> wrote:
> > I'm still not clear about whether this is a bug in foreach. Should
c.Date be invoked by foreach with .combine='c'?
> >
> > On 11/06/2016 07:02 PM, William Dunlap wrote:
> > Note that in the OP's example c.Date is never invoked.  c.Date is
called if .combine
> > calls c rather than if .combine is c:
> >
> >> library(zoo)
> >> trace(c.Date, quote(print(sys.call(
> > Tracing function "c.Date" in package "base"
> > [1] "c.Date"
> >> foreach(i=1:10003, .combine=c) %do% { as.Date(i) }
> > [1] 1 10001 10002 10003
> >> foreach(i=1:10003, .combine=function(...)c(...)) %do% { as.Date(i)
}
> > Tracing c.Date(...) on entry
> > eval(expr, envir, enclos)
> > Tracing c.Date(...) on entry
> > eval(expr, envir, enclos)
> > Tracing c.Date(...) on entry
> > eval(expr, envir, enclos)
> > [1] "1997-05-19" "1997-05-20" "1997-05-21" "1997-05-22"
> >
> >
> > Bill Dunlap
> > TIBCO Software
> > wdunlap tibco.com<http://tibco.com>
> >
> > On Sun, Nov 6, 2016 at 2:20 PM, Duncan Murdoch <murdoch.dun...@gmail.com
<mailto:murdoch.dun...@gmail.com>> wrote:
> > On 06/11/2016 5:02 PM, Jim Lemon wrote:
> > hi James,
> > I think you have to have a starting date ("origin") for as.Date to
> > convert numbers to dates.
> >
> > That's true with the function in the base package, but the zoo package
also has an as.Date() function, which defaults the origin to "1970-01-01".
If James is using zoo his code would be okay.  If he's not, he would have
got an error, so I think he must have been.
> >
> > Duncan Murdoch
> >
> >
> >
> > Jim
> >
> > On Sun, Nov 6, 2016 at 12:10 PM, James Hirschorn
> > <james.hirsch...@hotmail.com<mailto:james.hirsch...@hotmail.com>> wrote:
> > This seemed odd so I wanted to check:
> >
> >  > x <- foreach(i=1:10100, .combine='c') %do% { as.Date(i) }
> >
> > yields a numeric vector for x:
> >
> >  > class(x)
> > [1] "numeric"
> >
> > Should it not be a vector of Date?
> >
> > __
> > R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To
UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> > __
> > R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To
UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
http:/

Re: [R] Is this foreach behaviour correct?

2016-11-13 Thread Henrik Bengtsson
It looks like a bug.  I don't think c.Date() is every called, because:

> trace(c.Date, tracer = quote(message("c.Date() called")))
Tracing function "c.Date" in package "base"
[1] "c.Date"

Tracing works:

> c(as.Date(1L), as.Date(10001L))
Tracing c.Date(as.Date(1L), as.Date(10001L)) on entry
c.Date() called
[1] "1997-05-19" "1997-05-20"

but c.Date() is not called here:

> x <- foreach(i=1:10100, .combine = function(...) c(...)) %do% { 
> as.Date(i) }
> str(x)
 num [1:101] 1 10001 10002 10003 10004 ...


The following hack works:

> library("foreach")
> library("zoo")
> x <- foreach(i=1:10100, .combine = function(...) c(...)) %do% { 
> as.Date(i) }
> str(x)
 Date[1:101], format: "1997-05-19" "1997-05-20" "1997-05-21" "1997-05-22" ...

Alternatively, one can use append() which works like c() if no other
arguments are specified:

> x <- foreach(i=1:10100, .combine = append) %do% { as.Date(i) }
> str(x)
 Date[1:101], format: "1997-05-19" "1997-05-20" "1997-05-21" "1997-05-22" ...

It looks like foreach is treating the .combine = c case specially and
someone fail to properly dispatch c() on the object (or something).

/Henrik


On Sun, Nov 13, 2016 at 7:14 AM, James Hirschorn
 wrote:
> I'm still not clear about whether this is a bug in foreach. Should c.Date be 
> invoked by foreach with .combine='c'?
>
> On 11/06/2016 07:02 PM, William Dunlap wrote:
> Note that in the OP's example c.Date is never invoked.  c.Date is called if 
> .combine
> calls c rather than if .combine is c:
>
>> library(zoo)
>> trace(c.Date, quote(print(sys.call(
> Tracing function "c.Date" in package "base"
> [1] "c.Date"
>> foreach(i=1:10003, .combine=c) %do% { as.Date(i) }
> [1] 1 10001 10002 10003
>> foreach(i=1:10003, .combine=function(...)c(...)) %do% { as.Date(i) }
> Tracing c.Date(...) on entry
> eval(expr, envir, enclos)
> Tracing c.Date(...) on entry
> eval(expr, envir, enclos)
> Tracing c.Date(...) on entry
> eval(expr, envir, enclos)
> [1] "1997-05-19" "1997-05-20" "1997-05-21" "1997-05-22"
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Sun, Nov 6, 2016 at 2:20 PM, Duncan Murdoch 
> > wrote:
> On 06/11/2016 5:02 PM, Jim Lemon wrote:
> hi James,
> I think you have to have a starting date ("origin") for as.Date to
> convert numbers to dates.
>
> That's true with the function in the base package, but the zoo package also 
> has an as.Date() function, which defaults the origin to "1970-01-01".  If 
> James is using zoo his code would be okay.  If he's not, he would have got an 
> error, so I think he must have been.
>
> Duncan Murdoch
>
>
>
> Jim
>
> On Sun, Nov 6, 2016 at 12:10 PM, James Hirschorn
> > wrote:
> This seemed odd so I wanted to check:
>
>  > x <- foreach(i=1:10100, .combine='c') %do% { as.Date(i) }
>
> yields a numeric vector for x:
>
>  > class(x)
> [1] "numeric"
>
> Should it not be a vector of Date?
>
> __
> R-help@r-project.org mailing list -- To 
> UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To 
> UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
> __
> R-help@r-project.org mailing list -- To 
> UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] remove a "corrupted file" after using download.file() with R on Windows 7

2016-09-29 Thread Henrik Bengtsson
1. It could be that a virus checker locks the file.

2. There are Windows software tools that identify which process locks
a particular file, e.g. LockHunter (http://lockhunter.com/).  Those
should help you figure out what's going on.

3. R.utils::downloadFile() tries it's best to download files
atomically, i.e. it either gives you a fully downloaded file or not
all.  In your case, you might still end up with a temporary corrupt
file, but at least it will have a filename that is different than the
one you ask for.

Hope this helps

/Henrik

On Wed, Sep 28, 2016 at 9:32 PM, Fabien Tarrade
 wrote:
> Hi there,
>
> Sometime download.file() failed to download the file and I would like to
> remove the correspond file.
> The issue is that I am not able to do it and Windows complain that the file
> is use by another application.
> I try to closeAllConnections(), or unlink() before removing the file but
> without sucess.
>
> Any idea how I should proceed &
>
> Please find the code below
>
>  # consider warning as an error
>   options(warn=2)
>
>   # try to download the file
>   tryCatch({
> download.file(url,path_file,mode="wb",quiet=quiet)
> return(0)
>   },error = function(e){
> if(verbose){
>   print(e)
>   print(e$message)
> }
> # close file when it failed
> if (file.exists(path_file)){
>   closeAllConnections()
>   #unlink(path_file, recursive=TRUE)
>   #file.create(path_file,overwrite=TRUE,showWarning=TRUE)
>   #system(paste0('open "', path_file, '"'))
>   file.remove(path_file,overwrite=TRUE,showWarning=TRUE)
> }
> return(1)
> }
> )
>
> Thanks a lot
> Cheers
> Fabien
>
> --
> Dr Fabien Tarrade
>
> Quantitative Analyst/Developer - Data Scientist
>
> Senior data analyst specialised in the modelling, processing and statistical
> treatment of data.
> PhD in Physics, 10 years of experience as researcher at the forefront of
> international scientific research.
> Fascinated by finance and data modelling.
>
> Geneva, Switzerland
>
> Email : cont...@fabien-tarrade.eu 
> Phone : www.fabien-tarrade.eu 
> Phone : +33 (0)6 14 78 70 90
>
> LinkedIn  Twitter
>  Google
>  Facebook
>  Google 
> Xing 
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Replacing value with "1"

2016-09-23 Thread Henrik Bengtsson
which(df == 1, arr.ind=TRUE) is useful here:

> df <- matrix(c(0,NA,0,0,0,1,1,1,0,0,1,0,0,0,NA), nrow=3)
> df
 [,1] [,2] [,3] [,4] [,5]
[1,]00100
[2,]   NA0110
[3,]0100   NA

> ## Identify (row,col) indices for 1:s
> idxs <- which(df == 1, arr.ind=TRUE)
> idxs
 row col
[1,]   3   2
[2,]   1   3
[3,]   2   3
[4,]   2   4

> ## Drop any in the last column
> idxs <- idxs[idxs[,"col"] < ncol(df), , drop=FALSE]
> idxs
 row col
[1,]   3   2
[2,]   1   3
[3,]   2   3
[4,]   2   4

> idxs[,"col"] <- idxs[,"col"] + 1L
> idxs
 row col
[1,]   3   3
[2,]   1   4
[3,]   2   4
[4,]   2   5

> df[idxs] <- 1
> df
 [,1] [,2] [,3] [,4] [,5]
[1,]00110
[2,]   NA0111
[3,]0110   NA

/Henrik

On Thu, Sep 22, 2016 at 8:13 PM, Jim Lemon  wrote:
> Hi Saba,
> Try this:
>
> df<-matrix(c(0,NA,0,0,0,1,1,1,0,0,1,0,0,0,NA),nrow=3)
> dimdf<-dim(df)
> df1<-df==1
> df[cbind(rep(FALSE,dimdf[1]),df1[,-dimdf[2]])]<-1
>
> Jim
>
>
>
> On Fri, Sep 23, 2016 at 12:27 PM, Saba Sehrish via R-help
>  wrote:
>> Hi
>>
>> I have a matrix that contains 1565 rows and 132 columns. All the 
>> observations are either "0" or "1". Now I want to keep all the observations 
>> same but just one change, i.e. whenever there is "1", the very next value in 
>> the same row should become "1". Please see below as a sample:
>>
>>>df
>>
>>  00100
>> NA0110
>>  0100NA
>>
>> What I want is:
>>
>>
>> 00110
>>NA0111
>> 0110NA
>>
>>
>>
>> I shall be thankful for the reply.
>>
>>
>> Saba
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Accelerating binRead

2016-09-18 Thread Henrik Bengtsson
I second Mike's proposal - it works, e.g.
https://github.com/HenrikBengtsson/affxparser/blob/5bf1a9162904c56d59c4735a8d7eb427e4f085e4/R/readCcg.R#L535-L583

Here's an outline. Say each row consists of tuple (=4-byte
integer, =4-byte float, ss=2 byte integer) so that the
byte-by-byte content of your file look like this:

  ss
  ss
  ss
  ...
  ss

Then read this is as raw bytes (file_size can also be a very large
number in case it's unknown):

  raw <- readBin(con, what="raw", n=file_size)

Turn into a (4+4+2)-by-K raw matrix:

  raw <- matrix(raw, nrow=4+4+2)

so that your raw bytes has the following layout:

  iii ... i
  iii ... i
  iii ... i
  iii ... i
  fff ... f
  fff ... f
  fff ... f
  fff ... f
  sss ... s
  sss ... s

Then extract the three submatrices of interest:

   <- raw[1:4,]
   <- raw[5:8,]
  ss <- raw[9:10,]

Here you can discard raw, i.e. rm(list="raw").

Since R stores matrices in a column-by-column order internally, your
bytes are already in the proper order.  Finally, re-read these with
appropriate readBin() settings, e.g.

  i <- readBin(, what="integer", size=4L)
  f <- readBin(, what="double", size=4L)
  s <- readBin(ss, what="integer", size=2L)

Put into a 3-by-K data.frame:

  data <- data.frame(i=i, f=f, s=s)

/Henrik

On Sun, Sep 18, 2016 at 8:02 AM, Philippe de Rochambeau  wrote:
> I would gladly examine your example, Mike.
> Cheers,
> Philippe
>
>> Le 18 sept. 2016 à 16:05, Michael Sumner  a écrit :
>>
>>
>>
>>> On Sun, 18 Sep 2016, 19:04 Philippe de Rochambeau  wrote:
>>> Please find below code that attempts to read ints, longs and floats from a 
>>> binary file (which is a simplification of my original program).
>>> Please disregard the R inefficiencies, such as using rbind, for now.
>>> I’ve also included Java code to generate the binary file.
>>> The output shows that, at one point, anInt becomes undefined. 
>>> Unfortunately, I couldn’t find the correct R function to determine whether 
>>> inInt is undefined or not, as is.null, is.nan, and is.infinite don’t work.
>>> Any help would be much appreciated.
>>> Many thanks in advance.
>>> Philippe
>>>
>>> ———
>>> [1] "anInt = 1"
>>> [1] "is.null  FALSE"
>>> [1] "is.nan  FALSE"
>>> [1] "is.infinite  FALSE"
>>> [1] "aLong = 2"
>>> [1] "aFloat = 3.0007209778"
>>> [1] "--"
>>> [1] "anInt = 2"
>>> [1] "is.null  FALSE"
>>> [1] "is.nan  FALSE"
>>> [1] "is.infinite  FALSE"
>>> [1] "aLong = 22"
>>> [1] "aFloat = 13.4644002914429"
>>> [1] "--"
>>> [1] "anInt = 3"
>>> [1] "is.null  FALSE"
>>> [1] "is.nan  FALSE"
>>> [1] "is.infinite  FALSE"
>>> [1] "aLong = 55"
>>> [1] "aFloat = 45.007873535"
>>> [1] "--"
>>> [1] "anInt = "
>>> [1] "is.null  FALSE"
>>> [1] "is.nan  "
>>> [1] "is.infinite  "
>>> [1] "aLong = "
>>> [1] "aFloat = "
>>> [1] "--"
>>>  [,1]  [,2]  [,3]
>>> [1,] 1 2 3.
>>> [2,] 2 2213.4644
>>> [3,] 3 5545.
>>> [4,] Integer,0 Integer,0 Numeric,0
>>> >
>>>
>>> ---
>>>
>>>
>>> —
>>>
>>> readFile <- function(inputPath) {
>>>   URL <- file(inputPath, "rb")
>>>   PLT <- matrix(nrow=0, ncol=3)
>>>   counte <- 0
>>>   max <- 4
>>>   while (counte < max) {
>>> anInt <- readBin(con=URL, what=integer(), size=4, n=1, endian="big")
>>> print(paste("anInt =", anInt))
>>> #if (! (anInt == 0)) { print(paste("empty int")); break }
>>> print(paste("is.null ", is.null(anInt)))
>>> print(paste("is.nan ", is.nan(anInt)))
>>> print(paste("is.infinite ", is.infinite(anInt)))
>>> aLong <- readBin(URL, integer(), size=8, n=1, endian="big")
>>> print(paste("aLong =", aLong))
>>> aFloat <- readBin(URL, numeric(), size=4, n=1, endian="big")
>>> print(paste("aFloat =", aFloat))
>>> print("--")
>>> PLT <- rbind(PLT, list(anInt, aLong, aFloat))
>>> counte <- counte + 1
>>>   } # end while
>>>   close(URL)
>>>   PLT
>>> }
>>> fichier <- "/Users/philippe/Desktop/datatests/data0.bin"
>>> PLT2 <- readFile(fichier)
>>> print(PLT2)
>>> —
>>>
>>> import java.io.*;
>>>
>>> public class Main {
>>>
>>> Main() {
>>> writeData();
>>> }
>>>
>>> public static void main(String[] args) {
>>> new Main();
>>> }
>>>
>>> public void writeData() {
>>>
>>> final String path = 
>>> "/Users/philippe/Desktop/datatests/data0.bin";
>>>
>>> DataOutputStream dos;
>>> try {
>>> dos = new DataOutputStream(new 
>>> BufferedOutputStream(new FileOutputStream(path)));
>>> // big endian write! ("high byte first") , see 
>>> https://docs.oracle.com/javase/7/docs/api/java/io/DataOutputStream.html
>>> 

Re: [R] Maximum # of DLLs reached, or, how to clean up after yourself?

2016-09-14 Thread Henrik Bengtsson
As Jeff says, I think the common use case is to run/rerun in fresh R sessions.

But, yes, if you'd like to have each script clean up after itself,
then you need to check with pkgs0 <- loadedNamespaces() to see what
packages are loaded when the script starts (not just attached) and
then unload the ones added at the end by pkgsDiff <-
setdiff(loadedNamespaces(), pkgs0).   However, it's not as simple as
calling unloadNamespace(pkgsDiff), because they need to be unloaded in
an order that is compatible with the package dependencies.   One way
is to too use while(length(pkgDiffs) > 0) loop over with a
try(unloadNamespace(pkg)) until all are unloaded.   At the end, run
R.utils::gcDLLs() too (now on CRAN).

unloadNamespace("foo") should result in the same as
detach("package::foo", unload=TRUE) [anyone correct me if I'm wrong].

Hope this helps

Henrik

On Wed, Sep 14, 2016 at 6:41 AM, Jeff Newmiller
<jdnew...@dcn.davis.ca.us> wrote:
> I never detach packages. I rarely load more than 6 or 7 packages directly 
> before restarting R. I frequently re-run my scripts in new R sessions to 
> confirm reproducibility.
> --
> Sent from my phone. Please excuse my brevity.
>
> On September 14, 2016 1:49:55 AM PDT, Alexander Shenkin <ashen...@ufl.edu> 
> wrote:
>>Hi Henrik,
>>
>>Thanks for your reply.  I didn't realize that floating DLLs were an
>>issue (good to know).  My query is actually a bit more basic.  I'm
>>actually wondering how folks manage their loading and unloading of
>>packages when calling scripts within scripts.
>>
>>Quick example:
>>Script1:
>>   library(package1)
>>   source("script2.r")
>>   # do stuff reliant on package1
>>   detach("package:package1", unload=TRUE)
>>
>>Script2:
>>   library(package1)
>>   library(package2)
>>   # do stuff reliant on package1 and package2
>>   detach("package:package1", unload=TRUE)
>>   detach("package:package2", unload=TRUE)
>>
>>Script2 breaks Script1 by unloading package1 (though unloading package2
>>
>>is ok).  I will have to test whether each package is loaded in Script2
>>before loading it, and use that list when unloading at the end of the
>>Script2.  *Unless there's a better way to do it* (which is my question
>>-
>>is there?).  I'm possibly just pushing the whole procedural scripting
>>thing too far, but I also think that this likely isn't uncommon in R.
>>
>>Any thoughts greatly appreciated!
>>
>>Thanks,
>>Allie
>>
>>On 9/13/2016 7:23 PM, Henrik Bengtsson wrote:
>>> In R.utils (>= 2.4.0), which I hope to submitted to CRAN today or
>>> tomorrow, you can simply call:
>>>
>>>R.utils::gcDLLs()
>>>
>>> It will look at base::getLoadedDLLs() and its content and compare to
>>> loadedNamespaces() and unregister any "stray" DLLs that remain after
>>> corresponding packages have been unloaded.
>>>
>>> Until the new version is on CRAN, you can install it via
>>>
>>>
>>source("http://callr.org/install#HenrikBengtsson/R.utils@develop;)
>>>
>>> or alternatively just source() the source file:
>>>
>>>
>>source("https://raw.githubusercontent.com/HenrikBengtsson/R.utils/develop/R/gcDLLs.R;)
>>>
>>>
>>> DISCUSSION:
>>> (this might be better suited for R-devel; feel free to move it there)
>>>
>>> As far as I understand the problem, running into this error / limit
>>is
>>> _not_ the fault of the user.  Instead, I'd argue that it is the
>>> responsibility of package developers to make sure to unregister any
>>> registered DLLs of theirs when the package is unloaded.  A developer
>>> can do this by adding the following to their package:
>>>
>>> .onUnload <- function(libpath) {
>>> library.dynam.unload(utils::packageName(), libpath)
>>>  }
>>>
>>> That should be all - then the DLL will be unloaded as soon as the
>>> package is unloaded.
>>>
>>> I would like to suggest that 'R CMD check' would include a check that
>>> asserts when a package is unloaded it does not leave any registered
>>> DLLs behind, e.g.
>>>
>>> * checking whether the namespace can be unloaded cleanly ... WARNING
>>>   Unloading the namespace does not unload DLL
>>> * checking loading without being on the library search path ... OK
>>>
>>> For further details on my thoughts on this, see
>>> https://github.com/HenrikBengtsson

Re: [R] Maximum # of DLLs reached, or, how to clean up after yourself?

2016-09-13 Thread Henrik Bengtsson
In R.utils (>= 2.4.0), which I hope to submitted to CRAN today or
tomorrow, you can simply call:

   R.utils::gcDLLs()

It will look at base::getLoadedDLLs() and its content and compare to
loadedNamespaces() and unregister any "stray" DLLs that remain after
corresponding packages have been unloaded.

Until the new version is on CRAN, you can install it via

source("http://callr.org/install#HenrikBengtsson/R.utils@develop;)

or alternatively just source() the source file:


source("https://raw.githubusercontent.com/HenrikBengtsson/R.utils/develop/R/gcDLLs.R;)


DISCUSSION:
(this might be better suited for R-devel; feel free to move it there)

As far as I understand the problem, running into this error / limit is
_not_ the fault of the user.  Instead, I'd argue that it is the
responsibility of package developers to make sure to unregister any
registered DLLs of theirs when the package is unloaded.  A developer
can do this by adding the following to their package:

.onUnload <- function(libpath) {
library.dynam.unload(utils::packageName(), libpath)
 }

That should be all - then the DLL will be unloaded as soon as the
package is unloaded.

I would like to suggest that 'R CMD check' would include a check that
asserts when a package is unloaded it does not leave any registered
DLLs behind, e.g.

* checking whether the namespace can be unloaded cleanly ... WARNING
  Unloading the namespace does not unload DLL
* checking loading without being on the library search path ... OK

For further details on my thoughts on this, see
https://github.com/HenrikBengtsson/Wishlist-for-R/issues/29.

Hope this helps

Henrik

On Tue, Sep 13, 2016 at 6:05 AM, Alexander Shenkin  wrote:
> Hello all,
>
> I have a number of analyses that call bunches of sub-scripts, and in the
> end, I get the "maximal number of DLLs reached" error.  This has been asked
> before (e.g.
> http://stackoverflow.com/questions/36974206/r-maximal-number-of-dlls-reached),
> and the general answer is, "just clean up after yourself".
>
> Assuming there are no plans to raise this 100-DLL limit in the near future,
> my question becomes, what is best practice for cleaning up (detaching)
> loaded packages in scripts, when those scripts are sometimes called from
> other scripts?  One can detach all packages at the end of a script that were
> loaded at the beginning of the script.  However, if a package is required in
> a calling script, one should really make sure it hadn't been loaded prior to
> sub-script invocation before detaching it.
>
> I could write a custom function that pushes and pops package names from a
> global list, in order to keep track, but maybe there's a better way out
> there...
>
> Thanks for any thoughts.
>
> Allie
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] save R object into a remote directory

2016-09-06 Thread Henrik Bengtsson
On Tue, Sep 6, 2016 at 2:00 AM, Rainer M Krug  wrote:
> Please reply to the mailing list to keep the conversation available for
> everybody. I Send this mail to the mailing list as well.
>
> Simone Tenan  writes:
>
>> Thanks Rainer,
>> it's very kind of you.
>> Unfortunately, I cannot save the (large) R object where the current R 
>> session is running. There is not enough memory and when I tried I got the 
>> error "error
>> writing to connection".
>
> OK - essential piece of information.

Assuming the remote machine you're logging into is running Linux:

Before anything else, did you try to save elsewhere than to your
/home/ folder/account? Try /tmp/.  That probably / hopefully doesn't
count to your account quota (and it gets wiped at every reboot).
There's also /var/tmp/ (a bit more persistent storage - survives
reboots).

/Henrik

PS. In R, the tempdir() function often points to a directory under
/tmp/, e.g.  tempdir() => "/tmp/RtmpWZPsip".  However, that will be
wiped by R itself when you quit the R session, so if you use tempdir()
you need to scp that file before you quit R.  It's probably better to
avoid tempdir() in your case.

>
> In this case, your best bet is trying to mount an a flder from your
> client on the server and than save there.
>
> Contact you administrator to find out if this is possible or (s)he has
> other suggestions.
>
>> For that reason I was trying to save the object of of the remote server.
>>
>> Best,
>
> Cheers,
>
> Rainer
>
>> Simone
>>
>> On 6 September 2016 at 10:30, Rainer M Krug  wrote:
>>
>>  Simone Tenan  writes:
>>
>>  > Hi all,
>>  > I'm using R remotely via ssh connection in linux. I need to save a large R
>>  > object from the remote server to my laptop. How can I specify the path in
>>  > the save() function?
>>
>>  You are working on the remote machine and there is no way that you can
>>  specify "out of the box" a save to client machine (I stand to be corrected).
>>
>>  You could mount a directory on the client on the server, but I would
>>  suggest to
>>
>>  1) save the object on the server (the where R is running on)
>>  2) use scp from the client to copy the file from the server to the
>>  client (the one you are typing on).
>>
>>  scp user@host1:file1 ./TheNameOfTheLocalFile
>>
>>  Cheers,
>>
>>  Rainer
>>
>>  >
>>  > Thanks much for your help,
>>  > Simone
>>  >
>>  > [[alternative HTML version deleted]]
>>  >
>>  > __
>>  > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>  > https://stat.ethz.ch/mailman/listinfo/r-help
>>  > PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>>  > and provide commented, minimal, self-contained, reproducible code.
>>
>>  --
>>  Rainer M. Krug
>>  email: Rainerkrugsde
>>  PGP: 0x0F52F982
>>
>>
>
> --
> Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, 
> UCT), Dipl. Phys. (Germany)
>
> Centre of Excellence for Invasion Biology
> Stellenbosch University
> South Africa
>
> Tel :   +33 - (0)9 53 10 27 44
> Cell:   +33 - (0)6 85 62 59 98
> Fax :   +33 - (0)9 58 10 27 44
>
> Fax (D):+49 - (0)3 21 21 25 22 44
>
> email:  rai...@krugs.de
>
> Skype:  RMkrug
>
> PGP: 0x0F52F982
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] paste0 in file path

2016-08-31 Thread Henrik Bengtsson
Also, the recommended way to build file paths is to use file.path(), i.e.

   file.path("C:", "temp", filename)

rather than

   paste0("C:/temp/", filename)


BTW, R provides tempdir() that gives you the temporary directory that
R prefers to use on your OS.  So, you might want to consider using:

   paste0(tempdir(), filename)

It is specific and temporary to the R session running though, so if
you want to store things across R sessions, tempdir() is not what you
want to use.

/Henrik


On Wed, Aug 31, 2016 at 8:54 AM, Uwe Ligges
 wrote:
>
>
> On 31.08.2016 17:50, Leslie Rutkowski wrote:
>>
>> Hi,
>>
>> I'm trying to reshape and output 8 simple tables into excel files. This is
>> the code I'm using
>>
>>   for (i in 1:8) {
>>   count <- table(mydata$ctry, mydata[,paste0("q0",i,"r")])
>>   dat <- as.data.frame(q01count)
>>
>>   wide <- reshape(dat,
>>   timevar="Var2",
>>   idvar="Var1",
>>   direction="wide")
>>write.xlsx(wide, file=paste0(i, 'C:/temp/q0',i,'r.xlsx'))
>
>
>   ^^
> remove the i?
>
> Best,
> Uwe Ligges
>
>
>
>>   }
>>
>> All goes well until the write.xlsx, which produces the error
>>
>> Error in .jnew("java/io/FileOutputStream", jFile) :
>>   java.io.FileNotFoundException: 1C:\temp\q01r.xlsx (The filename,
>> directory name, or volume label syntax is incorrect)
>>
>> Among other things, I'm puzzled about why a "1" is getting tacked on to
>> the
>> file path.
>>
>> Any hints?
>>
>> Thanks,
>> Leslie
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Antwort: Re: Antwort: Re: Re: sink(): Cannot open file

2016-05-11 Thread Henrik Bengtsson
Sounds like it would be helpful to find out exactly which process is
holding on to the file in order to figure out what's going on. From a
quick look, it seems that

  
http://superuser.com/questions/117902/find-out-which-process-is-locking-a-file-or-folder-in-windows

gives some useful info on how to track down the process that looks the file.

/Henrik

On Wed, May 11, 2016 at 9:47 AM,   wrote:
> Duncan,
>
> thanks for the hint.
>
> I have done it correctly in R fashion
>
> ## capture all the output to a file.
> zz <- file("C:/Temp/all.Rout", open = "wt")
> sink(zz)
> sink(zz, type = "message")
> try(log("a"))
> ## back to the console
> sink(type = "message")
> sink()
> unlink("C:/Temp/all.Rout")
>
> But the error persits.
>
> Kind regards
>
> Georg
>
>
>
>
> Von:Duncan Murdoch 
> An: John Sorkin , drjimle...@gmail.com,
> g.maub...@weinwolf.de,
> Kopie:  r-help@r-project.org
> Datum:  10.05.2016 19:03
> Betreff:Re: [R] Antwort: Re: Re: sink(): Cannot open file
>
>
>
> On 10/05/2016 11:15 AM, John Sorkin wrote:
>> George,
>> I do not know what operating system you are working with, but when I use
> sink() under windows, I need to specify a valid path which I don't see in
> your code. I might, for example specify:
>>
>> sink("c:\myfile.txt")
>
> Note that the backslash should be doubled (so it isn't interpreted as an
> escape for the "m" that follows it), or replaced with a forward slash.
>
> Duncan Murdoch
>
>>   R code goes here
>> sink()
>>
>> with the expectation that I would create a file myfile.txt that would
> contain the output of my R program.
>>
>> John
>>
>>
>> John David Sorkin M.D., Ph.D.
>> Professor of Medicine
>> Chief, Biostatistics and Informatics
>> University of Maryland School of Medicine Division of Gerontology and
> Geriatric Medicine
>> Baltimore VA Medical Center
>> 10 North Greene Street
>> GRECC (BT/18/GR)
>> Baltimore, MD 21201-1524
>> (Phone) 410-605-7119
>> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>> >>>  05/10/16 11:10 AM >>>
>> Hi Jim,
>>
>> I tried:
>>
>> sink("all.Rout")
>> try(log("a"))
>> sink()
>>
>> The program executes without warning or error. The file "all.Rout" is
>> begin created. Nothing will be written to it. The file is accessable
>> rights after the execution of the program by notepad.exe.
>>
>> The program
>>
>> zz <- file("all.Rout", open = "wt")
>> sink(zz, type = "message")
>> try(log("a"))
>> sink()
>> close(zz)
>> unlink(zz)
>>
>> creates the file, does not write anything to it and is not accessable
>> after program execution in R with notepad.exe.
>>
>> Any ideas what happens behind the szenes?
>>
>> Kind regards
>>
>> Georg
>>
>>
>>
>>
>> Von: Jim Lemon 
>> An: g.maub...@weinwolf.de,
>> Kopie: r-help mailing list 
>> Datum: 10.05.2016 13:16
>> Betreff: Re: Re: [R] sink(): Cannot open file
>>
>>
>>
>> Have you tried:
>>
>> sink("all.Rout")
>> try(log("a"))
>> sink()
>>
>> Jim
>>
>> On Tue, May 10, 2016 at 9:05 PM,  wrote:
>> > Hi Jim,
>> >
>> > thanks for your reply.
>> >
>> > ad 1)
>> > "all.Rout" was created in the correct directory. It exists properly
> with
>> > correct file properties on Windows, e.g. creation date and time and
> file
>> > size information.
>> >
>> > ad 2)
>> > I can not access the file with Notepad.exe directly after it was
> created
>> > by R. The error message is (translated):
>> >
>> > "Cannot access file "all.Rout". The file is opened by another
> process."
>> >
>> > ad 3)
>> > If I close R completely the file access is released. Then I can read
> the
>> > file using Notepad.exe. The contents is:
>> >
>> > Error in log("a") : non-numeric argument to mathematical function
>> >
>> > I tried
>> >
>> > close(zz)
>> >
>> > but the error persists.
>> >
>> > To me it looks like R is still accessing the file and not releasing
> the
>> > connection for other programs. close(zz) should have solved the
> problem
>> > but unfortantely it doesn't.
>> >
>> > What else could I try?
>> >
>> > Kind regards
>> >
>> > Georg
>> >
>> >
>> >
>> >
>> > Von: Jim Lemon 
>> > An: g.maub...@weinwolf.de,
>> > Kopie: r-help mailing list 
>> > Datum: 10.05.2016 12:50
>> > Betreff: Re: [R] sink(): Cannot open file
>> >
>> >
>> >
>> > Hi Georg,
>> > I don't suppose that you have:
>> >
>> > 1) checked that the file "all.Rout" exists somewhere?
>> >
>> > 2) if so, looked at the file with Notepad, perhaps?
>> >
>> > 3) let us in on the secret by pasting the contents of "all.Rout" into
>> > your message if it is not too big?
>> >
>> > At a guess, trying:
>> >
>> > close(zz)
>> >
>> > might get you there.
>> >
>> > Jim
>> >
>> > On Tue, May 10, 2016 at 5:25 PM,  wrote:
>> >> Hi All,
>> >>
>> >> I would like to route the output to a file using sink(). When using
> the
>> >> example from the 

Re: [R] Matrix: How create a _row-oriented_ sparse Matrix (=dgRMatrix)?

2016-04-20 Thread Henrik Bengtsson
On Wed, Apr 20, 2016 at 1:25 AM, Martin Maechler
<maech...@stat.math.ethz.ch> wrote:
>>>>>> Henrik Bengtsson <henrik.bengts...@gmail.com>
>>>>>> on Tue, 19 Apr 2016 14:04:11 -0700 writes:
>
> > Using the Matrix package, how can I create a row-oriented sparse
> > Matrix from scratch populated with some data?  By default a
> > column-oriented one is created and I'm aware of the note that the
> > package is optimized for column-oriented ones, but I'm only interested
> > in using it for holding my sparse row-oriented data and doing basic
> > subsetting by rows (even using drop=FALSE).
>
> > Here is what I get when I set up a column-oriented sparse Matrix:
>
> >> Cc <- Matrix(0, nrow=5, ncol=5, sparse=TRUE)
> >> Cc[1:3,1] <- 1
>
> A general ("teaching") remark :
> The above use of Matrix() is seen in many places, and is fine
> for small matrices and the case where you only use the `[<-`
> method very few times (as above).
> Also using  Matrix()  is nice when being introduced to using the
> Matrix package.
>
> However, for efficience in non-small cases, do use
>
>sparseMatrix()
>
> directly to construct sparse matrices.
>
>
> >> Cc
> > 5 x 5 sparse Matrix of class "dgCMatrix"
>
> > [1,] 1 . . . .
> > [2,] 1 . . . .
> > [3,] 1 . . . .
> > [4,] . . . . .
> > [5,] . . . . .
> >> str(Cc)
> > Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
> > ..@ i   : int [1:3] 0 1 2
> > ..@ p   : int [1:6] 0 3 3 3 3 3
> > ..@ Dim : int [1:2] 5 5
> > ..@ Dimnames:List of 2
> > .. ..$ : NULL
> > .. ..$ : NULL
> > ..@ x   : num [1:3] 1 1 1
> > ..@ factors : list()
>
> > When I try to do the analogue for a row-oriented matrix, I get a
> > "dgTMatrix", whereas I would expect a "dgRMatrix":
>
> >> Cr <- Matrix(0, nrow=5, ncol=5, sparse=TRUE)
> >> Cr <- as(Cr, "dsRMatrix")
> >> Cr[1,1:3] <- 1
> >> Cr
> > 5 x 5 sparse Matrix of class "dgTMatrix"
>
> > [1,] 1 1 1 . .
> > [2,] . . . . .
> > [3,] . . . . .
> > [4,] . . . . .
> > [5,] . . . . .
>
> The reason for the above behavior has been
>
> a) efficiency.  All the subassignment ( `[<-` ) methods for
>"RsparseMatrix" objects (of which "dsRMatrix" is a special case)
>are implemented via  TsparseMatrix.
> b) because of the general attitude that Csparse (and Tsparse to
>some extent) are well supported in Matrix,
>and e.g. further operations on Rsparse matrices would *again*
>go via T* or C* sparse ones, I had decided to keep things Tsparse.

Thanks, understanding these design decisions is helpful.
Particularly, since I consider myself a rookie when it comes to the
Matrix package.

>
> [...]
>
> > Trying with explicit coercion does not work:
>
> >> as(Cc, "dgRMatrix")
> > Error in as(Cc, "dgRMatrix") :
> > no method or default for coercing "dgCMatrix" to "dgRMatrix"
>
> >> as(Cr, "dgRMatrix")
> > Error in as(Cr, "dgRMatrix") :
> > no method or default for coercing "dgTMatrix" to "dgRMatrix"
>
> The general philosophy in 'Matrix' with all the class
> hierarchies and the many specific classes has been to allow and
> foster coercing to abstract super classes,
> i.e, to  "dMatrix"  or "generalMatrix", "triangularMatrix", or
> then "denseMatrix", "sparseMatrix", "CsparseMatrix" or
> "RsparseMatrix", etc
>
> So in the above  as(*, "RsparseMatrix")   should work always.

Thanks for pointing this out (and confirming as I since discovered the
virtual RsparseMatrix class in the help).

>
>
> As a summary, in other words,  for what you want,
>
>as(sparseMatrix(.), "RsparseMatrix")
>
> should give you what you want reliably and efficiently.

Perfect.

>
>
> > Am I doing some wrong here?  Or is this what means that the package is
> > optimized for the column-oriented representation and I shouldn't
> > really work with row-oriented ones?  I'm really only interested in
> > access to efficient Cr[row,,drop=FALSE] subsetting (and a small memory
> > footprint).
>
> { though you could equivalently use   Cc[,row, drop=FALSE]
>   with a CsparseMatrix Cc := t(Cr),
>   couldn't you ?
> }

Yes, I actually went ahead did that, but since the code I'm writing
supports both plain matrix:es and sparse Matrix:es, and the underlying
model operates row-by-row, I figured the code would be more consistent
if I could use row-orientation everywhere.  Not a big deal.

Thanks Martin

Henrik

>
>
> Martin Maechler  (maintainer of 'Matrix')
> ETH Zurich
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Matrix: How create a _row-oriented_ sparse Matrix (=dgRMatrix)?

2016-04-19 Thread Henrik Bengtsson
Using the Matrix package, how can I create a row-oriented sparse
Matrix from scratch populated with some data?  By default a
column-oriented one is created and I'm aware of the note that the
package is optimized for column-oriented ones, but I'm only interested
in using it for holding my sparse row-oriented data and doing basic
subsetting by rows (even using drop=FALSE).

Here is what I get when I set up a column-oriented sparse Matrix:

> Cc <- Matrix(0, nrow=5, ncol=5, sparse=TRUE)
> Cc[1:3,1] <- 1
> Cc
5 x 5 sparse Matrix of class "dgCMatrix"

[1,] 1 . . . .
[2,] 1 . . . .
[3,] 1 . . . .
[4,] . . . . .
[5,] . . . . .
> str(Cc)
Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
  ..@ i   : int [1:3] 0 1 2
  ..@ p   : int [1:6] 0 3 3 3 3 3
  ..@ Dim : int [1:2] 5 5
  ..@ Dimnames:List of 2
  .. ..$ : NULL
  .. ..$ : NULL
  ..@ x   : num [1:3] 1 1 1
  ..@ factors : list()

When I try to do the analogue for a row-oriented matrix, I get a
"dgTMatrix", whereas I would expect a "dgRMatrix":

> Cr <- Matrix(0, nrow=5, ncol=5, sparse=TRUE)
> Cr <- as(Cr, "dsRMatrix")
> Cr[1,1:3] <- 1
> Cr
5 x 5 sparse Matrix of class "dgTMatrix"

[1,] 1 1 1 . .
[2,] . . . . .
[3,] . . . . .
[4,] . . . . .
[5,] . . . . .
> str(Cr)
Formal class 'dgTMatrix' [package "Matrix"] with 6 slots
  ..@ i   : int [1:3] 0 0 0
  ..@ j   : int [1:3] 0 1 2
  ..@ Dim : int [1:2] 5 5
  ..@ Dimnames:List of 2
  .. ..$ : NULL
  .. ..$ : NULL
  ..@ x   : num [1:3] 1 1 1
  ..@ factors : list()


Trying with explicit coercion does not work:

> as(Cc, "dgRMatrix")
Error in as(Cc, "dgRMatrix") :
  no method or default for coercing "dgCMatrix" to "dgRMatrix"

> as(Cr, "dgRMatrix")
Error in as(Cr, "dgRMatrix") :
  no method or default for coercing "dgTMatrix" to "dgRMatrix"


Am I doing some wrong here?  Or is this what means that the package is
optimized for the column-oriented representation and I shouldn't
really work with row-oriented ones?  I'm really only interested in
access to efficient Cr[row,,drop=FALSE] subsetting (and a small memory
footprint).

Thanks,

Henrik

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reshaping an array - how does it work in R

2016-03-19 Thread Henrik Bengtsson
On Fri, Mar 18, 2016 at 8:28 PM, Roy Mendelssohn - NOAA Federal
<roy.mendelss...@noaa.gov> wrote:
> Hi Henrik:
>
> I want to do want in oceanography is called an EOF, which is just a PCA 
> analysis. Unless I am missing something, in R I need to flatten my 3-D matrix 
> into a 2-D data matrix. I can fit the entire 30GB matrix into memory, and I 
> believe I have enough memory to do the PCA by constraining the number of 
> components returned .  What I don’t think I have enough memory for is an 
> operation that makes a copy of the matrix.
>
> As I said, in theory I know how to do the flattening, it a simple command, 
> but in practice I don’t have enough memory.  So I spent the afternoon 
> rewriting my code to read in parts of the data at a time and then putting 
> those in the appropriate places of a matrix already flattened of appropriate 
> size.  In case someone is wondering, on the 3D grid the matrix is 
> [1001,1001,3650].  So I create an empty matrix size [1001*1001,3650], and 
> read in a slice of the lat-lon grid, and map those into the appropriate 
> places in the flattened matrix.  By reading in appropriately sized chunks  my 
> memory usage is not pushed too far.

Sounds good.  There's another small caveat. Make sure to specify the
'data' argument for matrix() we allocating an "empty" matrix, e.g.

X <- matrix(NA_real_, nrow=1001*1001, ncol=3650)

This will give you a double matrix with all missing value.  If you use
the default

X <- matrix(nrow=1001*1001, ncol=3650)

you'll get a logical matrix, which will introduce a copy as soon as
you assign a double value (e.g. X[1,1] <- 3.14). The latter is a
complete waste of memory/time. See
http://www.jottr.org/2014/06/matrixNA-wrong-way.html for details.

/Henrik

>
> -Roy
>
>
>> On Mar 18, 2016, at 7:37 PM, Henrik Bengtsson <henrik.bengts...@gmail.com> 
>> wrote:
>>
>> On Fri, Mar 18, 2016 at 3:15 PM, Roy Mendelssohn - NOAA Federal
>> <roy.mendelss...@noaa.gov> wrote:
>>> Thanks.  That is what I needed to know.  I don’t want to play around with 
>>> some of the other suggestions, as I don’t totally understand what they do, 
>>> and don’t want to risk messing up something and not be aware of it.
>>>
>>> There is a way to read in the data chunks at a time and reshape it and put, 
>>> it into the (reshaped) larger array, harder to program but probably worth 
>>> the pain to be certain of what I am doing.
>>
>> I recommend this approach; whenever I work with reasonable large data
>> (that may become even larger in the future), I try to implement a
>> constant-memory version of the algorithm, which often comes down to
>> processing data in chunks.  The simplest version of this is to read
>> all data into memory and the subset, but if you can read data in in
>> chunks that is even better.
>>
>> Though, I'm curious to what matrix operations you wish to perform.
>> Because if you wish to do regular summation, then base::.rowSums() and
>> base::.colSums() allow you to override the default dimensions on the
>> fly without inducing extra copies, e.g.
>>
>>> X <- array(1:24, dim=c(2,3,4))
>>> .rowSums(X, m=6, n=4)
>> [1] 40 44 48 52 56 60
>>> rowSums(matrix(X, nrow=6, ncol=4))
>> [1] 40 44 48 52 56 60
>>
>> For other types of calculations, you might want to look at
>> matrixStats.  It has partial(*) support for overriding the default
>> dimensions in a similar fashion.  For instance,
>>
>>> library("matrixStats")
>>> rowVars(X, dim.=c(6,4))
>>
>> The above effectively calculates rowVars(matrix(X, nrow=6, ncol=4))
>> without making copies.
>>
>> (*) By partial I mean that this is a feature that hasn't been pushed
>> through to all of matrixStats functions, cf.
>> https://github.com/HenrikBengtsson/matrixStats/issues/83.
>>
>> Cheers,
>>
>> Henrik
>> (author of matrixStats)
>>
>>>
>>> I had a feeling a copy was made, just wanted to make certain of it.
>>>
>>> Thanks again,
>>>
>>> -Roy
>>>
>>>> On Mar 18, 2016, at 2:56 PM, Dénes Tóth <toth.de...@ttk.mta.hu> wrote:
>>>>
>>>> Hi Roy,
>>>>
>>>> R (usually) makes a copy if the dimensionality of an array is modified, 
>>>> even if you use this syntax:
>>>> x <- array(1:24, c(2, 3, 4))
>>>> dim(x) <- c(6, 4)
>>>>
>>>> See also ?tracemem, ?data.table::address, ?pryr::address and other tools 
>>>> to trace if an internal copy is done.
>>>>
>>

Re: [R] Reshaping an array - how does it work in R

2016-03-18 Thread Henrik Bengtsson
On Fri, Mar 18, 2016 at 3:15 PM, Roy Mendelssohn - NOAA Federal
 wrote:
> Thanks.  That is what I needed to know.  I don’t want to play around with 
> some of the other suggestions, as I don’t totally understand what they do, 
> and don’t want to risk messing up something and not be aware of it.
>
> There is a way to read in the data chunks at a time and reshape it and put, 
> it into the (reshaped) larger array, harder to program but probably worth the 
> pain to be certain of what I am doing.

I recommend this approach; whenever I work with reasonable large data
(that may become even larger in the future), I try to implement a
constant-memory version of the algorithm, which often comes down to
processing data in chunks.  The simplest version of this is to read
all data into memory and the subset, but if you can read data in in
chunks that is even better.

Though, I'm curious to what matrix operations you wish to perform.
Because if you wish to do regular summation, then base::.rowSums() and
base::.colSums() allow you to override the default dimensions on the
fly without inducing extra copies, e.g.

> X <- array(1:24, dim=c(2,3,4))
> .rowSums(X, m=6, n=4)
[1] 40 44 48 52 56 60
> rowSums(matrix(X, nrow=6, ncol=4))
[1] 40 44 48 52 56 60

For other types of calculations, you might want to look at
matrixStats.  It has partial(*) support for overriding the default
dimensions in a similar fashion.  For instance,

> library("matrixStats")
> rowVars(X, dim.=c(6,4))

The above effectively calculates rowVars(matrix(X, nrow=6, ncol=4))
without making copies.

(*) By partial I mean that this is a feature that hasn't been pushed
through to all of matrixStats functions, cf.
https://github.com/HenrikBengtsson/matrixStats/issues/83.

Cheers,

Henrik
(author of matrixStats)

>
> I had a feeling a copy was made, just wanted to make certain of it.
>
> Thanks again,
>
> -Roy
>
>> On Mar 18, 2016, at 2:56 PM, Dénes Tóth  wrote:
>>
>> Hi Roy,
>>
>> R (usually) makes a copy if the dimensionality of an array is modified, even 
>> if you use this syntax:
>> x <- array(1:24, c(2, 3, 4))
>> dim(x) <- c(6, 4)
>>
>> See also ?tracemem, ?data.table::address, ?pryr::address and other tools to 
>> trace if an internal copy is done.
>>
>> Workaround: use data.table::setattr or bit::setattr to modify the dimensions 
>> in place (i.e., without making a copy). Risk: if you modify an object by 
>> reference, all other objects which point to the same memory address will be 
>> modified silently, too.
>>
>> HTH,
>>  Denes
>>
>>
>>
>> On 03/18/2016 10:28 PM, Roy Mendelssohn - NOAA Federal wrote:
>>> Hi All:
>>>
>>> I am working with a very large array.  if noLat is the number of latitudes, 
>>> noLon the number of longitudes and noTime the number of  time periods, the 
>>> array is of the form:
>>>
>>> myData[noLat, no Lon, noTime].
>>>
>>> It is read in this way because that is how it is stored in a (series) of 
>>> netcdf files.  For the analysis I need to do, I need instead the array:
>>>
>>> myData[noLat*noLon, noTime].  Normally this would be easy:
>>>
>>> myData<- array(myData,dim=c(noLat*noLon,noTime))
>>>
>>> My question is how does this command work in R - does it make a copy of the 
>>> existing array, with different indices for the dimensions, or does it just 
>>> redo the indices and leave the given array as is?  The reason for this 
>>> question is my array is 30GB in memory, and I don’t have enough space to 
>>> have a copy of the array in memory.  If the latter I will have to figure 
>>> out a work around to bring in only part of the data at a time and put it 
>>> into the proper locations.
>>>
>>> Thanks,
>>>
>>> -Roy
>>>
>>>
>>>
>>> **
>>> "The contents of this message do not reflect any position of the U.S. 
>>> Government or NOAA."
>>> **
>>> Roy Mendelssohn
>>> Supervisory Operations Research Analyst
>>> NOAA/NMFS
>>> Environmental Research Division
>>> Southwest Fisheries Science Center
>>> ***Note new address and phone***
>>> 110 Shaffer Road
>>> Santa Cruz, CA 95060
>>> Phone: (831)-420-3666
>>> Fax: (831) 420-3980
>>> e-mail: roy.mendelss...@noaa.gov www: http://www.pfeg.noaa.gov/
>>>
>>> "Old age and treachery will overcome youth and skill."
>>> "From those who have been given much, much will be expected"
>>> "the arc of the moral universe is long, but it bends toward justice" -MLK 
>>> Jr.
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>
> **
> "The contents of this message do not reflect any position of the U.S. 
> Government or NOAA."
> **
> Roy Mendelssohn
> Supervisory Operations Research Analyst
> NOAA/NMFS
> 

Re: [R] Create macro_var in R

2016-02-03 Thread Henrik Bengtsson
Don't know what `population` is, but a simple assignment

MVAR <- population

may provide what you need.  Note, no c().  For example,

> foo <- rnorm
> foo(3)
[1] -0.08093862 -0.87827617  1.52826914

/Henrik

On Wed, Feb 3, 2016 at 9:41 AM, Amoy Yang via R-help
 wrote:
>  There is a %LET statement in SAS: %let MVAR=population; Thus, MVAR can be 
> used through entire program.
> In R, I tried MAVR<-c("population"). The problem is that MAVR comes with 
> double quote "" that I don't need. But MVAR<-c(population) did NOT work 
> out. Any way that double quote can be removed as done in SAS when creating 
> macro_var?
> Thanks in advance for helps!
> Amoy
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error because of large dimension

2016-01-24 Thread Henrik Bengtsson
FYI, the matrix you tried to allocate would hold
(3195*1290*495*35*35*35*15) * 3 = 3.936248e+15 values.  Each value
would occupy 8 bytes of memory (for the double data type).  In other
words, in order to keep this data matrix in memory you would require a
computer with at least 3.148998e+16 bytes of RAM, i.e. 29327331 GiB =
28640 TiB = 28 PiB.  Storing such a large matrix even on file is not
possible.

In other words, you need to figure out how to approach your original
problem in a different way.

/Henrik

On Sun, Jan 24, 2016 at 8:46 AM, li li  wrote:
> Hi all,
>   I am doing some calculation with very large dimension. I need to create a
> matrix
> with three columns and a very large number of rows
> (3195*1290*495*35*35*35*15=1.312083e+15) i
> n order to allocate calculation result from a for loop.
> R does not allow me to create such a matrix because of the large dimension
> (see below). Is there a way to go around this?
>   Thanks very much!!
>  Hanna
>
>
>> matrix(0, 3195*1290*495*35*35*35*15, 3)
> Error in matrix(0, 3195 * 1290 * 495 * 35 * 35 * 35 * 15, 3) :
>   invalid 'nrow' value (too large or NA)
> In addition: Warning message:
> In matrix(0, 3195 * 1290 * 495 * 35 * 35 * 35 * 15, 3) :
>   NAs introduced by coercion
>>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] close specific graphics device

2015-12-16 Thread Henrik Bengtsson
The R.devices package provides functions for this.  For instance, you
can open several devices with different labels and the close then in
whatever order you'd like:

> library("R.devices")
> devSet("foo")
> plot(1:10)
> devSet("bar")
> plot(10:1)
> devSet("foo")
> points(10:1)
> devSet("bar")
> devOff("foo")
> devOff("bar")

Alternatively, you can specify the 'label' argument when you use devNew(), e.g.

> devNew("x11", label="foo")
> plot(1:10)
> devNew("png", filename="myplot.png", label="bar")
> plot(10:1)
> devOff("foo")
> devOff("bar")

The R.devices package also allows you to open a device with any index
number in [2,63], e.g.

> devSet(43)

regardless of whether devices 2 to 42 already exists or not.

Also, if you don't already know, R.devices provides devEval(), e.g.

devEval("png", name="myplot", {
  plot(10:1)
})

which guarantees that the device is closed afterward, i.e. no more
forgetting to use dev.off().  Also, filename extensions etc are
automatically taken care of.  You can plot to multiple image types at
the same time, e.g.

devEval(c("png", "pdf", "eps"), name="myplot", aspectRatio=2/3, {
  plot(10:1)
})

There are also "quick" functions such as:

toPNG("myplot", aspectRatio=2/3, {
  plot(10:1)
})

Hope this helps

Henrik

PS. R.devices 2.13.2 is rolling out on CRAN right now - make sure to
use that version if your doing _unbalanced_ opening/closing with
labels as in my first example.

On Tue, Dec 15, 2015 at 2:38 PM, Jim Lemon  wrote:
> Hi Dan,
> The range of device numbers seems to be 1-63. There doesn't appear to be a
> means of explicitly setting the device number when calling dev.new, and
> devices are numbered sequentially when they are opened. This means that
> even if you did know that the device number was, say, 4 it would be
> possible to close that device and open another device with the number 4.
>
> I suppose it would be possible to write wrapper functions for this, but I
> have to leave at the moment, so perhaps tomorrow.
>
> Jim
>
> On Wed, Dec 16, 2015 at 7:51 AM, Dalthorp, Daniel 
> wrote:
>
>> dev.off(which) can be used to close a specific graphics device where
>> "which" is the index of the device, but is there a way to assign a custom
>> number (or name) to a windows device so that specific window can be later
>> closed via dev.off (or some other method) if it is open?
>>
>> The following does NOT work. The target device is not open when its dev.off
>> is called, and another window that later got assigned the original index
>> associated with the target device is closed instead.
>>
>> plot(0,0,type='n') # target window to close
>> text(0,0,"close me")
>> targetindex<-dev.cur()
>>
>> # unbeknownst to the programmer, user closes device by clicking the red "X"
>> or...
>> dev.off()
>>
>> # user draws a new graph that he wants to keep open
>> plot(1,1,type='n')
>> text(1,1,"do not close me")
>>
>> # now it's time for the program to close the original graphics device (if
>> it still happens to be open)
>> dev.off(targetindex)
>>
>> # the wrong device has been closed because the original window had closed
>> and the index associated with original graph is now associated with
>> something else
>>
>> 
>>
>> I'm looking for something like:
>>
>> dev.off(which = "original figure") or dev.off(which = n), where n is a
>> custom index (like 1) that will not be later assigned to a different
>> device [unless explicitly assigned that index].
>>
>> Any help would be greatly appreciated.
>>
>> Thanks!
>>
>>
>>
>> --
>> Dan Dalthorp, PhD
>> USGS Forest and Rangeland Ecosystem Science Center
>> Forest Sciences Lab, Rm 189
>> 3200 SW Jefferson Way
>> Corvallis, OR 97331
>> ph: 541-750-0953
>> ddalth...@usgs.gov
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] stopifnot with logical(0)

2015-12-12 Thread Henrik Bengtsson
On Sat, Dec 12, 2015 at 6:08 AM, Hadley Wickham <h.wick...@gmail.com> wrote:
> On Sat, Dec 12, 2015 at 3:54 AM, Martin Maechler
> <maech...@stat.math.ethz.ch> wrote:
>>>>>>> Henrik Bengtsson <henrik.bengts...@gmail.com>
>>>>>>> on Fri, 11 Dec 2015 08:20:55 -0800 writes:
>>
>> > On Fri, Dec 11, 2015 at 8:10 AM, David Winsemius 
>> <dwinsem...@comcast.net> wrote:
>> >>
>> >>> On Dec 11, 2015, at 5:38 AM, Dario Beraldi <dario.bera...@gmail.com> 
>> wrote:
>> >>>
>> >>> Hi All,
>> >>>
>> >>> I'd like to understand the reason why stopifnot(logical(0) == x) 
>> doesn't
>> >>> (never?) throw an exception, at least in these cases:
>> >>
>> >> The usual way to test for a length-0 logical object is to use 
>> length():
>> >>
>> >> x <- logical(0)
>> >>
>> >> stopifnot( !length(x) & mode(x)=="logical" )
>>
>> > I found
>>
>> > stopifnot(!length(x), mode(x) == "logical")
>>
>> > more helpful when troubleshooting, because it will tell you whether
>> > it's !length(x) or mode(x) == "logical" that is FALSE.  It's as if you
>> > wrote:
>>
>> > stopifnot(!length(x))
>> > stopifnot(mode(x) == "logical")
>>
>> > /Henrik
>>
>> Yes, indeed, thank you Henrik  --- and Jeff Newmiller who's nice
>> humorous reply added other relevant points.
>>
>> As author stopifnot(), I do agree with Dario's  "gut feeling"
>> that stopifnot()  "somehow ought to do the right thing"
>> in cases such as
>>
>>stopifnot(dim(x) == c(3,4))
>>
>> which is really subtle version of his cases
>> {But the gut feeling is wrong, as I argue from now on}.
>
> Personally, I think the problem there is that people forget that == is
> vectorised, and for a non-vectorised equality check you really should
> use identical:
>
> stopifnot(identical(dim(x), c(3,4)))

Kids, this one of the rare cases where you should not listen to Hadley
;)  Because,

> x <- matrix(1:12, nrow=3, ncol=4)
> dim(x)
[1] 3 4
> identical(dim(x), c(3,4))
[1] FALSE

Why, because:

> storage.mode(dim(x))
[1] "integer"
> storage.mode(c(3,4))
[1] "double"

My rule of thumb is that identical() is awesome, but you really have
to know the inner bits and pieces (*).  When in doubt, use
all.equal(), e.g.

> all.equal(dim(x), c(3,4))
[1] TRUE

Related to Hadley's point, is that using all(x == y) is risky because
R loops of one of the two vectors if one is longer than the other,
e.g.

> all(dim(x) == c(3,4))
[1] TRUE
> all(dim(x) == c(3,4,3,4))
[1] TRUE
> all(dim(x) == c(3,4,3,4,3,4))
[1] TRUE

so one really need to check the lengths as well, e.g.

> all(length(dim(x)) == length(c(3,4)), dim(x) == c(3,4))
[1] TRUE


(*) ADVANCED: I would say its risky to use:

> identical(dim(x), c(3L,4L))
[1] TRUE

because, who knows, in a future version of R we might see
matrices/arrays that support dimensions longer than
.Machine$integer.max which in case dimensions may be stored as
doubles.  This is what we already have for very long vectors today,
cf. help("length").

Henrik

>
> Hadley
>
> --
> http://had.co.nz/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] stopifnot with logical(0)

2015-12-11 Thread Henrik Bengtsson
On Fri, Dec 11, 2015 at 8:10 AM, David Winsemius  wrote:
>
>> On Dec 11, 2015, at 5:38 AM, Dario Beraldi  wrote:
>>
>> Hi All,
>>
>> I'd like to understand the reason why stopifnot(logical(0) == x) doesn't
>> (never?) throw an exception, at least in these cases:
>
> The usual way to test for a length-0 logical object is to use length():
>
> x <- logical(0)
>
> stopifnot( !length(x) & mode(x)=="logical" )

I found

stopifnot(!length(x), mode(x) == "logical")

more helpful when troubleshooting, because it will tell you whether
it's !length(x) or mode(x) == "logical" that is FALSE.  It's as if you
wrote:

stopifnot(!length(x))
stopifnot(mode(x) == "logical")

/Henrik

>
>
>>
>> stopifnot(logical(0) == 1)
>> stopifnot(logical(0) == TRUE)
>> stopifnot(logical(0) == FALSE)
>>
>> My understanding is that logical(0) is an empty set, so I would expect the
>> above tests to fail.
>>
>> (I got bitten by this in a piece of code where "x" happened to be
>> logical(0) and stopifnot didn't catch it)
>>
>> Thanks!
>> Dario
> --
>
> David Winsemius
> Alameda, CA, USA
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Ordering Filenames stored in list or vector

2015-12-04 Thread Henrik Bengtsson
> filenames <- c("Q_Read_prist#1...@1.xls", "Q_Read_prist#1...@10.xls", 
> "Q_Read_prist#1...@2.xls")
> filenames <- gtools::mixedsort(filenames, numeric.type="decimal")
> filenames
[1] "Q_Read_prist#1...@1.xls"  "Q_Read_prist#1...@2.xls"  
"Q_Read_prist#1...@10.xls"

/Henrik

On Fri, Dec 4, 2015 at 7:53 AM, Boris Steipe  wrote:
> The thread below has a number of solutions. I personally like the one with 
> sprintf().
>https://stat.ethz.ch/pipermail/r-help/2010-July/246059.html
>
>
> B.
>
> On Dec 4, 2015, at 5:51 AM, BARLAS Marios 247554  wrote:
>
>> Hello everyone,
>>
>> I am an R rookie and I'm learning as I program.
>>
>> I am working on a script to process a large amount of data: I read a pattern 
>> of filenames in the folder I want and import their data
>>
>> filenames = list.files(path, pattern="*Q_Read_prist*")
>>
>> myfiles = lapply(filenames, function(x) read.xlsx2(file=x, sheetName="Data", 
>> header=TRUE, FILENAMEVAR=x))
>>
>> The problem is that R recognizes the files in a 'non human' order.
>>
>> Q_Read_prist#1...@1.xls   Q_Read_prist#1...@1.xls
>> Q_Read_prist#1...@10.xls Q_Read_prist#1...@10.xls
>> Q_Read_prist#1...@11.xls Q_Read_prist#1...@11.xls
>> Q_Read_prist#1...@12.xls Q_Read_prist#1...@12.xls
>> Q_Read_prist#1...@13.xls Q_Read_prist#1...@13.xls
>> Q_Read_prist#1...@14.xls Q_Read_prist#1...@14.xls
>> Q_Read_prist#1...@15.xls Q_Read_prist#1...@15.xls
>> Q_Read_prist#1...@16.xls Q_Read_prist#1...@16.xls
>> Q_Read_prist#1...@17.xls Q_Read_prist#1...@17.xls
>> Q_Read_prist#1...@18.xls Q_Read_prist#1...@18.xls
>> Q_Read_prist#1...@19.xls Q_Read_prist#1...@19.xls
>> Q_Read_prist#1...@2.xls   Q_Read_prist#1...@2.xls
>> Q_Read_prist#1...@3.xls   Q_Read_prist#1...@3.xls
>> Q_Read_prist#1...@4.xls   Q_Read_prist#1...@4.xls
>> Q_Read_prist#1...@5.xls   Q_Read_prist#1...@5.xls
>> Q_Read_prist#1...@6.xls   Q_Read_prist#1...@6.xls
>> Q_Read_prist#1...@7.xls   Q_Read_prist#1...@7.xls
>> Q_Read_prist#1...@8.xls   Q_Read_prist#1...@8.xls
>> Q_Read_prist#1...@9.xls   Q_Read_prist#1...@9.xls
>>
>> I tried to order them using order or sort but it doesn' seem to work. I have 
>> had the same issue in matlab but there I have a function to re-define the 
>> order in a "correct" way.
>>
>> Anyone knows of a smart way to sort these guys from 1 to 19 ascending or 
>> descending?
>>
>> Thanks in advance,
>> Mario
>>
>>   [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SWEAVE - a gentle introduction

2015-11-17 Thread Henrik Bengtsson
When choosing source format, it's probably helpful to know that if you
work with a Markdown-based format (e.g. Rmarkdown) you'll be able to
generate either/both HTML or/and PDF documents, whereas if you work
with LaTeX-based formats (e.g. Sweave/knitr) you will only be able
output PDF documents (at least without great efforts).

One major advantage with HTML documents/reports is that they aren't
constrained by page breaks, e.g. you don't have to worry about
generating long tables that run across two or more pages. With LaTeX
that is often a great pain.  These days even mathematical equations
renders quite well in HTML.  I tend to use HTML output for everyday
analysis reports and PDF occasionally for more final artifacts such as
supplementary notes where I want to have full control of layout,
equations, figure sizes and bibliographies.

If you plan to write package vignettes using one of the above formats,
choice of vignette format does not matter these days.  They're all
equally easy to use and incorporate in packages.

Cheers,

Henrik

On Tue, Nov 17, 2015 at 11:09 AM, Duncan Murdoch
 wrote:
> On 17/11/2015 10:42 AM, Marc Schwartz wrote:
>>
>>
>>> On Nov 17, 2015, at 9:21 AM, John Sorkin 
>>> wrote:
>>>
>>> I am looking for a gentle introduction to SWEAVE, and would appreciate
>>> recommendations.
>>> I have an R program that I want to run and have the output and plots in
>>> one document. I believe this can be accomplished with SWEAVE. Unfortunately
>>> I don't know HTML, but am willing to learn. . . as I said I need a gentle
>>> introduction to SWEAVE.
>>> Thank you,
>>> John
>>>
>>
>>
>> John,
>>
>> A couple of initial comments.
>>
>> First, you will likely get some recommendations to also consider using
>> Knitr:
>>
>>http://yihui.name/knitr/
>>
>> which I do not use myself (I use Sweave), but to be fair, is worth
>> considering as an alternative.
>
>
> He did, and I'd agree with them.  I've switched to knitr for all new
> projects and some old ones.  knitr should be thought of as Sweave version 2.
>
> Duncan Murdoch
>
>
>>
>> Second, to create stand alone documents, as opposed to web based content,
>> you will likely want the output to be in TeX/LaTeX via Sweave, which can
>> then become PDF based documents via the post processing of the TeX/LaTeX
>> source. That is what I do for all of my analytic deliverables. You can also
>> use LaTeX classes like 'Beamer' to create Powerpoint-like slides for
>> presentation.
>>
>> Fritz' web site for Sweave is here:
>>
>>http://www.statistik.lmu.de/~leisch/Sweave/
>>
>> and there are some links to supporting materials there with very basic
>> examples.
>>
>> Another resource is:
>>
>>https://beckmw.files.wordpress.com/2014/02/sweave_intro1.pdf
>>
>> and if you Google for Sweave Introductions and Tutorials, there are a
>> myriad of others.
>>
>> In conjunction with Sweave itself, there are a variety of supporting
>> packages on CRAN that have related functionality (e.g. formatted LaTeX
>> output) that are worth knowing about and are included in the Reproducible
>> Research task view:
>>
>>https://cran.r-project.org/web/views/ReproducibleResearch.html
>>
>> Regards,
>>
>> Marc Schwartz
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] conditionally disable evaluation of chunks in Rmarkdown...

2015-11-10 Thread Henrik Bengtsson
On Tue, Nov 10, 2015 at 9:00 AM, Yihui Xie  wrote:
> The short answer is you cannot. Inline R code is always evaluated.
> When it is not evaluated, I doubt if your output still makes sense,
> e.g. "The value of x is `r x`." becomes "The value of x is ." That
> sounds odd to me.
>
> If you want to disable the evaluate of inline code anyway, you may use
> a custom function to do it. e.g.
>
> cond_eval = function(x) {
>   if (isTRUE(knitr::opts_chunk$get('eval'))) x
> }
>
> Then `r cond_eval(x)` instead of `r x`.
>
> Regards,
> Yihui
> --
> Yihui Xie 
> Web: http://yihui.name
>
>
> On Tue, Nov 10, 2015 at 4:40 AM, Witold E Wolski  wrote:
>> I do have an Rmd where I would like to conditionally evaluate the second 
>> part.
>>
>> So far I am working with :
>>
>> ```{r}
>> if(length(specLibrary@ionlibrary) ==0){
>>   library(knitr)
>>   opts_chunk$set(eval=FALSE, message=FALSE, echo=FALSE)
>> }
>> ```
>>
>> Which disables the evaluation of subsequent chunks.
>>
>> However my RMD file contains also these kind of snippets : `r `
>>
>> How do I disable them?

Just a FYI and maybe/maybe not a option for you; this is one of the
use cases where RSP (https://cran.r-project.org/package=R.rsp) is
handy because it does not require that code snippets (aka "code
chunks" as originally defined by weave/tangle literate programming) to
contain complete expressions.  With RSP-embedded documents, you can do
things such

<% if (length(specLibrary@ionlibrary) > 0) { %>

[... code and text blocks to conditionally include ...]

<% } # if (length(specLibrary@ionlibrary) > 0) %>

or include from a separate file, e.g.

<% if (length(specLibrary@ionlibrary) > 0) { %> <%@include
file="extras.md.rsp"%> <% } %>

You can also use loops over a mixture of code and text blocks etc.

Depending on when 'specLibrary@ionlibrary' gets assigned, you could
preprocess you R Markdown file with RSP, but for this to work out of
the box you basically need to know the value
length(specLibrary@ionlibrary) before your R Markdown code is
evaluated, i.e. before you compile the Rmd file.  Your build pipeline
would then look something like:

rmd <- R.rsp::rfile("report.rmd.rsp")
rmarkdown::render(rmd)

/Henrik
(author of R.rsp)

>>
>> regards
>>
>>
>>
>> --
>> Witold Eryk Wolski
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Changing text file to .r format

2015-11-06 Thread Henrik Bengtsson
My guess is that your on Windows. If so, make sure to change settings in
Explorer to always show filename extensions, because now I think it drops
.txt when it lists/displays your files (this is one of the most annoying
features in Windows).

Also, some basic editors add .txt extension when you save a file the first
time. You can try to put quotation marks to force it not to, e.g.
"RHtestsV4.r".

Henrik
On Nov 6, 2015 15:23, "Chattopadhyay, Somsubhra"  wrote:

Dear all,

I am a beginner in R and want to ask a simple question. I have a code file
in text format which I need to change to .r format only. For example now it
is RHtestsV4.r.txt which needs to be changed to just RHtestsV4.r. I tried
this

sub("^([^.]*).*", "\\1", 'RHtestsV4.r.txt')
[1] "RHtestsV4"

But this didn't seem to work as again when I try to call the function using
source("RHtestsV4.r")
The error message is

Error in file(filename, "r", encoding = encoding) :
  cannot open the connection
In addition: Warning message:
In file(filename, "r", encoding = encoding) :
  cannot open file 'RHtestsV4.r': No such file or directory

I think it is due to the format of the file. Please help me to convert the
file to .r format.

Thanks
Som
--
Somsubhra Chattopadhyay
Graduate Research Assistant
Biosystem and Agricultural Engineering Department
University of Kentucky, Lexington, KY 40546
Email: schatto...@uky.edu
Cell: 9198026951

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Last msg not sent to the list

2015-10-14 Thread Henrik Bengtsson
In addition,

if you go to https://stat.ethz.ch/mailman/listinfo/r-help (which is in
the footer of every R-help message), you'll find a link to 'R-help
Archives' (https://stat.ethz.ch/pipermail/r-help/).  On that latter
page, you'll see all messages that have been sent out to the list.
That will allow you to make sure your message went out.

/Henrik

On Wed, Oct 14, 2015 at 3:54 AM, Ivan Calandra
 wrote:
> Maram,
>
> I have received both of your e-mails on this topic, so they made it to the
> list.
> There is the option " Receive your own posts to the list?" on the membership
> configuration website (https://stat.ethz.ch/mailman/options/r-help/). If it
> is checked to "no", that would explain why you didn't receive your own
> posts.
>
> As to why nobody answered, no idea. Try again?
>
> HTH,
> Ivan*
> *
>
> --
> Ivan Calandra, PhD
> University of Reims Champagne-Ardenne
> GEGENAA - EA 3795
> CREA - 2 esplanade Roland Garros
> 51100 Reims, France
> +33(0)3 26 77 36 89
> ivan.calan...@univ-reims.fr
> https://www.researchgate.net/profile/Ivan_Calandra
>
> Le 14/10/15 12:38, marammagdysa...@gmail.com a écrit :
>
>> Dear All,
>>
>> My last mail entitled: "using the apply() family to evaluate nested
>> functions with common arguments" to the r-help list didn't reach me though
>> I've sent it 2 days ago. I've included my suggested code and asked about
>> some details to make it work. In addition, I haven't received any feedback
>> from the r-help that may be that mail had something wrong or needs some
>> approval or ... , as the ones I used to receive when I first applied to the
>> list.
>> Any idea why is that?!
>>   Thanks.
>> Maram Salem
>>
>> Sent from my iPhone
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] why must a named colClasses in read.table be in correct order

2015-07-08 Thread Henrik Bengtsson
Thanks for insisting; I was wrong and I'm happy to see that there is
indeed code intended for named 'colClasses', which even goes back to
2004.   But as you report, then names only work when
length(colClasses)  cols (which also explains why I though it was not
supported).  I'm not sure if that _strictly less than_  test is
intentional or a mistake, but I would propose the following patch:

[HB-X201]{hb}: svn diff src\library\utils\R\readtable.R
Index: src/library/utils/R/readtable.R
===
--- src/library/utils/R/readtable.R (revision 68642)
+++ src/library/utils/R/readtable.R (working copy)
@@ -139,7 +139,7 @@
 if (rlabp) col.names - c(row.names, col.names)

 nmColClasses - names(colClasses)
-if(length(colClasses)  cols)
+if(length(colClasses) = cols)
 if(is.null(nmColClasses)) {
 colClasses - rep_len(colClasses, cols)
 } else {


Your example works with this patch.  I've made it source():able so you
can try it out (if you cannot source() https://, then download the
file an source it locally):

source(https://gist.githubusercontent.com/HenrikBengtsson/ed1eeb41a1b4d6c43b47/raw/ebe58f76e518dd014423bea466a5c93d2efd3c99/readtable-fix.R;)

kkk - c(a\tb,
 3.14\tx)

colClasses - c(a=numeric, b=character)
data - read.table(textConnection(kkk),
   sep=\t,
   header = TRUE,
   colClasses = colClasses)
str(data)
### 'data.frame':   1 obs. of  2 variables:
### $ a: num 3.14
### $ b: chr x

## Does not work with utils::read.table(), but with patch
data - read.table(textConnection(kkk),
   sep=\t,
   header = TRUE,
   colClasses = rev(colClasses))
str(data)
### 'data.frame':   1 obs. of  2 variables:
### $ a: num 3.14
### $ b: chr x

Let's hope that the above is a (10-year old) typo, and changing a  to
a = adds support for named 'colClasses', which is a really useful
functionality.

/Henrik

On Wed, Jul 8, 2015 at 6:42 PM, Andreas Leha
andreas.l...@med.uni-goettingen.de wrote:
 Hi Henrik,

 Thanks for your reply.

 I am not (yet) convinced, though.  The help page for read.table
 mentions named colClasses and if I specify colClasses for not all
 columns, the names are taken into account:

 --8---cut here---start-8---
 kkk - c(a\tb,
  3.14\tx)
 str(read.table(textConnection(kkk),
sep=\t,
header = TRUE))

 str(read.table(textConnection(kkk),
sep=\t,
header = TRUE,
colClasses=c(b=character)))
 --8---cut here---end---8---

 What am I missing?

 Best,
 Andreas



 On 09/07/2015 02:21, Henrik Bengtsson wrote:
 read.table() does not make use of names(colClasses) - only its values.
 Because of this, ordering is critical, as you noted. It shouldn't be
 too hard to add support for a named `colClasses` argument of
 utils::read.table(), but someone needs to convince the R core team
 that this is a good idea.

 As an alternative, see R.filesets::readDataFrame() for a
 read.table()-like function that matches names(colClasses) to column
 names, if they exists.

 /Henrik
 (author of R.filesets)

 On Wed, Jul 8, 2015 at 5:41 PM, Andreas Leha
 andreas.l...@med.uni-goettingen.de wrote:
 Hi all,

 Apparently, the colClasses argument to read.table needs to be in the
 order of the columns *even when it is named*.  Why is that?  And where
 would I find it in the documentation?

 Here is a MWE:

 --8---cut here---start-8---
 kkk - c(a\tb,
  3.14\tx)
 read.table(textConnection(kkk),
sep=\t,
header = TRUE)

 cclasses=c(b=character,
a=numeric)

 read.table(textConnection(kkk),
sep=\t,
header = TRUE,
colClasses = cclasses)  ## --- error

 read.table(textConnection(kkk),
sep=\t,
header = TRUE,
colClasses = cclasses[order(names(cclasses))])
 --8---cut here---end---8---


 Thanks,
 Andreas

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   3   4   5   6   >