Re: [R-pkg-devel] using portable simd instructions

2024-03-27 Thread Serguei Sokol

Le 27/03/2024 à 14:54, jesse koops a écrit :

I tried that but I found the interface awkward and there was really no
performance bonus. It was in the early phase of experimentation and I
didn't save it, so it could very well be that I got the compiler
settings wrong and simd was not used. But if that was the case, there
would still be the problem of using the correct compiler settings
cross platform.

When I compile the example of the cited page with "authorized" flag "-std":

   g++ -std=c++20 stdx_simd.cpp -o tmp.exe

then I do:

   objdump -d tmp.exe > tmp.asm

I do find simd instructions in assembler code, e.g.:

   grep paddd tmp.asm

14a8:   66 0f fe c1 paddd  %xmm1,%xmm0
8f7b:   66 0f fe c1 paddd  %xmm1,%xmm0



Op wo 27 mrt 2024 om 14:44 schreef Serguei Sokol :


Le 26/03/2024 à 15:51, Tomas Kalibera a écrit :


On 3/26/24 10:53, jesse koops wrote:

Hello R-package-devel,

I recently got inspired by the rcppsimdjson package to try out simd
registers. It works fantastic on my computer but I struggle to find
information on how to make it portable. It doesn't help in this case
that R and Rcpp make including Cpp code so easy that I have never had
to learn about cmake and compiler flags. I would appreciate any help,
including of the type: "go read instructions at ...".

I use RcppArmadillo and Rcpp. I currenlty include the following header:

#include 

The functions in immintrin that I use are:

_mm256_loadu_pd
_mm256_set1_pd
_mm256_mul_pd
_mm256_fmadd_pd
_mm256_storeu_pd

and I define up to four __m256d registers. From information found
online (not sure where anymore) I constructed the following makevars
file:

CXX_STD = CXX14

PKG_CPPFLAGS = -I../inst/include -mfma -msse4.2 -mavx

PKG_CXXFLAGS = $(SHLIB_OPENMP_CXXFLAGS)
PKG_LIBS = $(SHLIB_OPENMP_CXXFLAGS) $(LAPACK_LIBS) $(BLAS_LIBS) $(FLIBS)

(I also use openmp, that has always worked fine, I just included all
lines for completeness) Rcheck gives me two notes:

─  using R version 4.3.2 (2023-10-31 ucrt)
─  using platform: x86_64-w64-mingw32 (64-bit)
─  R was compiled by
 gcc.exe (GCC) 12.3.0
 GNU Fortran (GCC) 12.3.0

❯ checking compilation flags used ... NOTE
Compilation used the following non-portable flag(s):
  '-mavx' '-mfma' '-msse4.2'

❯ checking C++ specification ... NOTE
  Specified C++14: please drop specification unless essential

But as far as I understand, the flags are necessary, at least in GCC.
How can I make this portable and CRAN-acceptable?


I think it the best way for portability is to use a higher-level library
that already has done the low-level business of maintaining multiple
versions of the code (with multiple instruction sets) and choosing one
appropriate for the current CPU. It could be say LAPACK, BLAS, openmp,
depending of the problem at hand.

Talking about libraries, may be the
https://en.cppreference.com/w/cpp/experimental/simd will do the job?

Best,
Serguei.

   In some cases, code can be rewritten

so that the compiler can vectorize it better, using the level of
vectorized instructions that have been enabled.

Unconditionally using GCC-specific or architecture-specific options in
packages would certainly not be portable. Even on Windows, R is now used
also with clang and on aarch64, so one should not assume a concrete
compiler and architecture.

Please note also that GCC on Windows has a bug due to which AVX2
instructions cannot be used reliably - the compiler doesn't always
properly align local variables on the stack when emitting these. See
[1,2] for more information.

Best
Tomas

[1] https://stat.ethz.ch/pipermail/r-sig-windows/2024q1/000113.html
[2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412



kind regards,
Jesse

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel




__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] using portable simd instructions

2024-03-27 Thread Serguei Sokol

Le 26/03/2024 à 15:51, Tomas Kalibera a écrit :


On 3/26/24 10:53, jesse koops wrote:

Hello R-package-devel,

I recently got inspired by the rcppsimdjson package to try out simd
registers. It works fantastic on my computer but I struggle to find
information on how to make it portable. It doesn't help in this case
that R and Rcpp make including Cpp code so easy that I have never had
to learn about cmake and compiler flags. I would appreciate any help,
including of the type: "go read instructions at ...".

I use RcppArmadillo and Rcpp. I currenlty include the following header:

#include 

The functions in immintrin that I use are:

_mm256_loadu_pd
_mm256_set1_pd
_mm256_mul_pd
_mm256_fmadd_pd
_mm256_storeu_pd

and I define up to four __m256d registers. From information found
online (not sure where anymore) I constructed the following makevars
file:

CXX_STD = CXX14

PKG_CPPFLAGS = -I../inst/include -mfma -msse4.2 -mavx

PKG_CXXFLAGS = $(SHLIB_OPENMP_CXXFLAGS)
PKG_LIBS = $(SHLIB_OPENMP_CXXFLAGS) $(LAPACK_LIBS) $(BLAS_LIBS) $(FLIBS)

(I also use openmp, that has always worked fine, I just included all
lines for completeness) Rcheck gives me two notes:

─  using R version 4.3.2 (2023-10-31 ucrt)
─  using platform: x86_64-w64-mingw32 (64-bit)
─  R was compiled by
    gcc.exe (GCC) 12.3.0
    GNU Fortran (GCC) 12.3.0

❯ checking compilation flags used ... NOTE
   Compilation used the following non-portable flag(s):
 '-mavx' '-mfma' '-msse4.2'

❯ checking C++ specification ... NOTE
 Specified C++14: please drop specification unless essential

But as far as I understand, the flags are necessary, at least in GCC.
How can I make this portable and CRAN-acceptable?


I think it the best way for portability is to use a higher-level library 
that already has done the low-level business of maintaining multiple 
versions of the code (with multiple instruction sets) and choosing one 
appropriate for the current CPU. It could be say LAPACK, BLAS, openmp, 
depending of the problem at hand.
Talking about libraries, may be the 
https://en.cppreference.com/w/cpp/experimental/simd will do the job?


Best,
Serguei.

 In some cases, code can be rewritten
so that the compiler can vectorize it better, using the level of 
vectorized instructions that have been enabled.


Unconditionally using GCC-specific or architecture-specific options in 
packages would certainly not be portable. Even on Windows, R is now used 
also with clang and on aarch64, so one should not assume a concrete 
compiler and architecture.


Please note also that GCC on Windows has a bug due to which AVX2 
instructions cannot be used reliably - the compiler doesn't always 
properly align local variables on the stack when emitting these. See 
[1,2] for more information.


Best
Tomas

[1] https://stat.ethz.ch/pipermail/r-sig-windows/2024q1/000113.html
[2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412



kind regards,
Jesse

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] [External] [External] RcmdrPlugin.HH_1.1-48.tar.gz

2024-03-07 Thread Serguei Sokol

Le 07/03/2024 à 11:08, Duncan Murdoch a écrit :

On 07/03/2024 4:16 a.m., Ivan Krylov wrote:

On Wed, 6 Mar 2024 13:46:55 -0500
Duncan Murdoch  wrote:


is this just a more or less harmless error, thinking that
the dot needs escaping


I think it's this one. You are absolutely right that the dot doesn't
need escaping in either TRE (which is what's used inside exportPattern)
or PCRE. In PRCE, this regular expression would have worked as intended:

# We do match backslashes by mistake.
grepl('[\\.]', '\\')
# [1] TRUE

# In PCRE, this wouldn't have been a mistake.
grepl('[\\.]', c('\\', '.'), perl = TRUE)
# [1] FALSE TRUE



Thanks, I didn't realize that escaping in PCRE was optional.
Escaping is optional only in brackets []. Without them it becomes 
mandatory if we want to catch just "." not any character :


grepl('.', c('\\', '.'), perl = TRUE)
#[1] TRUE TRUE

Best,
Serguei.




So the default exportPattern line could be

   exportPattern("^[^.]")

and it would work even if things were changed so that PCRE was used 
instead of TRE.


Duncan Murdoch

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] CRAN uses an old version of clang

2024-02-09 Thread Serguei Sokol
Not really responding the question, however another way could be to 
consider if your code is in Rcpp and calls to bessel and gamma function 
are not very frequent. These functions are available in base R and as 
such are callable via Function():


> Rcpp::evalCpp('Function("besselK")(1., 0.2)')
[1] 0.42722
> Rcpp::evalCpp('Function("gamma")(4)') # 3!
[1] 6

https://teuder.github.io/rcpp4everyone_en/230_R_function.html#function


Best,
Serguei.

Le 09/02/2024 à 15:59, Marcin Jurek a écrit :

Dear community,

I recently submitted an update to my package. It previous version relied on
Boost for Bessel and gamma functions but a colleague pointed out to me that
they are included in the standard library beginning with the C++17
standard.

I don't have access to a Mac so I tested my package on Rhub and on my local
Linux and everything ran fine. However, it seems like CRAN is using an old
version of Clang (14.03 vs 16 being the newest one) and it complained about
these Bessel functions. I'm pasting the installation log below. I wonder if
this is something I could hope to explain in cran-comments and have my
package accepted as is?

I could also revert to using Boost although I only need it for these
special functions and things are much cleaner without it. In addition, one
of the main reasons for this update was related to some warnings Boost
started throwing.

Really appreciate the help!

* installing *source* package ‘GPvecchia’ ...
** package ‘GPvecchia’ successfully unpacked and MD5 sums checked
** using staged installation
** libs
using C++ compiler: ‘Apple clang version 14.0.3 (clang-1403.0.22.14.1)’
using C++17
using SDK: ‘MacOSX11.3.sdk’
clang++ -arch x86_64 -std=gnu++17
-I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG
-I'/Volumes/Builds/packages/big-sur-x86_64/Rlib/4.3/Rcpp/include'
-I'/Volumes/Builds/packages/big-sur-x86_64/Rlib/4.3/RcppArmadillo/include'
-I/opt/R/x86_64/include -fPIC  -falign-functions=64 -Wall -g -O2
-c Esqe.cpp -o Esqe.o
clang++ -arch x86_64 -std=gnu++17
-I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG
-I'/Volumes/Builds/packages/big-sur-x86_64/Rlib/4.3/Rcpp/include'
-I'/Volumes/Builds/packages/big-sur-x86_64/Rlib/4.3/RcppArmadillo/include'
-I/opt/R/x86_64/include -fPIC  -falign-functions=64 -Wall -g -O2
-c Matern.cpp -o Matern.o
clang++ -arch x86_64 -std=gnu++17
-I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG
-I'/Volumes/Builds/packages/big-sur-x86_64/Rlib/4.3/Rcpp/include'
-I'/Volumes/Builds/packages/big-sur-x86_64/Rlib/4.3/RcppArmadillo/include'
-I/opt/R/x86_64/include -fPIC  -falign-functions=64 -Wall -g -O2
-c MaxMin.cpp -o MaxMin.o
clang++ -arch x86_64 -std=gnu++17
-I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG
-I'/Volumes/Builds/packages/big-sur-x86_64/Rlib/4.3/Rcpp/include'
-I'/Volumes/Builds/packages/big-sur-x86_64/Rlib/4.3/RcppArmadillo/include'
-I/opt/R/x86_64/include -fPIC  -falign-functions=64 -Wall -g -O2
-c RcppExports.cpp -o RcppExports.o
clang++ -arch x86_64 -std=gnu++17
-I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG
-I'/Volumes/Builds/packages/big-sur-x86_64/Rlib/4.3/Rcpp/include'
-I'/Volumes/Builds/packages/big-sur-x86_64/Rlib/4.3/RcppArmadillo/include'
-I/opt/R/x86_64/include -fPIC  -falign-functions=64 -Wall -g -O2
-c U_NZentries.cpp -o U_NZentries.o
clang++ -arch x86_64 -std=gnu++17
-I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG
-I'/Volumes/Builds/packages/big-sur-x86_64/Rlib/4.3/Rcpp/include'
-I'/Volumes/Builds/packages/big-sur-x86_64/Rlib/4.3/RcppArmadillo/include'
-I/opt/R/x86_64/include -fPIC  -falign-functions=64 -Wall -g -O2
-c dist.cpp -o dist.o
clang++ -arch x86_64 -std=gnu++17
-I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG
-I'/Volumes/Builds/packages/big-sur-x86_64/Rlib/4.3/Rcpp/include'
-I'/Volumes/Builds/packages/big-sur-x86_64/Rlib/4.3/RcppArmadillo/include'
-I/opt/R/x86_64/include -fPIC  -falign-functions=64 -Wall -g -O2
-c fastTree.cpp -o fastTree.o
Matern.cpp:80:68: error: no member named 'cyl_bessel_k' in namespace 'std'
 covmat(j1,j2) = normcon*pow( scaledist, covparms(2)
)*std::cyl_bessel_k(covparms(2),scaledist);
//Rf_bessel_k(scaledist,covparms(2),1.0);
   ~^
1 error generated.
make: *** [Matern.o] Error 1
make: *** Waiting for unfinished jobs
ERROR: compilation failed for package ‘GPvecchia’
* removing 
‘/Volumes/Builds/packages/big-sur-x86_64/results/4.3/GPvecchia.Rcheck/GPvecchia’

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Native pipe in package examples

2024-01-26 Thread Serguei Sokol

Le 26/01/2024 à 10:44, Serguei Sokol a écrit :

Le 26/01/2024 à 10:31, Serguei Sokol a écrit :

Le 25/01/2024 à 19:04, Berwin A Turlach a écrit :

On Thu, 25 Jan 2024 09:44:26 -0800
Henrik Bengtsson  wrote:


On Thu, Jan 25, 2024 at 9:23 AM Berwin A Turlach
 wrote:


G'day Duncon,


Uups, apologies for the misspelling of your name Duncan. Fingers were
too fast. :)

[...]

But you could always code your example (not tested :-) ) along lines
similar to:

if( with(version, all(as.numeric(c(major, minor)) >= c(4, 1))) ){
   ## code that uses native pipe
}else{
   cat("You have to upgrade to R >= 4.1.0 to run this example\n")
}


That will unfortunately not work in this case, because |> is part of
the new *syntax* that was introduced in R 4.1.0.  Older versions of R
simply doesn't understand how to *parse* those two symbols next to
each other, e.g.

{R 4.1.0}> parse(text = "1:3 |> sum()")
expression(1:3 |> sum())

{R 4.0.5}> parse(text = "1:3 |> sum()")
Error in parse(text = "1:3 |> sum()") : :1:6: unexpected '>'
1: 1:3 |>
  ^

In order for R to execute some code, it needs to be able to parse it
first. Only then, it can execute it.  So, here, we're not even getting
past the parsing phase.


Well, not withstanding 'fortune(181)', you could code it as:

if( with(version, all(as.numeric(c(major, minor)) >= c(4, 1))) ){
    cat(eval(parse(text="1:3 |> sum()")), "\n")
}else{
   cat("You have to upgrade to R >= 4.1.0 to run this example\n")
}

By nitpicking a little bit, this test won't work for v5.0 as minor 
"0" is less then "1". There are a more canonical ways to test the 
version and send a message (or a 'warning()'):


if (getVersion() >= "4.1") {

Oops, it won't work for v10.0. Better would be:

if (utils::compareVersion(getVersion(), "4.1.0") >= 0) {
Sorry for annoyance (not a good day for sending messages), obviously it 
should be 'getRversion()'


Best,
Serguei.



Best,
Serguei.


cat(eval(parse(text="1:3 |> sum()")), "\n")
} else {
   message("You have to upgrade to R >= 4.1.0 to run this example")
}

Best,
Serguei.




__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Native pipe in package examples

2024-01-26 Thread Serguei Sokol

Le 26/01/2024 à 10:31, Serguei Sokol a écrit :

Le 25/01/2024 à 19:04, Berwin A Turlach a écrit :

On Thu, 25 Jan 2024 09:44:26 -0800
Henrik Bengtsson  wrote:


On Thu, Jan 25, 2024 at 9:23 AM Berwin A Turlach
 wrote:


G'day Duncon,


Uups, apologies for the misspelling of your name Duncan. Fingers were
too fast. :)

[...]

But you could always code your example (not tested :-) ) along lines
similar to:

if( with(version, all(as.numeric(c(major, minor)) >= c(4, 1))) ){
   ## code that uses native pipe
}else{
   cat("You have to upgrade to R >= 4.1.0 to run this example\n")
}


That will unfortunately not work in this case, because |> is part of
the new *syntax* that was introduced in R 4.1.0.  Older versions of R
simply doesn't understand how to *parse* those two symbols next to
each other, e.g.

{R 4.1.0}> parse(text = "1:3 |> sum()")
expression(1:3 |> sum())

{R 4.0.5}> parse(text = "1:3 |> sum()")
Error in parse(text = "1:3 |> sum()") : :1:6: unexpected '>'
1: 1:3 |>
  ^

In order for R to execute some code, it needs to be able to parse it
first. Only then, it can execute it.  So, here, we're not even getting
past the parsing phase.


Well, not withstanding 'fortune(181)', you could code it as:

if( with(version, all(as.numeric(c(major, minor)) >= c(4, 1))) ){
    cat(eval(parse(text="1:3 |> sum()")), "\n")
}else{
   cat("You have to upgrade to R >= 4.1.0 to run this example\n")
}

By nitpicking a little bit, this test won't work for v5.0 as minor "0" 
is less then "1". There are a more canonical ways to test the version 
and send a message (or a 'warning()'):


if (getVersion() >= "4.1") {

Oops, it won't work for v10.0. Better would be:

if (utils::compareVersion(getVersion(), "4.1.0") >= 0) {

Best,
Serguei.


cat(eval(parse(text="1:3 |> sum()")), "\n")
} else {
   message("You have to upgrade to R >= 4.1.0 to run this example")
}

Best,
Serguei.


__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Native pipe in package examples

2024-01-26 Thread Serguei Sokol

Le 25/01/2024 à 19:04, Berwin A Turlach a écrit :

On Thu, 25 Jan 2024 09:44:26 -0800
Henrik Bengtsson  wrote:


On Thu, Jan 25, 2024 at 9:23 AM Berwin A Turlach
 wrote:


G'day Duncon,


Uups, apologies for the misspelling of your name Duncan.  Fingers were
too fast. :)

[...]

But you could always code your example (not tested :-) ) along lines
similar to:

if( with(version, all(as.numeric(c(major, minor)) >= c(4, 1))) ){
   ## code that uses native pipe
}else{
   cat("You have to upgrade to R >= 4.1.0 to run this example\n")
}


That will unfortunately not work in this case, because |> is part of
the new *syntax* that was introduced in R 4.1.0.  Older versions of R
simply doesn't understand how to *parse* those two symbols next to
each other, e.g.

{R 4.1.0}> parse(text = "1:3 |> sum()")
expression(1:3 |> sum())

{R 4.0.5}> parse(text = "1:3 |> sum()")
Error in parse(text = "1:3 |> sum()") : :1:6: unexpected '>'
1: 1:3 |>
  ^

In order for R to execute some code, it needs to be able to parse it
first. Only then, it can execute it.  So, here, we're not even getting
past the parsing phase.


Well, not withstanding 'fortune(181)', you could code it as:

if( with(version, all(as.numeric(c(major, minor)) >= c(4, 1))) ){
cat(eval(parse(text="1:3 |> sum()")), "\n")
}else{
   cat("You have to upgrade to R >= 4.1.0 to run this example\n")
}

By nitpicking a little bit, this test won't work for v5.0 as minor "0" 
is less then "1". There are a more canonical ways to test the version 
and send a message (or a 'warning()'):


if (getVersion() >= "4.1") {
   cat(eval(parse(text="1:3 |> sum()")), "\n")
} else {
   message("You have to upgrade to R >= 4.1.0 to run this example")
}

Best,
Serguei.

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] How to debug segfault when running build -> document in Rstudio that includes TMB module

2024-01-24 Thread Serguei Sokol

Le 24/01/2024 à 04:22, Carl Schwarz a écrit :

I'm trying to update my SPAS package to respond to a CRAN check. Before
starting the changes, I tried to rebuild my package, but now get a segfault
when I try to do a devtools::document() or devtools::check(args =
c('--as-cran')). See below for output from the Rstudio "Build" window.

I've
- reinstalled ALL packages
- reinstalled R (4.3.2 on MacOS Intel Chip)
- reinstalled Rstudio

When I try a rebuild/document on a sister package it runs fine, so I
suspect that the problem is related to using a TMB module that is part of
the SPAS package written in Cpp.

How do I start to "debug" this to identify the problem?

Why not simply run devtools::document() from 'R -d gdb' ?

Best,
Serguei.



Thanks
Carl Schwarz




==> devtools::document(roclets = c('rd', 'collate', 'namespace', 'vignette'))
ℹ Updating SPAS documentationℹ Loading SPAS

  *** caught segfault ***
address 0x54e40, cause 'memory not mapped'

Traceback:
  1: dyn.load(dll_copy_file)
  2: library.dynam2(path, lib)
  3: load_dll(path)
  4: pkgload::load_all(path, helpers = FALSE, attach_testthat = FALSE)
  5: load_code(base_path)
  6: roxygen2::roxygenise(pkg$path, roclets)
  7: devtools::document(roclets = c("rd", "collate", "namespace",
"vignette"))
  8: withCallingHandlers(expr, packageStartupMessage = function(c)
tryInvokeRestart("muffleMessage"))
  9: suppressPackageStartupMessages({oldLC <-
Sys.getlocale(category = "LC_COLLATE")Sys.setlocale(category =
"LC_COLLATE", locale = "C")on.exit(Sys.setlocale(category =
"LC_COLLATE", locale = oldLC))devtools::document(roclets = c("rd",
"collate", "namespace", "vignette"))})
An irrecoverable exception occurred. R is aborting now ...

Exited with status 139.

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [Rd] [External] Re: zapsmall(x) for scalar x

2023-12-18 Thread Serguei Sokol via R-devel

Le 18/12/2023 à 11:24, Martin Maechler a écrit :

Serguei Sokol via R-devel
 on Mon, 18 Dec 2023 10:29:02 +0100 writes:

 > Le 17/12/2023 à 18:26, Barry Rowlingson a écrit :
 >> I think what's been missed is that zapsmall works relative to the 
absolute
 >> largest value in the vector. Hence if there's only one
 >> item in the vector, it is the largest, so its not zapped. The function's
 >> raison d'etre isn't to replace absolutely small values,
 >> but small values relative to the largest. Hence a vector of similar tiny
 >> values doesn't get zapped.
 >>
 >> Maybe the line in the docs:
 >>
 >> " (compared with the maximal absolute value)"
 >>
 >> needs to read:
 >>
 >> " (compared with the maximal absolute value in the vector)"

 > I agree that this change in the doc would clarify the situation but
 > would not resolve proposed corner cases.

 > I think that an additional argument 'mx' (absolute max value of
 > reference) would do. Consider:

 > zapsmall2 <-
 > function (x, digits = getOption("digits"), mx=max(abs(x), na.rm=TRUE))
 > {
 >     if (length(digits) == 0L)
 >     stop("invalid 'digits'")
 >     if (all(ina <- is.na(x)))
 >     return(x)
 >     round(x, digits = if (mx > 0) max(0L, digits -
 > as.numeric(log10(mx))) else digits)
 > }

 > then zapsmall2() without explicit 'mx' behaves identically to actual
 > zapsmall() and for a scalar or a vector of identical value, user can
 > manually fix the scale of what should be considered as small:

 >> zapsmall2(y)
 > [1] 2.220446e-16
 >> zapsmall2(y, mx=1)
 > [1] 0
 >> zapsmall2(c(y, y), mx=1)
 > [1] 0 0
 >> zapsmall2(c(y, NA))
 > [1] 2.220446e-16   NA
 >> zapsmall2(c(y, NA), mx=1)
 > [1]  0 NA

 > Obviously, the name 'zapsmall2' was chosen just for this explanation.
 > The original name 'zapsmall' could be reused as a full backward
 > compatibility is preserved.

 > Best,
 > Serguei.

Thank you, Serguei, Duncan, Barry et al.

Generally :
   Yes, zapsmall was meant and is used for zapping *relatively*
   small numbers.  In the other cases,  directly  round()ing is
   what you should use.

Specifically to Serguei's proposal of allowing the "max" value
to be user specified (in which case it is not really a true
max() anymore):

I've spent quite a a few hours on this problem in May 2022, to
make it even more flexible, e.g. allowing to use a 99%
percentile instead of the max(), or allowing to exclude +Inf
from the "mx"; but -- compared to your zapsmall2() --
to allow reproducible automatic choice :


zapsmall <- function(x, digits = getOption("digits"),
  mFUN = function(x, ina) max(abs(x[!ina])),
 min.d = 0L)
{
 if (length(digits) == 0L)
 stop("invalid 'digits'")
 if (all(ina <- is.na(x)))
 return(x)
 mx <- mFUN(x, ina)
 round(x, digits = if(mx > 0) max(min.d, digits - as.numeric(log10(mx))) 
else digits)
}

with optional 'min.d' as I had (vaguely remember to have) found
at the time that the '0' is also not always "the only correct" choice.

Do you have a case or two where min.d could be useful?

Serguei.



Somehow I never got to propose/discuss the above,
but it seems a good time to do so now.

Martin



 >> barry
 >>
 >>
 >> On Sun, Dec 17, 2023 at 2:17 PM Duncan Murdoch 

 >> wrote:
 >>
 >>> This email originated outside the University. Check before clicking 
links
 >>> or attachments.
 >>>
 >>> I'm really confused.  Steve's example wasn't a scalar x, it was a
 >>> vector.  Your zapsmall() proposal wouldn't zap it to zero, and I don't
 >>> see why summary() would if it was using your proposal.
 >>>
 >>> Duncan Murdoch
 >>>
 >>> On 17/12/2023 8:43 a.m., Gregory R. Warnes wrote:
 >>>> Isn’t that the correct outcome?  The user can change the number of
 >>> digits if they want to see small values…
 >>>>
 >>>> --
 >>>> Change your thoughts and you change the world.
 >>>> --Dr. Norman Vincent Peale
 >>>>
 >>>>> On Dec 17, 2023, at 12:11 AM, Steve Martin 
 >>> wrote:
 >>>>> Zapping a vector of small numbers to zero would cause problems when
 >>>>> printing the results of summary(). For example, if
 >&g

Re: [Rd] [External] Re: zapsmall(x) for scalar x

2023-12-18 Thread Serguei Sokol via R-devel
devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Serguei Sokol
Ingenieur de recherche INRAE

Cellule Mathématiques
TBI, INSA/INRAE UMR 792, INSA/CNRS UMR 5504
135 Avenue de Rangueil
31077 Toulouse Cedex 04

tel: +33 5 61 55 98 49
email: so...@insa-toulouse.fr
https://www.toulouse-biotechnology-institute.fr/en/plateformes-plateaux/cellule-mathematiques/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [R-pkg-devel] Check warning around sprintf: Compiled code should not call entry points which might terminate R nor write to stdout/stderr instead of to the console, nor use Fortran I/O nor system

2023-11-20 Thread Serguei Sokol

Le 19/11/2023 à 02:07, Iris Simmons a écrit :

Yes, the reason for the error is the use of sprintf. You can instead use
snprintf where n is the maximum number of bytes to write, including the
terminating nul character. For example:

char msg[8191];
snprintf(msg, 8191, "criteria: error (%d) -> %s\n", inErr, errStr);

This line should be

snprintf(msg, 8190, "criteria: error (%d) -> %s\n", inErr, errStr);

i.e. 1-less than 'msg' size, leaving a room for the terminal 0-byte.
Otherwise, a recent version of gcc emits a warning caught by CRAN.

Best,
Serguei.



Rf_error(msg);

or however large you made the error string.


On Sat, Nov 18, 2023, 20:01 Iago Giné-Vázquez 
wrote:


Dear all,

I am updating a CRAN-archived R package, so it can get back to CRAN. But
there is a warning produced in Linux OS that I am not sure to understand
and I do not know how to solve, even after looking at ‘Writing portable
packages’ in the ‘Writing R Extensions’ manual and after searching in the
web. The warning is


* checking compiled code ... WARNING
File ‘ccckc/libs/ccckc.so’:
Found ‘sprintf’, possibly from ‘sprintf’ (C)
Object: ‘criteria.o’

Compiled code should not call entry points which might terminate R nor
write to stdout/stderr instead of to the console, nor use Fortran I/O
nor system RNGs nor [v]sprintf.
See ‘Writing portable packages’ in the ‘Writing R Extensions’ manual.


The package contains both C and Fortran code and in the criteria.c there
is only a sprintf use, as follows:

sprintf(msg,"criteria: error (%d) -> %s\n", inErr, errStr);
Rf_error(msg);
May be the reason of the warning the next line the ‘Writing R Extensions’
manual?


Use ofsprintfandvsprintfis regarded as a potential security risk and

warned about on some platforms.[82](
https://cran.r-project.org/doc/manuals/R-exts.html#FOOT82)R CMD
checkreports if any calls are found.

If that is the reason, is there any alternative to the use of sprintf?
Anyway, what can I do?

Thanks you in advance for your time.

Kind regards,
Iago

Sent with [Proton Mail](https://proton.me/) secure email.
 [[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel



[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Virtual C++ functions

2023-11-15 Thread Serguei Sokol

Le 15/11/2023 à 10:37, Michael Meyer via R-package-devel a écrit :

Greetings,
Suppose I wanted to develop a package with C++ code that contains virtual 
functions which the package user should define.It's assumed that evaluation is 
expensive so we do not want to define these in R and then call these 
R-functions from C++.
Hm, virtual C++ functions are defined at compilation time. Their binding 
is done at runtime but at compilation time they must be already defined. 
So, how a package user (who already installed and therefor compiled your 
package) could define them? Moreover in R?
Or may be you mean that a user import your C++ code and define his own 
derived function based on your virtual functions in his C++ code?
Or, another option, you call "virtual function" what is usually called 
"callback function" like e.g. a function searching for roots of any user 
defined function and taking it as a parameter. This latter is a callback 
function.


Could you precise what you mean by "virtual function" and "package user 
should define"?


Best,
Serguei.


Is this a reasonable idea with a standard solution?Are there packages that do 
this?
Thanks in advance for all answers,

Michael Meyer
[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Rmarkdown fails if (quote) r (space) is used

2023-11-03 Thread Serguei Sokol

Le 03/11/2023 à 15:54, J C Nash a écrit :

I've spent a couple of hours with an Rmarkdown document where I
was describing some spherical coordinates made up of a radius r and
some angles. I wanted to fix the radius at 1.

In my Rmarkdown text I wrote

    Thus we have `r = 1` ...
To avoid a confusion between inline code and fixed font typesetting, 
could it be


   Thus we have ` r = 1` ...

(with a space after an opening quote)?

Best,
Serguei.



This caused failure to render with "unexpected =". I was using Rstudio
at first and didn't see the error msg.

If I use "radius R" and `R = 1`, things are fine, or `r=1` with no space,
but the particular "(quote) r (space)" seems to trigger code block 
processing.


Perhaps this note can save others some wasted time.

I had thought (obviously incorrectly) that one needed ```{r something}
to start the code chunk.

JN

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Link to MKL instead of RBLAS on CRAN

2023-09-27 Thread Serguei Sokol

Le 27/09/2023 à 14:11, Sameh Abdulah a écrit :

Hi,

Is it possible to link with MKL instead of RBLAS when submitting my package to 
CRAN?
Usually, it's not the business of a package to choose a BLAS to link to. 
Many options exists for this on the user's side. For example, at 
installation of R you can see 
https://cran.r-project.org/doc/manuals/r-release/R-admin.html#BLAS or 
https://www.intel.com/content/www/us/en/developer/articles/technical/using-onemkl-with-r.html
Another option for "standard" R which can work, is simply to simlink 
Rblas.so to libopenblas.os or whatever BLAS you want. For MKL it can be 
a little trickier as it requires some additional libraries. It is up to 
you to make them findable at run time.


Best,
Serguei.


Do CRAN support other BLAS libraries?

Best,
--Sameh



__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] [EXTERNAL] Re: Warning: a function declaration without a prototype is deprecated in all versions of C

2023-09-26 Thread Serguei Sokol

Le 26/09/2023 à 10:50, Iñaki Ucar a écrit :

On Tue, 26 Sept 2023 at 10:29, Sameh Abdulah  wrote:


Thanks for replying!

The main problem that this warning from a C library that I am relying on, I 
have no control to fix the warning there. So, I am still getting this warning 
from R, when building my package.


We don't have a way of knowing for sure, because you didn't provide a
link to the package in question, so I'm just guessing here. From your
description, it seems that you vendor OpenBLAS in your package,
And if you do, it's probably not the best way to proceed. R relies 
already heavily on BLAS. So it is already available and may be it is 
sufficient in your case to add the following lines to Makevars to link 
to the local BLAS and leave the choice of the vendor to a final user 
(OpenBLAS, Atlas, MKL, ...):


PKG_LIBS = $(LAPACK_LIBS) $(BLAS_LIBS) $(FLIBS)

You can see a demo package of using BLAS/LAPACK in R e.g. at 
https://github.com/cjgeyer/mat


Best,
Serguei.


and
that's why you get a warning. Then you *do* have control to fix that:
just patch your copy appropriately. This is what e.g. the BH package
(and others) do. And in this case it would be nice to send the fix
upstream too.

Iñaki



Best,
--Sameh

From: R-package-devel  on behalf of Jeff 
Newmiller via R-package-devel 
Date: Tuesday, September 26, 2023 at 11:19 AM
To: r-package-devel@r-project.org 
Subject: [EXTERNAL] Re: [R-pkg-devel] Warning: a function declaration without a 
prototype is deprecated in all versions of C
This error arises because you are not declaring the function properly before 
you call it... likely because you have not included the appropriate header file 
or because you have typoed the function call.

If you provide a link to your package someone may point you more precisely to 
your error, but this is a pretty basic C language question rather than an R 
package question so it isn't technically on topic here.

On September 26, 2023 12:58:25 AM PDT, Sameh Abdulah 
 wrote:

Dear Colleagues,


I've encountered a warning in my package that states:

'warning: a function declaration without a prototype is deprecated in all 
versions of C [-Wstrict-prototypes].'

This warning originates from one of the libraries I depend on, specifically 
OpenBLAS. So, I have no control to fix it inside OpenBLAS.

I'm not sure how to resolve this issue.


Best regards,
--Sameh"




--
Sent from my phone. Please excuse my brevity.

__
R-package-devel@r-project.org mailing list
https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-package-devel__;!!Nmw4Hv0!yyvWt-qHY4RQENDvneARJTJbYchLTruMwyEmREYaEtV52oUiLbgVqWM1wxJW-ijKGJeNgHq1HWtnHCQ1_CqTIRN9gfJfWX3c1A7nVQ$

--

This message and its contents, including attachments are intended solely
for the original recipient. If you are not the intended recipient or have
received this message in error, please notify me immediately and delete
this message from your computer system. Any unauthorized use or
distribution is prohibited. Please consider the environment before printing
this email.

 [[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel






__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [Rd] Strange behaviour of do.call()

2023-09-19 Thread Serguei Sokol via R-devel

Le 19/09/2023 à 16:44, Duncan Murdoch a écrit :
The knitr::kable() function does some internal setup, including 
determining the target format, and then calls an internal function using


  do.call(paste("kable", format, sep = "_"), list(x = x,
    caption = caption, escape = escape, ...))

I was interested in setting the `vlign` argument to 
knitr:::kable_latex, using this code:


  knitr::kable(head(mtcars), format="latex", align = "c", vlign="")

If I debug knitr::kable, I can see that `vlign = ""` is part of 
list(...).  However, if I debug knitr:::kable_latex, I get weird results:


  > debug(knitr:::kable_latex)
  > knitr::kable(head(mtcars), format="latex", align = "c", vlign="")

If I do this in my R v4.3.1 on linux, I get:

> debug(knitr:::kable_latex)
> knitr::kable(head(mtcars), format="latex", align = "c", vlign="")
Error in kable_latex(x = c("Mazda RX4", "Mazda RX4 Wag", "Datsun 710",  :
  unused argument (vlign = "")

By looking at args(knitr:::kable_latex), I see 2 similar arguments 
'valign' and 'vline' but no 'vlign'.

Can it be just a typo in your code?


debugging in: kable_latex(x = c("Mazda RX4", "Mazda RX4 Wag", "Datsun 
710",

  "Hornet 4 Drive", "Hornet Sportabout", "Valiant", "21.0", "21.0",
  "22.8", "21.4", "18.7", "18.1", "6", "6", "4", "6", "8", "6",
  "160", "160", "108", "258", "360", "225", "110", "110", "93",
  "110", "175", "105", "3.90", "3.90", "3.85", "3.08", "3.15",
  "2.76", "2.620", "2.875", "2.320", "3.215", "3.440", "3.460",
  "16.46", "17.02", "18.61", "19.44", "17.02", "20.22", "0", "0",
  "1", "1", "0", "1", "1", "1", "1", "0", "0", "0", "4", "4", "4",
  "3", "3", "3", "4", "4", "1", "1", "2", "1"), caption = NULL,
  escape = TRUE, vlign = "")
debug: {

  [rest of function display omitted]

I see here that vlign = "" is being shown as an argument. However, 
when I print vlign, sometimes I get "object not found", and somethings 
I get


  Browse[2]> vline
  debug: [1] "|"

Here again, 'vline' is used on purpose instead of 'vlign'?

Best,
Serguei.



(which is what the default value would be).  In the latter case, I 
also see


  Browse[2]> list(...)
  $vlign
  [1] ""

i.e. vlign remains part of the ... list, it wasn't bound to the 
argument named vlign.


I can't spot anything particularly strange in the way knitr is 
handling this; can anyone else?  My sessionInfo() is below.


Duncan Murdoch

> sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: x86_64-apple-darwin20 (64-bit)
Running under: macOS Monterey 12.6.9

Matrix products: default
BLAS: 
/Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib 

LAPACK: 
/Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; 
 LAPACK version 3.11.0


locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

time zone: America/Toronto
tzcode source: internal

attached base packages:
[1] stats graphics  grDevices utils datasets  methods base

loaded via a namespace (and not attached):
[1] compiler_4.3.1 tools_4.3.1    knitr_1.44 xfun_0.40

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Serguei Sokol
Ingenieur de recherche INRAE

Cellule Mathématiques
TBI, INSA/INRAE UMR 792, INSA/CNRS UMR 5504
135 Avenue de Rangueil
31077 Toulouse Cedex 04

tel: +33 5 61 55 98 49
email: so...@insa-toulouse.fr
https://www.toulouse-biotechnology-institute.fr/en/plateformes-plateaux/cellule-mathematiques/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [R-pkg-devel] Check package without suggests

2023-07-18 Thread Serguei Sokol

Is it possible that you have complicated the task unnecessarily?
Normally, you can just do

if (requireNamespace("", quietly=TRUE)) {
 # do the tests involving 
}

Wasn't that enough?

Best,
Serguei.


Le 18/07/2023 à 16:37, John Harrold a écrit :

Howdy Folks,

I recent had a package start failing because I wasn't checking properly in
my tests to make sure my suggested packages were installed before running
tests. I think this is something new running on CRAN where packages are
tested with only the packages specified as Imports in the DESCRIPTION file
are installed. It took me a bit of back and forth to get all of these
issues worked out.  I was wondering if anyone has a good way to run R CMD
check with only the imports installed?  A github action, or a
specific platform on rhub?

Thank you,

John
:wq

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [Rd] dir.exists returns FALSE on Symlink directory

2023-06-26 Thread Serguei Sokol via R-devel

Le 26/06/2023 à 17:17, Serguei Sokol a écrit :

Le 26/06/2023 à 16:26, Dipterix Wang a écrit :

I hope I'm not asking a stupid question...
Many think that there is no such thing as "stupid question". However, 
this one looks more appropriate for r-help list, does not it?


  If I symlink a directory, is symlink considered as directory in R? 
If so, why `dir.exists` returns FALSE on directory?


I understand symlink is essentially a file, but functionally a 
symlink to directory is no different to the directory itself, but a 
directory is also essentially a file.


Is there any way that allow me to check if a symlink is linked to 
directory in base R, like `dir.exists(..., symlink_ok = TRUE)`?

What about :

dir.exists(Sys.readlink("your_link"))

Or even better:

file_test("-d", "your_real_dir_or_simlink_to_dir")

which returns TRUE for both a real dir and a simlink to a dir.

Best,
Serguei.



?

Best,
Serguei.



Thanks,
Dipterix
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel






--
Serguei Sokol
Ingenieur de recherche INRAE

Cellule Mathématiques
TBI, INSA/INRAE UMR 792, INSA/CNRS UMR 5504
135 Avenue de Rangueil
31077 Toulouse Cedex 04

tel: +33 5 61 55 98 49
email: so...@insa-toulouse.fr
https://www.toulouse-biotechnology-institute.fr/en/plateformes-plateaux/cellule-mathematiques/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] dir.exists returns FALSE on Symlink directory

2023-06-26 Thread Serguei Sokol via R-devel

Le 26/06/2023 à 16:26, Dipterix Wang a écrit :

I hope I'm not asking a stupid question...
Many think that there is no such thing as "stupid question". However, 
this one looks more appropriate for r-help list, does not it?



  If I symlink a directory, is symlink considered as directory in R? If so, why 
`dir.exists` returns FALSE on directory?

I understand symlink is essentially a file, but functionally a symlink to 
directory is no different to the directory itself, but a directory is also 
essentially a file.

Is there any way that allow me to check if a symlink is linked to directory in 
base R, like `dir.exists(..., symlink_ok = TRUE)`?

What about :

dir.exists(Sys.readlink("your_link"))

?

Best,
Serguei.



Thanks,
Dipterix
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Serguei Sokol
Ingenieur de recherche INRAE

Cellule Mathématiques
TBI, INSA/INRAE UMR 792, INSA/CNRS UMR 5504
135 Avenue de Rangueil
31077 Toulouse Cedex 04

tel: +33 5 61 55 98 49
email: so...@insa-toulouse.fr
https://www.toulouse-biotechnology-institute.fr/en/plateformes-plateaux/cellule-mathematiques/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] codetools wrongly complains about lazy evaluation in S4 methods

2023-06-07 Thread Serguei Sokol via R-devel

Le 03/06/2023 à 17:50, Mikael Jagan a écrit :
In a package, I define a method for not-yet-generic function 'qr.X' 
like so:


    > setOldClass("qr")
    > setMethod("qr.X", signature(qr = "qr"), function(qr, complete, 
ncol) NULL)


The formals of the newly generic 'qr.X' are inherited from the 
non-generic

function in the base namespace.  Notably, the inherited default value of
formal argument 'ncol' relies on lazy evaluation:

    > formals(qr.X)[["ncol"]]
    if (complete) nrow(R) else min(dim(R))

where 'R' must be defined in the body of any method that might 
evaluate 'ncol'.
To my surprise, tools:::.check_code_usage_in_package() complains about 
the

undefined symbol:

    qr.X: no visible binding for global variable 'R'
    qr.X,qr: no visible binding for global variable 'R'
    Undefined global functions or variables:
  R
I think this issue is similar to the complaints about non defined 
variables in expressions involving non standard evaluation, e.g. column 
names in a data frame which are used as unquoted symbols. One of 
workarounds is simply to declare them somewhere in your code. In your 
case, it could be something as simple as:


  R=NULL

Best,
Serguei.



I claim that it should _not_ complain, given that lazy evaluation is a 
really
a feature of the language _and_ given that it already does not 
complain about

the formals of functions that are not S4 methods.

Having said that, it is not obvious to me what in codetools would need 
to change

here.  Any ideas?

I've attached a script that creates and installs a test package and 
reproduces
the check output by calling tools:::.check_code_usage_in_package().  
Hope it

gets through.

Mikael

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [R-pkg-devel] Problems with devtools::build() in R

2023-05-16 Thread Serguei Sokol

Le 16/05/2023 à 18:07, Jarrett Phillips a écrit :

Hi All,

I'm trying to generate a `tar.gz` file on a Mac for R package submission to
CRAN but am having issues.

I'm using `devtools`, specifically `build()` and `install()`.

My package relies on compiled code via `Rcpp/RcppArmadillo`.

 build("HACSim_OO")
 ── R CMD build
─
 ✔  checking for file ‘/Users/jarrettphillips/Desktop/HAC
simulation/HACSim_OO/DESCRIPTION’ ...
 ─  preparing ‘HACSim’:
 ✔  checking DESCRIPTION meta-information ...
 ─  cleaning src
 ─  installing the package to process help pages
  ---
 ─  installing *source* package ‘HACSim’ ...
** using staged installation
** libs
clang++ -arch arm64 -std=gnu++11 -
I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG
  
-I'/Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/Rcpp/include'
-I'/Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/RcppArmadillo/include'
-I/opt/R/arm64/include-fPIC  -falign-functions=64 -Wall -g -O2  -Wall
-pedantic -fdiagnostics-color=always -c RcppExports.cpp -o RcppExports.o
clang++ -arch arm64 -std=gnu++11
-I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG
  
-I'/Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/Rcpp/include'
-I'/Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/RcppArmadillo/include'
-I/opt/R/arm64/include-fPIC  -falign-functions=64 -Wall -g -O2  -Wall
-pedantic -fdiagnostics-color=always -c accumulate.cpp -o accumulate.o
clang++ -arch arm64 -std=gnu++11 -dynamiclib
-Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module
-multiply_defined suppress -L/Library/Frameworks/R.framework/Resources/lib
-L/opt/R/arm64/lib -o HACSim.so RcppExports.o accumulate.o
-L/Library/Frameworks/R.framework/Resources/lib -lRlapack
-L/Library/Frameworks/R.framework/Resources/lib -lRblas
-L/opt/R/arm64/gfortran/lib/gcc/aarch64-apple-darwin20.6.0/12.0.1
-L/opt/R/arm64/gfortran/lib -lgfortran -lemutls_w -lquadmath
-F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework
-Wl,CoreFoundation
ld: warning: directory not found for option
'-L/opt/R/arm64/gfortran/lib/gcc/aarch64-apple-darwin20.6.0/12.0.1'
ld: warning: directory not found for option
'-L/opt/R/arm64/gfortran/lib'
ld: library not found for -lgfortran
clang: error: linker command failed with exit code 1 (use -v to see
invocation)
make: *** [HACSim.so] Error 1
ERROR: compilation failed for package ‘HACSim’
 ─  removing
‘/private/var/folders/r4/xm5blbcd2tn06tjv00lm1c78gn/T/RtmpN4uaYR/Rinstdf4219594de/HACSim’
  ---
 ERROR: package installation failed
 Error in `(function (command = NULL, args = character(),
error_on_status = TRUE, …`:
 ! System command 'R' failed
  ---
  Exit status: 1
  stdout & stderr: 
  ---
 Type .Last.error to see the more details.

`clang` is installed (since I am able to run the code within my package)
and I've verified by typing `gcc` in the Mac Terminal. I've also installed
`Homebrew` and `gfortran`, verifying via typing in the Terminal.

Any idea on what's going on how to fix the issue(s)?

Try to add in /src/Makevars:

PKG_LIBS=$(FLIBS)

Best,
Serguei.

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [Rd] save.image Non-responsive to Interrupt

2023-05-04 Thread Serguei Sokol via R-devel
ossibly
expensive) ui check again.

Basic example: https://github.com/r-devel/r-svn/pull/125/files





{{ saving the whole user workspace is not "valid" in that sense
in my view.  I tell all my (non-beginner) Rstudio-using
students they should turn *off* the automatic saving and
loading at session end / beginning; and for reproducibility
only saveRDS() [or save()] *explicitly* a few precious
objects ..
}}

Again, we don't want to punish people who know what they are
doing, just because other R users manage to hang their R session
by too little thinking ...

Your patch adds 15 such interrupt checking calls which may
really be too much -- I'm not claiming they are: with our
recursive objects it's surely not very easy to determine the
"minimally necessary" such calls.

In addition, we may still consider adding an extra optional
argument, say   `check.interrupt = TRUE`
which we may default to TRUE when  save.image() is called
but e.g., not when serialize() is called..

Martin

 > --
 > Best regards,
 > Ivan
 > x[DELETED ATTACHMENT external: Rd_IvanKrylov_interrupt-serialize.patch, 
text/x-patch]
 > __
 > R-devel@r-project.org mailing list
 > https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Serguei Sokol
Ingenieur de recherche INRAE

Cellule Mathématiques
TBI, INSA/INRAE UMR 792, INSA/CNRS UMR 5504
135 Avenue de Rangueil
31077 Toulouse Cedex 04

tel: +33 5 61 55 98 49
email: so...@insa-toulouse.fr
http://www.toulouse-biotechnology-institute.fr/en/technology_platforms/mathematics-cell.html

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] xyTable(x,y) versus table(x,y) with NAs

2023-04-25 Thread Serguei Sokol via R-devel

Le 25/04/2023 à 17:39, Bill Dunlap a écrit :

x <- c(1, 1, 2, 2,  2, 3)
y <- c(1, 2, 1, 3, NA, 3)

str(xyTable(x,y))

List of 3
  $ x : num [1:6] 1 1 2 2 NA 3
  $ y : num [1:6] 1 2 1 3 NA 3
  $ number: int [1:6] 1 1 1 NA NA 1


How many (2,3)s do we have?  At least one, the third entry, but the fourth
entry, (2,NA), is possibly a (2,3) so we don't know and make the count NA.
I suspect this is not the intended logic, but a byproduct of finding value
changes in a sorted vector with the idiom x[-1]!=x[-length(x).  Also the
following does follow that logic:


x <- c(1, 1, 2, 2,  5, 6)
y <- c(2, 2, 2, 4, NA, 3)
str(xyTable(x,y))

List of 3
  $ x : num [1:5] 1 2 2 5 6
  $ y : num [1:5] 2 2 4 NA 3
  $ number: int [1:5] 2 1 1 1 1

Not really. If we take

  x <- c(1, 1, 2, 2,  5, 6, 5, 5)
  y <- c(2, 2, 2, 4, NA, 3, 3, 4)

we get

  str(xyTable(x,y))

List of 3
 $ x : num [1:7] 1 2 2 5 5 NA 6
 $ y : num [1:7] 2 2 4 3 4 NA 3
 $ number: int [1:7] 2 1 1 1 NA NA 1

How many (5, 3) we have? At least 1 but (5, NA) is possibly (5,3) so we 
should have NA but we have 1.
How many (5, 4) we have? At least 1 but (5, NA) is possibly (5,4) and we 
do get NA. So restored logic is not consistent.
Without talking about a pair (NA, NA) appeared and not producing (5, NA) 
pair.


Best,
Serguei.






table() does not use this logic, as one NA in a vector would make all the
counts NA.  Should xyTable have a way to handle NAs the way table() does?

-Bill

On Tue, Apr 25, 2023 at 1:26 AM Viechtbauer, Wolfgang (NP) <
wolfgang.viechtba...@maastrichtuniversity.nl> wrote:


Hi all,

Posted this many years ago (
https://stat.ethz.ch/pipermail/r-devel/2017-December/075224.html), but
either this slipped under the radar or my feeble mind is unable to
understand what xyTable() is doing here and nobody bothered to correct me.
I now stumbled again across this issue.

x <- c(1, 1, 2, 2,  2, 3)
y <- c(1, 2, 1, 3, NA, 3)
table(x, y, useNA="always")
xyTable(x, y)

Why does xyTable() report that there are NA instances of (2,3)? I could
understand the logic that the NA could be anything, including a 3, so the
$number value for (2,3) is therefore unknown, but then the same should
apply so (2,1), but here $number is 1, so the logic is then inconsistent.

I stared at the xyTable code for a while and I suspect this is coming from
order() using na.last=TRUE by default, but in any case, to me the behavior
above is surprising.

Best,
Wolfgang

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] xyTable(x,y) versus table(x,y) with NAs

2023-04-25 Thread Serguei Sokol via R-devel

I correct myself. Obviously, the line

first[is.na(first) | isFALSE(first)] <- FALSE

should read

first[is.na(first)] <- FALSE

Serguei.

Le 25/04/2023 à 11:30, Serguei Sokol a écrit :

Le 25/04/2023 à 10:24, Viechtbauer, Wolfgang (NP) a écrit :

Hi all,

Posted this many years ago 
(https://stat.ethz.ch/pipermail/r-devel/2017-December/075224.html), 
but either this slipped under the radar or my feeble mind is unable 
to understand what xyTable() is doing here and nobody bothered to 
correct me. I now stumbled again across this issue.


x <- c(1, 1, 2, 2,  2, 3)
y <- c(1, 2, 1, 3, NA, 3)
table(x, y, useNA="always")
xyTable(x, y)

Why does xyTable() report that there are NA instances of (2,3)? I 
could understand the logic that the NA could be anything, including a 
3, so the $number value for (2,3) is therefore unknown, but then the 
same should apply so (2,1), but here $number is 1, so the logic is 
then inconsistent.


I stared at the xyTable code for a while and I suspect this is coming 
from order() using na.last=TRUE by default, but in any case, to me 
the behavior above is surprising.
Not really. The variable 'first' in xyTable() is supposed to detect 
positions of first values in repeated pair sequences. Then it is used 
to retained only their indexes in a vector of type 1:n. Finally, by 
taking diff(), a number of repeated pairs is obtained. However, as 
'first' will contain one NA  for your example, the diff() call will 
produce two NAs by taking the difference with precedent and following 
number. Hence, the result.


Here is a slightly modified code ox xyTable to handle NA too.

xyTableNA <- function (x, y = NULL, digits)
{
    x <- xy.coords(x, y, setLab = FALSE)
    y <- signif(x$y, digits = digits)
    x <- signif(x$x, digits = digits)
    n <- length(x)
    number <- if (n > 0) {
    orderxy <- order(x, y)
    x <- x[orderxy]
    y <- y[orderxy]
    first <- c(TRUE, (x[-1L] != x[-n]) | (y[-1L] != y[-n]))
    firstNA <- c(TRUE, xor(is.na(x[-1L]), is.na(x[-n])) | 
xor(is.na(y[-1L]), is.na(y[-n])))

    first[firstNA] <- TRUE
    first[is.na(first) | isFALSE(first)] <- FALSE
    x <- x[first]
    y <- y[first]
    diff(c((1L:n)[first], n + 1L))
    }
    else integer()
    list(x = x, y = y, number = number)
}

Best,
Serguei.



Best,
Wolfgang

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel






--
Serguei Sokol
Ingenieur de recherche INRAE

Cellule Mathématiques
TBI, INSA/INRAE UMR 792, INSA/CNRS UMR 5504
135 Avenue de Rangueil
31077 Toulouse Cedex 04

tel: +33 5 61 55 98 49
email: so...@insa-toulouse.fr
http://www.toulouse-biotechnology-institute.fr/en/technology_platforms/mathematics-cell.html

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] xyTable(x,y) versus table(x,y) with NAs

2023-04-25 Thread Serguei Sokol via R-devel

Le 25/04/2023 à 10:24, Viechtbauer, Wolfgang (NP) a écrit :

Hi all,

Posted this many years ago 
(https://stat.ethz.ch/pipermail/r-devel/2017-December/075224.html), but either 
this slipped under the radar or my feeble mind is unable to understand what 
xyTable() is doing here and nobody bothered to correct me. I now stumbled again 
across this issue.

x <- c(1, 1, 2, 2,  2, 3)
y <- c(1, 2, 1, 3, NA, 3)
table(x, y, useNA="always")
xyTable(x, y)

Why does xyTable() report that there are NA instances of (2,3)? I could 
understand the logic that the NA could be anything, including a 3, so the 
$number value for (2,3) is therefore unknown, but then the same should apply so 
(2,1), but here $number is 1, so the logic is then inconsistent.

I stared at the xyTable code for a while and I suspect this is coming from 
order() using na.last=TRUE by default, but in any case, to me the behavior 
above is surprising.
Not really. The variable 'first' in xyTable() is supposed to detect 
positions of first values in repeated pair sequences. Then it is used to 
retained only their indexes in a vector of type 1:n. Finally, by taking 
diff(), a number of repeated pairs is obtained. However, as 'first' will 
contain one NA  for your example, the diff() call will produce two NAs 
by taking the difference with precedent and following number. Hence, the 
result.


Here is a slightly modified code ox xyTable to handle NA too.

xyTableNA <- function (x, y = NULL, digits)
{
    x <- xy.coords(x, y, setLab = FALSE)
    y <- signif(x$y, digits = digits)
    x <- signif(x$x, digits = digits)
    n <- length(x)
    number <- if (n > 0) {
    orderxy <- order(x, y)
    x <- x[orderxy]
    y <- y[orderxy]
    first <- c(TRUE, (x[-1L] != x[-n]) | (y[-1L] != y[-n]))
    firstNA <- c(TRUE, xor(is.na(x[-1L]), is.na(x[-n])) | 
xor(is.na(y[-1L]), is.na(y[-n])))

    first[firstNA] <- TRUE
    first[is.na(first) | isFALSE(first)] <- FALSE
    x <- x[first]
    y <- y[first]
    diff(c((1L:n)[first], n + 1L))
    }
    else integer()
    list(x = x, y = y, number = number)
}

Best,
Serguei.



Best,
Wolfgang

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Augment base::replace(x, list, value) to allow list= to be a predicate?

2023-03-06 Thread Serguei Sokol via R-devel

Le 04/03/2023 à 01:21, Pavel Krivitsky a écrit :

Dear All,

Currently, list= in base::replace(x, list, value) has to be an index
vector. For me, at least, the most common use case is for list= to be
some simple property of elements of x, e.g.,

x <- c(1,2,NA,3)
replace(x, is.na(x), 0)

Particularly when using R pipes, which don't allow multiple
substitutions,

Right, but anonymous function syntax can palliate to this:

x |> (\(x) replace(x, is.na(x), 0))()



  it would simplify many of such cases if list= could be a
function that returns an index, e.g.,

replace <- function (x, list, values, ...) {
   # Here, list() refers to the argument, not the built-in.
   if(is.function(list)) list <- list(x, ...)
   x[list] <- values
   x
}
Before modifying the base of R, we should examine existing possibilities 
to achieve the same goal.
In this particular case and if the previous solution (anonymous 
function) is not satisfactory a thin one-line wrapper can make the job:


freplace <- function (x, list, values, ...) replace(x, 
if(is.function(list)) list <- list(x, ...) else list, values)




Then, the following is possible:

c(1,2,NA,3) |> replace(is.na, 0)

this becomes

c(1,2,NA,3) |> freplace(is.na, 0)

and looks quite acceptable for me.

Best,
Serguei.



Any thoughts?
Pavel
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Serguei Sokol
Ingenieur de recherche INRAE

Cellule Mathématiques
TBI, INSA/INRAE UMR 792, INSA/CNRS UMR 5504
135 Avenue de Rangueil
31077 Toulouse Cedex 04

tel: +33 5 61 55 98 49
email: so...@insa-toulouse.fr
http://www.toulouse-biotechnology-institute.fr/en/technology_platforms/mathematics-cell.html

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] uniroot violates bounds?

2023-02-20 Thread Serguei Sokol via R-devel

Le 18/02/2023 à 21:44, J C Nash a écrit :

I wrote first cut at unirootR for Martin M and he revised and put in
Rmpfr.

The following extends Ben's example, but adds the unirootR with trace
output.

c1 <- 4469.822
c2 <- 572.3413
f <- function(x) { c1/x - c2/(1-x) }; uniroot(f, c(1e-6, 1))
uniroot(f, c(1e-6, 1))
library(Rmpfr)
unirootR(f, c(1e-6, 1), extendInt="no", trace=1)

This gives more detail on the iterations, and it looks like the Inf is 
the

issue. But certainly we could do more to avoid "gotchas" like this. If
someone is interested in some back and forth, I'd be happy to give it a
try, but I think progress would be better with more than one contributor.

For me, the following fix makes the job :

--- nlm.R.old    2018-09-25 10:44:49.0 +0200
+++ nlm.R    2023-02-20 10:46:39.893542531 +0100
@@ -143,14 +143,14 @@

 if(check.conv) {
 val <- tryCatch(.External2(C_zeroin2, function(arg) f(arg, ...),
-                   lower, upper, f.lower, f.upper,
+                   lower, upper, f.low., f.upp.,
                tol, as.integer(maxiter)),
         warning = function(w)w)
 if(inherits(val, "warning"))
     stop("convergence problem in zero finding: ", 
conditionMessage(val))

 } else {
 val <- .External2(C_zeroin2, function(arg) f(arg, ...),
-              lower, upper, f.lower, f.upper,
+              lower, upper, f.low., f.upp.,
           tol, as.integer(maxiter))
 }
 iter <- as.integer(val[2L])


Best,
Serguei.



Best,

John Nash

On 2023-02-18 12:28, Ben Bolker wrote:

c1 <- 4469.822
c2 <- 572.3413
f <- function(x) { c1/x - c2/(1-x) }; uniroot(f, c(1e-6, 1))
uniroot(f, c(1e-6, 1))


    provides a root at -6.00e-05, which is outside of the specified 
bounds.  The default value of the "extendInt" argument to uniroot() 
is "no", as far as I can see ...


$root
[1] -6.003516e-05

$f.root
[1] -74453981

$iter
[1] 1

$init.it
[1] NA

$estim.prec
[1] 6.103516e-05


   I suspect this fails because f(1) (value at the upper bound) is 
infinite, although setting interval to c(0.01, 1) does work/give a 
sensible answer ...  (works for a lower bound of 1e-4, fails for 1e-5 
...)


   Setting the upper bound < 1 appears to avoid the problem.

  For what it's worth, the result has an "init.it" component, but the 
only thing the documentation says about it is " component ‘init.it’ 
was added in R 3.1.0".


   And, I think (?) that the 'trace' argument only produces any 
output if the 'extendInt' option is enabled?


   Inspired by 
https://stackoverflow.com/questions/75494696/solving-a-system-of-non-linear-equations-with-only-one-unknown/75494955#75494955


   cheers
    Ben Bolker

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Combinations and Permutations

2023-01-13 Thread Serguei Sokol via R-devel

Le 13/01/2023 à 09:00, Dario Strbenac via R-devel a écrit :

Good day,

In utils, there is a function named combn. It would seem complementary for 
utils to also offer permutations because of how closely mathematically related 
they are to each other. Could permutations be added to save on a package 
dependency if developing a package?
If you need a function returning a matrix with all permutations of a 
vector x in its columns, a simple recursive one-liner can be sufficient, 
no need for a whole package dependency for this:


   perm=function(x) {n=length(x); f=factorial(n); if (n>1) 
structure(vapply(seq_along(x), function(i) rbind(x[i], perm(x[-i])), 
x[rep(1L, f)]), dim=c(n, f)) else x}


It works for all king of vectors (integer, numeric, character, ...):

   perm(1:3)
   perm(pi*1:3)
   perm(letters[1:3])

Obviously, a particular attention should be brought to the size of x 
(referred here as n) as the column number in the returned matrix growths 
as n!.. E.g. 8!=40320. So growths the cpu time too.


Hoping it helps,
Serguei.



--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] rhub vs. CRAN fedora-*-devel, using armadillo & slapack

2023-01-10 Thread Serguei Sokol via R-devel
Looks like there is a kind of misunderstanding...

Le 10/01/2023 à 17:27, RICHET Yann a écrit :
> Thank you for your answer.
> In facts, 10 threads are asked by armadillo for some LinAlg, which backs to 
> two threads as warned. But I cannot imagine this costs so much time just for 
> that...
Excessive thread number is a problem per se. I did not say that it was 
responsible for excessive test time.
It's up to you to configure armadillo to use no more then 2 threads when 
run on CRAN. May be, setting environment variable OPENBLAS_NUM_THREADS=2 
could help.

>
> A deeper analysis of time spent seems to point that a large time was mainly 
> spent on testthat and Rcpp dependencies compilation...
Normally, compilation time is not accounted in the quota of 15 min 
dedicated to tests.
I would just focus on reducing this time, e.g. by running tests on 
smaller cases or skipping time-consuming tests conditionally when on CRAN.

Serguei.
>   But other recent packages depending on these also are not spending so much 
> time.
>
> CRAN team, as I failed to reproduce the issue with rhub dockers,
> - is there any reason that could explain that fedora-*-devel is so slow for 
> this package or compilation of Rcpp/testthat ?
> - is our slapack a possible source of... anything ?
> - are we the only package which embeds "standard armadillo" and tries to deal 
> with simple precision lapack issues on fedora ?
> - is there any chance that I can get a deeper log of what happened ?
>
> Best regards,
>
> Yann
>
> -Message d'origine-
> De : Serguei Sokol  
> Envoyé : mardi 10 janvier 2023 11:41
> À : RICHET Yann;R-devel@r-project.org
> Cc : Pascal Havé
> Objet : Re: [Rd] rhub vs. CRAN fedora-*-devel, using armadillo & slapack
>
> Le 10/01/2023 à 11:37, Serguei Sokol a écrit :
>> Le 10/01/2023 à 10:44, RICHET Yann a écrit :
>>> Dear R-devel people,
>>>
>>> We are working to submit a package which is mainly a binding over a
>>> C++ lib (https://github.com/libKriging) using armadillo.
>>> It is _not_ a standard RcppArmadillo package, because we also had to
>>> provide a python binding... so there is a huge layer of cmake &
>>> scripting to make it work with a standard armadillo (but using same
>>> version that RcppArmadillo).
>>> It seems now working with almost all CRAN targets, but a problem
>>> remains with fedora (clang & gcc) devel.
>>>
>>> Our issue is that the rhub docker is not well sync with the CRAN
>>> fedora-clang-devel config:
>>> - failing on CRAN (without clear reason):
>>> https://www.r-project.org/nosvn/R.check/r-devel-linux-x86_64-fedora-c
>>> lang/rlibkriging-00check.html
>> I see  2 candidates for  reasons of failing on CRAN:
>>   1. test time is 30 min while 15 min are allowed;
>>   2. your code try to get 30 threads
> Oops, it was 10 threads not 30 but it does not change the nature of a 
> possible issue.
>
>> while CRAN limits them to 2;
>>
>> Try to make your tests shorter ( < 15 min) on 2 threads. May be it
>> will help.
>>
>> Best,
>> Serguei.
>>
>>> - passing on rhub:
>>> https://builder.r-hub.io/status/rlibkriging_0.7-3.tar.gz-20f7dc756272
>>> 6497af7c678ab41f4dea
>>>
>>> So we cannot investigate and fix the problem.
>>>
>>> Note that we did quite strange things with the fedora platforms:
>>> include explicitely slapack to provide simple precision for our
>>> (vanilla) armadillo...
>>>
>>> Do you have any idea, or even known problem in mind, that could be
>>> related to this ?
>>>
>>> Best regards,
>>>
>>> --
>>> Dr. Yann Richet
>>> Institute for Radiological Protection and Nuclear Safety
>>> (https://www.irsn.fr),
>>>     Department of Characterization of Natural Unexpected Events and
>>> Sites Office : +33 1 58 35 88 84
>>>
>>> __
>>> R-devel@r-project.org  mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] rhub vs. CRAN fedora-*-devel, using armadillo & slapack

2023-01-10 Thread Serguei Sokol via R-devel

Le 10/01/2023 à 11:37, Serguei Sokol a écrit :

Le 10/01/2023 à 10:44, RICHET Yann a écrit :

Dear R-devel people,

We are working to submit a package which is mainly a binding over a 
C++ lib (https://github.com/libKriging) using armadillo.
It is _not_ a standard RcppArmadillo package, because we also had to 
provide a python binding... so there is a huge layer of cmake & 
scripting to make it work with a standard armadillo (but using same 
version that RcppArmadillo).
It seems now working with almost all CRAN targets, but a problem 
remains with fedora (clang & gcc) devel.


Our issue is that the rhub docker is not well sync with the CRAN 
fedora-clang-devel config:
- failing on CRAN (without clear reason): 
https://www.r-project.org/nosvn/R.check/r-devel-linux-x86_64-fedora-clang/rlibkriging-00check.html

I see  2 candidates for  reasons of failing on CRAN:
 1. test time is 30 min while 15 min are allowed;
 2. your code try to get 30 threads
Oops, it was 10 threads not 30 but it does not change the nature of a 
possible issue.



while CRAN limits them to 2;

Try to make your tests shorter ( < 15 min) on 2 threads. May be it 
will help.


Best,
Serguei.

- passing on rhub: 
https://builder.r-hub.io/status/rlibkriging_0.7-3.tar.gz-20f7dc7562726497af7c678ab41f4dea


So we cannot investigate and fix the problem.

Note that we did quite strange things with the fedora platforms: 
include explicitely slapack to provide simple precision for our 
(vanilla) armadillo...


Do you have any idea, or even known problem in mind, that could be 
related to this ?


Best regards,

--
Dr. Yann Richet
Institute for Radiological Protection and Nuclear Safety 
(https://www.irsn.fr),

   Department of Characterization of Natural Unexpected Events and Sites
Office : +33 1 58 35 88 84

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel






--
Serguei Sokol
Ingenieur de recherche INRAE

Cellule Mathématiques
TBI, INSA/INRAE UMR 792, INSA/CNRS UMR 5504
135 Avenue de Rangueil
31077 Toulouse Cedex 04

tel: +33 5 61 55 98 49
email: so...@insa-toulouse.fr
http://www.toulouse-biotechnology-institute.fr/en/technology_platforms/mathematics-cell.html

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] rhub vs. CRAN fedora-*-devel, using armadillo & slapack

2023-01-10 Thread Serguei Sokol via R-devel

Le 10/01/2023 à 10:44, RICHET Yann a écrit :

Dear R-devel people,

We are working to submit a package which is mainly a binding over a C++ lib 
(https://github.com/libKriging) using armadillo.
It is _not_ a standard RcppArmadillo package, because we also had to provide a 
python binding... so there is a huge layer of cmake & scripting to make it work 
with a standard armadillo (but using same version that RcppArmadillo).
It seems now working with almost all CRAN targets, but a problem remains with 
fedora (clang & gcc) devel.

Our issue is that the rhub docker is not well sync with the CRAN 
fedora-clang-devel config:
- failing on CRAN (without clear reason): 
https://www.r-project.org/nosvn/R.check/r-devel-linux-x86_64-fedora-clang/rlibkriging-00check.html

I see  2 candidates for  reasons of failing on CRAN:
 1. test time is 30 min while 15 min are allowed;
 2. your code try to get 30 threads while CRAN limits them to 2;

Try to make your tests shorter ( < 15 min) on 2 threads. May be it will 
help.


Best,
Serguei.


- passing on rhub: 
https://builder.r-hub.io/status/rlibkriging_0.7-3.tar.gz-20f7dc7562726497af7c678ab41f4dea

So we cannot investigate and fix the problem.

Note that we did quite strange things with the fedora platforms: include 
explicitely slapack to provide simple precision for our (vanilla) armadillo...

Do you have any idea, or even known problem in mind, that could be related to 
this ?

Best regards,

--
Dr. Yann Richet
Institute for Radiological Protection and Nuclear Safety (https://www.irsn.fr),
   Department of Characterization of Natural Unexpected Events and Sites
Office : +33 1 58 35 88 84

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Serguei Sokol
Ingenieur de recherche INRAE

Cellule Mathématiques
TBI, INSA/INRAE UMR 792, INSA/CNRS UMR 5504
135 Avenue de Rangueil
31077 Toulouse Cedex 04

tel: +33 5 61 55 98 49
email: so...@insa-toulouse.fr
http://www.toulouse-biotechnology-institute.fr/en/technology_platforms/mathematics-cell.html

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [R] I do not want that R CMD build removes temp directory

2022-12-19 Thread Serguei Sokol via R-devel

Le 19/12/2022 à 10:52, Witold E Wolski a écrit :

Dear Uwe,

Unfortunately there isn't much of an output. This is all what I have:

$ R CMD INSTALL --log prolfqua
Warning: unknown option '--log'
* installing to library 'C:/Users/witoldwolski/AppData/Local/R/win-library/4.2'
* installing *source* package 'prolfqua' ...
** using staged installation
** R
** data
** inst
** byte-compile and prepare package for lazy loading
ERROR: lazy loading failed for package 'prolfqua'
* removing 'C:/Users/witoldwolski/AppData/Local/R/win-library/4.2/prolfqua'

Also with --no-test-load option the install is failing :

$ R CMD INSTALL --clean --no-test-load prolfqua

* installing to library 'C:/Users/witoldwolski/AppData/Local/R/win-library/4.2'
* installing *source* package 'prolfqua' ...
** using staged installation
** R
** data
** inst
** byte-compile and prepare package for lazy loading
ERROR: lazy loading failed for package 'prolfqua'
* removing 'C:/Users/witoldwolski/AppData/Local/R/win-library/4.2/prolfqua'

And including "--no-clean-on-error" also does not help because the
installation directory is empty.
You don't show the full command you run. If you included also "--clean" 
as in previous command, may be it just does what is asked, i.e. clean 
the temp dir ?
I don't know what prevails: "--clean" or "--no-clean-on-error" if put 
simultaneously.


Best,
Serguei.




Tested the install, on macos M1, linux ARM64, linux x86, Windows 64,
and it works everywhere except
Parallels Windows 64 on ARM M1.

R version 4.2.2 (2022-10-31 ucrt) -- "Innocent and Trusting"
Copyright (C) 2022 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

best regards
Witek



On Fri, 16 Dec 2022 at 11:24, Uwe Ligges
 wrote:



On 15.12.2022 21:47, Witold E Wolski wrote:

Thank you Simon,

It seems not to be related to the R package but rather to the OS,
(just got the same error when installing the shiny R package from
CRAN).
I am on an M1 mac running Windows ARM in Parallels. Installed a
x86_64-w64 R version.



"** byte-compile and prepare package for lazy loading
ERROR: lazy loading failed for package 'shiny'
* removing 'C:/Users/witoldwolski/AppData/Local/R/win-library/4.2/shiny'
Warning in install.packages :"

Can we please have the full output?

Best,
Uwe Ligges




On Thu, 15 Dec 2022 at 19:09, Simon Urbanek  wrote:

Yes:

$ R CMD INSTALL --help | grep error
--no-clean-on-error   do not remove installed package on error

But probably more commonly used way is to install the package from its unpacked 
directory as that avoids the use of temporary directories in the first place.

In you case you can also use --no-test-load and the non-functional package will 
still be installed so you can inspect it.

Cheers,
Simon

PS: please don't cross-post



On Dec 16, 2022, at 7:01 AM, Witold E Wolski  wrote:

I am getting a package build error, and can not figure out the problem.
The error is
"
ERROR: lazy loading failed for package 'prolfqua'
* removing 'C:/Users/
"
However since R CMD build removes the temp directory and does not give
any other errors how can I find out what the build problem is?

Is there a way to disable the temp directory removal?

Best Regards
Witek
--
Witold Eryk Wolski

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel






--
Witold Eryk Wolski

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Serguei Sokol
Ingenieur de recherche INRAE

Cellule Mathématiques
TBI, INSA/INRAE UMR 792, INSA/CNRS UMR 5504
135 Avenue de Rangueil
31077 Toulouse Cedex 04

tel: +33 5 61 55 98 49
email: so...@insa-toulouse.fr
http://www.toulouse-biotechnology-institute.fr/en/technology_platforms/mathematics-cell.html

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] gettext(msgid, domain="R") doesn't work for some 'msgid':s

2021-11-05 Thread Serguei Sokol

Le 05/11/2021 à 15:51, Henrik Bengtsson a écrit :

I'm trying to reuse some of the translations available in base R by using:

   gettext(msgid, domain="R")

This works great for most 'msgid's, e.g.

$ LANGUAGE=de Rscript -e 'gettext("cannot get working directory", domain="R")'
[1] "kann das Arbeitsverzeichnis nicht ermitteln"

However, it does not work for all.  For instance,

$ LANGUAGE=de Rscript -e 'gettext("Execution halted\n", domain="R")'
[1] "Execution halted\n"

This despite that 'msgid' existing in:

$ grep -C 2 -F 'Execution halted\n' src/library/base/po/de.po

#: src/main/main.c:342
msgid "Execution halted\n"
msgstr "Ausführung angehalten\n"

It could be that the trailing newline causes problems, because the
same happens also for:

$ LANGUAGE=de Rscript --vanilla -e 'gettext("error during cleanup\n",
domain="R")'
[1] "error during cleanup\n"

It happens also to:

$ LANGUAGE=de Rscript -e 'gettext("During startup - ", domain="R")'
[1] "During startup - "


#: src/main/main.c:1078
msgid "During startup - "
msgstr "Beim Start - "

which has not "\n" at the end.

Just a testimony with a hope it helps.

Best,
Serguei.



Is this meant to work, and if so, how do I get it to work, or is it a bug?

Thanks,

Henrik

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Serguei Sokol
Ingenieur de recherche INRAE

Cellule Mathématiques
TBI, INSA/INRAE UMR 792, INSA/CNRS UMR 5504
135 Avenue de Rangueil
31077 Toulouse Cedex 04

tel: +33 5 61 55 98 49
email: so...@insa-toulouse.fr
http://www.toulouse-biotechnology-institute.fr/en/technology_platforms/mathematics-cell.html

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] dgTMatrix Segmentation Fault

2021-06-09 Thread Serguei Sokol

Le 08/06/2021 à 18:32, Martin Maechler a écrit :

Dario Strbenac
 on Tue, 8 Jun 2021 09:00:04 + writes:

 > Good day, Indeed, changing the logical test is a
 > workaround to the problem. However, a segmentation fault
 > means that the software tried to access an invalid memory
 > location, so I think the original problem should be
 > addressed in Matrix package, regardless.

Hmm, you maybe right or not ..

Note we have the situation you (via R) ask your computer
(i.e. the OS system memory allocation routines) to provide
memory.

In a reasonable setup, the OS routine returns, saying
"I cannot provide the memory you asked for",
and the R function stop() s. .. no segfault, all is fine.

The problem that on some platforms that does not work, is a
relatively deep problem  and also has happened in base R in some
cases on some platforms (possibly never on Linux based ones
(Ubuntu, Debian, Fedora, CentOS..),  but maybe I'm too
optimistic there as well.

Note: I now also tried on our oldish Windows (Terminal) Server,
and it also just gave errors that it could not allocate so much
memory but did not produce a seg.fault.


Currently, I don't see what we should improve in the Matrix
package here.
Is it possible (pure hypothesis) that when such a big piece of memory is 
available, some int32 counter is out of bounds?
Here, we have almost 1.e10 non-zero elements. This number is greater 
than 2**31-1 (int32 limit) and even greater than unit32 limit (2**32).

Just a thought.

Best,
Serguei.



Martin Maechler
(co-maintainer of 'Matrix')

 > --
 > Dario Strbenac University of Sydney Camperdown NSW 2050
 > Australia

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [R-pkg-devel] Error in CHECK caused by dev.off()

2020-07-22 Thread Serguei Sokol

Le 22/07/2020 à 14:36, Helmut Schütz a écrit :

Dear all,

I have two variables, foo and bar. The first is TRUE if a png should be 
created and the second is TRUE if an already existing one should be 
overwritten.

At the end of the plot I had
if (foo | (foo & bar)) dev.off()
This worked as expected in all versions of my package built in R up to 
v3.6.3. However, when I CHECK the package in v4.0.2 I get:

 > grDevices::dev.off()
Error in grDevices::dev.off() :
   cannot shut down device 1 (the null device)
Execution halted

I tried:
if (foo | (foo & bar)) {
   dev <- dev.list()
   if (!is.null(dev)) {
     if (dev == 2) invisible(dev.off())
   }
}
without success (same error).

Even the more general
if (foo | (foo & bar)) {
   graphics.off()
}
did not work.

The plot is called only in an example of one man-page -- though embedded 
in \donttest{}.
Even if I set both foo and bar to FALSE (i.e., the respective part of 
the code should not be executed at all), I get the same error.
Hmm... I see 2 possibilities for still getting an error while the 
concerned part of code is not supposed to be run:


 - either you are running not updated version of your package;
 - or the error comes from some other place of the code.

Sorry but without a minimal reproducible example I cannot help more.
Best,
Serguei.

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [Rd] Speed-up/Cache loadNamespace()

2020-07-20 Thread Serguei Sokol

Le 20/07/2020 à 10:15, Abby Spurdle a écrit :

It's possible to run R (or a c parent process) as a background process
via a named pipe, and then write script files to the named pipe.
However, the details depend on what shell you use.

The last time I tried (which was a long time ago), I created a small c
program to run R, read from the named pipe from within c, then wrote
it's contents to R's standard in.

It might be possible to do it without the c program.
Haven't checked.

For testing purposes, you can do:

- in a shell 1:
 mkfifo rpipe
 exec 3>rpipe # without this trick, Rscript will end after the first 
"echo" hereafter or at the end of your first script.


- in a shell 2:
 Rscript rfifo

- in a shell 3:
 echo "print('hello')" > rpipe
 echo "print('hello again')" > rpipe

Then in the shell 2, you will see the output:
[1] "hello"
[1] "hello again"
etc.

If your R scripts contain "stop()" or "q('yes')" or any other error, it 
will end the Rscript process. Kind of watch-dog can be set for automatic 
relaunching if needed. Another way to stop the Rscript process is to 
kill the "exec 3>rpipe" one. You can find its PID with "fuser rpipe"


Best,
Serguei.




On Mon, Jul 20, 2020 at 3:50 AM Mario Annau  wrote:

Dear all,

in our current setting we have our packages stored on a (rather slow)
network drive and need to invoke short R scripts (using RScript) in a
timely manner. Most of the script's runtime is spent with package loading
using library() (or loadNamespace to be precise).

Is there a way to cache the package namespaces as listed in
loadedNamespaces() and load them into memory before the script is executed?

My first simplistic attempt was to serialize the environment output
from loadNamespace() to a file and load it before the script is started.
However, loading the object automatically also loads all the referenced
namespaces (from the slow network share) which is undesirable for this use
case.

Cheers,
Mario

 [[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

______
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Serguei Sokol
Ingenieur de recherche INRAE

Cellule mathématiques
TBI, INSA/INRAE UMR 792, INSA/CNRS UMR 5504
135 Avenue de Rangueil
31077 Toulouse Cedex 04

tel: +33 5 61 55 98 49
email: so...@insa-toulouse.fr
http://www.toulouse-biotechnology-institute.fr/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] dimnames incoherence?

2020-02-19 Thread Serguei Sokol

Hi,

I was bitten by a little incoherence in dimnames assignment or may be I 
missed some point.
Here is the case. If I assign row names via dimnames(a)[[1]], when 
nrow(a)=1 then an error is thrown. But if I do the same when nrow(a) > 1 
it's OK. Is one of this case works unexpectedly? Both? Neither?


a=as.matrix(1)
dimnames(a)[[1]]="a" # error: 'dimnames' must be a list

aa=as.matrix(1:2)
dimnames(aa)[[1]]=c("a", "b") # OK

In the second case, dimnames(aa) is not a list (like in the first case) 
but it works.

I would expect that the both work or neither.

Your thoughts are welcome.
Best,
Serguei.

PS the same apply for dimnames(a)[[2]]<-.

> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Mageia 7

Matrix products: default
BLAS/LAPACK: /home/opt/OpenBLAS/lib/libopenblas_sandybridge-r0.3.6.so

locale:
 [1] LC_CTYPE=fr_FR.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=fr_FR.UTF-8    LC_COLLATE=fr_FR.UTF-8
 [5] LC_MONETARY=fr_FR.UTF-8    LC_MESSAGES=fr_FR.UTF-8
 [7] LC_PAPER=fr_FR.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats graphics  grDevices utils datasets methods
[8] base

other attached packages:
[1] multbxxc_1.0.1    rmumps_5.2.1-11
[3] arrApply_2.1  RcppArmadillo_0.9.800.4.0
[5] Rcpp_1.0.3    slam_0.1-47
[7] nnls_1.4

loaded via a namespace (and not attached):
[1] compiler_3.6.1   tools_3.6.1  codetools_0.2-16

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] NA in doc for options(matprod="default")

2020-02-17 Thread Serguei Sokol

Le 17/02/2020 à 17:50, Tomas Kalibera a écrit :

On 2/17/20 5:36 PM, Serguei Sokol wrote:

Hi,

A colleague of mine has spotted me a passage of the doc ?option 
talking about Inf and NaN check in 'matprod=default' section:

https://stat.ethz.ch/R-manual/R-devel/library/base/html/options.html

I am wondering if NA should be mentioned too as the check seems to 
include this "value" too. NA being different from Inf and NaN it is 
worth mentioning, isn't it?


Yes, NA is handled, too. NA is one of NaN values for the purpose of 
this text

Thanks for clarification. It was not clear for me from the text itself.


(and it is also implemented that way, see ?NaN).

 Indeed, the text of ?NaN says "... systems typically have
 many different NaN values.  One of these is used for the numeric
 missing value ‘NA’, and ‘is.nan’ is false for that value."
However, R can return both NA and NaN symbols, e.g.

> mean(c(1, NA))
[1] NA
> mean(c(1, NaN))
[1] NaN

which does not help to understand their relationship. That's why I 
continue to think that it would be clearer to mention NA explicitly in 
option(matprod=default). It could be a phrasing like "... ensure correct 
propagation of Inf and NaN (including NA) ..."


Best,
Serguei.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] NA in doc for options(matprod="default")

2020-02-17 Thread Serguei Sokol

Hi,

A colleague of mine has spotted me a passage of the doc ?option talking 
about Inf and NaN check in 'matprod=default' section:

https://stat.ethz.ch/R-manual/R-devel/library/base/html/options.html

I am wondering if NA should be mentioned too as the check seems to 
include this "value" too. NA being different from Inf and NaN it is 
worth mentioning, isn't it?


Best,
Serguei.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] possible bug in win R-devel in check/test environment

2020-01-15 Thread Serguei Sokol

Hi Dirk,

Thanks for sharing your thoughts on the subject.  I have few notes next 
to it.


Le 14/01/2020 à 15:59, Dirk Eddelbuettel a écrit :

Hi Serguei,

Nice analysis!

On 14 January 2020 at 11:00, Serguei Sokol wrote:
| During my recent r2sundials development, I've came across a strange test
| failing during 'R CMD check' exclusively on win R-devel which I could
| reproduce with a minimal example that I present here.
| The toy packages testarma1 [1] and testarma2 [2] are minimal
| modifications of a skeleton package produced by
| RcppArmadillo.package.skeleton().
| They are almost identical. The first one fails to passe its tests on win
| R-devel [3] while the second one is OK [4]. The reason of test failing
| in testarma1 boils down to not finding a package during tests (here
| RcppArmadillo) although it is well present in LinkingTo field of the
| DESCRIPTION file (the mechanism of the error is detailed in [5]). To
| make the tests pass, I had to add RcppArmadillo and r2sundials to
| 'Suggests:' field too (as can be seen in testarma2)
|
| In my understanding, the presence of a package name in the LinkingTo
| field should be sufficient for finding it during the test phase.

I thought so too. But thinking about it a little more it clears up a little.
It remains not clear for me why the tests of testarma1 fail on win 
R-devel and run OK on Linux R-devel ( 
https://builder.r-hub.io/status/testarma1_1.0.tar.gz-37bca609ce3b49149daa2f97d035098c 
)
If the reasoning you describe is the really intended one, it should work 
in similar way on all platforms, should not it?




A bit more context: One can be more fine-grained on Depends. And Debian does
that, and R sometimes followed Debian's model of declaring dependencies. One
element we are missing here is to distinguish between _build-time_ needs (we
call that Build-Depends: in Debian) and _run_time_ needs.  We currently only
have the latter as Depends:, which for example pains a million dplyr users on
Windows who have to download 120mb worth of our BH package because it is used
to _build_ the binary zipfile, but not thereafter.  That is a wart.

+1



Now, _LinkingTo_ always implies build-dependecies or else it would croak at
that stage.

Currently WRE states about LinkingTo:
"Specifying a package in ‘LinkingTo’ suffices if these are C++ headers 
containing source code or static linking is done at installation: the 
packages do not need to be (and usually should not be) listed in the 
‘Depends’ or ‘Imports’ fields. This includes CRAN package BH and almost 
all users of RcppArmadillo and RcppEigen."


No mention of build-time or test-time is made. Moreover, regarding your 
advice to add packages to 'Imports', this phrase explicitly advises 
against it: "and usually should not be ...".


If it is a real intention of R developers, a little phrase in WRE like 
the following one could clarify the things:
"Note that packages listed in fields Depends, Imports and Suggests are 
visible during the test stage of 'R CMD check' command while those in 
LinkingTo are not."




And I had assumed that this would cover all run-time but ...

| Instead, one have to add it to 'Suggests:' field too.

... tests are indeed treated differently and this may just be a different
code path.

If you have something in Suggests: and test for it, you should condition the
test.
Is it documented somewhere in such or similar words? (I mean official R 
documentation.)



  I have argued that part a few times but mostly to no avail so I too now
mostly give up and _unconditionally_ install Suggests to support tests when I
run bulk tests for reverse dependencies.  But it is still wrong.
I am not so resolved to call it "wrong". After all why not? The main 
thing is to have a widely accepted consensus about it.
The packages underlying the tests like testthat, RUnit and alike are 
explicitly required to be listed in Suggests. So if the packages in 
Suggests are to be considered as optional including those ones you don't 
even have a chance to check the presence of packages like RcppArmadillo 
as the code containing this check cannot be run without testthat, RUnit 
and so on.




So here the ball is in your court. Your tests for r2sundials should probably
condition on RcppArmadillo being present and skip tests requiring it if it is
not present.
In this case, this is not an option for me. I do want the tests to be 
run, not skipped.



   Or, if you don't like that, make it an Imports: too.
I confirm, putting packages in Import, makes the tests run OK on win 
R-devel too (cf imports branch of testarma1 and the check log on 
https://win-builder.r-project.org/GQaZBdmn2U1x/00check.log )
But I prefer to leave them in Suggests if no their functions are used in 
the body of the package (hence no real import is required).


Best,
Serguei.



Hope this helps.

Cheers, Dirk

| Am I wrong or this behavior is unexpected?
|
| Best,
| Serguei.
|
| [1] https://github.com

Re: [R-pkg-devel] seeking help regarding the valgrind error

2020-01-14 Thread Serguei Sokol

Hi Yang,

Le 14/01/2020 à 17:29, Yang Feng a écrit :

Hi All,

Happy new year! I just joined this mailing list and would like to post my
first question.

I received a message earlier this month regarding an error in my R package
RAMP https://cran.r-project.org/web/packages/RAMP/index.html
I need to update the package by the end of this month to prevent it from
removed from CRAN.

The detailed email message is as follows. Does anyone know how to fix this?
Yep. The default value for gamma parameter in RAMP is NULL. So when you 
do in cd.general.R:


para.in = c(epsilon, max.iter, lambda, gamma)

and later on in .Fortran() call

..., paraIn = as.double(para.in) ...

you obtain a vector of length 3 while it is expected to be of length 4 
in cd_general_lin.f:


double precision, dimension(4) :: paraIn

So that when you read the 4-th element on line 21:

gamma = paraIn(4)

you are caught by valgrid.

Fixing that is left as exercise ;)

Best,
Serguei.



Also, how to reproduce this kind of error on my local mac?

Many thanks!


Checking with valgrind shows:

Still

  > fit1 = RAMP(x, y)
==4663== Invalid read of size 8
==4663==at 0x48A0A51: cd_general_lin_
(/tmp/RAMP.Rcheck/00_pkg_src/RAMP/src/cd_general_lin.f:21)
==4663==by 0x49DDE6: do_dotCode (svn/R-devel/src/main/dotcode.c:1799)
==4663==by 0x4D181C: bcEval (svn/R-devel/src/main/eval.c:7054)
==4663==by 0x4E8197: Rf_eval (svn/R-devel/src/main/eval.c:688)
==4663==by 0x4E9D56: R_execClosure (svn/R-devel/src/main/eval.c:1853)
==4663==by 0x4EAB33: Rf_applyClosure (svn/R-devel/src/main/eval.c:1779)
==4663==by 0x4DB64D: bcEval (svn/R-devel/src/main/eval.c:7022)
==4663==by 0x4E8197: Rf_eval (svn/R-devel/src/main/eval.c:688)
==4663==by 0x4E9D56: R_execClosure (svn/R-devel/src/main/eval.c:1853)
==4663==by 0x4EAB33: Rf_applyClosure (svn/R-devel/src/main/eval.c:1779)
==4663==by 0x4E8363: Rf_eval (svn/R-devel/src/main/eval.c:811)
==4663==by 0x4ECD01: do_set (svn/R-devel/src/main/eval.c:2920)
==4663==  Address 0x1616c100 is 7,600 bytes inside a block of size 7,960
alloc'd
==4663==at 0x483880B: malloc
(/builddir/build/BUILD/valgrind-3.15.0/coregrind/m_replacemalloc/vg_replace_malloc.c:309)
==4663==by 0x5223E0: GetNewPage (svn/R-devel/src/main/memory.c:946)

Please fix and resubmit.

Best wishes,
Yang

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel



__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


[Rd] possible bug in win R-devel in check/test environment

2020-01-14 Thread Serguei Sokol

Hi,

During my recent r2sundials development, I've came across a strange test 
failing during 'R CMD check' exclusively on win R-devel which I could 
reproduce with a minimal example that I present here.
The toy packages testarma1 [1] and testarma2 [2] are minimal 
modifications of a skeleton package produced by 
RcppArmadillo.package.skeleton().
They are almost identical. The first one fails to passe its tests on win 
R-devel [3] while the second one is OK [4]. The reason of test failing 
in testarma1 boils down to not finding a package during tests (here 
RcppArmadillo) although it is well present in LinkingTo field of the 
DESCRIPTION file (the mechanism of the error is detailed in [5]). To 
make the tests pass, I had to add RcppArmadillo and r2sundials to 
'Suggests:' field too (as can be seen in testarma2)


In my understanding, the presence of a package name in the LinkingTo 
field should be sufficient for finding it during the test phase. 
Instead, one have to add it to 'Suggests:' field too.


Am I wrong or this behavior is unexpected?

Best,
Serguei.

[1] https://github.com/sgsokol/testarma1 


[2] https://github.com/sgsokol/testarma2
[3] https://win-builder.r-project.org/v0nBoFleT48y/00check.log
[4] https://win-builder.r-project.org/TMKbnEBncFNc/00check.log
[5] https://github.com/RcppCore/Rcpp/issues/1026



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [R-pkg-devel] suggestion: conda for third-party software

2020-01-08 Thread Serguei Sokol

Le 08/01/2020 à 08:50, Ivan Krylov a écrit :

On Tue, 7 Jan 2020 15:49:45 +0100
Serguei Sokol  wrote:


Currently, many R packages include TPS as part of them thus bloating
their sizes and often duplicating files on a given system.  And even
when TPS is not included in an R package but is just installed on a
system, it is not so obvious to get the right path to it. Sometimes
pkg-config helps but it is not always present.


I agree that making a package depend on a third-party library means
finding oneself in a bit of a pickle. A really popular library like
cURL could be "just" depended upon (for the price of some problems when
building on Windows). A really small (e.g. 3 source files) and rarely
updated (just once last year) library like liborigin could "just" be
bundled (but the package maintainer would have to constantly watch out
for new versions of the library). Finding that the bundled version of a
network-facing library in an R package (e.g. libuv in httpuv) is several
minor versions out of date is always a bit scary, even if it turns out
that no major security flaws have been found in that version (just a few
low-probability resource leaks, one unlikely NULL pointer dereference
and some portability problems). The road to dependency hell is paved
with intentions of code reuse.


So, the new feature would be to let R package developers to write in
DESCRIPTION/SystemRequirements field something like
'conda:boost-cpp>=1.71' where 'boost-cpp' is an example of a conda
package and '>=1.71' is an optional version requirement.


While I appreciate the effort behind Anaconda, I would hate to see it
being *required* to depend on third-party binaries compiled by a
fourth-party (am I counting my parties right?) when there's already a
copy installed and available via means the user trusts more (e.g. via
GNU/Linux distro package, or Homebrew on macOS, or just a copy sitting
in /usr/local installed manually from source). In this regard, a
separate field like "Config/conda" suggested by Kevin Ushey sounds like
a good idea: if one wants to use Anaconda, the field is there. If one
doesn't, one can just ignore it and provide the necessary dependencies
in a different way.
The same would apply for my proposition: if you want, you use 
conda:something if not you do like before. But anyway, I don't make a 
campaign for 'conda:' tag in SystemRequirements. Kevin's Config/conda 
solution seems to be sufficient for this issue. Just, I was not aware 
that it was already there.


Best,
Serguei.

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


[R-pkg-devel] suggestion: conda for third-party software

2020-01-07 Thread Serguei Sokol

Best wishes for 2020!

I would like to suggest a new feature for R package management. Its aim 
is to enable package developers and end-users to rely on conda ( 
https://docs.conda.io/en/latest/ ) for managing third-party software 
(TPS) on major platforms: linux64, win64 and osx64. Currently, many R 
packages include TPS as part of them thus bloating their sizes and often 
duplicating files on a given system.  And even when TPS is not included 
in an R package but is just installed on a system, it is not so obvious 
to get the right path to it. Sometimes pkg-config helps but it is not 
always present.


So, the new feature would be to let R package developers to write in 
DESCRIPTION/SystemRequirements field something like 
'conda:boost-cpp>=1.71' where 'boost-cpp' is an example of a conda 
package and '>=1.71' is an optional version requirement. Having this 
could allow install.packages() to install TPS on a testing CRAN machine 
or on an end-user's one. (There is just one line to execute in a shell: 
conda install . It will install the package itself as well as 
all its dependencies).


To my mind, this feature would have the following advantages:
 - on-disk size economy as the same TPS does not have to be included in 
R package itself and can be shared with other language wrappers, e.g. 
Python;
 - an easy flag configuring in Makevars as paths to TPS will be well 
known in advance;
 - CRAN machines could test packages relying on a wide panel of TPS 
without bothering with their manual installation;
 - TPS installation can become transparent for the end-user on major 
platforms;


Note that even R is part of conda ( 
https://anaconda.org/conda-forge/r-base ), it is not mandatory to use 
the conda's R version for this feature. Here, conda is just meant to 
facilitate access to TPS. However, a minimal requirement is obviously to 
have conda itself.


Does it look reasonable? appealing?
Best,
Serguei.

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] r2sundials submission failure

2019-12-16 Thread Serguei Sokol

Hi Satya,

Le 16/12/2019 à 14:52, Satyaprakash Nayak a écrit :

Hi Serguei

I apply a similar approach to include the Sundials source code as a part 
of my package (sundialr, on CRAN) which provides an interface for Cvode 
and Cvodes. I didn’t have an issue with the files sizes while submitting 
to CRAN

Good to hear :)


and you can take a look to see if the code would be helpful to you.

https://github.com/sn248/sundialr
I see that you include v4.0.1 of sundials while I included a more recent 
release 5.0.0:

https://github.com/sgsokol/r2sundials

But I think, in the end, the library size must be close to each other.



Although, it would be great if sundials is installed on CRAN machines 
and static libraries for Windows are provided in rtools 4.0
It could be a good thing in general but the end user on other platforms 
will have to install it anyway. Moreover, I see you kept the default 
size for index type:


SUNDIALS_INDEX_TYPE int64_t

while I defined it to int32_t.
It shows that it's hard to satisfy all the tastes in the wild with only 
one library distrubution ;)


Best,
Serguei.



Satya


On Mon, 16 Dec 2019 at 03:54, Serguei Sokol <mailto:serguei.so...@gmail.com>> wrote:


Le 15/12/2019 à 17:59, Uwe Ligges a écrit :
 > Have you tried to write to CRAN@... and ask if thirs party
software can
 > be installed on CRAN?
No, I didn't. I presumed that CRAN team had other things to do than to
install any soft that package authors could need. But if this is an
option, I will try this approach next time. For this one, I
integrated a
subset of sundials soft in the package and it turned out to be far less
then 20 MB (which is the size I had on my system for the whole set of
cvodes library and its dependencies). The tarball is only 270 KB and
compiled libraries on some systems can be up to 7 MB and far less on
others (e.g. on my linux, compiled without "-g" flag, the whole size of
$RHOME/library/r2sundial is under 2 MB). Just now, it is pending a
manual inspection. I hope that library size will not be a problem.

Anyway, I appreciate the hint. Thanks.
Best,
Serguei.

 >
 > Best,
 > Uwe Ligges
 >
 > On 11.12.2019 10:39, Serguei Sokol wrote:
 >> Hi,
 >>
 >> I have tried to submit my new package
 >> https://github.com/sgsokol/r2sundials to CRAN but submission
seems to
 >> be dismissed.
 >> The package needs a third part software
 >> https://computing.llnl.gov/projects/sundials/cvodes so it cannot be
 >> built on CRAN automatically. I explained this (and how the
package was
 >> tested by myself) in the submitter's comment (cf. hereafter) and
 >> second time in the reply to all (as was requested) to automatic
 >> message from CRAN announcing the building failure but to no
avail. The
 >> submission was done on November 25, more than two weeks later I
still
 >> don't have any response and the package is no more in incoming/
dir on
 >> cran ftp site. I conclude that this submission is dismissed.
 >>
 >> My question is: what can be reasonably done to make a package like
 >> this (i.e. depending on third part software not available on
CRAN) to
 >> be accepted? Or may be the current policy: all new package must
 >> automatically build. Period. In my case it can imply ~20 MB
additional
 >> space (source code + libs).
 >>
 >> Thanks in advance for any hint.
 >> Serguei.
 >>
 >> Submitter's comment: Dear CRAN team,
 >>
 >> I submit package r2sundials which
 >>    depends on a third part software
 >>    from
 >> https://computing.llnl.gov/projects/sundials/cvodes
 >> This
 >>    is the reason for which it will not automatically
 >>    build on your test systems. But on my side, I could
 >>    successfully run 'R CMD check --as-cran' on Linux
 >>    (R-3.6.1, gcc8), MacOS Catalina (R-3.6.1, Apple
 >>    clang-1100.0.33.12) and Windows 10 (R-devel r77430,
 >>    gcc-4.9.3). If you wish, I can send you reports from
 >>    these runs. Even if it is improbable, but if you
 >>    decide to run such checks manually by your self,
 >>    installation instructions are available
 >>    on
 >> https://github.com/sgsokol/r2sundials
 >>
 >> Moreover,
 >>    winbuilder signals:
 >> Possibly mis-spelled words in
 >>    DESCRIPTION:
 >>    CVODES (3:34)
 >>    Hindmarsh (9:803)
 >>    Rcpp (3:8, 9:285)
 >>    al (9:816)
 >>    cvodes (9:47)
 >>    et (9:813)
  

Re: [Rd] Build failure on powerpc64

2019-12-13 Thread Serguei Sokol

Le 13/12/2019 à 17:06, Tom Callaway a écrit :

arithmetic.c:
static LDOUBLE q_1_eps = 1 / LDBL_EPSILON;
Just a thought: can it be that it's "1" which is at the origin of 
compiler complaint?

In this case, would the syntax "1.L" be sufficient to keep it calm?

Serguei.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[R-pkg-devel] r2sundials submission failure

2019-12-11 Thread Serguei Sokol

Hi,

I have tried to submit my new package 
https://github.com/sgsokol/r2sundials to CRAN but submission seems to be 
dismissed.
The package needs a third part software 
https://computing.llnl.gov/projects/sundials/cvodes so it cannot be 
built on CRAN automatically. I explained this (and how the package was 
tested by myself) in the submitter's comment (cf. hereafter) and second 
time in the reply to all (as was requested) to automatic message from 
CRAN announcing the building failure but to no avail. The submission was 
done on November 25, more than two weeks later I still don't have any 
response and the package is no more in incoming/ dir on cran ftp site. I 
conclude that this submission is dismissed.


My question is: what can be reasonably done to make a package like this 
(i.e. depending on third part software not available on CRAN) to be 
accepted? Or may be the current policy: all new package must 
automatically build. Period. In my case it can imply ~20 MB additional 
space (source code + libs).


Thanks in advance for any hint.
Serguei.

Submitter's comment: Dear CRAN team,

I submit package r2sundials which
  depends on a third part software
  from
https://computing.llnl.gov/projects/sundials/cvodes
This
  is the reason for which it will not automatically
  build on your test systems. But on my side, I could
  successfully run 'R CMD check --as-cran' on Linux
  (R-3.6.1, gcc8), MacOS Catalina (R-3.6.1, Apple
  clang-1100.0.33.12) and Windows 10 (R-devel r77430,
  gcc-4.9.3). If you wish, I can send you reports from
  these runs. Even if it is improbable, but if you
  decide to run such checks manually by your self,
  installation instructions are available
  on
https://github.com/sgsokol/r2sundials

Moreover,
  winbuilder signals:
Possibly mis-spelled words in
  DESCRIPTION:
  CVODES (3:34)
  Hindmarsh (9:803)
  
  Rcpp (3:8, 9:285)

  al (9:816)
  cvodes (9:47)
  
  et (9:813)

  rmumps (9:501)
These all are false
  positives designating software, R packages and
  bibliographic reference.

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] No answer from win-builder R-devel. Is it down?

2019-11-26 Thread Serguei Sokol

Le 26/11/2019 à 11:31, NURIA PEREZ ZANON a écrit :

Dear,

Before submitting my package to CRAN, yesterday I sent it (25th
November) to win-builder R-devel version.

Almost 19 hours later, I haven't received any email from win-builder
with the link to the binaries and the log files. (The email address
specified in the 'Maintainer' field is correct).

Checking the index R-devel in the ftp server, it seems that packages
stoped being tested around 11/25/19, 1:44:00 PM (the oldest package in
the index).

By trying to resubmit the package, I get the error
ERROR: Access to the path
'C:\Inetpub\ftproot\R-devel\CSTools_2.0.0.tar.gz' is denied.

because the package is already in the queue.

Does anyone know how to proceed? Should I contact somebody or this is a
frequent and well-known issue?
We are used to get a response from winbuilder in around 30 minutes but 
these days it seems to be congested. Recently I had to wait 3 days 
before my tarball was proceeded.


So my advice: be patient.
Serguei.

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [Rd] Why is matrix product slower when matrix has very small values?

2019-11-20 Thread Serguei Sokol

Le 20/11/2019 à 09:56, Hilmar Berger a écrit :

Hi Florian,

just a guess, but couldn't it be that the multiplication of very small 
values leads to FP underflow exceptions which have to be handled by 
BLAS in a less efficient way than "normal" multiplications handled by 
SIMD instructions ?
Another guess is that you are caught by what is called "denormal 
numbers" https://en.wikipedia.org/wiki/Denormal_number.
Arithmetic operations on them are different and slower that those on 
"normal" numbers.


Best,
Serguei.



Best regards,
Hilmar

On 19/11/2019 15:09, Florian Gerber wrote:

Hi,

I experience surprisingly large timing differences for the
multiplication of matrices of the same dimension. An example is given
below. How can this be explained?
I posted the question on Stackoverflow:
https://stackoverflow.com/questions/58886111/r-why-is-matrix-product-slower-when-matrix-has-very-small-values 


Somebody could reproduce the behavior but I did not get any useful
explanations yet.

Many thanks for hints!
Florian

## disable openMP
library(RhpcBLASctl); blas_set_num_threads(1); omp_set_num_threads(1)

A <- exp(-as.matrix(dist(expand.grid(1:60, 1:60
summary(c(A))
# Min.  1st Qu.   Median Mean  3rd Qu. Max.
# 0.00 0.00 0.00 0.001738 0.00 1.00

B <- exp(-as.matrix(dist(expand.grid(1:60, 1:60)))*10)
summary(c(B))
#  Min.   1st Qu.    Median  Mean   3rd Qu.  Max.
# 0.000 0.000 0.000 0.0002778 0.000 1.000

identical(dim(A), dim(B))
## [1] TRUE

system.time(A %*% A)
#    user  system elapsed
#   2.387   0.001   2.389
system.time(B %*% B)
#    user  system elapsed
#  21.285   0.020  21.310

sessionInfo()
# R version 3.6.1 (2019-07-05)
# Platform: x86_64-pc-linux-gnu (64-bit)
# Running under: Linux Mint 19.2

# Matrix products: default
# BLAS:   /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
# LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Is missingness always passed on?

2019-10-01 Thread Serguei Sokol

Le 01/10/2019 à 10:58, Serguei Sokol a écrit :

Le 30/09/2019 à 16:17, Duncan Murdoch a écrit :


There's a StackOverflow question 
https://stackoverflow.com/q/22024082/2554330 that references this 
text from ?missing:


"Currently missing can only be used in the immediate body of the 
function that defines the argument, not in the body of a nested 
function or a local call. This may change in the future."


Someone pointed out (in https://stackoverflow.com/a/58169498/2554330) 
that this isn't true in the examples they've tried:  missingness does 
get passed along.  This example shows it (this is slightly different 
than the SO example):


f1 <- function(x, y, z){
  if(missing(x))
    cat("f1: x is missing\n")
  if(missing(y))
    cat("f1: y is missing\n")
}

f2 <- function(x, y, z){
  if(missing(z))
    cat("f2: z is missing\n")
  f1(x, y)
}

f2()

which produces

f2: z is missing
f1: x is missing
f1: y is missing

Is the documentation out of date?  That quote appears to have been 
written in 2002.
Er, as far  as I understand the cited doc, it correctly describes what 
happened in your example: missing() is not working in a local call 
(here f1(x,y)).
In fact, what missing() of f1 is reporting it is still the situation 
of f2() call (i.e. immediate body of the function). See


f2(y=1)

produces

f2: z is missing
f1: x is missing

(the line about y missing disappeared from f1(x,y) call, what needed 
to be demonstrated).
Re-er, it seem that I was a little bit to fast in my conclusion. If we 
modify f2 to be


f2 <- function(x, y, z){
  if(missing(z))
    cat("f2: z is missing\n")
  f1(x=1, y)
}

then f2() call gives

f2: z is missing
f1: y is missing

i.e. missing() of f1(x=1,y) call is reporting its own situation, not 
those of f2(). And the missingess of y seems to be inherited from f2() call.

Sorry to be hasty.

Serguei.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] What is the best way to loop over an ALTREP vector?

2019-09-24 Thread Serguei Sokol

Le 24/09/2019 à 07:48, Gabriel Becker a écrit :

Also, a small nitpick, R's internal mean function doesn't hit Dataptr, it
hits either INTEGER_ELT (which really should probably be a
ITERATE_BY_REGION) or ITERATE_BY_REGION.

Even if it is not the main point of this thread, I was wondering if 
mean() could take an advantage of sum() (which handles ALTREP in 
efficient way) to be defined as mean(x)=sum(x)/length(x)? Currently, 
sum(1:1e14) is almost instantaneous while mean(1:1e14) is very long.


Best Serguei.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Fw: Calling a LAPACK subroutine from R

2019-09-13 Thread Serguei Sokol

On 12/09/2019 11:07, Berend Hasselman wrote:

On 12 Sep 2019, at 10:36, Serguei Sokol  wrote:

On 11/09/2019 21:38, Berend Hasselman wrote:

The Lapack library is loaded automatically by R itself when it needs it  for 
doing some calculation.
You can force it to do that with a (dummy) solve for example.
Put this at start of your script:


# dummy code to get LAPACK library loaded
X1 <- diag(2,2)
x1 <- rep(2,2)
# X1;x1
z <- solve(X1,x1)


another way is to use directly dyn.load():

lapack.path <- paste0(file.path(R.home(), ifelse(.Platform$OS.type == "windows",
  file.path("bin", .Platform$r_arch, "Rlapack"), file.path("lib", 
"libRlapack"))),
  .Platform$dynlib.ext)
dyn.load(lapack.path)

This will not work on macOS.
The extension for dynamic libraries is .dylib.
So you would need

lapack.path <- paste0(file.path(R.home(), ifelse(.Platform$OS.type == "windows",
  file.path("bin", .Platform$r_arch, "Rlapack"), file.path("lib", 
"libRlapack"))),
  ".dylib")

See the help for .Platform and dyn.load for the details for macOS.
Indeed. I was surprised to discover that .Platform$dynlib.ext is set to 
".so" on macos,
not to ".dylib". Thank you to point me to a special note about it in 
?.Platform

Is there a R predefined variable set to ".dylib" on macos ?
Meanwhile, the code for lapack path detection will become a little bit 
more complicated:


dynlib.ext=ifelse(Sys.info()[["sysname"]] == "Darwin", ".dylib", 
.Platform$dynlib.ext)
lapack.path <- paste0(file.path(R.home(), ifelse(.Platform$OS.type == 
"windows",
 file.path("bin", .Platform$r_arch, "Rlapack"), 
file.path("lib", "libRlapack"))),

 dynlib.ext)
dyn.load(lapack.path)
is.loaded("dgemv") # must be TRUE

Serguei.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [R-pkg-devel] please help understand an error in openMP statements

2019-09-13 Thread Serguei Sokol

On 12/09/2019 23:12, Marcin Jurek wrote:

Hello everyone, I'm submitting a package to CRAN which I tested locally, on
Travis CI, R-hub and win builder. It worked no problem in all these
environments. However, after submission, I keep getting the error described
here:


https://win-builder.r-project.org/incoming_pretest/GPvecchia_0.1.0_20190912_201702/Debian/00install.out

One obvious error is pretty well described by the compiler:

U_NZentries.cpp:258:19: error: ‘covparms’ not specified in enclosing 
‘parallel’
  258 |  covmat= MaternFun(dist,covparms) + diagmat(nug) ; // summation 
from arma

  |  ~^~~
U_NZentries.cpp:230:11: error: enclosing ‘parallel’
  230 |   #pragma omp parallel for 
shared(locs,revNNarray,revCondOnLatent,nuggets, nnp,m,Lentries,COV) 
private(k,M,dist,onevec,covmat,nug,n0,inds,revCon_row,inds00,succ,attempt) 
default(none) schedule(static)


It simply stets that you should add 'covparms' either to private() or to 
shared() list on line 230.


I have no idea for other messages complaining about 'none'.

Best Serguei.





I'm not a pro when it comes to openMP but since all previous tests
completed successfully, I thought things were alright. Could you help me
understand what's wrong and how to fix it? Source code of the package can
be found at http://github.com/katzfuss-group/GPvecchia Thanks a lot!


Marcin

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel



__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [Rd] Fw: Calling a LAPACK subroutine from R

2019-09-12 Thread Serguei Sokol

On 11/09/2019 21:38, Berend Hasselman wrote:

The Lapack library is loaded automatically by R itself when it needs it  for 
doing some calculation.
You can force it to do that with a (dummy) solve for example.
Put this at start of your script:


# dummy code to get LAPACK library loaded
X1 <- diag(2,2)
x1 <- rep(2,2)
# X1;x1
z <- solve(X1,x1)


another way is to use directly dyn.load():

lapack.path <- paste0(file.path(R.home(), ifelse(.Platform$OS.type == 
"windows",
 file.path("bin", .Platform$r_arch, "Rlapack"), 
file.path("lib", "libRlapack"))),

 .Platform$dynlib.ext)
dyn.load(lapack.path)

followed by your code.

Best,
Serguei.



followed by the rest of your script.
You will get a warning (I do) that  "passing a character vector  to .Fortran is not 
portable".
On other systems this may gave fatal errors. This is quick and very dirty. 
Don't do it.

I believe there is a better and much safer way to achieve what you want.
Here goes.

Create a folder (directory) src in the directory where your script resides.
Create a wrapper for "dpbtrf" file in a file xdpbtrf.f that takes an integer 
instead of character


c intermediate for dpbtrf

   SUBROUTINE xDPBTRF( kUPLO, N, KD, AB, LDAB, INFO )

c  .. Scalar Arguments ..
   integer kUPLO
   INTEGER INFO, KD, LDAB, N

c  .. Array Arguments ..
   DOUBLE PRECISION   AB( LDAB, * )

   character UPLO
c convert integer argument to character
   if(kUPLO .eq. 1 ) then
   UPLO = 'L'
   else
   UPLO = 'U'
   endif

   call dpbtrf(UPLO,N,KD,AB,LDAB,INFO)
   return
   end



Instead of a character argument UPLO it takes an integer argument kUPLO.
The meaning should be obvious from the code.

Now create a shell script in the folder of your script to generate a dynamic 
library to be loaded in your script:


# Build a binary dynamic library for accessing Lapack dpbtrf

# syntax checking
  
SONAME=xdpbtrf.so


echo Strict syntax checking
echo --
gfortran -c -fsyntax-only -fimplicit-none -Wall src/*.f || exit 1

LAPACK=$(R CMD config LAPACK_LIBS)
R CMD SHLIB --output=${SONAME} src/*.f ${LAPACK} || exit 1


To load the dynamic library xdpbtrf.so  change your script into this


dyn.load("xdpbtrf.so")
n <- 4L
phi <- 0.64
AB <- matrix(0, 2, n)
AB[1, ] <- c(1, rep(1 + phi^2, n-2), 1)
AB[2, -n] <- -phi
round(AB, 3)

AB.ch <- .Fortran("xdpbtrf", kUPLO=1L, N = as.integer(n),
 KD = 1L, AB = AB, LDAB = 2L, INFO = 
as.integer(0))$AB
AB.ch



and you are good to go.

You should always do something  as described above when you need to pass 
character arguments to Fortran code.

All of this was tested and run on macOS using the CRAN version of R.

Berend Hasselman


On 11 Sep 2019, at 15:47, Giovanni Petris  wrote:

Sorry for cross-posting, but I realized my question might be more appropriate 
for r-devel...

Thank you,
Giovanni


From: R-help  on behalf of Giovanni Petris 

Sent: Tuesday, September 10, 2019 16:44
To: r-h...@r-project.org
Subject: [R] Calling a LAPACK subroutine from R

Hello R-helpers!

I am trying to call a LAPACK subroutine directly from my R code using 
.Fortran(), but R cannot find the symbol name. How can I register/load the 
appropriate library?


### AR(1) Precision matrix
n <- 4L
phi <- 0.64
AB <- matrix(0, 2, n)
AB[1, ] <- c(1, rep(1 + phi^2, n-2), 1)
AB[2, -n] <- -phi
round(AB, 3)

  [,1]  [,2]  [,3] [,4]
[1,]  1.00  1.41  1.411
[2,] -0.64 -0.64 -0.640

### Cholesky factor
AB.ch <- .Fortran("dpbtrf", UPLO = 'L', N = as.integer(n),

+  KD = 1L, AB = AB, LDAB = 2L, INFO = as.integer(0))$AB
Error in .Fortran("dpbtrf", UPLO = "L", N = as.integer(n), KD = 1L, AB = AB,  :
  Fortran symbol name "dpbtrf" not in load table

sessionInfo()

R version 3.6.0 (2019-04-26)
Platform: x86_64-apple-darwin18.5.0 (64-bit)
Running under: macOS Mojave 10.14.6

Matrix products: default
BLAS/LAPACK: /usr/local/Cellar/openblas/0.3.6_1/lib/libopenblasp-r0.3.6.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_3.6.0 tools_3.6.0

Thank you in advance for your help!

Best,
Giovanni Petris



--
Giovanni Petris, PhD
Professor
Director of Statistics
Department of Mathematical Sciences
University of Arkansas - Fayetteville, AR 72701


__
r-h...@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dhelp=DwICAg=7ypwAowFJ8v-mw8AB-SdSueVQgSDL4HiiSaLK01W8HA=C3DNvy_azplKSvJKgvsgjA=C-MwKl__0xz-98RBbu7QNXJjqWkRr4xp6c0cz9Dck7A=a1vAu3mcXKObTLwP19vOmRPq55h6oQTh_vnS6BEibF0=
PLEASE do read the posting guide 

Re: [R-pkg-devel] set pkg_config for 3rd party software

2019-09-05 Thread Serguei Sokol

On 05/09/2019 12:26, Martin Maechler wrote:

Sameh M Abdulah
 on Fri, 30 Aug 2019 18:50:55 + writes:


 > Hi,
 > I recently asked some questions about my R package which were well 
responded by Dirk.

 > I have another question related to pkg_config path,

 > I am using this command to add the installation path to the 
PKG_CONFIG_PATH   so that all cmake commands can get the required libraries from 
this path,

 > 
Sys.setenv(PKG_CONFIG_PATH=paste(Sys.getenv("PKG_CONFIG_PATH"),paste(.libPaths(),"exageostat/lib/pkgconfig",sep='/',collapse=':'),sep=':'))

 > Is there a simple way to set this path without explicitly calling this 
line before installing my package? OR is there any other path that I can use so 
that all software CMake commands can easily find the required libraries?

 > --Sameh

Not an answer, but a  #METOO   with a hopefull very related
question, also on using 'pkg-config' (Note: "-", not "_" here)
for package configuration.

I'm maintainer of CRAN package Rmpfr (for arbitrary precise arithmetic..),
 https://cran.r-project.org/package=Rmpfr
development & source on R-forge
  http://rmpfr.r-forge.r-project.org/
and  https://r-forge.r-project.org/scm/viewvc.php/pkg/?root=rmpfr
which "down there"  is principally an interface to the GNU MPFR
C library (which also needs the GNU  GMP C library).

I do have a  Rmpfr/configure.ac from which to produce
Rmpfr/configure which then ensures that both libraries (MPFR and GMP)
are found and are "working".
The 'configure' script then (supposedly, but not on Windows?) takes
either src/Makevars.in  (non-Windows)
or src/Makevars.win (Windows)
to produce  src/Makevars
which then is used during compilation of the C sources of my
package.

I have a small marginal remark about this.
Makevars.win is not windows equivalent of Makevars.in but of plain 
Makevars (see
https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Package-subdirectories 
)
On windows, you can use configure.win which in turn can use 
Makevars.win.in to produce Makevars.win. The latter will be used by make 
on that platform (see 
https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Configure-and-cleanup 
)


Best,
Serguei.



Notably it will contain   '-lmpfr -lgmp'  among the LDFLAGS in
any case.

Now back to the 'pkg-config' : The compiler *also* needs correct

-I  (the path used  by ' #include <...> ' statement)

and for linking a correct  -L.

Now, my main OS,  Linux Fedora (as all other decent Linux distributions)
does provide MPFR and GMP libraries (and include headers) as OS
packages, installed in  /usr/lib/ (actually /lib64 nowadays)
and /usr/include respectively.

However, for some reasons I don't know the *version* of the MPFR
library that the OS provides is outdate (to my taste), and I'd
really want a newer version of MPFR,  which I easily install in
a version of /usr/local/. *and* I also make sure that

 pkg-config --libs mpfr
and pkg-config --cflags mpfr

list the corresponding LDFLAGS  and CFLAGS

(the first giving

   -L/usr/local.../mpfr/4.0.1/lib -lmpfr -lgmp

  the 2nd

   -I/usr/local.../mpfr/4.0.1/include
)

Now what is the officially / best way to have either 'configure'
or  Makevars.{in,win}  use the 'pkg-config' information
*dynamically*, i.e.,
if I update my MPFR from 4.0.1 to 4.0.2  the newer 4.0.2 is found ?

My current setup would not even work on some platforms to really
end up using my local version of MPFR instead of the system-wide
OS default (using /lib64 and /usr/include/ and then
which even with quite new Fedora 30 is still MPFR 3.1.6 .. much
too old for some of the things I want).

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel



__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] third part software dependency

2019-08-30 Thread Serguei Sokol

On 30/08/2019 15:58, Dirk Eddelbuettel wrote:


On 30 August 2019 at 07:24, Dirk Eddelbuettel wrote:
|
| On 30 August 2019 at 11:10, Serguei Sokol wrote:
| | I am preparing a new package r2sundials for submission to cran. It
| | depends on third part software
| | (https://computing.llnl.gov/projects/sundials). This will be my first
| | submission of the kind so I am wondering how it is supposed to be tested
| | so that I can reliably check the box "tested on R-devel" during
| | submission process?
| | I suppose that sundials is not installed on windev. And even if it was,
| | I need a particular option in this software (index size set to 32 bits)
| | which is probably not activated in the hypothetical installation. So I
| | cannot use win-builder.r-project.org.
| | Am I supposed to install current r-devel version and test my package on
| | it locally?
|
| The builder.r-hub.io service is a good alternative. It offers twelve
| different platforms, including a few r-devel ones.
|
| That said, it won't have sundials either so you may have to pull sundials in
| during configure or via Make dependencies or ...

An alternative is of course to use a Docker image. I have long provided two
different containers within the Rocker Project that have r-devel (to be
invoked as RD).  You can pretty easily fire up the container, install
sundials and then save it again locally.  Ask me off-line about how if you
need help.

Or, of course, do as I and many others do and just keep a local r-devel build
in /usr/local 

Thank Dirk. This was my spare option and I think this is what I'll do.

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] third part software dependency

2019-08-30 Thread Serguei Sokol

Hi Ralf,

On 30/08/2019 16:04, Ralf Stubner wrote:

On Fri, Aug 30, 2019 at 11:10 AM Serguei Sokol  wrote:

I am preparing a new package r2sundials for submission to cran. It
depends on third part software
(https://computing.llnl.gov/projects/sundials).


Are you aware of the sundialr package?
https://cran.r-project.org/package=sundialr
Yes, I do. I explain why I created a new package in Readme.md visible 
here https://github.com/sgsokol/r2sundials





This will be my first
submission of the kind so I am wondering how it is supposed to be tested
so that I can reliably check the box "tested on R-devel" during
submission process?
I suppose that sundials is not installed on windev. And even if it was,
I need a particular option in this software (index size set to 32 bits)
which is probably not activated in the hypothetical installation. So I
cannot use win-builder.r-project.org.
Am I supposed to install current r-devel version and test my package on
it locally?


Besides Dirk's suggestions, you can also use a CI service like Travis.
there you can install additional dependencies before testing the
package itself.
Thanks for suggestion but I don't use Travis. Until now I didn't need 
it. Locally, I installed sundials/cvodes and my package. That's enough 
for the tests.




However, the question remains how CRAN will test the package without
having sundials installed.

Right.

Best,
Serguei.

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z

2019-06-24 Thread Serguei Sokol

On 22/06/2019 00:58, jing hua zhao wrote:

Hi Peter, Rui, Chrstophe and Gabriel,

Thanks for your inputs --  the use of qnorm(., log=TRUE) is a good point
Another approach could be simply to note that a function defined as 
f(p)=exp(-z(p)^2/2) is regular around p=0 with f(0)=0.
It has roughly the shape of p*(2-p) for p \in [0; 1]. So we can 
calculate let say f(10^-10) with sufficient precision using Rmpfr and 
then use a linear approximation for p from [0, 10^-10]. After that a 
simple inverse gives us e^(z*z/2).


Serguei.


  in line with pnorm with which we devised log(p)  as

log(2) + pnorm(-abs(z), lower.tail = TRUE, log.p = TRUE)

that could do really really well for large z compared to Rmpfr. Maybe I am 
asking too much since

z <-2

Rmpfr::format(2*pnorm(mpfr(-abs(z),100),lower.tail=TRUE,log.p=FALSE))

[1] "1.660579603192917090365313727164e-86858901"

already gives a rarely seen small p value. I gather I also need a multiple 
precision exp() and their sum since exp(z^2/2) is also a Bayes Factor so I  get 
log(x_i )/sum_i log(x_i) instead. To this point, I am obliged to clarify, see 
https://statgen.github.io/gwas-credible-sets/method/locuszoom-credible-sets.pdf.

I agree many feel geneticists go to far with small p values which I would have 
difficulty to argue againston the other hand it is also expected to see these 
in a non-genetic context. For instance the Framingham study was established in 
1948 just got $34m for six years on phenotypewide association which we would be 
interesting to see.

Best wishes,


Jing Hua



From: peter dalgaard 
Sent: 21 June 2019 16:24
To: jing hua zhao
Cc: Rui Barradas; r-devel@r-project.org
Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z

You may want to look into using the log option to qnorm

e.g., in round figures:


log(1e-300)

[1] -690.7755

qnorm(-691, log=TRUE)

[1] -37.05315

exp(37^2/2)

[1] 1.881797e+297

exp(-37^2/2)

[1] 5.314068e-298

Notice that floating point representation cuts out at 1e+/-308 or so. If you 
want to go outside that range, you may need explicit manipulation of the log 
values. qnorm() itself seems quite happy with much smaller values:


qnorm(-5000, log=TRUE)

[1] -99.94475

-pd


On 21 Jun 2019, at 17:11 , jing hua zhao  wrote:

Dear Rui,

Thanks for your quick reply -- this allows me to see the bottom of this. I was 
hoping we could have a handle of those p in genmoics such as 1e-300 or smaller.

Best wishes,


Jing Hua


From: Rui Barradas 
Sent: 21 June 2019 15:03
To: jing hua zhao; r-devel@r-project.org
Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z

Hello,

Well, try it:

p <- .Machine$double.eps^seq(0.5, 1, by = 0.05)
z <- qnorm(p/2)

pnorm(z)
# [1] 7.450581e-09 1.22e-09 2.026908e-10 3.343152e-11 5.514145e-12
# [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16
#[11] 1.110223e-16
p/2
# [1] 7.450581e-09 1.22e-09 2.026908e-10 3.343152e-11 5.514145e-12
# [6] 9.094947e-13 1.500107e-13 2.474254e-14 4.080996e-15 6.731134e-16
#[11] 1.110223e-16

exp(z*z/2)
# [1] 9.184907e+06 5.301421e+07 3.073154e+08 1.787931e+09 1.043417e+10
# [6] 6.105491e+10 3.580873e+11 2.104460e+12 1.239008e+13 7.306423e+13
#[11] 4.314798e+14


p is the smallest possible such that 1 + p != 1 and I couldn't find
anything to worry about.


R version 3.6.0 (2019-04-26)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 19.04

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.8.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.8.0

locale:
  [1] LC_CTYPE=pt_PT.UTF-8   LC_NUMERIC=C
  [3] LC_TIME=pt_PT.UTF-8LC_COLLATE=pt_PT.UTF-8
  [5] LC_MONETARY=pt_PT.UTF-8LC_MESSAGES=pt_PT.UTF-8
  [7] LC_PAPER=pt_PT.UTF-8   LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=pt_PT.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods
[7] base

other attached packages:

[many packages loaded]


Hope this helps,

Rui Barradas

�s 15:24 de 21/06/19, jing hua zhao escreveu:

Dear R-developers,

I am keen to calculate exp(z*z/2) with z=qnorm(p/2) and p is very small. I 
wonder if anyone has experience with this?

Thanks very much in advance,


Jing Hua

   [[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com










[[alternative HTML version deleted]]

__

Re: [R-pkg-devel] .Rd, LaTeX and Unicode

2019-06-19 Thread Serguei Sokol

On 18/06/2019 17:10, Georgi Boshnakov wrote:


Since April 2018 'utf8' is the default input encoding in LaTeX, see
http://anorien.csc.warwick.ac.uk/mirrors/CTAN/macros/latex/doc/ltnews.pdf and 
they added some symbols in December.
Interesting ... but still not sufficient. I have a fairly recent latex 
system:

$ latex --version
pdfTeX 3.14159265-2.6-1.40.19 (TeX Live 2018/Mageia)

but unfortunately utf8 alone (and even including
\usepackage[mathletters]{ucs}) cannot compile utf8 math expressions.

I have also tried a full scale test on a tex file obtained with
$ R CMD Rd2pdf --no-clean pkgname
where I replaced

\usepackage[utf8]{inputenc}

by

\usepackage[mathletters]{ucs}
\usepackage[utf8x]{inputenc}

and it did not compile either. In addition, I had to replace every 
occurrence of


\inputencoding{utf8}

by

\inputencoding{utf8x}

after what pdflatex worked like a charm.

Serguei.



-Original Message-
From: R-package-devel [mailto:r-package-devel-boun...@r-project.org] On Behalf 
Of Martin Maechler
Sent: 18 June 2019 15:01
To: serguei.so...@gmail.com; Hugh Parsonage
Cc: r-package-devel@r-project.org
Subject: Re: [R-pkg-devel] .Rd, LaTeX and Unicode


Hugh Parsonage
 on Tue, 18 Jun 2019 20:03:41 +1000 writes:


 > utf8x is deprecated
 > 
https://tex.stackexchange.com/questions/13067/utf8x-vs-utf8-inputenc#13070

Hmm... interestingly, I've tried quite a few versions of the
above which started in 2011, but had been updated in April 2016 :
https://tex.stackexchange.com/a/203804/7228
from where it seems that

\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}

should be sufficient.  Further, note that from
   https://tex.stackexchange.com/a/238135/7228
the {ucs} package should no longer be needed since ca. 2013,
hence your \usepackage[mathletters]{ucs}  would not be needed either.

HOWEVER:  After losing at least half an hour now, trying many
variants I found that the only version that works correctly for
me (with a teTeX / TeXlive version of 2018) is the version
Serguei Sokol proposes (below), including the use of the 'utf8x'
option *and* the 'ucs' package ...

which is pretty surprising after having read the
tex.statexchange threads ...

 > On Tue, 18 Jun 2019 at 7:52 pm, Serguei Sokol 
 > wrote:

 >> Hi,
 >>
 >> I am preparing a package where I would like to use UTF characters in .Rd
 >> files. When the LaTeX comes to play, I got well known errors e.g.:
 >> ! Package inputenc Error: Unicode character ∂ (U+2202)
 >> (inputenc)not set up for use with LaTeX.
 >>
 >> It is coherent with what is said on this page
 >> https://developer.r-project.org/Encodings_and_R.html :
 >> "Since LaTeX cannot handle Unicode we would have to convert the encoding
 >> of latex help files or use Lambda (and tell it they were in UTF-8)."

That whole document has been very important and crucial, written
by Prof Brian Ripley  who had worked a *LOT* to bring unicode to R,
-- but it has been written 2004-2005  and indeed, I think it is
probably fair to say that the above sentence no longer applies
to current LaTeX engines (including "simple" pdflatex)... though really,
I'm not the expert here, but I think it's a good point in time
to reconsider how much UTF8 should be allowed/supported in *.Rd files.

One problem: This is (slightly) the wrong mailing list; this would have
been a perfect topic for 'R-devel' (discussing about new
features etc for R) instead
( but we'd rather keep it here for now.)

Martin Maechler
ETH Zurich and R Core Team



 >> But LaTeX can support UTF8 as shown with this small example:

  \documentclass{article}
  \usepackage[mathletters]{ucs}
  \usepackage[utf8x]{inputenc}
  
  \begin{document}

  The vorticity ω is defined as $ω = ∇ × u$.
  \end{document}

 >> I can compile it with my LaTeX without problem. May be you too?
 >> So my suggestion would be to place these two lines somewhere in LaTeX
 >> header generated by R doc system:
 >> \usepackage[mathletters]{ucs}
 >> \usepackage[utf8x]{inputenc}
 >>
 >> Note "utf8x" and not just "utf8" which is crucial for this example.
 >> With a hope that it would fix unicode errors from LaTeX.
 >>
 >> Best,
 >> Serguei.

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel



__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


[R-pkg-devel] .Rd, LaTeX and Unicode

2019-06-18 Thread Serguei Sokol

Hi,

I am preparing a package where I would like to use UTF characters in .Rd 
files. When the LaTeX comes to play, I got well known errors e.g.:

! Package inputenc Error: Unicode character ∂ (U+2202)
(inputenc)    not set up for use with LaTeX.

It is coherent with what is said on this page 
https://developer.r-project.org/Encodings_and_R.html :
"Since LaTeX cannot handle Unicode we would have to convert the encoding 
of latex help files or use Lambda (and tell it they were in UTF-8)."


But LaTeX can support UTF8 as shown with this small example:

\documentclass{article}
\usepackage[mathletters]{ucs}
\usepackage[utf8x]{inputenc}

\begin{document}
    The vorticity ω is defined as $ω = ∇ × u$.
\end{document}

I can compile it with my LaTeX without problem. May be you too?
So my suggestion would be to place these two lines somewhere in LaTeX 
header generated by R doc system:

\usepackage[mathletters]{ucs}
\usepackage[utf8x]{inputenc}

Note "utf8x" and not just "utf8" which is crucial for this example.
With a hope that it would fix unicode errors from LaTeX.

Best,
Serguei.

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [Rd] [R] Open a file which name contains a tilde

2019-06-14 Thread Serguei Sokol

On 14/06/2019 14:43, Frank Schwidom wrote:

Hi John,

First, the unix and linux filesystem allows the use of any nonzero character in 
its filesystem filenames
Well, even it's not the central point of the discussion let make this 
assertion more correct. It depends on file system. E.g. JFS 
(https://en.wikipedia.org/wiki/JFS_%28file_system%29) abides to this 
rule while very popular ext4 (https://en.wikipedia.org/wiki/Ext4) does 
not. In addition to NUL character, the latter forbids '/' as well.


Best,
Serguei.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [R-pkg-devel] try() in R CMD check --as-cran

2019-06-07 Thread Serguei Sokol

On 07/06/2019 16:47, Duncan Murdoch wrote:

On 07/06/2019 9:46 a.m., J C Nash wrote:

Should try() not stop those checks from forcing an error?


try(stop("msg"))  will print the error message, but won't stop 
execution.  Presumably the printed message is what is causing you 
problems.  If you want to suppress that, use


try(stop("msg"), silent = TRUE)
By curiosity, I tried but to no avail. Moreover, I have tried to trigger 
a similar error in different place by adding to the John's example in 
man/fchk.Rd the following code:


#... the same before
} # very simple, but ...

print(Sys.getenv("_R_CHECK_LENGTH_1_LOGIC2_", unset="unset"))
a=1:3 || 1:3
cat("a=", a, "\n")
y<-1:10
#... the same after.

add failed again. This part of code works without any error signaling 
while fchk()'s situation continue to flag up an error. From 
fchk-Ex.Rout, we can see:

...
+ } # very simple, but ...
>
> print(Sys.getenv("_R_CHECK_LENGTH_1_LOGIC2_", unset="unset"))
[1] "package:_R_CHECK_PACKAGE_NAME_,abort,verbose"
> a=1:3 || 1:3
> cat("a=", a, "\n")
a= TRUE
> y<-1:10
...
Function evaluation returns a vector not a scalar
 --- FAILURE REPORT --
 --- failure: length > 1 in coercion to logical ---
...

While in regular R session:
> Sys.unsetenv("_R_CHECK_LENGTH_1_LOGIC2_")
> (a=1:3 || 1:3)
[1] TRUE
> Sys.setenv("_R_CHECK_LENGTH_1_LOGIC2_"="")
> (a=1:3 || 1:3)
Error in 1:3 || 1:3 : 'length(x) = 3 > 1' in coercion to 'logical(1)'

So, to my mind, 'a=1:3 || 1:3' should be considered as an error in a 
check with "--as-cran" but for some reason is not.


Both fchk with modified example and corresponding fchk-Ex.Rout are enclosed.

Best,
Serguei.



Duncan Murdoch



I recognize that this is the failure -- it is indeed the check I'm 
trying to

catch -- but I don't want tests of such checks to fail my package.

JN

On 2019-06-07 9:31 a.m., Sebastian Meyer wrote:

The failure stated in the R CMD check failure report is:


  --- failure: length > 1 in coercion to logical ---


This comes from --as-cran performing useful extra checks via setting the
environment variable _R_CHECK_LENGTH_1_LOGIC2_, which means:

check if either argument of the binary operators && and || has 
length greater than one.


(see https://cran.r-project.org/doc/manuals/r-release/R-ints.html#Tools)

The failure report also states the source of the failure:


  --- call from context ---
fchk(x, benbad, trace = 3, y)
  --- call from argument ---
is.infinite(fval) || is.na(fval)


The problem is that both is.infinite(fval) and is.na(fval) return
vectors of length 10 in your test case:


  --- value of length: 10 type: logical ---
  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE


The || operator works on length 1 Booleans. Since fval can be of length
greater than 1 at that point, the proper condition seems to be:

any(is.infinite(fval)) || any(is.na(fval))

Best regards,

Sebastian


Am 07.06.19 um 14:53 schrieb J C Nash:
Sorry reply not quicker. For some reason I'm not getting anything in 
the thread I started!
I found the responses in the archives. Perhaps cc: nas...@uottawa.ca 
please.


I have prepared a tiny (2.8K) package at
http://web.ncf.ca/nashjc/jfiles/fchk_2019-6.5.tar.gz

R CMD check --> OK

R CMD check --as-cran --> 1 ERROR, 1 NOTE

The error is in an example:


benbad<-function(x, y){
    # y may be provided with different structures
    f<-(x-y)^2
} # very simple, but ...

y<-1:10
x<-c(1)
cat("test benbad() with y=1:10, x=c(1)\n")
tryfc01 <- try(fc01<-fchk(x, benbad, trace=3, y))
print(tryfc01)
print(fc01)


There's quite a lot of output, but it doesn't make much sense to me, as
it refers to code that I didn't write.

The function fchk is attempting to check if functions provided for
optimization do not violate some conditions e.g., character rather than
numeric etc.

JN


On 2019-06-07 8:44 a.m., J C Nash wrote:

Uwe Ligges ||gge@ @end|ng |rom @t@t|@t|k@tu-dortmund@de
Fri Jun 7 11:44:37 CEST 2019

 Previous message (by thread): [R-pkg-devel] try() in R CMD 
check --as-cran
 Next message (by thread): [R-pkg-devel] using package data in 
package code

 Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Right, what problem are you talking about? Can you tell us which check
it is and what it actually complained about.
There is no check that looks at the sizes of x and y in exypressions
such as
(x - y)^2.
as far as I know.

Best,
Uwe

On 07.06.2019 10:33, Berry Boessenkool wrote:


Not entirely sure if this is what you're looking for:
https://github.com/wch/r-source/blob/trunk/src/library/tools/R/check.R 


It does contain --as-cran a few times and there's the change-history:
https://github.com/wch/r-source/commits/trunk/src/library/tools/R/check.R 



Regards,
Berry



From: R-package-devel r-project.org> on behalf of J C Nash 

Sent: Thursday, June 6, 2019 15:03
To: List r-package-devel
Subject: [R-pkg-devel] try() in R CMD check --as-cran

After making a 

Re: [R-pkg-devel] try() in R CMD check --as-cran

2019-06-07 Thread Serguei Sokol

On 07/06/2019 16:05, Jeff Newmiller wrote:

any(is.infinite(fval)) || any(is.na(fval))


a little typo here: it should be '|', not '||', right ?


Since `any` collapses the vectors to length 1 either will work, but I would 
prefer `||`.
You are right, I missed the second 'any()' at the first glance. I read 
it as 'any(v1 || v2)'. My bad.




On June 7, 2019 6:51:29 AM PDT, Serguei Sokol  wrote:

On 07/06/2019 15:31, Sebastian Meyer wrote:

The failure stated in the R CMD check failure report is:


   --- failure: length > 1 in coercion to logical ---


This comes from --as-cran performing useful extra checks via setting

the

environment variable _R_CHECK_LENGTH_1_LOGIC2_, which means:


check if either argument of the binary operators && and || has

length greater than one.


(see

https://cran.r-project.org/doc/manuals/r-release/R-ints.html#Tools)


The failure report also states the source of the failure:


   --- call from context ---
fchk(x, benbad, trace = 3, y)
   --- call from argument ---
is.infinite(fval) || is.na(fval)


The problem is that both is.infinite(fval) and is.na(fval) return
vectors of length 10 in your test case:


   --- value of length: 10 type: logical ---
   [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE


The || operator works on length 1 Booleans. Since fval can be of

length

greater than 1 at that point, the proper condition seems to be:

any(is.infinite(fval)) || any(is.na(fval))

a little typo here: it should be '|', not '||', right ?

Best,
Serguei.


Am 07.06.19 um 14:53 schrieb J C Nash:

Sorry reply not quicker. For some reason I'm not getting anything in

the thread I started!

I found the responses in the archives. Perhaps cc: nas...@uottawa.ca

please.


I have prepared a tiny (2.8K) package at
http://web.ncf.ca/nashjc/jfiles/fchk_2019-6.5.tar.gz

R CMD check --> OK

R CMD check --as-cran --> 1 ERROR, 1 NOTE

The error is in an example:


benbad<-function(x, y){
 # y may be provided with different structures
 f<-(x-y)^2
} # very simple, but ...

y<-1:10
x<-c(1)
cat("test benbad() with y=1:10, x=c(1)\n")
tryfc01 <- try(fc01<-fchk(x, benbad, trace=3, y))
print(tryfc01)
print(fc01)


There's quite a lot of output, but it doesn't make much sense to me,

as

it refers to code that I didn't write.

The function fchk is attempting to check if functions provided for
optimization do not violate some conditions e.g., character rather

than

numeric etc.

JN


On 2019-06-07 8:44 a.m., J C Nash wrote:

Uwe Ligges ||gge@ @end|ng |rom @t@t|@t|k@tu-dortmund@de
Fri Jun 7 11:44:37 CEST 2019

  Previous message (by thread): [R-pkg-devel] try() in R CMD

check --as-cran

  Next message (by thread): [R-pkg-devel] using package data in

package code

  Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Right, what problem are you talking about? Can you tell us which

check

it is and what it actually complained about.
There is no check that looks at the sizes of x and y in

exypressions

such as
(x - y)^2.
as far as I know.

Best,
Uwe

On 07.06.2019 10:33, Berry Boessenkool wrote:


Not entirely sure if this is what you're looking for:


https://github.com/wch/r-source/blob/trunk/src/library/tools/R/check.R

It does contain --as-cran a few times and there's the

change-history:



https://github.com/wch/r-source/commits/trunk/src/library/tools/R/check.R


Regards,
Berry



From: R-package-devel 
r-project.org> on behalf of J C Nash 

Sent: Thursday, June 6, 2019 15:03
To: List r-package-devel
Subject: [R-pkg-devel] try() in R CMD check --as-cran

After making a small fix to my optimx package, I ran my usual R

CMD check --as-cran.


To my surprise, I got two ERRORs unrelated to the change. The

errors popped up in

a routine designed to check the call to the user objective

function. In particular,

one check is that the size of vectors is the same in expressions

like (x - y)^2.

This works fine with R CMD check, but the --as-cran seems to have

changed and it

pops an error, even when the call is inside try(). The irony that

the routine in

question is intended to avoid problems like this is not lost on

me.


I'm working on a small reproducible example, but it's not small

enough yet.

In the meantime, I'm looking for the source codes of the scripts

for "R CMD check" and

"R CMD check --as-cran" so I can work out why there is this

difference, which seems

to be recent.

Can someone send/post a link? I plan to figure this out and

provide feedback,

as I suspect it is going to affect others. However, it may be a

few days or even

weeks if past experience is a guide.

JN

__
R-package-devel using r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

[[alternative HTML version deleted]]

__
R-package-devel usin

Re: [R-pkg-devel] try() in R CMD check --as-cran

2019-06-07 Thread Serguei Sokol

On 07/06/2019 15:31, Sebastian Meyer wrote:

The failure stated in the R CMD check failure report is:


  --- failure: length > 1 in coercion to logical ---


This comes from --as-cran performing useful extra checks via setting the
environment variable _R_CHECK_LENGTH_1_LOGIC2_, which means:


check if either argument of the binary operators && and || has length greater 
than one.


(see https://cran.r-project.org/doc/manuals/r-release/R-ints.html#Tools)

The failure report also states the source of the failure:


  --- call from context ---
fchk(x, benbad, trace = 3, y)
  --- call from argument ---
is.infinite(fval) || is.na(fval)


The problem is that both is.infinite(fval) and is.na(fval) return
vectors of length 10 in your test case:


  --- value of length: 10 type: logical ---
  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE


The || operator works on length 1 Booleans. Since fval can be of length
greater than 1 at that point, the proper condition seems to be:

any(is.infinite(fval)) || any(is.na(fval))

a little typo here: it should be '|', not '||', right ?

Best,
Serguei.


Am 07.06.19 um 14:53 schrieb J C Nash:

Sorry reply not quicker. For some reason I'm not getting anything in the thread 
I started!
I found the responses in the archives. Perhaps cc: nas...@uottawa.ca please.

I have prepared a tiny (2.8K) package at
http://web.ncf.ca/nashjc/jfiles/fchk_2019-6.5.tar.gz

R CMD check --> OK

R CMD check --as-cran --> 1 ERROR, 1 NOTE

The error is in an example:


benbad<-function(x, y){
# y may be provided with different structures
f<-(x-y)^2
} # very simple, but ...

y<-1:10
x<-c(1)
cat("test benbad() with y=1:10, x=c(1)\n")
tryfc01 <- try(fc01<-fchk(x, benbad, trace=3, y))
print(tryfc01)
print(fc01)


There's quite a lot of output, but it doesn't make much sense to me, as
it refers to code that I didn't write.

The function fchk is attempting to check if functions provided for
optimization do not violate some conditions e.g., character rather than
numeric etc.

JN


On 2019-06-07 8:44 a.m., J C Nash wrote:

Uwe Ligges ||gge@ @end|ng |rom @t@t|@t|k@tu-dortmund@de
Fri Jun 7 11:44:37 CEST 2019

 Previous message (by thread): [R-pkg-devel] try() in R CMD check --as-cran
 Next message (by thread): [R-pkg-devel] using package data in package code
 Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Right, what problem are you talking about? Can you tell us which check
it is and what it actually complained about.
There is no check that looks at the sizes of x and y in exypressions
such as
(x - y)^2.
as far as I know.

Best,
Uwe

On 07.06.2019 10:33, Berry Boessenkool wrote:


Not entirely sure if this is what you're looking for:
https://github.com/wch/r-source/blob/trunk/src/library/tools/R/check.R
It does contain --as-cran a few times and there's the change-history:
https://github.com/wch/r-source/commits/trunk/src/library/tools/R/check.R

Regards,
Berry



From: R-package-devel  on behalf of J C 
Nash 
Sent: Thursday, June 6, 2019 15:03
To: List r-package-devel
Subject: [R-pkg-devel] try() in R CMD check --as-cran

After making a small fix to my optimx package, I ran my usual R CMD check 
--as-cran.

To my surprise, I got two ERRORs unrelated to the change. The errors popped up 
in
a routine designed to check the call to the user objective function. In 
particular,
one check is that the size of vectors is the same in expressions like (x - y)^2.
This works fine with R CMD check, but the --as-cran seems to have changed and it
pops an error, even when the call is inside try(). The irony that the routine in
question is intended to avoid problems like this is not lost on me.

I'm working on a small reproducible example, but it's not small enough yet.
In the meantime, I'm looking for the source codes of the scripts for "R CMD 
check" and
"R CMD check --as-cran" so I can work out why there is this difference, which 
seems
to be recent.

Can someone send/post a link? I plan to figure this out and provide feedback,
as I suspect it is going to affect others. However, it may be a few days or even
weeks if past experience is a guide.

JN

__
R-package-devel using r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

[[alternative HTML version deleted]]

__
R-package-devel using r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel





__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel



__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel



__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] try() in R CMD check --as-cran

2019-06-07 Thread Serguei Sokol

On 07/06/2019 14:53, J C Nash wrote:

Sorry reply not quicker. For some reason I'm not getting anything in the thread 
I started!
I found the responses in the archives. Perhaps cc: nas...@uottawa.ca please.

I have prepared a tiny (2.8K) package at
http://web.ncf.ca/nashjc/jfiles/fchk_2019-6.5.tar.gz

R CMD check --> OK

R CMD check --as-cran --> 1 ERROR, 1 NOTE

The error is in an example:


benbad<-function(x, y){
# y may be provided with different structures
f<-(x-y)^2
} # very simple, but ...

y<-1:10
x<-c(1)

The faulty line in fchk() seems to be

if (is.infinite(fval) || is.na(fval)) { ... }

As fval is now allowed to be a vector, it conflicts with '||' operator 
which, as you know, expects scalar logical values. If you replace this 
line with something like:


if (any(is.infinite(fval) | is.na(fval))) { ... }
is gone even with --as-cran option used. In my opinion (even if nobody 
asks for it ;) ), it is fortunate that this option is picky enough about 
such kind of misuse of logical operators.


Best,
Serguei.


cat("test benbad() with y=1:10, x=c(1)\n")
tryfc01 <- try(fc01<-fchk(x, benbad, trace=3, y))
print(tryfc01)
print(fc01)


There's quite a lot of output, but it doesn't make much sense to me, as
it refers to code that I didn't write.

The function fchk is attempting to check if functions provided for
optimization do not violate some conditions e.g., character rather than
numeric etc.

JN


On 2019-06-07 8:44 a.m., J C Nash wrote:

Uwe Ligges ||gge@ @end|ng |rom @t@t|@t|k@tu-dortmund@de
Fri Jun 7 11:44:37 CEST 2019

 Previous message (by thread): [R-pkg-devel] try() in R CMD check --as-cran
 Next message (by thread): [R-pkg-devel] using package data in package code
 Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Right, what problem are you talking about? Can you tell us which check
it is and what it actually complained about.
There is no check that looks at the sizes of x and y in exypressions
such as
(x - y)^2.
as far as I know.

Best,
Uwe

On 07.06.2019 10:33, Berry Boessenkool wrote:


Not entirely sure if this is what you're looking for:
https://github.com/wch/r-source/blob/trunk/src/library/tools/R/check.R
It does contain --as-cran a few times and there's the change-history:
https://github.com/wch/r-source/commits/trunk/src/library/tools/R/check.R

Regards,
Berry



From: R-package-devel  on behalf of J C 
Nash 
Sent: Thursday, June 6, 2019 15:03
To: List r-package-devel
Subject: [R-pkg-devel] try() in R CMD check --as-cran

After making a small fix to my optimx package, I ran my usual R CMD check 
--as-cran.

To my surprise, I got two ERRORs unrelated to the change. The errors popped up 
in
a routine designed to check the call to the user objective function. In 
particular,
one check is that the size of vectors is the same in expressions like (x - y)^2.
This works fine with R CMD check, but the --as-cran seems to have changed and it
pops an error, even when the call is inside try(). The irony that the routine in
question is intended to avoid problems like this is not lost on me.

I'm working on a small reproducible example, but it's not small enough yet.
In the meantime, I'm looking for the source codes of the scripts for "R CMD 
check" and
"R CMD check --as-cran" so I can work out why there is this difference, which 
seems
to be recent.

Can someone send/post a link? I plan to figure this out and provide feedback,
as I suspect it is going to affect others. However, it may be a few days or even
weeks if past experience is a guide.

JN

__
R-package-devel using r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

[[alternative HTML version deleted]]

__
R-package-devel using r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel





__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel



__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments

2019-05-06 Thread Serguei Sokol

On 06/05/2019 18:21, Thomas Petzoldt wrote:
It seems that it's an old bug that was found in some other packages, 
but at that time not optim:


https://bugs.r-project.org/bugzilla/show_bug.cgi?id=15958
I think that the bug description is a little bit misleading. The bug is 
not in fact that "<<-" produce a reference instead of a copy (that's 
normal) but in fact that some C or Fortran code modifies a variable "in 
place" without taking care if there are some references on it or not.


Serguei (just splitting hairs)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments

2019-05-03 Thread Serguei Sokol

On 03/05/2019 10:31, Serguei Sokol wrote:

On 02/05/2019 21:35, Florian Gerber wrote:

Dear all,

when using optim() for a function that uses the parent environment, I
see the following unexpected behavior:

makeFn <- function(){
 xx <- ret <- NA
 fn <- function(x){
    if(!is.na(xx) && x==xx){
    cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="")
    return(ret)
    }
    xx <<- x; ret <<- sum(x^2)
    cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="")
    ret
 }
 fn
}
fn <- makeFn()
optim(par=10, fn=fn, method="L-BFGS-B")
# x=10, ret=100 (calculate)
# x=10.001, ret=100.02 (calculate)
# x=9.999, ret=100.02 (memory)
# $par
# [1] 10
#
# $value
# [1] 100
# (...)

I would expect that optim() does more than 3 function evaluations and
that the optimization converges to 0.

Same problem with optim(par=10, fn=fn, method="BFGS").

Any ideas?
I don't have an answer but may be an insight. For some mysterious 
reason xx is getting changed when in should not. Consider:
> fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in 
x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx 
<<- x; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})

> optim(par=10, fn=fn, method="L-BFGS-B")
1 in x,xx,ret= 10 NA NA
out x,xx,ret= 10 10 100
2 in x,xx,ret= 10.001 10 100
out x,xx,ret= 10.001 10.001 100.02
3 in x,xx,ret= 9.999 9.999 100.02
$par
[1] 10

$value
[1] 100

$counts
function gradient
   1    1

$convergence
[1] 0

$message
[1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"

At the third call, xx has value 9.999 while it should have kept the 
value 10.001.


A little follow-up: if you untie the link between xx and x by replacing 
the expression "xx <<- x" by "xx <<- x+0" it works as expected:
> fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in 
x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<- 
x+0; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})

> optim(par=10, fn=fn, method="L-BFGS-B")
1 in x,xx,ret= 10 NA NA
out x,xx,ret= 10 10 100
2 in x,xx,ret= 10.001 10 100
out x,xx,ret= 10.001 10.001 100.02
3 in x,xx,ret= 9.999 10.001 100.02
out x,xx,ret= 9.999 9.999 99.98
4 in x,xx,ret= 9 9.999 99.98
out x,xx,ret= 9 9 81
5 in x,xx,ret= 9.001 9 81
out x,xx,ret= 9.001 9.001 81.018
6 in x,xx,ret= 8.999 9.001 81.018
out x,xx,ret= 8.999 8.999 80.982
7 in x,xx,ret= 1.776357e-11 8.999 80.982
out x,xx,ret= 1.776357e-11 1.776357e-11 3.155444e-22
8 in x,xx,ret= 0.001 1.776357e-11 3.155444e-22
out x,xx,ret= 0.001 0.001 1e-06
9 in x,xx,ret= -0.001 0.001 1e-06
out x,xx,ret= -0.001 -0.001 1e-06
10 in x,xx,ret= -1.334475e-23 -0.001 1e-06
out x,xx,ret= -1.334475e-23 -1.334475e-23 1.780823e-46
11 in x,xx,ret= 0.001 -1.334475e-23 1.780823e-46
out x,xx,ret= 0.001 0.001 1e-06
12 in x,xx,ret= -0.001 0.001 1e-06
out x,xx,ret= -0.001 -0.001 1e-06
$par
[1] -1.334475e-23

$value
[1] 1.780823e-46

$counts
function gradient
   4    4

$convergence
[1] 0

$message
[1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"

Serguei.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R optim(method="L-BFGS-B"): unexpected behavior when working with parent environments

2019-05-03 Thread Serguei Sokol

On 02/05/2019 21:35, Florian Gerber wrote:

Dear all,

when using optim() for a function that uses the parent environment, I
see the following unexpected behavior:

makeFn <- function(){
     xx <- ret <- NA
     fn <- function(x){
    if(!is.na(xx) && x==xx){
    cat("x=", xx, ", ret=", ret, " (memory)", fill=TRUE, sep="")
    return(ret)
    }
    xx <<- x; ret <<- sum(x^2)
    cat("x=", xx, ", ret=", ret, " (calculate)", fill=TRUE, sep="")
    ret
     }
     fn
}
fn <- makeFn()
optim(par=10, fn=fn, method="L-BFGS-B")
# x=10, ret=100 (calculate)
# x=10.001, ret=100.02 (calculate)
# x=9.999, ret=100.02 (memory)
# $par
# [1] 10
#
# $value
# [1] 100
# (...)

I would expect that optim() does more than 3 function evaluations and
that the optimization converges to 0.

Same problem with optim(par=10, fn=fn, method="BFGS").

Any ideas?
I don't have an answer but may be an insight. For some mysterious reason 
xx is getting changed when in should not. Consider:
> fn=local({n=0; xx=ret=NA; function(x) {n <<- n+1; cat(n, "in 
x,xx,ret=", x, xx, ret, "\n"); if (!is.na(xx) && x==xx) ret else {xx <<- 
x; ret <<- x**2; cat("out x,xx,ret=", x, xx, ret, "\n"); ret}}})

> optim(par=10, fn=fn, method="L-BFGS-B")
1 in x,xx,ret= 10 NA NA
out x,xx,ret= 10 10 100
2 in x,xx,ret= 10.001 10 100
out x,xx,ret= 10.001 10.001 100.02
3 in x,xx,ret= 9.999 9.999 100.02
$par
[1] 10

$value
[1] 100

$counts
function gradient
   1    1

$convergence
[1] 0

$message
[1] "CONVERGENCE: NORM OF PROJECTED GRADIENT <= PGTOL"

At the third call, xx has value 9.999 while it should have kept the 
value 10.001.


Serguei.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Intermittent crashes with inset `[<-` command

2019-02-27 Thread Serguei Sokol

On 26/02/2019 05:18, Brian Montgomery via R-devel wrote:

The following code crashes after about 300 iterations on my x86_64-w64-mingw32 
machine on R 3.5.2 --vanilla.
Others have duplicated this (see 
https://github.com/tidyverse/magrittr/issues/190 if necessary), but I don't 
know how machine/OS-dependent it may be.

It crashes too on my Mageia6 (RPM based Linux distribution):
 184 185 186 187
 *** caught segfault ***
address 0x70002, cause 'memory not mapped'

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace

The crash can happen at different moments, sometimes after i=187 like in 
the example above, sometimes after i=915. The error is not always 
segfault. It can also be


915 Error in `[<-`(x, y == "a", x[y == "b"]) : replacement has length zero

or

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Error in 
`[<-`(x, y == "a", x[y == "b"]) :
  types (de raw a integer) incompatibles dans l'ajustement 
d'affectation de type


(sorry, this crash was in french locale)

Hoping this helps.
Serguei.

>  sessionInfo()
R version 3.5.2 (2018-12-20)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Mageia 6

Matrix products: default
BLAS/LAPACK: /home/opt/OpenBLAS/lib/libopenblas_sandybridge-r0.3.3.so

locale:
[1] C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods base

loaded via a namespace (and not attached):
[1] compiler_3.5.2


If it doesn't crash for you, please try increasing the length of the x vector.

Substituting the commented-out line for the one below it works correctly 
(prints out 1:1000 and ends normally) every time.

x <- 1:20
y <- rep(letters[1:5], length(x) / 5L)
for (i in 1:1000) {
   # x[y == 'a'] <- x[y == 'b']
   x <- `[<-`(x, y == 'a', x[y == 'b'])
   cat(i, '')
}
cat('\n')

The point of using this syntax is to make it work better with pipes, but the 
errors occur without pipes or magrittr.

Thank you for your help!

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Bug report: Function ppois(0:20, lambda=0.9) does not generate a non-decreasing result.

2018-12-05 Thread Serguei Sokol

Le 04/12/2018 à 19:12, Martin Maechler a écrit :

Serguei Sokol
 on Tue, 4 Dec 2018 11:46:32 +0100 writes:

 > Le 04/12/2018 à 11:27, Iñaki Ucar a écrit :
 >> On Tue, 4 Dec 2018 at 11:12,  wrote:
 >>> function ppois is a function calculate the CDF of Poisson 
distribution, it should generate a non-decreasing result, but what I got is:
 >>>
 >>>> any(diff(ppois(0:19,lambda=0.9))<0)
 >>> [1] TRUE
...
 > any(diff(exp(ppois(0:19, lambda=0.9, log.p=TRUE))) < 0)
 > #[1] FALSE

 > But may be there is another, more economic way?

Well, log probabilites *are* very economic for many such p*()
functions.
I have not doubt about it. My "economic way" was related to get ppois() 
*non decreasing*, at least, more economic than exp-log.p trick.


Serguei.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Bug report: Function ppois(0:20, lambda=0.9) does not generate a non-decreasing result.

2018-12-04 Thread Serguei Sokol

Le 04/12/2018 à 11:27, Iñaki Ucar a écrit :

On Tue, 4 Dec 2018 at 11:12,  wrote:

function ppois is a function calculate the CDF of Poisson distribution, it 
should generate a non-decreasing result, but what I got is:


any(diff(ppois(0:19,lambda=0.9))<0)

[1] TRUE

Actually,


ppois(19,lambda=0.9)
[1] TRUE

Which could not be TRUE.

This is just another manifestation of

0.1 * 3 > 0.3
#> [1] TRUE

This discussion returns to this list from time to time. TLDR; this is
not an R issue, but an unavoidable floating point issue.
Well, here the request may be interpreted not as "do it without round 
error" which is indeed unavoidable but rather "please cope with rounding 
errors in a way that return consistent result for ppois()". You have 
indicated one way to do so (I have just added exp() in the row):


any(diff(exp(ppois(0:19, lambda=0.9, log.p=TRUE))) < 0)
#[1] FALSE

But may be there is another, more economic way?

Serguei.


  Solution:
work with log-probabilities instead.

any(diff(ppois(0:40, lambda=0.9, log.p=TRUE))<0)
#> [1] FALSE

Iñaki

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Serguei Sokol
Ingenieur de recherche INRA

Cellule mathématiques
LISBP, INSA/INRA UMR 792, INSA/CNRS UMR 5504
135 Avenue de Rangueil
31077 Toulouse Cedex 04

tel: +33 5 62 25 01 27
email: so...@insa-toulouse.fr
http://www.lisbp.fr

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Subsetting row in single column matrix drops names in resulting vector

2018-11-27 Thread Serguei Sokol
The reason that multi-[column|row] and one-[column|row] matrices should 
be treated in the same way as to names kept in the result sounds good to 
me. I withdraw my remark.


Serguei.

Le 27/11/2018 à 15:48, Radford Neal a écrit :

The behaviour of a[1,] is unchanged, for backwards compatibility
reasons.  But in pqR one can explicitly mark an argument as
missing using "_".  When an array subscript is missing in this way,
the names will not be dropped in this context even if there is
only one of them.  So a[1,_] will do what you want:

> a = matrix(1:2, nrow = 2, dimnames = list(c("row1", "row2"), c("col1")))
> a[1, ]
[1] 1
> a[1,_]
col1
   1

To my mind, it's rather counterintuitive as


a[2,_]

col1
 1
so a[1,_] and a[2,_] have the same name. To make it intuitive (at least
for me ;) ) it should rather return names "row1" and "row2" respectively.

Best,
Serguei.


The aim in designing these features should be to make it easier to
write reliable software, which doesn't unexpectedly fail in edge
cases.

Here, the fact that a is a matrix presumably means that the program is
designed to work for more than one column - in fact, it's likely that
the programmer was mostly thinking of the case where there is more
than one column, and perhaps only testing that case.  But of course
there is usually no reason why one column (or even zero columns) is
impossible.  We want the program to still work in such cases.

When there is more than one column, a[1,] and a[1,_] both produce a
vector with the _column_ names attached, and this is certainly not
going to change (nor should it, unless one wants to change the whole
semantics of matrices so that rows and columns are treated
non-symmetrically, and even then attaching the same row name to all
the elements would be rather strange...).

After v <- a[1,_], the program may well have an expression like v[nc]
where nc is a column name.  We want this to still work if there
happens to be only one column.  That will happen only if a[1,_]
attaches a column name, not a row name, when a has only one column.

Radford Neal




--
Serguei Sokol
Ingenieur de recherche INRA

Cellule mathématiques
LISBP, INSA/INRA UMR 792, INSA/CNRS UMR 5504
135 Avenue de Rangueil
31077 Toulouse Cedex 04

tel: +33 5 62 25 01 27
email: so...@insa-toulouse.fr
http://www.lisbp.fr

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Subsetting row in single column matrix drops names in resulting vector

2018-11-27 Thread Serguei Sokol

Le 27/11/2018 à 01:50, Radford Neal a écrit :

Dmitriy Selivanov (selivanov.dmit...@gmail.com) wrote:


Consider following example:

a = matrix(1:2, nrow = 2, dimnames = list(c("row1", "row2"), c("col1")))
a[1, ]
# 1

It returns *unnamed* vector `1` where I would expect named vector. In fact
it returns named vector when number of columns is > 1.
Same issue applicable to single row matrix. Is it a bug? looks very
counterintuitive.

This and related issues are addressed in pqR, in the new
release of 2018-11-18.  (See pqR-project.org, and my blog
post at radfordneal.wordpress.com)

The behaviour of a[1,] is unchanged, for backwards compatibility
reasons.  But in pqR one can explicitly mark an argument as
missing using "_".  When an array subscript is missing in this way,
the names will not be dropped in this context even if there is
only one of them.  So a[1,_] will do what you want:

   > a = matrix(1:2, nrow = 2, dimnames = list(c("row1", "row2"), c("col1")))
   > a[1, ]
   [1] 1
   > a[1,_]
   col1
  1

To my mind, it's rather counterintuitive as


a[2,_]

col1
   1
so a[1,_] and a[2,_] have the same name. To make it intuitive (at least for me 
;) )
it should rather return names "row1" and "row2" respectively.

Best,
Serguei.
 



Furthermore, pqR will not drop names when the subscript is a
1D array (ie, has a length-1 dim attribute) even if it is only
one long.  In pqR, sequences that are 1D arrays are easily created
using the .. operator.  So the following works as intended when ..
is used, but not when the old : operator is used:

   > a = matrix(1:4, nrow=2, dimnames=list(c("row1","row2"),c("col1","col2")))
   > n = 2
   > a[1,1:n]
   col1 col2
  13
   > a[1,1..n]
   col1 col2
  13
   > n = 1
   > a[1,1:n]
   [1] 1
   > a[1,1..n]
   col1
  1

You can read more about this in my blog post at

https://radfordneal.wordpress.com/2016/06/25/fixing-rs-design-flaws-in-a-new-version-of-pqr/

That was written when most of these features where introduced,
though getting your specific example right relies on another
change introduced in the most recent version.

 Radford Neal

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Serguei Sokol
Ingenieur de recherche INRA

Cellule mathématiques
LISBP, INSA/INRA UMR 792, INSA/CNRS UMR 5504
135 Avenue de Rangueil
31077 Toulouse Cedex 04

tel: +33 5 62 25 01 27
email: so...@insa-toulouse.fr
http://www.lisbp.fr

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Subsetting row in single column matrix drops names in resulting vector

2018-11-22 Thread Serguei Sokol

Le 22/11/2018 à 14:47, Emil Bode a écrit :

The problem is that the drop is only applied (or not) after the subsetting, so 
what R does is:
- Getting the subset, which means a 1 x 1 matrix.
- Only then It either returns that as is (when drop=FALSE), or removes ALL 
dimensions of extent 1, regardless of whether these are rows or columns (or 
higher dimensions).
And it can't keep any names, because what name should be returned? The name 
'row1' is just as valid as 'col1'.
If it is the only reason to not return any name in this case, I could 
make a suggestion.
Let return the name corresponding to the index in subsetting request, 
i.e. for a one-column matrix example it would give


names(a[1,])
#"row1"
names(a[2,])
#"row2"

as the indexes 1 and 2 here above corresponds to rows.

Just my 0.02€
Serguei.



I guess if we could design everything anew, a solution would be to be able to 
specify something like a[1,,drop='row'], or a[1,,drop=1] to drop the rows but 
keep columns, and get a vector being equal to 'row1' (which in this case just 
has length-1, and names 'col1')
That not how it's designed, but you could use 'adrop()' from the 'abind' 
package:
abind:: adrop(a[1,,drop=FALSE], drop=1) first subsets, then drops the 
row-dimension, so gives what you're looking for.
Hope this solves your problem.

Best regards,
Emil Bode
  


On 21/11/2018, 17:58, "R-devel on behalf of Dmitriy Selivanov" 
 wrote:

 Hi Rui. Thanks for answer, I'm aware of drop = FALSE option. Unfortunately
 it doesn't resolve the issue - I'm expecting to get a vector, not a matrix 
.
 
 ср, 21 нояб. 2018 г. в 20:54, Rui Barradas :
 
 > Hello,

 >
 > Use drop = FALSE.
 >
 > a[1, , drop = FALSE]
 > # col1
 > #row11
 >
 >
 > Hope this helps,
 >
 > Rui Barradas
 >
 > Às 16:51 de 21/11/2018, Dmitriy Selivanov escreveu:
 > > Hello here. I'm struggling to understand R's subsetting behavior in
 > couple
 > > of edge cases - subsetting row in a single column matrix and subsetting
 > > column in a single row matrix. I've read R's docs several times and
 > haven't
 > > found answer.
 > >
 > > Consider following example:
 > >
 > > a = matrix(1:2, nrow = 2, dimnames = list(c("row1", "row2"), 
c("col1")))
 > > a[1, ]
 > > # 1
 > >
 > > It returns *unnamed* vector `1` where I would expect named vector. In
 > fact
 > > it returns named vector when number of columns is > 1.
 > > Same issue applicable to single row matrix. Is it a bug? looks very
 > > counterintuitive.
 > >
 > >
 >
 
 
 --

 Regards
 Dmitriy Selivanov
 
 	[[alternative HTML version deleted]]
 
 __

 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel
 



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Serguei Sokol
Ingenieur de recherche INRA

Cellule mathématiques
LISBP, INSA/INRA UMR 792, INSA/CNRS UMR 5504
135 Avenue de Rangueil
31077 Toulouse Cedex 04

tel: +33 5 62 25 01 27
email: so...@insa-toulouse.fr
http://www.lisbp.fr

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] compairing doubles

2018-08-31 Thread Serguei Sokol

Le 31/08/2018 à 16:25, Mark van der Loo a écrit :

Ah, my bad, you're right of course.

sum(abs(diff(diff( sort(x) < eps

for some reasonable eps then, would do as a oneliner, or

all(abs(diff(diff(sort(x < eps)

or

max(abs(diff(diff(sort(x) < eps

Or with only four function calls:
diff(range(diff(sort(x < eps

Serguei.



-Mark

Op vr 31 aug. 2018 om 16:14 schreef Iñaki Ucar :


El vie., 31 ago. 2018 a las 16:00, Mark van der Loo
() escribió:

how about

is_evenly_spaced <- function(x,...) all.equal(diff(sort(x)),...)

This doesn't work, because

1. all.equal does *not* return FALSE. Use of isTRUE or identical(.,
TRUE) is required if you want a boolean.
2. all.equal compares two objects, not elements in a vector.

Iñaki


(use ellipsis to set tolerance if necessary)


Op vr 31 aug. 2018 om 15:46 schreef Emil Bode :

Agreed that's it's rounding error, and all.equal would be the way to go.
I wouldn't call it a bug, it's simply part of working with floating

point numbers, any language has the same issue.

And while we're at it, I think the function can be a lot shorter:
.is_continous_evenly_spaced <- function(n){
   length(n)>1 && isTRUE(all.equal(n[order(n)], seq(from=min(n),

to=max(n), length.out = length(n

}

Cheers, Emil

 El vie., 31 ago. 2018 a las 15:10, Felix Ernst
 () escribió:
 >
 > Dear all,
 >
 > I a bit unsure, whether this qualifies as a bug, but it is

definitly a strange behaviour. That why I wanted to discuss it.

 >
 > With the following function, I want to test for evenly space

numbers, starting from anywhere.

 >
 > .is_continous_evenly_spaced <- function(n){
 >   if(length(n) < 2) return(FALSE)
 >   n <- n[order(n)]
 >   n <- n - min(n)
 >   step <- n[2] - n[1]
 >   test <- seq(from = min(n), to = max(n), by = step)
 >   if(length(n) == length(test) &&
 >  all(n == test)){
 > return(TRUE)
 >   }
 >   return(FALSE)
 > }
 >
 > > .is_continous_evenly_spaced(c(1,2,3,4))
 > [1] TRUE
 > > .is_continous_evenly_spaced(c(1,3,4,5))
 > [1] FALSE
 > > .is_continous_evenly_spaced(c(1,1.1,1.2,1.3))
 > [1] FALSE
 >
 > I expect the result for 1 and 2, but not for 3. Upon

Investigation it turns out, that n == test is TRUE for every pair, but not
for the pair of 0.2.

 >
 > The types reported are always double, however n[2] == 0.1 reports

FALSE as well.

 >
 > The whole problem is solved by switching from all(n == test) to

all(as.character(n) == as.character(test)). However that is weird, isn’t it?

 >
 > Does this work as intended? Thanks for any help, advise and

suggestions in advance.

 I guess this has something to do with how the sequence is built and
 the inherent error of floating point arithmetic. In fact, if you
 return test minus n, you'll get:

 [1] 0.00e+00 0.00e+00 2.220446e-16 0.00e+00

 and the error gets bigger when you continue the sequence; e.g., this
 is for c(1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7):

 [1] 0.00e+00 0.00e+00 2.220446e-16 2.220446e-16 4.440892e-16
 [6] 4.440892e-16 4.440892e-16 0.00e+00

 So, independently of this is considered a bug or not, instead of

 length(n) == length(test) && all(n == test)

 I would use the following condition:

 isTRUE(all.equal(n, test))

 Iñaki

 >
 > Best regards,
 > Felix
 >
 >
 > [[alternative HTML version deleted]]
 >
 > __
 > R-devel@r-project.org mailing list
 > https://stat.ethz.ch/mailman/listinfo/r-devel



 --
 Iñaki Ucar

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Iñaki Ucar


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Serguei Sokol
Ingenieur de recherche INRA

Cellule mathématiques
LISBP, INSA/INRA UMR 792, INSA/CNRS UMR 5504
135 Avenue de Rangueil
31077 Toulouse Cedex 04

tel: +33 5 62 25 01 27
email: so...@insa-toulouse.fr
http://www.lisbp.fr

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] access an element with empty name

2018-05-14 Thread Serguei Sokol

Le 14/05/2018 à 15:55, Kurt Hornik a écrit :

Serguei Sokol writes:

Hi,
I came across a case where I cannot access a list element by its empty name.
Minimal example can be constructed as
      x=list("A", 1)
      names(x)=c("a", "")
      x[["a"]]
      #[1]  "A"
      x[[""]]
      #NULL
      x$`a`
      #[1]  "A"
      x$``
      # Error: attempt to use zero-length variable name
      # but we can still access the second element by its index
      x[[2]]
      #[1] 1
To my mind, it should be perfectly legal to access an element by an
empty name as we can have for example
      match("", names(x))
      #[1] 2
Hence a traditional question: is it a bug or feature?

A feature according to the docs: ? Extract says

 Neither empty (‘""’) nor ‘NA’ indices match any names, not even
 empty nor missing names.



Thanks Kurt, I missed that one.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] access an element with empty name

2018-05-14 Thread Serguei Sokol

Hi,

I came across a case where I cannot access a list element by its empty name.
Minimal example can be constructed as

    x=list("A", 1)
    names(x)=c("a", "")
    x[["a"]]
    #[1]  "A"
    x[[""]]
    #NULL
    x$`a`
    #[1]  "A"
    x$``
    # Error: attempt to use zero-length variable name
    # but we can still access the second element by its index
    x[[2]]
    #[1] 1

To my mind, it should be perfectly legal to access an element by an 
empty name as we can have for example

    match("", names(x))
    #[1] 2
Hence a traditional question: is it a bug or feature?

Best,
Serguei.


> sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Mageia 6

Matrix products: default
BLAS/LAPACK: /home/opt/OpenBLAS/lib/libopenblas_sandybridge-r0.3.0.dev.so

locale:
[1] C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods base

loaded via a namespace (and not attached):
[1] compiler_3.5.0

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] mean(x) for ALTREP

2018-04-26 Thread Serguei Sokol

Hi,

By looking at a doc about ALTREP 
https://svn.r-project.org/R/branches/ALTREP/ALTREP.html (by the way 
congratulations for that and for R-3.5.0 in general), I was a little bit 
surprised by the following example:


> x <- 1:1e10
> system.time(print(mean(x)))
[1] 5e+09
   user  system elapsed
 38.520   0.008  38.531

Taking 38.520 s to calculate a mean value of an arithmetic sequence 
seemed a lot to me. It probably means that calculations are made by 
running into a for loop while in the case of arithmetic sequence a mean 
value can simply be calculated as (b+e)/2 where b and e are the begin 
and end value respectively. Is it planned to take benefit of ALTREP for 
functions like mean(), sum(), min(), max() and some others to avoid 
running a for loop wherever possible? It seems so natural to me but 
after all some implementation details preventing this can escape to me.


Best,
Serguei.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R Bug: write.table for matrix of more than 2, 147, 483, 648 elements

2018-04-19 Thread Serguei Sokol

Le 19/04/2018 à 12:15, Tomas Kalibera a écrit :

On 04/19/2018 11:47 AM, Serguei Sokol wrote:





replace
    tmp = EncodeElement2(x, i + j*nr, quote_col[j], qmethod,
                    , sdec);
by
    tmp = EncodeElement2(VECTOR_ELT(x, (R_xlen_t)i + j*nr), 0, 
quote_col[j], qmethod,

                    , sdec);


Unfortunately we can't do that, x is a matrix of an atomic vector 
type. VECTOR_ELT is taking elements of a generic vector, so it cannot 
be applied to "x". But even if we extracted a single element from "x" 
(e.g. via a type-switch etc), we would not be able to pass it to 
EncodeElement0 which expects a full atomic vector (that is, including 
its header). Instead we would have to call functions like 
EncodeInteger, EncodeReal0, etc on the individual elements. Which is 
then the same as changing EncodeElement0 or implementing a new version 
of it. This does not seem that hard to fix, just is not as trivial as 
changing the cast..


Thanks Tomas for this detailed explanation.

I would like also to signal a problem with the list. It must be 
corrupted in some way because beside the Tomas'  response I've got five 
or six (so far) dating spam. All of them coming from two emails: 
Kristina Oliynik <kristinaoliynik604...@kw.taluss.com> and Samantha 
Smith <samanthasmith317...@kw.fefty.com>.


Serguei.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R Bug: write.table for matrix of more than 2, 147, 483, 648 elements

2018-04-19 Thread Serguei Sokol

Le 19/04/2018 à 09:30, Tomas Kalibera a écrit :

On 04/19/2018 02:06 AM, Duncan Murdoch wrote:

On 18/04/2018 5:08 PM, Tousey, Colton wrote:

Hello,

I want to report a bug in R that is limiting my capabilities to 
export a matrix with write.csv or write.table with over 
2,147,483,648 elements (C's int limit). I found this bug already 
reported about before: 
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17182. However, 
there appears to be no solution or fixes in upcoming R version 
releases.


The error message is coming from the writetable part of the utils 
package in the io.c source 
code(https://svn.r-project.org/R/trunk/src/library/utils/src/io.c):

/* quick integrity check */
 if(XLENGTH(x) != (R_len_t)nr * nc)
 error(_("corrupt matrix -- dims not not match 
length"));


The issue is that nr*nc is an integer and the size of my matrix, 2.8 
billion elements, exceeds C's limit, so the check forces the code to 
fail.


Yes, looks like a typo:  R_len_t is an int, and that's how nr was 
declared.  It should be R_xlen_t, which is bigger on machines that 
support big vectors.


I haven't tested the change; there may be something else in that 
function that assumes short vectors.
Indeed, I think the function won't work for long vectors because of 
EncodeElement2 and EncodeElement0. EncodeElement2/0 would have to be 
changed, including their signatures


That would be a definite fix but before such deep rewriting is 
undertaken may the following small fix (in addition to "(R_xlen_t)nr * 
nc") will be sufficient for cases where nr and nc are in int range but 
their product can reach long vector limit:


replace
    tmp = EncodeElement2(x, i + j*nr, quote_col[j], qmethod,
                    , sdec);
by
    tmp = EncodeElement2(VECTOR_ELT(x, (R_xlen_t)i + j*nr), 0, 
quote_col[j], qmethod,

                    , sdec);

Serguei

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] cat(fill=N)

2018-03-16 Thread Serguei Sokol

Le 16/03/2018 à 17:10, David Hugh-Jones a écrit :

Hi all,

I expect I'm getting something wrong, but

cat("foo bar baz foo bar baz foo bar baz", fill = 10)

should be broken into lines of width 10, whereas I get:


cat("foo bar baz foo bar baz foo bar baz", fill = 10)

foo bar baz foo bar baz foo bar baz

On the other side, if I do
> cat(strsplit("foo bar baz foo bar baz foo bar baz", " ")[[1]], fill = 10)
I get the expected result:

foo bar
baz foo
bar baz
foo bar
baz

Which suggest that cat() doesn't break elements of submitted character vector
put put enough of them to fill the requested width.

Serguei.



This is on R 3.4.3, but I don't see mentions of it fixed in 3.4.4 or
r-devel NEWS.

Cheers,
David

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Inconsistent rank in qr()

2018-01-23 Thread Serguei Sokol

Le 23/01/2018 à 08:47, Martin Maechler a écrit :

Serguei Sokol <so...@insa-toulouse.fr>
 on Mon, 22 Jan 2018 17:57:47 +0100 writes:

 > Le 22/01/2018 à 17:40, Keith O'Hara a écrit :
 >> This behavior is noted in the qr documentation, no?
 >>
 >> rank - the rank of x as computed by the decomposition(*): always full 
rank in the LAPACK case.
 > For a me a "full rank matrix" is a matrix the rank of which is indeed 
min(nrow(A), ncol(A))
 > but here the meaning of "always is full rank" is somewhat confusing. 
Does it mean
 > that only full rank matrices must be submitted to qr() when LAPACK=TRUE?
 > May be there is a jargon where "full rank" is a synonym of min(nrow(A), 
ncol(A)) for any matrix
 > but the fix to stick with commonly admitted rank definition (i.e. the 
number of linearly independent
 > columns in A) is so easy. Why to discard lapack case from it (even 
properly documented)?

Because 99.5% of caller to qr()  never look at '$rank',
so why should we compute it every time qr() is called?

1. Because R already does it for linpack so it would be consistent to do so for 
lapack too.
2. Because R pretends that it is a part of a returned qr class.
3. Because its calculation is a negligible fraction of QR itself.



==> Matrix :: rankMatrix() does use "qr" as one of its several methods.

--

As wiser people than me have said (I'm paraphrasing, don't find a nice 
citation):

   While the rank of a matrix is a very well defined concept in
   mathematics (theory), its practical computation on a finite
   precision computer is much more challenging.

True. It is indeed depending of round-off errors during QR calculations and 
tolerance
setting but putting it just as min(nrow(A), ncol(A)) and still calling it rank of 
"full rank"
is by far the most misleading choice to my mind.

Once again, if we are already calculating it for linpack let do it in most 
consistent
way for lapack too. I can propose a patch if you will.

Serguei.



The ?rankMatrix  help page (package Matrix, part of your R)
https://stat.ethz.ch/R-manual/R-devel/library/Matrix/html/rankMatrix.html
starts with the following 'Description'

__ Compute ‘the’ matrix rank, a well-defined functional in theory(*), somewhat 
ambigous in practice. We provide several methods, the default corresponding to 
Matlab's definition.

__ (*) The rank of a n x m matrix A, rk(A) is the maximal number of linearly 
independent columns (or rows); hence rk(A) <= min(n,m).


 >>> On Jan 22, 2018, at 11:21 AM, Serguei Sokol <so...@insa-toulouse.fr> 
wrote:
 >>>
 >>> Hi,
 >>>
 >>> I have noticed different rank values calculated by qr() depending on
 >>> LAPACK parameter. When it is FALSE (default) a true rank is estimated 
and returned.
 >>> Unfortunately, when LAPACK is set to TRUE, the min(nrow(A), ncol(A)) 
is returned
 >>> which is only occasionally a true rank.
 >>>
 >>> Would not it be more consistent to replace the rank in the latter case 
by something
 >>> based on the following pseudo code ?
 >>>
 >>> d=abs(diag(qr))
 >>> rank=sum(d >= d[1]*tol)
 >>>
 >>> Here, we rely on the fact column pivoting is activated in the called 
lapack routine (dgeqp3)
 >>> and diagonal term in qr matrix are put in decreasing order (according 
to their absolute values).
 >>>
 >>> Serguei.
 >>>
     >>> How to reproduce:
 >>>
 >>> a=diag(2)
 >>> a[2,2]=0
 >>> qaf=qr(a, LAPACK=FALSE)
 >>> qaf$rank # shows 1. OK it's the true rank value
 >>> qat=qr(a, LAPACK=TRUE)
 >>> qat$rank #shows 2. Bad, it's not the expected value.
 >>>

 > --
 > Serguei Sokol
 > Ingenieur de recherche INRA

 > Cellule mathématique
 > LISBP, INSA/INRA UMR 792, INSA/CNRS UMR 5504
 > 135 Avenue de Rangueil
 > 31077 Toulouse Cedex 04

 > tel: +33 5 6155 9849
 > email: so...@insa-toulouse.fr
 > http://www.lisbp.fr



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Inconsistent rank in qr()

2018-01-22 Thread Serguei Sokol

Le 22/01/2018 à 17:40, Keith O'Hara a écrit :

This behavior is noted in the qr documentation, no?

rank - the rank of x as computed by the decomposition(*): always full rank in 
the LAPACK case.

For a me a "full rank matrix" is a matrix the rank of which is indeed 
min(nrow(A), ncol(A))
but here the meaning of "always is full rank" is somewhat confusing. Does it 
mean
that only full rank matrices must be submitted to qr() when LAPACK=TRUE?
May be there is a jargon where "full rank" is a synonym of min(nrow(A), 
ncol(A)) for any matrix
but the fix to stick with commonly admitted rank definition (i.e. the number of 
linearly independent
columns in A) is so easy. Why to discard lapack case from it (even properly 
documented)?






On Jan 22, 2018, at 11:21 AM, Serguei Sokol <so...@insa-toulouse.fr> wrote:

Hi,

I have noticed different rank values calculated by qr() depending on
LAPACK parameter. When it is FALSE (default) a true rank is estimated and 
returned.
Unfortunately, when LAPACK is set to TRUE, the min(nrow(A), ncol(A)) is returned
which is only occasionally a true rank.

Would not it be more consistent to replace the rank in the latter case by 
something
based on the following pseudo code ?

d=abs(diag(qr))
rank=sum(d >= d[1]*tol)

Here, we rely on the fact column pivoting is activated in the called lapack 
routine (dgeqp3)
and diagonal term in qr matrix are put in decreasing order (according to their 
absolute values).

Serguei.

How to reproduce:

a=diag(2)
a[2,2]=0
qaf=qr(a, LAPACK=FALSE)
qaf$rank # shows 1. OK it's the true rank value
qat=qr(a, LAPACK=TRUE)
qat$rank #shows 2. Bad, it's not the expected value.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Serguei Sokol
Ingenieur de recherche INRA

Cellule mathématique
LISBP, INSA/INRA UMR 792, INSA/CNRS UMR 5504
135 Avenue de Rangueil
31077 Toulouse Cedex 04

tel: +33 5 6155 9849
email: so...@insa-toulouse.fr
http://www.lisbp.fr

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Inconsistent rank in qr()

2018-01-22 Thread Serguei Sokol

Hi,

I have noticed different rank values calculated by qr() depending on
LAPACK parameter. When it is FALSE (default) a true rank is estimated and 
returned.
Unfortunately, when LAPACK is set to TRUE, the min(nrow(A), ncol(A)) is returned
which is only occasionally a true rank.

Would not it be more consistent to replace the rank in the latter case by 
something
based on the following pseudo code ?

d=abs(diag(qr))
rank=sum(d >= d[1]*tol)

Here, we rely on the fact column pivoting is activated in the called lapack 
routine (dgeqp3)
and diagonal term in qr matrix are put in decreasing order (according to their 
absolute values).

Serguei.

How to reproduce:

a=diag(2)
a[2,2]=0
qaf=qr(a, LAPACK=FALSE)
qaf$rank # shows 1. OK it's the true rank value
qat=qr(a, LAPACK=TRUE)
qat$rank #shows 2. Bad, it's not the expected value.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] wrong matrix dimension in sparseQR

2018-01-19 Thread Serguei Sokol

I have found explanation in the comments to original cs_qr() matlab function
http://users.encs.concordia.ca/~krzyzak/R%20Code-Communications%20in%20Statistics%20and%20Simulation%202014/Zubeh%F6r/SuiteSparse/CSparse3/MATLAB/CSparse/cs_qr.m

"If A is structurally rank deficient, additional empty
 rows may have been added to V and R."

To my mind, it could be useful to mention it in the manual of Matrix::qr
and/or sparseQR-class too. Because in case of augmented nrow
we cannot do anymore A[p+1,] as cited in example in ?"sparseQR-class".

My 2 cents.
Serguei.

Le 18/01/2018 à 16:13, Serguei Sokol a écrit :

Hi,

I came across a case when the dimensions of matrices returned by qr()
operated on a sparse matrix does not coincide with the initial matrix.

If A is structurally rank deficient, additional empty
%   rows may have been added to V and R.


Here is a spinet code that should produce an example (one of many that I could 
provide):

 m=205
 n=199
 set.seed(7);
 a=matrix(rnorm(m*n), m, n)
 a[sample(seq(m*n), m*(n-4))]=0
 a=as(a, "Matrix")
 qa=qr(a);
 stopifnot(nrow(qa@R) == m)
 # On my box I have nrow(qa@R):=207 while should be 205 as m is)

Note that for m=203 and n=197, the same code produce right (i.e. coinciding) 
dimensions.

Have I missed something?

Serguei.

> R.version
platform   x86_64-pc-linux-gnu
arch   x86_64
os linux-gnu
system x86_64, linux-gnu
status
major  3
minor  4.3
year   2017
month  11
day    30
svn rev    73796
language   R
version.string R version 3.4.3 (2017-11-30)
nickname   Kite-Eating Tree




--
Serguei Sokol
Ingenieur de recherche INRA

Cellule mathématique
LISBP, INSA/INRA UMR 792, INSA/CNRS UMR 5504
135 Avenue de Rangueil
31077 Toulouse Cedex 04

tel: +33 5 6155 9849
email: so...@insa-toulouse.fr
http://www.lisbp.fr

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] wrong matrix dimension in sparseQR

2018-01-18 Thread Serguei Sokol

Hi,

I came across a case when the dimensions of matrices returned by qr()
operated on a sparse matrix does not coincide with the initial matrix.

Here is a spinet code that should produce an example (one of many that I could 
provide):

 m=205
 n=199
 set.seed(7);
 a=matrix(rnorm(m*n), m, n)
 a[sample(seq(m*n), m*(n-4))]=0
 a=as(a, "Matrix")
 qa=qr(a);
 stopifnot(nrow(qa@R) == m)
 # On my box I have nrow(qa@R):=207 while should be 205 as m is)

Note that for m=203 and n=197, the same code produce right (i.e. coinciding) 
dimensions.

Have I missed something?

Serguei.

> R.version
platform   x86_64-pc-linux-gnu
arch   x86_64
os linux-gnu
system x86_64, linux-gnu
status
major  3
minor  4.3
year   2017
month  11
day    30
svn rev    73796
language   R
version.string R version 3.4.3 (2017-11-30)
nickname   Kite-Eating Tree

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Extreme bunching of random values from runif with Mersenne-Twister seed

2017-11-06 Thread Serguei Sokol

Le 05/11/2017 à 15:17, Duncan Murdoch a écrit :

On 04/11/2017 10:20 PM, Daniel Nordlund wrote:

Tirthankar,

"random number generators" do not produce random numbers.  Any given
generator produces a fixed sequence of numbers that appear to meet
various tests of randomness.  By picking a seed you enter that sequence
in a particular place and subsequent numbers in the sequence appear to
be unrelated.  There are no guarantees that if YOU pick a SET of seeds
they won't produce a set of values that are of a similar magnitude.

You can likely solve your problem by following Radford Neal's advice of
not using the the first number from each seed.  However, you don't need
to use anything more than the second number.  So, you can modify your
function as follows:

function(x) {
    set.seed(x, kind = "default")
    y = runif(2, 17, 26)
    return(y[2])
  }

Hope this is helpful,


That's assuming that the chosen seeds are unrelated to the function output, which seems unlikely on the face of it.  You can certainly choose a set of seeds 
that give high values on the second draw just as easily as you can choose seeds that give high draws on the first draw.

To confirm this statement, I did

s2_25=s[sapply(s, function(i) {set.seed(i); runif(2, 17, 26)[2] > 25})]
length(s2_25) # 48990

For memory, we had
length(s25) # 48631 out of 439166

which is much similar length.
So if we take the second or even the 10-th pseudo-random value we can
fall as easily (or as hard) at a seed sequence giving some narrow set.

Serguei.



The interesting thing about this problem is that Tirthankar doesn't believe that the seed selection process is aware of the function output.  I would say that 
it must be, and he should be investigating how that happens if he is worried about the output, he shouldn't be worrying about R's RNG.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Extreme bunching of random values from runif with Mersenne-Twister seed

2017-11-03 Thread Serguei Sokol
584144", "86584272", "86620568",
 > "86724613", "86756002", "86768593", "86772411",
 > "86781516", "86794389", "86805854", "86814600",
 > "86835092", "86874179", "86876466", "86901193",
 > "86987847", "86988080")

 >  random_values = sapply(seeds, function(x) {
 >   set.seed(x)
 >   y = runif(1, 17, 26)
 >   return(y)
 > })

Why do you do that?

1) You should set the seed *once*, not multiple times in one simulation.


This code is written like this since this seed is set every time the
function (API) is called for call-level replicability. It doesn't make a
lot of sense in an MRE, but this is a critical component of the larger
function. We do acknowledge that for any one of the seeds in the vector
`seeds` the vector of draws appears to have the uniform distribution.



2) Assuming that your strings are correctly translated to integers
and the same on all platforms, independent of locales (!) etc,
you are again not following the simple instruction on the help page:

  ‘set.seed’ uses a single integer argument to set as many seeds as
  are required.  It is intended as a simple way to get quite
  different seeds by specifying small integer arguments, and also as
  .
  .

Note:   ** small ** integer
Why do you assume   86901193  to be a small integer ?


Because 86901193/2^32 = 0.02. What is a "small integer"?



 > This gives values that are **extremely** bunched together.

 >> summary(random_values)
 >Min. 1st Qu.  Median Mean 3rd Qu.  Max.  25.13
 > 25.36 25.66 25.58 25.83 25.94

 > This behaviour of `runif` goes away when we use `kind =
 > "Knuth-TAOCP-2002"`, and we get values that appear to be
 > much more evenly spread out.

 > random_values = sapply(seeds, function(x) {
 > set.seed(x, kind = "Knuth-TAOCP-2002") y = runif(1, 17,
 > 26) return(y) })

 > *Output omitted.*

 > ---

 > **The most interesting thing here is that this does not
 > happen on Windows -- only happens on Ubuntu**
 > (`sessionInfo` output for Ubuntu & Windows below).

 > # Windows output: #

 >> seeds = c(
 > + "86548915", "86551615", "86566163", "86577411",
 > "86584144", + "86584272", "86620568", "86724613",
 > "86756002", "86768593", "86772411", + "86781516",
 > "86794389", "86805854", "86814600", "86835092",
 > "86874179", + "86876466", "86901193", "86987847",
 > "86988080")
 >>
 >> random_values = sapply(seeds, function(x) {
 > + set.seed(x) + y = runif(1, 17, 26) + return(y) + })
 >>
 >> summary(random_values)
 >Min. 1st Qu.  Median Mean 3rd Qu.  Max.  17.32
 > 20.14 23.00 22.17 24.07 25.90

 > Can someone help understand what is going on?

 > Ubuntu
 > --

 > R version 3.4.0 (2017-04-21)
 > Platform: x86_64-pc-linux-gnu (64-bit)
 > Running under: Ubuntu 16.04.2 LTS

You have not learned to get a current version of R.
===> You should not write to R-devel (sorry if this may sound harsh ..)


We do spend a while on certain versions of R since upgrading our systems in
production is not something we are able to do frequently & this version is
only 6 months old. However, addressing your concern, upgrading to R 3.4.2
leaves the output unchanged.



- - - - -
After doing all this, your problem may still be just
because you are using much too large integers for the 'seed'
argument of set.seed()


Note that multiplying the reported set of seeds by 10, results in expected
output, so not clear if there is a sweet spot that bugs out the
Mersenne-Twister algorithm:

seeds = c(86548915L, 86551615L, 86566163L, 86577411L, 86584144L, 86584272L,
   86620568L, 86724613L, 86756002L, 86768593L, 86772411L, 86781516L,
   86794389L, 86805854L, 86814600L, 86835092L, 86874179L, 86876466L,
   86901193L, 86987847L, 86988080L)*10

random_values = sapply(seeds, function(x) {
   set.seed(x)
   y = runif(1, 17, 26)
   return(y)
})

summary(random_values)




I really really strongly believe you should have used R-help
instead of R-devel.

Best,
Martin Maechler


If you continue to believe with the inputs given in this reply that this
should be on R-help, we will switch over.

Your continued help would be appreciated in understanding the issue.

T

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Serguei Sokol
Ingenieur de recherche INRA

Cellule mathématique
LISBP, INSA/INRA UMR 792, INSA/CNRS UMR 5504
135 Avenue de Rangueil
31077 Toulouse Cedex 04

tel: +33 5 6155 9849
email: so...@insa-toulouse.fr
http://www.lisbp.fr

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [New Patch] Fix disk corruption when writing

2017-07-10 Thread Serguei Sokol

Le 10/07/2017 à 13:13, Duncan Murdoch a écrit :

On 10/07/2017 5:34 AM, Serguei Sokol wrote:

Le 10/07/2017 à 11:19, Duncan Murdoch a écrit :

On 10/07/2017 4:54 AM, Serguei Sokol wrote:

Le 08/07/2017 à 00:54, Duncan Murdoch a écrit :

I have now committed changes to R-devel (rev 72898) that seem to catch large 
and small errors.  They only give a warning if the error happens when the
connection is closed, because that can happen asynchronously

For this asynchronous behavior, would not it be more useful to have
the name of the file that failed at closing? If many files were open
during a session and not closed explicitly (yes, bad practice but it
can happen), the warning message doesn't help to understand
which of files were corrupted, e.g.:
 > fc=file("/dev/full", "w")
 > write.csv("a", file=fc)
 > q("yes")
Warning message:
In close.connection(getConnection(set[i])) :
   Problem closing connection:  No space left on device

Having only "set[i]" for indication is not very informative, is it?


To debug your failure to close fc, reproduce the conditions before the warning 
was issued, and call showConnections().

It can help in some cases but in all.
First, to reproduce the exact condition of failure is not always possible. It 
could
happen after a long calculation and the environment that caused
the failure could evolve meantime. And second, having the list of
connections still does not say which one (or many) has/have failed as
we have only "set[i]" not even the connection number (which in turn
could be not the same between the first failure and a tentative to reproduce 
it).

Is adding con->description to the warning message problematic in any sens ?


Yes, we don't know if it is still valid after the connection has been closed.  It's just a pointer, whose target is allocated when the connection is created, 
and deallocated when it is closed. Using it after closing could lead to a seg fault. 

If you mean "free(con->description);" which is in con_close1() at 
connections.c:3536
it occurs after calling checkClose(). Then logically, con-description is still 
valid
during generation of warning message.

Serguei.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [New Patch] Fix disk corruption when writing

2017-07-10 Thread Serguei Sokol

Le 10/07/2017 à 11:19, Duncan Murdoch a écrit :

On 10/07/2017 4:54 AM, Serguei Sokol wrote:

Le 08/07/2017 à 00:54, Duncan Murdoch a écrit :

I have now committed changes to R-devel (rev 72898) that seem to catch large 
and small errors.  They only give a warning if the error happens when the
connection is closed, because that can happen asynchronously

For this asynchronous behavior, would not it be more useful to have
the name of the file that failed at closing? If many files were open
during a session and not closed explicitly (yes, bad practice but it
can happen), the warning message doesn't help to understand
which of files were corrupted, e.g.:
 > fc=file("/dev/full", "w")
 > write.csv("a", file=fc)
 > q("yes")
Warning message:
In close.connection(getConnection(set[i])) :
   Problem closing connection:  No space left on device

Having only "set[i]" for indication is not very informative, is it?


To debug your failure to close fc, reproduce the conditions before the warning 
was issued, and call showConnections().

It can help in some cases but in all.
First, to reproduce the exact condition of failure is not always possible. It 
could
happen after a long calculation and the environment that caused
the failure could evolve meantime. And second, having the list of
connections still does not say which one (or many) has/have failed as
we have only "set[i]" not even the connection number (which in turn
could be not the same between the first failure and a tentative to reproduce 
it).

Is adding con->description to the warning message problematic in any sens ?

Serguei.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [New Patch] Fix disk corruption when writing

2017-07-10 Thread Serguei Sokol

Le 08/07/2017 à 00:54, Duncan Murdoch a écrit :
I have now committed changes to R-devel (rev 72898) that seem to catch large and small errors.  They only give a warning if the error happens when the 
connection is closed, because that can happen asynchronously

For this asynchronous behavior, would not it be more useful to have
the name of the file that failed at closing? If many files were open
during a session and not closed explicitly (yes, bad practice but it
can happen), the warning message doesn't help to understand
which of files were corrupted, e.g.:
> fc=file("/dev/full", "w")
> write.csv("a", file=fc)
> q("yes")
Warning message:
In close.connection(getConnection(set[i])) :
  Problem closing connection:  No space left on device

Having only "set[i]" for indication is not very informative, is it?

Serguei.


: I didn't want to mess up some later unrelated computation that triggered 
garbage collection.

I will wait a while before porting these to R-patched, because there may still 
be some problems to clean up.

Duncan Murdoch



On 07/07/2017 11:42 AM, Duncan Murdoch wrote:

On 07/07/2017 11:13 AM, Serguei Sokol wrote:

Le 07/07/2017 à 16:52, Duncan Murdoch a écrit :

On 07/07/2017 9:54 AM, Serguei Sokol wrote:

Le 07/07/2017 à 01:09, Duncan Murdoch a écrit :

On 06/07/2017 6:44 PM, Sokol Serguei wrote:

Duncan Murdoch has written at Thu, 6 Jul 2017 13:58:10 -0400

On 06/07/2017 5:21 AM, Serguei Sokol wrote:

I propose the following patch against the current
R-devel/src/main/connection.c (cf. attached file).
It gives (on my linux box):
 > fc=file("/dev/full", "w")
 > write.csv("a", file=fc)
Error in writeLines(paste(col.names, collapse = sep), file, sep = eol) :
   system call failure on writing
 > close(fc)

Serguei.


I suspect that flushing on every write will slow things down too much.

That's quite plausible.



I think a better approach is to catch the error in the Rconn_printf
calls (as R-devel currently does), and also catch errors on
con->close(con).  This one requires more changes to the source, so it
may be a day or two before I commit.

I have testes it on file_close() and it works (cf. attached patch):

fc=file("/dev/full", "w")
write.csv("a", file=fc)
close(fc)

Error in close.connection(fc) : closing file failed



One thing I have to look into:  is anyone making use of the fact that
the R-level close.connection() function can return -1 to signal an
error?  Base R ignores that, which is why we need to do something, but
if packages are using it, things need to be changed carefully.  I
can't just change it to raise an error instead.

As you can see in the patch, no need to change close.connection() function
but we have to add a test of con->status to all *_close() functions
(gzfile_close() and co.)


You missed my point.  Currently the R close() function may return -1 to signal that there was an error closing.  We can't change that to an error if 
packages

are using it.

May be I missed it but finally, me too, I was saying that we don't have to do 
so.
Anyhow, the situation of writing to full disk cannot be passed in silence.
IMHO, trigger an error would be the most appropriate in this situation but if 
for legacy
or any other reason we cannot do so, let whistle a warning, at least.
Here few tests with a new small patch:
 > fc=file("/dev/full", "w"); write.csv("a", file=fc); (res=close(fc))
[1] -1
Warning message:
In close.connection(fc) :
   closing '/dev/full' failed: No space left on device
 > fc=gzfile("/dev/full", "w"); write.csv("a", file=fc); (res=close(fc))
NULL
Warning message:
In close.connection(fc) :
   closing '/dev/full' failed: No space left on device
 > fc=xzfile("/dev/full", "w"); write.csv("a", file=fc); (res=close(fc))
NULL
Warning message:
In close.connection(fc) :
   closing '/dev/full' failed: No space left on device
 > fc=bzfile("/dev/full", "w"); write.csv("a", file=fc); (res=close(fc))
NULL
Warning message:
In close.connection(fc) :
   closing '/dev/full' failed: No space left on device

Note that if we test only status < 0 (without errno) then there are too many 
warnings
on seemingly "innocent" file closings.


Could you give an example of how to get status < 0 on a valid closing?

If you remove "&& errno" and leave only "if (status < 0)" in the previous patch
then during make I have many warnings, e.g. :
Warning messages:
1: In close.connection(con) :
   closing '/home/sokol/dev/R/R-devel/library/compiler/Meta/nsInfo.rds' failed: 
Success
2: In close.connection(con) :
   closing '/home/sokol/dev/R/R-devel/library/compiler/Meta/package.rds' 
failed: Success
3: In close(con) :
   closing '/home/sokol/dev/R/R-devel/library/compiler/R/compiler.rdx' f

Re: [Rd] [New Patch] Fix disk corruption when writing

2017-07-07 Thread Serguei Sokol

Le 07/07/2017 à 16:52, Duncan Murdoch a écrit :

On 07/07/2017 9:54 AM, Serguei Sokol wrote:

Le 07/07/2017 à 01:09, Duncan Murdoch a écrit :

On 06/07/2017 6:44 PM, Sokol Serguei wrote:

Duncan Murdoch has written at  Thu, 6 Jul 2017 13:58:10 -0400

On 06/07/2017 5:21 AM, Serguei Sokol wrote:

I propose the following patch against the current
R-devel/src/main/connection.c (cf. attached file).
It gives (on my linux box):
 > fc=file("/dev/full", "w")
 > write.csv("a", file=fc)
Error in writeLines(paste(col.names, collapse = sep), file, sep = eol) :
   system call failure on writing
 > close(fc)

Serguei.


I suspect that flushing on every write will slow things down too much.

That's quite plausible.



I think a better approach is to catch the error in the Rconn_printf
calls (as R-devel currently does), and also catch errors on
con->close(con).  This one requires more changes to the source, so it
may be a day or two before I commit.

I have testes it on file_close() and it works (cf. attached patch):

fc=file("/dev/full", "w")
write.csv("a", file=fc)
close(fc)

Error in close.connection(fc) : closing file failed



One thing I have to look into:  is anyone making use of the fact that
the R-level close.connection() function can return -1 to signal an
error?  Base R ignores that, which is why we need to do something, but
if packages are using it, things need to be changed carefully.  I
can't just change it to raise an error instead.

As you can see in the patch, no need to change close.connection() function
but we have to add a test of con->status to all *_close() functions
(gzfile_close() and co.)


You missed my point.  Currently the R close() function may return -1 to signal 
that there was an error closing.  We can't change that to an error if packages
are using it.

May be I missed it but finally, me too, I was saying that we don't have to do 
so.
Anyhow, the situation of writing to full disk cannot be passed in silence.
IMHO, trigger an error would be the most appropriate in this situation but if 
for legacy
or any other reason we cannot do so, let whistle a warning, at least.
Here few tests with a new small patch:
 > fc=file("/dev/full", "w"); write.csv("a", file=fc); (res=close(fc))
[1] -1
Warning message:
In close.connection(fc) :
   closing '/dev/full' failed: No space left on device
 > fc=gzfile("/dev/full", "w"); write.csv("a", file=fc); (res=close(fc))
NULL
Warning message:
In close.connection(fc) :
   closing '/dev/full' failed: No space left on device
 > fc=xzfile("/dev/full", "w"); write.csv("a", file=fc); (res=close(fc))
NULL
Warning message:
In close.connection(fc) :
   closing '/dev/full' failed: No space left on device
 > fc=bzfile("/dev/full", "w"); write.csv("a", file=fc); (res=close(fc))
NULL
Warning message:
In close.connection(fc) :
   closing '/dev/full' failed: No space left on device

Note that if we test only status < 0 (without errno) then there are too many 
warnings
on seemingly "innocent" file closings.


Could you give an example of how to get status < 0 on a valid closing? 

If you remove "&& errno" and leave only "if (status < 0)" in the previous patch
then during make I have many warnings, e.g. :
Warning messages:
1: In close.connection(con) :
  closing '/home/sokol/dev/R/R-devel/library/compiler/Meta/nsInfo.rds' failed: 
Success
2: In close.connection(con) :
  closing '/home/sokol/dev/R/R-devel/library/compiler/Meta/package.rds' failed: 
Success
3: In close(con) :
  closing '/home/sokol/dev/R/R-devel/library/compiler/R/compiler.rdx' failed: 
Success
4: In close.connection(con) :
  closing '/home/sokol/dev/R/R-devel/library/tools/Meta/nsInfo.rds' failed: 
Success
5: In close.connection(con) :
  closing '/home/sokol/dev/R/R-devel/library/tools/Meta/package.rds' failed: 
Success
6: In close(con) :
  closing '/home/sokol/dev/R/R-devel/library/tools/R/tools.rdx' failed: Success
7: In close(con) :
  closing '/home/sokol/dev/R/R-devel/library/tools/R/sysdata.rdx' failed: 
Success
8: In close.connection(con) :
  closing '../../library/parallel/Meta/Rd.rds' failed: Success
9: In close.connection(con) :
  closing '../../library/parallel/help/aliases.rds' failed: Success
10: In close.connection(file) :
  closing '../../library/parallel/DESCRIPTION' failed: Success

Note "Succes" as the reason of "failure".

And if I use thus compiled R, at startup I get:

Warning message:
In close(con) :
  closing '/home/sokol/dev/R/R-devel/library/base/R/base.rdx' failed: Success

R Under development (unstable) (2017-06-01 r72753) -- "Unsuffered Consequences"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRA

Re: [Rd] [New Patch] Fix disk corruption when writing

2017-07-07 Thread Serguei Sokol

Le 07/07/2017 à 01:09, Duncan Murdoch a écrit :

On 06/07/2017 6:44 PM, Sokol Serguei wrote:

Duncan Murdoch has written at  Thu, 6 Jul 2017 13:58:10 -0400

On 06/07/2017 5:21 AM, Serguei Sokol wrote:

I propose the following patch against the current
R-devel/src/main/connection.c (cf. attached file).
It gives (on my linux box):
 > fc=file("/dev/full", "w")
 > write.csv("a", file=fc)
Error in writeLines(paste(col.names, collapse = sep), file, sep = eol) :
   system call failure on writing
 > close(fc)

Serguei.


I suspect that flushing on every write will slow things down too much.

That's quite plausible.



I think a better approach is to catch the error in the Rconn_printf
calls (as R-devel currently does), and also catch errors on
con->close(con).  This one requires more changes to the source, so it
may be a day or two before I commit.

I have testes it on file_close() and it works (cf. attached patch):

fc=file("/dev/full", "w")
write.csv("a", file=fc)
close(fc)

Error in close.connection(fc) : closing file failed



One thing I have to look into:  is anyone making use of the fact that
the R-level close.connection() function can return -1 to signal an
error?  Base R ignores that, which is why we need to do something, but
if packages are using it, things need to be changed carefully.  I
can't just change it to raise an error instead.

As you can see in the patch, no need to change close.connection() function
but we have to add a test of con->status to all *_close() functions
(gzfile_close() and co.)


You missed my point.  Currently the R close() function may return -1 to signal that there was an error closing.  We can't change that to an error if packages 
are using it.

May be I missed it but finally, me too, I was saying that we don't have to do 
so.
Anyhow, the situation of writing to full disk cannot be passed in silence.
IMHO, trigger an error would be the most appropriate in this situation but if 
for legacy
or any other reason we cannot do so, let whistle a warning, at least.
Here few tests with a new small patch:
> fc=file("/dev/full", "w"); write.csv("a", file=fc); (res=close(fc))
[1] -1
Warning message:
In close.connection(fc) :
  closing '/dev/full' failed: No space left on device
> fc=gzfile("/dev/full", "w"); write.csv("a", file=fc); (res=close(fc))
NULL
Warning message:
In close.connection(fc) :
  closing '/dev/full' failed: No space left on device
> fc=xzfile("/dev/full", "w"); write.csv("a", file=fc); (res=close(fc))
NULL
Warning message:
In close.connection(fc) :
  closing '/dev/full' failed: No space left on device
> fc=bzfile("/dev/full", "w"); write.csv("a", file=fc); (res=close(fc))
NULL
Warning message:
In close.connection(fc) :
  closing '/dev/full' failed: No space left on device

Note that if we test only status < 0 (without errno) then there are too many 
warnings
on seemingly "innocent" file closings.

Serguei.



Le 05/07/2017 à 15:33, Serguei Sokol a écrit :

Le 05/07/2017 à 14:46, Serguei Sokol a écrit :

Le 05/07/2017 à 13:09, Duncan Murdoch a écrit :

On 05/07/2017 5:26 AM, January W. wrote:

I tried the newest patch, but it does not seem to work for me (on
Linux). Despite the check in Rconn_printf, the write.csv happily
writes
to /dev/full and does not report an error. When I added a
printf("%d\n",
res); to the Rconn_printf() definition, I see only positive values
returned by the vfprintf call.



That's likely because you aren't writing enough to actually
trigger a write to disk during the write.  Writes are buffered,
and the error doesn't happen
until the buffer is written.

I can confirm this behavior with fvprintf(). Small and medium sized
writings
on /dev/full don't trigger error and 1MB does.

But if fprintf() is used, it returns a negative value from the very
first byte written.

I correct myself. In my test, fprintf() returned -1 for another
reason (connection was already closed
at this moment)
However, if fvprintf(...) is followed by res=fflush(con) then res is -1
if we try to write on /dev/full. May be we have to use this to trigger
an error message in R.

Serguei.


  The regression test I put in had this problem; I'm working on
MacOS and Windows, so I never got to actually try it before
committing.

Unfortunately, it doesn't look possible to catch the final flush
of the buffer when the connection is closed, so small writes won't
trigger any error.

It's also possible that whatever system you're on doesn't signal
an error when the write fails.

Duncan Murdoch


Cheers,

j.


On 4 July 2017 at 21:37, Duncan Murdoch <murdoch.dun...@gmail.com
<mailto:murdoch.dun...@gmail.com>> wrote:

On 04/07/2017 11:50 AM, Jean-Sébastien Bevilacqua wrote:

Hello,
You can find here a patch to fix disk corruption.
  

Re: [Rd] [New Patch] Fix disk corruption when writing

2017-07-06 Thread Serguei Sokol

I propose the following patch against the current R-devel/src/main/connection.c 
(cf. attached file).
It gives (on my linux box):
> fc=file("/dev/full", "w")
> write.csv("a", file=fc)
Error in writeLines(paste(col.names, collapse = sep), file, sep = eol) :
  system call failure on writing
> close(fc)

Serguei.

Le 05/07/2017 à 15:33, Serguei Sokol a écrit :

Le 05/07/2017 à 14:46, Serguei Sokol a écrit :

Le 05/07/2017 à 13:09, Duncan Murdoch a écrit :

On 05/07/2017 5:26 AM, January W. wrote:

I tried the newest patch, but it does not seem to work for me (on
Linux). Despite the check in Rconn_printf, the write.csv happily writes
to /dev/full and does not report an error. When I added a printf("%d\n",
res); to the Rconn_printf() definition, I see only positive values
returned by the vfprintf call.



That's likely because you aren't writing enough to actually trigger a write to disk during the write.  Writes are buffered, and the error doesn't happen 
until the buffer is written.

I can confirm this behavior with fvprintf(). Small and medium sized writings
on /dev/full don't trigger error and 1MB does.

But if fprintf() is used, it returns a negative value from the very first byte 
written.

I correct myself. In my test, fprintf() returned -1 for another reason 
(connection was already closed
at this moment)
However, if fvprintf(...) is followed by res=fflush(con) then res is -1
if we try to write on /dev/full. May be we have to use this to trigger
an error message in R.

Serguei.


  The regression test I put in had this problem; I'm working on MacOS and 
Windows, so I never got to actually try it before committing.

Unfortunately, it doesn't look possible to catch the final flush of the buffer 
when the connection is closed, so small writes won't trigger any error.

It's also possible that whatever system you're on doesn't signal an error when 
the write fails.

Duncan Murdoch


Cheers,

j.


On 4 July 2017 at 21:37, Duncan Murdoch <murdoch.dun...@gmail.com
<mailto:murdoch.dun...@gmail.com>> wrote:

On 04/07/2017 11:50 AM, Jean-Sébastien Bevilacqua wrote:

Hello,
You can find here a patch to fix disk corruption.
When your disk is full, the write function exit without error
but the file
is truncated.

https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17243
<https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17243>


Thanks.  I didn't see that when it came through (or did and forgot).
I'll probably move the error check to a lower level (in the
Rconn_printf function), if tests show that works.

Duncan Murdoch


__
R-devel@r-project.org <mailto:R-devel@r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
<https://stat.ethz.ch/mailman/listinfo/r-devel>




--
 January Weiner --


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel







--
Serguei Sokol
Ingenieur de recherche INRA
Metabolisme Integre et Dynamique des Systemes Metaboliques (MetaSys)

LISBP, INSA/INRA UMR 792, INSA/CNRS UMR 5504
135 Avenue de Rangueil
31077 Toulouse Cedex 04

tel: +33 5 6155 9276
fax: +33 5 6704 8825
email: so...@insa-toulouse.fr
http://metasys.insa-toulouse.fr
http://www.lisbp.fr

--- connections.c.orig	2017-07-05 12:07:36.514818879 +0200
+++ connections.c	2017-07-06 11:15:34.911618744 +0200
@@ -3711,16 +3711,19 @@
 return(nbuf);
 }
 
-
 int Rconn_printf(Rconnection con, const char *format, ...)
 {
 int res;
 va_list(ap);
-
 va_start(ap, format);
 /* Parentheses added for FC4 with gcc4 and -D_FORTIFY_SOURCE=2 */
 res = (con->vfprintf)(con, format, ap);
 va_end(ap);
+if (res < 0)
+	error(_("system call failure on writing"));
+res=(con->fflush)(con);
+if (res < 0)
+	error(_("system call failure on writing"));
 return res;
 }
 
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [New Patch] Fix disk corruption when writing

2017-07-05 Thread Serguei Sokol

Le 05/07/2017 à 14:46, Serguei Sokol a écrit :

Le 05/07/2017 à 13:09, Duncan Murdoch a écrit :

On 05/07/2017 5:26 AM, January W. wrote:

I tried the newest patch, but it does not seem to work for me (on
Linux). Despite the check in Rconn_printf, the write.csv happily writes
to /dev/full and does not report an error. When I added a printf("%d\n",
res); to the Rconn_printf() definition, I see only positive values
returned by the vfprintf call.



That's likely because you aren't writing enough to actually trigger a write to disk during the write.  Writes are buffered, and the error doesn't happen 
until the buffer is written.

I can confirm this behavior with fvprintf(). Small and medium sized writings
on /dev/full don't trigger error and 1MB does.

But if fprintf() is used, it returns a negative value from the very first byte 
written.

I correct myself. In my test, fprintf() returned -1 for another reason 
(connection was already closed
at this moment)
However, if fvprintf(...) is followed by res=fflush(con) then res is -1
if we try to write on /dev/full. May be we have to use this to trigger
an error message in R.

Serguei.


  The regression test I put in had this problem; I'm working on MacOS and 
Windows, so I never got to actually try it before committing.

Unfortunately, it doesn't look possible to catch the final flush of the buffer 
when the connection is closed, so small writes won't trigger any error.

It's also possible that whatever system you're on doesn't signal an error when 
the write fails.

Duncan Murdoch


Cheers,

j.


On 4 July 2017 at 21:37, Duncan Murdoch <murdoch.dun...@gmail.com
<mailto:murdoch.dun...@gmail.com>> wrote:

On 04/07/2017 11:50 AM, Jean-Sébastien Bevilacqua wrote:

Hello,
You can find here a patch to fix disk corruption.
When your disk is full, the write function exit without error
but the file
is truncated.

https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17243
<https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17243>


Thanks.  I didn't see that when it came through (or did and forgot).
I'll probably move the error check to a lower level (in the
Rconn_printf function), if tests show that works.

Duncan Murdoch


__
R-devel@r-project.org <mailto:R-devel@r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
<https://stat.ethz.ch/mailman/listinfo/r-devel>




--
 January Weiner --


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel





--
Serguei Sokol
Ingenieur de recherche INRA
Metabolisme Integre et Dynamique des Systemes Metaboliques (MetaSys)

LISBP, INSA/INRA UMR 792, INSA/CNRS UMR 5504
135 Avenue de Rangueil
31077 Toulouse Cedex 04

tel: +33 5 6155 9276
fax: +33 5 6704 8825
email: so...@insa-toulouse.fr
http://metasys.insa-toulouse.fr
http://www.lisbp.fr

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [New Patch] Fix disk corruption when writing

2017-07-05 Thread Serguei Sokol

Le 05/07/2017 à 13:09, Duncan Murdoch a écrit :

On 05/07/2017 5:26 AM, January W. wrote:

I tried the newest patch, but it does not seem to work for me (on
Linux). Despite the check in Rconn_printf, the write.csv happily writes
to /dev/full and does not report an error. When I added a printf("%d\n",
res); to the Rconn_printf() definition, I see only positive values
returned by the vfprintf call.



That's likely because you aren't writing enough to actually trigger a write to disk during the write.  Writes are buffered, and the error doesn't happen until 
the buffer is written.

I can confirm this behavior with fvprintf(). Small and medium sized writings
on /dev/full don't trigger error and 1MB does.

But if fprintf() is used, it returns a negative value from the very first byte 
written.

Serguei.


  The regression test I put in had this problem; I'm working on MacOS and 
Windows, so I never got to actually try it before committing.

Unfortunately, it doesn't look possible to catch the final flush of the buffer 
when the connection is closed, so small writes won't trigger any error.

It's also possible that whatever system you're on doesn't signal an error when 
the write fails.

Duncan Murdoch


Cheers,

j.


On 4 July 2017 at 21:37, Duncan Murdoch <murdoch.dun...@gmail.com
<mailto:murdoch.dun...@gmail.com>> wrote:

On 04/07/2017 11:50 AM, Jean-Sébastien Bevilacqua wrote:

Hello,
You can find here a patch to fix disk corruption.
When your disk is full, the write function exit without error
but the file
is truncated.

https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17243
<https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17243>


Thanks.  I didn't see that when it came through (or did and forgot).
I'll probably move the error check to a lower level (in the
Rconn_printf function), if tests show that works.

Duncan Murdoch


__
R-devel@r-project.org <mailto:R-devel@r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
<https://stat.ethz.ch/mailman/listinfo/r-devel>




--
 January Weiner --


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Serguei Sokol
Ingenieur de recherche INRA
Metabolisme Integre et Dynamique des Systemes Metaboliques (MetaSys)

LISBP, INSA/INRA UMR 792, INSA/CNRS UMR 5504
135 Avenue de Rangueil
31077 Toulouse Cedex 04

tel: +33 5 6155 9276
fax: +33 5 6704 8825
email: so...@insa-toulouse.fr
http://metasys.insa-toulouse.fr
http://www.lisbp.fr

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R history: Why 'L; in suffix character ‘L’ for integer constants?

2017-06-16 Thread Serguei Sokol

Le 16/06/2017 à 17:54, Henrik Bengtsson a écrit :

I'm just curious (no complaints), what was the reason for choosing the
letter 'L' as a suffix for integer constants?  Does it stand for
something (literal?), is it because it visually stands out, ..., or no
specific reason at all?

My guess is that it is inherited form C "long integer" type (contrary to "short integer" 
or simply "integer")
https://en.wikipedia.org/wiki/C_data_types

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [WISH / PATCH] possibility to split string literals across multiple lines

2017-06-14 Thread Serguei Sokol

Le 14/06/2017 à 12:58, Andreas Kersting a écrit :

Hi,

I would really like to have a way to split long string literals across multiple 
lines in R.

...
An alternative approach could be to have something like

("aaa "
"bbb")

This is C-style and if the core-team decides to implement it,
it could be useful and intuitive.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] stats::line() does not produce correct Tukey line when n mod 6 is 2 or 3

2017-06-01 Thread Serguei Sokol

Le 31/05/2017 à 22:00, Martin Maechler a écrit :

Serguei Sokol <so...@insa-toulouse.fr>
 on Wed, 31 May 2017 18:46:34 +0200 writes:

 > Le 31/05/2017 à 17:30, Serguei Sokol a écrit :
 >>
 >> More thorough reading revealed that I have overlooked this phrase in the
 >> line's doc: "left and right /thirds/ of the data" (emphasis is mine).
 > Oops. I have read the first ref returned by google and it happened to be
 > tibco's doc, not the R's one. The layout is very similar hence my 
mistake.
 > The latter does not mention "thirds" but ...
 > Anyway, here is a new line's patch which still gives a result slightly 
different
 > form MMline(). The slope is the same but not the intercept.
 > What are the exact terms for intercept calculation that should be 
implemented?

 > Serguei.

Sorry Serguei,   I have new version of line.c  since yesterday,
and will not be disturbed anymore.

Note that I *did* give the litterature, and it seems most
discussants don't have paper books in physical libraries anymore;
In this case, interestingly, you need one of those I think -
almost everything I found online did not have the exact details.

Fortunately, you keep old good habits regarding paper books ;)


Peter Dalgaard definitely was right that Tukey did not use
quantiles at all, and notably did *not* define the three groups
via   {i;  x_i <= x_L}  and {i; x_i >= X_R}  which (as I think
you noticed) may make the groups quite unbalanced in case of duplicated x's.

But then, for now I had decided to fix the bug (namely computing
the x-medians wrongly as you diagnosed correctly(!) -- but your
first 2 patches only fixed partly) *and* go at least one step in
the direction of Tukey's original, namely by allowing iteration via a new 
'iter' argument.

Hm, I did not use iterations. A newly introduced indx is used to keep
index permutation when x is sorted.


I have also updated the help page to document what  line()  has
been computing all these years {apart from the bug which
typically shows for non-equidistant x[]}.

You mean "non equally sized"? (bis ;) )


We could also consider to eventually add a new   'method = '
argument to line()  one version of which would continue to
compute the current solution,

If the current solution is considered as plainly wrong, why to continue
to implement it? Unless "by current version" you mean your implementation
equivalent to my patch2 which fixes group sizes.


  another would compute the one
corresponding to Velleman & Hoaglin (1981)'s  FORTRAN
implementation (which had to be corrected for some infinite-loop
cases!)... not in the close future though

What would be the interest of this fortran version? Faster? More accurate?


Given all this discussions here, I think I should commit what I
currently have  ASAP.

+1.

Serguei.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] stats::line() does not produce correct Tukey line when n mod 6 is 2 or 3

2017-05-31 Thread Serguei Sokol

Le 31/05/2017 à 16:39, Joris Meys a écrit :

Seriously, if a method gives a wrong result, it's wrong.

I did not understand why you and others were using term "wrong"
based on something that I was considering as just "different" implementation.
More thorough reading revealed that I have overlooked this phrase in the
line's doc: "left and right /thirds/ of the data" (emphasis is mine).

Should I be exiled to Excel department for this sin? That's tough ;)
Serguei.


line() does NOT implement the algorithm of Tukey, even not after the patch. 
We're not discussing Excel here, are we?

The method of Tukey is rather clear, and it is NOT using the default quantile definition from the quantile function. Actually, it doesn't even use quantiles 
to define the groups. It just says that the groups should be more or less equally spaced. As the method of Tukey relies on the medians of the subgroups, it 
would make sense to pick a method that is approximately unbiased with regard to the median. That would be type 8 imho.


To get the size of the outer groups, Tukey would've been more than happy enough 
with a:

> floor(length(dfr$time) / 3)
[1] 6

There you have the size of your left and right group, and now we can discuss 
about which median type should be used for the robust fitting.

But I can honestly not understand why anyone in his right mind would defend a method that is clearly wrong while not working at Microsoft's spreadsheet 
department.


Cheers
Joris

On Wed, May 31, 2017 at 4:03 PM, Serguei Sokol <so...@insa-toulouse.fr 
<mailto:so...@insa-toulouse.fr>> wrote:

Le 31/05/2017 à 15:40, Joris Meys a écrit :

OTOH,

> sapply(1:9, function(i){
+   sum(dfr$time <= quantile(dfr$time, 1./3., type = i))
+ })
[1] 8 8 6 6 6 6 8 6 6

Only the default (type = 7) and the first two types give the result 
lines() gives now. I think there is plenty of reasons to give why any of the 
other
6 types might be better suited in Tukey's method.

So to my mind, chaning the definition of line() to give sensible output 
that is in accordance with the theory, does not imply any inconsistency with
the quantile definition in R. At least not with 6 out of the 9 
different ones ;-)

Nice shot.
But OTOE (on the other end ;)
> sapply(1:9, function(i){
+   sum(dfr$time >= quantile(dfr$time, 2./3., type = i))
+ })
[1] 8 8 8 8 6 6 8 6 6

Here "8" gains 5 votes against 4 for "6". There were two defector methods
that changed the point number and should be discarded. Which leaves us
with the score 3:4, still in favor of "6" but the default method should 
prevail
in my sens.

Serguei.




--
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Mathematical Modelling, Statistics and Bio-Informatics

tel :  +32 (0)9 264 61 79
joris.m...@ugent.be
-------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php



--
Serguei Sokol
Ingenieur de recherche INRA
Metabolisme Integre et Dynamique des Systemes Metaboliques (MetaSys)

LISBP, INSA/INRA UMR 792, INSA/CNRS UMR 5504
135 Avenue de Rangueil
31077 Toulouse Cedex 04

tel: +33 5 6155 9276
fax: +33 5 6704 8825
email: so...@insa-toulouse.fr
http://metasys.insa-toulouse.fr
http://www.lisbp.fr

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

  1   2   >