Re: [Rd] Output mis-encoded on Windows w/ RGui 3.5.1 in strange case

2018-07-17 Thread Tomas Kalibera
Hi Kevin, the extra bytes you are seeing are escapes for UTF-8 strings used in input to RGui console. Recently ascii strings are converted to UTF-8 so you would get these escapes for ascii strings now as well. RGui understands these escapes and converts from UTF-8 to wide characters before

Re: [Rd] base::mean not consistent about NA/NaN

2018-07-18 Thread Tomas Kalibera
Yes, the performance overhead of fixing this at R level would be too large and it would complicate the code significantly. The result of binary operations involving NA and NaN is hardware dependent (the propagation of NaN payload) - on some hardware, it actually works the way we would like -

Re: [Rd] Output mis-encoded on Windows w/ RGui 3.5.1 in strange case

2018-07-18 Thread Tomas Kalibera
eed to invoke save() after printing that empty list, > 3) Then, attempts to call encodeString() will produce the strange output. > > For what it's worth, it may be related to a behavior I'm seeing where > the first name printed for an R list is quoted with backticks even > when not

Re: [Rd] Memory leakage from large lists

2018-07-17 Thread Tomas Kalibera
On 07/17/2018 12:56 PM, Joshua Ulrich wrote: This looks like a case of FAQ 7.42: https://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-is-R-apparently-not-releasing-memory_003f Yes. A true memory leak in R would mean that repeated execution of the same code (e.g. creation and deletion of the list)

Re: [Rd] Output mis-encoded on Windows w/ RGui 3.5.1 in strange case

2018-07-18 Thread Tomas Kalibera
Fixed in R-devel and R-patched, Tomas On 07/18/2018 12:03 PM, Tomas Kalibera wrote: > Thanks, I can now reproduce and it is a bug that is easy to fix, I > will do so shortly. > > Fyi it can be reproduced simply by running these two lines in Rgui: > > list() > encodeStr

Re: [Rd] Detecting whether a process exists or not by its PID?

2018-08-31 Thread Tomas Kalibera
On 08/31/2018 03:13 PM, Gábor Csárdi wrote: On Fri, Aug 31, 2018 at 2:51 PM Tomas Kalibera wrote: [...] kill(sig=0) is specified by POSIX but indeed as you say there is a race condition due to PID-reuse. In principle, detecting that a worker process is still alive cannot be done correctly

Re: [Rd] True length - length(unclass(x)) - without having to call unclass()?

2018-09-05 Thread Tomas Kalibera
On 08/24/2018 07:55 PM, Henrik Bengtsson wrote: Is there a low-level function that returns the length of an object 'x' - the length that for instance .subset(x) and .subset2(x) see? An obvious candidate would be to use: .length <- function(x) length(unclass(x)) However, I'm concerned that

Re: [Rd] True length - length(unclass(x)) - without having to call unclass()?

2018-09-10 Thread Tomas Kalibera
rimitives. But a partial solution could be implemented at some point with ALTREP wrappers when one could without copying create a wrapper object with a modified class attribute. Tomas Iñaki El mié., 5 sept. 2018 a las 10:09, Tomas Kalibera () escribió: On 08/24/2018 07:55 PM, Henrik Bengt

Re: [Rd] Detecting whether a process exists or not by its PID?

2018-08-31 Thread Tomas Kalibera
On 08/31/2018 01:18 AM, Henrik Bengtsson wrote: Hi, I'd like to test whether a (localhost) PSOCK cluster node is still running or not by its PID, e.g. it may have crashed / core dumped. I'm ok with getting false-positive results due to *another* process with the same PID has since started.

Re: [Rd] [parallel] fixes load balancing of parLapplyLB

2018-03-13 Thread Tomas Kalibera
us know ;) Best Regards On 02/26/2018 04:01 PM, Tomas Kalibera wrote: Dear Christian and Henrik, thank you for spotting the problem and suggestions for a fix. We'll probably add a chunk.size argument to parLapplyLB and parLapply to follow OpenMP terminology, which has already been

Re: [Rd] clusterApply arguments

2018-03-15 Thread Tomas Kalibera
On 03/15/2018 05:25 PM, Henrik Bengtsson wrote: On Thu, Mar 15, 2018 at 3:39 AM, wrote: Thank you for your answer! I agree with you except for the 3 (Error) example and I realize now I should have started with that in the explanation. From my point of view

Re: [Rd] [Bug report] Chinese characters are not handled correctly in Rterm for Windows

2018-04-05 Thread Tomas Kalibera
Thank you for the report and initial debugging. I am not sure what is going wrong, we may have to rely on your help to debug this (I do not have a system to reproduce on). A user-targeted advice would be to use RGui (Rgui.exe). Does the problem also exist in R-devel?

Re: [Rd] utils::unzip ignores overwrite argument, effectively

2018-04-04 Thread Tomas Kalibera
Thanks, fixed in R-devel. Tomas On 12/20/2017 02:38 PM, Gábor Csárdi wrote: It does give a warning, but then it overwrites the files, anyway. Reproducible example below. This is R 3.4.3, but it does not seem to be fixed in R-devel:

Re: [Rd] file.copy(from=Directory, to=File) oddity

2018-04-09 Thread Tomas Kalibera
Thanks for the report, fixed in R-devel. Now we get a warning when copying a directory over a non-directory file is attempted. The target (non-directory) file is left alone. Tomas On 09/08/2017 06:54 PM, William Dunlap via R-devel wrote: When I mistakenly use file.copy() with a directory for

Re: [Rd] potential file.copy() or documentation bug when copy.date = TRUE

2018-04-06 Thread Tomas Kalibera
Thanks for the report, fixed in R-devel. Best, Tomas On 04/05/2018 05:01 PM, Gábor Csárdi wrote: This is a recent R-devel. file.copy() is not vectorized if multiple destinations succeed: cat("foo1\n", file = "foo1") cat("foo2\n", file = "foo2") unlink(c("copy1", "copy2"), recursive = TRUE)

Re: [Rd] [bug report] Cyrillic letter "я" interrupts script execution via R source function

2018-04-09 Thread Tomas Kalibera
Hi Vladimir, thanks for your report - this was really a bug, now fixed in R-devel and to appear in 3.5.0. Apart from the bug, having source files in UTF-8 and reading them into R on Windows is perfectly fine, you just need to specify that they are in UTF-8. You also need to make sure R is

Re: [Rd] [bug report] Cyrillic letter "я" interrupts script execution via R source function

2018-04-09 Thread Tomas Kalibera
Hi Patrick, thanks for your comments on the bug, just to clarify - one could reproduce the bug simply using file() and readLines(). The parser saw a real end of file as (incorrectly) communicated to it by lower level connections code - there is no design issue related in the parser (nor

Re: [Rd] R Bug: write.table for matrix of more than 2, 147, 483, 648 elements

2018-04-19 Thread Tomas Kalibera
On 04/19/2018 02:06 AM, Duncan Murdoch wrote: On 18/04/2018 5:08 PM, Tousey, Colton wrote: Hello, I want to report a bug in R that is limiting my capabilities to export a matrix with write.csv or write.table with over 2,147,483,648 elements (C's int limit). I found this bug already reported

Re: [Rd] R Bug: write.table for matrix of more than 2, 147, 483, 648 elements

2018-04-19 Thread Tomas Kalibera
On 04/19/2018 11:47 AM, Serguei Sokol wrote: Le 19/04/2018 à 09:30, Tomas Kalibera a écrit : On 04/19/2018 02:06 AM, Duncan Murdoch wrote: On 18/04/2018 5:08 PM, Tousey, Colton wrote: Hello, I want to report a bug in R that is limiting my capabilities to export a matrix with write.csv

Re: [Rd] R CMD build then check fails on R-devel due to serialization version

2018-04-24 Thread Tomas Kalibera
E_VERSION=2 and R_DEFAULT_SERIALIZE_VERSION=2). These frameworks could also be tested before the change by running with R_DEFAULT_SAVE_VERSION=3 and R_DEFAULT_SERIALIZE_VERSION=3. Best Tomas On 01/13/2018 01:35 AM, Tomas Kalibera wrote: To reduce difficulties for people relying on auto

Re: [Rd] Objects not gc'ed due to caching (?) in R's S3 dispatch mechanism

2018-03-27 Thread Tomas Kalibera
On 03/27/2018 09:51 AM, Iñaki Úcar wrote: 2018-03-27 6:02 GMT+02:00 : This has nothing to do with printing or dispatch per se. It is the result of an internal register (R_ReturnedValue) being protected. It gets rewritten whenever there is a jump, e.g. by an explicit

Re: [Rd] Typo in src/extra/tzone/registryTZ.c

2018-03-27 Thread Tomas Kalibera
Thanks! Fixed in R-devel, Tomas On 03/26/2018 03:22 PM, Korpela Mikko (MML) wrote: I stumbled upon a typo in a time zone name: Irtutsk should be Irkutsk. A patch is attached. I also checked that this is the only bug of its kind in this file, i.e., all the other Olson time zones occurring in the

Re: [Rd] Objects not gc'ed due to caching (?) in R's S3 dispatch mechanism

2018-03-27 Thread Tomas Kalibera
On 03/27/2018 11:53 AM, Iñaki Úcar wrote: 2018-03-27 11:11 GMT+02:00 Tomas Kalibera <tomas.kalib...@gmail.com>: On 03/27/2018 09:51 AM, Iñaki Úcar wrote: 2018-03-27 6:02 GMT+02:00 <luke-tier...@uiowa.edu>: This has nothing to do with printing or dispatch per se. It i

Re: [Rd] as.pairlist does not convert call objects

2018-03-29 Thread Tomas Kalibera
On 03/28/2018 08:31 AM, Jialin Ma wrote: Dear all, It seems that as.pairlist does not convert call objects, producing results like the following: is.pairlist(as.pairlist(quote(x + y))) [1] FALSE Should this behavior be expected? The documentation says that the behavior of as.pairlist is

Re: [Rd] Possible `substr` bug in UTF-8 Corner Case

2018-03-29 Thread Tomas Kalibera
Thanks, fixed in R-devel (by checking validity of UTF-8 strings for substr/substring). Tomas On 03/29/2018 03:53 AM, brodie gaslam via R-devel wrote: I think there is a memory bug in `substr` that is triggered by a UTF-8 corner case: an incomplete UTF-8 byte sequence at the end of a string. 

Re: [Rd] Discrepancy: R sum() VS C or Fortran sum

2018-03-16 Thread Tomas Kalibera
R uses long double type for the accumulator (on platforms where it is available). This is also mentioned in ?sum: "Where possible extended-precision accumulators are used, typically well supported with C99 and newer, but possibly platform-dependent." Tomas On 03/16/2018 06:08 PM, Pierre

Re: [Rd] clusterApply arguments

2018-03-16 Thread Tomas Kalibera
Fixed in R-devel (74418). Tomas On 03/15/2018 08:57 PM, Tomas Kalibera wrote: On 03/15/2018 05:25 PM, Henrik Bengtsson wrote: On Thu, Mar 15, 2018 at 3:39 AM, <florianschwendin...@gmx.at> wrote: Thank you for your answer! I agree with you except for the 3 (Error) example and I realize

Re: [Rd] BUG: tools::pskill() returns incorrect values or non-initated garbage values [PATCH]

2018-03-19 Thread Tomas Kalibera
Thanks for spotting this, fixed in R-devel (including the Windows version). Tomas On 03/18/2018 09:53 PM, Henrik Bengtsson wrote: For the record, I've just filed the following bug report with a patch to https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=17395: tools::pskill() returns either

Re: [Rd] Inconsistency, may be bug in read.delim ?

2018-03-21 Thread Tomas Kalibera
On 03/19/2018 02:23 PM, Detlef Steuer wrote: Dear friends, I stumbled into beheaviour of read.delim which I would consider a bug or at least an inconsistency that should be improved upon. Recently we had to work with data that used "", two double quotes, as symbol to start and end character

Re: [Rd] [parallel] fixes load balancing of parLapplyLB

2018-02-26 Thread Tomas Kalibera
Dear Christian and Henrik, thank you for spotting the problem and suggestions for a fix. We'll probably add a chunk.size argument to parLapplyLB and parLapply to follow OpenMP terminology, which has already been an inspiration for the present code (parLapply already implements static

Re: [Rd] Possible typo in the C source code of R

2018-02-26 Thread Tomas Kalibera
Thank you, Martin, for spotting this, it is clearly a bug, originally a conformance check was intended here and time series were defined using integers, so exact comparison would have made sense. Now time series are defined using doubles and exact comparison could be too strict with rounding

Re: [Rd] Bug in RScript.exe for 3.5.0

2018-04-26 Thread Tomas Kalibera
ctory C:\>"C:\Program Files\R\R-devel\bin\x64\Rscript.exe" --vanilla "C:\foo bar.R" What do you get when you multiply 6 * 9? C:\> -Original Message- From: Tomas Kalibera [mailto:tomas.kalib...@gmail.com] Sent: Thursday, April 26, 2018 8:35 AM To: Kerry Jackson

Re: [Rd] Bug in RScript.exe for 3.5.0

2018-04-26 Thread Tomas Kalibera
source code. Thanks, Kerry. There are binary builds for daily snapshots of R-devel (development/unstable version of R) at https://cran.r-project.org/bin/windows/base/rdevel.html At this time the build should already have the fix. Best Tomas -Original Message- From: Tomas Kalibera

Re: [Rd] Bug in RScript.exe for 3.5.0

2018-04-25 Thread Tomas Kalibera
Thanks for the report. A quick workaround before this gets fixed is to add an extra first argument that has no space in it, e.g. Rscript --vanilla "foo bar.R" The problem exists on all systems, not just Windows. Best Tomas On 04/25/2018 09:55 PM, Kerry Jackson wrote: Hi R Developers, I have

Re: [Rd] incomplete results from as.character.srcref() in some cases involving quote()

2018-06-20 Thread Tomas Kalibera
wholeSrcref attribute is documented in ?parse to be the source reference corresponding to the already parsed text. The implementation in the parser matches the documentation - the code stops at the last byte/character of the expression, that is, on the closing brace - which is the "already

Re: [Rd] Automatic Compression by Save Causes Check Warning

2018-06-20 Thread Tomas Kalibera
Dear Dario, this question may be more suitable for R-pkg-devel or perhaps R-help list, if you have subsequent questions, you might get better advice there. In short, save() does no automated selection of a compression algorithm - it just uses the one specified, by default "gzip". For

Re: [Rd] Random behavior of mclapply

2018-10-18 Thread Tomas Kalibera
Hi Thibault, mclapply has been designed to signal an error in two ways. User code errors are returned as special objects (of class "try-error") in the respective element of the result list. All other errors (including a process killed) are returned as NULL in the respective elements of the

Re: [Rd] unexpected memory.limit on windows in embedded R

2018-10-17 Thread Tomas Kalibera
Dear Livio, thank you for the report. Fixed in R-devel, 75450. Best Tomas On 10/01/2018 06:29 PM, Livio Bertacco wrote: Dear All, I'm linking R from another application and embedding it as described in the R-exts manual, i.e. with initialization done via Rf_initEmbeddedR. While everything

Re: [Rd] Bug in file.access on Windows when using network shares

2018-10-24 Thread Tomas Kalibera
ted. Since _waccess exists as a system call within Windows and > has the same syntax (other than wide characters) as access in POSIX, > why not use it? > > BW > > Nick > > On 3 July 2018 at 09:21, Tomas Kalibera <mailto:tomas.kalib...@gmail.com>> wrote: > &g

Re: [Rd] error unserializing ascii format (v2 or v3)

2018-11-12 Thread Tomas Kalibera
Thanks, fixed in R-devel and R-patched. The problem was only in unserializing of raw vectors serialized in ASCII, the format is not affected. Old serialized ASCII files created by R 3.5 or earlier can be read in R-patched and re-saved in binary format, which can in turn be read in R 3.5 and

Re: [Rd] redundant "`" symbol in the name of list (R version 3.5.0 & 3.5.1)

2018-11-13 Thread Tomas Kalibera
Thanks for the report, but please note this bug has already been reported (PR#17447). It is a Windows/RGui specific bug and has already been fixed in R-devel. I will see if it could also be ported to R-patched. Best, Tomas On 11/12/18 5:35 PM, ya wei wrote: Dear R devel Team, There might be

Re: [Rd] redundant "`" symbol in the name of list (R version 3.5.0 & 3.5.1)

2018-11-13 Thread Tomas Kalibera
Now also fixed in R-patched. Best, Tomas On 11/13/18 11:12 AM, Tomas Kalibera wrote: Thanks for the report, but please note this bug has already been reported (PR#17447). It is a Windows/RGui specific bug and has already been fixed in R-devel. I will see if it could also be ported to R

Re: [Rd] Little memory leak fix

2018-11-08 Thread Tomas Kalibera
Thanks! Fixed in R-devel. Best Tomas On 10/29/18 12:34 AM, David CARLIER wrote: > Hi dear list, > > Here a little memory leak fix proposal. > > Kind regards. > Thanks. > > __ > R-devel@r-project.org mailing list >

Re: [Rd] segfault issue with parallel::mclapply and download.file() on Mac OS X

2018-10-04 Thread Tomas Kalibera
Thanks for the report, but unfortunately I cannot reproduce on my system (either macOS nor Linux, from the command line) to debug. Did you run this in the command line version of R? I would not be surprised to see such a crash if executed from a multi-threaded application, say from some

Re: [Rd] unexpected behavior of unzip with list=T and unzip=/usr/bin/unzip

2018-10-09 Thread Tomas Kalibera
Hi Paul, thanks for the report. Fixed in R-devel 75417. Best Tomas On 07/04/2018 10:08 PM, Paul Schrimpf wrote: Hello, I encountered some unexpected behavior of unzip when using info-zip's unzip instead of R's internal program. Specifically, unzip("file.zip", list=TRUE, unzip=/usr/bin/unzip)

Re: [Rd] Query the pointer protection stack size (--max-ppsize=N) from within R?

2018-10-01 Thread Tomas Kalibera
I don't think this is possible. If you would like e.g. stricter checking for --max-ppsize argument value, please try to come up with some example that triggers such case. Tomas On 09/29/2018 06:41 PM, Henrik Bengtsson wrote: Hi, for simply for troubleshooting purposes (e.g. making sure that

Re: [Rd] Problem with parseData

2018-10-02 Thread Tomas Kalibera
The fix is now in R-devel, 75386. I have not ported to R-patched, because the fix breaks two packages which are working around this bug (and to my knowledge without having reported it before). So thanks again for the report! Best Tomas On 08/16/2018 10:06 AM, Tomas Kalibera wrote: Dear

Re: [Rd] Rscript -e does not accept newlines under Linux?

2018-10-08 Thread Tomas Kalibera
I've checked in an experimental fix for this (75413). The newline was lost in the shell script wrapper for R, it is now being escaped similarly to space. To pass multiple commands to Rscript, one can also use "-e" multiple times. Tomas On 09/17/2018 01:09 PM, Duncan Murdoch wrote: On

Re: [Rd] True length - length(unclass(x)) - without having to call unclass()?

2018-09-03 Thread Tomas Kalibera
Please don't do this to get the underlying vector length (or to achieve anything else). Setting/deleting attributes of an R object without checking the reference count violates R semantics, which in turn can have unpredictable results on R programs (essentially undebuggable segfaults now or

Re: [Rd] Get Logical processor count correctly whether NUMA is enabled or disabled

2018-09-03 Thread Tomas Kalibera
have 65 processors, a loop with 64 > iterations seem to complete as expected, but using all 65 processors to loop > over 65 iterations didn't seem to complete. I stopped it after ~5mins. The > same happens with the cluster started with any number between 65 and 88. It > seems to me l

Re: [Rd] True length - length(unclass(x)) - without having to call unclass()?

2018-09-03 Thread Tomas Kalibera
On 09/03/2018 03:59 PM, Dénes Tóth wrote: Hi Tomas, On 09/03/2018 11:49 AM, Tomas Kalibera wrote: Please don't do this to get the underlying vector length (or to achieve anything else). Setting/deleting attributes of an R object without checking the reference count violates R semantics, which

Re: [Rd] R-admin typo

2018-09-19 Thread Tomas Kalibera
Thanks, Tomas On 09/19/2018 06:15 AM, Colin Gillespie wrote: Hi, Section 3.2 of the R-admin manual https://cran.ma.imperial.ac.uk/doc/manuals/r-release/R-admin.html#Testing-a-Windows-Installation could be improved. The particular sentence is The Rtools are not needed to run these tests. but

Re: [Rd] Objectsize function visiting every element for alt-rep strings

2019-01-23 Thread Tomas Kalibera
On 1/22/19 6:17 PM, Kevin Ushey wrote: I think that object.size() is most commonly used to answer the question, "what R objects are consuming the most memory currently in my R session?" and for that reason I think returning the size of the internal representations of objects (for e.g. ALTREP

Re: [Rd] [patch] Documentation for list.files when no matches found

2019-01-07 Thread Tomas Kalibera
Thanks for the report, fixed in documentation in R-devel. Best Tomas On 1/7/19 3:03 AM, Jonathan Carroll wrote: Apologies in advance if this is already known but a search of the r-devel archive did not immediately turn up any mentions. list.files() (and thus dir()) returns character(0) when

Re: [Rd] Bug report: R.home() cause package Rcpp failed executing sourceCpp, similar bug are labeled "BUG 16660" since 2016 and here I could provide a solution that tested in my laptop.

2019-01-02 Thread Tomas Kalibera
To resolve this issue quickly on your side, I would recommend installing R on the C: drive which should have the short file names enabled by default. Then, R.home() will return a path name without a space (short file names do not include a space). If you for some reason need to install R

Re: [Rd] Inconsistent returned values of normalizePath(NA_character_) on Windows and *nix

2019-01-11 Thread Tomas Kalibera
Thanks for the report, fixed in R-devel (one gets NA_character_ as a result and the path is treated as non-existent, so with a warning or error when requested via mustWork argument). Best, Tomas On 12/7/18 7:10 PM, Yihui Xie wrote: Hi, I just noticed normalizePath(NA_character_) returns

Re: [Rd] Suggestion: use mustWork = TRUE as the default for system.file

2018-09-16 Thread Tomas Kalibera
Hello Irene, we can only change the documented behavior when there is a very strong reason to do so, because it indeed can break existing code. A lot of existing code would depend on the current behavior, using e.g. nzchar() to check the output of system.file(). Changing the behavior that is

Re: [Rd] SUGGESTION: Proposal to mitigate problem with stray processes left behind by parallel::makeCluster()

2019-03-27 Thread Tomas Kalibera
The problem causing the stray worker processes when the master fails to open a server socket to listen to connections from workers is not related to timeout in socketConnection(), because socketConnection() will fail right away. It is caused by a bug in checking the setup timeout (PR

Re: [Rd] Deep Replicable Bug With AMD Threadripper MultiCore

2019-04-05 Thread Tomas Kalibera
In addition you can also try to use a PSOCK cluster (see makeCluster, parLapply) to avoid the problem - it should help if the problem is somehow related to forking in mclapply(). The problem you are seeing may be in base R, in data.table, or in interaction between the two (mclapply() from

Re: [Rd] Use of C++ in Packages

2019-04-01 Thread Tomas Kalibera
On 3/30/19 8:59 AM, Romain Francois wrote: tl;dr: we need better C++ tools and documentation. We collectively know more now with the rise of tools like rchk and improved documentation such as Tomas’s post. That’s a start, but it appears that there still is a lot of knowledge that would

Re: [Rd] topenv of emptyenv

2019-04-01 Thread Tomas Kalibera
On 3/23/19 3:26 PM, Konrad Rudolph wrote: I was surprised just now to find out that `topenv(emptyenv())` equals … `.GlobalEnv`, not `emptyenv()`. From my understanding of the description of `topenv`, it should walk up the chain of enclosing environments (as if by calling `e = parent.env(e)`

Re: [Rd] Bug: time complexity of substring is quadratic as string size and number of substrings increases

2019-02-28 Thread Tomas Kalibera
An optimized version of substring/substr is now in R-devel (76172). Best, Tomas On 2/22/19 8:16 PM, Tomas Kalibera wrote: On 2/20/19 7:55 PM, Toby Hocking wrote: Update: I have observed that stringi::stri_sub is linear time complexity, and it computes the same thing as base::substring

Re: [Rd] code for sum function

2019-02-20 Thread Tomas Kalibera
can I > find documentation on this? > > Cheers, Rampal > > On Mon, Feb 18, 2019, 15:38 Tomas Kalibera <mailto:tomas.kalib...@gmail.com> wrote: > > See do_summary() in summary.c, rsum() for doubles. R uses long double > type as accumulator on systems where available. > >

Re: [Rd] patch for gregexpr(perl=TRUE)

2019-02-20 Thread Tomas Kalibera
Thanks, in R-devel 76138, I confirm it speeds up gregexpr() with pcre in Bill Dunlap's example from https://stat.ethz.ch/pipermail/r-devel/2017-January/073577.html (RegExprPCRE column) The performance problem of StrSplitPCRE does not seem to be due to strlen(). Best Tomas On 2/19/19 9:37 PM,

Re: [Rd] Bug: time complexity of substring is quadratic as string size and number of substrings increases

2019-02-22 Thread Tomas Kalibera
ther easy optimizations that will not complicate the code, including avoiding the strlen() call (taking advantage of pre-computed length of R character object). Best Tomas On Wed, Feb 20, 2019 at 11:16 AM Toby Hocking wrote: Hi all, (and especially hi to Tomas Kalibera who accepted my p

Re: [Rd] pcre problems

2019-02-25 Thread Tomas Kalibera
On 2/25/19 6:25 AM, robin hankin wrote: Hi there, ubuntu 18.04.2, trying to compile R-devel 3.6.0, svn 76155. I am having difficulty compiling R. I think I have pcre installed correctly: You can use apt-get build-dep r-base to install binary Ubuntu packages needed to build R from source,

Re: [Rd] pcre problems

2019-03-01 Thread Tomas Kalibera
configuration details. root@limpet:/etc/apt# hankin.ro...@gmail.com hankin.ro...@gmail.com On Fri, Mar 1, 2019 at 9:19 PM Tomas Kalibera wrote: On 3/1/19 9:03 AM, robin hankin wrote: OK thanks Tomas, but I get OK~ sudo apt-get build-dep r-base Reading package lists... Done E: Unable

Re: [Rd] pcre problems

2019-02-28 Thread Tomas Kalibera
~ compilation terminated. configure:42208: $? = 1 configure: failed program was: | /* confdefs.h */ | #define PACKAGE_NAME "R" | #define PACKAGE_TARNAME "R" | #define PACKAGE_VERSION "3.6.0" | #define PACKAGE_STRING "R 3.6.0" | #define PACKAGE_BUGREPOR h

Re: [Rd] pcre problems

2019-03-01 Thread Tomas Kalibera
enable them in /etc/apt/sources.list, uncomment all lines starting with deb-src. Best Tomas hankin.ro...@gmail.com On Fri, Mar 1, 2019 at 8:47 PM Tomas Kalibera wrote: On 3/1/19 7:10 AM, robin hankin wrote: thanks for this guys. I only compiled pcre myself as a last resort, because

Re: [Rd] Surprising results from INTEGER_GET_REGION with ALTREP object

2019-03-01 Thread Tomas Kalibera
On 3/1/19 1:52 PM, Ralf Stubner wrote: > Dear Listmembers, > > wanting to learn more about ALTREP I wrote the following function to > extract a subsequence from an integer vector: > > #include > > SEXP integer_get_region(SEXP _x, SEXP _i, SEXP _n) { >int i = INTEGER(_i)[0]; >int n =

Re: [Rd] package installation needs the file utility on Unix

2019-03-08 Thread Tomas Kalibera
Well, this only applies to source installs of packages that have some files with the special extension, so on systems where a compiler toolchain needs to be installed, so the image cannot be really tiny, anyway. But ok, I've made stage install use "file" only when it is available. When it

Re: [Rd] Bug Report: read.table with UTF-8 encoded file imports infinity symbol as Integer 8

2019-02-08 Thread Tomas Kalibera
I can reproduce this behavior on my Windows 10 system in RGui (cp1252): when I paste the Unicode infinity symbol into the console, it is treated as number 8. This is caused by Windows "best fit" default behavior in conversion of unicode characters to characters in the current native encoding:

Re: [Rd] Bug Report: read.table with UTF-8 encoded file imports infinity symbol as Integer 8

2019-02-08 Thread Tomas Kalibera
I can reproduce with read.table(encoding="UTF-8") in RGui on Windows 10, reading a file containing the two UTF-8 characters. The table is read correctly into R as documented (both characters are represented in UTF-8 and marked as such), but, the conversion of Infinity to 8 and of Zhe to

Re: [Rd] Trying to compile R 3.5.2 - 32 bit R - on Windows 10 64 bit - with ICU support

2019-02-18 Thread Tomas Kalibera
On 2/16/19 11:12 AM, Andre Mikulec wrote: Hi, I am trying to compile R with ICU support. I am following https://cran.r-project.org/doc/manuals/R-admin.html#Building-from-source I have downloaded and extracted https://www.stats.ox.ac.uk/pub/Rtools/goodies/ICU_531.zip to

Re: [Rd] Encoding issues

2019-02-18 Thread Tomas Kalibera
On 2/18/19 4:36 PM, Iñaki Ucar wrote: > Hi, > > We found a (to our eyes) strange behaviour that might be a bug. First > a little bit of context. The 'units' package allows us to set the unit > using both SE or NSE. E.g., these both work in the same way: > > units::set_units(1:10, "μm") > #> Units:

Re: [Rd] code for sum function

2019-02-18 Thread Tomas Kalibera
See do_summary() in summary.c, rsum() for doubles. R uses long double type as accumulator on systems where available. Best, Tomas On 2/14/19 2:08 PM, Rampal Etienne wrote: Hello, I am trying to write FORTRAN code to do the same as some R code I have. I get (small) differences when using the

Re: [Rd] Bug or undocumented behavior in normalizePath() with file system links on windows

2019-01-29 Thread Tomas Kalibera
I think this is caused by insufficient permissions on "C:/Programme" junction, and the behavior of normalizePath is as documented. I can get the same error with "C:/Documents and Settings", which is on my machine a junction into "C:/Users". The path cannot be normalized using

Re: [Rd] R 3.5.3 and 3.6.0 alpha Windows bug: UTF-8 characters in code are simplified to wrong ones

2019-04-11 Thread Tomas Kalibera
On 4/10/19 6:32 PM, Jeroen Ooms wrote: On Wed, Apr 10, 2019 at 5:45 PM Duncan Murdoch wrote: On 10/04/2019 10:29 a.m., Yihui Xie wrote: Since it is "technically easy" to disable the best fit conversion and the best fit is rarely good, how about providing an option for code/package authors to

Re: [Rd] R 3.5.3 and 3.6.0 alpha Windows bug: UTF-8 characters in code are simplified to wrong ones

2019-04-11 Thread Tomas Kalibera
On 4/10/19 6:13 PM, Tomáš Bořil wrote: An optional parameter to source() function which would translate all UTF-8 characters in string literals to their "\U" codes sounds as a great idea (and I hope it would fix 99.9% of problems I have - because that is the way I overcome these problems

Re: [Rd] R 3.5.3 and 3.6.0 alpha Windows bug: UTF-8 characters in code are simplified to wrong ones

2019-04-11 Thread Tomas Kalibera
it does not throw an error” is generally not a good idea - it is dangerous. Users / coders should know that there is something wrong with their strings and some characters are “eaten alive”. Tomas čt 11. 4. 2019 v 8:26 odesílatel Tomas Kalibera napsal: On 4/10/19 6:32 PM, Jeroen Ooms wrote

Re: [Rd] R 3.5.3 and 3.6.0 alpha Windows bug: UTF-8 characters in code are simplified to wrong ones

2019-04-10 Thread Tomas Kalibera
On 4/10/19 1:14 PM, Jeroen Ooms wrote: On Wed, Apr 10, 2019 at 12:19 PM Tomáš Bořil wrote: Minimalistic example: Let's type "ř" (LATIN SMALL LETTER R WITH CARON) in RGui console: "ř" [1] "r" Although the script is in UTF-8, the characters are replaced by "simplified" substitutes

Re: [Rd] Parsing code with newlines

2019-04-10 Thread Tomas Kalibera
On 4/5/19 8:14 AM, Mikhail Titov wrote: Hello! This is my first post here. I came across the very same problem. It can be reproduced within modified tests/Embedding/RParseEval.c Please check https://www.r-project.org/posting-guide.html and update your post if you still need to get help here

Re: [Rd] R 3.5.3 and 3.6.0 alpha Windows bug: UTF-8 characters in code are simplified to wrong ones

2019-04-10 Thread Tomas Kalibera
On 4/10/19 10:22 AM, Tomáš Bořil wrote: > Hello, > > There is a long-lasting problem with processing UTF-8 source code in R > on Windows OS. As Windows do not have "UTF-8" locale and R passes > source code through OS before executing it, some characters are > "simplified" by the OS before

Re: [Rd] SUGGESTION: Settings to disable forked processing in R, e.g. parallel::mclapply()

2019-04-15 Thread Tomas Kalibera
On 4/13/19 12:05 PM, Iñaki Ucar wrote: On Sat, 13 Apr 2019 at 03:51, Kevin Ushey wrote: I think it's worth saying that mclapply() works as documented Mostly, yes. But it says nothing about fork's copy-on-write and memory overcommitment, and that this means that it may work nicely or fail

Re: [Rd] SUGGESTION: Settings to disable forked processing in R, e.g. parallel::mclapply()

2019-04-15 Thread Tomas Kalibera
On 4/15/19 11:02 AM, Iñaki Ucar wrote: On Mon, 15 Apr 2019 at 08:44, Tomas Kalibera wrote: On 4/13/19 12:05 PM, Iñaki Ucar wrote: On Sat, 13 Apr 2019 at 03:51, Kevin Ushey wrote: I think it's worth saying that mclapply() works as documented Mostly, yes. But it says nothing about fork's

Re: [Rd] Exit status of Rscript when setting options(error=utils::recover)

2019-03-15 Thread Tomas Kalibera
Please refer to the documentation (?stop, ?recover, ?dump.frames). In non-interactive use, recover() works as dump.frames(). dump.frames() is documented not to quit R, and the examples show how to quit the R session with a given status automatically after dump.frames(). So in line with the

Re: [Rd] MacOS parallel::makeCluster fails

2019-06-05 Thread Tomas Kalibera
Hi Dominik, from the output, the master process could not "listen" on the port where it expects a connection from the worker. We need to find out why. I'd recommend first to create a minimal reproducible example (and one that does not use future, only parallel, and a minimal number of

Re: [Rd] Setting LC_CTYPE=en_US.UTF-8 failed

2019-06-06 Thread Tomas Kalibera
On 6/5/19 3:49 AM, Steven Penny wrote: Using this in my "~/.profile":    export LC_ALL=en_US.UTF-8 Yields this:    $ Rscript -e 'print(9)'    During startup - Warning message:    Setting LC_CTYPE=en_US.UTF-8 failed    [1] 9 This is confusing as the exact same environment works fine with

Re: [Rd] Setting LC_CTYPE=en_US.UTF-8 failed

2019-06-06 Thread Tomas Kalibera
On 6/6/19 3:24 PM, Duncan Murdoch wrote: On 06/06/2019 7:28 a.m., Duncan Murdoch wrote: On 06/06/2019 6:22 a.m., Tomas Kalibera wrote: On 6/5/19 3:49 AM, Steven Penny wrote: Using this in my "~/.profile":      export LC_ALL=en_US.UTF-8 Yields this:      $ Rscript -e 'print(9)'  

Re: [Rd] Possible bug when finding shared libraries during staged installation

2019-05-27 Thread Tomas Kalibera
gt; for looking into this! > > On Fri, May 24, 2019 at 5:58 AM Tomas Kalibera > mailto:tomas.kalib...@gmail.com>> wrote: > > On 5/24/19 2:52 PM, Martin Maechler wrote: > >>>>>> Kara Woo > >>>>>>      on Thu, 23 May 2019 14

Re: [Rd] Staged installation fail on some file systems

2019-05-09 Thread Tomas Kalibera
On 5/7/19 6:18 PM, Henrik Bengtsson wrote: On Tue, May 7, 2019 at 9:05 AM Tomas Kalibera wrote: Thanks for the report. According to my reading, this use of "mv" is ok and the renameat2() call which the invocation of "mv" leads to is also ok and allowed by POSIX in this co

Re: [Rd] read.table() fails with https in R 3.6 but not in R 3.5

2019-05-13 Thread Tomas Kalibera
On 5/6/19 2:27 PM, Stephen Berman wrote: On Mon, 6 May 2019 11:12:25 +0200 Ralf Stubner wrote: On 04.05.19 19:04, Stephen Berman wrote: In versions of R prior to 3.6.0 the following invocation succeeds, returning the data frame shown:

Re: [Rd] Possible bug when finding shared libraries during staged installation

2019-05-24 Thread Tomas Kalibera
On 5/24/19 2:52 PM, Martin Maechler wrote: Kara Woo on Thu, 23 May 2019 14:24:26 -0700 writes: > Hi all, > With the new staged installation, it seems that R CMD INSTALL sometimes > fails on macOS due to these lines [1] when sapply() returns a list. The > x13binary

Re: [Rd] Race condition on parallel package's mcexit and rmChild

2019-05-20 Thread Tomas Kalibera
This issue has already been addressed in 76462 (R-devel) and also ported to R-patched. In fact rmChild() is used in mccollect(wait=FALSE). Best Tomas On 5/19/19 11:39 AM, Sun Yijiang wrote: I've been hacking with parallel package for some time and built a parallel processing framework with

Re: [Rd] Staged installation fail on some file systems

2019-05-07 Thread Tomas Kalibera
Thanks for the report.  According to my reading, this use of "mv" is ok and the renameat2() call which the invocation of "mv" leads to is also ok and allowed by POSIX in this context. It could only fail with EEXIST if the target directory (path/pkg) was not empty. So far I've not been able to

Re: [Rd] R problems with lapack with gfortran

2019-05-06 Thread Tomas Kalibera
On 5/4/19 6:49 PM, Steve Kargl wrote: On Sat, May 04, 2019 at 06:42:47PM +0200, Thomas König wrote: - figure out Fortran2003 specification for C/Fortran interoperability -- this _sounds_ like the right solution, but I don't think many understand how to use it and what is implied (in particular,

Re: [Rd] mccollect with NULL in R 3.6

2019-05-02 Thread Tomas Kalibera
On 5/1/19 12:25 AM, Gergely Daróczi wrote: Dear All, I'm running into issues with calling mccollect on a list containing NULL using R 3.6 (this used to work in 3.5.3): jobs <- lapply( list(NULL, 'foobar'), function(x) mcparallel(identity(x))) mccollect(jobs, wait = FALSE, timeout =

Re: [Rd] mccollect with NULL in R 3.6

2019-05-03 Thread Tomas Kalibera
On 5/3/19 3:04 PM, Gergely Daróczi wrote: On Thu, May 2, 2019 at 7:24 PM Tomas Kalibera wrote: On 5/1/19 12:25 AM, Gergely Daróczi wrote: Dear All, I'm running into issues with calling mccollect on a list containing NULL using R 3.6 (this used to work in 3.5.3): jobs <- lapply( l

Re: [Rd] configure script issue with -flto with recent gcc and system ar/ranlib

2019-04-26 Thread Tomas Kalibera
On 4/25/19 6:11 PM, Thomas König wrote: Hi Tomas, On 4/23/19 2:59 PM, Thomas König wrote: Hi, there can be an issue with recent gcc where the system-installed "ar" and "ranlib" commands cannot handle LTO binaries.  On compilation, this manifests itself with error messages claiming that they

Re: [Rd] Use of C++ in Packages

2019-04-25 Thread Tomas Kalibera
and it can also reveal other problems. Tomas > > Regards > > Hugh > > On Mon, Apr 1, 2019 at 6:23 PM Tomas Kalibera > mailto:tomas.kalib...@gmail.com>> wrote: > > On 3/30/19 8:59 AM, Romain Francois wrote: > > tl;dr: we need better C++ tools and do

<    1   2   3   4   5   6   >