Re: [Rd] Crash/bug when calling match on latin1 strings

2021-10-11 Thread Martin Maechler
> Rui Barradas 
> on Mon, 11 Oct 2021 07:41:51 +0100 writes:

> Hello,

> R 4.1.1 on Ubuntu 20.04.

> I can reproduce this error but not ~90% of the time, only the 1st time I 
> run the script.
> If I run other (terminal) commands before rerunning the R script it 
> sometimes segfaults again but once again very far from 90% of the time.


> rui@rui:~/tmp$ R -q -f rhelp.R
>> sessionInfo()
> R version 4.1.1 (2021-08-10)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Ubuntu 20.04.3 LTS

> Matrix products: default
> BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

> locale:
> [1] LC_CTYPE=pt_PT.UTF-8   LC_NUMERIC=C
> [3] LC_TIME=pt_PT.UTF-8LC_COLLATE=pt_PT.UTF-8
> [5] LC_MONETARY=pt_PT.UTF-8LC_MESSAGES=pt_PT.UTF-8
> [7] LC_PAPER=pt_PT.UTF-8   LC_NAME=C
> [9] LC_ADDRESS=C   LC_TELEPHONE=C
> [11] LC_MEASUREMENT=pt_PT.UTF-8 LC_IDENTIFICATION=C

> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base

> loaded via a namespace (and not attached):
> [1] compiler_4.1.1
>> 
>> # A bunch of words in UTF8; replace *'s
>> words <- readLines("h://pastebin.c**/raw/MFCQfhpY", encoding = 
> "UTF-8")
>> words2 <- iconv(words, "utf-8", "latin1")
>> gctorture(TRUE)
>> y <- match(words2, words2)

> *** caught segfault ***
> address 0x10, cause 'memory not mapped'
> *** recursive gc invocation
> *** recursive gc invocation
> *** recursive gc invocation
> *** recursive gc invocation
> *** recursive gc invocation
> *** recursive gc invocation
> *** recursive gc invocation
> *** recursive gc invocation
> *** recursive gc invocation
> *** recursive gc invocation

> Traceback:
> 1: match(words2, words2)
> An irrecoverable exception occurred. R is aborting now ...
> Falta de segmentação (núcleo despejado)



> This last line is Portuguese for

> Segmentation fault (core dumped)

> Hope this helps,

Yes, it does, thank you!

I can confirm the problem:  Only in R 4.1.0 and newer, and
including current "R-patched" and "R-devel" versions.

I've now turned this into a formal R bug report on R's bugzilla,
and (slightly) extended your (Travers') example into self
contained (no internet access) R script.

Bugzilla PR#18211 :" match() memory corruption "

  https://bugs.r-project.org/show_bug.cgi?id=18211

  with attachment 2929
  --> https://bugs.r-project.org/attachment.cgi?id=2929=edit

==> please if possible follow up on bugzilla

Thanks again to you both!
Martin Maechler


> Rui Barradas

> Às 06:05 de 11/10/21, Travers Ching escreveu:
>> Here's a brief example:
>> 
>> # A bunch of words in UTF8; replace *'s
>> words <- readLines("h://pastebin.c**/raw/MFCQfhpY", encoding = 
"UTF-8")
>> words2 <- iconv(words, "utf-8", "latin1")
>> gctorture(TRUE)
>> y <- match(words2, words2)
>> 
>> 
>> I searched bugzilla but didn't see anything. Apologies if this is already
>> reported.
>> 
>> The bug appears in both R-devel and the release, but doesn't seem to 
affect
>> R 4.0.5.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Crash/bug when calling match on latin1 strings

2021-10-11 Thread Rui Barradas

Hello,

R 4.1.1 on Ubuntu 20.04.

I can reproduce this error but not ~90% of the time, only the 1st time I 
run the script.
If I run other (terminal) commands before rerunning the R script it 
sometimes segfaults again but once again very far from 90% of the time.



rui@rui:~/tmp$ R -q -f rhelp.R
> sessionInfo()
R version 4.1.1 (2021-08-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=pt_PT.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=pt_PT.UTF-8LC_COLLATE=pt_PT.UTF-8
 [5] LC_MONETARY=pt_PT.UTF-8LC_MESSAGES=pt_PT.UTF-8
 [7] LC_PAPER=pt_PT.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=pt_PT.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_4.1.1
>
> # A bunch of words in UTF8; replace *'s
> words <- readLines("h://pastebin.c**/raw/MFCQfhpY", encoding = 
"UTF-8")

> words2 <- iconv(words, "utf-8", "latin1")
> gctorture(TRUE)
> y <- match(words2, words2)

 *** caught segfault ***
address 0x10, cause 'memory not mapped'
*** recursive gc invocation
*** recursive gc invocation
*** recursive gc invocation
*** recursive gc invocation
*** recursive gc invocation
*** recursive gc invocation
*** recursive gc invocation
*** recursive gc invocation
*** recursive gc invocation
*** recursive gc invocation

Traceback:
 1: match(words2, words2)
An irrecoverable exception occurred. R is aborting now ...
Falta de segmentação (núcleo despejado)



This last line is Portuguese for

Segmentation fault (core dumped)

Hope this helps,

Rui Barradas


Às 06:05 de 11/10/21, Travers Ching escreveu:

Here's a brief example:

# A bunch of words in UTF8; replace *'s
words <- readLines("h://pastebin.c**/raw/MFCQfhpY", encoding = "UTF-8")
words2 <- iconv(words, "utf-8", "latin1")
gctorture(TRUE)
y <- match(words2, words2)


I searched bugzilla but didn't see anything. Apologies if this is already
reported.

The bug appears in both R-devel and the release, but doesn't seem to affect
R 4.0.5.

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Crash/bug when calling match on latin1 strings

2021-10-10 Thread Travers Ching
Here's a brief example:

# A bunch of words in UTF8; replace *'s
words <- readLines("h://pastebin.c**/raw/MFCQfhpY", encoding = "UTF-8")
words2 <- iconv(words, "utf-8", "latin1")
gctorture(TRUE)
y <- match(words2, words2)


I searched bugzilla but didn't see anything. Apologies if this is already
reported.

The bug appears in both R-devel and the release, but doesn't seem to affect
R 4.0.5.

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel