Re: [Rd] Unicode characters in ISO8859-15 locale

2020-12-18 Thread Prof Brian Ripley

On 17/12/2020 12:28, Jeroen Ooms wrote:

The hunspell package uses the code below to replace curly quotes (aka


officially these are directional (or right/left) quotes.


fancyquotes) with a regular ascii quotes that are needed for check
spelling:

 chartr("\u2019", "'", input)

As of last week this stopped working on CRAN in the Linux server that
runs in ISO8859-15 locale. From the error message, it seems that R no
longer parses the escaped unique string, which gets turned into
"".


You need to distinguish the string and what gets printed.  I see (for 
the CRAN log from a 2-day-old R)


 Error in chartr("", "'", as.character(add_words)) :

which is what I would expect to be printed in that locale:


x <- "\u2019"
Encoding(x)

[1] "UTF-8"

x

[1] ""


Is this expected?


Changes are expected, as this is an area being worked on (to try to 
remove some system-dependent behaviour).  However, I cannot reproduce 
this easily:



chartr("\u2019", "'", "abc\u2019")

[1] "abc'"

As I say, work in progress.

--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Emeritus Professor of Applied Statistics, University of Oxford

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R-devel crash

2020-12-18 Thread Prof Brian Ripley

On 18/12/2020 13:39, Gábor Csárdi wrote:

FYI.


tolower, toupper and chartr with non-native chars is work in progress. 
(They were getting things wrong in a platform-dependent way, and it 
seems the replacement code is also flaky, if in general better than what 
went before.)





sessionInfo()

R Under development (unstable) (2020-12-17 r79645)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux bullseye/sid

Matrix products: default
BLAS:   /opt/R-devel/lib/R/lib/libRblas.so
LAPACK: /opt/R-devel/lib/R/lib/libRlapack.so

locale:
  [1] LC_CTYPE=en_US.iso885915   LC_NUMERIC=C
  [3] LC_TIME=en_US.iso885915LC_COLLATE=en_US.iso885915
  [5] LC_MONETARY=en_US.iso885915LC_MESSAGES=en_US.iso885915
  [7] LC_PAPER=en_US.iso885915   LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.iso885915 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_4.1.0


## Run this:

fun <- function() {
   ben <- paste0(
 "\u098F\u099F\u09BF \u098F\u0995\u099F\u09BF ",
 "\u09AD\u09BE\u09B7\u09BE \u098F\u0995\u0995 IBM ",
 "\u09B8\u09CD\u0995\u09CD\u09B0\u09BF\u09AA\u09CD\u099F"
   )
   tolower(ben)
}
fun()

## To crash:

free(): invalid next size (fast)

Program received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) fun()
Undefined function command: "()".  Try "help function".
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x7fe471fb3537 in __GI_abort () at abort.c:79
#2  0x7fe47200c708 in __libc_message (action=action@entry=do_abort,
 fmt=fmt@entry=0x7fe47211ae31 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#3  0x7fe4720139fa in malloc_printerr
(str=str@entry=0x7fe47211d180 "free(): invalid next size (fast)")
 at malloc.c:5347
#4  0x7fe472014c34 in _int_free (av=0x7fe47214cb80 ,
p=0x2696a30, have_lock=0)
 at malloc.c:4249
#5  0x7fe47245c090 in R_chk_free () from /opt/R-devel/lib/R/lib/libR.so
#6  0x7fe47235c8cc in do_tolower () from /opt/R-devel/lib/R/lib/libR.so
#7  0x7fe4723f466e in bcEval () from /opt/R-devel/lib/R/lib/libR.so
#8  0x7fe4723ed20d in Rf_eval () from /opt/R-devel/lib/R/lib/libR.so
#9  0x7fe472409111 in R_execClosure () from /opt/R-devel/lib/R/lib/libR.so
#10 0x7fe472408cce in Rf_applyClosure () from /opt/R-devel/lib/R/lib/libR.so
#11 0x7fe4723ed884 in Rf_eval () from /opt/R-devel/lib/R/lib/libR.so
#12 0x7fe47240f6b5 in do_begin () from /opt/R-devel/lib/R/lib/libR.so
#13 0x7fe4723ed5cb in Rf_eval () from /opt/R-devel/lib/R/lib/libR.so
#14 0x7fe472409111 in R_execClosure () from /opt/R-devel/lib/R/lib/libR.so
#15 0x7fe472408cce in Rf_applyClosure () from /opt/R-devel/lib/R/lib/libR.so
#16 0x7fe4723ed884 in Rf_eval () from /opt/R-devel/lib/R/lib/libR.so
#17 0x7fe472443550 in Rf_ReplIteration () from
/opt/R-devel/lib/R/lib/libR.so
#18 0x7fe472445766 in R_ReplConsole () from /opt/R-devel/lib/R/lib/libR.so
#19 0x7fe4724456bd in run_Rmainloop () from /opt/R-devel/lib/R/lib/libR.so
#20 0x7fe4724457fe in Rf_mainloop () from /opt/R-devel/lib/R/lib/libR.so
#21 0x00401177 in main ()
#22 0x7fe471fb4d0a in __libc_start_main (main=0x401140 ,
argc=1, argv=0x7ffc058512d8,
 init=, fini=, rtld_fini=, stack_end=0x7ffc058512c8)
 at ../csu/libc-start.c:308
#23 0x0040107a in _start ()

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Emeritus Professor of Applied Statistics, University of Oxford

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] R-devel crash

2020-12-18 Thread Gábor Csárdi
FYI.

> sessionInfo()
R Under development (unstable) (2020-12-17 r79645)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux bullseye/sid

Matrix products: default
BLAS:   /opt/R-devel/lib/R/lib/libRblas.so
LAPACK: /opt/R-devel/lib/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.iso885915   LC_NUMERIC=C
 [3] LC_TIME=en_US.iso885915LC_COLLATE=en_US.iso885915
 [5] LC_MONETARY=en_US.iso885915LC_MESSAGES=en_US.iso885915
 [7] LC_PAPER=en_US.iso885915   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.iso885915 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_4.1.0


## Run this:

fun <- function() {
  ben <- paste0(
"\u098F\u099F\u09BF \u098F\u0995\u099F\u09BF ",
"\u09AD\u09BE\u09B7\u09BE \u098F\u0995\u0995 IBM ",
"\u09B8\u09CD\u0995\u09CD\u09B0\u09BF\u09AA\u09CD\u099F"
  )
  tolower(ben)
}
fun()

## To crash:

free(): invalid next size (fast)

Program received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) fun()
Undefined function command: "()".  Try "help function".
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x7fe471fb3537 in __GI_abort () at abort.c:79
#2  0x7fe47200c708 in __libc_message (action=action@entry=do_abort,
fmt=fmt@entry=0x7fe47211ae31 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#3  0x7fe4720139fa in malloc_printerr
(str=str@entry=0x7fe47211d180 "free(): invalid next size (fast)")
at malloc.c:5347
#4  0x7fe472014c34 in _int_free (av=0x7fe47214cb80 ,
p=0x2696a30, have_lock=0)
at malloc.c:4249
#5  0x7fe47245c090 in R_chk_free () from /opt/R-devel/lib/R/lib/libR.so
#6  0x7fe47235c8cc in do_tolower () from /opt/R-devel/lib/R/lib/libR.so
#7  0x7fe4723f466e in bcEval () from /opt/R-devel/lib/R/lib/libR.so
#8  0x7fe4723ed20d in Rf_eval () from /opt/R-devel/lib/R/lib/libR.so
#9  0x7fe472409111 in R_execClosure () from /opt/R-devel/lib/R/lib/libR.so
#10 0x7fe472408cce in Rf_applyClosure () from /opt/R-devel/lib/R/lib/libR.so
#11 0x7fe4723ed884 in Rf_eval () from /opt/R-devel/lib/R/lib/libR.so
#12 0x7fe47240f6b5 in do_begin () from /opt/R-devel/lib/R/lib/libR.so
#13 0x7fe4723ed5cb in Rf_eval () from /opt/R-devel/lib/R/lib/libR.so
#14 0x7fe472409111 in R_execClosure () from /opt/R-devel/lib/R/lib/libR.so
#15 0x7fe472408cce in Rf_applyClosure () from /opt/R-devel/lib/R/lib/libR.so
#16 0x7fe4723ed884 in Rf_eval () from /opt/R-devel/lib/R/lib/libR.so
#17 0x7fe472443550 in Rf_ReplIteration () from
/opt/R-devel/lib/R/lib/libR.so
#18 0x7fe472445766 in R_ReplConsole () from /opt/R-devel/lib/R/lib/libR.so
#19 0x7fe4724456bd in run_Rmainloop () from /opt/R-devel/lib/R/lib/libR.so
#20 0x7fe4724457fe in Rf_mainloop () from /opt/R-devel/lib/R/lib/libR.so
#21 0x00401177 in main ()
#22 0x7fe471fb4d0a in __libc_start_main (main=0x401140 ,
argc=1, argv=0x7ffc058512d8,
init=, fini=, rtld_fini=, stack_end=0x7ffc058512c8)
at ../csu/libc-start.c:308
#23 0x0040107a in _start ()

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] motivation behind the fact that printing R vectors take equal (maximum one) amount of lines for each element

2020-12-18 Thread gaurav arora
What is the motivation behind the fact that printing R vectors take equal
(maximum one) amount of lines for each element?


The vector x has a large string at index 2, while a small string at index
1. When we do a print, even the entry at index 1 takes two lines. In other
words, the element at index 2 is printed after leaving an empty line. When
we run a command using system using intern arg, there may be outputs of
varying length, and then output is clumsy, as maximum no of lines are used
for each element of the vector

x = 
c("111")
x = c("2", x)
x
#Output
[1] "2"

[2] 
"111"

Why could printing with some delimiters, like a comma or space etc, not
suffice ? Similar args may apply for a vector containing some large
numbers, and some small numbers, but we will take more space to print

-- 
Gaurav Arora

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel