On Mon, 2 Dec 2019 10:57:51 -0300 Rafael Pereira <rafa.pereira...@gmail.com> wrote:
> checking data for non-ASCII characters ... NOTE Note: found 58 marked > Latin-1 strings > > I have used to code below to identify my scripts that have strings > using non-ASCII characters. I don't think it's about non-ASCII in source code; it's about Latin-1 strings in the package data: git clone https://github.com/ipeaGIT/geobr cd geobr/data R load('grid_state_correspondence_table.RData') sum( unlist( lapply(grid_state_correspondence_table, Encoding) ) == 'latin1' ) # [1] 58 What I'm not sure of is *how* this NOTE should be fixed. "Writing R extensions" §1.6.3 provides advice on UTF-8 strings in the R code, not data; §5.15 only says that strings *could* be marked as Latin-1 or UTF-8 (but doesn't say what *should* be done); finally, tools:::.check_package_datasets seems to produce NOTEs about Latin-1, UTF-8 and strings marked as bytes. -- Best regards, Ivan ______________________________________________ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel