This doesn't seem to be happening on MacOS, neither in Terminal nor RStudio, (R 3.5.1, R-devel, R-patched). So probably Windows specific.
-pd > On 7 Feb 2019, at 11:17 , David Byrne <david.byrne...@gmail.com> wrote: > > Bug > Using read.table(file, encoding="UTF-8") to import a UTF-8 encoded > file containing the infinity symbol (' ∞ ') results in the infinity > symbol imported as the number 8. Other Unicode characters seem > unaffected, example, Zhe: ж > > Expected Behavior: > The imported data.frame should represent the infinity symbol as the > expected 'Inf' so that normal mathematical operations can be processed > > Stack Overflow Post: > I created a question on Stack Overflow where one other member was able > to reproduce the same issues I was having. This question can be found > at: > https://stackoverflow.com/questions/54522196/r-read-table-with-utf-8-encoded-file-reads-infinity-symbol-as-8-int > > Method to Reproduce - 1: > A simple method to reproduce this issues is to use R-Studio: In the > console, type the following: >> read.table(text=" ∞", encoding="UTF-8") > > The result should be a data.frame with a single value of '8' > > Repeating the same with ж Results in correct expected behavior > > Method to Reproduce - 2: > Create a .csv file containing the infinity and Zhe characters (I have > attached the file for convenience, hopefully it is no rejected by your > email service). Launch an interactive session using > >> r --vanilla > > Enter the following statement taking care to replace the > <path-to-file> with the appropriate one: > >> read.table("<path-to-file>/unicode_chars.csv", sep=",", encoding="UTF-8") > > > This should result in a two element data.frame; the first being the > incorrect value of 8 with an additional <U+FEFF> and the second the > correct value of Zhe. > > Note the additional <U+FEFF> prefixed to the front of the '8'. This > appears to be a hidden character for the purposes of letting editors > know the encoding. The following link has some explanation however, it > states this is caused by excel. The file I created was done so using > notepad and not Excel. > > https://medium.freecodecamp.org/a-quick-tale-about-feff-the-invisible-character-cd25cd4630e7 > > System Details: > OS: >> Windows 10.0.17134 Build 17134 > > > R Version: >> platform x86_64-w64-mingw32 >> arch x86_64 >> os mingw32 >> system x86_64, mingw32 >> status >> major 3 >> minor 4.1 >> year 2017 >> month 06 >> day 30 >> svn rev 72865 >> language R >> version.string R version 3.4.1 (2017-06-30) >> nickname Single Candle > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel