Bug Using read.table(file, encoding="UTF-8") to import a UTF-8 encoded file containing the infinity symbol (' ∞ ') results in the infinity symbol imported as the number 8. Other Unicode characters seem unaffected, example, Zhe: ж
Expected Behavior: The imported data.frame should represent the infinity symbol as the expected 'Inf' so that normal mathematical operations can be processed Stack Overflow Post: I created a question on Stack Overflow where one other member was able to reproduce the same issues I was having. This question can be found at: https://stackoverflow.com/questions/54522196/r-read-table-with-utf-8-encoded-file-reads-infinity-symbol-as-8-int Method to Reproduce - 1: A simple method to reproduce this issues is to use R-Studio: In the console, type the following: > read.table(text=" ∞", encoding="UTF-8") The result should be a data.frame with a single value of '8' Repeating the same with ж Results in correct expected behavior Method to Reproduce - 2: Create a .csv file containing the infinity and Zhe characters (I have attached the file for convenience, hopefully it is no rejected by your email service). Launch an interactive session using > r --vanilla Enter the following statement taking care to replace the <path-to-file> with the appropriate one: > read.table("<path-to-file>/unicode_chars.csv", sep=",", encoding="UTF-8") This should result in a two element data.frame; the first being the incorrect value of 8 with an additional <U+FEFF> and the second the correct value of Zhe. Note the additional <U+FEFF> prefixed to the front of the '8'. This appears to be a hidden character for the purposes of letting editors know the encoding. The following link has some explanation however, it states this is caused by excel. The file I created was done so using notepad and not Excel. https://medium.freecodecamp.org/a-quick-tale-about-feff-the-invisible-character-cd25cd4630e7 System Details: OS: > Windows 10.0.17134 Build 17134 R Version: > platform x86_64-w64-mingw32 > arch x86_64 > os mingw32 > system x86_64, mingw32 > status > major 3 > minor 4.1 > year 2017 > month 06 > day 30 > svn rev 72865 > language R > version.string R version 3.4.1 (2017-06-30) > nickname Single Candle ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel