Hi! I'm trying to read individual files from a ZIP archive, using the unz() function. Some of the files contain non-ASCII characters and I'd like to avoid unpacking them in a temporary directory.
My problem is that unz() seems to ignore the encoding="latin1" option I need to read the non-ASCII characters properly. I can't find a clear indication in the documentation that this is expected behaviour, except for the remark that "unz reads (only) single files within zip files, in binary mode" (and a short comment further below that re-encoding only works for text connections). Digging a bit in the source code, the ultimate cause seems to be this line in the unz_open() C-level function, on line 359 of src/main/dounzip.c: > /* set_iconv(); not yet */ Any ideas why this is commented out? The previous lines set up con->text appropriately and con->encname was set by do_unz(), so I don't see an obvious reason why the iconv layer can't be added. I'm working on 2.11.1 > _ > platform i386-apple-darwin9.8.0 > arch i386 > os darwin9.8.0 > system i386, darwin9.8.0 > status > major 2 > minor 11.1 > year 2010 > month 05 > day 31 > svn rev 52157 > language R > version.string R version 2.11.1 (2010-05-31) but have been looking at the current R-devel source code, so I suspect my problem won't just go away with the next release. Best regards, Stefan Evert [ stefan.ev...@uos.de | http://purl.org/stefan.evert ] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.