William Dunlap wrote: > You may have to use > (unsigned int)(unsigned char)*s++ > instead of just > (unsigned int)*s++ > to avoid the sign extension.
Thanks again, I probably won't be doing the change since I don't have a Windows build environment around, and I'm a bit superstitious about fixing bugs that I cannot see... Let me just filter this information into the bug repository for now. -pd > > Bill Dunlap > TIBCO Software Inc - Spotfire Division > wdunlap tibco.com > >> -----Original Message----- >> From: Peter Dalgaard [mailto:p.dalga...@biostat.ku.dk] >> Sent: Friday, April 10, 2009 1:41 PM >> To: William Dunlap >> Cc: r-devel@r-project.org >> Subject: Re: [Rd] type.convert (PR#13646) >> >> William Dunlap wrote: >>> I can reproduce the difference that Stefan saw, depending >>> on whether or not I start Rgui with the flags >>> --no-environ --no-Rconsole >>> I think it boils down to the isBlankString() function. >>> For the string "\247" it returns 1 when those flags are >>> not present and 0 when they are. isBlankString does use >>> some locale-specific functions: >>> Rboolean isBlankString(const char *s) >>> { >>> #ifdef SUPPORT_MBCS >>> if(mbcslocale) { >>> wchar_t wc; int used; mbstate_t mb_st; >>> mbs_init(&mb_st); >>> while( (used = Mbrtowc(&wc, s, MB_CUR_MAX, &mb_st)) ) { >>> if(!iswspace(wc)) return FALSE; >>> s += used; >>> } >>> } else >>> #endif >>> while (*s) >>> if (!isspace((int)*s++)) return FALSE; >>> return TRUE; >>> } >>> >>> I was using R 2.8.1, downloaded precompiled from CRAN, on Windows >>> XP SP3. The outputs of sessionInfo() and Sys.getenv() are the same >>> in both sessions. 'Process Explorer' shows that the 2 sessions >>> have the same dll's opened. >> Thanks for that analysis Bill! >> >> Stefan was in "German_Austria.1252" which I don't think is >> multibyte, so >> only the else-clause should be relevant, pointing the finger rather >> squarely at isspace(). Googling indicates that others have >> been caught >> out by signed/unsigned char issues there. Should this >> possibly rather read >> >> if (!isspace((unsigned int)*s++)) return FALSE; >> >> ?? >> >>>> sessionInfo() >>> R version 2.8.1 (2008-12-22) >>> i386-pc-mingw32 >>> >>> locale: >>> LC_COLLATE=English_United >> States.1252;LC_CTYPE=English_United >> States.1252;LC_MONETARY=English_United >> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 >>> attached base packages: >>> [1] stats graphics grDevices utils datasets >> methods base >>> I did the test with a dll compiled from >>> #include <R.h> >>> #include <R_ext/Utils.h> >>> >>> void test_isBlankString(char **s, int *res) >>> { >>> *res = isBlankString(*s) ; >>> } >>> >>> and called by .C("test_isBlankString","\247",-1L) >>> >>> I don't see the difference while running a version of 2.9.0(devel) >>> compiled locally on 11 March 2009 (from svn rev 48116). >>> >>> Bill Dunlap >>> TIBCO Software Inc - Spotfire Division >>> wdunlap tibco.com >>> >>>> -----Original Message----- >>>> From: r-devel-boun...@r-project.org >>>> [mailto:r-devel-boun...@r-project.org] On Behalf Of Peter Dalgaard >>>> Sent: Friday, April 10, 2009 2:03 AM >>>> To: Raberger, Stefan >>>> Cc: r-b...@r-project.org; r-de...@stat.math.ethz.ch >>>> Subject: Re: [Rd] type.convert (PR#13646) >>>> >>>> Raberger, Stefan wrote: >>>>> Hi Peter, >>>>> >>>>> each of the four PCs actually has the same locale setting: >>>>> >>>>>> Sys.setlocale("LC_CTYPE") >>>>> [1] "German_Austria.1252" >>>>> >>>>> (all the other settings returned by invoking >>>> Sys.getlocale() are identical as well). >>>>> Just to be sure (because it's displayed incorrectly in my >>>> browser on the bugtracking page): the character inside the >>>> type.convert function ought to be a "section"-sign (HTML Code >>>> § or § , in R "\247", and not a dot "."). >>>> >>>> I saw it correctly. It's "\302\247" in UTF8 locales, which is >>>> of course >>>> the reason I suspected locale settings, but I can't seem to >>>> trigger the >>>> NA behaviour. >>>> >>>> I'm at a loss here, but some ideas: >>>> >>>> In the cases where it returns NA, what type is it? (I.e. >>>> storage.mode(type.convert(....))) >>>> >>>> What do you get from >>>> >>>> > charToRaw("§") >>>> [1] c2 a7 >>>> >>>> (a7, presumably, but better check). >>>> >>>> -p >>>> >>>>> -----Ursprüngliche Nachricht----- >>>>> Von: Peter Dalgaard [mailto:p.dalga...@biostat.ku.dk] >>>>> Gesendet: Donnerstag, 09. April 2009 19:26 >>>>> An: Raberger, Stefan >>>>> Cc: r-de...@stat.math.ethz.ch; r-b...@r-project.org >>>>> Betreff: Re: [Rd] type.convert (PR#13646) >>>>> >>>>> s.raber...@innovest.at wrote: >>>>>> Full_Name: Stefan Raberger >>>>>> Version: 2.8.1 >>>>>> OS: Windows XP >>>>>> Submission from: (NULL) (213.185.163.242) >>>>>> >>>>>> >>>>>> Hi there, >>>>>> >>>>>> I recently noticed some strange behaviour of the command >>>> "type.convert", >>>>>> depending on the startup mode used. But there also seems >>>> to be different >>>>>> behaviour on different PCs (all running the same OS and >>>> the same version of R). >>>>>> On PC1: >>>>>> When I start R in SDI mode (RGui --no-save --no-restore >>>> --no-site-file >>>>>> --no-init-file --no-environ) and try to convert, the result is >>>>>> >>>>>>> type.convert("§") >>>>>> [1] NA >>>>>> >>>>>> If I use MDI mode (RGui --no-save --no-restore >>>> --no-site-file --no-init-file >>>>>> --no-environ --no-Rconsole) instead, the result is >>>>>> >>>>>>> type.convert("§") >>>>>> [1] § >>>>>> Levels: § >>>>>> >>>>>> On PC2 it's exactly the other way round (SDI: §, MDI: NA), >>>> on PC2 the result is >>>>>> always NA, independent of the startup mode used, and on >>>> PC4 it's always §. >>>>>> What's the result I should expect R to return, and why is >>>> it different in so >>>>>> many cases? >>>>> Which locale does R think it is in in the four cases? >>>>> (Sys.setlocale("LC_CTYPE"), I think). >>>>> >>>>> Might well not be a bug (so please don't file it as one). >>>>> >>>>>> Any help is much appreciated! >>>>>> Regards, Stefan >>>>>> >>>>>> ______________________________________________ >>>>>> R-devel@r-project.org mailing list >>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel >>>> -- >>>> O__ ---- Peter Dalgaard Øster >> Farimagsgade 5, Entr.B >>>> c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K >>>> (*) \(*) -- University of Copenhagen Denmark Ph: >>>> (+45) 35327918 >>>> ~~~~~~~~~~ - (p.dalga...@biostat.ku.dk) FAX: >>>> (+45) 35327907 >>>> >>>> ______________________________________________ >>>> R-devel@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-devel >>>> >> >> -- >> O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B >> c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K >> (*) \(*) -- University of Copenhagen Denmark Ph: >> (+45) 35327918 >> ~~~~~~~~~~ - (p.dalga...@biostat.ku.dk) FAX: >> (+45) 35327907 >> -- O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel