>>>>> Gabriel Becker <gmbec...@ucdavis.edu> >>>>> on Tue, 24 May 2016 10:30:48 -0700 writes:
> On Tue, May 24, 2016 at 9:30 AM, Jeroen Ooms <jeroen.o...@stat.ucla.edu> > wrote: >> On Tue, May 24, 2016 at 5:59 PM, Gabriel Becker <gmbec...@ucdavis.edu> >> wrote: >> > Shouldn't Rf_mkString(NULL) return (the c-level equivalent of) >> character() >> > rather than the NA_character_? >> >> No. It should still be safe to assume that mkString() always returns a >> character vector of exactly length one. Anything else could lead to >> type errors. >> > Well the thing is you're passing an invalid pointer, that doesn't point to > a C string, to a constructor expecting a valid const char *. I'm fine with > the contract being that mkString always returns a character vector of > length one, but that doesn't necessarily mean that the function needs to > accept NULL pointers. The contract as I understand it is that if you give > it a C string, it will create a CHARSXP for that string. In this light, > Bill's suggestion that it throw an error seems the most principled > response. I would think you would need to at the very least emit a warning. I agree with Jerooen that mkChar() and mkString() may be used in contexts where they can end up with a NULL and hence should not segfault... and hence am willing the extra (very small) penalty of checking for NULL. >> >> > An empty string and NULL aren't the same. >> >> Exactly! So if you pass in an empty C string, you get an empty R >> string, and if you pass in a null pointer you get NA. >> >> Rf_mkString(NULL) <--> NA >> Rf_mkString("") <--> "" >> >> There is no ambiguity, and much better than segfaulting. Better than segfaulting, yes, but really agree with Bill (and Gabe), also for Rf_mkChar(NULL): I think both functions should give an error in such a case rather than returning NA_character_ It is an accident of some kind if they got NULL, no? -- Martin Maechler, ETH Zurich > Well, better than segfaulting is not really relevant here. No one is > arguing that it should segfault. The question is what behavior it should > have when it doesn't segfault. > It's true that a C empty string is not the same as NULL, but NULL isn't the > same as NA either. Semantically, for your use-case (which I gather arose > from interactions we had :) ) the NULL means there is no version, while NA > indicates there is a version but we don't know what it is. Imagine an > object class that represents a persons name (first, middle, last). Now take > two people, One has no middle name (and we know that when creating the > object) and another for whom we don't have any information about the middle > name, only first and last were reported. I would expect the first one to > have middle name either NULL or (in a data.frame context) "", while the > second would have NA_character_. In this light, mkString should arguably > generate "". i don't think the fact that there is another way to get "" is > a particularly large problem. > On the other hand, and in support of your position it came up as Michael > Lawrence and I were talking about this that asChar from utils.c will give > you NA_STRING when you give it R_NilValue. That is a coercion though, > whereas arguably mkString is not. That said, consistency would probably be > good. > ~G > -- > Gabriel Becker, PhD > Associate Scientist (Bioinformatics) > Genentech Research > [[alternative HTML version deleted]] > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel