Re: [R] Creating NA equivalent
Responding a bit slow during holidays. Indeed, following Ducan Murdoch's advice I created the package declared which has a simple mechanism of attributing different interpretations for the same NA value, within a vector. I learned there is no need for different NA values, the built in NA is enough. By allocating different interpretations (labels), the end result is similar as if we had different NA values, all in base R. This meta-information can subsequently be used for any conceivable purpose, including what I read in this thread about censoring etc. I hope this helps, best wishes and season's greetings, Adrian On Tue, 21 Dec 2021 at 21:45, Avi Gross wrote: > I wonder if the package Adrian Dușa created might be helpful or point you > along the way. > > It was eventually named "declared" > > https://cran.r-project.org/web/packages/declared/index.html > > With a vignette here: > > https://cran.r-project.org/web/packages/declared/vignettes/declared.pdf > > I do not know if it would easily satisfy your needs but it may be a step > along the way. A package called Haven was part of the motivation and Adrian > wanted a way to import data from external sources that had more than one > category of NA that sounds a bit like what you want. His functions should > allow the creation of such data within R, as well. I am including him in > this email if you want to contact him or he has something to say. > > > -Original Message- > From: R-help On Behalf Of Duncan Murdoch > Sent: Tuesday, December 21, 2021 5:26 AM > To: Marc Girondot ; r-help@r-project.org > Subject: Re: [R] Creating NA equivalent > > On 20/12/2021 11:41 p.m., Marc Girondot via R-help wrote: > > Dear members, > > > > I work about dosage and some values are bellow the detection limit. I > > would like create new "numbers" like LDL (to represent lower than > > detection limit) and UDL (upper the detection limit) that behave like > > NA, with the possibility to test them using for example is.LDL() or > > is.UDL(). > > > > Note that NA is not the same than LDL or UDL: NA represent missing data. > > Here the data is available as LDL or UDL. > > > > NA is built in R language very deep... any option to create new > > version of NA-equivalent ? > > > > There was a discussion of this back in May. Here's a link to one approach > that I suggested: > >https://stat.ethz.ch/pipermail/r-devel/2021-May/080776.html > > Read the followup messages, I made at least one suggested improvement. > I don't know if anyone has packaged this, but there's a later version of > the code here: > >https://stackoverflow.com/a/69179441/2554330 > > Duncan Murdoch > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating NA equivalent
On Tue, 21 Dec 2021 05:41:31 +0100 Marc Girondot via R-help wrote: > Dear members, > > I work about dosage and some values are bellow the detection limit. I > would like create new "numbers" like LDL (to represent lower than > detection limit) and UDL (upper the detection limit) that behave like > NA, with the possibility to test them using for example is.LDL() or > is.UDL(). > > Note that NA is not the same than LDL or UDL: NA represent missing > data. Here the data is available as LDL or UDL. > > NA is built in R language very deep... any option to create new > version of NA-equivalent ? > > Thanks > > Marc You are concerned with a distinct quality in the data with respect to a specific method. You might want to code a qualitative variable that defines the detectability state of the specific reading. Then filter on the state of interest, and as a means of establishing the quality of the method or the data, summarize the detection properties in your sample for the anaytical method employed. I had an engineer tell me flatly that the measures claimed in a paper were "impossible." The method used was already common, but his system was not sensitive enough. As far as the statistical properties go, there are measures that could be made and measures that could not be made. If a different method became available, you would probably still desire to either reanalyze the older data employing the new method, or append new measures where they were previously unavailable. Either way you encounter data range or compatibility issues that have to be addressed methodologically. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating NA equivalent
attributes record what the NA is supposed to represent. -Original Message- From: Jim Lemon Sent: Tuesday, December 21, 2021 5:00 PM To: Avi Gross Cc: r-help mailing list ; Adrian Dușa Subject: Re: [R] Creating NA equivalent Please pardon a comment that may be off-target as well as off-topic. This appears similar to a number of things like fuzzy logic, where an instance can take incompatible truth values. It is known that an instance may have an attribute with a numeric value, but that value cannot be determined. It seems to me that an appropriate designation for the value is Unk, perhaps with an associated probability of determination to distinguish it from NA (it is definitely not known). Jim On Wed, Dec 22, 2021 at 6:55 AM Avi Gross via R-help wrote: > > I wonder if the package Adrian Dușa created might be helpful or point you > along the way. > > It was eventually named "declared" > > https://cran.r-project.org/web/packages/declared/index.html > > With a vignette here: > > https://cran.r-project.org/web/packages/declared/vignettes/declared.pd > f > > I do not know if it would easily satisfy your needs but it may be a step > along the way. A package called Haven was part of the motivation and Adrian > wanted a way to import data from external sources that had more than one > category of NA that sounds a bit like what you want. His functions should > allow the creation of such data within R, as well. I am including him in this > email if you want to contact him or he has something to say. > > > -Original Message- > From: R-help On Behalf Of Duncan > Murdoch > Sent: Tuesday, December 21, 2021 5:26 AM > To: Marc Girondot ; r-help@r-project.org > Subject: Re: [R] Creating NA equivalent > > On 20/12/2021 11:41 p.m., Marc Girondot via R-help wrote: > > Dear members, > > > > I work about dosage and some values are bellow the detection limit. > > I would like create new "numbers" like LDL (to represent lower than > > detection limit) and UDL (upper the detection limit) that behave > > like NA, with the possibility to test them using for example > > is.LDL() or is.UDL(). > > > > Note that NA is not the same than LDL or UDL: NA represent missing data. > > Here the data is available as LDL or UDL. > > > > NA is built in R language very deep... any option to create new > > version of NA-equivalent ? > > > > There was a discussion of this back in May. Here's a link to one approach > that I suggested: > >https://stat.ethz.ch/pipermail/r-devel/2021-May/080776.html > > Read the followup messages, I made at least one suggested improvement. > I don't know if anyone has packaged this, but there's a later version of the > code here: > >https://stackoverflow.com/a/69179441/2554330 > > Duncan Murdoch > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating NA equivalent
Hi Bert, What troubles me about this is that something like detectable level(s) is determined at a particular time and may change. Censoring in survival tells us that the case lasted "at least this long". While a less than detectable value doesn't give any useful information apart from perhaps "non-zero", an over limit value gives something like censoring with "at least this much". However, it is more difficult to conceptualize and I suspect, to quantify. To me, the important information is that we think there _may be_ a value but we don't (yet?) know it. Jim On Wed, Dec 22, 2021 at 9:56 AM Bert Gunter wrote: > > But you appear to be missing something, Jim -- see inline below (and > the original post): > > Bert > > > On Tue, Dec 21, 2021 at 2:00 PM Jim Lemon wrote: > > > > Please pardon a comment that may be off-target as well as off-topic. > > This appears similar to a number of things like fuzzy logic, where an > > instance can take incompatible truth values. > > > > It is known that an instance may have an attribute with a numeric > > value, but that value cannot be determined. > Yes, but **something** about the value is known: that it is > an upper > value or < a lower value. Such information should be used > (censoring!), not characterized as completely unknown. Think about it > in terms of survival time: saying that a person lasted longer than k > months is much more informative than saying that how long they lasted > is completely unknown! > > > > > It seems to me that an appropriate designation for the value is Unk, > > perhaps with an associated probability of determination to distinguish > > it from NA (it is definitely not known). > > > > Jim > > > > On Wed, Dec 22, 2021 at 6:55 AM Avi Gross via R-help > > wrote: > > > > > > I wonder if the package Adrian Dușa created might be helpful or point you > > > along the way. > > > > > > It was eventually named "declared" > > > > > > https://cran.r-project.org/web/packages/declared/index.html > > > > > > With a vignette here: > > > > > > https://cran.r-project.org/web/packages/declared/vignettes/declared.pdf > > > > > > I do not know if it would easily satisfy your needs but it may be a step > > > along the way. A package called Haven was part of the motivation and > > > Adrian wanted a way to import data from external sources that had more > > > than one category of NA that sounds a bit like what you want. His > > > functions should allow the creation of such data within R, as well. I am > > > including him in this email if you want to contact him or he has > > > something to say. > > > > > > > > > -Original Message- > > > From: R-help On Behalf Of Duncan Murdoch > > > Sent: Tuesday, December 21, 2021 5:26 AM > > > To: Marc Girondot ; r-help@r-project.org > > > Subject: Re: [R] Creating NA equivalent > > > > > > On 20/12/2021 11:41 p.m., Marc Girondot via R-help wrote: > > > > Dear members, > > > > > > > > I work about dosage and some values are bellow the detection limit. I > > > > would like create new "numbers" like LDL (to represent lower than > > > > detection limit) and UDL (upper the detection limit) that behave like > > > > NA, with the possibility to test them using for example is.LDL() or > > > > is.UDL(). > > > > > > > > Note that NA is not the same than LDL or UDL: NA represent missing data. > > > > Here the data is available as LDL or UDL. > > > > > > > > NA is built in R language very deep... any option to create new > > > > version of NA-equivalent ? > > > > > > > > > > There was a discussion of this back in May. Here's a link to one > > > approach that I suggested: > > > > > >https://stat.ethz.ch/pipermail/r-devel/2021-May/080776.html > > > > > > Read the followup messages, I made at least one suggested improvement. > > > I don't know if anyone has packaged this, but there's a later version of > > > the code here: > > > > > >https://stackoverflow.com/a/69179441/2554330 > > > > > > Duncan Murdoch > > > > > > __ > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > >
Re: [R] Creating NA equivalent
But you appear to be missing something, Jim -- see inline below (and the original post): Bert On Tue, Dec 21, 2021 at 2:00 PM Jim Lemon wrote: > > Please pardon a comment that may be off-target as well as off-topic. > This appears similar to a number of things like fuzzy logic, where an > instance can take incompatible truth values. > > It is known that an instance may have an attribute with a numeric > value, but that value cannot be determined. Yes, but **something** about the value is known: that it is > an upper value or < a lower value. Such information should be used (censoring!), not characterized as completely unknown. Think about it in terms of survival time: saying that a person lasted longer than k months is much more informative than saying that how long they lasted is completely unknown! > > It seems to me that an appropriate designation for the value is Unk, > perhaps with an associated probability of determination to distinguish > it from NA (it is definitely not known). > > Jim > > On Wed, Dec 22, 2021 at 6:55 AM Avi Gross via R-help > wrote: > > > > I wonder if the package Adrian Dușa created might be helpful or point you > > along the way. > > > > It was eventually named "declared" > > > > https://cran.r-project.org/web/packages/declared/index.html > > > > With a vignette here: > > > > https://cran.r-project.org/web/packages/declared/vignettes/declared.pdf > > > > I do not know if it would easily satisfy your needs but it may be a step > > along the way. A package called Haven was part of the motivation and Adrian > > wanted a way to import data from external sources that had more than one > > category of NA that sounds a bit like what you want. His functions should > > allow the creation of such data within R, as well. I am including him in > > this email if you want to contact him or he has something to say. > > > > > > -----Original Message- > > From: R-help On Behalf Of Duncan Murdoch > > Sent: Tuesday, December 21, 2021 5:26 AM > > To: Marc Girondot ; r-help@r-project.org > > Subject: Re: [R] Creating NA equivalent > > > > On 20/12/2021 11:41 p.m., Marc Girondot via R-help wrote: > > > Dear members, > > > > > > I work about dosage and some values are bellow the detection limit. I > > > would like create new "numbers" like LDL (to represent lower than > > > detection limit) and UDL (upper the detection limit) that behave like > > > NA, with the possibility to test them using for example is.LDL() or > > > is.UDL(). > > > > > > Note that NA is not the same than LDL or UDL: NA represent missing data. > > > Here the data is available as LDL or UDL. > > > > > > NA is built in R language very deep... any option to create new > > > version of NA-equivalent ? > > > > > > > There was a discussion of this back in May. Here's a link to one approach > > that I suggested: > > > >https://stat.ethz.ch/pipermail/r-devel/2021-May/080776.html > > > > Read the followup messages, I made at least one suggested improvement. > > I don't know if anyone has packaged this, but there's a later version of > > the code here: > > > >https://stackoverflow.com/a/69179441/2554330 > > > > Duncan Murdoch > > > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating NA equivalent
Please pardon a comment that may be off-target as well as off-topic. This appears similar to a number of things like fuzzy logic, where an instance can take incompatible truth values. It is known that an instance may have an attribute with a numeric value, but that value cannot be determined. It seems to me that an appropriate designation for the value is Unk, perhaps with an associated probability of determination to distinguish it from NA (it is definitely not known). Jim On Wed, Dec 22, 2021 at 6:55 AM Avi Gross via R-help wrote: > > I wonder if the package Adrian Dușa created might be helpful or point you > along the way. > > It was eventually named "declared" > > https://cran.r-project.org/web/packages/declared/index.html > > With a vignette here: > > https://cran.r-project.org/web/packages/declared/vignettes/declared.pdf > > I do not know if it would easily satisfy your needs but it may be a step > along the way. A package called Haven was part of the motivation and Adrian > wanted a way to import data from external sources that had more than one > category of NA that sounds a bit like what you want. His functions should > allow the creation of such data within R, as well. I am including him in this > email if you want to contact him or he has something to say. > > > -Original Message- > From: R-help On Behalf Of Duncan Murdoch > Sent: Tuesday, December 21, 2021 5:26 AM > To: Marc Girondot ; r-help@r-project.org > Subject: Re: [R] Creating NA equivalent > > On 20/12/2021 11:41 p.m., Marc Girondot via R-help wrote: > > Dear members, > > > > I work about dosage and some values are bellow the detection limit. I > > would like create new "numbers" like LDL (to represent lower than > > detection limit) and UDL (upper the detection limit) that behave like > > NA, with the possibility to test them using for example is.LDL() or > > is.UDL(). > > > > Note that NA is not the same than LDL or UDL: NA represent missing data. > > Here the data is available as LDL or UDL. > > > > NA is built in R language very deep... any option to create new > > version of NA-equivalent ? > > > > There was a discussion of this back in May. Here's a link to one approach > that I suggested: > >https://stat.ethz.ch/pipermail/r-devel/2021-May/080776.html > > Read the followup messages, I made at least one suggested improvement. > I don't know if anyone has packaged this, but there's a later version of the > code here: > >https://stackoverflow.com/a/69179441/2554330 > > Duncan Murdoch > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating NA equivalent
I wonder if the package Adrian Dușa created might be helpful or point you along the way. It was eventually named "declared" https://cran.r-project.org/web/packages/declared/index.html With a vignette here: https://cran.r-project.org/web/packages/declared/vignettes/declared.pdf I do not know if it would easily satisfy your needs but it may be a step along the way. A package called Haven was part of the motivation and Adrian wanted a way to import data from external sources that had more than one category of NA that sounds a bit like what you want. His functions should allow the creation of such data within R, as well. I am including him in this email if you want to contact him or he has something to say. -Original Message- From: R-help On Behalf Of Duncan Murdoch Sent: Tuesday, December 21, 2021 5:26 AM To: Marc Girondot ; r-help@r-project.org Subject: Re: [R] Creating NA equivalent On 20/12/2021 11:41 p.m., Marc Girondot via R-help wrote: > Dear members, > > I work about dosage and some values are bellow the detection limit. I > would like create new "numbers" like LDL (to represent lower than > detection limit) and UDL (upper the detection limit) that behave like > NA, with the possibility to test them using for example is.LDL() or > is.UDL(). > > Note that NA is not the same than LDL or UDL: NA represent missing data. > Here the data is available as LDL or UDL. > > NA is built in R language very deep... any option to create new > version of NA-equivalent ? > There was a discussion of this back in May. Here's a link to one approach that I suggested: https://stat.ethz.ch/pipermail/r-devel/2021-May/080776.html Read the followup messages, I made at least one suggested improvement. I don't know if anyone has packaged this, but there's a later version of the code here: https://stackoverflow.com/a/69179441/2554330 Duncan Murdoch __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating NA equivalent
Say 'yi' is left censored. Then: # naive regression model res1 <- lm(yi ~ xi, data=dat) # tobit model via survreg() res2a <- survreg(Surv(yi, yi > censval, type="left") ~ xi, dist="gaussian", data=dat) # tobit model via tobit() from AER package res2b <- tobit(yi ~ xi, left=censval, data=dat) # tobit model via censReg() from censReg package res2c <- censReg(yi ~ xi, left=censval, data=dat) (forgot to mention the AER package; and I assume there are even other packages that can fit Tobit models). One can also have censoring on both sides in Tobit models. Just explore these packages to see what they can do. Best, Wolfgang >-Original Message- >From: Chris Evans [mailto:chrish...@psyctc.org] >Sent: Tuesday, 21 December, 2021 12:56 >To: Viechtbauer, Wolfgang (SP) >Cc: r-help@r-project.org >Subject: Re: Creating NA equivalent > >Many thanks Wolfgang, > >I guess I can see that survival analyses don't have to be time based but >clearly I need to read up on that. I can't see an example in the survival >package. And it proves to be hard to search for one. Can anyone point me >to useful resources on that, in {survival} or not? > >I am probably straying way off topic and off list guide here but isn't a >Tobit only handling censoring at one edge, i.e. the LDL scenario, or the UDL, >but not both? I think this may be getting back to Marc's original question >and certainly, again, I would love to be pointed to either Tobit handling >LDL _and_ UDL or to any other existing methods. > >TIA, > >Chris > >- Original Message - >> From: "Wolfgang Viechtbauer" >> To: "Chris Evans" >> Cc: r-help@r-project.org >> Sent: Tuesday, 21 December, 2021 11:31:55 >> Subject: RE: Creating NA equivalent > >> Hi Chris, >> >> The survival package provides machinery for handling censored observations. >> Whether time is censored or some other type of variable (e.g., viral load due >> to some lower detection limit) does not make a fundamental difference. In >> fact, >> the type of model you are thinking of with 2) is a Tobit model, which can be >> fitted using the survival package (or censReg). >> >> Best, >> Wolfgang >> >>>-Original Message- >>>From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Chris Evans >>>Sent: Tuesday, 21 December, 2021 12:17 >>>To: Duncan Murdoch >>>Cc: r-help@r-project.org >>>Subject: Re: [R] Creating NA equivalent >>> >>> I am neither a programmer nor a professional statistician but this topic >>> interests me because: >>> >>> 1) I remember from long, long ago that S had a way to create labels that >>> could >>>denote multiple ways in which a value could be missing that was sometimes >>>useful to me as my field sometimes has such situations. In R I handle >>> this >>>with a second variable but I can see that using attributes is cleaner and >>>might have real benefits when doing missing value analyses. That might >>>raise questions about whether some of the nice packages that help with >>>missing value analyses would take on board some standardised use of >>>attributes for this. >>> >>> 2) I think Marc's question LDL/UDL is about a very particular sort of value >>>that isn't missing and _is_ censored but not in survival analysis meaning >>>of censored. (At least, it's not the same to my mind, perhaps it is? To >>> me >>>the difference is that I most often hit the LDL/UDL issue in data that >>>don't have much, or any, time frame.) Again, this comes up a lot for me >>>whe people are given limited possible answers in questionnaires and I've >>>often wondered if I should explore simulating probability models for an >>> the >>>"off the edge" value on a latent variable beneath/behind the measured >>>responses. I'd be very grateful to hear of any work in R packages (to >>> stay >>>only just "off the edge" of the postingguide). Or of any work a long >>>the lines that Duncan offers, that sort of pulls this towardbase R, >>>though that sounds to me as if it would be a huge undertaking. >>> >>> I'm very interested to hear any thoughts on either aspect. >>> >>> Seasonal (mutivalued) greetings to all! >>> >> > Chris __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating NA equivalent
Many thanks Wolfgang, I guess I can see that survival analyses don't have to be time based but clearly I need to read up on that. I can't see an example in the survival package. And it proves to be hard to search for one. Can anyone point me to useful resources on that, in {survival} or not? I am probably straying way off topic and off list guide here but isn't a Tobit only handling censoring at one edge, i.e. the LDL scenario, or the UDL, but not both? I think this may be getting back to Marc's original question and certainly, again, I would love to be pointed to either Tobit handling LDL _and_ UDL or to any other existing methods. TIA, Chris - Original Message - > From: "Wolfgang Viechtbauer" > To: "Chris Evans" > Cc: r-help@r-project.org > Sent: Tuesday, 21 December, 2021 11:31:55 > Subject: RE: Creating NA equivalent > Hi Chris, > > The survival package provides machinery for handling censored observations. > Whether time is censored or some other type of variable (e.g., viral load due > to some lower detection limit) does not make a fundamental difference. In > fact, > the type of model you are thinking of with 2) is a Tobit model, which can be > fitted using the survival package (or censReg). > > Best, > Wolfgang > >>-Original Message- >>From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Chris Evans >>Sent: Tuesday, 21 December, 2021 12:17 >>To: Duncan Murdoch >>Cc: r-help@r-project.org >>Subject: Re: [R] Creating NA equivalent >> >> I am neither a programmer nor a professional statistician but this topic >> interests me because: >> >> 1) I remember from long, long ago that S had a way to create labels that >> could >>denote multiple ways in which a value could be missing that was sometimes >>useful to me as my field sometimes has such situations. In R I handle >> this >>with a second variable but I can see that using attributes is cleaner and >>might have real benefits when doing missing value analyses. That might >>raise questions about whether some of the nice packages that help with >>missing value analyses would take on board some standardised use of >>attributes for this. >> >> 2) I think Marc's question LDL/UDL is about a very particular sort of value >>that isn't missing and _is_ censored but not in survival analysis meaning >>of censored. (At least, it's not the same to my mind, perhaps it is? To >> me >>the difference is that I most often hit the LDL/UDL issue in data that >>don't have much, or any, time frame.) Again, this comes up a lot for me >>where people are given limited possible answers in questionnaires and I've >>often wondered if I should explore simulating probability models for an >> the >>"off the edge" value on a latent variable beneath/behind the measured >>responses. I'd be very grateful to hear of any work in R packages (to >> stay >>only just "off the edge" of the postingguide). Or of any work a long >>the lines that Duncan offers, that sort of pulls this towardbase R, >>though that sounds to me as if it would be a huge undertaking. >> >> I'm very interested to hear any thoughts on either aspect. >> >> Seasonal (mutivalued) greetings to all! >> > > Chris -- Chris Evans (he/him) Visiting Professor, UDLA, Quito, Ecuador & Honorary Professor, University of Roehampton, London, UK. Work web site: https://www.psyctc.org/psyctc/ CORE site: https://www.coresystemtrust.org.uk/ Personal site: https://www.psyctc.org/pelerinage2016/ OMbook:https://ombook.psyctc.org/book/ __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating NA equivalent
Hi Chris, The survival package provides machinery for handling censored observations. Whether time is censored or some other type of variable (e.g., viral load due to some lower detection limit) does not make a fundamental difference. In fact, the type of model you are thinking of with 2) is a Tobit model, which can be fitted using the survival package (or censReg). Best, Wolfgang >-Original Message- >From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Chris Evans >Sent: Tuesday, 21 December, 2021 12:17 >To: Duncan Murdoch >Cc: r-help@r-project.org >Subject: Re: [R] Creating NA equivalent > > I am neither a programmer nor a professional statistician but this topic > interests me because: > > 1) I remember from long, long ago that S had a way to create labels that could >denote multiple ways in which a value could be missing that was sometimes >useful to me as my field sometimes has such situations. In R I handle this >with a second variable but I can see that using attributes is cleaner and >might have real benefits when doing missing value analyses. That might >raise questions about whether some of the nice packages that help with >missing value analyses would take on board some standardised use of >attributes for this. > > 2) I think Marc's question LDL/UDL is about a very particular sort of value >that isn't missing and _is_ censored but not in survival analysis meaning >of censored. (At least, it's not the same to my mind, perhaps it is? To me >the difference is that I most often hit the LDL/UDL issue in data that >don't have much, or any, time frame.) Again, this comes up a lot for me >where people are given limited possible answers in questionnaires and I've >often wondered if I should explore simulating probability models for an the >"off the edge" value on a latent variable beneath/behind the measured >responses. I'd be very grateful to hear of any work in R packages (to stay >only just "off the edge" of the postingguide). Or of any work a long >the lines that Duncan offers, that sort of pulls this towardbase R, >though that sounds to me as if it would be a huge undertaking. > > I'm very interested to hear any thoughts on either aspect. > > Seasonal (mutivalued) greetings to all! > > Chris __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating NA equivalent
I am neither a programmer nor a professional statistician but this topic interests me because: 1) I remember from long, long ago that S had a way to create labels that could denote multiple ways in which a value could be missing that was sometimes useful to me as my field sometimes has such situations. In R I handle this with a second variable but I can see that using attributes is cleaner and might have real benefits when doing missing value analyses. That might raise questions about whether some of the nice packages that help with missing value analyses would take on board some standardised use of attributes for this. 2) I think Marc's question LDL/UDL is about a very particular sort of value that isn't missing and _is_ censored but not in survival analysis meaning of censored. (At least, it's not the same to my mind, perhaps it is? To me the difference is that I most often hit the LDL/UDL issue in data that don't have much, or any, time frame.) Again, this comes up a lot for me where people are given limited possible answers in questionnaires and I've often wondered if I should explore simulating probability models for an the "off the edge" value on a latent variable beneath/behind the measured responses. I'd be very grateful to hear of any work in R packages (to stay only just "off the edge" of the postingguide). Or of any work a long the lines that Duncan offers, that sort of pulls this towardbase R, though that sounds to me as if it would be a huge undertaking. I'm very interested to hear any thoughts on either aspect. Seasonal (mutivalued) greetings to all! Chris - Original Message - > From: "Duncan Murdoch" > To: "Marc Girondot" , r-help@r-project.org > Sent: Tuesday, 21 December, 2021 10:26:12 > Subject: Re: [R] Creating NA equivalent > On 20/12/2021 11:41 p.m., Marc Girondot via R-help wrote: >> Dear members, >> >> I work about dosage and some values are bellow the detection limit. I >> would like create new "numbers" like LDL (to represent lower than >> detection limit) and UDL (upper the detection limit) that behave like >> NA, with the possibility to test them using for example is.LDL() or >> is.UDL(). >> >> Note that NA is not the same than LDL or UDL: NA represent missing data. >> Here the data is available as LDL or UDL. >> >> NA is built in R language very deep... any option to create new version >> of NA-equivalent ? >> > > There was a discussion of this back in May. Here's a link to one > approach that I suggested: > > https://stat.ethz.ch/pipermail/r-devel/2021-May/080776.html > > Read the followup messages, I made at least one suggested improvement. > I don't know if anyone has packaged this, but there's a later version of > the code here: > > https://stackoverflow.com/a/69179441/2554330 > > Duncan Murdoch > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Chris Evans (he/him) Visiting Professor, UDLA, Quito, Ecuador & Honorary Professor, University of Roehampton, London, UK. Work web site: https://www.psyctc.org/psyctc/ CORE site: https://www.coresystemtrust.org.uk/ Personal site: https://www.psyctc.org/pelerinage2016/ OMbook:https://ombook.psyctc.org/book/ __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating NA equivalent
On 20/12/2021 11:41 p.m., Marc Girondot via R-help wrote: Dear members, I work about dosage and some values are bellow the detection limit. I would like create new "numbers" like LDL (to represent lower than detection limit) and UDL (upper the detection limit) that behave like NA, with the possibility to test them using for example is.LDL() or is.UDL(). Note that NA is not the same than LDL or UDL: NA represent missing data. Here the data is available as LDL or UDL. NA is built in R language very deep... any option to create new version of NA-equivalent ? There was a discussion of this back in May. Here's a link to one approach that I suggested: https://stat.ethz.ch/pipermail/r-devel/2021-May/080776.html Read the followup messages, I made at least one suggested improvement. I don't know if anyone has packaged this, but there's a later version of the code here: https://stackoverflow.com/a/69179441/2554330 Duncan Murdoch __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating NA equivalent
Beyond known limits are left/right censored data. You need to use statistical methodology that handles censoring. See the survival package and the CRAN Survival task view for this -- or consult an appropriate expert. There are of course standard ways of annotating such data in these packages. Bert On Mon, Dec 20, 2021, 8:41 PM Marc Girondot via R-help wrote: > Dear members, > > I work about dosage and some values are bellow the detection limit. I > would like create new "numbers" like LDL (to represent lower than > detection limit) and UDL (upper the detection limit) that behave like > NA, with the possibility to test them using for example is.LDL() or > is.UDL(). > > Note that NA is not the same than LDL or UDL: NA represent missing data. > Here the data is available as LDL or UDL. > > NA is built in R language very deep... any option to create new version > of NA-equivalent ? > > Thanks > > Marc > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating NA equivalent
Dear members, I work about dosage and some values are bellow the detection limit. I would like create new "numbers" like LDL (to represent lower than detection limit) and UDL (upper the detection limit) that behave like NA, with the possibility to test them using for example is.LDL() or is.UDL(). Note that NA is not the same than LDL or UDL: NA represent missing data. Here the data is available as LDL or UDL. NA is built in R language very deep... any option to create new version of NA-equivalent ? Thanks Marc __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.