Re: [R] Multiple missing values
NA, Inf, -Inf, NaN would give you 4 possibilities and is.finite would check if its any of them: x - c(1, NA, 2, Inf, 3, -Inf, 4, NaN, 5) is.finite(x) [1] TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE You might need to map them all to NA before using it with various functions depending on how the functions deal with these values. Other possibilities are to have an attribute with a factor defining the type of each NA. x - c(1, NA, 2, NA, 3, NA) attr(x, type.of.na) - factor(c(A, B, A)) and depending on how much work you are prepared to do you could define a new R class that handles objects with such an attribute. On Sun, Feb 14, 2010 at 9:33 AM, John john.macin...@ed.ac.uk wrote: Does anyone know, or know documentation that describes, how to declare multiple values in R as missing that does not involve coding them as NA? I wish to be able to treate values as missing, while still retaining codes that describe the reason for the value being missing. Thanks John MAcInnes -- Professor John MacInnes Sociology, School of Social and Political Studies, No 8 Buccleuch Place University of Edinburgh Edinburgh EH8 9LN +44 (0)131 651 3867 Centre d'Estudis Demogràfics Universitat Autònoma de Barcelona Edifici E-2 08193 Bellaterra (Barcelona) Spain +34 93 581 3060 The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple missing values
I can think of a few solutions, none perfect. * You could have a master dataset that has the missing value codes you want, and a dataset that you use which is a copy of it with real NA's in it. * You could add an attribute that gives the types of missing values in the various positions. The downside is that attributes tend to disappear with subsetting. * If you only have two types, you might be able to get away with using NaN as the second type of NA. On 14/02/2010 14:33, John wrote: Does anyone know, or know documentation that describes, how to declare multiple values in R as missing that does not involve coding them as NA? I wish to be able to treate values as missing, while still retaining codes that describe the reason for the value being missing. Thanks John MAcInnes __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Patrick Burns pbu...@pburns.seanet.com http://www.burns-stat.com (home of 'The R Inferno' and 'A Guide for the Unwilling S User') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple missing values
Patrick Burns wrote: I can think of a few solutions, none perfect. * You could have a master dataset that has the missing value codes you want, and a dataset that you use which is a copy of it with real NA's in it. * You could add an attribute that gives the types of missing values in the various positions. The downside is that attributes tend to disappear with subsetting. The sas.get function in the Hmisc exemplifies that approach, and it has a subsetting method that preserves the special.miss attribute. Frank * If you only have two types, you might be able to get away with using NaN as the second type of NA. On 14/02/2010 14:33, John wrote: Does anyone know, or know documentation that describes, how to declare multiple values in R as missing that does not involve coding them as NA? I wish to be able to treate values as missing, while still retaining codes that describe the reason for the value being missing. Thanks John MAcInnes -- Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple missing values
Gary King's Amelia package for R and a stand alone version does EM algorithm multiple imputation. Joe King 206-913-2912 j...@joepking.com Never throughout history has a man who lived a life of ease left a name worth remembering. --Theodore Roosevelt -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Frank E Harrell Jr Sent: Sunday, February 14, 2010 9:39 AM To: Patrick Burns Cc: r-help@r-project.org; john.macin...@ed.ac.uk Subject: Re: [R] Multiple missing values Patrick Burns wrote: I can think of a few solutions, none perfect. * You could have a master dataset that has the missing value codes you want, and a dataset that you use which is a copy of it with real NA's in it. * You could add an attribute that gives the types of missing values in the various positions. The downside is that attributes tend to disappear with subsetting. The sas.get function in the Hmisc exemplifies that approach, and it has a subsetting method that preserves the special.miss attribute. Frank * If you only have two types, you might be able to get away with using NaN as the second type of NA. On 14/02/2010 14:33, John wrote: Does anyone know, or know documentation that describes, how to declare multiple values in R as missing that does not involve coding them as NA? I wish to be able to treate values as missing, while still retaining codes that describe the reason for the value being missing. Thanks John MAcInnes -- Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple missing values
John wrote: ... Does anyone know, or know documentation that describes, how to declare multiple values in R as missing that does not involve coding them as NA? I wish to be able to treate values as missing, while still retaining codes that describe the reason for the value being missing. I would suggest leaving the missing values as is in your data file and recoding these to NA at the top of each analysis script you run. I find that the only place I usually make use of such information is in the initial descriptives, although you may want to selectively recode for different analyses. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.