There is a difference between the levels of a factor and the values in the 
vector. If you make NA one of the levels then it will use an integer to 
represent that level in the data just like any other level. At that point it 
seems to me that you can do meta analysis on the existence of NA in the 
original data, but the data in your working vector no longer really contains NA.

For my data analysis needs, I would stay away from exclude=NULL entirely, but 
someone else might offer a good justification for using it. I would imagine 
that avoiding mixing the actual data and your meta-analysis data (with 
exclude=NULL) would be advisable in such a case, and that would have the side 
benefit of eliminating the concerns you have raised.
If your goal is to eliminate unused levels, I usually convert to character and 
then back to a factor to accomplish that, which works fine with NAs.
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<[email protected]> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.

andrewH <[email protected]> wrote:

Dear folks?

Is there a function to correctly find (and count) the NAs in a factor when
exclude=NULL, regardless of whether their origin is in the original data or
by subsequent assignment?

In example number 1 below, where NAs are assigned by is.na()<-, testing the
factor with is.na() finds the correct number of NAs. In example number 2,
where the NAs are from the data, neither is.na(), ==NA, nor =="NA" correctly
identifies the NAs. In example number 3, which mixes NAs from assignment
with NAs from data, is.na does not even find the NAs created by assignment,
as it did in example 1.

I'm running R 2.13.2 on Windows XP with ServicePack 3

Any assistance would be greatly appreciated.

Appreciatively, andrewH


Example #1

> # Origin: is.na()<- Exclude: NULL
> KK <- factor(c("A","A","B","B","C","C"), exclude=NULL)
> KK[KK=="C"]
[1] C C
Levels: A B C
> is.na(KK[KK=="C"]) <- TRUE
> KK
[1] A A B B <NA> <NA>
Levels: A B C
> levels(KK)
[1] "A" "B" "C"
> levels(KK)[KK]
[1] "A" "A" "B" "B" NA NA 
> KK==NA
[1] NA NA NA NA NA NA
> sum(KK==NA)
[1] NA
> KK=="NA"
[1] FALSE FALSE FALSE FALSE NA NA
> sum(KK=="NA")
[1] NA
> is.na(KK)
[1] FALSE FALSE FALSE FALSE TRUE TRUE
> sum(is.na(KK))
[1] 2

Example #2

> # Origin: data Exclude: NULL
> GG <- factor(c("A","A","B","B", NA, NA), exclude=NULL)
> GG
[1] A A B B <NA> <NA>
Levels: A B <NA>
> levels(GG)
[1] "A" "B" NA 
> levels(GG)[GG]
[1] "A" "A" "B" "B" NA NA 
> GG==NA
[1] NA NA NA NA NA NA
> sum(GG==NA)
[1] NA
> GG=="NA"
[1] FALSE FALSE FALSE FALSE FALSE FALSE
> sum(GG=="NA")
[1] 0
> is.na(GG)
[1] FALSE FALSE FALSE FALSE FALSE FALSE
> sum(is.na(GG))

Example #3.

> MM <- factor(c("A","A","B","B","C","C", NA), exclude=NULL)
> is.na(MM[MM=="C"]) <- TRUE
> MM
[1] A A B B <NA> <NA> <NA>
Levels: A B C <NA>
> levels(MM)
[1] "A" "B" "C" NA 
> levels(MM)[MM]
[1] "A" "A" "B" "B" NA NA NA 
> MM==NA
[1] NA NA NA NA NA NA NA
> sum(MM==NA)
[1] NA
> MM=="NA"
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> sum(MM=="NA")
[1] 0
> is.na(MM)
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> sum(is.na(MM))
[1] 0

--
View this message in context: 
http://r.789695.n4.nabble.com/Consistant-test-for-NAs-in-a-factor-when-exclude-NULL-tp3942755p3942755.html
Sent from the R help mailing list archive at Nabble.com.

_____________________________________________

[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


        [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to