Phillip,

You approach to using factors misses an important consideration; the
class that was observed in the full dataset should not disappear just
because you subsetted the data in some manner. Also, `droplevels()` is
a useful function to call on a factor or data frame if subsetting
produces levels with zero observations and if that information is not
made use on in whatever computations follow next.

G

On 5 December 2013 10:42, Dixon, Philip M [STAT] <pdi...@iastate.edu> wrote:
> Kendra,
>
> I wonder if the problem is a factor level with no observations.  One of the 
> frustrating things about factors (class variables) in R is that the list of 
> levels is stored separately from the data.  This can cause all sorts of 
> problems if you create the factor, then subset the data, and the subset is 
> missing one or more levels of the factor.  You are subsetting your data, so 
> this may be the source of the problem.
>
> My working philosophy is to keep variables as character strings or numbers 
> until just before I need the factors.  That avoids any issues with extraneous 
> levels.  That means reading data sets (.txt or .csv files) with as.is=TRUE to 
> avoid default creation of factors.  relevel() may recreate the list of 
> levels.  I usually use factor(as.character(variable)) to flip a factor to a 
> vector of character strings then back to a factor with the correct set of 
> levels.
>
> Best wishes,
> Philip Dixon
>
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology



-- 
Gavin Simpson, PhD

_______________________________________________
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

Reply via email to