[R] factor level issue after subsetting

2011-11-01 Thread Schreiber, Stefan
Dear list,

I cannot figure out why, after sub-setting my data, that particular item
which I don't want to plot is still in the newly created subset (please
see example below). R somehow remembers what was in the original data
set. A work around is exporting and importing the new subset. Then it's
all fine; but I don't like this idea and was wondering what am I missing
here?

Thanks!
Stefan

P.S. I am using R 2.13.2 for Mac.

 dat-read.csv(~/MyFiles/data.csv)
 class(dat$treat)
[1] factor
 dat
   treat yield
1   cont  98.7
2   cont  97.2
3   cont  96.1
4   cont  98.1
5 10 103.0
6 10 101.3
7 10 102.1
8 10 101.9
9 30 121.1
1030 123.1
1130 119.7
1230 118.9
1360 109.9
1460 110.1
1560 113.1
1660 112.3
 plot(dat$treat,dat$yield)
 dat.sub-dat[which(dat$treat!='cont')]
 class(dat.sub$treat)
[1] factor
 dat.sub
   treat yield
5 10 103.0
6 10 101.3
7 10 102.1
8 10 101.9
9 30 121.1
1030 123.1
1130 119.7
1230 118.9
1360 109.9
1460 110.1
1560 113.1
1660 112.3
 plot(dat.sub$treat,dat.sub$yield)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] factor level issue after subsetting

2011-11-01 Thread Nordlund, Dan (DSHS/RDA)
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Schreiber, Stefan
 Sent: Tuesday, November 01, 2011 2:29 PM
 To: r-help@r-project.org
 Subject: [R] factor level issue after subsetting
 
 Dear list,
 
 I cannot figure out why, after sub-setting my data, that particular
 item
 which I don't want to plot is still in the newly created subset (please
 see example below). R somehow remembers what was in the original data
 set. 

That is the nature of factors.  Once created, unused levels must be xplicitly 
dropped

plot(droplevels(dat.sub$treat),dat.sub$yield)


Hope this is helpful,

Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA 98504-5204



A work around is exporting and importing the new subset. Then it's
 all fine; but I don't like this idea and was wondering what am I
 missing
 here?
 
 Thanks!
 Stefan
 
 P.S. I am using R 2.13.2 for Mac.
 
  dat-read.csv(~/MyFiles/data.csv)
  class(dat$treat)
 [1] factor
  dat
treat yield
 1   cont  98.7
 2   cont  97.2
 3   cont  96.1
 4   cont  98.1
 5 10 103.0
 6 10 101.3
 7 10 102.1
 8 10 101.9
 9 30 121.1
 1030 123.1
 1130 119.7
 1230 118.9
 1360 109.9
 1460 110.1
 1560 113.1
 1660 112.3
  plot(dat$treat,dat$yield)
  dat.sub-dat[which(dat$treat!='cont')]
  class(dat.sub$treat)
 [1] factor
  dat.sub
treat yield
 5 10 103.0
 6 10 101.3
 7 10 102.1
 8 10 101.9
 9 30 121.1
 1030 123.1
 1130 119.7
 1230 118.9
 1360 109.9
 1460 110.1
 1560 113.1
 1660 112.3
  plot(dat.sub$treat,dat.sub$yield)
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] factor level issue after subsetting

2011-11-01 Thread Justin Haynes
first of all, the subsetting line is overly complicated.

dat.sub-dat[dat$treat!='cont',]

will work just fine.  R does exactly what you're describing.  It knows
the levels of the factor.  Once you remove 'cont' from the data, that
doesn't mean that the level is removed from the factor:

 df-data.frame(let=factor(sample(letters[1:5],100,replace=T)),num=rnorm(100))
 str(df)
'data.frame':   100 obs. of  2 variables:
 $ let: Factor w/ 5 levels a,b,c,d,..: 1 5 1 4 3 5 2 2 1 3 ...
 $ num: num  0.224 -0.523 0.974 -0.268 -0.61 ...

 df.sub-df[df$let!='a',]
 str(df.sub)
'data.frame':   82 obs. of  2 variables:
 $ let: Factor w/ 5 levels a,b,c,d,..: 5 4 3 5 2 2 3 3 5 3 ...
 $ num: num  -0.523 -0.268 -0.61 -1.383 -0.193 ...

 unique(df.sub$let)
[1] e d c b
Levels: a b c d e

 df.sub$let-factor(df.sub$let)
 unique(df.sub$let)
[1] e d c b
Levels: e d c b

 str(df.sub$let)
 Factor w/ 4 levels e,d,c,b: 1 2 3 1 4 4 3 3 1 3 ...


by redefining your factor you can eliminate the problem.  the other
option, if you don't want factors to begin with is:

options(stringsAsFactors=FALSE)  # to set the global option

or

dat-read.csv(~/MyFiles/data.csv,stringsAsFactors=FALSE)  # to set
the option locally for this single read.csv call.


On Tue, Nov 1, 2011 at 2:28 PM, Schreiber, Stefan
stefan.schrei...@ales.ualberta.ca wrote:
 Dear list,

 I cannot figure out why, after sub-setting my data, that particular item
 which I don't want to plot is still in the newly created subset (please
 see example below). R somehow remembers what was in the original data
 set. A work around is exporting and importing the new subset. Then it's
 all fine; but I don't like this idea and was wondering what am I missing
 here?

 Thanks!
 Stefan

 P.S. I am using R 2.13.2 for Mac.

 dat-read.csv(~/MyFiles/data.csv)
 class(dat$treat)
 [1] factor
 dat
   treat yield
 1   cont  98.7
 2   cont  97.2
 3   cont  96.1
 4   cont  98.1
 5     10 103.0
 6     10 101.3
 7     10 102.1
 8     10 101.9
 9     30 121.1
 10    30 123.1
 11    30 119.7
 12    30 118.9
 13    60 109.9
 14    60 110.1
 15    60 113.1
 16    60 112.3
 plot(dat$treat,dat$yield)
 dat.sub-dat[which(dat$treat!='cont')]
 class(dat.sub$treat)
 [1] factor
 dat.sub
   treat yield
 5     10 103.0
 6     10 101.3
 7     10 102.1
 8     10 101.9
 9     30 121.1
 10    30 123.1
 11    30 119.7
 12    30 118.9
 13    60 109.9
 14    60 110.1
 15    60 113.1
 16    60 112.3
 plot(dat.sub$treat,dat.sub$yield)

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] factor level issue after subsetting

2011-11-01 Thread Felipe Carrillo
Stefan:
Use the droplevels function...
dat - read.table(textConnection(
  treat yield
1  cont  98.7
2  cont  97.2
3  cont  96.1
4  cont  98.1
5    10 103.0
6    10 101.3
7    10 102.1
8    10 101.9
9    30 121.1
10    30 123.1
11    30 119.7
12    30 118.9
13    60 109.9
14    60 110.1
15    60 113.1
16    60 112.3),header=T)
dat
 plot(dat$treat,dat$yield)
 dat.sub - subset(dat,treat!=cont);dat.sub
 dat.sub - droplevels(dat.sub)    # drop unwanted levels
plot(dat.sub$treat,dat.sub$yield)

Felipe D. Carrillo
Supervisory Fishery Biologist
Department of the Interior
US Fish  Wildlife Service
California, USA
http://www.fws.gov/redbluff/rbdd_jsmp.aspx


From: Schreiber, Stefan stefan.schrei...@ales.ualberta.ca
To: r-help@r-project.org
Sent: Tuesday, November 1, 2011 2:28 PM
Subject: [R] factor level issue after subsetting

Dear list,

I cannot figure out why, after sub-setting my data, that particular item
which I don't want to plot is still in the newly created subset (please
see example below). R somehow remembers what was in the original data
set. A work around is exporting and importing the new subset. Then it's
all fine; but I don't like this idea and was wondering what am I missing
here?

Thanks!
Stefan

P.S. I am using R 2.13.2 for Mac.

 dat-read.csv(~/MyFiles/data.csv)
 class(dat$treat)
[1] factor
 dat
  treat yield
1  cont  98.7
2  cont  97.2
3  cont  96.1
4  cont  98.1
5    10 103.0
6    10 101.3
7    10 102.1
8    10 101.9
9    30 121.1
10    30 123.1
11    30 119.7
12    30 118.9
13    60 109.9
14    60 110.1
15    60 113.1
16    60 112.3
 plot(dat$treat,dat$yield)
 dat.sub-dat[which(dat$treat!='cont')]
 class(dat.sub$treat)
[1] factor
 dat.sub
  treat yield
5    10 103.0
6    10 101.3
7    10 102.1
8    10 101.9
9    30 121.1
10    30 123.1
11    30 119.7
12    30 118.9
13    60 109.9
14    60 110.1
15    60 113.1
16    60 112.3
 plot(dat.sub$treat,dat.sub$yield)

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] factor level issue after subsetting

2011-11-01 Thread Schreiber, Stefan
Thanks for the fast response and your comments!

That works perfect!

 

Another little mystery solved ;)

 

Stefan

 

 

From: Felipe Carrillo [mailto:mazatlanmex...@yahoo.com] 
Sent: Tuesday, November 01, 2011 3:54 PM
To: Schreiber, Stefan; r-help@r-project.org
Subject: Re: [R] factor level issue after subsetting

 

Stefan:

Use the droplevels function...

dat - read.table(textConnection(
  treat yield
1  cont  98.7
2  cont  97.2
3  cont  96.1
4  cont  98.1
510 103.0
610 101.3
710 102.1
810 101.9
930 121.1
1030 123.1
1130 119.7
1230 118.9
1360 109.9
1460 110.1
1560 113.1
1660 112.3),header=T)
dat
 plot(dat$treat,dat$yield)
 dat.sub - subset(dat,treat!=cont);dat.sub
 dat.sub - droplevels(dat.sub)# drop unwanted levels
plot(dat.sub$treat,dat.sub$yield)

 

Felipe D. Carrillo

Supervisory Fishery Biologist

Department of the Interior

US Fish  Wildlife Service

California, USA

http://www.fws.gov/redbluff/rbdd_jsmp.aspx

 

From: Schreiber, Stefan stefan.schrei...@ales.ualberta.ca
To: r-help@r-project.org
Sent: Tuesday, November 1, 2011 2:28 PM
Subject: [R] factor level issue after subsetting

Dear list,

I cannot figure out why, after sub-setting my data, that
particular item
which I don't want to plot is still in the newly created subset
(please
see example below). R somehow remembers what was in the original
data
set. A work around is exporting and importing the new subset.
Then it's
all fine; but I don't like this idea and was wondering what am I
missing
here?

Thanks!
Stefan

P.S. I am using R 2.13.2 for Mac.

 dat-read.csv(~/MyFiles/data.csv)
 class(dat$treat)
[1] factor
 dat
  treat yield
1  cont  98.7
2  cont  97.2
3  cont  96.1
4  cont  98.1
510 103.0
610 101.3
710 102.1
810 101.9
930 121.1
1030 123.1
1130 119.7
1230 118.9
1360 109.9
1460 110.1
1560 113.1
1660 112.3
 plot(dat$treat,dat$yield)
 dat.sub-dat[which(dat$treat!='cont')]
 class(dat.sub$treat)
[1] factor
 dat.sub
  treat yield
510 103.0
610 101.3
710 102.1
810 101.9
930 121.1
1030 123.1
1130 119.7
1230 118.9
1360 109.9
1460 110.1
1560 113.1
1660 112.3
 plot(dat.sub$treat,dat.sub$yield)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible
code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.