Re: [R-sig-eco] subsetting data in R

2011-04-29 Thread Chris Howden
Try

pa2$influencia-factor(pa2$influencia)




Chris Howden
Founding Partner
Tricky Solutions
Tricky Solutions 4 Tricky Problems
Evidence Based Strategic Development, IP Commercialisation and Innovation,
Data Analysis, Modelling and Training
(mobile) 0410 689 945
(fax / office) (+618) 8952 7878
ch...@trickysolutions.com.au


-Original Message-
From: r-sig-ecology-boun...@r-project.org
[mailto:r-sig-ecology-boun...@r-project.org] On Behalf Of Manuel Spínola
Sent: Monday, 25 April 2011 12:37 AM
To: Christian Parker
Cc: r-sig-ecology@r-project.org
Subject: Re: [R-sig-eco] subsetting data in R

Thank you Christian.

Following your suggestion I got the following result,

  pa2 = subset(pa, influencia==AP)
  pa2$influencia-as.factor(pa2$influencia)
  levels(pa$influencia)
[1] AID AII AP


On 24/04/2011 07:42 a.m., Christian Parker wrote:
 You are creating a new object, but the columns that are stored as
 factors are not being 'refactored' so you are retaining the original
 list of levels. To fix this you can use the factor function after you
 subset

 pa2 = subset(pa, influencia==AID)
 pa2$influencia-as.factor(pa2$influencia)



 On Apr 24, 2011, at 6:04 AM, Manuel SpC-nolamspinol...@gmail.com
wrote:

 Dear list members,

 I have a question regarding too subsetting a data set in R.

 I created an object for my data:

 pa = read.csv(espec_indic.csv, header = T, sep=,, check.names =
 F)
 levels(pa$influencia)
 [1] AID AII AP

 The object has 3 levels for influencia (AP, AID, AII)

 Now I subset only observations with influencia = AID

 pa2 = subset(pa, influencia==AID)
 but if I ask for the levels of influencia still show me the 3 levels,
 AP, AID, AII.

 levels(pa2$influencia)
 [1] AID AII AP

 Why is that?

 I was thinking that I was creating a new data frame with only AID as
 a level for influencia.

 How can I make a complete new object with only the observations for
 AID and that the only level for influencia is indeed AID?

 Best,

 Manuel




 --
 *Manuel SpC-nola, Ph.D.*
 Instituto Internacional en ConservaciC3n y Manejo de Vida Silvestre
 Universidad Nacional Apartado 1350-3000 Heredia COSTA RICA
 mspin...@una.ac.cr mspinol...@gmail.com
 TelC)fono: (506) 2277-3598
 Fax: (506) 2237-7036
 Personal website: Lobito de rC-o
 https://sites.google.com/site/lobitoderio/
 Institutional website: ICOMVIShttp://www.icomvis.una.ac.cr/

 [[alternative HTML version deleted]]

 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


--
*Manuel SpC-nola, Ph.D.*
Instituto Internacional en ConservaciC3n y Manejo de Vida Silvestre
Universidad Nacional Apartado 1350-3000 Heredia COSTA RICA
mspin...@una.ac.cr mspinol...@gmail.com
TelC)fono: (506) 2277-3598
Fax: (506) 2237-7036
Personal website: Lobito de rC-o
https://sites.google.com/site/lobitoderio/
Institutional website: ICOMVIS http://www.icomvis.una.ac.cr/

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] subsetting data in R

2011-04-26 Thread Ben Bolker
  If this isn't already answered:

  I don't quite understand the question: what do you mean by do a
complete data set from an object in R?  What do you mean by the
subsetting is dangerous ... as you need to specify the levels for all
your factors again?

  (What do your 3000 columns of data represent?  If these are predictor
variables I hope you have a truly enormous number of responses ...)

  It may have been mentioned already, but droplevels(subset(...)) will
probably do what you want.  (I have tried very hard over the years to
get drop.levels= to be an optional argument to subset(), but so far I
have failed.  droplevels() is an improvement over the drop.levels()
function in gdata because (1) it is in base R and (2) it doesn't reorder
the factor by default (which is what gdata::drop.levels [insanely in my
opinion] does).

On 11-04-24 11:21 AM, Manuel Spínola wrote:
 Thank you for all the responses.
 
 Is there a way to do a complete data set from an object in R?
 I have a data set with more than 3000 columns.
 
 The subsetting is ok but it could be dangerous if you are using other 
 factors to do some analysis as you need to specify the levels for all 
 your factors again.
 
 Best,
 
 Manuel
 
 On 24/04/2011 08:30 a.m., Gustavo Carvalho wrote:
 pa2- subset(pa, influencia==AP)
 pa2$influencia- factor(pa2$influencia)
 levels(pa2$influencia)

 On Sun, Apr 24, 2011 at 11:24 AM, Manuel Spínolamspinol...@gmail.com  
 wrote:
 Thank you very much for your response, Christian, Roman, and Sarah.

 Sarah,

 I am trying your suggestion but I cannot see the levels:

 pa2 = factor(subset(pa, influencia==AP)$influencia)
 levels(pa2$influencia)
 Error in pa2$influencia : $ operator is invalid for atomic vectors

 Best,

 Manuel



 On 24/04/2011 07:51 a.m., Sarah Goslee wrote:
 By default, read.csv() turns character variables into factors, using all 
 the
 unique values as the levels.

 subset() retains those levels by default, as they are a vital element of 
 the
 data. If you are studying some attribute of men and women, say height,
 even if you are only looking at the heights for women it's important to 
 remember
 that men still exist.

 If you don't want influencia to be a factor, you can change that in the 
 import
 stringsAsFactors=FALSE.

 If you do want influencia to be a factor, but want the unused levels to be
 removed, you can use factor() to do that.

 testdata- data.frame(group=c(A, B, C, A, B, C), value=1:6)
 testdata
 group value
 1 A 1
 2 B 2
 3 C 3
 4 A 4
 5 B 5
 6 C 6
 str(testdata)
 'data.frame': 6 obs. of  2 variables:
$ group: Factor w/ 3 levels A,B,C: 1 2 3 1 2 3
$ value: int  1 2 3 4 5 6
 subset(testdata, group==A)
 group value
 1 A 1
 4 A 4
 subset(testdata, group==A)$group
 [1] A A
 Levels: A B C
 ?subset
 factor(subset(testdata, group==A)$group)
 [1] A A
 Levels: A

 Sarah

 On Sun, Apr 24, 2011 at 9:04 AM, Manuel Spínolamspinol...@gmail.com
 wrote:
 Dear list members,

 I have a question regarding too subsetting a data set in R.

 I created an object for my data:

pa = read.csv(espec_indic.csv, header = T, sep=,, check.names = F)

levels(pa$influencia)
 [1] AID AII AP

 The object has 3 levels for influencia (AP, AID, AII)

 Now I subset only observations with influencia = AID

pa2 = subset(pa, influencia==AID)

 but if I ask for the levels of influencia still show me the 3 levels,
 AP, AID, AII.

levels(pa2$influencia)
 [1] AID AII AP

 Why is that?

 I was thinking that I was creating a new data frame with only AID as a
 level for influencia.

 How can I make a complete new object with only the observations for
 AID and that the only level for influencia is indeed AID?

 Best,

 Manuel



 --
 *Manuel Spínola, Ph.D.*
 Instituto Internacional en Conservación y Manejo de Vida Silvestre
 Universidad Nacional
 Apartado 1350-3000
 Heredia
 COSTA RICA
 mspin...@una.ac.cr
 mspinol...@gmail.com
 Teléfono: (506) 2277-3598
 Fax: (506) 2237-7036
 Personal website: Lobito de río
 https://sites.google.com/site/lobitoderio/
 Institutional website: ICOMVIShttp://www.icomvis.una.ac.cr/

 [[alternative HTML version deleted]]


 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


 
 
 
 
 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] subsetting data in R

2011-04-26 Thread Manuel Spínola
Thank you very much Ben.

I was doing an analysis of indicator species with the subset data and 
the other levels were still in my subset data and the analysis was 
considering them in the analysis.
My 3000 columns are plant species presence/absence type of data.

Best,

Manuel

On 26/04/2011 12:06 p.m., Ben Bolker wrote:
If this isn't already answered:

I don't quite understand the question: what do you mean by do a
 complete data set from an object in R?  What do you mean by the
 subsetting is dangerous ... as you need to specify the levels for all
 your factors again?

(What do your 3000 columns of data represent?  If these are predictor
 variables I hope you have a truly enormous number of responses ...)

It may have been mentioned already, but droplevels(subset(...)) will
 probably do what you want.  (I have tried very hard over the years to
 get drop.levels= to be an optional argument to subset(), but so far I
 have failed.  droplevels() is an improvement over the drop.levels()
 function in gdata because (1) it is in base R and (2) it doesn't reorder
 the factor by default (which is what gdata::drop.levels [insanely in my
 opinion] does).

 On 11-04-24 11:21 AM, Manuel Spínola wrote:
 Thank you for all the responses.

 Is there a way to do a complete data set from an object in R?
 I have a data set with more than 3000 columns.

 The subsetting is ok but it could be dangerous if you are using other
 factors to do some analysis as you need to specify the levels for all
 your factors again.

 Best,

 Manuel

 On 24/04/2011 08:30 a.m., Gustavo Carvalho wrote:
 pa2- subset(pa, influencia==AP)
 pa2$influencia- factor(pa2$influencia)
 levels(pa2$influencia)

 On Sun, Apr 24, 2011 at 11:24 AM, Manuel Spínolamspinol...@gmail.com   
 wrote:
 Thank you very much for your response, Christian, Roman, and Sarah.

 Sarah,

 I am trying your suggestion but I cannot see the levels:

   pa2 = factor(subset(pa, influencia==AP)$influencia)
   levels(pa2$influencia)
 Error in pa2$influencia : $ operator is invalid for atomic vectors

 Best,

 Manuel



 On 24/04/2011 07:51 a.m., Sarah Goslee wrote:
 By default, read.csv() turns character variables into factors, using all 
 the
 unique values as the levels.

 subset() retains those levels by default, as they are a vital element of 
 the
 data. If you are studying some attribute of men and women, say height,
 even if you are only looking at the heights for women it's important to 
 remember
 that men still exist.

 If you don't want influencia to be a factor, you can change that in the 
 import
 stringsAsFactors=FALSE.

 If you do want influencia to be a factor, but want the unused levels to be
 removed, you can use factor() to do that.

 testdata- data.frame(group=c(A, B, C, A, B, C), value=1:6)
 testdata
  group value
 1 A 1
 2 B 2
 3 C 3
 4 A 4
 5 B 5
 6 C 6
 str(testdata)
 'data.frame': 6 obs. of  2 variables:
 $ group: Factor w/ 3 levels A,B,C: 1 2 3 1 2 3
 $ value: int  1 2 3 4 5 6
 subset(testdata, group==A)
  group value
 1 A 1
 4 A 4
 subset(testdata, group==A)$group
 [1] A A
 Levels: A B C
 ?subset
 factor(subset(testdata, group==A)$group)
 [1] A A
 Levels: A

 Sarah

 On Sun, Apr 24, 2011 at 9:04 AM, Manuel Spínolamspinol...@gmail.com
  wrote:
 Dear list members,

 I have a question regarding too subsetting a data set in R.

 I created an object for my data:

 pa = read.csv(espec_indic.csv, header = T, sep=,, check.names = 
 F)

  levels(pa$influencia)
 [1] AID AII AP

 The object has 3 levels for influencia (AP, AID, AII)

 Now I subset only observations with influencia = AID

 pa2 = subset(pa, influencia==AID)

 but if I ask for the levels of influencia still show me the 3 levels,
 AP, AID, AII.

  levels(pa2$influencia)
 [1] AID AII AP

 Why is that?

 I was thinking that I was creating a new data frame with only AID as a
 level for influencia.

 How can I make a complete new object with only the observations for
 AID and that the only level for influencia is indeed AID?

 Best,

 Manuel


 --
 *Manuel Spínola, Ph.D.*
 Instituto Internacional en Conservación y Manejo de Vida Silvestre
 Universidad Nacional
 Apartado 1350-3000
 Heredia
 COSTA RICA
 mspin...@una.ac.cr
 mspinol...@gmail.com
 Teléfono: (506) 2277-3598
 Fax: (506) 2237-7036
 Personal website: Lobito de río
 https://sites.google.com/site/lobitoderio/
 Institutional website: ICOMVIShttp://www.icomvis.una.ac.cr/

  [[alternative HTML version deleted]]


 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology





 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology



-- 
*Manuel Spínola, Ph.D.*
Instituto Internacional en Conservación y Manejo de Vida 

Re: [R-sig-eco] subsetting data in R

2011-04-24 Thread Christian Parker
You are creating a new object, but the columns that are stored as factors are 
not being 'refactored' so you are retaining the original list of levels. To fix 
this you can use the factor function after you subset

pa2 = subset(pa, influencia==AID)
pa2$influencia-as.factor(pa2$influencia)



On Apr 24, 2011, at 6:04 AM, Manuel Spínola mspinol...@gmail.com wrote:

 Dear list members,
 
 I have a question regarding too subsetting a data set in R.
 
 I created an object for my data:
 
 pa = read.csv(espec_indic.csv, header = T, sep=,, check.names = F)
 
 levels(pa$influencia)
 [1] AID AII AP
 
 The object has 3 levels for influencia (AP, AID, AII)
 
 Now I subset only observations with influencia = AID
 
 pa2 = subset(pa, influencia==AID)
 
 but if I ask for the levels of influencia still show me the 3 levels, 
 AP, AID, AII.
 
 levels(pa2$influencia)
 [1] AID AII AP
 
 Why is that?
 
 I was thinking that I was creating a new data frame with only AID as a 
 level for influencia.
 
 How can I make a complete new object with only the observations for 
 AID and that the only level for influencia is indeed AID?
 
 Best,
 
 Manuel
 
 
 
 
 -- 
 *Manuel Spínola, Ph.D.*
 Instituto Internacional en Conservación y Manejo de Vida Silvestre
 Universidad Nacional
 Apartado 1350-3000
 Heredia
 COSTA RICA
 mspin...@una.ac.cr
 mspinol...@gmail.com
 Teléfono: (506) 2277-3598
 Fax: (506) 2237-7036
 Personal website: Lobito de río 
 https://sites.google.com/site/lobitoderio/
 Institutional website: ICOMVIS http://www.icomvis.una.ac.cr/
 
[[alternative HTML version deleted]]
 
 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] subsetting data in R

2011-04-24 Thread Roman Luštrik
You can also use droplevels() on your new object (as of R 2.12).

Cheers,
Roman



On Sun, Apr 24, 2011 at 3:42 PM, Christian Parker cpar...@pdx.edu wrote:

 You are creating a new object, but the columns that are stored as factors
 are not being 'refactored' so you are retaining the original list of levels.
 To fix this you can use the factor function after you subset

 pa2 = subset(pa, influencia==AID)
 pa2$influencia-as.factor(pa2$influencia)



 On Apr 24, 2011, at 6:04 AM, Manuel Spínola mspinol...@gmail.com wrote:

  Dear list members,
 
  I have a question regarding too subsetting a data set in R.
 
  I created an object for my data:
 
  pa = read.csv(espec_indic.csv, header = T, sep=,, check.names = F)
 
  levels(pa$influencia)
  [1] AID AII AP
 
  The object has 3 levels for influencia (AP, AID, AII)
 
  Now I subset only observations with influencia = AID
 
  pa2 = subset(pa, influencia==AID)
 
  but if I ask for the levels of influencia still show me the 3 levels,
  AP, AID, AII.
 
  levels(pa2$influencia)
  [1] AID AII AP
 
  Why is that?
 
  I was thinking that I was creating a new data frame with only AID as a
  level for influencia.
 
  How can I make a complete new object with only the observations for
  AID and that the only level for influencia is indeed AID?
 
  Best,
 
  Manuel
 
 
 
 
  --
  *Manuel Spínola, Ph.D.*
  Instituto Internacional en Conservación y Manejo de Vida Silvestre
  Universidad Nacional
  Apartado 1350-3000
  Heredia
  COSTA RICA
  mspin...@una.ac.cr
  mspinol...@gmail.com
  Teléfono: (506) 2277-3598
  Fax: (506) 2237-7036
  Personal website: Lobito de río
  https://sites.google.com/site/lobitoderio/
  Institutional website: ICOMVIS http://www.icomvis.una.ac.cr/
 
 [[alternative HTML version deleted]]
 
  ___
  R-sig-ecology mailing list
  R-sig-ecology@r-project.org
  https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology




-- 
In God we trust, all others bring data.

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] subsetting data in R

2011-04-24 Thread Sarah Goslee
By default, read.csv() turns character variables into factors, using all the
unique values as the levels.

subset() retains those levels by default, as they are a vital element of the
data. If you are studying some attribute of men and women, say height,
even if you are only looking at the heights for women it's important to remember
that men still exist.

If you don't want influencia to be a factor, you can change that in the import
stringsAsFactors=FALSE.

If you do want influencia to be a factor, but want the unused levels to be
removed, you can use factor() to do that.

 testdata - data.frame(group=c(A, B, C, A, B, C), value=1:6)
 testdata
  group value
1 A 1
2 B 2
3 C 3
4 A 4
5 B 5
6 C 6
 str(testdata)
'data.frame':   6 obs. of  2 variables:
 $ group: Factor w/ 3 levels A,B,C: 1 2 3 1 2 3
 $ value: int  1 2 3 4 5 6
 subset(testdata, group==A)
  group value
1 A 1
4 A 4
 subset(testdata, group==A)$group
[1] A A
Levels: A B C
 ?subset
 factor(subset(testdata, group==A)$group)
[1] A A
Levels: A

Sarah

On Sun, Apr 24, 2011 at 9:04 AM, Manuel Spínola mspinol...@gmail.com wrote:
 Dear list members,

 I have a question regarding too subsetting a data set in R.

 I created an object for my data:

  pa = read.csv(espec_indic.csv, header = T, sep=,, check.names = F)

   levels(pa$influencia)
 [1] AID AII AP

 The object has 3 levels for influencia (AP, AID, AII)

 Now I subset only observations with influencia = AID

  pa2 = subset(pa, influencia==AID)

 but if I ask for the levels of influencia still show me the 3 levels,
 AP, AID, AII.

   levels(pa2$influencia)
 [1] AID AII AP

 Why is that?

 I was thinking that I was creating a new data frame with only AID as a
 level for influencia.

 How can I make a complete new object with only the observations for
 AID and that the only level for influencia is indeed AID?

 Best,

 Manuel


-- 
Sarah Goslee
http://www.functionaldiversity.org

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] subsetting data in R

2011-04-24 Thread Manuel Spínola
Thank you very much for your response, Christian, Roman, and Sarah.

Sarah,

I am trying your suggestion but I cannot see the levels:

  pa2 = factor(subset(pa, influencia==AP)$influencia)
  levels(pa2$influencia)
Error in pa2$influencia : $ operator is invalid for atomic vectors

Best,

Manuel



On 24/04/2011 07:51 a.m., Sarah Goslee wrote:
 By default, read.csv() turns character variables into factors, using all the
 unique values as the levels.

 subset() retains those levels by default, as they are a vital element of the
 data. If you are studying some attribute of men and women, say height,
 even if you are only looking at the heights for women it's important to 
 remember
 that men still exist.

 If you don't want influencia to be a factor, you can change that in the import
 stringsAsFactors=FALSE.

 If you do want influencia to be a factor, but want the unused levels to be
 removed, you can use factor() to do that.

 testdata- data.frame(group=c(A, B, C, A, B, C), value=1:6)
 testdata
group value
 1 A 1
 2 B 2
 3 C 3
 4 A 4
 5 B 5
 6 C 6
 str(testdata)
 'data.frame': 6 obs. of  2 variables:
   $ group: Factor w/ 3 levels A,B,C: 1 2 3 1 2 3
   $ value: int  1 2 3 4 5 6
 subset(testdata, group==A)
group value
 1 A 1
 4 A 4
 subset(testdata, group==A)$group
 [1] A A
 Levels: A B C
 ?subset
 factor(subset(testdata, group==A)$group)
 [1] A A
 Levels: A

 Sarah

 On Sun, Apr 24, 2011 at 9:04 AM, Manuel Spínolamspinol...@gmail.com  wrote:
 Dear list members,

 I have a question regarding too subsetting a data set in R.

 I created an object for my data:

   pa = read.csv(espec_indic.csv, header = T, sep=,, check.names = F)

 levels(pa$influencia)
 [1] AID AII AP

 The object has 3 levels for influencia (AP, AID, AII)

 Now I subset only observations with influencia = AID

   pa2 = subset(pa, influencia==AID)

 but if I ask for the levels of influencia still show me the 3 levels,
 AP, AID, AII.

 levels(pa2$influencia)
 [1] AID AII AP

 Why is that?

 I was thinking that I was creating a new data frame with only AID as a
 level for influencia.

 How can I make a complete new object with only the observations for
 AID and that the only level for influencia is indeed AID?

 Best,

 Manuel




-- 
*Manuel Spínola, Ph.D.*
Instituto Internacional en Conservación y Manejo de Vida Silvestre
Universidad Nacional
Apartado 1350-3000
Heredia
COSTA RICA
mspin...@una.ac.cr
mspinol...@gmail.com
Teléfono: (506) 2277-3598
Fax: (506) 2237-7036
Personal website: Lobito de río 
https://sites.google.com/site/lobitoderio/
Institutional website: ICOMVIS http://www.icomvis.una.ac.cr/

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] subsetting data in R

2011-04-24 Thread Gustavo Carvalho
pa2 - subset(pa, influencia==AP)
pa2$influencia - factor(pa2$influencia)
levels(pa2$influencia)

On Sun, Apr 24, 2011 at 11:24 AM, Manuel Spínola mspinol...@gmail.com wrote:
 Thank you very much for your response, Christian, Roman, and Sarah.

 Sarah,

 I am trying your suggestion but I cannot see the levels:

   pa2 = factor(subset(pa, influencia==AP)$influencia)
   levels(pa2$influencia)
 Error in pa2$influencia : $ operator is invalid for atomic vectors

 Best,

 Manuel



 On 24/04/2011 07:51 a.m., Sarah Goslee wrote:
 By default, read.csv() turns character variables into factors, using all the
 unique values as the levels.

 subset() retains those levels by default, as they are a vital element of the
 data. If you are studying some attribute of men and women, say height,
 even if you are only looking at the heights for women it's important to 
 remember
 that men still exist.

 If you don't want influencia to be a factor, you can change that in the 
 import
 stringsAsFactors=FALSE.

 If you do want influencia to be a factor, but want the unused levels to be
 removed, you can use factor() to do that.

 testdata- data.frame(group=c(A, B, C, A, B, C), value=1:6)
 testdata
    group value
 1     A     1
 2     B     2
 3     C     3
 4     A     4
 5     B     5
 6     C     6
 str(testdata)
 'data.frame': 6 obs. of  2 variables:
   $ group: Factor w/ 3 levels A,B,C: 1 2 3 1 2 3
   $ value: int  1 2 3 4 5 6
 subset(testdata, group==A)
    group value
 1     A     1
 4     A     4
 subset(testdata, group==A)$group
 [1] A A
 Levels: A B C
 ?subset
 factor(subset(testdata, group==A)$group)
 [1] A A
 Levels: A

 Sarah

 On Sun, Apr 24, 2011 at 9:04 AM, Manuel Spínolamspinol...@gmail.com  wrote:
 Dear list members,

 I have a question regarding too subsetting a data set in R.

 I created an object for my data:

   pa = read.csv(espec_indic.csv, header = T, sep=,, check.names = F)

     levels(pa$influencia)
 [1] AID AII AP

 The object has 3 levels for influencia (AP, AID, AII)

 Now I subset only observations with influencia = AID

   pa2 = subset(pa, influencia==AID)

 but if I ask for the levels of influencia still show me the 3 levels,
 AP, AID, AII.

     levels(pa2$influencia)
 [1] AID AII AP

 Why is that?

 I was thinking that I was creating a new data frame with only AID as a
 level for influencia.

 How can I make a complete new object with only the observations for
 AID and that the only level for influencia is indeed AID?

 Best,

 Manuel




 --
 *Manuel Spínola, Ph.D.*
 Instituto Internacional en Conservación y Manejo de Vida Silvestre
 Universidad Nacional
 Apartado 1350-3000
 Heredia
 COSTA RICA
 mspin...@una.ac.cr
 mspinol...@gmail.com
 Teléfono: (506) 2277-3598
 Fax: (506) 2237-7036
 Personal website: Lobito de río
 https://sites.google.com/site/lobitoderio/
 Institutional website: ICOMVIS http://www.icomvis.una.ac.cr/

        [[alternative HTML version deleted]]


 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology



___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] subsetting data in R

2011-04-24 Thread Manuel Spínola
Thank you Christian.

Following your suggestion I got the following result,

  pa2 = subset(pa, influencia==AP)
  pa2$influencia-as.factor(pa2$influencia)
  levels(pa$influencia)
[1] AID AII AP


On 24/04/2011 07:42 a.m., Christian Parker wrote:
 You are creating a new object, but the columns that are stored as factors are 
 not being 'refactored' so you are retaining the original list of levels. To 
 fix this you can use the factor function after you subset

 pa2 = subset(pa, influencia==AID)
 pa2$influencia-as.factor(pa2$influencia)



 On Apr 24, 2011, at 6:04 AM, Manuel Spínolamspinol...@gmail.com  wrote:

 Dear list members,

 I have a question regarding too subsetting a data set in R.

 I created an object for my data:

 pa = read.csv(espec_indic.csv, header = T, sep=,, check.names = F)
 levels(pa$influencia)
 [1] AID AII AP

 The object has 3 levels for influencia (AP, AID, AII)

 Now I subset only observations with influencia = AID

 pa2 = subset(pa, influencia==AID)
 but if I ask for the levels of influencia still show me the 3 levels,
 AP, AID, AII.

 levels(pa2$influencia)
 [1] AID AII AP

 Why is that?

 I was thinking that I was creating a new data frame with only AID as a
 level for influencia.

 How can I make a complete new object with only the observations for
 AID and that the only level for influencia is indeed AID?

 Best,

 Manuel




 -- 
 *Manuel Spínola, Ph.D.*
 Instituto Internacional en Conservación y Manejo de Vida Silvestre
 Universidad Nacional
 Apartado 1350-3000
 Heredia
 COSTA RICA
 mspin...@una.ac.cr
 mspinol...@gmail.com
 Teléfono: (506) 2277-3598
 Fax: (506) 2237-7036
 Personal website: Lobito de río
 https://sites.google.com/site/lobitoderio/
 Institutional website: ICOMVIShttp://www.icomvis.una.ac.cr/

 [[alternative HTML version deleted]]

 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


-- 
*Manuel Spínola, Ph.D.*
Instituto Internacional en Conservación y Manejo de Vida Silvestre
Universidad Nacional
Apartado 1350-3000
Heredia
COSTA RICA
mspin...@una.ac.cr
mspinol...@gmail.com
Teléfono: (506) 2277-3598
Fax: (506) 2237-7036
Personal website: Lobito de río 
https://sites.google.com/site/lobitoderio/
Institutional website: ICOMVIS http://www.icomvis.una.ac.cr/

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] subsetting data in R

2011-04-24 Thread Manuel Spínola
Thank you very much Gustavo.

That works.

Manuel

On 24/04/2011 08:30 a.m., Gustavo Carvalho wrote:
 pa2- subset(pa, influencia==AP)
 pa2$influencia- factor(pa2$influencia)
 levels(pa2$influencia)

 On Sun, Apr 24, 2011 at 11:24 AM, Manuel Spínolamspinol...@gmail.com  
 wrote:
 Thank you very much for your response, Christian, Roman, and Sarah.

 Sarah,

 I am trying your suggestion but I cannot see the levels:

 pa2 = factor(subset(pa, influencia==AP)$influencia)
 levels(pa2$influencia)
 Error in pa2$influencia : $ operator is invalid for atomic vectors

 Best,

 Manuel



 On 24/04/2011 07:51 a.m., Sarah Goslee wrote:
 By default, read.csv() turns character variables into factors, using all the
 unique values as the levels.

 subset() retains those levels by default, as they are a vital element of the
 data. If you are studying some attribute of men and women, say height,
 even if you are only looking at the heights for women it's important to 
 remember
 that men still exist.

 If you don't want influencia to be a factor, you can change that in the 
 import
 stringsAsFactors=FALSE.

 If you do want influencia to be a factor, but want the unused levels to be
 removed, you can use factor() to do that.

 testdata- data.frame(group=c(A, B, C, A, B, C), value=1:6)
 testdata
 group value
 1 A 1
 2 B 2
 3 C 3
 4 A 4
 5 B 5
 6 C 6
 str(testdata)
 'data.frame': 6 obs. of  2 variables:
$ group: Factor w/ 3 levels A,B,C: 1 2 3 1 2 3
$ value: int  1 2 3 4 5 6
 subset(testdata, group==A)
 group value
 1 A 1
 4 A 4
 subset(testdata, group==A)$group
 [1] A A
 Levels: A B C
 ?subset
 factor(subset(testdata, group==A)$group)
 [1] A A
 Levels: A

 Sarah

 On Sun, Apr 24, 2011 at 9:04 AM, Manuel Spínolamspinol...@gmail.com
 wrote:
 Dear list members,

 I have a question regarding too subsetting a data set in R.

 I created an object for my data:

pa = read.csv(espec_indic.csv, header = T, sep=,, check.names = F)

levels(pa$influencia)
 [1] AID AII AP

 The object has 3 levels for influencia (AP, AID, AII)

 Now I subset only observations with influencia = AID

pa2 = subset(pa, influencia==AID)

 but if I ask for the levels of influencia still show me the 3 levels,
 AP, AID, AII.

levels(pa2$influencia)
 [1] AID AII AP

 Why is that?

 I was thinking that I was creating a new data frame with only AID as a
 level for influencia.

 How can I make a complete new object with only the observations for
 AID and that the only level for influencia is indeed AID?

 Best,

 Manuel



 --
 *Manuel Spínola, Ph.D.*
 Instituto Internacional en Conservación y Manejo de Vida Silvestre
 Universidad Nacional
 Apartado 1350-3000
 Heredia
 COSTA RICA
 mspin...@una.ac.cr
 mspinol...@gmail.com
 Teléfono: (506) 2277-3598
 Fax: (506) 2237-7036
 Personal website: Lobito de río
 https://sites.google.com/site/lobitoderio/
 Institutional website: ICOMVIShttp://www.icomvis.una.ac.cr/

 [[alternative HTML version deleted]]


 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology




-- 
*Manuel Spínola, Ph.D.*
Instituto Internacional en Conservación y Manejo de Vida Silvestre
Universidad Nacional
Apartado 1350-3000
Heredia
COSTA RICA
mspin...@una.ac.cr
mspinol...@gmail.com
Teléfono: (506) 2277-3598
Fax: (506) 2237-7036
Personal website: Lobito de río 
https://sites.google.com/site/lobitoderio/
Institutional website: ICOMVIS http://www.icomvis.una.ac.cr/

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] subsetting data in R

2011-04-24 Thread Manuel Spínola
Thank you for all the responses.

Is there a way to do a complete data set from an object in R?
I have a data set with more than 3000 columns.

The subsetting is ok but it could be dangerous if you are using other 
factors to do some analysis as you need to specify the levels for all 
your factors again.

Best,

Manuel

On 24/04/2011 08:30 a.m., Gustavo Carvalho wrote:
 pa2- subset(pa, influencia==AP)
 pa2$influencia- factor(pa2$influencia)
 levels(pa2$influencia)

 On Sun, Apr 24, 2011 at 11:24 AM, Manuel Spínolamspinol...@gmail.com  
 wrote:
 Thank you very much for your response, Christian, Roman, and Sarah.

 Sarah,

 I am trying your suggestion but I cannot see the levels:

 pa2 = factor(subset(pa, influencia==AP)$influencia)
 levels(pa2$influencia)
 Error in pa2$influencia : $ operator is invalid for atomic vectors

 Best,

 Manuel



 On 24/04/2011 07:51 a.m., Sarah Goslee wrote:
 By default, read.csv() turns character variables into factors, using all the
 unique values as the levels.

 subset() retains those levels by default, as they are a vital element of the
 data. If you are studying some attribute of men and women, say height,
 even if you are only looking at the heights for women it's important to 
 remember
 that men still exist.

 If you don't want influencia to be a factor, you can change that in the 
 import
 stringsAsFactors=FALSE.

 If you do want influencia to be a factor, but want the unused levels to be
 removed, you can use factor() to do that.

 testdata- data.frame(group=c(A, B, C, A, B, C), value=1:6)
 testdata
 group value
 1 A 1
 2 B 2
 3 C 3
 4 A 4
 5 B 5
 6 C 6
 str(testdata)
 'data.frame': 6 obs. of  2 variables:
$ group: Factor w/ 3 levels A,B,C: 1 2 3 1 2 3
$ value: int  1 2 3 4 5 6
 subset(testdata, group==A)
 group value
 1 A 1
 4 A 4
 subset(testdata, group==A)$group
 [1] A A
 Levels: A B C
 ?subset
 factor(subset(testdata, group==A)$group)
 [1] A A
 Levels: A

 Sarah

 On Sun, Apr 24, 2011 at 9:04 AM, Manuel Spínolamspinol...@gmail.com
 wrote:
 Dear list members,

 I have a question regarding too subsetting a data set in R.

 I created an object for my data:

pa = read.csv(espec_indic.csv, header = T, sep=,, check.names = F)

levels(pa$influencia)
 [1] AID AII AP

 The object has 3 levels for influencia (AP, AID, AII)

 Now I subset only observations with influencia = AID

pa2 = subset(pa, influencia==AID)

 but if I ask for the levels of influencia still show me the 3 levels,
 AP, AID, AII.

levels(pa2$influencia)
 [1] AID AII AP

 Why is that?

 I was thinking that I was creating a new data frame with only AID as a
 level for influencia.

 How can I make a complete new object with only the observations for
 AID and that the only level for influencia is indeed AID?

 Best,

 Manuel



 --
 *Manuel Spínola, Ph.D.*
 Instituto Internacional en Conservación y Manejo de Vida Silvestre
 Universidad Nacional
 Apartado 1350-3000
 Heredia
 COSTA RICA
 mspin...@una.ac.cr
 mspinol...@gmail.com
 Teléfono: (506) 2277-3598
 Fax: (506) 2237-7036
 Personal website: Lobito de río
 https://sites.google.com/site/lobitoderio/
 Institutional website: ICOMVIShttp://www.icomvis.una.ac.cr/

 [[alternative HTML version deleted]]


 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology




-- 
*Manuel Spínola, Ph.D.*
Instituto Internacional en Conservación y Manejo de Vida Silvestre
Universidad Nacional
Apartado 1350-3000
Heredia
COSTA RICA
mspin...@una.ac.cr
mspinol...@gmail.com
Teléfono: (506) 2277-3598
Fax: (506) 2237-7036
Personal website: Lobito de río 
https://sites.google.com/site/lobitoderio/
Institutional website: ICOMVIS http://www.icomvis.una.ac.cr/

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology