Re: [R] Mann-Whitney U

2007-08-15 Thread Natalie O'Toole
Hi,

I do want to use the Mann-Whitney test which ranks my data and then uses 
those ranks rather than the actual data.

Here is the R code i am using:

 group1- 
c(1.34,1.47,1.48,1.49,1.62,1.67,1.7,1.7,1.7,1.73,1.81,1.84,1.9,1.96,2,2,2.19,2.29,2.29,2.41,2.41,2.46,2.5,2.6,2.8,2.8,3.07,3.3)
 group2- 
c(0.98,1.18,1.25,1.33,1.38,1.4,1.49,1.57,1.72,1.75,1.8,1.82,1.86,1.9,1.97,2.04,2.14,2.18,2.49,2.5,2.55,2.57,2.64,2.73,2.77,2.9,2.94,NA)
 result -  wilcox.test(group1, group2, paired=FALSE, conf.level = 0.95, 
na.action)

paired = FALSE so that the Wilcoxon rank sum test which is equivalent to 
the Mann-Whitney test is used (my samples are NOT paired).
conf.level = 0.95 to specify the confidence level
na.action is used because i have a NA value (i suspect i am not using 
na.action in the correct manner)

When i use this code i get the following error message:

Error in arg == choices : comparison (1) is possible only for atomic and 
list types

When i use this code:

 group1- 
c(1.34,1.47,1.48,1.49,1.62,1.67,1.7,1.7,1.7,1.73,1.81,1.84,1.9,1.96,2,2,2.19,2.29,2.29,2.41,2.41,2.46,2.5,2.6,2.8,2.8,3.07,3.3)
 group2- 
c(0.98,1.18,1.25,1.33,1.38,1.4,1.49,1.57,1.72,1.75,1.8,1.82,1.86,1.9,1.97,2.04,2.14,2.18,2.49,2.5,2.55,2.57,2.64,2.73,2.77,2.9,2.94,NA)
 result -  wilcox.test(group1, group2, paired=FALSE, conf.level = 0.95)

I get the following result:

  Wilcoxon rank sum test with continuity correction

data:  group1 and group2 
W = 405.5, p-value = 0.6494
alternative hypothesis: true location shift is not equal to 0 

Warning message:
cannot compute exact p-value with ties in: wilcox.test.default(group1, 
group2, paired = FALSE, conf.level = 0.95) 

The W value here is 405.5 with a p-value of 0.6494


in SPSS, i am ranking my data and then performing a Mann-Whitney U by 
selecting analyze - non-parametric tests - 2 independent samples  and then 
checking off the Mann-Whitney U test.

For the Mann-Whitney test in SPSS i am gettting the following results:

Mann-Whitney U = 350.5
 2- tailed p value = 0.643

I think maybe the descrepancy has to do with the specification of the NA 
values in R, but i'm not sure.


If anyone has any suggestions, please let me know!

I hope i have provided enough information to convey my problem.

Thank-you, 

Nat
__


Natalie,

It's best to provide at least a sample of your data.  Your field names 
suggest 
that your data might be collected in units of mm^2 or some similar 
measurement of area.  Why do you want to use Mann-Whitney, which will rank 

your data and then use those ranks rather than your actual data?  Unless 
your 
sample is quite small, why not use a two sample t-test?  Also,are your 
samples paired?  If they aren't, did you use the paired = FALSE option?

JWDougherty

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




 

This communication is intended for the use of the recipient to which it is 
addressed, and may
contain confidential, personal, and or privileged information. Please 
contact the sender
immediately if you are not the intended recipient of this communication, 
and do not copy,
distribute, or take action relying on it. Any communication received in 
error, or subsequent
reply, should be deleted or destroyed.



 

This communication is intended for the use of the recipient to which it is 
addressed, and may
contain confidential, personal, and or privileged information. Please 
contact the sender
immediately if you are not the intended recipient of this communication, 
and do not copy,
distribute, or take action relying on it. Any communication received in 
error, or subsequent
reply, should be deleted or destroyed.

 

This communication is intended for the use of the recipient to which it is 
addressed, and may
contain confidential, personal, and or privileged information. Please 
contact the sender
immediately if you are not the intended recipient of this communication, 
and do not copy,
distribute, or take action relying on it. Any communication received in 
error, or subsequent
reply, should be deleted or destroyed.
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Mann-Whitney U test discrepancies

2007-08-15 Thread Natalie O'Toole

   Hi,
   I  do  want  to use the Mann-Whitney test which ranks my data and then
   uses
   those ranks rather than the actual data.
   Here is the R code i am using:
   group1-
   c(1.34,1.47,1.48,1.49,1.62,1.67,1.7,1.7,1.7,1.73,1.81,1.84,1.9,1.96,2,
   2,2.19,2.29,2.29,2.41,2.41,2.46,2.5,2.6,2.8,2.8,3.07,3.3)
group2-
   c(0.98,1.18,1.25,1.33,1.38,1.4,1.49,1.57,1.72,1.75,1.8,1.82,1.86,1.9,1
   .97,2.04,2.14,2.18,2.49,2.5,2.55,2.57,2.64,2.73,2.77,2.9,2.94,NA)
 result  -   wilcox.test(group1, group2, paired=FALSE, conf.level =
   0.95,
   na.action)
   paired  = FALSE so that the Wilcoxon rank sum test which is equivalent
   to
   the Mann-Whitney test is used (my samples are NOT paired).
   conf.level = 0.95 to specify the confidence level
   na.action is used because i have a NA value (i suspect i am not using
   na.action in the correct manner)
   When i use this code i get the following error message:
   Error  in  arg == choices : comparison (1) is possible only for atomic
   and
   list types
   When i use this code:
   group1-
   c(1.34,1.47,1.48,1.49,1.62,1.67,1.7,1.7,1.7,1.73,1.81,1.84,1.9,1.96,2,
   2,2.19,2.29,2.29,2.41,2.41,2.46,2.5,2.6,2.8,2.8,3.07,3.3)
group2-
   c(0.98,1.18,1.25,1.33,1.38,1.4,1.49,1.57,1.72,1.75,1.8,1.82,1.86,1.9,1
   .97,2.04,2.14,2.18,2.49,2.5,2.55,2.57,2.64,2.73,2.77,2.9,2.94,NA)
 result  -   wilcox.test(group1, group2, paired=FALSE, conf.level =
   0.95)
   I get the following result:
Wilcoxon rank sum test with continuity correction
   data:  group1 and group2
   W = 405.5, p-value = 0.6494
   alternative hypothesis: true location shift is not equal to 0
   Warning message:
   cannot compute exact p-value with ties in: wilcox.test.default(group1,
   group2, paired = FALSE, conf.level = 0.95)
   The W value here is 405.5 with a p-value of 0.6494
   in SPSS, i am ranking my data and then performing a Mann-Whitney U by
   selecting  analyze - non-parametric tests - 2 independent samples  and
   then
   checking off the Mann-Whitney U test.
   For the Mann-Whitney test in SPSS i am gettting the following results:
   Mann-Whitney U = 350.5
   2- tailed p value = 0.643
   I  think maybe the descrepancy has to do with the specification of the
   NA
   values in R, but i'm not sure.
   If anyone has any suggestions, please let me know!
   I hope i have provided enough information to convey my problem.
   Thank-you,
   Nat
   __
   
   This  communication  is intended for the use of the recipient to which
   it is
   addressed, and may contain confidential, personal, and or privileged
   information. Please contact the sender immediately if you are not the
   intended recipient of this communication, and do not copy, distribute,
   or
   take action relying on it. Any communication received in error, or
   subsequent reply, should be deleted or destroyed.
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Mann-Whitney U

2007-08-14 Thread Natalie O'Toole
Hi,

Could someone please tell me how to perform a Mann-Whitney U test on a 
dataset with 2 groups where one group has more data values than another?

I have split up my 2 groups into 2 columns in my .txt file i'm using with 
R. Here is the code i have so far...

group1 - c(LeafArea2)
group2 - c(LeafArea1)
wilcox.test(group1, group2)

This code works for datasets with the same number of data values in each 
column, but not when there is a different number of data values in one 
column than another column of data.

Is the solution that i have to have a null value in the data column with 
the fewer data values?

I'm testing for significant diferences between the 2 groups, and the 
result i'm getting in R with the uneven values is different from what i'm 
getting in SPSS.

Help please!

Nat



 

This communication is intended for the use of the recipient to which it is 
addressed, and may
contain confidential, personal, and or privileged information. Please 
contact the sender
immediately if you are not the intended recipient of this communication, 
and do not copy,
distribute, or take action relying on it. Any communication received in 
error, or subsequent
reply, should be deleted or destroyed.

 

This communication is intended for the use of the recipient to which it is 
addressed, and may
contain confidential, personal, and or privileged information. Please 
contact the sender
immediately if you are not the intended recipient of this communication, 
and do not copy,
distribute, or take action relying on it. Any communication received in 
error, or subsequent
reply, should be deleted or destroyed.
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] weight

2007-04-30 Thread Natalie O'Toole
Hi,

These are the variables in my file. I think the variable i'm having 
problems with is WTPP which is of the Factor type. Does anyone know how to 
fix this, please?

Thanks,

Nat

data.frame':   290 obs. of  5 variables:
 $ PROV  : num  48 48 48 48 48 48 48 48 48 48 ...
 $ REGION: num  4 4 4 4 4 4 4 4 4 4 ...
 $ GRADE : num  7 7 7 7 7 7 7 7 7 7 ...
 $ Y_Q10A: num  1.1 1.1 1.1 1.1 1.1 1.1 1.1 1.1 1.1 1.1 ...
 $ WTPP  : Factor w/ 1884 levels 1,106.8250,1,336.5138,..: 1544 67 
1568 40 221 1702 1702 1434 310 310 ...


__



--- Douglas Bates [EMAIL PROTECTED] wrote:

 On 4/28/07, John Kane [EMAIL PROTECTED] wrote:
  IIRC you have a yes/no smoking variable scored 1/2
 ?
 
  It is possibly being read in as a factor not as an
  integer.
 
  try
   class(df$smoking.variable)
  to see .
 
 Good point.  In general I would recommend using
 
 str(df)
 
 to check on the class or storage type of all
 variables in a data frame
 if you are getting unexpected results when
 manipulating it.  That
 function is carefully written to provide a maximum
 of information in a
 minimum of space.

Yes but I'm an relative newbie at R and didn't realise
that str() would do that.  I always thought it was
some kind of string function. 

Thanks, it makes life much easier.

 
  --- Natalie O'Toole [EMAIL PROTECTED] wrote:
 
   Hi,
  
   I'm getting an error message:
  
   Error in df[, 1:4] * df[, 5] : non-numeric
 argument
   to binary operator
   In addition: Warning message:
   Incompatible methods (Ops.data.frame,
   Ops.factor) for *
  
   here is my code:
  
  
   ##reading in the file
   happyguys-read.table(c:/test4.dat,
 header=TRUE,
   row.names=1)
  
   ##subset the file based on Select If
  
   test-subset(happyguys, PROV==48  GRADE == 7  
   Y_Q10A  9)
  
   ##sorting the file
  
   mydata-test
   mydataSorted-mydata[ order(mydata$Y_Q10A), ]
   print(mydataSorted)
  
  
   ##assigning  a different name to file
  
   happyguys-mydataSorted
  
  
   ##trying to weight my data
  
   data.frame-happyguys
   df-data.frame
   df1-df[, 1:4] * df[, 5]
  
   ##getting error message here??
  
   Error in df[, 1:4] * df[, 5] : non-numeric
 argument
   to binary operator
   In addition: Warning message:
   Incompatible methods (Ops.data.frame,
   Ops.factor) for *
  
   Does anyone know what this error message means?
  
   I've been reviewing R code all day  getting
 more
   familiar with it
  
   Thanks,
  
   Nat
  
 
  
 


  
  
   This communication is intended for the use of
 the
   recipient to which it is
   addressed, and may
   contain confidential, personal, and or
 privileged
   information. Please
   contact the sender
   immediately if you are not the intended
 recipient of
   this communication,
   and do not copy,
   distribute, or take action relying on it. Any
   communication received in
   error, or subsequent
   reply, should be deleted or destroyed.
 [[alternative HTML version deleted]]
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained,
   reproducible code.
  
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
 reproducible code.
 
 



  Be smarter than spam. See how smart SpamGuard is at giving junk 
email the boot with the All-new Yahoo! Mail at 
http://mrd.mail.yahoo.com/try_beta?.intl=ca



 

This communication is intended for the use of the recipient to which it is 
addressed, and may
contain confidential, personal, and or privileged information. Please 
contact the sender
immediately if you are not the intended recipient of this communication, 
and do not copy,
distribute, or take action relying on it. Any communication received in 
error, or subsequent
reply, should be deleted or destroyed.
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] weight

2007-04-30 Thread Natalie O'Toole
__


Hi, 

These are the variables in my file. I think the variable i'm having 
problems with is WTPP which is of the Factor type. Does anyone know how to 
fix this, please? 

Thanks, 

Nat 

data.frame':   290 obs. of  5 variables: 
 $ PROV  : num  48 48 48 48 48 48 48 48 48 48 ... 
 $ REGION: num  4 4 4 4 4 4 4 4 4 4 ... 
 $ GRADE : num  7 7 7 7 7 7 7 7 7 7 ... 
 $ Y_Q10A: num  1.1 1.1 1.1 1.1 1.1 1.1 1.1 1.1 1.1 1.1 ... 
 $ WTPP  : Factor w/ 1884 levels 1,106.8250,1,336.5138,..: 1544 67 
1568 40 221 1702 1702 1434 310 310 ... 


__ 



--- Douglas Bates [EMAIL PROTECTED] wrote:

 On 4/28/07, John Kane [EMAIL PROTECTED] wrote:
  IIRC you have a yes/no smoking variable scored 1/2
 ?
 
  It is possibly being read in as a factor not as an
  integer.
 
  try
   class(df$smoking.variable)
  to see .
 
 Good point.  In general I would recommend using
 
 str(df)
 
 to check on the class or storage type of all
 variables in a data frame
 if you are getting unexpected results when
 manipulating it.  That
 function is carefully written to provide a maximum
 of information in a
 minimum of space.

Yes but I'm an relative newbie at R and didn't realise
that str() would do that.  I always thought it was
some kind of string function.  

Thanks, it makes life much easier.

 
  --- Natalie O'Toole [EMAIL PROTECTED] wrote:
 
   Hi,
  
   I'm getting an error message:
  
   Error in df[, 1:4] * df[, 5] : non-numeric
 argument
   to binary operator
   In addition: Warning message:
   Incompatible methods (Ops.data.frame,
   Ops.factor) for *
  
   here is my code:
  
  
   ##reading in the file
   happyguys-read.table(c:/test4.dat,
 header=TRUE,
   row.names=1)
  
   ##subset the file based on Select If
  
   test-subset(happyguys, PROV==48  GRADE == 7  
   Y_Q10A  9)
  
   ##sorting the file
  
   mydata-test
   mydataSorted-mydata[ order(mydata$Y_Q10A), ]
   print(mydataSorted)
  
  
   ##assigning  a different name to file
  
   happyguys-mydataSorted
  
  
   ##trying to weight my data
  
   data.frame-happyguys
   df-data.frame
   df1-df[, 1:4] * df[, 5]
  
   ##getting error message here??
  
   Error in df[, 1:4] * df[, 5] : non-numeric
 argument
   to binary operator
   In addition: Warning message:
   Incompatible methods (Ops.data.frame,
   Ops.factor) for *
  
   Does anyone know what this error message means?
  
   I've been reviewing R code all day  getting
 more
   familiar with it
  
   Thanks,
  
   Nat
  
 
  
 


  
  
   This communication is intended for the use of
 the
   recipient to which it is
   addressed, and may
   contain confidential, personal, and or
 privileged
   information. Please
   contact the sender
   immediately if you are not the intended
 recipient of
   this communication,
   and do not copy,
   distribute, or take action relying on it. Any
   communication received in
   error, or subsequent
   reply, should be deleted or destroyed.
 [[alternative HTML version deleted]]
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained,
   reproducible code.
  
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
 reproducible code.
 
 



 Be smarter than spam. See how smart SpamGuard is at giving junk email 
the boot with the All-new Yahoo! Mail at 
http://mrd.mail.yahoo.com/try_beta?.intl=ca



 

This communication is intended for the use of the recipient to which it is 
addressed, and may
contain confidential, personal, and or privileged information. Please 
contact the sender
immediately if you are not the intended recipient of this communication, 
and do not copy,
distribute, or take action relying on it. Any communication received in 
error, or subsequent
reply, should be deleted or destroyed. 

 

This communication is intended for the use of the recipient to which it is 
addressed, and may
contain confidential, personal, and or privileged information. Please 
contact the sender
immediately if you are not the intended recipient of this communication, 
and do not copy,
distribute, or take action relying on it. Any communication received in 
error, or subsequent
reply, should be deleted or destroyed.
[[alternative HTML version deleted

[R] thousand separator (was RE: weight)

2007-04-30 Thread Natalie O'Toole
Thank-you Andy!! That works great now

Nat

__


I've run into this occasionally.  My current solution is simply to read
it into Excel, re-format the offending column(s) by unchecking the
thousand separator box, and write it back out.  Not exactly ideal to
say the least.  If anyone can provide a better solution in R, I'm all
ears...

Andy 

From: Natalie O'Toole
 
 Hi,
 
 These are the variables in my file. I think the variable i'm having 
 problems with is WTPP which is of the Factor type. Does 
 anyone know how to 
 fix this, please?
 
 Thanks,
 
 Nat
 
 data.frame':   290 obs. of  5 variables:
  $ PROV  : num  48 48 48 48 48 48 48 48 48 48 ...
  $ REGION: num  4 4 4 4 4 4 4 4 4 4 ...
  $ GRADE : num  7 7 7 7 7 7 7 7 7 7 ...
  $ Y_Q10A: num  1.1 1.1 1.1 1.1 1.1 1.1 1.1 1.1 1.1 1.1 ...
  $ WTPP  : Factor w/ 1884 levels 
 1,106.8250,1,336.5138,..: 1544 67 
 1568 40 221 1702 1702 1434 310 310 ...
 
 
 __
 
 
 
 --- Douglas Bates [EMAIL PROTECTED] wrote:
 
  On 4/28/07, John Kane [EMAIL PROTECTED] wrote:
   IIRC you have a yes/no smoking variable scored 1/2
  ?
  
   It is possibly being read in as a factor not as an
   integer.
  
   try
class(df$smoking.variable)
   to see .
  
  Good point.  In general I would recommend using
  
  str(df)
  
  to check on the class or storage type of all
  variables in a data frame
  if you are getting unexpected results when
  manipulating it.  That
  function is carefully written to provide a maximum
  of information in a
  minimum of space.
 
 Yes but I'm an relative newbie at R and didn't realise
 that str() would do that.  I always thought it was
 some kind of string function. 
 
 Thanks, it makes life much easier.
 
  
   --- Natalie O'Toole [EMAIL PROTECTED] wrote:
  
Hi,
   
I'm getting an error message:
   
Error in df[, 1:4] * df[, 5] : non-numeric
  argument
to binary operator
In addition: Warning message:
Incompatible methods (Ops.data.frame,
Ops.factor) for *
   
here is my code:
   
   
##reading in the file
happyguys-read.table(c:/test4.dat,
  header=TRUE,
row.names=1)
   
##subset the file based on Select If
   
test-subset(happyguys, PROV==48  GRADE == 7  
Y_Q10A  9)
   
##sorting the file
   
mydata-test
mydataSorted-mydata[ order(mydata$Y_Q10A), ]
print(mydataSorted)
   
   
##assigning  a different name to file
   
happyguys-mydataSorted
   
   
##trying to weight my data
   
data.frame-happyguys
df-data.frame
df1-df[, 1:4] * df[, 5]
   
##getting error message here??
   
Error in df[, 1:4] * df[, 5] : non-numeric
  argument
to binary operator
In addition: Warning message:
Incompatible methods (Ops.data.frame,
Ops.factor) for *
   
Does anyone know what this error message means?
   
I've been reviewing R code all day  getting
  more
familiar with it
   
Thanks,
   
Nat
   
  
   
  
 
 --
 --
   
   
This communication is intended for the use of
  the
recipient to which it is
addressed, and may
contain confidential, personal, and or
  privileged
information. Please
contact the sender
immediately if you are not the intended
  recipient of
this communication,
and do not copy,
distribute, or take action relying on it. Any
communication received in
error, or subsequent
reply, should be deleted or destroyed.
  [[alternative HTML version deleted]]
   
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained,
reproducible code.
   
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained,
  reproducible code.
  
  
 
 
 
   Be smarter than spam. See how smart SpamGuard is at giving junk 
 email the boot with the All-new Yahoo! Mail at 
 http://mrd.mail.yahoo.com/try_beta?.intl=ca
 
 
 --
 -- 
 
 This communication is intended for the use of the recipient 
 to which it is 
 addressed, and may
 contain confidential, personal, and or privileged information. Please 
 contact the sender
 immediately if you are not the intended recipient of this 
 communication, 
 and do not copy,
 distribute, or take action relying on it. Any communication 
 received in 
 error, or subsequent
 reply, should be deleted or destroyed.
[[alternative HTML version

Re: [R] weight

2007-04-28 Thread Natalie O'Toole
Hi,

I'm getting an error message:

Error in df[, 1:4] * df[, 5] : non-numeric argument to binary operator
In addition: Warning message:
Incompatible methods (Ops.data.frame, Ops.factor) for * 

here is my code:


##reading in the file
happyguys-read.table(c:/test4.dat, header=TRUE, row.names=1)

##subset the file based on Select If

test-subset(happyguys, PROV==48  GRADE == 7   Y_Q10A  9)

##sorting the file

mydata-test
mydataSorted-mydata[ order(mydata$Y_Q10A), ]
print(mydataSorted)


##assigning  a different name to file

happyguys-mydataSorted


##trying to weight my data

data.frame-happyguys
df-data.frame
df1-df[, 1:4] * df[, 5]

##getting error message here??

Error in df[, 1:4] * df[, 5] : non-numeric argument to binary operator
In addition: Warning message:
Incompatible methods (Ops.data.frame, Ops.factor) for * 

Does anyone know what this error message means?

I've been reviewing R code all day  getting more familiar with it

Thanks,

Nat



 

This communication is intended for the use of the recipient to which it is 
addressed, and may
contain confidential, personal, and or privileged information. Please 
contact the sender
immediately if you are not the intended recipient of this communication, 
and do not copy,
distribute, or take action relying on it. Any communication received in 
error, or subsequent
reply, should be deleted or destroyed.
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] select if + other questions

2007-04-27 Thread Natalie O'Toole
Hi,

Does anyone know how to skip variables (or columns) in R. Say, for example 
i had PUMFID position1 and Y_Q10A position 33 and i do not want to include 
all the variables in between. Is there a way to do this in R when you are 
extracting variables from a large .txt file with many, many variables?

Thanks,

Nat

__


Yes but I believe it will vary depending on what
package you're using.  I don't deal with weigthed data
so I'm not a good source

Have a look at help for something like lm in the stats
  package (part of the base installation) for an
example.
 
?lm 

weight is the fourth argument down.

However for more information try
http://finzi.psych.upenn.edu/search.html and type in
weight.

As Brian Ripley says in a reply to a question about
weights:
 Almost all methods I know of do: logistic
regression, neural nets, classification trees, PPR
 


--- Natalie O'Toole [EMAIL PROTECTED] wrote:

 Hi,
 
 Thank-you for the response!! That worked great!! Is
 there any way to apply 
 a weight variable to your file similar to what you
 can do in SPSS? So that 
 all of your other variables will be weighted by the
 weight variable?
 
 Thanks,
 
 Nat
 
 __
 
 
 Hi, 
 
 i am trying to read a .txt file, do a couple of
 select if statements on my 
 data, and then finally use the ?table function to
 get frequency counts on 
 the data. Specifically, i am looking at answering
 the following question: 
 
 What is the frequency of Grade 7 students in the
 province of Alberta who 
 are smokers? 
 
 I am having some problems: 
 
 1)i cannot get the column names to show up when
 print to screen 
 
 2)I cannot seem to skip variables properly when i
 choose certain other 
 variables 
 
 3)i cannot get the combination of Select If
 statements to work to produce 
 a different table with my new criteria
 
 Here are the variables 
 
 PUMFID position1 length 5 
 PROV position 6 length 2 
 GRADE position 9 length 2 
 Y_Q10A position 33 length 1 
 
 
 Y_Q10A has the following 1=yes 
2=no
   9=skip 
 
 all the others have no skipped or missing values 
 
 Here is my code: 
 
 myfile-(c:/test2.txt) 
 myVariableNames-c(PUMFID,PROV,GRADE,Y_Q10A)
 
 myVariableWidths-c(5,2,2,1) 
 
 
  mydata-read.fwf( 
 file=myfile, 
 width=myVariableWidths, 
 col.names=myVariableNames, 
 row.names=PUMFID, 
 fill=TRUE, 
 strip.white=TRUE) 
 
 
 print(mydata) 
 
 print( mydata [which(PROV==AB  GRADE==7 
 Y_Q10A9), ] ) 
 
 
 
 Any help would be greatly appreciated!! 
 
 Thank-you, 
 
 Nat 
 


 
 
 This communication is intended for the use of the
 recipient to which it is 
 addressed, and may
 contain confidential, personal, and or privileged
 information. Please 
 contact the sender
 immediately if you are not the intended recipient of
 this communication, 
 and do not copy,
 distribute, or take action relying on it. Any
 communication received in 
 error, or subsequent
 reply, should be deleted or destroyed. 


 
 
 This communication is intended for the use of the
 recipient to which it is 
 addressed, and may
 contain confidential, personal, and or privileged
 information. Please 
 contact the sender
 immediately if you are not the intended recipient of
 this communication, 
 and do not copy,
 distribute, or take action relying on it. Any
 communication received in 
 error, or subsequent
 reply, should be deleted or destroyed.
[[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.
 



  Be smarter than spam. See how smart SpamGuard is at giving junk 
email the boot with the All-new Yahoo! Mail at 
http://mrd.mail.yahoo.com/try_beta?.intl=ca



 

This communication is intended for the use of the recipient to which it is 
addressed, and may
contain confidential, personal, and or privileged information. Please 
contact the sender
immediately if you are not the intended recipient of this communication, 
and do not copy,
distribute, or take action relying on it. Any communication received in 
error, or subsequent
reply, should be deleted or destroyed.
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self

Re: [R] select if + other questions

2007-04-27 Thread Natalie O'Toole
Hi John,

I figured out the skipping of variables using a first subset  then making 
a second subset  it worked!

Thanks,

Natalie

__

You need to have a look at Chapter 5 of the Intro to
R.   I would recommend downloading the pdf and
printing it out.  It is not an easy read but it should
help.

newdata - mydata[, c(PUMDID, Y_Q10A)]

or
newdata - mydata[, c(1,33)]

should do the trick

--- Natalie O'Toole [EMAIL PROTECTED] wrote:

 Hi,

 Does anyone know how to skip variables (or columns)
 in R. Say, for example
 i had PUMFID position1 and Y_Q10A position 33 and i
 do not want to include
 all the variables in between. Is there a way to do
 this in R when you are
 extracting variables from a large .txt file with
 many, many variables?

 Thanks,

 Nat

 __


 Yes but I believe it will vary depending on what
 package you're using.  I don't deal with weigthed
 data
 so I'm not a good source

 Have a look at help for something like lm in the
 stats
   package (part of the base installation) for an
 example.

 ?lm

 weight is the fourth argument down.

 However for more information try
 http://finzi.psych.upenn.edu/search.html and type in
 weight.

 As Brian Ripley says in a reply to a question about
 weights:
  Almost all methods I know of do: logistic
 regression, neural nets, classification trees, PPR
  


 --- Natalie O'Toole [EMAIL PROTECTED] wrote:

  Hi,
 
  Thank-you for the response!! That worked great!!
 Is
  there any way to apply
  a weight variable to your file similar to what you
  can do in SPSS? So that
  all of your other variables will be weighted by
 the
  weight variable?
 
  Thanks,
 
  Nat
 
  __
 
 
  Hi,
 
  i am trying to read a .txt file, do a couple of
  select if statements on my
  data, and then finally use the ?table function to
  get frequency counts on
  the data. Specifically, i am looking at answering
  the following question:
 
  What is the frequency of Grade 7 students in the
  province of Alberta who
  are smokers?
 
  I am having some problems:
 
  1)i cannot get the column names to show up when
  print to screen
 
  2)I cannot seem to skip variables properly when i
  choose certain other
  variables
 
  3)i cannot get the combination of Select If
  statements to work to produce
  a different table with my new criteria
 
  Here are the variables
 
  PUMFID position1 length 5
  PROV position 6 length 2
  GRADE position 9 length 2
  Y_Q10A position 33 length 1
 
 
  Y_Q10A has the following 1=yes
 2=no
9=skip
 
  all the others have no skipped or missing values
 
  Here is my code:
 
  myfile-(c:/test2.txt)
 
 myVariableNames-c(PUMFID,PROV,GRADE,Y_Q10A)
 
  myVariableWidths-c(5,2,2,1)
 
 
   mydata-read.fwf(
  file=myfile,
  width=myVariableWidths,
  col.names=myVariableNames,
  row.names=PUMFID,
  fill=TRUE,
  strip.white=TRUE)
 
 
  print(mydata)
 
  print( mydata [which(PROV==AB  GRADE==7 
  Y_Q10A9), ] )
 
 
 
  Any help would be greatly appreciated!!
 
  Thank-you,
 
  Nat
 
 


 
 
  This communication is intended for the use of the
  recipient to which it is
  addressed, and may
  contain confidential, personal, and or privileged
  information. Please
  contact the sender
  immediately if you are not the intended recipient
 of
  this communication,
  and do not copy,
  distribute, or take action relying on it. Any
  communication received in
  error, or subsequent
  reply, should be deleted or destroyed.
 


 
 
  This communication is intended for the use of the
  recipient to which it is
  addressed, and may
  contain confidential, personal, and or privileged
  information. Please
  contact the sender
  immediately if you are not the intended recipient
 of
  this communication,
  and do not copy,
  distribute, or take action relying on it. Any
  communication received in
  error, or subsequent
  reply, should be deleted or destroyed.
 [[alternative HTML version
 deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
  reproducible code.
 



   Be smarter than spam. See how smart SpamGuard
 is at giving junk
 email the boot with the All-new Yahoo! Mail at
 http://mrd.mail.yahoo.com/try_beta?.intl=ca






 This communication is intended for the use of the
 recipient to which it is
 addressed, and may
 contain confidential, personal, and or privileged
 information

[R] weight

2007-04-27 Thread Natalie O'Toole
Hi,

I have the file below called happyguys. It is a subset of data. How do I 
apply the weight variable (WTPP) to this file? Can i just multiply each 
column (except the first column because it is a record id) by WTPP? If the 
answer is yes, how do I multiply one variable name by another?

Thanks,

Nat



  PROV REGION GRADE Y_Q10A WTPP
83  48  4 7  2 342233324020
115 48  4 7  1 434413433040
185 48  4 7  1 432312433040
222 48  4 7  2 13311030
242 48  4 7  1 421313332020
247 48  4 7  2 312134212030
352 48  4 7  1 331112411040
562 48  4 7  2 331112321030
591 48  4 7  1 321334413030
663 48  4 7  2 441412442040
691 48  4 7  1 333213343020
730 48  4 7  1 43321030
850 48  4 7  1 343113422040
101648  4 7  1 322124413050
104148  4 7  1 331133432040
116348  4 7  1 433913439040
121148  4 7  2 211213421030
124548  4 7  2 231113331020
128348  4 7  1 432114432030
172348  4 7  2 233112422040
176548  4 7  1 331113421040
176648  4 7  2 443434234040
189448  4 7  2 311142321040
197648  4 7  1 113124312040
209248  4 7  1 333122343040
209348  4 7  1 341312412040
224848  4 7  2 31213040
239648  4 7  2 424113332040
240548  4 7  1 43220030
243848  4 7  1 421314432030
248848  4 7  1 421123322040
257948  4 7  2 312113241040
263748  4 7  1 421132432030
269948  4 7  1 444212433050
273848  4 7  1 24311040
275948  4 7  1 43311040
285648  4 7  1 14410060
296448  4 7  2 413223413030
310748  4 7  2 232233324030
316648  4 7  2 322234324030
316948  4 7  2 32424040
348048  4 7  2 311122421040
351948  4 7  2 432224234020
364548  4 7  1 321112221040
368148  4 7  2 344112432040
369848  4 7  1 44311030
370348  4 7  1 313311412040
373748  4 7  2 343234324040
388948  4 7  1 431123322020
389648  4 7  2 233313223030
391548  4 7  1 311312411040
392948  4 7  2 243314223030
393448  4 7  2 223112332040
393748  4 7  2 332122423030
395748  4 7  2 211194449030
398348  4 7  1 331312432040
405248  4 7  2 423313413040
414748  4 7  1 33321030
416848  4 7  2 322131323040
425348  4 7  1 343432324040
426348  4 7  1 211132411060
432448  4 7  1 331113421040
440248  4 7  2 321112331030
452848  4 7  1 444113312030
457048  4 7  2 441114221040
460048  4 7  1 22220030
464048  4 7  2 321234323050
467248  4 7  2 342134433040
470148  4 7  2 241433423020
471048  4 7  2 331114331030
472848  4 7  2 321213422050
476448  4 7  2 333413233040
476548  4 7  2 24224030
479448  4 7  2 32320030
491548  4 7  1 43420050
492148  4 7  2 443412413040
494448  4 7  1 411343412050
496348  4 7  2 322314313030
508948  4 7  1 22411040
517348  4 7  2 311134431030
546648  4 7  2 332332424030
548448  4 7  2 33324030
__




 

This communication is intended for the use of the recipient to which it is 
addressed, and may
contain confidential, personal, and or privileged information. Please 
contact the sender
immediately if you are not the intended recipient of this communication, 
and do not copy,
distribute, or take action relying on it. Any communication received in 
error, or subsequent
reply, should be deleted or destroyed.
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] weight

2007-04-27 Thread Natalie O'Toole
__


Hi Dr. Kubovy, 

Here is my code so far: My question is: how do I then get a frequency 
count of Y_Q10A with the WTPP applied to it? 

myfile-(c:/test2.txt) 
mysubset-myfile 
mysubset$Y_Q02 -mysubset$DVSELF -NULL
mysubset2-mysubset 
mysubset2$Y_Q10B -mysubset2$GP2_07 -NULL 

myVariableNames-c(PUMFID,PROV,REGION,GRADE,Y_Q10A,WTPP) 
myVariableWidths-c(5,2,1,2,1,12.4) 


mysubset2-read.fwf(  
file=myfile,  
width=myVariableWidths,  
col.names=myVariableNames,  
row.names=PUMFID,  
fill=TRUE,  
strip.white=TRUE)  



print(mysubset2)

happyguys-subset(mysubset2, PROV==48  GRADE == 7   Y_Q10A  9) 
print(happyguys) 


df - data.frame(PROV = rnorm(10), REGION = rnorm(10), GRADE = rnorm  
(10), Y_Q10A = rnorm(10), WTTP = rnorm(10)) 
df1 - df[, 1:4] * df[, 5]

Thanks, 

Nat 


__


df - data.frame(PROV = rnorm(10), REGION = rnorm(10), GRADE = rnorm 
(10), Y_Q10A = rnorm(10), WTTP = rnorm(10))
df1 - df[, 1:4] * df[, 5]
The column you were worried about is not part of the data.
You can get a vector of the record ids by
rownames(df)

On Apr 27, 2007, at 1:05 PM, Natalie O'Toole wrote:

 I have the file below called happyguys. It is a subset of data. How  
 do I
 apply the weight variable (WTPP) to this file? Can i just multiply  
 each
 column (except the first column because it is a record id) by WTPP?  
 If the
 answer is yes, how do I multiply one variable name by another?

   PROV REGION GRADE Y_Q10A WTPP
 83  48  4 7  2 342233324020
 115 48  4 7  1 434413433040
 185 48  4 7  1 432312433040
 222 48  4 7  2 13311030
 242 48  4 7  1 421313332020
 247 48  4 7  2 312134212030
snip
_
Professor Michael Kubovy
University of Virginia
Department of Psychology
USPS: P.O.Box 400400Charlottesville, VA 22904-4400
Parcels:Room 102Gilmer Hall
McCormick RoadCharlottesville, VA 22903
Office:B011+1-434-982-4729
Lab:B019+1-434-982-4751
Fax:+1-434-982-4766
WWW:http://www.people.virginia.edu/~mk9y/




 

This communication is intended for the use of the recipient to which it is 
addressed, and may
contain confidential, personal, and or privileged information. Please 
contact the sender
immediately if you are not the intended recipient of this communication, 
and do not copy,
distribute, or take action relying on it. Any communication received in 
error, or subsequent
reply, should be deleted or destroyed. 

 

This communication is intended for the use of the recipient to which it is 
addressed, and may
contain confidential, personal, and or privileged information. Please 
contact the sender
immediately if you are not the intended recipient of this communication, 
and do not copy,
distribute, or take action relying on it. Any communication received in 
error, or subsequent
reply, should be deleted or destroyed.
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] weight

2007-04-27 Thread Natalie O'Toole
Does anyone know why it is giving me this error? Any help would be greatly 
appreciated!!

Thanks,

Nat



myfile-(c:/test2.txt)
mysubset-myfile
mysubset$Y_Q02 -mysubset$DVSELF -NULL
mysubset2-mysubset
mysubset2$Y_Q10B -mysubset2$GP2_07 -NULL

myVariableNames-c(PUMFID=rnorm(10),PROV=rnorm(10),REGION=rnorm(10),GRADE=rnorm(10),Y_Q10A=rnorm(10),WTPP=rnorm(10))
 df-mysubset2[, 2:5] * mysubset2[, 6]
myVariableWidths-c(5,2,1,2,1,12.4)
df-read.fwf(
file=myfile,
width=myVariableWidths,
col.names=myVariableNames,
row.names=PUMFID,
fill=TRUE,
strip.white=TRUE)

happyguys-subset(df, PROV==48  GRADE == 7   Y_Q10A  9)
print(happyguys)


where it is bolded, i'm getting the following error: Error in mysubset2[, 
2:5] : incorrect number of dimensions

__

__


Hi Dr. Kubovy,

Here is my code so far: My question is: how do I then get a frequency
count of Y_Q10A with the WTPP applied to it?

myfile-(c:/test2.txt)
mysubset-myfile
mysubset$Y_Q02 -mysubset$DVSELF -NULL
mysubset2-mysubset
mysubset2$Y_Q10B -mysubset2$GP2_07 -NULL

myVariableNames-c(PUMFID,PROV,REGION,GRADE,Y_Q10A,WTPP)
myVariableWidths-c(5,2,1,2,1,12.4)


mysubset2-read.fwf(
file=myfile,
width=myVariableWidths,
col.names=myVariableNames,
row.names=PUMFID,
fill=TRUE,
strip.white=TRUE)



print(mysubset2)

happyguys-subset(mysubset2, PROV==48  GRADE == 7   Y_Q10A  9)
print(happyguys)


df - data.frame(PROV = rnorm(10), REGION = rnorm(10), GRADE = rnorm
(10), Y_Q10A = rnorm(10), WTTP = rnorm(10))
df1 - df[, 1:4] * df[, 5]

Thanks,

Nat


__


df - data.frame(PROV = rnorm(10), REGION = rnorm(10), GRADE = rnorm
(10), Y_Q10A = rnorm(10), WTTP = rnorm(10))
df1 - df[, 1:4] * df[, 5]
The column you were worried about is not part of the data.
You can get a vector of the record ids by
rownames(df)

On Apr 27, 2007, at 1:05 PM, Natalie O'Toole wrote:

 I have the file below called happyguys. It is a subset of data. How
 do I
 apply the weight variable (WTPP) to this file? Can i just multiply
 each
 column (except the first column because it is a record id) by WTPP?
 If the
 answer is yes, how do I multiply one variable name by another?

   PROV REGION GRADE Y_Q10A WTPP
 83  48  4 7  2 342233324020
 115 48  4 7  1 434413433040
 185 48  4 7  1 432312433040
 222 48  4 7  2 13311030
 242 48  4 7  1 421313332020
 247 48  4 7  2 312134212030
snip
_
Professor Michael Kubovy
University of Virginia
Department of Psychology
USPS: P.O.Box 400400Charlottesville, VA 22904-4400
Parcels:Room 102Gilmer Hall
McCormick RoadCharlottesville, VA 22903
Office:B011+1-434-982-4729
Lab:B019+1-434-982-4751
Fax:+1-434-982-4766
WWW:http://www.people.virginia.edu/~mk9y/





This communication is intended for the use of the recipient to which it is
addressed, and may
contain confidential, personal, and or privileged information. Please
contact the sender
immediately if you are not the intended recipient of this communication,
and do not copy,
distribute, or take action relying on it. Any communication received in
error, or subsequent
reply, should be deleted or destroyed.


This communication is intended for the use of the recipient to which it is
addressed, and may
contain confidential, personal, and or privileged information. Please
contact the sender
immediately if you are not the intended recipient of this communication,
and do not copy,
distribute, or take action relying on it. Any communication received in
error, or subsequent
reply, should be deleted or destroyed.
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


 

This communication is intended for the use of the recipient to which it is 
addressed, and may
contain confidential, personal, and or privileged information. Please 
contact the sender
immediately if you are not the intended recipient of this communication, 
and do not copy,
distribute, or take action relying on it. Any communication received in 
error, or subsequent
reply, should be deleted or destroyed.
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org

[R] like SPSS

2007-04-27 Thread Natalie O'Toole
Hi,

I've written code to extact a pumf file in R, subset it, and weight it 
like you would do in SPSS. My code is below  it works great. My question 
is: how do i then calculate the frequencies of smokers (1) versus 
non-smokers (2) after having weighted my file? or even the process that 
SPSS is going through to aggregate the data?

Thanks,

Nat


Here is my code:

myfile-(c:/test2.txt) 
mysubset-myfile
mysubset$Y_Q02 -mysubset$DVSELF -NULL
mysubset2-mysubset
mysubset2$Y_Q10B -mysubset2$GP2_07 -NULL

myVariableNames-c(PUMFID,PROV,REGION,GRADE,Y_Q10A,WTPP)
myVariableWidths-c(5,2,1,2,1,12.4)


mysubset2-read.fwf( 
file=myfile, 
width=myVariableWidths, 
col.names=myVariableNames, 
row.names=PUMFID, 
fill=TRUE, 
strip.white=TRUE) 



print(mysubset2)

happyguys-subset(mysubset2, PROV==48  GRADE == 7   Y_Q10A  9)
print(happyguys)

 data.frame-happyguys


 df-data.frame(PROV,REGION,GRADE,Y_Q10A,WTPP)
 df1 - df[, 1:4] * df[, 5]
 print(df1)






 

This communication is intended for the use of the recipient to which it is 
addressed, and may
contain confidential, personal, and or privileged information. Please 
contact the sender
immediately if you are not the intended recipient of this communication, 
and do not copy,
distribute, or take action relying on it. Any communication received in 
error, or subsequent
reply, should be deleted or destroyed.
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] select if + other questions

2007-04-26 Thread Natalie O'Toole
Hi,

i am trying to read a .txt file, do a couple of select if statements on my 
data, and then finally use the ?table function to get frequency counts on 
the data. Specifically, i am looking at answering the following question:

What is the frequency of Grade 7 students in the province of Alberta who 
are smokers?

I am having some problems:

1)i cannot get the column names to show up when print to screen

2)I cannot seem to skip variables properly when i choose certain other 
variables

3)i cannot get the combination of Select If statements to work to produce 
a different table with my new criteria

Here are the variables

PUMFID position1 length 5
PROV position 6 length 2
GRADE position 9 length 2
Y_Q10A position 33 length 1


Y_Q10A has the following 1=yes
   2=no
  9=skip

all the others have no skipped or missing values

Here is my code:

myfile-(c:/test2.txt)
myVariableNames-c(PUMFID,PROV,GRADE,Y_Q10A)
myVariableWidths-c(5,2,2,1)


 mydata-read.fwf(
file=myfile,
width=myVariableWidths,
col.names=myVariableNames,
row.names=PUMFID,
fill=TRUE,
strip.white=TRUE)


print(mydata)

print( mydata [which(PROV==AB  GRADE==7  Y_Q10A9), ] )



Any help would be greatly appreciated!!

Thank-you,

Nat


 

This communication is intended for the use of the recipient to which it is 
addressed, and may
contain confidential, personal, and or privileged information. Please 
contact the sender
immediately if you are not the intended recipient of this communication, 
and do not copy,
distribute, or take action relying on it. Any communication received in 
error, or subsequent
reply, should be deleted or destroyed.
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] select if + other questions

2007-04-26 Thread Natalie O'Toole
Hi,

Thank-you for the response!! That worked great!! Is there any way to apply 
a weight variable to your file similar to what you can do in SPSS? So that 
all of your other variables will be weighted by the weight variable?

Thanks,

Nat

__


Hi, 

i am trying to read a .txt file, do a couple of select if statements on my 
data, and then finally use the ?table function to get frequency counts on 
the data. Specifically, i am looking at answering the following question: 

What is the frequency of Grade 7 students in the province of Alberta who 
are smokers? 

I am having some problems: 

1)i cannot get the column names to show up when print to screen 

2)I cannot seem to skip variables properly when i choose certain other 
variables 

3)i cannot get the combination of Select If statements to work to produce 
a different table with my new criteria

Here are the variables 

PUMFID position1 length 5 
PROV position 6 length 2 
GRADE position 9 length 2 
Y_Q10A position 33 length 1 


Y_Q10A has the following 1=yes 
   2=no
  9=skip 

all the others have no skipped or missing values 

Here is my code: 

myfile-(c:/test2.txt) 
myVariableNames-c(PUMFID,PROV,GRADE,Y_Q10A) 
myVariableWidths-c(5,2,2,1) 


 mydata-read.fwf( 
file=myfile, 
width=myVariableWidths, 
col.names=myVariableNames, 
row.names=PUMFID, 
fill=TRUE, 
strip.white=TRUE) 


print(mydata) 

print( mydata [which(PROV==AB  GRADE==7  Y_Q10A9), ] ) 



Any help would be greatly appreciated!! 

Thank-you, 

Nat 


 

This communication is intended for the use of the recipient to which it is 
addressed, and may
contain confidential, personal, and or privileged information. Please 
contact the sender
immediately if you are not the intended recipient of this communication, 
and do not copy,
distribute, or take action relying on it. Any communication received in 
error, or subsequent
reply, should be deleted or destroyed. 

 

This communication is intended for the use of the recipient to which it is 
addressed, and may
contain confidential, personal, and or privileged information. Please 
contact the sender
immediately if you are not the intended recipient of this communication, 
and do not copy,
distribute, or take action relying on it. Any communication received in 
error, or subsequent
reply, should be deleted or destroyed.
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help

2007-04-25 Thread Natalie O'Toole
Hi all,

I have 2 questions:

1)How do I calculate the mean on an imported txt file? I've imported the 
file below and that's what it looks like imported. How do I then calcuate 
the mean, median, or mode on the column LeafArea using the desktop R 
package?

Any help would be greatly appreciated!!

Thanks,

Nat

   LeafType Leaflets LeafArea ShapeRatio LeafWeight LeafThickness
1 13 0.12   0.12   0.21  0.00
2 13 0.17   0.17   0.36  0.00
3 13 0.21   0.05   0.47  0.16
4 13 0.11   0.14   0.23  0.21
5 23 0.03   0.27   0.16  0.60
6 23 0.08   0.20   0.15  0.75
7 23 0.22   0.05   0.24  1.09
8 23 0.20   0.10   0.26  1.30
9 23 0.18   0.10   0.33  1.33
1023 0.14   0.07   0.19  1.36
1123 0.16   0.13   0.22  1.38
1223 0.18   0.06   0.25  1.39
1323 0.05   0.00   0.07  1.40
1423 0.11   0.01   0.21  1.41
1523 0.22   0.04   0.31  1.41
1623 0.09   0.10   0.13  1.44
1723 0.09   0.10   0.13  1.44
1823 0.13   0.08   0.19  1.46
1923 0.15   0.13   0.22  1.47
2023 0.15   0.03   0.22  1.47
2123 0.21   0.01   0.31  1.48
2213 0.21   0.14   0.32  1.50
2323 0.10   0.00   0.15  1.50
2413 0.26   0.60   0.40  1.53
2523 0.12   0.18   0.20  1.54
2623 0.20   0.15   0.31  1.55
2713 0.19   0.16   0.31  1.60
2813 0.13   0.00   0.21  1.62
2913 0.13   0.01   0.21  1.62
3013 0.37   0.27   0.60  1.62
3123 0.11   0.09   0.18  1.64
3223 0.14   0.00   0.23  1.64
3323 0.15   0.08   0.21  1.64
3423 0.20   0.10   0.33  1.65
3523 0.15   0.01  -0.25  1.67
3623 0.17   0.06   0.29  1.67
3723 0.13   0.08   0.22  1.69
3813 0.16   0.31   0.27  1.70
3913 0.21   0.01   0.40  1.70
4013 0.14   0.07   0.29  1.71
4123 0.14   0.00   0.24  1.71
4223 0.21   0.14   0.35  1.71
4323 0.11   0.09   0.19  1.73
4423 0.15   0.01   0.26  1.73
4523 0.19   0.11   0.33  1.74
4610 0.28   0.27   0.50  1.79
4713 0.10   0.01   0.18  1.80
4823 0.05   0.00   0.09  1.80
4913 0.12   0.11   0.22  1.83
5013 0.20   0.05   0.37  1.85
5123 0.14   0.14   0.26  1.86
5213 0.15   0.07   0.28  1.87
5313 0.15   0.01   0.28  1.87
5423 0.12   0.08   0.23  1.92
5523 0.15   0.00   0.29  1.93
5613 0.17   0.00   0.34  2.00
5713 0.21   0.02   0.42  2.00
5823 0.13   0.08   0.26  2.00
5913 0.16   0.06   0.32  2.05
6013 0.14   0.14   0.29  2.07
6123 0.12   0.08   0.25  2.08
6213 0.17   0.06   0.36  2.12
6313 0.13   0.08   0.28  2.13
6413 0.20   0.10   0.43  2.15
6513 0.26   0.08   0.56  2.15
6613 0.20   0.10   0.44  2.20
6713 0.19   0.11   0.42  2.21
6813 0.08   0.00   0.18  2.25
6913 0.12   0.00   0.27  2.25
7013 0.12   0.08   0.27  

[R] help

2007-04-25 Thread Natalie O'Toole
Hi,

Would anyone know how to calculate the modal value of LeafArea?

Thank-you very much!!

Nat

__

Hi all,

I have 2 questions:

1)How do I calculate the mean on an imported txt file? I've imported the 
file below and that's what it looks like imported. How do I then calcuate 
the mean, median, or mode on the column LeafArea using the desktop R 
package?

Any help would be greatly appreciated!!

Thanks,

Nat

   LeafType Leaflets LeafArea ShapeRatio LeafWeight LeafThickness
1 13 0.12   0.12   0.21  0.00
2 13 0.17   0.17   0.36  0.00
3 13 0.21   0.05   0.47  0.16
4 13 0.11   0.14   0.23  0.21
5 23 0.03   0.27   0.16  0.60
6 23 0.08   0.20   0.15  0.75
7 23 0.22   0.05   0.24  1.09
8 23 0.20   0.10   0.26  1.30
9 23 0.18   0.10   0.33  1.33
1023 0.14   0.07   0.19  1.36
1123 0.16   0.13   0.22  1.38
1223 0.18   0.06   0.25  1.39
1323 0.05   0.00   0.07  1.40
1423 0.11   0.01   0.21  1.41
1523 0.22   0.04   0.31  1.41
1623 0.09   0.10   0.13  1.44
1723 0.09   0.10   0.13  1.44
1823 0.13   0.08   0.19  1.46
1923 0.15   0.13   0.22  1.47
2023 0.15   0.03   0.22  1.47
2123 0.21   0.01   0.31  1.48
2213 0.21   0.14   0.32  1.50
2323 0.10   0.00   0.15  1.50
2413 0.26   0.60   0.40  1.53
2523 0.12   0.18   0.20  1.54
2623 0.20   0.15   0.31  1.55
2713 0.19   0.16   0.31  1.60
2813 0.13   0.00   0.21  1.62
2913 0.13   0.01   0.21  1.62
3013 0.37   0.27   0.60  1.62
3123 0.11   0.09   0.18  1.64
3223 0.14   0.00   0.23  1.64
3323 0.15   0.08   0.21  1.64
3423 0.20   0.10   0.33  1.65
3523 0.15   0.01  -0.25  1.67
3623 0.17   0.06   0.29  1.67
3723 0.13   0.08   0.22  1.69
3813 0.16   0.31   0.27  1.70
3913 0.21   0.01   0.40  1.70
4013 0.14   0.07   0.29  1.71
4123 0.14   0.00   0.24  1.71
4223 0.21   0.14   0.35  1.71
4323 0.11   0.09   0.19  1.73
4423 0.15   0.01   0.26  1.73
4523 0.19   0.11   0.33  1.74
4610 0.28   0.27   0.50  1.79
4713 0.10   0.01   0.18  1.80
4823 0.05   0.00   0.09  1.80
4913 0.12   0.11   0.22  1.83
5013 0.20   0.05   0.37  1.85
5123 0.14   0.14   0.26  1.86
5213 0.15   0.07   0.28  1.87
5313 0.15   0.01   0.28  1.87
5423 0.12   0.08   0.23  1.92
5523 0.15   0.00   0.29  1.93
5613 0.17   0.00   0.34  2.00
5713 0.21   0.02   0.42  2.00
5823 0.13   0.08   0.26  2.00
5913 0.16   0.06   0.32  2.05
6013 0.14   0.14   0.29  2.07
6123 0.12   0.08   0.25  2.08
6213 0.17   0.06   0.36  2.12
6313 0.13   0.08   0.28  2.13
6413 0.20   0.10   0.43  2.15
6513 0.26   0.08   0.56  2.15
6613 0.20   0.10   0.44  2.20
6713 0.19   0.11   0.42  2.21
6813 0.08   0.00   0.18  2.25
691  

[R] aggregate similar to SPSS

2007-04-25 Thread Natalie O'Toole
Hi,

Does anyone know if: with R can you take a set of numbers and aggregate
them like you can in SPSS? For example, could you calculate the percentage
of people who smoke based on a dataset like the following:

smoke = 1
non-smoke = 2

variable
1
1
1
2
2
1
1
1
2
2
2
2
2
2


When aggregated, SPSS can tell you what percentage of persons are smokers
based on the frequency of 1's and 2's. Can R statistical package do a
similar thing?

Thanks,

Nat

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Rweb - calculating mean for an excel table opened in Rweb

2007-03-23 Thread Natalie O'Toole
Hi,

Would anyone happen to know the steps and code for calculating mean on one 
column in a multicolumn exel spreadsheet. I would like to do this 
calculation on a file opened from my computer into Rweb using the Rweb 
open file interface?

Any help would be greatly appreciated!

Thank-you,

Nat
[EMAIL PROTECTED]

403-440-6794

 
This communication is intended for the use of the recipient to which it is 
addressed, and may
contain confidential, personal, and or privileged information. Please contact 
the sender
immediately if you are not the intended recipient of this communication, and do 
not copy,
distribute, or take action relying on it. Any communication received in error, 
or subsequent
reply, should be deleted or destroyed.
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.