Re: [R] Deparse substitute assign with list elements

2015-05-15 Thread soeren . vogel
Thanks, Bill, I should have googled more carefully:

http://stackoverflow.com/questions/9561053/assign-values-to-a-list-element-in-r

So, remove

assign(nm, tmp, parent.frame())

and add

txt - paste( nm, '-', tmp, sep='' )
eval( parse(text=txt), parent.frame() )

in `foo()` will do the trick.

Bests
Sören

 On 15.05.2015, at 00:02, William Dunlap wdun...@tibco.com wrote:
 
 You could use a 'replacement function' named 'bar-', whose last argument
 is called 'value', and use bar(variable) - newValue where you currently
 use foo(variable, newValue).
 
 bar - function(x) {
 x + 3
 }
 `bar-` - function(x, value) {
 bar(value)
 }
 
 a - NA
 bar(a) - 4
 a
 # [1] 7
 b - list(NA, NA)
 bar(b[[1]]) - 4
 b
 #[[1]]
 #[1] 7
 #
 #[[2]]
 #[1] NA
 
 
 Bill Dunlap
 TIBCO Software
 wdunlap tibco.com
 
 On Thu, May 14, 2015 at 11:28 AM, soeren.vo...@posteo.ch wrote:
 Hello,
 
 When I use function `foo` with list elements (example 2), it defines a new 
 object named `b[[1]]`, which is not what I want.  How can I change the 
 function code to show the desired behaviour for all data structures passed to 
 the function?  Or is there a more appropriate way to sort of pass by 
 references in a function?
 
 Thanks
 Sören
 
 src
 bar - function(x) {
 return( x + 3 )
 }
 
 foo - function(x, value) {
 nm - deparse(substitute(x))
 tmp - bar( value )
 assign(nm, tmp, parent.frame())
 }
 
 # 1)
 a - NA
 foo(a, 4)
 a # 7, fine :-)
 
 # 2)
 b - list(NA, NA)
 foo(b[[1]], 4) # the first list item should be 7
 b # wanted 7 but still list with two NAs :-(
 /src
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Deparse substitute assign with list elements

2015-05-14 Thread soeren . vogel
Hello,

When I use function `foo` with list elements (example 2), it defines a new 
object named `b[[1]]`, which is not what I want.  How can I change the function 
code to show the desired behaviour for all data structures passed to the 
function?  Or is there a more appropriate way to sort of pass by references 
in a function?

Thanks
Sören

src
bar - function(x) {
return( x + 3 )
}

foo - function(x, value) {
nm - deparse(substitute(x))
tmp - bar( value )
assign(nm, tmp, parent.frame())
}

# 1)
a - NA
foo(a, 4)
a # 7, fine :-)

# 2)
b - list(NA, NA)
foo(b[[1]], 4) # the first list item should be 7
b # wanted 7 but still list with two NAs :-(
/src

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Get previous R digest volumes/issues

2011-09-10 Thread soeren . vogel
Hello, (how) can I download/re-retrieve/order previous R-** digest 
volumes/issues to my mailbox for local browsing? Thank you, Sören

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Instances of a C++ class in R

2011-05-13 Thread soeren . vogel
On 28.04.2011, at 12:18, soeren.vo...@uzh.ch wrote:

 On 27.04.2011, at 11:59, soeren.vo...@uzh.ch wrote:
 
 We are working on a class in C++. The files compile fine (R CMD SHLIB ...) 
 and run in R. A bzipped tar archive with source code can be downloaded from 
 here:
 
 http://sovo.md-hh.com/files/GUTS3.tar.bz
 
 In R, dyn.load(GUTS.so) generates an instance of the GUTS class. (How) Is 
 it possible to generate (within R) several instances of this class? (E.g., 
 we would like to consider several experiments, each of which is represented 
 by an instance of GUTS.)
 
 Hm, no comment so far. Is the question unclear? Did we miss something 
 obvious? Is there other resources to post the issue to? Regards, Carlo/Sören

For the files: the discussion continued here:

https://stat.ethz.ch/pipermail/r-devel/2011-May/060922.html

and is now here:

http://lists.r-forge.r-project.org/pipermail/rcpp-devel/2011-May/002261.html

Regards
Sören

-- 
soeren.vo...@uzh.ch, carlo.alb...@eawag.ch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Instances of a C++ class in R

2011-04-28 Thread soeren . vogel
On 27.04.2011, at 11:59, soeren.vo...@uzh.ch wrote:

 We are working on a class in C++. The files compile fine (R CMD SHLIB ...) 
 and run in R. A bzipped tar archive with source code can be downloaded from 
 here:
 
 http://sovo.md-hh.com/files/GUTS3.tar.bz
 
 In R, dyn.load(GUTS.so) generates an instance of the GUTS class. (How) Is 
 it possible to generate (within R) several instances of this class? (E.g., we 
 would like to consider several experiments, each of which is represented by 
 an instance of GUTS.)

Hm, no comment so far. Is the question unclear? Did we miss something obvious? 
Is there other resources to post the issue to? Regards, Carlo/Sören

-- 
soeren.vo...@uzh.ch, carlo.alb...@eawag.ch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Instances of a C++ class in R

2011-04-27 Thread soeren . vogel
Hello

We are working on a class in C++. The files compile fine (R CMD SHLIB ...) and 
run in R. A bzipped tar archive with source code can be downloaded from here:

http://sovo.md-hh.com/files/GUTS3.tar.bz

In R, dyn.load(GUTS.so) generates an instance of the GUTS class. (How) Is it 
possible to generate (within R) several instances of this class? (E.g., we 
would like to consider several experiments, each of which is represented by an 
instance of GUTS.)

Thanks for help and advices, Sören

-- 
soeren.vo...@uzh.ch, carlo.alb...@eawag.ch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nagelkerke R square for Prediction data

2010-12-27 Thread Soeren . Vogel
Hello

I found some small postings dated to 22 Oct 2008 on the message subject. 
Recently, I have been working with binary logistic regressions. I didn't use 
the design package. Yet, I needed the fit indices. Therefore, I wrote a small 
function to output the Nagelkerke's R, and the Cox--Snell R from a fitted 
model. I am no professional programmer by far, yet, I hope, that the code is 
okay and that it may be of some use to others -- or subject to useful 
improvement. Please let me know, if you find errors.

Regards,
Sören

Rcsnagel - function(mod) {
  llnull - mod$null.deviance
  llmod - mod$deviance
  n - length(mod$fitted.values)
  Rcs - 1 - exp( (mod$deviance - mod$null.deviance) / n )
  Rnagel - Rcs / (1 - exp(-llnull/n))
  out - list('Rcs'=Rcs, 'Rnagel'=Rnagel)
  class(out) - c(list, table)
  return(out)
}

y - sample(c(T, F), 50, repl=T)
x - sample(1:7, 50, repl=T)
mod - glm(y ~ x, family=binomial(logit))
Rcsnagel(mod)


-- 
Sören Vogel, Dipl.-Psych. (Univ.), PhD-Student, Eawag, Dept. SIAM
http://www.eawag.ch, http://sozmod.eawag.ch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Multinomial Analysis

2010-12-15 Thread soeren . vogel
I want to analyse data with an unordered, multi-level outcome variable, y. I am 
asking for the appropriate method (or R procedure) to use for this analysis.

 N - 500
 set.seed(1234)
 data0 - data.frame(y = as.factor(sample(LETTERS[1:3], N, repl = T, 
+ prob = c(10, 12, 14))), x1 = sample(1:7, N, repl = T, prob = c(8, 
+ 8, 9, 15, 9, 9, 8)), x2 = sample(1:7, N, repl = T, prob = 7:1), 
+ x3 = round(rnorm(N, 0, 1), 0) + 4)

In [1], the use of mlogit is described (same package name). mlogit produces 
a nice output, but for some reason it does not accept the . operator in my 
formula. In addition, it requires me to transform my data frame before running 
the analysis:

 library(mlogit)
 data1 - mlogit.data(data0, varying = NULL, choice = y, shape = wide)
 summary(mod1 - mlogit(y ~ 1 | ., data = data1, reflevel = A))

Error in terms.formula(object) : '.' in formula and no 'data' argument

 form2 - paste(y,  ~ 1 | , paste(names(data0)[-1], sep = , 
+ collapse =  + ), sep = , collapse = )
 summary(mod2 - mlogit(as.formula(form2), data = data1, reflevel = A))

Call:
mlogit(formula = as.formula(form2), data = data1, reflevel = A, 
method = nr, print.level = 0)

Frequencies of alternatives:
A B C 
0.282 0.338 0.380 

nr method
3 iterations, 0h:0m:0s 
g'(-H)^-1g = 8.8E-07 
gradient close to zero 

Coefficients :
 Estimate Std. Error t-value Pr(|t|)
altB 0.269645   0.569943  0.4731   0.6361
altC 0.642654   0.551250  1.1658   0.2437
altB:x1  0.018484   0.061061  0.3027   0.7621
altC:x1  0.075489   0.059837  1.2616   0.2071
altB:x2 -0.102349   0.068194 -1.5008   0.1334
altC:x2 -0.031934   0.065491 -0.4876   0.6258
altB:x3  0.034394   0.115189  0.2986   0.7653
altC:x3 -0.137739   0.112124 -1.2284   0.2193

Log-Likelihood: -542.05
McFadden R^2:  0.0065888 
Likelihood ratio test : chisq = 7.1903 (p.value=0.30361)

This is no big problem if the data frame holds only few variables, however, it 
can easily become confusing if the number variables in the data frame 
increases. So, I can use the function multinom (package nnet), as described 
in [2] and [3]:

 library(nnet)
 summary(mod3 - multinom(y ~ ., data = data0, na.action = na.omit, 
+ Hess = T, model = T))

# weights:  15 (8 variable)
initial  value 549.306144 
iter  10 value 542.075434
final  value 542.046320 
converged
Call:
multinom(formula = y ~ ., data = data0, na.action = na.omit, 
Hess = T, model = T)

Coefficients:
  (Intercept) x1  x2  x3
B   0.2696466 0.01848389 -0.10234898  0.03439345
C   0.6426552 0.07548893 -0.03193395 -0.13773888

Std. Errors:
  (Intercept) x1 x2x3
B   0.5699428 0.06106070 0.06819403 0.1151887
C   0.5512500 0.05983652 0.06549051 0.1121242

Residual Deviance: 1084.093 
AIC: 1100.093 

The estimates produced by this function are the same like in the analysis above 
(mlogit), and in principle, I should be able to calculate the z and fit 
statistics by hand. However, this seem more long-winded than using the mlogit 
function.

Alternatively, I may describe my results in the following way: Given that a 
person knows B and C (which is the case for all my cases), what drives some to 
use B, and what drives some to use C? Basically, I am wondering whether this 
analysis can be done using a set of two, parallel binary logistic regressions, 
one describing the step from A to B, and another one describing the step from A 
to C, while keeping predictors the same in both regressions.

More generally, I am not sure whether each multinomial logistic model could not 
just as well be analysed using consecutive binary logistic regressions. This 
has two advantages in my opinion, one, the model statistics and fit measures 
are well-described in binary models compared to multinomial ones (see [1] at 
the bottom), and two, they are much easier to communicate. But I am wondering, 
analogously to multiple mean comparisons, whether I run into any traps, and I 
have no idea what these traps look like.

I would kindly appreciate any advice.

Thanks, Sören

[1] http://www.ats.ucla.edu/stat/r/dae/mlogit.htm

[2] Venables, W. N. and Ripley, B. D. (2002). Modern applied statistics with S 
(4th ed.). New York: Springer.

[3] http://www.stat.washington.edu/quinn/classes/536/S/multinomexample.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Linear separation

2010-12-03 Thread soeren . vogel
In https://stat.ethz.ch/pipermail/r-help/2008-March/156868.html I found what 
linear separability means. But what can I do if I find such a situation in my 
data? Field (2005) suggest to reduce the number of predictors or increase the 
number of cases. But I am not sure whether I can, as an alternative, take the 
findings from my analysis and report them. And if so, how can I find the linear 
combination of the predictors that separates the zeros from the ones?

Below a small example to illustrate the situation.

set.seed(123)
df - data.frame(
  'y'=c(rep(FALSE, 6), rep(TRUE, 14)),
  'x1'=c(sample(1:2, 6, repl=T), sample(3:5, 14, repl=T)),
  'x2'=c(sample(4:7, 6, repl=T), sample(1:3, 14, repl=T)),
  'x3'=round(rnorm(20, 4, 2), 0)
)
df[17:18, c(2, 3)] - df[17:18, c(3, 2)]
glm(y ~ ., data=df[, -3], family=binomial(logit))
glm(y ~ ., data=df, family=binomial(logit))

Thanks, Sören

-- 
Sören Vogel, Dipl.-Psych. (Univ.), PhD-Student, Eawag, Dept. SIAM
http://www.eawag.ch, http://sozmod.eawag.ch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] slicing list with matrices

2010-11-17 Thread soeren . vogel
A list contains several matrices. Over all matrices (list elements) I'd like to 
access one matrix cell:

m - matrix(1:9, nrow=3, dimnames=list(LETTERS[1:3], letters[1:3]))
l - list(m1=m, m2=m*2, m3=m*3)
l[[3]] # works
l[[3]][1:2, ] # works
l[[1:3]][1, 1] # does not work

How can I slice all C-c combinations in the list?

Sören

-- 
Sören Vogel, Dipl.-Psych. (Univ.), PhD-Student, Eawag, Dept. SIAM
+41 76 233 3637, http://www.eawag.ch, http://sozmod.eawag.ch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] S: appropriate significance tests

2010-10-20 Thread soeren . vogel
Hello

(1) How can I compare two Pearson correlation coefficients for significant 
differences without the use of the raw data?

(2) How can I compare two linear regression coefficients for significant 
differences without the use of the raw data?

(3) How can I compare two multiple correlation coefficients (as produced in a 
linear regression) for significant differences, again, if the raw data should 
not be used?

src type=perhaps useful
da - data.frame(
  y1=sample(1:5, 40, repl=T),
  y2=sample(1:5, 40, repl=T),
  x1=sample(1:5, 40, repl=T),
  x2=sample(1:5, 40, repl=T)
)
cro1 - cor(da$y1, da$x1)
cro2 - cor(da$y2, da$x1)
lmo1 - lm(y1 ~ x1 + x2, data=da)
lmo2 - lm(y2 ~ x1 + x2, data=da)
/src

Thanks, Sören

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] S: appropriate significance tests

2010-10-20 Thread soeren . vogel
Thanks Greg for the additional remarks. Basically I have two questions, let me 
try to specify them as follows:

(1) Height and intelligence may correlate at, say, X, but speed and finger 
length may correlate at Y. Despite any sense of such a statement, is X 
significantly larger than Y? How can I perform this significance test (of two 
correlation coefficients) if I don't have the original data, but only the 
correlation coefficients, the degrees of freedom and the standard errors?

(2) Let's assume I've studied whether height (as independent) has an influence 
on (a, regression 1) intelligence and (b, regression 2) speed. Suppose I found 
out that my linear regression between intelligence and height, (1), reveals a 
standardised estimate of .6 for height, while for the regression between speed 
and height, (2), reveals an estimate of .65 for height. Is the estimate for 
height in regression 1 significantly higher than that for height in regression 
2?


 I have asked a couple of persons how much they like music by Robbie Williams 
 or Nora Jones. Additionally I recorded their hair colour and their attitude 
 towards pop music (ATP). After my analysis, I'd like to state that for RW and 
 NJ neither hair colour nor ATP is of *different* importance, whereas the hair 
 colour is much more important for liking RW. I thus thought that I should 
 compare two regression estimates for significant differences since saying 
 that the one is lower/higher than the other doesn't satisfy my editor.
 
 On 20.10.2010, at 20:48, Greg Snow wrote:
 
 It is not completely clear what question you are trying to answer or what you 
 are trying to accomplish.  But here are some additional questions that may 
 help:
 
 What tests would you use if you could use the original data?
 What assumptions are you willing to make about the data and/or statistics?
 What are you trying to accomplish?
 
 -- 
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 greg.s...@imail.org
 801.408.8111
 
 
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of soeren.vo...@eawag.ch
 Sent: Wednesday, October 20, 2010 7:33 AM
 To: r-help@r-project.org
 Subject: [R] S: appropriate significance tests
 
 Hello
 
 (1) How can I compare two Pearson correlation coefficients for
 significant differences without the use of the raw data?
 
 (2) How can I compare two linear regression coefficients for
 significant differences without the use of the raw data?
 
 (3) How can I compare two multiple correlation coefficients (as
 produced in a linear regression) for significant differences, again, if
 the raw data should not be used?
 
 src type=perhaps useful
 da - data.frame(
  y1=sample(1:5, 40, repl=T),
  y2=sample(1:5, 40, repl=T),
  x1=sample(1:5, 40, repl=T),
  x2=sample(1:5, 40, repl=T)
 )
 cro1 - cor(da$y1, da$x1)
 cro2 - cor(da$y2, da$x1)
 lmo1 - lm(y1 ~ x1 + x2, data=da)
 lmo2 - lm(y2 ~ x1 + x2, data=da)
 /src
 
 Thanks, Sören
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sweave line breaks

2010-07-30 Thread soeren . vogel
Hello, when I print x in Sweave, the lines do not wrap. However, I want them to 
wrap (perhaps at a specified width). How? Thanks, *S*

keep.source=TRUE=
x - Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod 
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, 
quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo 
consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse 
cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non 
proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
print(x)
@

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ROpenOffice (which requires Rcompression)

2010-07-23 Thread soeren . vogel
Hello, for my data preparation and administration (data, labels, etc.) I use 
OpenOffice.org ODS spreadsheet files with several sheets in one file. However, 
I find it inconvenient to export every single sheet to a csv file whenever I 
apply changes to the labelling or so, and I haven't found a plugin or an 
application which does this batch in OOorg. Is there new development on a 
direct import package for ODS files? I've googled around one afternoon, found 
ROpenOffice (http://www.omegahat.org/ROpenOffice/) which requires Rcompression 
(http://www.mail-archive.com/r-help@r-project.org/msg55480.html), which in turn 
is quite outdated. Is the whole idea dead or did I miss something important 
when googling?

*S*

(R version 2.11.1 (2010-05-31), x86_64-apple-darwin9.8.0)

-- 
Sören Vogel, Dipl.-Psych. (Univ.), PhD-Student, Eawag, Dept. SIAM
http://www.eawag.ch, http://sozmod.eawag.ch
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Default For list.len Argument (list output truncated)

2010-07-21 Thread Soeren . Vogel
Hello

How can I set a default for the 'list.len' argument in 'str'?

x - as.data.frame(matrix(1:1000, ncol=1000))
str(x)
formals(str) - alist(object=, ...=, list.len=1000)
args(str); formals(str)
str(x)

Does not display errors, but does not work either.

Thanks for help

Sören

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] S: Book for time series analysis with R

2010-07-09 Thread soeren . vogel
Hello

Could you recommend a printed/electronic book that teaches time series analysis 
(using R) for students? I am searching for something similar to the MASS book, 
with a level lower but close, for TSA. Easy examples would be fine to 
understand deeper statistical procedures. Functions, results, and their 
practical interpretation should be explained next to the R code.

Thanks for help and/or suggestions!

Sören

-- 
Sören Vogel, Dipl.-Psych. (Univ.), PhD-Student, Eawag, Dept. SIAM
+41 76 233 3637, http://www.eawag.ch, http://sozmod.eawag.ch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Adding NAs to data.frame

2010-05-08 Thread soeren . vogel
Hello, after the creation of a data.frame I like to add NAs as follows:

n - 743;
x - runif(n, 1, 7);
Y - runif(n, 1, 7);
Ag6 - runif(n, 1, 7);
df - data.frame(x, Y, Ag6);
# a list with positions:
v - apply(df, 2, function(x) sample(n, sample(1:ceiling(5*n/100), 1), repl=F));
# a loop too much?
for (i in 1:length(df)){
  df[unlist(v[i]), i] - NA;
}
summary(df);

This works fine but it appears to me that there is a more elegant and simple 
way -- which one?

Thanks
Sören

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] apply fun to df returning a matrix

2010-04-30 Thread Soeren . Vogel
Hello, a data.frame, df, holds the numerics, x, y, and z. A function,  
fun, should return some arbitrary statistics about the arguments, e.g.  
the sum or anything else. What I want to do is to apply this function  
to every pair of variables in df, and the return should be a matrix as  
found with cov. How can I achieve that? Thanks, Sören


df - data.frame(x=1:10, y=11:20, z=21:30);
fun - function(x){
  return(sum(x));
}
# and now???

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] apply fun to df returning a matrix

2010-04-30 Thread Soeren . Vogel
Hi Mohamed, thanks for your answer. Anyway, the how to is exactly my  
problem, since ...


fun2 - function(x){
   
please_use_aggregate_and_apply_in_some_way_and_return_the_output_of_my_example_as_requested 
(fun(x));

}
fun2(df);

... unfortunately returns an error ;-). Could you please give a simple  
example? Thanks, Sören


On 30.04.2010, at 12:59, Mohamed Lajnef wrote:


Hi Soeren

Apply or aggregate functions

best regards
M
soeren.vo...@eawag.ch a écrit :
Hello, a data.frame, df, holds the numerics, x, y, and z. A  
function, fun, should return some arbitrary statistics about the  
arguments, e.g. the sum or anything else. What I want to do is to  
apply this function to every pair of variables in df, and the  
return should be a matrix as found with cov. How can I achieve  
that? Thanks, Sören


df - data.frame(x=1:10, y=11:20, z=21:30);
fun - function(x){
 return(sum(x));
}
# and now???


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] apply fun to df returning a matrix

2010-04-30 Thread Soeren . Vogel
Thanks for the code, that was exactly what I was looking for. Regards,  
Sören


On 30.04.2010, at 14:04, David Winsemius wrote:



On Apr 30, 2010, at 6:59 AM, Mohamed Lajnef wrote:


Hi Soeren

Apply or aggregate functions



Probably needs combn as well. Could do it all with numeric indices,  
but this effort with character vectors seems acceptable:


fun - function(x){ cnms - colnames(x)
 return(apply(combn(cnms,2), 2, function(y) sum(x[,y])))
}
fun(df)
#[1] 210 310 410

I do have a question about returning a matrix though. Did you mena  
that you wanted the pairs of sums rather than the sums of pairs. In  
that case:


fun2 - function(x){cnms - colnames(x)
 return(apply(combn(cnms,2), c(1,2), function(y) sum(x[,y])))
}

fun2(df)
# [,1] [,2] [,3]
#[1,]   55   55  155
#[2,]  155  255  255

--
David.


best regards
M
soeren.vo...@eawag.ch a écrit :
Hello, a data.frame, df, holds the numerics, x, y, and z. A  
function, fun, should return some arbitrary statistics about the  
arguments, e.g. the sum or anything else. What I want to do is to  
apply this function to every pair of variables in df, and the  
return should be a matrix as found with cov. How can I achieve  
that? Thanks, Sören


df - data.frame(x=1:10, y=11:20, z=21:30);
fun - function(x){
return(sum(x));
}
# and now???



--


Mohamed Lajnef,IE INSERM U955 eq 15
Pôle de Psychiatrie
Hôpital CHENEVIER
40, rue Mesly
94010 CRETEIL Cedex FRANCE
mohamed.laj...@inserm.fr
tel : 01 49 81 31 31 (poste 18470)
Sec : 01 49 81 32 90
fax : 01 49 81 30 99


David Winsemius, MD
West Hartford, CT





--
Sören Vogel, Dipl.-Psych. (Univ.), PhD-Student, Eawag, Dept. SIAM
http://www.eawag.ch, http://sozmod.eawag.ch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Return a variable name

2010-04-16 Thread soeren . vogel

Hello,

how can I return the name of a variable, say a$b, from a function?

fun - function(x){
  return(substitute(x));
}
a  - data.frame(b=1:10);
fun(a$b)

... returns a$b, but this is a type language, thus I can't use it as a  
character string, can I? How?


Thanks for help,

Sören

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] cbind, row names

2010-01-29 Thread soeren . vogel

Hello,

I read the help as well as the examples, but I can not figure out why  
the following code does not produce the *given* row names, x and y:


x - 1:20
y - 21:40
rbind(
  x=cbind(N=length(x), M=mean(x), SD=sd(x)),
  y=cbind(N=length(y), M=mean(y), SD=sd(y))
)

Could you please help?

Thank you

Sören

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Exact test for count data

2009-11-28 Thread soeren . vogel

Hello!

Bortz, Lienert,  Boehnke (2008, pp. 140--142) suggest an exact  
polynomial test for low frequency tables. I used it recently, and  
thus, created the code attached. Maybe someone would use (and likely  
modify) it or incorporate it into their package.


Sören

References:

Bortz, J., Lienert, G. A.,  Boehnke, K. (2008). Verteilungsfreie  
Methoden in der Biostatistik. Berlin: Springer


Code:

library(forensim)
myfun - function(cur, exp, p.obs, force=F){
  if(length(cur) != length(exp)){
stop(wrong length of dimensions, try transpose the matrix\n)
  }
  # keep users from hot laptops
  if(!force){
if(Cmn(sum(obs), length(obs))  (2 * (10^6))){
  stop(Use option \force=T\ to compute more than 2 million  
combinations.\n)

}
  }
  p.cur - factorial(sum(cur)) / prod(sapply(cur, function(x)  
factorial(x))) * prod(exp ^ cur)

  if(p.cur = p.obs){
return(p.cur)
  }
}

# Trial 1, Book example
d.sta - Sys.time()
exp - c(.28, .08, .04, .42, .12, .06)
exp - exp/sum(exp)
obs - c(0, 3, 0, 0, 0, 0)
Cmn(sum(obs), length(obs)) # 56
p0 - factorial(sum(obs)) / prod(sapply(obs, function(x)  
factorial(x))) * prod(exp ^ obs)

mat - comb(sum(obs), length(obs))
sum(unlist(apply(mat, 1, function(x) myfun(x, exp, p0
d.sto - Sys.time()
difftime(d.sto, d.sta) # Time difference of 0.1103680 secs

# Trial 2
d.sta - Sys.time()
exp - c(3, 5, 7, 4, 5, 4, 2, 9, 1)
exp - exp/sum(exp)
obs - c(3, 0, 2, 0, 1, 1, 0, 5, 3) # now with a longer vector
Cmn(sum(obs), length(obs)) # 490314
p0 - factorial(sum(obs)) / prod(sapply(obs, function(x)  
factorial(x))) * prod(exp ^ obs)

mat - comb(sum(obs), length(obs))
sum(unlist(apply(mat, 1, function(x) myfun(x, exp, p0
d.sto - Sys.time()
difftime(d.sto, d.sta) # Time difference of 1.050846 mins

# Trial 3
d.sta - Sys.time()
exp - c(3, 5, 7, 4, 5, 4, 2, 9, 1)
exp - exp/sum(exp)
obs - c(3, 2, 2, 0, 1, 1, 0, 5, 3) # changed 0 to 2 on position 2
Cmn(sum(obs), length(obs)) # 1081575
p0 - factorial(sum(obs)) / prod(sapply(obs, function(x)  
factorial(x))) * prod(exp ^ obs)

mat - comb(sum(obs), length(obs))
sum(unlist(apply(mat, 1, function(x) myfun(x, exp, p0
d.sto - Sys.time()
difftime(d.sto, d.sta) # Time difference of 2.425888 mins

# Trial 4
d.sta - Sys.time()
exp - c(3, 5, 7, 4, 5, 4, 2, 9, 1)
exp - exp/sum(exp)
obs - c(3, 2, 2, 0, 2, 1, 0, 5, 3) # changed 1 to 2 on position 5
Cmn(sum(obs), length(obs)) # 1562275
p0 - factorial(sum(obs)) / prod(sapply(obs, function(x)  
factorial(x))) * prod(exp ^ obs)

mat - comb(sum(obs), length(obs))
sum(unlist(apply(mat, 1, function(x) myfun(x, exp, p0
d.sto - Sys.time()
difftime(d.sto, d.sta) # Time difference of 3.658092 mins

# Trial 5
d.sta - Sys.time()
exp - c(3, 5, 7, 4, 5, 4, 2, 9, 1)
exp - exp/sum(exp)
obs - c(3, 3, 2, 0, 2, 1, 0, 5, 3) # changed 1 to 2 on position 5
Cmn(sum(obs), length(obs)) # 2220075
p0 - factorial(sum(obs)) / prod(sapply(obs, function(x)  
factorial(x))) * prod(exp ^ obs)

mat - comb(sum(obs), length(obs))
sum(unlist(apply(mat, 1, function(x) myfun(x, exp, p0
d.sto - Sys.time()
difftime(d.sto, d.sta) # Time difference of 3.658092 mins

--
Sören Vogel, Dipl.-Psych. (Univ.), PhD-Student, Eawag, Dept. SIAM
http://www.eawag.ch, http://sozmod.eawag.ch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] searching code for combination of vector

2009-11-25 Thread soeren . vogel
For a given numeric vector v of length n and sum s, is there a ready- 
to-run code that returns every combination of v in n summing up to s?  
Example for n=3 and s=2:


v - c(2, 0, 0)
# find some coding here that returns
[1] 2 0 0
[2] 1 1 0
[3] 1 0 1
[4] 0 2 0
[5] 0 1 1
[6] 0 0 2

Thanks

Sören

--
Sören Vogel, Dipl.-Psych. (Univ.), PhD-Student, Eawag, Dept. SIAM
http://www.eawag.ch, http://sozmod.eawag.ch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Define return values of a function

2009-11-22 Thread Soeren . Vogel

I have created a function to do something:

i - factor(sample(c(A, B, C, NA), 793, rep=T, prob=c(8, 7, 5,  
1)))
k - factor(sample(c(X, Y, Z, NA), 793, rep=T, prob=c(12, 7, 9,  
1)))

mytable - function(x){
  xtb - x
  btx - x
  # do more with x, not relevant here
  cat(The table has been created, see here:\n)
  print(xtb)
  list(table=xtb, elbat=btx)
}
tbl - table(i, k)
mytable(tbl) # (1)
z - mytable(tbl) # (2)
str(z) # (3)

(1) Wanted: outputs the string and the table properly. *Unwanted*:  
outputs the list elements.


(2) and (3) Wanted: outputs the string properly. Wanted: assigns the  
list properly.


How can I get rid of the *unwanted* part? That is, how do I define  
what the functions prints and -- on the other hand -- what it returns  
without printing?


Thanks

Sören

--
Sören Vogel, Dipl.-Psych. (Univ.), PhD-Student, Eawag, Dept. SIAM
http://www.eawag.ch, http://sozmod.eawag.ch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] shrink list by mathed entries

2009-11-14 Thread Soeren . Vogel

On 14.11.2009, at 03:58, David Winsemius wrote:


On Nov 13, 2009, at 11:19 AM, soeren.vo...@eawag.ch wrote:


a - c(Mama, Papa, Papa; Mama, , Sammy; Mama; Papa)
a - strsplit(a, ; )
mama - rep(F, length(a))
mama[sapply(a, function(x) { sum(x==Mama) }, simplify=T)  0] - T


[...]

... produces the variables mama and papa correctly. But how do  
I remove all Mama list entries


[...]


Maybe you should explain what you were trying to do?  Perhaps:

 a[!mama]


[...]

I would sidestep that confusing sequence of logical assignments and  
just do this:


 a[ -grep(Mama, a) ]


[...]

Explanation of what I want to do: This code is PHP, maybe rather crude  
but it works the way I want it and explains my goal:


#!/usr/bin/php
?php
error_reporting(E_ALL);
ini_set('display_errors', true);
ini_set('html_errors', false);
$strings = array(Mama, Papa, Papa; Mama, , Sammy; Mama;  
Papa, Josh, Mama);

$vars = array(Mama, Papa, Sammy);
$i=0;
foreach($strings as $line){
  $line = explode(; , $line);
  $matches = array_intersect($line, $vars);
  $diffs = array_diff($line, $vars);
  foreach($matches as $match){
eval(\$$match.[.$i.] = 1;); // no easier way
  }
  foreach($diffs as $diff){
$others[$i] = $diff;
  }
  $i++;
}
print_r($Mama); // array with elements 0, 2, 4, and 6 set to 1
print_r($Papa); // array with elements 1, 2, and 4, set to 1
print_r($Sammy); // array with element 4 set to 1
print_r($others); // array with elements 3 set to , and 5 set to  
Josh

?

Sören

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Craddock-Flood Test in R?

2009-11-13 Thread soeren . vogel

Hello

The Craddock-Flood Test is recommended for large tables with small  
degrees of freedom and low-frequency cells. Is there an R procedure  
and/or package which does the test?


Thank you for your help!

Sören Vogel

--
Sören Vogel, Dipl.-Psych. (Univ.), PhD-Student, Eawag, Dept. SIAM
http://www.eawag.ch, http://sozmod.eawag.ch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] shrink list by mathed entries

2009-11-13 Thread Soeren . Vogel

Hello

a - c(Mama, Papa, Papa; Mama, , Sammy; Mama; Papa)
a - strsplit(a, ; )
mama - rep(F, length(a))
mama[sapply(a, function(x) { sum(x==Mama) }, simplify=T)  0] - T
papa - rep(F, length(a))
papa[sapply(a, function(x) { sum(x==Papa) }, simplify=T)  0] - T
# ... more variables

... produces the variables mama and papa correctly. But how do I  
remove all Mama list entries in a in the same run, that is, shrink  
the list by what was already matched?


Thank you for your help!

Sören Vogel


--
Sören Vogel, Dipl.-Psych. (Univ.), PhD-Student, Eawag, Dept. SIAM
http://www.eawag.ch, http://sozmod.eawag.ch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] when use which()

2009-11-13 Thread soeren . vogel

Hello:

# some code to assign with and without which
q - 1:20; q[c(9, 12, 14)] - NA
r - 1:20; r[c(8:9, 12:15)] - NA
s - 1:20; s[c(8:9, 12:15)] - NA
r[q  16] - 0
s[which(q  16)] - 0
r;s # both: 0  0  0  0  0  0  0  0 NA  0  0 NA  0 NA  0 16 17 18 19 20
r - 1:20; r[c(8:9, 12:15)] - NA
s - 1:20; s[c(8:9, 12:15)] - NA
r[is.na(q)] - 30
s[which(is.na(q))] - 30
r;s # both: 1  2  3  4  5  6  7 NA 30 10 11 30 NA 30 NA 16 17 18 19 20

So it appears to me that a[b] - c delivers the same results as  
a[which(b)] - c. Is there any situation where the assignment with/ 
out which indeed makes a difference?


Thanks for help

Regards

Sören

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Formatted contingency tables with (%)

2009-11-10 Thread soeren . vogel

Quite often, I need those tables:

x - sample(c(a, b, c), 40, rep=T)
y - sample(c(X, Y), 40, rep=T)
(tbl - table(x, y))
(z - as.factor(paste(as.vector(tbl),  (,  
round(prop.table(as.vector(tbl)) * 100, 1), %), sep=)))

matrix(as.factor(z), nrow=3, dimnames=dimnames(tbl))

But the result looks ugly and is not copypaste-able for LaTeX  
verbatim or table environment, moreover, the \ is not what I want  
in the printout. How to achieve:


   y
x  X  Y
a  3  (7.5%)   7 (17.5%)
b  9 (22.5%)   5 (12.5%)
c  6 (15.0%)  10 (25.0%)

Thank you for help or hints.

Sören

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] pca vs. pfa: dimension reduction

2009-03-25 Thread soeren . vogel

Can't make sense of calculated results and hope I'll find help here.

I've collected answers from about 600 persons concerning three  
variables. I hypothesise those three variables to be components (or  
indicators) of one latent factor. In order to reduce data (vars), I  
had the following idea: Calculate the factor underlying these three  
vars. Use the loadings and the original var values to construct an new  
(artificial) var: (B1 * X1) + (B2 * X2) + (B3 * X3) = ArtVar (brackets  
for readability). Use ArtVar for further analysis of the data, that  
is, as predictor etc.


In my (I realise, elementary) psychological statistics readings I was  
taught to use pca for these problems. Referring to Venables  Ripley  
(2002, chapter 11), I applied princomp to my vars. But the outcome  
shows 4 components -- which is obviously not what I want. Reading  
further I found factanal, which produces loadings on the one  
specified factor very fine. But since this is a contradiction to  
theoretical introductions in so many texts I'm completely confused  
whether I'm right with these calculations.


(1) Is there an easy example, which explains the differences between  
pca and pfa? (2) Which R procedure should I use to get what I want?


Thank you for your help

Sören


Refs.:

Venables, W. N., and Ripley, B. D. (2002). Modern applied statistics  
with S (4th edition). New York: Springer.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] sunflowerplot error

2009-03-19 Thread soeren . vogel

A sunflowerplot crossing two categorial variables with NAs fails:

### sample: start ###
set.seed(20)
a - c(letters[1:4])
z - c(letters[23:26])
fa - factor(sample(rep.int(a, 1000), 100, replace=T), levels=a,  
ordered=T)
fz - factor(sample(rep.int(z, 1000), 100, replace=T), levels=z,  
ordered=T)

sunflowerplot(fa, fz)
# okay,  but:
r - xyTable(fa, fz)
length(r$x)==length(r$y)
length(r$x)==length(r$number)
# TRUE, TRUE
is.na(fa) - sort(sample(1:100, 3))
sunflowerplot(fa, fz)
# Error in rep.int(i.multi, number[number  1]) : invalid 'times' value
s - xyTable(fa, fz)
length(s$x)==length(s$y)
length(s$x)==length(s$number)
# TRUE, TRUE
### sample: end ###

Seems to fail due to NAs, but (1) why and (2) how to get by?

Thanks, *Sören*

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] chisq.test: decreasing p-value

2009-03-11 Thread soeren . vogel
A Likert scale may have produced counts of answers per category.  
According to theory I may expect equality over the categories. A  
statistical test shall reveal the actual equality in my sample.


When applying a chi square test with increasing number of repetitions  
(simulate.p.value) over a fixed sample, the p-value decreases  
dramatically (looks as if converge to zero).


(1) Why?
(2) (If this test is wrong), then which test can check what I want to  
check, that is: are the two distributions of frequencies (observed and  
expected) in principle the same?

(3) By the way, how to deal with low frequency cells?

r - c(10, 100, 500, 1000, 2000, 5000)
v - c(35, 40, 45, 45, 40, 35)
sapply(list(r), function (x) { chisq.test(v, p=c(rep.int(40, 6)),  
rescale.p=T, simulate.p.value=T, B=x)$p.value })


Thank you, Sören


--
Sören Vogel, PhD-Student, Eawag, Dept. SIAM
http://www.eawag.ch, http://sozmod.eawag.ch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] chisq.test: decreasing p-value

2009-03-11 Thread soeren . vogel
Thanks to Peter, David, and Michael! After having corrected the coding  
error, the p values converge to particular value, not necessarily  
zero. The whole story is, 634 respondents in 6 different areas marked  
their answer on a 7-step Likert scale (very bad, bad, ..., very good  
-- later recoded to 5 scale levels). The statistical question now is,  
do the answer's distributions (amount of goods, bads etc.) in either  
area differ from the mean answer-distribution calculated with  
summing up all goods, bads, etc. Anyway an omnibus chi square would  
not answer my question, and due to spurious significances I'd rather  
go back to my chi square book ;-) (for the interested, see http://sozmod.eawag.ch/files/file.Robj 
 for the entire table).


Thanks for your help

Sören

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Summary of data.frame according to colnames and grouping factor

2009-03-08 Thread soeren . vogel
A dataframe holds 3 vars, each checked true or false (1, 0). Another  
var holds the grouping, r and s:


### start:example
set.seed(20)
d - data.frame(sample(c(0, 1), 20, replace=T), sample(c(0, 1), 20,  
replace=T), sample(c(0, 1), 20, replace=T))

names(d) - c(A, B, C)
e - rep(c(r, s), 10)
### end:example

How do I get the count of 1's (or any other function) applied over  
each var according to the grouping? That is:


Desired output table:

   A  B  C
r  count  count  count
s.........

or likewise transposed. I'd like to use the table for textual display  
and/or barplot creation.


Thx, Sören

--
Sören Vogel, PhD-Student, Eawag, Dept. SIAM
http://www.eawag.ch, http://sozmod.eawag.ch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Recode factor into binary factor-level vars

2009-03-07 Thread soeren . vogel
How to I recode a factor into a binary data frame according to the  
factor levels:


### example:start
set.seed(20)
l - sample(rep.int(c(locA, locB, locC, locD), 100), 10,  
replace=T)
# [1] locD locD locD locD locB locA locA locA locD  
locA

### example:end

What I want in the end is the following:

m$locA: 0, 0, 0, 0, 0, 1, 1, 1, 0, 1
m$locB: 0, 0, 0, 0, 1, 0, 0, 0, 0, 0
m$locC: 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
m$locD: 1, 1, 1, 1, 0, 0, 0, 0, 1, 0

Instead of 0, NA's would also be fine.

Thanks, Sören

--
Sören Vogel, PhD-Student, Eawag, Dept. SIAM
http://www.eawag.ch, http://sozmod.eawag.ch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Summary grouped by factor

2009-03-06 Thread soeren . vogel

### example:start
v - sample(rnorm(200), 100, replace=T)
k - rep.int(c(locA, locB, locC, locD), 25)
tapply(v, k, summary)
### example:end

... (hopefully) produces 4 summaries of v according to k group  
membership. How can I transform the output into a nice table with the  
croups as columns and the interesting statistics as lines?


Thx, Sören

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Summary grouped by factor

2009-03-06 Thread soeren . vogel

On 06.03.2009, at 16:48, soeren.vo...@eawag.ch wrote:


### example:start
v - sample(rnorm(200), 100, replace=T)
k - rep.int(c(locA, locB, locC, locD), 25)
tapply(v, k, summary)
### example:end

... (hopefully) produces 4 summaries of v according to k group  
membership. How can I transform the output into a nice table with  
the croups as columns and the interesting statistics as lines?


### Right??? and good??? solution:

sapply(by(v, list(area=k), function(x)x, simplify=F), summary)

Sören (again)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Grouped Boxplot

2009-03-04 Thread soeren . vogel

Pls forgive me heavy-handed data generation -- newby ;-)

### start ###
# example data
g - rep.int(c(A, B, C, D), 125)
t - rnorm(5000)
a - sample(t, 500, replace=TRUE)
b - sample(t, 500, replace=TRUE)

# what I actually want to have:
boxplot(a | b ~ g)

# but that does obviously not produce what I want, instead
i - data.frame(g, a, rep(one, length(g)))
j - data.frame(g, b, rep(two, length(g)))
names(i) - c(Group, Number, Word)
names(j) - c(Group, Number, Word)
k - as.data.frame(rbind(i, j))
boxplot(k$Number ~ k$Word * k$Group)
### end ###

Questions: (1) Is there an easier way? (2) How can I get additional  
space between the A:D groups?


Thank you

Sören

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] add absolute value to bars in barplot

2009-02-27 Thread soeren . vogel

Hello,

r-h...@r-project.orgbarplot(twcons.area,
  beside=T, col=c(green4, blue, red3, gray),
  xlab=estate,
  ylab=number of persons, ylim=c(0, 110),
  legend.text=c(treated, mix, untreated, NA))

produces a barplot very fine. In addition, I'd like to get the bars'  
absolute values on the top of the bars. How can I produce this in an  
easy way?


Thanks

Sören

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] cross tabulation: convert frequencies to percentages

2009-02-27 Thread soeren . vogel

Hello,

might be rather easy for R pros, but I've been searching to the dead  
end to ...


twsource.area - table(twsource, area, useNA=ifany)

gives me a nice cross tabulation of frequencies of two factors, but  
now I want to convert to pecentages of those absolute values. In  
addition I'd like an extra column and an extra row with absolute sums.  
I know, Excel or the likes will produce it more easily, but how would  
the procedure look like in R?


Thanks,

Sören

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plot grouped histograms

2008-10-09 Thread soeren . vogel
r11 -- r16 are variables showing a reason for usage of a product in 6  
different situations. Each variable is a factor with 4 levels imported  
from a SPSS sav file with labels ranging from not important to very  
important, and NA's for a sample of N = 276.


(1) I need a chi square test of independence showing that the reason  
does not differ depending on the situation.


(2) I need a single coloured histogram plot. The x axis should be  
grouped by the 6 situation with small space between the groups, each  
group should show different bars for each factor value (not  
important, little..., ..., NA's), but NA's is not necessary.


I've been googling the whole day, searching the mailing list and  
handbooks, and struggled through the somewhat R programmer specific  
documentation. Beeing a newby in R, I'm now afraid that I have to go  
back to SPSS and Excel where my tasks would be a work of an hour. But  
I was told euphoric that R may solve many of the problems I have  
(and don't like) with SPSS, or having to separate calculation (SPSS,  
Excel by hand) and plotting (GNUplot).


So my two questions are: How can I easily solve my 2 tasks? Secondly,  
is R really recommended for R newbys in daily work?


Thank you for any help

Sören

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] read.spss: variable.labels

2008-10-07 Thread soeren . vogel

Hi,

how can I attach variable labels originally read by read.spss() to the  
resulting variables?


pre
X - read.spss('data.sav', use.value.labels = TRUE, to.data.frame =  
TRUE, trim.factor.names = TRUE, trim_values = TRUE, reencode = UTF-8)

names(X) - tolower(names(X))
attach(X)
/pre

Thank you

Sören

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Regression with nominal data

2008-09-07 Thread soeren . vogel

Hi,

y is nominal (3 categories), x1 to 3 is scale. What I want is a  
regression, showing the probability to fall in one of the three  
categories of y according to the x. How can I perform such a  
regression in R?


Thanks for your help

Sören

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.