[R] string edit distance

2007-04-07 Thread Thomas Hills
I have a column of words, for example

DOG
DOOG
GOD
GOOD
DOOR
...

and I am interested in creating a matrix that contains the string  
edit distances between each pair of words.  I am this close  - '  '   
-   to writing the algorithm myself (which will allow for different  
variations on the string edit rules, indels, plus or minus  
transpositions, and possibly some variations on that), but I figured  
I'd see if anyone on the list has any experience with this and might  
already have some shoulders for me to stand on.

Thanks,

Thomas


Thomas Hills Ph.D.
Department of Psychological and Brain Sciences
Indiana University
Bloomington, IN 47405





[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] interpreting coxph results

2006-08-21 Thread Thomas Hills
I am having trouble understanding results I'm getting back from coxph  
doing a recurrent event analysis.  I've included the model below and  
the summary.  In some cases, with minor variations, the Robust  
variance and Wald tests are significant, but the individual  
covariates may or may not be significant.  My main question is:  If  
Wald and robust tests both take into account the clustering, then why  
are they so different and how do I make sense of them.  A second  
question is:  If Wald and Robust are both significant in the summary  
tests, but all individual covariates are insignificant (these are  
Wald, yes?), what do I make of that?  I recognize the questions are  
partly R related and partly statistical (if there is a better place  
to post this please let me know).

Call:
coxph(formula = Surv(startt, stopt, rep(1, nrow(omfi))) ~ joof1 +
 topslope1 * top1 + I(early.angle/late.angle) + spac.cov +
 ave.angle + slopef.d + cluster(id) + strata(sequence), data =  
thedofile))

   n= 174
  coef exp(coef) se(coef) robust se   
zp
joof1 -0.2755  7.59e-01   0.15900.2998 -0.919  
0.36
topslope1 30.9827  2.86e+13  23.2339   51.9948  0.596  
0.55
top1   0.1165  1.12e+00   0.19010.3951  0.295  
0.77
I(early.angle/late.angle)  0.0449  1.05e+00   0.11650.1296  0.347  
0.73
spac.cov   0.9815  2.67e+00   3.41045.5871  0.176  
0.86
ave.angle  0.0396  1.04e+00   0.01560.0266  1.488  
0.14
slopef.d  -0.3394  7.12e-01   0.43730.8891 -0.382  
0.70
topslope1:top1-5.5673  3.82e-03   2.81986.7696 -0.822  
0.41


Rsquare= 0.18   (max possible= 0.898 )
Likelihood ratio test= 34.5  on 8 df,   p=3.27e-05
Wald test= 23.5  on 8 df,   p=0.00276
Score (logrank) test = 31.8  on 8 df,   p=0.000103,   Robust = 13.5   
p=0.097

   (Note: the likelihood ratio and score tests assume independence of
  observations within a cluster, the Wald and robust score tests  
do not).

Thanks for any help,

Thomas Hills
Indiana University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] averaging within columns

2005-02-26 Thread thomas hills
I have a dataframe with names in the first column and wait times 
between decisions in the second column.  Since individuals make 
multiple decisions, I want the average for each individual.  For 
example, the data might look like this

namewtime
jo  1
jo  2
jo  1
jo  3
tim 3   
tim 2
tim 2
ro  1
ro  2
etc.
I'm hoping there is something like
mean(dataname$wtime[name])
which will just create a column with length equal to the number of 
different names (levels) and an average wtime for each.  So far though, 
I haven't had much luck figuring that one out.

Thanks.
Thomas
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] read.table from a list of filenames

2004-12-28 Thread thomas hills
I am wondering if it is possible to read.table repeatedly from a list 
of file names into a new list of table names.

For example:

filenames - list.files()

then with a function like

rf - function(i) {
word??(filename[i]) - read.table(filenames[i]) }

I can't seem to find a function like word?? that will be the object of 
another operation.   If this worked, then I could repeat it for the 
length of filenames.

Also, even the following function seems to give me an error, but I 
don't yet know why.

rf - function(nam, i) {  nam - read.table(filenames[i]) }


Any help would be very much appreciated.

Thanks,
Thomas
[[alternative text/enriched version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] using get() in assign()

2004-12-28 Thread thomas hills
I'm trying to rename the columns in a list of data.frames using the 
following...

for(i in 1:length(filenames)) {
assign(names(get(filenames[i])), c(name, infood, time) ) }

R returns no errors, but the names are unchanged in the data.frames.

The original names were things like

  names(get(filenames[2]))
[1] Tc45w4.V1 Tc45w4.V2 Tc45w4.V3

after the above procedure they are still those names.

Ideas appreciated.  Thanks.

Thomas



[[alternative text/enriched version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html