[R] tree()

2007-02-19 Thread stephenc
Hi
 
I am trying to use tree() to classify movements in a futures contract.  My
data is like this:
 
 diff  dip  dim adx
1  0100.08650.100.0
2  0 93.185402044.5455 93.18540
3  0 90.309951549.1169 90.30995
4  1 85.22030 927.0419 85.22030
5  1 85.36084 785.6480 85.36084
6  0 85.72627 663.3814 85.72627
7  0 78.06721 500.1113 78.06721
8  1 69.59398 376.7558 69.59398
9  1 71.15429 307.4533 71.15429
10 1 71.81023 280.6238 71.81023
 
plus another 6000 lines
 
The cpus example works fine and I am trying this:
 
tree.model - tree(as.factor(indi$diff) ~ indi$dim + indi$dip + indi$adx,
indi[1:4000,])
tree.model
summary(tree.model)
plot(tree.model);  text(tree.model)
 
but I get this:
 tree.model - tree(as.factor(indi$diff) ~ indi$dim + indi$dip + indi$adx,
indi[1:4000,])
 tree.model
node), split, n, deviance, yval, (yprob)
  * denotes terminal node
1) root 6023 8346 0 ( 0.513 0.487 ) *
 summary(tree.model)
Classification tree:
tree(formula = as.factor(indi$diff) ~ indi$dim + indi$dip + indi$adx, 
data = indi[1:4000, ])
Variables actually used in tree construction:
character(0)
Number of terminal nodes:  1 
Residual mean deviance:  1.386 = 8346 / 6022 
Misclassification error rate: 0.487 = 2933 / 6023 
 plot(tree.model);  text(tree.model)
Error in plot.tree(tree.model) : cannot plot singlenode tree
 
I'm not getting any sort of tree formed.
I wondered if anyone could point me in the right direction.
Thanks.
Stephen Choularton
 
 
 
 
 
 
 
 

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ripper

2007-02-17 Thread stephenc
Is there some decision tree method available with R, like ripper, that ends
up producing a list of the rules and can be used for prediction?

 

Stephen Choularton

02  2226

0413 545 182

 

 


-- 



5:40 PM
 

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] futures, investment, etc

2007-02-05 Thread stephenc
Hi



I am just starting to look at R and trading in futures, stock, etc



Can anyone point me to useful background material?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] futures, investment

2007-02-03 Thread stephenc
Hi

 

I am just starting to look at R and trading in futures, stock, etc

 

Can anyone point me to useful background material?

 

Stephen Choularton

02  2226

0413 545 182

 

 

 


-- 



11:39 PM
 

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] font size for xlab

2006-10-18 Thread stephenc
Hi

 

I am trying to set the xlab font size.  I have this code:

 

attach(errorsBySpeakers)

postscript(pic2.ps,width=4,height=4,paper=A4,horizontal=FALSE,pointsize=
0,family=Times)

plot(prattpercent,uttspercent, xlab=Testing)

abline(z)

dev.off()

detach(errorsBySpeakers)

 

but I cannot find the correct form of words to make the xlab Testing  as a
size 11 font.  Does anyone know the wording?  Also for ylab.

 

Thanks

 

Stephen Choularton

02  2226

0413 545 182

 

 

 


Checked by AVG Free Edition.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] stepAIC

2006-09-12 Thread stephenc
Hi

 

I hope this isn't off topics, but I have always found when I stepAIC() some
glm I get an improvement in accuracy and kappa, but I have just done a case
where I got a marginal deterioration.  Is this possible, or should I be
going through my figures carefully to see if I have messed up?

 

Stephen Choularton

02  2226

0413 545 182

 

 

 


Checked by AVG Free Edition.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] boosting - second posting

2006-05-27 Thread stephenc
Hi
 
I am using boosting for a classification and prediction problem.
 
For some reason it is giving me an outcome that doesn't fall between 0
and 1 for the predictions.  I have tried type=response but it made no
difference.
 
Can anyone see what I am doing wrong?
 
Screen output shown below:
 
 
 boost.model - gbm(as.factor(train$simNuance) ~ ., # formula
+  data=train,   # dataset
+   # +1: monotone increase,
+   #  0: no monotone restrictions
+  distribution=gaussian, # bernoulli, adaboost, gaussian,
+   # poisson, and coxph available
+  n.trees=3000,# number of trees
+  shrinkage=0.005, # shrinkage or learning rate,
+   # 0.001 to 0.1 usually work
+  interaction.depth=3, # 1: additive model, 2: two-way
interactions, etc.
+  bag.fraction = 0.5,  # subsampling fraction, 0.5 is
probably best
+  train.fraction = 0.5,# fraction of data for training,
+   # first train.fraction*N used
for training
+  n.minobsinnode = 10, # minimum total weight needed in
each node
+  cv.folds = 5,# do 5-fold cross-validation
+  keep.data=TRUE,  # keep a copy of the dataset
with the object
+  verbose=FALSE)# print out progress
 
 best.iter = gbm.perf(boost.model,method=cv)
 pred = predict.gbm(boost.model, test, best.iter)
 summary(pred)
   Min. 1st Qu.  MedianMean 3rd Qu.Max. 
 0.4772  1.5140  1.6760  1.5100  1.7190  1.9420   

Checked by AVG Free Edition.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] boosting

2006-05-24 Thread stephenc
Hi
 
I am using boosting for a classification and prediction problem.
 
For some reason it is giving me an outcome that doesn't fall between 0
and 1 for the predictions.  I have tried type=response but it made no
difference.
 
Can anyone see what I am doing wrong?
 
Screen output shown below:
 
 
 boost.model - gbm(as.factor(train$simNuance) ~ ., # formula
+  data=train,   # dataset
+   # +1: monotone increase,
+   #  0: no monotone restrictions
+  distribution=gaussian, # bernoulli, adaboost, gaussian,
+   # poisson, and coxph available
+  n.trees=3000,# number of trees
+  shrinkage=0.005, # shrinkage or learning rate,
+   # 0.001 to 0.1 usually work
+  interaction.depth=3, # 1: additive model, 2: two-way
interactions, etc.
+  bag.fraction = 0.5,  # subsampling fraction, 0.5 is
probably best
+  train.fraction = 0.5,# fraction of data for training,
+   # first train.fraction*N used
for training
+  n.minobsinnode = 10, # minimum total weight needed in
each node
+  cv.folds = 5,# do 5-fold cross-validation
+  keep.data=TRUE,  # keep a copy of the dataset
with the object
+  verbose=FALSE)# print out progress
 
 best.iter = gbm.perf(boost.model,method=cv)
 pred = predict.gbm(boost.model, test, best.iter)
 summary(pred)
   Min. 1st Qu.  MedianMean 3rd Qu.Max. 
 0.4772  1.5140  1.6760  1.5100  1.7190  1.9420   

Checked by AVG Free Edition.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] products and polynomials in formulae

2006-03-27 Thread stephenc
Hi
 
I can do this:
 
formula = as.factor(outcome) ~ .
 
in glm and other model building functions.   I think there is a way to
get the product of the determinants (that is d1 * d2, d1 * d3, etc) and
also another way to get all the polynomials (that is like poly(d1,2)
would produce for a single determinant).
 
Can anyone tell me how you write them?
 
Stephen 

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] error message

2006-03-27 Thread stephenc
Hi
 
Does anyone know what this means:
 
 
 glm.model = glm(formula = as.factor(nextDay) ~ ., family=binomial,
data=spi[1:1000,])
 pred - predict(glm.model, spi[1001:1250,-9], type=response)
Warning message: 
prediction from a rank-deficient fit may be misleading in:
predict.lm(object, newdata, se.fit, scale = 1, type = ifelse(type ==  
 
9 is my determinant and I still get this message even when I remove the
9.
 
Stephen
 

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] There were 25 warnings (use warnings() to see them)

2006-03-25 Thread stephenc
I am trying to use bagging like this:
 
 
 bag.model - bagging(as.factor(nextDay) ~ ., data = spi[1:1250,])
 pred = predict(bag.model, spi[1251:13500,-9])
There were 25 warnings (use warnings() to see them)
 t = table(pred, spi[1251:13500,9])
 t

pred  0  1
   0 42 40
   1 12 22
 classAgreement(t)
 
but I get the warning.
 
The warnings run like this:
 
 
 warnings()
Warning messages:
1: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(1, 1:N,
predict(object$mtrees[[i]], newdata, type = class)) 
2: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(1, 1:N,
predict(object$mtrees[[i]], newdata, type = class)) 
3: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(1, 1:N,
predict(object$mtrees[[i]], newdata, type = class)) 
4: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(1, 1:N,
predict(object$mtrees[[i]], newdata, type = class)) 
5: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(1, 1:N,
predict(object$mtrees[[i]], newdata, type = class)) 
6: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(1, 1:N,
predict(object$mtrees[[i]], newdata, type = class)) 
7: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(1, 1:N,
predict(object$mtrees[[i]], newdata, type = class)) 
8: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(1, 1:N,
predict(object$mtrees[[i]], newdata, type = class)) 
9: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(1, 1:N,
predict(object$mtrees[[i]], newdata, type = class)) 
10: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(1, 1:N,
predict(object$mtrees[[i]], newdata, type = class)) 
11: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(1, 1:N,
predict(object$mtrees[[i]], newdata, type = class)) 
12: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(1, 1:N,
predict(object$mtrees[[i]], newdata, type = class)) 
13: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(1, 1:N,
predict(object$mtrees[[i]], newdata, type = class)) 
14: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(1, 1:N,
predict(object$mtrees[[i]], newdata, type = class)) 
15: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(1, 1:N,
predict(object$mtrees[[i]], newdata, type = class)) 
16: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(1, 1:N,
predict(object$mtrees[[i]], newdata, type = class)) 
17: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(1, 1:N,
predict(object$mtrees[[i]], newdata, type = class)) 
18: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(1, 1:N,
predict(object$mtrees[[i]], newdata, type = class)) 
19: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(1, 1:N,
predict(object$mtrees[[i]], newdata, type = class)) 
20: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(1, 1:N,
predict(object$mtrees[[i]], newdata, type = class)) 
21: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(1, 1:N,
predict(object$mtrees[[i]], newdata, type = class)) 
22: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(1, 1:N,
predict(object$mtrees[[i]], newdata, type = class)) 
23: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(1, 1:N,
predict(object$mtrees[[i]], newdata, type = class)) 
24: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(1, 1:N,
predict(object$mtrees[[i]], newdata, type = class)) 
25: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(1, 1:N,
predict(object$mtrees[[i]], newdata, type = class)) 

 
Can anyone tell me what is going wrong?
 
Stephen

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] There were 25 warnings (use warnings() to see them)

2006-03-25 Thread stephenc
Don't worry I can see my typo.  Sorry for the posting!
 
 
 bag.model - bagging(as.factor(nextDay) ~ ., data = spi[1:1250,])
 pred = predict(bag.model, spi[1251:13500,-9])
There were 25 warnings (use warnings() to see them)
 t = table(pred, spi[1251:13500,9])
 t


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] graphs - saving and multiple

2004-12-11 Thread stephenc
Hi
 
I am doing something like this:
 
hist(maximumPitch, xlab=Maximum Pitch in Hertz)
 
which produces a nice histogram but what do I do to get two or three,
etc on one page?
 
I want to save the resulting file to an eps.  I can find:
 
 
postscript(ex.eps)
 
which I then run something like my  hist above and then 
 
dev.off()
 
but I don't get anything in my eps file!
 
Thanks.
 
Stephen 

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] tuning SVM's

2004-11-30 Thread stephenc
Hi  
 
I am doing this  sort of thing:
 
POLY:
 
  obj = best.tune(svm, similarity ~., data = training, kernel =
polynomial)
 summary(obj)
 
Call:
 best.tune(svm, similarity ~ ., data = training, kernel = polynomial) 
 
Parameters:
   SVM-Type:  eps-regression 
 SVM-Kernel:  polynomial 
   cost:  1 
 degree:  3 
  gamma:  0.04545455 
 coef.0:  0 
epsilon:  0.1 
 
 
Number of Support Vectors:  754
 
 svm.model - svm(similarity ~., data = training, kernel  =
polynomial, cost = 1, degree = 3, gamma = 0.04545455, coef.0 = 0,
epsilon = 0.1)
 pred=predict(svm.model, testing)
 pred[pred  .5] = 1
 pred[pred = .5] = 0  
 table(testing$similarity, pred)
   pred
0  1 
  0 30  8
  1 70 63
 obj = best.tune(svm, similarity ~., data = training, kernel =
linear)
 summary(obj)
 
LINEAR:
 
Call:
 best.tune(svm, similarity ~ ., data = training, kernel = linear) 
 
Parameters:
   SVM-Type:  eps-regression 
 SVM-Kernel:  linear 
   cost:  1 
  gamma:  0.04545455 
epsilon:  0.1 
 
 
Number of Support Vectors:  697
 
 svm.model - svm(similarity ~., data = training, kernel  = linear,
cost = 1, gamma = 0.04545455, epsilon = 0.1)
 pred=predict(svm.model, testing)
 pred[pred  .5] = 1
 pred[pred = .5] = 0  
 table(testing$similarity, pred)
   pred
0   1  
  0   6  32
  1   4 129
 
 
RADIAL:
 
 obj = best.tune(svm, similarity ~., data = training, kernel =
radial)
 summary(obj)
 
Call:
 best.tune(svm, similarity ~ ., data = training, kernel = linear) 
 
Parameters:
   SVM-Type:  eps-regression 
 SVM-Kernel:  linear 
   cost:  1 
  gamma:  0.04545455 
epsilon:  0.1 
 
 
Number of Support Vectors:  697
 
 svm.model - svm(similarity ~., data = training, kernel  = radial,
cost = 1, gamma = 0.04545455, epsilon = 0.1)
 pred=predict(svm.model, testing)
 pred[pred  .5] = 1
 pred[pred = .5] = 0  
 table(testing$similarity, pred)
   pred
0  1 
  0 27 11
  1 64 69
 
 
SIGMOID:
 
 obj = best.tune(svm, similarity ~., data = training, kernel =
sigmoid)
 summary(obj)
 
Call:
 best.tune(svm, similarity ~ ., data = training, kernel = sigmoid) 
 
Parameters:
   SVM-Type:  eps-regression 
 SVM-Kernel:  sigmoid 
   cost:  1 
  gamma:  0.04545455 
 coef.0:  0 
epsilon:  0.1 
 
 
Number of Support Vectors:  986
 
 svm.model - svm(similarity ~., data = training, kernel  = sigmoid,
cost = 1, gamma = 0.04545455, coef.0 = 0, epsilon = 0.1)
 pred=predict(svm.model, testing)
 pred[pred  .5] = 1
 pred[pred = .5] = 0  
 table(testing$similarity, pred)
   pred
0   1  
  0   8  30
  1  26 107

 
and then taking out the kappa statistic to see if I am getting anything
significant.
 
I get kappas of 15 - 17% - I don't think that is very good.  I know
kappa is really for comparing the outcomes of two taggers but it seems a
good way to measure if your results might be by chance.
 
Two questions:
 
Any comments on Kappa and what it might be telling me?
 
What can I do to tune my kernels further?
 
Stephen

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] (no subject)

2004-11-29 Thread stephenc
Hi 
 
I am trying to tune an svm by doing the following:
 
tune(svm, similarity ~., data = training, degree = 2^(1:2), gamma =
2^(-1:1), coef0 = 2^(-1:1), cost = 2^(2:4), type = polynomial)
 
but I am getting 
 
Error in svm.default(x, y, scale = scale, ...) : 
wrong type specification!

 
I have to admit I am not sure what I am doing wrong.  Could anyone tell
me why the parameters I am using are wrong? 
 
Plus could anyone tell me how to go about picking the correct ranges for
my tuning?
 
Thanks
 
S

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] tune()

2004-11-29 Thread stephenc
Hi 
 
I am trying to tune an svm by doing the following:
 
tune(svm, similarity ~., data = training, degree = 2^(1:2), gamma =
2^(-1:1), coef0 = 2^(-1:1), cost = 2^(2:4), type = polynomial)
 
but I am getting 
 
Error in svm.default(x, y, scale = scale, ...) : 
wrong type specification!

 
I have to admit I am not sure what I am doing wrong.  Could anyone tell
me why the parameters I am using are wrong? 
 
Plus could anyone tell me how to go about picking the correct ranges for
my tuning?
 
Thanks
 
S

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] support vector machine

2004-11-25 Thread stephenc
Hi Everyone
 
Thanks to those who responded last time.
 
I am still having problems.  I really want to find one of those
tutorials on how to use svm() so I can then get going using it myself.
Issues are which kernel to choose, how to tune the parameters.  If
anyone know of a tutorial please let me know.
 
Stephen

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] SMVs

2004-11-24 Thread stephenc
Hi Everyone
 
I am struggling to  get going with support  vector machines in R - smv()
and predict() etc.  Does anyone know of a good tutorial covering R and
these things?  
 
Stephen

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html