[R] feature not available

2006-02-21 Thread Stephen Choularton
Hi I am working with this data: my data summary is: summary(spi) open high low close volume Min. :4315 Min. :4365 Min. :4301 Min. :4352 Min. : 0 1st Qu.:4480 1st Qu.:4497 1st Qu.:4458 1st Qu.:4475 1st Qu.:11135

[R] randomForest - classifier switch

2006-01-03 Thread Stephen Choularton
Hi I am trying to use randomForest for classification. I am using this code: set.seed(71) rf.model - randomForest(similarity ~ ., data=set1[1:100,], importance=TRUE, proximity=TRUE) Warning message: The response has five or fewer unique values. Are you sure you want to do regression? in:

[R] selecting every nth item in the data

2005-10-24 Thread Stephen Choularton
I want to make a glm and then use predict. I have a fairly small sample (4000 cases) and I want to train on 90% and test on 10% but I want to do it in slices so I test on every 10th case and train on the others. Is there some simple way to get these elements? Stephen -- 21/10/2005

[R] correct syntax

2005-09-28 Thread Stephen Choularton
Hi I am trying to produce a little table like this: 0 1 0 601 408 1 290 2655 but I cannot get the syntax right. Can anyone help. Stephen -- 27/09/2005 [[alternative HTML version deleted]]

[R] different models

2005-09-27 Thread Stephen Choularton
Hi I have a largish dataset (26 columns 35000 rows) which I have been subjecting to logistic regression and support vector machine analysis. I have noticed that R easily copes with using the data in either technique. Now I have to try and see what the best modeling technique to use is. I

[R] chisq.,test`

2005-08-26 Thread Stephen Choularton
Hi I am trying to do this: chisq.test(c(11, 13, 12, 18, 21, 43, 15, 12, 9, 10, 5, 28, 22, 11, 15, 11, 18, 28, 16, 8, 15, 19, 44, 18, 11, 23, 15, 23, 2, 5, 4, 14, 3, 22, 9, 0, 6, 19, 15, 32, 3, 16, 14, 10, 24, 16, 24, 31, 29, 28, 16, 26, 11, 11, 4, 17, 16, 13, 20, 26, 16, 19, 34, 19, 17, 14,

[R] chisq warning

2005-08-11 Thread Stephen Choularton
Hi I am running chisq as below and getting a warning. Can anyone tell me the significance or the warning? chisq.test(c(10 ,4 ,2 ,6 ,5 ,3 ,4 ,4 ,6 ,3 ,2 ,2 ,2 ,4 ,7 ,10 ,0 ,6 ,19 ,3 ,2 ,7 ,2 ,2 ,2 ,1 ,32 ,2 ,3 ,10 ,1 ,3 ,9 ,4 ,10 ,2 ,2 ,4 ,5 ,7 ,6 ,3 ,7 ,4 ,3 ,3 ,7 ,1 ,4 ,2 ,2 ,3 ,3 ,5 ,5 ,4

[R] chisq.test

2005-08-08 Thread Stephen Choularton
Hi I am trying to use this function. Can anyone show me how I would input the following example? Chi-Squared = (40-30)^2 + (20-30)^2 + (30-30)^2 30 30 30 = 3.333 + 3.333 + 0 = 6.666 (p value = 0.036) I want to be able to use different

[R] ranking predictive features in logsitic regression

2005-06-30 Thread Stephen Choularton
Hi Is there some function R that multiplies each coefficient by the standard deviation of the corresponding variable and produces a ranking? Stephen -- No virus found in this outgoing message. Checked by AVG Anti-Virus. [[alternative HTML version deleted]]

[R] logistic regression - using polys and products of features

2005-06-16 Thread Stephen Choularton
Hi I can get all my features by doing this: logistic.model = glm(similarity ~ ., family=binomial, data = cData[3001:3800,]) I can get the product of all my features by this: logistic.model = glm(similarity ~ . ^ 2, family=binomial, data = cData[3001:3800,]) I don't seem to be able to

[R] logistic regressioin - course ornotes

2005-06-09 Thread Stephen Choularton
Hi I am using R for logistic regression and finding it very useful. However, I wondered if anyone could point me to any course or notes on this subject using R. All help most welcome. Stephen -- Internal Virus Database is out-of-date. Checked by AVG Anti-Virus. [[alternative

[R] logistic regression (glm binary)

2005-06-07 Thread Stephen Choularton
Hi I am looking for a couple of pointers using glm (family = binary). 1. I want to add all the products of my predictive features as additional features (and I have 23 of them). Is there some easy way to add them? 2. I want to drop each feature in turn and get the most significant, then

[R] LPC

2005-05-31 Thread Stephen Choularton
I want to use the LPC to estimate the vocal tract of various speakers making short utterances. Can anyone point me in the direction of a script that extracts the correct features Stephen -- Internal Virus Database is out-of-date. Checked by AVG Anti-Virus. [[alternative HTML

[R] logistic regression

2005-05-26 Thread Stephen Choularton
Hi I am working on corpora of automatically recognized utterances, looking for features that predict error in the hypothesis the recognizer is proposing. I am using the glm functions to do logistic regression. I do this type of thing: * logistic.model = glm(formula = similarity ~.,

[R] best.svm

2005-05-24 Thread Stephen Choularton
Hi I am trying to fit an svm to predict speech recognition errors. I am using best.svm like this: svm.model = best.svm(data[1:3000,1:23],data[1:3000,24],tunecontrol = tune.control()) I got this: print(svm.model) Call: best.svm(x = data[1:3000, 1:23], tunecontrol = tune.control(),

[R] making table() work

2005-04-26 Thread Stephen Choularton
I am trying to do some verification across a large dataset, cuData, that has 23 columns. Column 23 (similarity) is the outcome 0 or 1 and the other columns are the features. I do this: verificationglm.model - glm(formula = similarity ~ ., family=binomial, data=cuData[1:1000,]) and produce

[R] running out of memory

2005-02-16 Thread Stephen Choularton
Hi I am trying to do a large glm and running into this message. Error: cannot allocate vector of size 3725426 Kb In addition: Warning message: Reached total allocation of 494Mb: see help(memory.size) Am I simply out of memory (I only have .5 gig)? Is there something I can do? Stephen

[R] Error in eval(expr, envir, enclos) : numeric envir arg not of length one

2005-02-16 Thread Stephen Choularton
I am working with a largish dataset of 25k lines and I am now tying to use predict. pred = predict(cuDataGlmModel, length + meanPitch + minimumPitch + maximumPitch + meanF1 + meanF2 + meanF3 + meanF4 + meanF5 + ratioF1ToF2 + rationF3ToF1 + jitter + shimmer + percentUnvoicedFrames +

[R] plotting percent of incidents within different 'bins'

2005-01-05 Thread Stephen Choularton
Hi Say I have some data, two columns in a table being a binary outcome plus a predictor and I want to plot a graph that shows the percentage positives of the binary outcome within bands of the predictor, e.g. Outcome predictor 0 1 1 2 1