Hi
I am working with this data:
my data summary is:
summary(spi)
open high low close volume
Min. :4315 Min. :4365 Min. :4301 Min. :4352 Min. :
0
1st Qu.:4480 1st Qu.:4497 1st Qu.:4458 1st Qu.:4475 1st
Qu.:11135
Hi
I am trying to use randomForest for classification. I am using this
code:
set.seed(71)
rf.model - randomForest(similarity ~ ., data=set1[1:100,],
importance=TRUE, proximity=TRUE)
Warning message:
The response has five or fewer unique values. Are you sure you want to
do regression? in:
I want to make a glm and then use predict. I have a fairly small sample
(4000 cases) and I want to train on 90% and test on 10% but I want to do
it in slices so I test on every 10th case and train on the others. Is
there some simple way to get these elements?
Stephen
--
21/10/2005
Hi
I am trying to produce a little table like this:
0 1
0 601 408
1 290 2655
but I cannot get the syntax right.
Can anyone help.
Stephen
--
27/09/2005
[[alternative HTML version deleted]]
Hi
I have a largish dataset (26 columns 35000 rows) which I have been
subjecting to logistic regression and support vector machine analysis.
I have noticed that R easily copes with using the data in either
technique. Now I have to try and see what the best modeling technique
to use is.
I
Hi
I am trying to do this:
chisq.test(c(11, 13, 12, 18, 21, 43, 15, 12, 9, 10, 5, 28, 22, 11, 15,
11, 18, 28, 16, 8, 15, 19, 44, 18, 11, 23, 15, 23, 2, 5, 4, 14, 3, 22,
9, 0, 6, 19, 15, 32, 3, 16, 14, 10, 24, 16, 24, 31, 29, 28, 16, 26, 11,
11, 4, 17, 16, 13, 20, 26, 16, 19, 34, 19, 17, 14,
Hi
I am running chisq as below and getting a warning. Can anyone tell me
the significance or the warning?
chisq.test(c(10 ,4 ,2 ,6 ,5 ,3 ,4 ,4 ,6 ,3 ,2 ,2 ,2 ,4 ,7 ,10 ,0 ,6
,19 ,3 ,2 ,7 ,2 ,2 ,2 ,1 ,32 ,2 ,3 ,10 ,1 ,3 ,9 ,4 ,10 ,2 ,2 ,4 ,5 ,7 ,6
,3 ,7 ,4 ,3 ,3 ,7 ,1 ,4 ,2 ,2 ,3 ,3 ,5 ,5 ,4
Hi
I am trying to use this function. Can anyone show me how I would input
the following example?
Chi-Squared = (40-30)^2 + (20-30)^2 + (30-30)^2
30 30 30
= 3.333 + 3.333 + 0 = 6.666 (p value = 0.036)
I want to be able to use different
Hi
Is there some function R that multiplies each coefficient by the
standard deviation of the corresponding variable and produces a ranking?
Stephen
--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
[[alternative HTML version deleted]]
Hi
I can get all my features by doing this:
logistic.model = glm(similarity ~ ., family=binomial, data =
cData[3001:3800,])
I can get the product of all my features by this:
logistic.model = glm(similarity ~ . ^ 2, family=binomial, data =
cData[3001:3800,])
I don't seem to be able to
Hi
I am using R for logistic regression and finding it very useful.
However, I wondered if anyone could point me to any course or notes on
this subject using R.
All help most welcome.
Stephen
--
Internal Virus Database is out-of-date.
Checked by AVG Anti-Virus.
[[alternative
Hi
I am looking for a couple of pointers using glm (family = binary).
1. I want to add all the products of my predictive features as
additional features (and I have 23 of them). Is there some easy way to
add them?
2. I want to drop each feature in turn and get the most significant,
then
I want to use the LPC to estimate the vocal tract of various speakers
making short utterances.
Can anyone point me in the direction of a script that extracts the
correct features
Stephen
--
Internal Virus Database is out-of-date.
Checked by AVG Anti-Virus.
[[alternative HTML
Hi
I am working on corpora of automatically recognized utterances, looking
for features that predict error in the hypothesis the recognizer is
proposing.
I am using the glm functions to do logistic regression. I do this type
of thing:
* logistic.model = glm(formula = similarity ~.,
Hi
I am trying to fit an svm to predict speech recognition errors. I am
using best.svm like this:
svm.model = best.svm(data[1:3000,1:23],data[1:3000,24],tunecontrol =
tune.control())
I got this:
print(svm.model)
Call:
best.svm(x = data[1:3000, 1:23], tunecontrol = tune.control(),
I am trying to do some verification across a large dataset, cuData, that
has 23 columns.
Column 23 (similarity) is the outcome 0 or 1 and the other columns are
the features.
I do this:
verificationglm.model - glm(formula = similarity ~ ., family=binomial,
data=cuData[1:1000,])
and produce
Hi
I am trying to do a large glm and running into this message.
Error: cannot allocate vector of size 3725426 Kb
In addition: Warning message:
Reached total allocation of 494Mb: see help(memory.size)
Am I simply out of memory (I only have .5 gig)?
Is there something I can do?
Stephen
I am working with a largish dataset of 25k lines and I am now tying to
use predict.
pred = predict(cuDataGlmModel, length + meanPitch + minimumPitch +
maximumPitch + meanF1 + meanF2 + meanF3 + meanF4 + meanF5 +
ratioF1ToF2 + rationF3ToF1 + jitter + shimmer + percentUnvoicedFrames
+
Hi
Say I have some data, two columns in a table being a binary outcome plus
a predictor and I want to plot a graph that shows the percentage
positives of the binary outcome within bands of the predictor, e.g.
Outcome predictor
0 1
1 2
1
19 matches
Mail list logo