Hello, all experts,
My major is computer-aied drug design ( main QSAR).
Now, my paper need be reviesed, and one reviewer ask me do genetic algorithm
coupled with gaussian process method (GA+GP).
my data:
training set: 191*106
test set: 73*106
here, I need use GA+GP to do variable
Thank Berend,
It seems like that it is better to attach a PDF file for avoiding messy
code.
Yes, I want to obtain is Tanimoto coefficient and your web site wikipedia
is about this coefficient. I also search R site about tanimoto coefficient
and learn it more.
About your code, I has saved and
Dear all,
My double for loop as follows, but it is little efficient, I hope all
friends can give me a vectorized program to replace my code. thanks
x: is a matrix 202*263, that is 202 samples, and 263 independent variables
num.compd-nrow(x); # number of compounds
diss.all-0
for( i in
thanks for your help, it is great. In addition, In the beginning, the format
of x is dataframe, and i run my code, it is so slow, after your help, I
change x for matirx, it is so quick. I am very grateful your kind help, and
your code is so good!
kevin
--
View this message in context:
thanks for your help. I am sorry I do not full understand your code, so i can
not correct using your code to my data. here is the attachment of my data,
and what I want to compute is the equation in the word document of the
attachment:
the code form Berend can get the answer i want to get.
http://r.789695.n4.nabble.com/file/n3060425/fig_1.png fig. 1
http://r.789695.n4.nabble.com/file/n3060425/fig_2.png fig. 2
I want to the picture like the above one, the origin crossover together,
while the following picture can be obtained by default and the origin is
detached, but throgut
thanks, I succeed.
kevin
--
View this message in context:
http://r.789695.n4.nabble.com/how-to-get-the-plot-like-the-attachment-tp3060425p3061217.html
Sent from the R help mailing list archive at Nabble.com.
__
R-help@r-project.org mailing list
http://r.789695.n4.nabble.com/file/n3032045/rsv1.txt rsv1.txt
I am very grateful to David's suggestion, here , I upload my dataset
rsv1.txt, also the question,
ks-ken.sto(rsv1,per=TRUE,per.n=0.3,va=FALSE,sav=FALSE)
it does not work, all results are NULL, i do not known why it is ?
hope,
http://r.789695.n4.nabble.com/file/n3031344/RSV.Rdata RSV.Rdata
I want to split my dataset to training set and test set using
kennard-stone(KS) algorithm, it is lucky there is R packages soil.spec to
implement it.
but when I used it to my dataset, it does not work, who can help me, how
reasons
now. it is ok. I uninstall R2.11.0, then delete an packages in the library,
and install again R2.11.0. ok, it does works.
thank you!
--
View this message in context:
http://r.789695.n4.nabble.com/update-R-2-11-0-there-is-error-when-using-plot-how-can-I-do-tp2164517p2165235.html
Sent from
1. is there some criterion to estimate overfitting? e.g. R2 and Q2 in the
training set, as well as R2 in the test set, when means overfitting. for
example, in my data, I have R2=0.94 for the training set and for the test
set R2=0.70, is overfitting?
2. in this scatter, can one say this
a-1:5
b-2:6
plot(a,b)
Error in function (width, height, pointsize, record, rescale, xpinch, :
Graphics API version mismatch
before, R 2.10 , plot() is ok. Now, R 2.11.0 does not work
--
View this message in context:
thanks for your suggestion.
many I need to learn indeed. I will buy that good book.
kevin
--
View this message in context:
http://r.789695.n4.nabble.com/How-to-estimate-whether-overfitting-tp2164417p2164847.html
Sent from the R help mailing list archive at Nabble.com.
thank you, I have downloaded it. studying
--
View this message in context:
http://r.789695.n4.nabble.com/How-to-estimate-whether-overfitting-tp2164417p2164932.html
Sent from the R help mailing list archive at Nabble.com.
__
R-help@r-project.org
many thanks . I can try to use test set with 100 samples.
anther question is that how can I rationally split my data to training set
and test set? (training set with 108 samples, and test set with 100 samples)
as I know, the test set should the same distribute to the training set. and
what
thanks, it is ok!
--
View this message in context:
http://n4.nabble.com/how-can-I-plot-the-histogram-like-this-using-R-tp1839303p2013782.html
Sent from the R help mailing list archive at Nabble.com.
__
R-help@r-project.org mailing list
Thanks for your reply, I just want to get the figure like y1.jpg using the
data from y1.txt.
Through the figure I want to obtain the split point like y1.jpg, and
consider 2.5 as the plit point. This figure is drawn by other people, I
just want to draw it using R, but I can not, so I hope,
thanks for your help. I can have a try.
--
View this message in context:
http://n4.nabble.com/how-can-I-plot-the-histogram-like-this-using-R-tp1839303p1839534.html
Sent from the R help mailing list archive at Nabble.com.
__
R-help@r-project.org
thank you, I will try this function barplot.
--
View this message in context:
http://n4.nabble.com/how-can-I-plot-the-histogram-like-this-using-R-tp1839303p1839541.html
Sent from the R help mailing list archive at Nabble.com.
__
R-help@r-project.org
I want to get the plot like this,
http://n4.nabble.com/file/n1839303/%25E9%25A2%2591%25E7%258E%2587%25E5%2588%2586%25E5%25B8%2583%25E5%259B%25BE%25E6%25A0%2587%25E5%2587%2586.jpg
%E9%A2%91%E7%8E%87%E5%88%86%E5%B8%83%E5%9B%BE%E6%A0%87%E5%87%86.jpg
not this,
Hello,
I am learning caret package, and I want to use the RFE to reduce the
feature. I want to use RFE coupled Random Forest (RFE+FR) to complete this
task. As we know, there are a number of pre-defined sets of functions, like
random Forest(rfFuncs), however,I want to tune the parameters
This topic refer to independent variables reduction, as we know ,a lot of
method can do with it,however, for pre-processing independent varibles, a
method like the sentence below can reduce many variable, How can I
understand it?
what is significant correlation at 5% level, what is the
Hello,
I am learning randomForest, now I want to boxplot mse and mtry using 20
5-fold cross-validation(using median value), but I have no a good method to
do it, except a not good method.
randomforest package itself did not contain cross-validating method, and
caret package contain cross
at 9:47 AM, bbslover [hidden email] wrote:
Hello,
I am learning randomForest, now I want to boxplot mse and mtry using 20
5-fold cross-validation(using median value), but I have no a good method to
do it, except a not good method.
randomforest package itself did not contain cross
...@yeah.net
主é¢:Re: [R] Help me! using random Forest package, how to calculate Error
Rates in the training set ?
From: bbslover
now I am learining random forest and using random forest
package, I can get
the OOB error rates, and test set rate, now I want to get the
training set
error
now I am learining random forest and using random forest package, I can get
the OOB error rates, and test set rate, now I want to get the training set
error rate, how can I do?
pgp.rf-randomForest(x.tr,y.tr,x.ts,y.ts,ntree=1e3,keep.forest=FALSE,do.trace=1e2)
using the code can get oob and
http://n4.nabble.com/file/n998182/pca.jpg pca.jpg
http://n4.nabble.com/file/n998182/som.jpg som.jpg
http://n4.nabble.com/file/n998182/all%2Bindepents.xls all+indepents.xls
As we know, som is a good tool to cluster hign demension to 2D and show as
a 2D picture, just like in the attachment
create a modified version of lmFuncs that suits your needs or
remove variables prior to modeling (or try some other method that
doesn't require more samples than predictors, such as the lasso or
elasticnet).
Max
On Fri, Jan 1, 2010 at 10:14 PM, bbslover [hidden email] wrote:
I am
.
Either create a modified version of lmFuncs that suits your needs or
remove variables prior to modeling (or try some other method that
doesn't require more samples than predictors, such as the lasso or
elasticnet).
Max
On Fri, Jan 1, 2010 at 10:14 PM, bbslover dlu...@yeah.net wrote:
I
I am learning the package caret, after I do the rfe function, I get the
error ,as follows:
Error in `[.data.frame`(x, , retained, drop = FALSE) :
undefined columns selected
In addition: Warning message:
In predict.lm(object, x) :
prediction from a rank-deficient fit may be misleading
I
I want to split my whole dateset to training set and test set, building model
in training set, and validate model using test set. Now, How can I split my
dataset to them reasonally. Please give me a hand, It is better to give me
some R code.
and I see some ways like using SOM to project whole
Thank you for all help. It is helpful for me.
Max Kuhn wrote:
I noticed Max already pointed you to the caret package.
Load the library and look at the help for the createFolds function, eg:
library(caret)
?createFolds
I think that the createDataPartition function in caret might work
://www.jstatsoft.org/v28/i05/paper
about the package.
Max
On Fri, Dec 18, 2009 at 12:26 PM, bbslover [hidden email] wrote:
as known, svm need tune some parameters like cost,gamma and epsilon to get
better performance,but one question appear, how can i monitor the
performance . generally
Hello, all
I have a lot of independents and one dependent, finally, I want to build
one model using them, and predict the new samples value, that is regression.
before it, I must remove some independents according to some criterion:
1. constant values independent. 2. variant near zero.
/v28/i05/paper
about the package.
Max
On Fri, Dec 18, 2009 at 12:26 PM, bbslover dlu...@yeah.net wrote:
as known, svm need tune some parameters like cost,gamma and epsilon to
get
better performance,but one question appear, how can i monitor the
performance . generally speaking ,we
as known, svm need tune some parameters like cost,gamma and epsilon to get
better performance,but one question appear, how can i monitor the
performance . generally speaking ,we chose the cross-validation MSE in the
training set, but It seems svm can not return the cross-validation MSE
value, we
Hi,all friends,
Please help me understand this sentence below:
“From this set, 858 columns not significantly correlated with the
response variable TBG at the 5% level were removed, leaving a set of 390
columns.” and “ the F-test's value for the one-parameter correlation with
the descriptor
http://old.nabble.com/file/p26443595/Edragonr.txt Edragonr.txt
HI all,
I have a 72*495 matrix, and the first column is the response, and the
remaining are independences. Final I want to select some independence to fit
y, but there are so many independences, the fit result is not meaning, so
Dear all,
I am learning the subselect package in R, now I want to use GA to select
some potent variable, but some questions are puzzled.
what i want to resolve is that I have one column dependent y and 219
columns independent x. A total 72 observations is contained in the
dataset. I want
ok,I understand your means, maybe PLS is better for my aim. but I have done
that, also bad. the most questions for me is how to select less variables
from the independent to fit dependent. GA maybe is good way, but I do not
learn it well.
Ben Bolker wrote:
bbslover dluthm at yeah.net writes
my code is not right below:
rm(list=ls())
#define data.frame
a=c(1,2,3,5,6); b=c(1,2,3,4,7); c=c(1,2,3,4,8); d=c(1,2,3,5,1);
e=c(1,2,3,5,7)
data.f=data.frame(a,b,c,d,e)
#backup data.f
origin.data-data.f
#get correlation matrix
cor.matrix-cor(origin.data)
#backup
use modulus instead of integer divison.
(which(cor.matrix =0.95) %% dim(cor.matrix)[1])
There are probably better ways than this.
Nikhil
but probably a better way to do this would be
On 6 Nov 2009, at 3:16AM, bbslover wrote:
for(i in 1:(cor.matrix[1]-1))
{
for(j in (i+1
rm(list=ls())
yx.df-read.csv(c:/MK-2-72.csv,sep=',',header=T,dec='.')
dim(yx.df)
#get X matrix
y-yx.df[,1]
x-yx.df[,2:643]
#conver to matrix
mat-as.matrix(x)
#get row number
rownum-nrow(mat)
#remove the constant parameters
mat1-mat[,apply(mat,2,function(.col)!(all(.col[1]==.col[2:rownum])))]
e.g.
a=
a b c d e
1 1 1 3 1 1
2 1 2 3 4 5
3 1 3 3 8 3
4 1 4 3 3 5
5 1 1 3 1 1I want to delete colume a and colume c, because they
have the same values in every row, then ,I want to get this data.frame .
b=
b d e
1 1 1 1
2 2 4 5
3 3 8 3
4 4 3 5
5 1 1 1the following
my programe is below:
a=c(1,2,1,1,1); b=c(1,2,3,4,1); c=c(3,4,3,3,3); d=c(1,2,3,5,1);
e=c(1,5,3,5,1)
data.f=data.frame(a,b,c,d,e)
origin.data-data.f
cor.matrix-cor(origin.data)
origin.cor-cor.matrix
m-0
for(i in 1:(cor.matrix[1]-1))
{
for(j in (i+1):(cor.matrix[2]))
{
if
is the dimension of your problem?
HTH,
Rick
--
From: Frank E Harrell Jr f.harr...@vanderbilt.edu
Sent: Thursday, November 05, 2009 4:12 PM
To: Ricardo Gonçalves Silva ricard...@terra.com.br
Cc: bbslover dlu...@yeah.net; r-help@r-project.org
hello,
my problem is like this: now after processing the varibles, the remaining
160 varibles(independent) and a dependent y. when I used PLS method, with 10
components, the good r2 can be obtained. but I donot know how can I express
my equation with the less varibles and the y. It is better to
--
From: bbslover dlu...@yeah.net
Sent: Wednesday, November 04, 2009 10:23 AM
To: r-help@r-project.org
Subject: [R] variable selectin---reduce the numbers of initial variable
hello,
my problem is like this: now after processing the varibles
thank you for your help,it is a good way.
Steven Kang wrote:
can try
matrix.x - as.matrix(x)
On Mon, Nov 2, 2009 at 8:38 PM, bbslover dlu...@yeah.net wrote:
In my disk C:/ have a a.csv file, I want to read it to R, importantly,
when
I use x=read.csv(C:/a.csv) ,the x format
In my disk C:/ have a a.csv file, I want to read it to R, importantly, when
I use x=read.csv(C:/a.csv) ,the x format is data.frame, I want to it to
become matrix format, how can I do it ?
thank you!
--
View this message in context:
how to put my data into R.
James W. MacDonald wrote:
bbslover wrote:
Steve Lianoglou-6 wrote:
Hi,
On Oct 22, 2009, at 2:35 PM, bbslover wrote:
Usage
data(gasoline)
Format
A data frame with 60 observations on the following 2 variables.
octane
a numeric
I have try it, past can add to wanted letter, but can not past the colume
names. May be I should learn it hard.
Don MacQueen wrote:
At 4:57 AM -0700 10/23/09, bbslover wrote:
Steve Lianoglou-6 wrote:
Hi,
On Oct 22, 2009, at 2:35 PM, bbslover wrote:
Usage
data(gasoline)
Format
thank you Don MacQueen , I will try it.
Don MacQueen wrote:
At 4:57 AM -0700 10/23/09, bbslover wrote:
Steve Lianoglou-6 wrote:
Hi,
On Oct 22, 2009, at 2:35 PM, bbslover wrote:
Usage
data(gasoline)
Format
A data frame with 60 observations on the following 2 variables.
octane
there are many R packages, yesterday, 2031 but today 2033 packages. how can I
kown which package is added, or updated?
--
View this message in context:
http://www.nabble.com/how-can-I-kown-which-package-is-added%2C-or-updated--tp26037150p26037150.html
Sent from the R help mailing list archive
It is so dramatical. Thank Gabor Grothendieck . I got it.
Gabor Grothendieck wrote:
Google for
CRANberries aggregates
and check first hit.
On Sat, Oct 24, 2009 at 4:44 AM, bbslover dlu...@yeah.net wrote:
there are many R packages, yesterday, 2031 but today 2033 packages. how
can I
Steve Lianoglou-6 wrote:
Hi,
On Oct 22, 2009, at 2:35 PM, bbslover wrote:
Usage
data(gasoline)
Format
A data frame with 60 observations on the following 2 variables.
octane
a numeric vector. The octane number.
NIR
a matrix with 401 columns. The NIR spectrum
and I see
I have read that one ,I want to this method to be used to my data.but I donot
know how to put my data into R.
James W. MacDonald wrote:
bbslover wrote:
Steve Lianoglou-6 wrote:
Hi,
On Oct 22, 2009, at 2:35 PM, bbslover wrote:
Usage
data(gasoline)
Format
A data frame with 60
Usage
data(gasoline)
Format
A data frame with 60 observations on the following 2 variables.
octane
a numeric vector. The octane number.
NIR
a matrix with 401 columns. The NIR spectrum
and I see the gasoline data to see below
NIR.1686 nm NIR.1688 nm NIR.1690 nm NIR.1692 nm NIR.1694 nm
58 matches
Mail list logo