Re: [R] Classification
For all who sent help on topic Classification: Thank you very much folks. I have got some inspiration how to solve this task. Michael - Original Message - From: "Marc Schwartz" <[EMAIL PROTECTED]> To: "Ing. Michal Kneifl, Ph.D." <[EMAIL PROTECTED]> Cc: Sent: Wednesday, July 18, 2007 7:53 PM Subject: Re: [R] Classification > On Wed, 2007-07-18 at 19:36 +0200, Ing. Michal Kneifl, Ph.D. wrote: >> Hi, >> I am also a quite new user of R and would like to ask you for help: >> I have a data frame where all columns are numeric variables. My aim is >> to convert one columnt in factors. >> Example: >> MD >> 0.2 >> 0.1 >> 0.8 >> 0.3 >> 0.7 >> 0.6 >> 0.01 >> 0.2 >> 0.5 >> 1 >> 1 >> >> >> I want to make classes: >> 0-0.2 A >> 0.21-0.4 B >> 0.41-0.6 C >> . and so on >> >> So after classification I wil get: >> MD >> A >> A >> D >> B >> . >> . >> . >> and so on >> >> Please could you give an advice to a newbie? >> Thanks a lot in advance.. >> >> Michael > > See ?cut > > You can then do something like: > >> DF > MD > 1 0.20 > 2 0.10 > 3 0.80 > 4 0.30 > 5 0.70 > 6 0.60 > 7 0.01 > 8 0.20 > 9 0.50 > 10 1.00 > 11 1.00 > > >> cut(DF$MD, breaks = c(seq(0, 1, .2)), labels = LETTERS[1:5]) > [1] A A D B D C A A C E E > Levels: A B C D E > > > HTH, > > Marc Schwartz > > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Classification
On Wed, 2007-07-18 at 12:53 -0500, Marc Schwartz wrote: > On Wed, 2007-07-18 at 19:36 +0200, Ing. Michal Kneifl, Ph.D. wrote: > > Hi, > > I am also a quite new user of R and would like to ask you for help: > > I have a data frame where all columns are numeric variables. My aim is > > to convert one columnt in factors. > > Example: > > MD > > 0.2 > > 0.1 > > 0.8 > > 0.3 > > 0.7 > > 0.6 > > 0.01 > > 0.2 > > 0.5 > > 1 > > 1 > > > > > > I want to make classes: > > 0-0.2 A > > 0.21-0.4 B > > 0.41-0.6 C > > . and so on > > > > So after classification I wil get: > > MD > > A > > A > > D > > B > > . > > . > > . > > and so on > > > > Please could you give an advice to a newbie? > > Thanks a lot in advance.. > > > > Michael > > See ?cut > > You can then do something like: > > > DF > MD > 1 0.20 > 2 0.10 > 3 0.80 > 4 0.30 > 5 0.70 > 6 0.60 > 7 0.01 > 8 0.20 > 9 0.50 > 10 1.00 > 11 1.00 > > > > cut(DF$MD, breaks = c(seq(0, 1, .2)), labels = LETTERS[1:5]) > [1] A A D B D C A A C E E > Levels: A B C D E For precision, let's clean that up as I just realized that I left the remnants of c() in there from an alternative solution, which is not needed here: cut(DF$MD, breaks = seq(0, 1, .2), labels = LETTERS[1:5]) Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Classification
You can use 'cut': > x MD 1 0.20 2 0.10 3 0.80 4 0.30 5 0.70 6 0.60 7 0.01 8 0.20 9 0.50 10 1.00 11 1.00 > cut(x$MD, breaks=seq(0,1,.2), include.lowest=TRUE, labels=LETTERS[1:5]) [1] A A D B D C A A C E E Levels: A B C D E > On 7/18/07, Ing. Michal Kneifl, Ph.D. <[EMAIL PROTECTED]> wrote: > Hi, > I am also a quite new user of R and would like to ask you for help: > I have a data frame where all columns are numeric variables. My aim is > to convert one columnt in factors. > Example: > MD > 0.2 > 0.1 > 0.8 > 0.3 > 0.7 > 0.6 > 0.01 > 0.2 > 0.5 > 1 > 1 > > > I want to make classes: > 0-0.2 A > 0.21-0.4 B > 0.41-0.6 C > . and so on > > So after classification I wil get: > MD > A > A > D > B > . > . > . > and so on > > Please could you give an advice to a newbie? > Thanks a lot in advance.. > > Michael > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Classification
On Wed, 2007-07-18 at 19:36 +0200, Ing. Michal Kneifl, Ph.D. wrote: > Hi, > I am also a quite new user of R and would like to ask you for help: > I have a data frame where all columns are numeric variables. My aim is > to convert one columnt in factors. > Example: > MD > 0.2 > 0.1 > 0.8 > 0.3 > 0.7 > 0.6 > 0.01 > 0.2 > 0.5 > 1 > 1 > > > I want to make classes: > 0-0.2 A > 0.21-0.4 B > 0.41-0.6 C > . and so on > > So after classification I wil get: > MD > A > A > D > B > . > . > . > and so on > > Please could you give an advice to a newbie? > Thanks a lot in advance.. > > Michael See ?cut You can then do something like: > DF MD 1 0.20 2 0.10 3 0.80 4 0.30 5 0.70 6 0.60 7 0.01 8 0.20 9 0.50 10 1.00 11 1.00 > cut(DF$MD, breaks = c(seq(0, 1, .2)), labels = LETTERS[1:5]) [1] A A D B D C A A C E E Levels: A B C D E HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Classification
maybe: x = c(.2, .1, .8, .3, .7, .6, .01, .2, .5, 1, 1) breaks = seq(0, 1, .2) LETTERS[1:(length(breaks)-1)][cut(x, breaks)] b On Jul 18, 2007, at 1:50 PM, Doran, Harold wrote: > Michael > > Assume your data frame is called "data" and your variable is called > "V1". Converting this to a factor is: > > data$V1 <- factor(data$V1) > > Creating the classes can be done using ifelse(). Something like > > data$class <- ifelse(data$V1 < .21, A, ifelse(data$V1 < .41, B, C)) > > Harold > > >> -Original Message- >> From: [EMAIL PROTECTED] >> [mailto:[EMAIL PROTECTED] On Behalf Of Ing. >> Michal Kneifl, Ph.D. >> Sent: Wednesday, July 18, 2007 1:37 PM >> To: r-help@stat.math.ethz.ch >> Subject: [R] Classification >> >> Hi, >> I am also a quite new user of R and would like to ask you for help: >> I have a data frame where all columns are numeric variables. >> My aim is to convert one columnt in factors. >> Example: >> MD >> 0.2 >> 0.1 >> 0.8 >> 0.3 >> 0.7 >> 0.6 >> 0.01 >> 0.2 >> 0.5 >> 1 >> 1 >> >> >> I want to make classes: >> 0-0.2 A >> 0.21-0.4 B >> 0.41-0.6 C >> . and so on >> >> So after classification I wil get: >> MD >> A >> A >> D >> B >> . >> . >> . >> and so on >> >> Please could you give an advice to a newbie? >> Thanks a lot in advance.. >> >> Michael >> >> __ >> R-help@stat.math.ethz.ch mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Classification
Have a look at the recode function in the car package library(car) ?recode should give you what you need. --- "Ing. Michal Kneifl, Ph.D." <[EMAIL PROTECTED]> wrote: > Hi, > I am also a quite new user of R and would like to > ask you for help: > I have a data frame where all columns are numeric > variables. My aim is > to convert one columnt in factors. > Example: > MD > 0.2 > 0.1 > 0.8 > 0.3 > 0.7 > 0.6 > 0.01 > 0.2 > 0.5 > 1 > 1 > > > I want to make classes: > 0-0.2 A > 0.21-0.4 B > 0.41-0.6 C > . and so on > > So after classification I wil get: > MD > A > A > D > B > . > . > . > and so on > > Please could you give an advice to a newbie? > Thanks a lot in advance.. > > Michael > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Classification
Michael Assume your data frame is called "data" and your variable is called "V1". Converting this to a factor is: data$V1 <- factor(data$V1) Creating the classes can be done using ifelse(). Something like data$class <- ifelse(data$V1 < .21, A, ifelse(data$V1 < .41, B, C)) Harold > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Ing. > Michal Kneifl, Ph.D. > Sent: Wednesday, July 18, 2007 1:37 PM > To: r-help@stat.math.ethz.ch > Subject: [R] Classification > > Hi, > I am also a quite new user of R and would like to ask you for help: > I have a data frame where all columns are numeric variables. > My aim is to convert one columnt in factors. > Example: > MD > 0.2 > 0.1 > 0.8 > 0.3 > 0.7 > 0.6 > 0.01 > 0.2 > 0.5 > 1 > 1 > > > I want to make classes: > 0-0.2 A > 0.21-0.4 B > 0.41-0.6 C > . and so on > > So after classification I wil get: > MD > A > A > D > B > . > . > . > and so on > > Please could you give an advice to a newbie? > Thanks a lot in advance.. > > Michael > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] classification tables
Also check out CrossTable in the gmodels package. Regarding your other question, assuming we have tab<-table(x,y) as in Philippe's post, the fraction of pairs in x and y that match can be calculated via any of these: sum(x==y) / length(x) sum(diag(tab)) / sum(tab) library(e1071) classAgreement(tab) # tab from above sum(diag(prop.table(tab))) On 8/7/06, Philippe Grosjean <[EMAIL PROTECTED]> wrote: > > > x <- c(1,2,3,4,2,3,3,1,2,3) > > y <- c(2,1,3,4,1,3,3,2,2,3) > > table(x, y) >y > x 1 2 3 4 > 1 0 2 0 0 > 2 2 1 0 0 > 3 0 0 4 0 > 4 0 0 0 1 > > ?table > > Best, > > Philippe Grosjean > > ..<°}))>< > ) ) ) ) ) > ( ( ( ( (Prof. Philippe Grosjean > ) ) ) ) ) > ( ( ( ( (Numerical Ecology of Aquatic Systems > ) ) ) ) ) Mons-Hainaut University, Belgium > ( ( ( ( ( > .. > > Taka Matzmoto wrote: > > Dear R-users > > > > I have two vectors. One vector includes true values and the other vector has > > estimated values. Values are all integers from 1 to 4. > > > > For example, > > > > x <- c(1,2,3,4,2,3,3,1,2,3) > > y <- c(2,1,3,4,1,3,3,2,2,3) > > > > I would like to a classfication table x by y. With the table, I would like > > to calculate what percentage is correct classfication. > > > > Which R function do I need to use for creating a 4 * 4 classification table? > > > > Thank you. > > > > Taka, > > > > __ > > R-help@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] classification tables
> x <- c(1,2,3,4,2,3,3,1,2,3) > y <- c(2,1,3,4,1,3,3,2,2,3) > table(x, y) y x 1 2 3 4 1 0 2 0 0 2 2 1 0 0 3 0 0 4 0 4 0 0 0 1 > ?table Best, Philippe Grosjean ..<°}))>< ) ) ) ) ) ( ( ( ( (Prof. Philippe Grosjean ) ) ) ) ) ( ( ( ( (Numerical Ecology of Aquatic Systems ) ) ) ) ) Mons-Hainaut University, Belgium ( ( ( ( ( .. Taka Matzmoto wrote: > Dear R-users > > I have two vectors. One vector includes true values and the other vector has > estimated values. Values are all integers from 1 to 4. > > For example, > > x <- c(1,2,3,4,2,3,3,1,2,3) > y <- c(2,1,3,4,1,3,3,2,2,3) > > I would like to a classfication table x by y. With the table, I would like > to calculate what percentage is correct classfication. > > Which R function do I need to use for creating a 4 * 4 classification table? > > Thank you. > > Taka, > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Classification trees and written conditions
On 5/18/06, Carlos Ortega <[EMAIL PROTECTED]> wrote: > Yes, that is right. > The conditions on top of the branches refer to the left-hand side. Thanks, Carlos. Then, it should be explicitly said in ? text.rpart. Paul __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Classification trees and written conditions
On 5/18/06, Carlos Ortega <[EMAIL PROTECTED]> wrote: > Are you referring to ?: > - library(tree) > - library(rpart) > > On 5/18/06, Paul Smith <[EMAIL PROTECTED]> wrote: > > > Dear All > > When drawing a classification tree with > > plot(mytree) > text(mytree) > > the conditions are written just before the nodes branch. My question > is: can one be certain that those conditions refer to the left-side > branches? (The R documentation surprisingly lacks the information that > I am asking for.) Thanks, Carlos. I am referring to ibrary(rpart) Paul __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Classification of Imbalanced Data
The implementation of weighted RF is still on the to-do list for the package. Use Breiman & Cutler's Fortran code for now. Andy From: [EMAIL PROTECTED] > > Hi, > I'm looking to perform a classification analysis on an > imbalanced data > set using random Forest and I'd like to reproduce the > weighted random > forest analysis proposed in the Chen, Liaw & Breiman paper > "Using Random > Forest to Learn Imbalanced Data"; can I use the R package > randomForest > to perform such analysis? What is the easiest way to > accomplish this task? > Thanks, > Paolo Sonego > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Classification tree data structure
That's most helpful. Thank you very much for your time. Best regards, Maria On Tue Oct 18 17:18 , Prof Brian Ripley <[EMAIL PROTECTED]> se nt: On Tue, 18 Oct 2005, > Hi there, > > I am growing classification trees using the 'tree' p add-on to R. > > I would like to convert the 'R' output to the SAS fo by Salford Systems' commercial CART software in order to interfac e with some > other software. > > My question is: > How can I parse the R tree data structure in order t tree > structure? The 'tree' class has a member '$frame' wh the > splits at each node, but as far as I can see does no the > daughter nodes. Is this information accessible throu interface to > class 'tree' or do I need to dive into the C code? The daughter nodes of n are 2n and 2n+1. The print method, print.tree, is < parse the tree (and you can see the pattern of the numbers from its result). -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, [2]http ://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 References 1. 3D"javascript:top.opencompose('[EMAIL PROTECTED] 2. file://localhost/tmp/3D"parse.pl?redirect=http%3A%2F% __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Classification tree data structure
On Tue, 18 Oct 2005, Hades wrote: > Hi there, > > I am growing classification trees using the 'tree' package add-on to R. > > I would like to convert the 'R' output to the SAS format used by Salford > Systems' commercial CART software in order to interface with some > other software. > > My question is: > How can I parse the R tree data structure in order to infer the tree > structure? The 'tree' class has a member '$frame' which gives the > splits at each node, but as far as I can see does not specify the > daughter nodes. Is this information accessible through the interface to > class 'tree' or do I need to dive into the C code? The daughter nodes of n are 2n and 2n+1. The print method, print.tree, is written entirely in R and shows you how to parse the tree (and you can see the pattern of the numbers from its result). -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Classification of an image
Search CRAN! -- Bert Gunter > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Poizot Emmanuel > Sent: Thursday, March 31, 2005 11:59 PM > To: r-help@stat.math.ethz.ch > Subject: [R] Classification of an image > > Dear all, > > I need to do a automatic classification of a raster file > (image) using > training samples. I would like to know if there is a library > able to do > such a work. > > Thanks > > > Emmanuel Poizot > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] classification using logistic regression
Hi, On Mon, 27 Dec 2004, Rajdeep Das wrote: > I would like to do classification using logistic regression. Which R package > can I use? Have you tried glm() function? > Also is there any package for feature selection for logistic regression based > method? Do you mean model selection methods like forward selection? If so, try step() HTH, Kevin Ko-Kang Kevin Wang PhD Student Centre for Mathematics and its Applications Building 27, Room 1004 Mathematical Sciences Institute (MSI) Australian National University Canberra, ACT 0200 Australia Homepage: http://wwwmaths.anu.edu.au/~wangk/ Ph (W): +61-2-6125-2431 Ph (H): +61-2-6125-7407 Ph (M): +61-40-451-8301 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] classification for huge datasets: SVM yields memory troubles
While it is true that the large number of variables relative to the number of observations restricts what can be inferred, the situation is not as hopeless as Bert seems to suggest. If it were, attempts at the analysis of expression array data would be a waste to time. Methods developed to that general area may well be relevant to other data where the number of variables is similarly far larger than the number of observations. See Ambroise, C. and Mclachlan, G.J. 2002. Selection bias in gene extraction on the basis of microarray gene-expression data. PNAS 99: 6562--6566. This discusses some of the literature on the use of SVMs. The selection bias that these authors discuss also affects plots, even principal components and other ordination-base plots where features have been selected on the basis of their ability to separate into known groups. I have draft versions of code that addresses this selection bias as it affects the plotting of graphs, which (along a paper that has been submitted for inclusion in a conference proceedings) I am happy to make available to anyone who wants to experiment. Another good place to look, as a starting point, may be Gordon Smyth's LIMMA User's Guide. This can be a bit hard to find. With limma installed, type help.start(). After some time a browser window should open. Click on Packages | limma | Overview | LIMMA User's Guide (pdf) John Maindonald email: [EMAIL PROTECTED] phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Bioinformation Science, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. On 14 Dec 2004, at 10:09 PM, [EMAIL PROTECTED] wrote: From: Berton Gunter <[EMAIL PROTECTED]> Date: 14 December 2004 9:23:08 AM To: "'Andreas'" <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]> Cc: Subject: RE: [R] classification for huge datasets: SVM yields memory troubles " I have a matrix with 30 observations and roughly 3 variables, ... " Comment: This is ** not ** a "huge" data set -- it is a tiny one with a large number of covariates. The difference is: If it were truly huge, SVM and/or LDA or ... might actually be able to produce useful results. With so few data and so many variables, it is hard to see how any approach that one uses is not simply a fancy random number generator. John Maindonald email: [EMAIL PROTECTED] phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Bioinformation Science, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] classification for huge datasets: SVM yields memory troubles
" I have a matrix with 30 observations and roughly 3 variables, ... " Comment: This is ** not ** a "huge" data set -- it is a tiny one with a large number of covariates. The difference is: If it were truly huge, SVM and/or LDA or ... might actually be able to produce useful results. With so few data and so many variables, it is hard to see how any approach that one uses is not simply a fancy random number generator. -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA "The business of the statistician is to catalyze the scientific learning process." - George E. P. Box > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Andreas > Sent: Monday, December 13, 2004 12:56 PM > To: [EMAIL PROTECTED] > Subject: Re: [R] classification for huge datasets: SVM yields > memory troubles > > Hi, > > I'm a beginner in the SVM-module but I have seen there is a > parameter called > : > cachesize #cache memory in MB (default 40) > > please let me know if this parameter solved your problem, I > might get the > same number of samples in the near future. > > regards Andreas > > "Christoph Lehmann" <[EMAIL PROTECTED]> schrieb im Newsbeitrag > news:[EMAIL PROTECTED] > > Hi > > I have a matrix with 30 observations and roughly 3 > variables, each > > obs belongs to one of two groups. With svm and slda I get > into memory > > troubles ('cannot allocate vector of size' roughly 2G). PCA LDA runs > > fine. Are there any way to use the memory issue withe > SVM's? Or can you > > recommend any other classification method for such huge datasets? > > > > > > P.S. I run suse 9.1 on a 2G RAM PIV machine. > > thanks for a hint > > > > Christoph > > > > __ > > [EMAIL PROTECTED] mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > > > __ > [EMAIL PROTECTED] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] classification for huge datasets: SVM yields memory troubles
Hi, I'm a beginner in the SVM-module but I have seen there is a parameter called : cachesize #cache memory in MB (default 40) please let me know if this parameter solved your problem, I might get the same number of samples in the near future. regards Andreas "Christoph Lehmann" <[EMAIL PROTECTED]> schrieb im Newsbeitrag news:[EMAIL PROTECTED] > Hi > I have a matrix with 30 observations and roughly 3 variables, each > obs belongs to one of two groups. With svm and slda I get into memory > troubles ('cannot allocate vector of size' roughly 2G). PCA LDA runs > fine. Are there any way to use the memory issue withe SVM's? Or can you > recommend any other classification method for such huge datasets? > > > P.S. I run suse 9.1 on a 2G RAM PIV machine. > thanks for a hint > > Christoph > > __ > [EMAIL PROTECTED] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] classification trees
[EMAIL PROTECTED] wrote: I'm working with S-Plus 6 in Windows. Does anyone know if the prune.tree or prune.misclass function automatically cross-validates or do you have to use cv.tree if you want to do cross-validation? This mailing list is about R. There is, e.g., the s-news lists for questions related to S-PLUS. Uwe Ligges Heather __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] classification and association rules in R
Rong-En Fan wrote: > By the way, I heard that there are some people developing a better > search interface for R (or CRAN?). Where are the related information > I can get? Strangely enough, by following the "Search" link on CRAN. Jason __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] classification with quantitative variables
> "OlivierM" == Martin Olivier <[EMAIL PROTECTED]> > on Tue, 12 Aug 2003 14:45:58 + writes: OlivierM> I want to conduct a cluster analysis with OlivierM> quantitative variables. More precisely, it OlivierM> concerns binary and non-ordered categorical OlivierM> variables. For such data, various similarity OlivierM> measures have been proposed, such as the Jaccard OlivierM> index or the simple matching index. OlivierM> So, is there a package such as mva or multiv in OlivierM> the case of quantitative variables? Could you OlivierM> indicate me reviews, papers or technical reports OlivierM> dealing with this problem? The package 'cluster' has a function daisy() that allows to work with combinations of "all" kinds of variables. Note that I think you mistyped "quantitative" where you meant "qualitative". Regards, Martin Maechler <[EMAIL PROTECTED]> http://stat.ethz.ch/~maechler/ Seminar fuer Statistik, ETH-Zentrum LEO C16Leonhardstr. 27 ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND phone: x-41-1-632-3408 fax: ...-1228 <>< __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help