[R] Random Forest, Giving More Importance to Some Data

2013-03-24 Thread Lorenzo Isella
Dear All, I am using randomForest to predict the final selling price of some items. As it often happens, I have a lot of (noisy) historical data, but the question is not so much about data cleaning. The dataset for which I need to carry out some predictions are fairly recent sales or even

[R] Parallelizing GBM

2013-03-24 Thread Lorenzo Isella
Dear All, I am far from being a guru about parallel programming. Most of the time, I rely or randomForest for data mining large datasets. I would like to give a try also to the gradient boosted methods in GBM, but I have a need for parallelization. I normally rely on gbm.fit for speed reasons,

Re: [R] Parallelizing GBM

2013-03-24 Thread Max Kuhn
See this: https://code.google.com/p/gradientboostedmodels/issues/detail?id=3 and this: https://code.google.com/p/gradientboostedmodels/source/browse/?name=parallel Max On Sun, Mar 24, 2013 at 7:31 AM, Lorenzo Isella lorenzo.ise...@gmail.comwrote: Dear All, I am far from being a guru

Re: [R] boxplot

2013-03-24 Thread John Kane
Unless you have a really large number of wells I'd just use the brute force approach of reading in each data set with a simple read.table or read.csv like well1 - read.csv(well1.csv) type of statement and repeat for each well. Here is a simple example that may give you an idea

Re: [R] Parallelizing GBM

2013-03-24 Thread Lorenzo Isella
Thanks a lot for the quick answer. However, from what I see, the parallelization affects only the cross-validation part in the gbm interface (but it changes nothing when you call gbm.fit). Am I missing anything here? Is there any fundamental reason why gbm.fit cannot be parallelized?

Re: [R] Parallelizing GBM

2013-03-24 Thread Mxkuhn
Yes, I think the second link is a test build of a parallelized cv loop within gbm(). On Mar 24, 2013, at 9:28 AM, Lorenzo Isella lorenzo.ise...@gmail.com wrote: Thanks a lot for the quick answer. However, from what I see, the parallelization affects only the cross-validation part in the

[R] Creating a boxplot from a data summary

2013-03-24 Thread Josh Hall
Hi, I'm trying to create a boxplot from the summary of a large data set and I'm having trouble finding any way to do this. I'm familiar with, but by no means good at, using R, so the only two websites I've found pertaining to this issue have been way over my head. I was hoping for a simple set

Re: [R] LOOCV over SVM,KNN

2013-03-24 Thread Nicolás Sánchez
Thanks you very much! Your help has been very useful! Regards! 2013/3/23 mxkuhn mxk...@gmail.com train() in caret. See http://caret.r-forge.r-project.org/ Also, the C5.0 function in the C50 is much more effective than J48. Max On Mar 23, 2013, at 2:57 PM, Nicolás Sánchez

Re: [R] Creating a boxplot from a data summary

2013-03-24 Thread Robert Baer
On 3/24/2013 11:39 AM, Josh Hall wrote: Hi, I'm trying to create a boxplot from the summary of a large data set and I'm having trouble finding any way to do this. I'm familiar with, but by no means good at, using R, so the only two websites I've found pertaining to this issue have been way over

[R] Rscript does not load/capture all shell arguments

2013-03-24 Thread Paulo van Breugel
Hi, I am working on a GRASS script (bash script), which should run a R script. I am working on Ubuntu 12.10, with R 2.15.3 and GRASS GIS 7.0 (I am not sure the latter isn't really relevant as the grass script is just a bash script). The R script is evoked with a call to Rscript ($RGRASSSCRIPT is

Re: [R] Random Forest, Giving More Importance to Some Data

2013-03-24 Thread Wensui Liu
your question doesn't seem to specifically related to either R or random forest. instead, it is about how to assign weights to training observations. On Sun, Mar 24, 2013 at 6:43 AM, Lorenzo Isella lorenzo.ise...@gmail.comwrote: Dear All, I am using randomForest to predict the final selling

Re: [R] boxplot

2013-03-24 Thread Janh Anni
Hello John, Thank you so much for your kind assistance and the detailed descriptions. I will play with the scripts and see which one is the easiest that serves the purpose.. Best regards, Janh On Sun, Mar 24, 2013 at 7:50 AM, John Kane jrkrid...@inbox.com wrote: ** Unless you have a really

Re: [R] Ordering a matrix by row value in R2.15

2013-03-24 Thread Pete Brecknock
fitz_ra wrote I know this is posted a lot, I've been through about 40 messages reading how to do this so let me apologize in advance because I can't get this operation to work unlike the many examples shown. I have a 2 row matrix temp [,1] [,2] [,3] [,4] [,5]

Re: [R] Integrate with vectors and varying upper limit

2013-03-24 Thread Pete Brecknock
sunny0 wrote I'd like to integrate vectors 't' and 'w' for log(w)/(1-t)^2 where i can vary the upper limit of the integral to change with each value of 't' and 'w', and then put the output into another vector. So, something like this... w=c(.33,.34,.56) t=c(.2,.5,.1) k-c(.3,.4,.5)

Re: [R] Ordering a matrix by row value in R2.15

2013-03-24 Thread soon yi
or this with Pete's example orig[,order(orig[2,])] Pete Brecknock wrote fitz_ra wrote I know this is posted a lot, I've been through about 40 messages reading how to do this so let me apologize in advance because I can't get this operation to work unlike the many examples shown. I

Re: [R] Ordering a matrix by row value in R2.15

2013-03-24 Thread William Dunlap
fitz_ra no address I want to order the matrix using the second row in ascending order. From the many examples (usually applied to columns) the typical solution appears to be: temp[order(temp[2,]),] Error: subscript out of bounds That tries to reorder the rows of temp according the

[R] a contrast question

2013-03-24 Thread Erin Hodgess
Dear R People: I have the following in a file: resp factA factB 39.5 low B- 38.6 high B- 27.2 low B+ 24.6 high B+ 43.1 low B- 39.5 high B- 23.2 low B+ 24.2 high B+ 45.2 low B- 33.0 high B- 24.8 low B+ 22.2 high B+ and I construct the data frame: collard.df -

Re: [R] a contrast question

2013-03-24 Thread Erin Hodgess
I found the solution: http://stats.stackexchange.com/questions/12993/how-to-setup-and-interpret-anova-contrasts-with-the-car-package-in-r Sorry for the trouble. On Sun, Mar 24, 2013 at 8:58 PM, Erin Hodgess erinm.hodg...@gmail.comwrote: Dear R People: I have the following in a file: resp

[R] Error with paired t-test

2013-03-24 Thread Charlotte Rayner
This error keeps appearing when i perform a paired t-test in R Error in t.test.default(payoff, paired = T) : 'y' is missing for paired test This is the method i have used read.table(MeanPayoff.txt,header=T) Open Closed1 47.5 42.37502 49.25000 50.3 50.0 49.80004

Re: [R] Clip a contour with shapefile while using contourplot

2013-03-24 Thread Paul Murrell
Hi Below is some code that does what I think you want by drawing a path based on the map data. This does some grubby low-level work with the 'sp' objects that someone else may be able to tidy up # The 21st polygon in 'hello' is the big outer boundary # PLUS the 20 other inner holes map -

Re: [R] Error with paired t-test

2013-03-24 Thread Pascal Oettli
Hi, The error message is explicit enough. You need 'y' for the paired test. with(payoff, t.test(Open, Closed1, paired=TRUE)) HTH, Pascal On 25/03/13 07:42, Charlotte Rayner wrote: This error keeps appearing when i perform a paired t-test in R Error in t.test.default(payoff, paired = T) :