[R] Regular Expressions

2010-11-05 Thread Noah Silverman
Hi, I'm trying to figure out how to use capturing parenthesis in regular expressions in R. (Doing this in Perl, Java, etc. is fairly trivial, but I can't seem to find the functionality in R.) For example, given the string:10 Nov 13.00 (PFE1020K13) I want to capture the first to digits

Re: [R] Regular Expressions

2010-11-05 Thread Noah Silverman
: On Thu, 4 Nov 2010, Noah Silverman wrote: Hi, I'm trying to figure out how to use capturing parenthesis in regular expressions in R. (Doing this in Perl, Java, etc. is fairly trivial, but I can't seem to find the functionality in R.) For example, given the string:10 Nov 13.00

[R] Populating then sorting a matrix and/or data.frame

2010-11-10 Thread Noah Silverman
Hi, I have a process in R that produces a lot of output. My plan was to build up a matrix or data.frame row by row, so that I'll have a nice object with all the resulting data. I started with: results - matrix(ncol=3) names(results) - c(one, two, three) Then, when looping through the data:

Re: [R] Populating then sorting a matrix and/or data.frame

2010-11-10 Thread Noah Silverman
That was a typo. It should have read: results[results$one 100,] It does still fail. There is ONE column that is text. So my guess is that R is seeing that and assuming that the entire data.frame should be factors. -N On 11/10/10 11:16 PM, Michael Bedward wrote: Hello Noah, If you set

Re: [R] Populating then sorting a matrix and/or data.frame

2010-11-11 Thread Noah Silverman
That makes perfect sense. Since I need to build up the results table sequentially as I iterate through the data, how would you recommend it?? Thanks, -N On 11/11/10 12:03 AM, Michael Bedward wrote: All values in a matrix are the same type, so if you've set up a matrix with a character

Re: [R] Populating then sorting a matrix and/or data.frame

2010-11-11 Thread Noah Silverman
Still doesn't work. When using rbind to build the data.frame, it get a structure mostly full of NA. The data is correct, so something about pushing into the data.frame is breaking. Example code: results - data.frame() for(i in 1:n){ #do all the work #a is a test label. b,c,d are

Re: [R] Populating then sorting a matrix and/or data.frame

2010-11-11 Thread Noah Silverman
Langfelder Sent: Thursday, November 11, 2010 12:25 PM To: Noah Silverman Cc: r-help@r-project.org Subject: Re: [R] Populating then sorting a matrix and/or data.frame On Thu, Nov 11, 2010 at 11:33 AM, Noah Silverman n...@smartmediacorp.com wrote: Still doesn't work. When using rbind to build

Re: [R] Populating then sorting a matrix and/or data.frame

2010-11-11 Thread Noah Silverman
That makes perfect sense. All of my numbers are being coerced into strings by the c() function. Subsequently, my data.frame contains all strings. I can't know the length of the data.frame ahead of time, so can't predefine it like your example. One thought would be to make it arbitrarily long

Re: [R] Populating then sorting a matrix and/or data.frame

2010-11-11 Thread Noah Silverman
David, Great solution. While a bit longer to enter, it lets me explicitly define a type for each column. Thanks!!! -N On 11/11/10 4:02 PM, David Winsemius wrote: On Nov 11, 2010, at 6:38 PM, Noah Silverman wrote: That makes perfect sense. All of my numbers are being coerced

[R] Can't invert matrix

2010-11-20 Thread Noah Silverman
Hi, I'm trying to use the solve() function in R to invert a matrix. I get the following error, Lapack routine dgesv: system is exactly singular However, My matrix doesn't appear to be singular. [,1] [,2] [,3] [,4] [1,] 0.99252358 0.93715047 0.7540535 0.4579895

[R] Counting things in a time series.

2010-11-20 Thread Noah Silverman
Hi, I have a process (not in R) that records events with a time stamp. So, I have a huge series of maybe 100,000 time stamps. I'd like to break it up into hourly (Or daily) intervals and then count how many events occurred in each interval. That way I can graph it. Ideally, converting the

Re: [R] Can't invert matrix

2010-11-20 Thread Noah Silverman
] -2.132894e-08 2.128452e-08 When dealing with numerical matrices you have to be prepared for the unexpected. Bill Venables. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Noah Silverman Sent: Sunday, 21 November 2010 2

[R] More detail in chart axis?

2010-11-23 Thread Noah Silverman
Hi, I have a series of data (about 80,000 pairs of x,y). Plotting it shows a great chart. However, R has randomly chosen about 6 labels for my x axis. Now, clearly I can't show them all, but would like some control over the granularity of what is displayed. I can't find anything in the

[R] Possible command line bug.

2009-10-13 Thread Noah Silverman
I think I've come across a bug in the command line switches. From R --help --vanillaCombine --no-save, --no-restore, --no-site-file, --no-init-file and --no-environ --slave Make R run as quietly as possible -q, --quiet

[R] Converting dataframe to matrix

2009-10-16 Thread Noah Silverman
Hi, I'm experimenting with a few learners that require a matrix as their input. (Currently svmpath, vbmp, etc.) I currently have a dataframe with 50 columns and 20,000 rows. I tried using: x - as.matrix(my_data.frame) If I then as, is.matrix(x), I get TRUE. However everywhere I've tried

Re: [R] Converting dataframe to matrix

2009-10-16 Thread Noah Silverman
: On Fri, Oct 16, 2009 at 01:33:14AM -0700, Noah Silverman wrote: Hi, I'm experimenting with a few learners that require a matrix as their input. (Currently svmpath, vbmp, etc.) I currently have a dataframe with 50 columns and 20,000 rows. I tried using: x- as.matrix(my_data.frame) If I

[R] Different way of scaling data

2009-10-16 Thread Noah Silverman
Hi, I have a data.frame that I need to scale. I've been using the scale function and it works nicely. Some of the libraries I'm testing won't accept negative values for data, so I need to find a way to scale the data from 0 to 1 Any ideas? Thans!

[R] Best SVM Performance measure?

2009-10-19 Thread Noah Silverman
Hi, This is probably going to be one of those, It depends what you want kind of answers, but I'm very curious to see if the group has an opinion or some general suggestions. The actual experiment is too complicated for a quick e-mail, but I'll summarize well enough(hopefully) to get the

[R] Best SVM Performance measure?

2009-10-20 Thread Noah Silverman
Hi, This is probably going to be one of those, It depends what you want kind of answers, but I'm very curious to see if the group has an opinion or some general suggestions. The actual experiment is too complicated for a quick e-mail, but I'll summarize well enough(hopefully) to get the

[R] Data format for KSVM

2009-10-23 Thread Noah Silverman
Hi, I have a process using svm from the e1071 library. it works. I want to try using the KSVM library instead. The same data used wiht e1071 gives me an error with KSVM. My data is a data.frame. sample code: svm_formula - formula(y ~ a + B + C) svm_model - ksvm(formula, data=train_data,

Re: [R] RMySql problem

2009-10-23 Thread Noah Silverman
Hi, It looks like you are potentially dealing with two separate issues. 1) Access - Mysql has very find grained permissions as to who can access what and from where. You need to make sure that your username in mysql is allowed to access the database/tables from your location. 2) Corruption

[R] Stratified Maximum Likelihood

2009-10-30 Thread Noah Silverman
Hi, I've search rseek.org high and low and can't seem to find an answer to this. I want to maximize likelihood for a set of training data, but the data is grouped. (Think multiple trials.) It would probably be possible to do this with some nested for loops manually, but would be painfully

[R] Avoiding for loops

2009-11-02 Thread Noah Silverman
Hi, I'm trying to normalize some data. My data is organized by groups. I want to normalize PER GROUP as opposed to over the entire data set. The current double loop that I'm using takes almost an hour to run on about 30,000 rows of data in 2,500 groups. I'm currently doing this:

Re: [R] Avoiding for loops

2009-11-02 Thread Noah Silverman
/ data$sum data .. or even transform(data, norm=ave(y, group, FUN = function(x) x/sum(x))) I hope it helps. Best, Dimitris Noah Silverman wrote: Hi, I'm trying to normalize some data. My data is organized by groups. I want to normalize PER GROUP as opposed to over

Re: [R] Binning Question

2010-04-12 Thread Noah Silverman
David, That helps me a lot. Thanks!!! -N On 4/12/10 9:06 PM, David Winsemius wrote: dat - as.data.frame(matrix( rnorm(200), 100 , 2)) # bivariate normal n=100 ab - matrix( c(-5,-5,5,5), 2, 2) # interval [-5,5) x [-5,5) nbin - c( 20, 20) # 400 bins bins - bin2(dat, ab, nbin) # bin

[R] Results from clogit out of range?

2010-04-20 Thread Noah Silverman
Hi, I'm calculating a conditional logit on some data stratified by group. My understanding was that a conditional logit by definition returns a value between 0 and 1 a a probability. Can anyone suggest why I'm seeing results outside of the {0,1} range?? The call in R is: m - clogit(score ~

Re: [R] Results from clogit out of range?

2010-04-20 Thread Noah Silverman
Thanks David, That explains a lot. I appreciate it. -- Noah On 4/20/10 3:48 PM, David Winsemius wrote: On Apr 20, 2010, at 5:59 PM, Noah Silverman wrote: Hi, I'm calculating a conditional logit on some data stratified by group. My understanding was that a conditional logit

Re: [R] Results from clogit out of range?

2010-04-20 Thread Noah Silverman
On 4/20/10 4:22 PM, Noah Silverman wrote: Thanks David, That explains a lot. I appreciate it. -- Noah On 4/20/10 3:48 PM, David Winsemius wrote: On Apr 20, 2010, at 5:59 PM, Noah Silverman wrote: Hi, I'm calculating a conditional logit on some data stratified by group. My

[R] Summarizing counts by multiple factors

2010-05-11 Thread Noah Silverman
Hi, An example data set is: grouplevelcolor A1blue A1Red B1blue B2Red A2Red B2Red B2blue B2blue A2blue A2Red I'd like to

[R] Discretize factors?

2010-05-15 Thread Noah Silverman
Hi, I'm looking for an easy way to discretize factors in R I've noticed that the lm function does this automatically with a nice result. If I have group - c(A, B,B,C,C,C) and run: lm(result ~ x1 + group) The lm function has split the group into separate binary variables {0,1} before

Re: [R] Discretize factors?

2010-05-16 Thread Noah Silverman
to lm, but not actually doing any regression? Thanks again! -N On 5/15/10 11:17 AM, Thomas Stewart wrote: Maybe this? group - factor(c(A, B,B,C,C,C)) model.matrix(~0+group) -tgs On Sat, May 15, 2010 at 2:02 PM, Noah Silverman n...@smartmediacorp.com mailto:n...@smartmediacorp.com wrote

Re: [R] Discretize factors?

2010-05-16 Thread Noah Silverman
1 0 4 4 3 0 0 1 5 5 4 0 0 1 6 6 5 0 0 1 Any ideas? -N On 5/15/10 11:02 AM, Noah Silverman wrote: Hi, I'm looking for an easy way to discretize factors in R

Re: [R] Discretize factors?

2010-05-16 Thread Noah Silverman
I could, but with close to 100 columns, its messy. On 5/16/10 11:22 AM, Peter Ehlers wrote: On 2010-05-16 11:06, Noah Silverman wrote: Update, I have it working, but now its producing really ugly labels. Must be a small adjustment to the code. Any ideas?? ##Create example data.frame

[R] Building a list

2010-05-30 Thread Noah Silverman
Hello, I need to build a list of lists We have 20 groups we are generating MCMC samples for. There are 10 coefficients, and 1 MCMC iterations. I would like to store each iteration by-group in a list. My problem is with the first iteration. Here is a toy example: Chain - list() for (j in

Re: [R] Building a list

2010-05-30 Thread Noah Silverman
]], coef) If it does, this has the additional advantage that it tends to be faster to initialize the list at size rather than expanding it as needed. HTH, Josh On Sun, May 30, 2010 at 2:52 PM, Noah Silverman n...@smartmediacorp.com wrote: Hello, I need to build a list of lists We

Re: [R] Building a list

2010-05-30 Thread Noah Silverman
) } } On Sun, May 30, 2010 at 6:05 PM, Noah Silverman n...@smartmediacorp.com wrote: That would be great, except I just realized I made a typo when sending my code. I'm tracking 20 coefficents for 10 groups. So I need a top list of 10 groups. Then each of the 10,000 samples for each of the 20

[R] Comparing multiple columns in matrix

2010-05-31 Thread Noah Silverman
We're running Monte Carlo repeated measures for several groups. The goal is to determine the number of time each group has the highest score. A toy example: [,1] [,2] [,3] 0.1 0.2 0.3 0.1 0.2 0.3 0.1 0.2 0.3 0.1 0.3 0.2 0.1 0.3 0.2 0.2 0.3 0.1 For this example:

[R] Fancy Page layout

2010-05-31 Thread Noah Silverman
Hi, Working on a report that is going to have a large number of graphs and summaries. We have 80 groups with 20 variables each. Ideally, I'd like to produce ONE page for each group. It would have two columns of 10 graphs and then the 5 number summary of the variables at the bottom. So, perhaps

Re: [R] Fancy Page layout

2010-05-31 Thread Noah Silverman
Lattice looks nice, but how can I put some summary text at the bottom? On 5/31/10 11:27 AM, RICHARD M. HEIBERGER wrote: Use lattice. require(lattice) ?lattice ?xyplot __ R-help@r-project.org mailing list

[R] Plot multiple columns

2010-06-01 Thread Noah Silverman
I'm running a long MCMC chain that is generating samples for 22 variables. I have each run of the chain as a row in a matrix. So: Chain[,1] is the column with all the samples for variable one. Chain[,2] is the column with all the samples for variable 2, etc. I'd like to fit all 22 on a single

Re: [R] Plot multiple columns

2010-06-01 Thread Noah Silverman
Hi, I used the term run, as each iteration of the Gibbs sampler produces 22 variables (coefficients for Beta in a regression model) The example wont work On 6/1/10 5:54 AM, Ben Bolker wrote: Noah Silverman noah at smartmediacorp.com writes: I'm running a long MCMC chain

Re: [R] Fancy Page layout

2010-06-01 Thread Noah Silverman
AM, Noah Silverman wrote: Hi, Working on a report that is going to have a large number of graphs and summaries. We have 80 groups with 20 variables each. Ideally, I'd like to produce ONE page for each group. It would have two columns of 10 graphs and then the 5 number summary

[R] textbox in lattice

2010-06-01 Thread Noah Silverman
Hi, I want to add a box at the bottom of a lattice window (device/page?). Lattice has drawn a nice group of panels with all the plots I need. How do I add my own summary text at the bottom (several lines worth?) __ R-help@r-project.org mailing list

Re: [R] Plot multiple columns

2010-06-01 Thread Noah Silverman
On Tue, Jun 1, 2010 at 9:37 AM, Noah Silverman n...@smartmediacorp.com mailto:n...@smartmediacorp.com wrote: Hi, I used the term run, as each iteration of the Gibbs sampler produces 22 variables (coefficients for Beta in a regression model) The example wont work

Re: [R] Plot multiple columns

2010-06-01 Thread Noah Silverman
, 2010 at 10:51 AM, Noah Silverman n...@smartmediacorp.com mailto:n...@smartmediacorp.com wrote: You are correct, I initially missed the as.mcmc step. Without it, R doesn't want to squeeze so many plots onto a page. I've had that problem before with lattice...which makes me wonder

Re: [R] textbox in lattice

2010-06-01 Thread Noah Silverman
::splitTextGrob function might help. HTH, baptiste On 1 June 2010 19:37, Noah Silverman n...@smartmediacorp.com wrote: Hi, I want to add a box at the bottom of a lattice window (device/page?). Lattice has drawn a nice group of panels with all the plots I need. How do I add my own summary text

Re: [R] textbox in lattice

2010-06-01 Thread Noah Silverman
- matrix(runif(2200),ncol=22) m - as.mcmc(x) p = xyplot(m, layout = c(2, 11)) pdf(,height=15) arrange(p, tableGrob(as.matrix(summary(iris)), theme=theme.white()), heights= unit(c(3,1),null)) dev.off() HTH, baptiste On 1 June 2010 20:52, Noah Silverman n...@smartmediacorp.com wrote

[R] Decision values from KSVM

2010-06-11 Thread Noah Silverman
Hi, I'm working on a project using the kernlab library. For one phase, I want the decision values from the SVM prediction, not the class label. the e1071 library has this function, but I can't find the equivalent in ksvm. In general, when an SVM is used for classification, the label of an

[R] Sim function

2010-06-12 Thread Noah Silverman
I'm reading Gellman's book Data Analysis Using Regression and Multilevel-Hierarchical Models In Chapter 7 (and later), he makes frequent referent to a function names sim. I can't find the function anywhere, not in my standard R install, or in any of the packages. Doe anyone have a

[R] Factoring a variable

2010-06-17 Thread Noah Silverman
Hi, I have a dataset where the results are coded (yes, no) We want to do some machine learning with SVM to predict the yes outcome My problem is that if I just use the as.factor function to convert, then it reverses the levels. -- x - c(no, no, no, yes, yes, no, no)

[R] Factoring a variable

2010-06-17 Thread Noah Silverman
Hi, I have a dataset where the results are coded (yes, no) We want to do some machine learning with SVM to predict the yes outcome My problem is that if I just use the as.factor function to convert, then it reverses the levels. -- x - c(no, no, no, yes, yes, no, no)

[R] accessing return variables from a function

2010-07-09 Thread Noah Silverman
Hi, I am trying to figure out a short way to access two values output from the sort function. x - c(3,4,3,6,78,3,1,2) sort(x, index.return=T) $x [1] 1 2 3 3 3 4 6 78 $ix [1] 7 8 1 3 6 2 4 5 It would be great to do something like this (doesn't work.): c(y, indexes) - sort(x,

Re: [R] interpretation of svm models with the e1071 package

2010-07-09 Thread Noah Silverman
Steve, Couldn't he also just use the decision.value property to see the equivilent of t(x) %*% b for each row? -N On 7/9/10 7:11 PM, Steve Lianoglou wrote: Hi, On Fri, Jul 9, 2010 at 12:15 PM, manuel.martin manuel.mar...@orleans.inra.fr wrote: Dear all, after having calibrated a svm

Re: [R] accessing return variables from a function

2010-07-10 Thread Noah Silverman
Thanks! -N On 7/9/10 2:20 AM, Noah Silverman wrote: Hi, I am trying to figure out a short way to access two values output from the sort function. x - c(3,4,3,6,78,3,1,2) sort(x, index.return=T) $x [1] 1 2 3 3 3 4 6 78 $ix [1] 7 8 1 3 6 2 4 5 It would be great

[R] Help with Conditional Logit

2009-07-16 Thread Noah Silverman
Hello, I'm brand new to using R. (I've been using Rapid Miner, but would like to move over to R since it gives me much more functionality.) I'm trying to learn how to do a conditional logit model. My data has one dependent variable, 2 independent variables and a group variable. example:

[R] svm works but tune.svm give error

2009-07-18 Thread Noah Silverman
Hello, I'm using the e1071 library for SVM functions. I can quickly train an SVM with: svm(formula = label ~ ., data = testdata) That works well. I want to tune the parameters, so I tried: tune.svm(label ~ ., data=testdata[1:2000, ], gamma=10^(-6:3), cost=10^(1:2)) THIS FAILS WITH AN ERROR:

[R] Normalize data

2009-07-20 Thread Noah Silverman
Hello, I'm coming from RapidMiner, so some of the easy things there are a bit difficult for me to find in R How do I normalize data in a data frame. Ideally I want to scale the values for each column in the range of (-1,1) Thank You, __

[R] Strange Memory issue

2009-07-27 Thread Noah Silverman
Hi, I am testing out some things with the kernlab library. The dataframe is 22,000 rows of 32 columns. The command I execute is: model - ksvm(label ~ ., data = traindata, type=C-svc, kernel = rbfdot, class.weights= c(0 =1, 1 =3), kpar = automatic, C = 10, cross = 3, prob.model = TRUE) I

[R] Forumla format?

2009-07-27 Thread Noah Silverman
Hi, Quick question. I'm working on training an SVM. I have a dataframe with about 50 columns. I want to train on 46 of them. Is there a way to say All except columns 22,23,25 and 31? It would be nice to not have to do +c1 +c2 +c3 +c4, etc for all 48 columns. Thanks! -N

Re: [R] Forumla format?

2009-07-27 Thread Noah Silverman
Hi, I'm not sure that would work for the formula format of an SVM function. the idea is normally svm(label ~ c1 + c2 +c3, data=mydata); It doesn't work to say svm(label ~ -c(22,23,24), data=mydata) On 7/27/09 12:17 PM, Steve Lianoglou wrote: Hi, On Jul 27, 2009, at 3:01 PM, Noah

[R] Watching tune parameters for SVM?

2009-07-29 Thread Noah Silverman
Hi, I'm switch over from RapidMiner to R. (The learning curve is steep, but there is so much more I can do with R and it runs much faster overall.) In RapidMiner, I can tune a parameter of my svm in a nice cross validation loop. The process will print out the progress as it goes. So for a

[R] scale subset of data

2009-07-31 Thread Noah Silverman
Hi, This should be an easy one, but I have some trouble formatting the data right I'm trying to replace the column of a subset of a dataframe with the scaled data for that column of the subset subset(rawdata, code== foo, select = a) - scale( subset(rawdata, code== foo, select = a) ) It

Re: [R] scale subset of data

2009-07-31 Thread Noah Silverman
That works perfectly. Thanks! -N On 7/31/09 2:04 PM, Steve Lianoglou wrote: Hi, On Jul 31, 2009, at 4:13 PM, Noah Silverman wrote: Hi, This should be an easy one, but I have some trouble formatting the data right I'm trying to replace the column of a subset of a dataframe

[R] scale subsets of grouped data in data frame

2009-07-31 Thread Noah Silverman
Hello, I'm trying to duplicate what's an easy process in RapidMiner. In RM, we can simply use two operators: subgroup iteration attribute value selection (Can use a regex for the attrribute name.) I can do this in R with a lot of code and manual steps. It would be really nice to

[R] Strange column shifting with read.table

2009-08-02 Thread Noah Silverman
Hi, I am reading in a dataframe from a CSV file. It has 70 columns. I do not have any kind of unique row id. rawdata - read.table(r_work/train_data.csv, header=T, sep=,, na.strings=0) When training an svm, I keep getting an error So, as an experiment, I wrote the data back out to a new

Re: [R] Strange column shifting with read.table

2009-08-02 Thread Noah Silverman
Jim, The write.table was simply a diagnostic step. My problem is that R is automatically adding row_names and then shifting my column labels over. (The shifting creates a bunch of related problems.) Thanks for the help. -Noah On 8/2/09 2:22 PM, jim holtman wrote: try 'row.names=FALSE' in

Re: [R] Strange column shifting with read.table

2009-08-02 Thread Noah Silverman
The column names have to obfuscated, but here are 10 rows of the data. label c0 c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15 c16 c17 c18 c19 c20 c21 c22 c23 c24 c25 c26 c27

Re: [R] Strange column shifting with read.table

2009-08-02 Thread Noah Silverman
Somehow, my data is still getting mangled. Running the SVM gives me the following error: names attribute[1994] must me the same length as the vector[1950] Any thoughts? -N On 8/2/09 2:35 PM, (Ted Harding) wrote: On 02-Aug-09 21:10:12, Noah Silverman wrote: Hi, I am reading

Re: [R] Strange column shifting with read.table

2009-08-02 Thread Noah Silverman
command. But, issuing the same 0 substitution AFTER the scale command makes everything work again. rawdata[is.na(rawdata)] - 0 VERY strange behavior. -N On 8/2/09 3:57 PM, J Dougherty wrote: On Sunday 02 August 2009 02:34:43 pm Noah Silverman wrote: The column names have to obfuscated

Re: [R] Strange column shifting with read.table

2009-08-02 Thread Noah Silverman
. Additionally, R seems MUCH MUCH faster.) I'm open to ideas. Thanks! -N On 8/2/09 4:14 PM, David Winsemius wrote: On Aug 2, 2009, at 7:02 PM, Noah Silverman wrote: Hi, It seems as if the problem was caused by an odd quirk of the scale function. Some of my data have NA entries. So

Re: [R] Strange column shifting with read.table

2009-08-02 Thread Noah Silverman
Just tried your suggestion. rawdata[is.na(rawdata), ] - 0 It FAILS with the following error: Error in `[-.data.frame`(`*tmp*`, is.na(rawdata), , value = 0) : non-existent rows not allowed __ R-help@r-project.org mailing list

Re: [R] Strange column shifting with read.table

2009-08-02 Thread Noah Silverman
, at 7:02 PM, Noah Silverman wrote: Hi, It seems as if the problem was caused by an odd quirk of the scale function. Some of my data have NA entries. So, I substitute 0 for any NA with: rawdata[is.na(rawdata)] - 0 Perhaps this would have done what you intended: rawdata[is.na(rawdata), ] - 0

[R] Scale set of 0 values returns NAN??

2009-08-03 Thread Noah Silverman
Hi, More questions in my ongoing quest to convert from RapidMiner to R. One thing has become VERY CLEAR: None of the issues I'm asking about here are addressed in RapidMiner. How it handles misisng values, scaling, etc. is hidden within the black box. Using R is forcing me to take a much

Re: [R] Strange column shifting with read.table

2009-08-03 Thread Noah Silverman
Hi, Thanks for the continued support. I've been working on this all night, and have learned some things: 1) Since I'm really committed to using an SVM, I need to skip the examples with missing data. I have a training set of approximately 22,000 examples of which about 500 have missing

[R] Save model and predictions from svm

2009-08-03 Thread Noah Silverman
Hello, I'm using the e1071 package for training an SVM. It seems to be working well. This question has two parts: 1) Once I've trained an SVM model, I want to USE it within R at a later date to predict various new data. I see the write.svm command, but don't know how to LOAD the model back

Re: [R] Save model and predictions from svm

2009-08-03 Thread Noah Silverman
with data frame of: label, v1, v2, v3 After svm prediction, ending up with data frame of: label, v1, v2, v3, prediction, probability Thanks again! -N On 8/3/09 8:15 PM, Steve Lianoglou wrote: Hi, On Aug 3, 2009, at 10:55 PM, Noah Silverman wrote: Hello, I'm using the e1071 package for training

[R] Strange error with ROCR

2009-08-04 Thread Noah Silverman
Hello, I've come across a strange error... Here is what happens: model - svm(traindata,trainlabels, type=C-classification, kernel=radial, cost=10, class.weights=c(win=3,lose=1), scale=FALSE, probability = TRUE) predictions - predict(model, traindata) pred - prediction(predictions,

Re: [R] Strange error with ROCR

2009-08-04 Thread Noah Silverman
Good point. I'm not sure how I missed that. This does lead to an additional question: Is the probability of the true label the best prediction to feed to the ROCR package, or is it better to use the decision.value Anybody have any experience on this one? Thanks! -N On 8/4/09 3:28 AM,

Re: [R] Strange error with ROCR

2009-08-04 Thread Noah Silverman
I hadn't thought of that. I'll run some tests... -N On 8/4/09 11:49 AM, Tobias Sing wrote: Is the probability of the true label the best prediction to feed to the ROCR package, or is it better to use the decision.value Since AFAIK they are related by a monotonous transformation, both

[R] Logistic Regression

2009-08-04 Thread Noah Silverman
Hi, Trying to setup a logistic regression model. (Something new to me. I usually use SVM.) The person explaining the concept explained to me that I can include a group variable so that the probabilities predicted by the model will be per group Does this make sense to anyone? If so, how

Re: [R] Logistic Regression

2009-08-04 Thread Noah Silverman
Thanks David, But HOW do I indicate the grouping variable in the formula? Thanks! -N On 8/4/09 3:37 PM, David Winsemius wrote: On Aug 4, 2009, at 6:33 PM, Noah Silverman wrote: Hi, Trying to setup a logistic regression model. (Something new to me. I usually use SVM.) The person

Re: [R] Logistic Regression

2009-08-04 Thread Noah Silverman
be: lrm( label ~ v1 + v2, group_by(group) -N On 8/4/09 3:41 PM, David Winsemius wrote: On Aug 4, 2009, at 6:38 PM, Noah Silverman wrote: Thanks David, But HOW do I indicate the grouping variable in the formula? Hard to tell. You have told us absolutely nothing about the problem

Re: [R] Logistic Regression

2009-08-04 Thread Noah Silverman
Hmmm.. I'll try that. I recall reading somewhere that the group variable had to be indicated in a special way. -N On 8/4/09 3:49 PM, David Winsemius wrote: On Aug 4, 2009, at 6:45 PM, Noah Silverman wrote: I guess I didn't explain it well enough. I have a number of training examples

Re: [R] Logistic Regression

2009-08-04 Thread Noah Silverman
Thanks David, My apologies for the HTML e-mail. Its the default of my desktop client. -N On 8/4/09 4:03 PM, David Winsemius wrote: On Aug 4, 2009, at 6:52 PM, Noah Silverman wrote: Hmmm.. I'll try that. I recall reading somewhere that the group variable had to be indicated in a special

Re: [R] Scale set of 0 values returns NAN??

2009-08-04 Thread Noah Silverman
LOL, I'll happily support that. Can we possibly take my name off of it? :) -N On 8/4/09 4:31 PM, Rolf Turner wrote: On 5/08/2009, at 11:10 AM, Greg Snow wrote: -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Noah

[R] Build a dataframe row by row?

2009-08-04 Thread Noah Silverman
Hi, Time for another of my newbie questions. Is it possible to build up a data.frame row by row as I go I'm going to be running a bunch of experiments (many in a loop) to test different things. I'm using AUC as my main performance measure. My thought was to add a row to a data.frame for

Re: [R] Build a dataframe row by row?

2009-08-04 Thread Noah Silverman
Hi, Nice, but I need a few columns for the data. Don't know how to do this with the method you suggest. -N On 8/4/09 4:57 PM, Remko Duursma wrote: Hi Noah, there are a few ways to do this. Easiest is to keep adding an element to a list, and then make it into a dataframe at the end, like

[R] Counting things

2009-08-04 Thread Noah Silverman
I've completed an experiment and want to summarize the results. There are two things I like to create. 1) A simple count of things from the data.frame with predictions 1a) Number of predictions with probability greater than x 1b) Number of predictions with probability greater than x

[R] binning results

2009-08-05 Thread Noah Silverman
Hello, I asked this as part of a previous message, but never really figured out a usable solution. So this is a second attempt. I have an process containing an SVM. The end result is the probability that the class is true. That result is added back to the original data. So I wind up

Re: [R] binning results

2009-08-05 Thread Noah Silverman
/how. -N On 8/5/09 11:22 AM, Steve Lianoglou wrote: Hi, On Aug 5, 2009, at 2:11 PM, Noah Silverman wrote: Hello, I asked this as part of a previous message, but never really figured out a usable solution. So this is a second attempt. I have an process containing an SVM. The end result

[R] Question with apply function

2009-08-05 Thread Noah Silverman
In my continuing quest to generate some summary data, I've come across some useful suggestions in pasts posts. The apply operation returns an error, and I can't figure out why. Can someone help me fix this? testlogdata - cbind(testlogdata, range_group=cut(testlogdata$lrm_score, breaks=c(.9,

[R] Help with Logit Model

2009-08-06 Thread Noah Silverman
Hello, I have a bit of a tricky puzzle with trying to implement a logit model as described in a paper. The particular paper is on horseracing and they explain a model that is a logit trained per race, yet somehow the coefficients are combined across all the training races to come up with a

[R] Logit Model... GLM or GEE or ??

2009-08-06 Thread Noah Silverman
Posted about this earlier. Didn't receive any response But, some further research leads me to believe that MAYBE a GLMM or a GEE function will do what I need. Hello, I have a bit of a tricky puzzle with trying to implement a logit model as described in a paper. The particular paper is on

Re: [R] Logit Model... GLM or GEE or ??

2009-08-06 Thread Noah Silverman
rkoen...@uiuc.eduDepartment of Economics vox: 217-333-4558University of Illinois fax: 217-244-6678Urbana, IL 61801 On Aug 6, 2009, at 5:00 PM, Noah Silverman wrote: Posted about this earlier. Didn't receive any response But, some further

Re: [R] Help with Logit Model

2009-08-06 Thread Noah Silverman
Yes, I already have solid code to estimate the probabilities and gather the public estimates. What I'm stuck on is how to train the race-wise logit and then somehow combine them to come up with a final set of coefficients. I could just train a glm on the whole data set, but would be losing

[R] Statistician Needed

2009-08-10 Thread Noah Silverman
Hello, I've come up with some challenges with my process that are a bit too complicated for the mailing list. Is there anyone out there, preferably a real statistician, who is willing to consult with me via phone/email for a few hours. I'm happy to pay you for your time. Thanks, -Noah

[R] nominal to numeric function

2009-08-11 Thread Noah Silverman
Hi, I'm training an SVM (C-classification from e1071 library) Some of the variables in my data set are nominal. Is there some easy/automatic way to convert them to numerical representations? Thanks, -N __ R-help@r-project.org mailing list

Re: [R] nominal to numeric function

2009-08-12 Thread Noah Silverman
newvariable=as.numeric(variablename). This converts your factors into numeric variables, but not always with the desired result. So make sure that you check whether newvariable gives you what you want. Otherwise recoding by hand is indicated. Best, Daniel Noah Silverman-3 wrote: Hi, I'm

[R] Nominal variables in SVM?

2009-08-12 Thread Noah Silverman
Hi, The answers to my previous question about nominal variables has lead me to a more important question. What is the best practice way to feed nominal variable to an SVM. For example: color = (red, blue, green) I could translate that into an index so I wind up with color= (1,2,3) But my

Re: [R] Nominal variables in SVM?

2009-08-12 Thread Noah Silverman
Lianoglou wrote: Hi, On Aug 12, 2009, at 2:53 PM, Noah Silverman wrote: Hi, The answers to my previous question about nominal variables has lead me to a more important question. What is the best practice way to feed nominal variable to an SVM. For example: color = (red, blue, green) I could

  1   2   3   >