[R] Memory management

2011-06-01 Thread Michael Conklin
I am trying to run a very large Bradley-Terry model using the BradleyTerry2 
package.  (There are 288 players in the BT model).

My problem is that I ran the model below successfully.
WLMat is a win-loss matrix that is 288 by 288
WLdf-countsToBinomial(WLMat)
  mod1-BTm(cbind(win1,win2),player1,player2,~player,id=player,data=WLdf)

Then I needed to run the same model with a subset of the observations that went 
into the win-loss matrix.  So I created my new win-loss matrix and tried to run 
a new model.

Now I get:  Error: cannot allocate vector of size 90.5 Mb

I found this particularly puzzling because the actual input data is the same 
size as the original model, just different values.

I tried increasing memory size, I tried running it in a clean workspace and the 
error message is always the same (sometimes the vector it is trying to allocate 
is 181.0MB (twice as large)) but it is always one of those two numbers no 
matter what I have done to the available memory.

To further complicate this...I cannot get the system to re-run my first model 
either . Same errors.

traceback indicates that the error occurs when the program is trying to do a qr 
decomposition.

R 2.13.0
Windows XP

Any suggestions?

W. Michael Conklin
Chief Methodologist
Google Voice: (612) 56STATS

MarketTools, Inc. | www.markettools.comhttp://www.markettools.com
6465 Wayzata Blvd | Suite 170 |  St. Louis Park, MN 55426.  PHONE: 952.417.4719 
| CELL: 612.201.8978
This email and attachment(s) may contain confidential and/or proprietary 
information and is intended only for the intended addressee(s) or its 
authorized agent(s). Any disclosure, printing, copying or use of such 
information is strictly prohibited. If this email and/or attachment(s) were 
received in error, please immediately notify the sender and delete all copies


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anyone successfully install Rgraphviz on windows with R 2.13?

2011-05-20 Thread Michael Conklin
Thanks to everyone who responded. The ReadMe file did the trick. It is too bad 
that it is so well hidden :)

W. Michael Conklin
Chief Methodologist
Google Voice: (612) 56STATS

MarketTools, Inc. | www.markettools.com
6465 Wayzata Blvd | Suite 170 |  St. Louis Park, MN 55426.  PHONE: 952.417.4719 
| CELL: 612.201.8978   
This email and attachment(s) may contain confidential and/or proprietary 
information and is intended only for the intended addressee(s) or its 
authorized agent(s). Any disclosure, printing, copying or use of such 
information is strictly prohibited. If this email and/or attachment(s) were 
received in error, please immediately notify the sender and delete all copies

-Original Message-
From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] 
Sent: Thursday, May 19, 2011 8:51 PM
To: Michael Conklin
Cc: R-help
Subject: Re: [R] Anyone successfully install Rgraphviz on windows with R 2.13?

On Thu, May 19, 2011 at 4:28 PM, Michael Conklin
michael.conk...@markettools.com wrote:
 I have been trying to get Rgraphviz to work (I know it is from Bioconductor) 
 unsuccessfully. Since I have no experience with Bioconductor I thought I 
 would ask here if anyone has advice. I have installed Graphviz 2.20.3 as is 
 recommended on the Bioconductor site but basically R cannot seem to find the 
 needed dll files.  So, even though I have added the appropriate directories 
 to the system path R cannot seem to find them. Any tips would be appreciated.


Be sure to read the installation instructions.  Unfortunately they
really hid them. You have to download and detar the source package and
then look at the README in it.

Regarding the path, I have graphviz installed in C:\Program
Files\Graphviz2.20 on my Windows Vista system  yet it works so at
least on Windows I don't think it matters that there are spaces in the
path.

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Anyone successfully install Rgraphviz on windows with R 2.13?

2011-05-19 Thread Michael Conklin
I have been trying to get Rgraphviz to work (I know it is from Bioconductor) 
unsuccessfully. Since I have no experience with Bioconductor I thought I would 
ask here if anyone has advice. I have installed Graphviz 2.20.3 as is 
recommended on the Bioconductor site but basically R cannot seem to find the 
needed dll files.  So, even though I have added the appropriate directories to 
the system path R cannot seem to find them. Any tips would be appreciated.

W. Michael Conklin
Chief Methodologist
Google Voice: (612) 56STATS

MarketTools, Inc. | www.markettools.comhttp://www.markettools.com
6465 Wayzata Blvd | Suite 170 |  St. Louis Park, MN 55426.  PHONE: 952.417.4719 
| CELL: 612.201.8978
This email and attachment(s) may contain confidential and/or proprietary 
information and is intended only for the intended addressee(s) or its 
authorized agent(s). Any disclosure, printing, copying or use of such 
information is strictly prohibited. If this email and/or attachment(s) were 
received in error, please immediately notify the sender and delete all copies


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Windows editor suggestions - autosave

2010-12-29 Thread Michael Conklin
I am looking for advice on an editor to use with R (windows) that has an 
autosave feature.  I typically write scripts using the RGui (and tried TinnR 
yesterday) but I am having continuing problems with BSODs (non R related) and 
have in the past have had issues with R crashes and would really like a system 
that does not require me to remember to hit the save button on my script every 
10 minutes so that I can avoid redoing everything.

W. Michael Conklin
Chief Methodologist
Google Voice: (612) 56STATS

MarketTools, Inc. | www.markettools.com
6465 Wayzata Blvd | Suite 170 |  St. Louis Park, MN 55426.  PHONE: 952.417.4719 
| CELL: 612.201.8978   
This email and attachment(s) may contain confidential and/or proprietary 
information and is intended only for the intended addressee(s) or its 
authorized agent(s). Any disclosure, printing, copying or use of such 
information is strictly prohibited. If this email and/or attachment(s) were 
received in error, please immediately notify the sender and delete all copies

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Analogue to SPSS regression commands ENTER and REMOVE in R?

2010-03-04 Thread Michael Conklin
I bet you stirred the pot here because you arre asking  about stepwise
procedures.  Look at step, or stepAIC in the MASS library.

\Mike


On Thu, 4 Mar 2010 07:47:34 -0800
Dimitri Liakhovitski ld7...@gmail.com wrote:

 I am not sure if this question has been asked before - but is there a
 procedure in R (in lm or glm?) that is equivalent to ENTER and REMOVE
 regression commands in SPSS?
 Thanks a lot!


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Scraping a web page

2009-12-03 Thread Michael Conklin
I would like to be able to submit a list of URLs of various webpages and 
extract the content i.e. not the mark-up of those pages. I can find plenty of 
examples in the XML library of extracting links from pages but I cannot seem to 
find a way to extract the text.  Any help would be greatly appreciated - I will 
not know the structure of the URLs I would submit in advance.  Any suggestions 
on where to look would be greatly appreciated.

Mike

W. Michael Conklin
Chief Methodologist

MarketTools, Inc. | www.markettools.comhttp://www.markettools.com
6465 Wayzata Blvd | Suite 170 |  St. Louis Park, MN 55426.  PHONE: 952.417.4719 
| CELL: 612.201.8978
This email and attachment(s) may contain confidential and/or proprietary 
information and is intended only for the intended addressee(s) or its 
authorized agent(s). Any disclosure, printing, copying or use of such 
information is strictly prohibited. If this email and/or attachment(s) were 
received in error, please immediately notify the sender and delete all copies


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mixed effect multinomial regression

2009-10-06 Thread Michael Conklin
The bayesm package implements such models.

Hth,

Mike


On Tue, 6 Oct 2009 12:41:18 -0700
James Martin just.strut...@gmail.com wrote:

 Hello list,
 
 I was trying to investigate the possible use of a mixed effect
 multinomial logit model in R.  Does anyone have suggestions on where
 to find information on these models and the associated functions in R.
 
 Thanks in advance,
 
 jm


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with predict.coxph

2009-08-19 Thread Michael Conklin
In examining the predict.coxph functions for the library I have with 2.7.1 
versus the library with 2.9.1 I find a major rewrite of the function. A number 
of internal survival functions are no longer present so much of the code has 
changed.  This makes identifying the specific problem beyond my capabilities.  
What I want to do, is generate predictions for specific combinations of 
covariates. The number of combinations I am interested in is different than the 
number of records in the original data file. Any help would be appreciated as 
some of the graphic routines I want to use on the data are only available in 
2.8 or greater - meaning I am currently looking at trying to run two different 
versions of R to get the project done.

TIA,

Michael Conklin 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Michael Conklin
Sent: Tuesday, August 18, 2009 8:26 PM
To: r-help@r-project.org
Subject: [R] Problem with predict.coxph

We occasionally utilize the coxph function in the survival library to fit 
multinomial logit models. (The breslow method produces the same likelihood 
function as the multinomial logit). We then utilize the predict function to 
create summary results for various combinations of covariates.  For example:

mod1-coxph(Depvar~Price:Product+strata(ID),data=MyDCMData2,na.action=na.omit,method=breslow)

The model runs fine.

Then we create some new data that is all combinations of Price and Product and 
retrieve the summary linear predictors.

newdata=expand.grid(Price=factor(as.character(1:5)),Product=factor(as.character(1:5)))
## create a utility matrix for all combinations of prices and products
totalut-predict(mod1,newdata=newdata,type=lp)

Under R 2.7.1 this produces the following output:

 totalut
  [,1]
1   0.01534582
2  -0.07628528
3  -0.88085189
4  -1.19458045
5  -1.03579684
6   0.40065672
7   0.15922492
8  -0.49233524
9  -0.65483441
10 -1.07739920
11  0.27589201
12  0.48055065
13  0.33638585
14 -0.28416678
15 -0.48762319
16  1.06071986
17  0.69041596
18  0.67479476
19  0.36360168
20 -0.09492167
21  0.66554276
22  0.55748465
23  0.37596413
24  0.01612020
25 -0.03567735


The problem is that under R 2.8.1 and R 2.9.1 the previous line fails with the 
following error:

 totalut-predict(mod1,newdata=newdata,type=lp)
Error in model.frame.default(Terms2, newdata, xlev = object$xlevels) :
  variable lengths differ (found for 'Price')
In addition: Warning message:
'newdata' had 25 rows but variable(s) found have 43350 rows


Does anyone have an idea what is going on?

Best regards,

Michael Conklin



W. Michael Conklin
Chief Methodologist

MarketTools, Inc. | www.markettools.comhttp://www.markettools.com
6465 Wayzata Blvd | Suite 170 |  St. Louis Park, MN 55426.  PHONE: 952.417.4719 
| CELL: 612.201.8978
This email and attachment(s) may contain confidential and/or proprietary 
information and is intended only for the intended addressee(s) or its 
authorized agent(s). Any disclosure, printing, copying or use of such 
information is strictly prohibited. If this email and/or attachment(s) were 
received in error, please immediately notify the sender and delete all copies


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with predict.coxph

2009-08-18 Thread Michael Conklin
We occasionally utilize the coxph function in the survival library to fit 
multinomial logit models. (The breslow method produces the same likelihood 
function as the multinomial logit). We then utilize the predict function to 
create summary results for various combinations of covariates.  For example:

mod1-coxph(Depvar~Price:Product+strata(ID),data=MyDCMData2,na.action=na.omit,method=breslow)

The model runs fine.

Then we create some new data that is all combinations of Price and Product and 
retrieve the summary linear predictors.

newdata=expand.grid(Price=factor(as.character(1:5)),Product=factor(as.character(1:5)))
## create a utility matrix for all combinations of prices and products
totalut-predict(mod1,newdata=newdata,type=lp)

Under R 2.7.1 this produces the following output:

 totalut
  [,1]
1   0.01534582
2  -0.07628528
3  -0.88085189
4  -1.19458045
5  -1.03579684
6   0.40065672
7   0.15922492
8  -0.49233524
9  -0.65483441
10 -1.07739920
11  0.27589201
12  0.48055065
13  0.33638585
14 -0.28416678
15 -0.48762319
16  1.06071986
17  0.69041596
18  0.67479476
19  0.36360168
20 -0.09492167
21  0.66554276
22  0.55748465
23  0.37596413
24  0.01612020
25 -0.03567735


The problem is that under R 2.8.1 and R 2.9.1 the previous line fails with the 
following error:

 totalut-predict(mod1,newdata=newdata,type=lp)
Error in model.frame.default(Terms2, newdata, xlev = object$xlevels) :
  variable lengths differ (found for 'Price')
In addition: Warning message:
'newdata' had 25 rows but variable(s) found have 43350 rows


Does anyone have an idea what is going on?

Best regards,

Michael Conklin



W. Michael Conklin
Chief Methodologist

MarketTools, Inc. | www.markettools.comhttp://www.markettools.com
6465 Wayzata Blvd | Suite 170 |  St. Louis Park, MN 55426.  PHONE: 952.417.4719 
| CELL: 612.201.8978
This email and attachment(s) may contain confidential and/or proprietary 
information and is intended only for the intended addressee(s) or its 
authorized agent(s). Any disclosure, printing, copying or use of such 
information is strictly prohibited. If this email and/or attachment(s) were 
received in error, please immediately notify the sender and delete all copies


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with Random Forest predict

2009-04-28 Thread Michael Conklin
I am trying to run a partialPlot with Random Forest (as I have done many times 
before).

First I run my forest... Cell is a 6 level factor that is the dependent 
variable - all other variables are predictors, most of these are factors as 
well.

predCell-randomForest(x=tempdata[-match(Cell,names(tempdata))],y=tempdata$Cell,importance=T)

Then I try my partial plot to look at the effect of a specific predictor.

partialPlot(x=predCell,pred.data=tempdata[-match(Cell,names(tempdata))],x.var=P7_6)

I get this error:

Error in predict.randomForest(x, x.data, type = prob) :
  Type of predictors in new data do not match that of the training data.

In examining randomForest:::predict.randomForest I see the following code that 
produces this error message.

cat.new - sapply(x, function(x) if (is.factor(x) 
!is.ordered(x))
length(levels(x))
else 1)
if (!all(object$forest$ncat == cat.new))
stop(Type of predictors in new data do not match that of the 
training data.)
}


The odd thing is that if I run this code outside of the function:

 all(predCell$forest$ncat==
+ sapply(tempdata[-match(Cell,names(tempdata))], function(x) if (is.factor(x) 

+ !is.ordered(x))
+ length(levels(x))
+ else 1))
[1] TRUE

Which should avoid the stop function.

Here is the session info.

R version 2.8.1 (2008-12-22)
i386-pc-mingw32

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United 
States.1252;LC_MONETARY=English_United 
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] randomForest_4.5-30


Any ideas would be greatly appreciated.

W. Michael Conklin
Chief Methodologist

MarketTools, Inc. | www.markettools.comhttp://www.markettools.com
6465 Wayzata Blvd | Suite 170 |  St. Louis Park, MN 55426.  PHONE: 952.417.4719 
| CELL: 612.201.8978
This email and attachment(s) may contain confidential and/or proprietary 
information and is intended only for the intended addressee(s) or its 
authorized agent(s). Any disclosure, printing, copying or use of such 
information is strictly prohibited. If this email and/or attachment(s) were 
received in error, please immediately notify the sender and delete all copies


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] performing function on data frame

2009-04-16 Thread Michael Conklin
newDF-as.data.frame(scale(oldDF))

see ?scale

Hope that helps.

Michael Conklin


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Karin Lagesen
Sent: Thursday, April 16, 2009 5:29 AM
To: r-help@r-project.org
Subject: Re: [R] performing function on data frame

David Hajage dhajag...@gmail.com writes:

 Hi Karin,

 I'm not sure I understand... Is this what you want ?

 d$y - mean(d$y)/sd(d$y)




Yes, and also a bit no.

Each column in my data frame represents one data set. For every
element in this data set I want to know the z value for that
element. I.e: I want to create a new data frame from the old data
frame, where each element in the new data frame is

newDF[i,j] = oldDF[i,j] - mean(d[,j]) / sddev(d[,j])

I could, I think, iterate like this over the data frame, but I keep
thinking that one of the apply functions should be employed...

Karin
--
Karin Lagesen, Ph.D.
karin.lage...@medisin.uio.no
http://folk.uio.no/karinlag

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Convert bits to numbers in base 10

2009-04-09 Thread Michael Conklin
Alternatively

(nn - c(1, 0, 0, 1, 0, 1,0))
 [1] 1 0 0 1 0 1 0

  sum(2^(0:(length(nn)-1))*nn)

but of course it depends if your bits are stored big-endian or little-endian
so you might want


  sum(2^((length(nn)-1):0)*nn)


I like Marc's approach better (certainly more elegant). If you have the big vs 
little endian issue you can just remove the rev from Marc's code below.

Michael Conklin


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Marc Schwartz
Sent: Thursday, April 09, 2009 4:51 PM
To: Jorge Ivan Velez
Cc: R-help; Gang Chen
Subject: Re: [R] Convert bits to numbers in base 10

I suspect that Gang was looking for something along the lines of:

  sum(2 ^ (which(as.logical(rev(nn))) - 1))
[1] 74

You might also want to look at the digitsBase() function in Martin's
sfsmisc package on CRAN.

HTH,

Marc Schwartz

On Apr 9, 2009, at 4:34 PM, Jorge Ivan Velez wrote:

 Dear Gang,
 Try this:

 nn - c(1, 0, 0, 1, 0, 1,0)
 paste(nn,sep=,collapse=)

 See ?paste for more information.

 HTH,

 Jorge


 On Thu, Apr 9, 2009 at 5:23 PM, Gang Chen gangch...@gmail.com wrote:

 I have some bits stored like the following variable nn

 (nn - c(1, 0, 0, 1, 0, 1,0))
 [1] 1 0 0 1 0 1 0

 not in the format of

 1001010

 and I need to convert them to numbers in base 10. What's an easy
 way to do
 it?

 TIA,
 Gang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Winsorizing Multiple Variables

2009-01-16 Thread Michael Conklin
Don't sort y. Calculate xbot and xtop using
xtemp-quantile(y,c(tr,1-tr),na.rm=na.rm)
xbot-xtemp[1]
xtop-xtemp[2]

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Karl Healey
Sent: Friday, January 16, 2009 2:51 PM
To: r-help@r-project.org
Subject: [R] Winsorizing Multiple Variables

Hi All,

I want to take a matrix (or data frame) and winsorize each variable.
So I can, for example, correlate the winsorized variables.

The code below will winsorize a single vector, but when applied to
several vectors, each ends up sorted independently in ascending order
so that a given observation is no longer on the same row for each
vector.

So I need to winsorize the variable but then return it to its original
order. Or another solution that will take a data frame, wisorize each
variable, and return a new data frame with all the variables in the
original order.

Thanks for any help!

-Karl


#The function I'm working from

win-function(x,tr=.2,na.rm=F){

if(na.rm)x-x[!is.na(x)]
y-sort(x)
n-length(x)
ibot-floor(tr*n)+1
itop-length(x)-ibot+1
xbot-y[ibot]
xtop-y[itop]
y-ifelse(y=xbot,xbot,y)
y-ifelse(y=xtop,xtop,y)
win-y
win
}

#Produces an example data frame, ss is the observation id, vars 1-5
are the variables I want to winzorise.

ss
=
c
(1
:
5
);var1
=
rnorm
(5
);var2
=
rnorm
(5
);var3
=rnorm(5);var4=rnorm(5);as.data.frame(cbind(ss,var1,var2,var3,var4))-
 data
data

#Winsorizes each variable, but sorts them independently so the
observations no longer line up.

sapply(data,win)


___
M. Karl Healey
Ph.D. Student

Department of Psychology
University of Toronto
Sidney Smith Hall
100 St. George Street
Toronto, ON
M5S 3G3

k...@psych.utoronto.ca

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] running R-code outside of R

2008-06-25 Thread Michael Conklin

Spencer Graves wrote:

  If you want to hide the fact that you are using R -- especially
if you charge people for your software that uses R clandestinely --
that's a violation of the license (GPL).  I doubt if anyone associated
with R would bother with a lawsuit, but a competitor who offers related
software might. 

  Best Wishes,
  Spencer

Do I understand the implication of the license correctly (forgive my
ignorance here).  

If I analyze a client's data using an R script I created then I can
charge the client a $20,000 consulting fee, but, if I let the client
push the button to execute the R script and charge him 10 cents for the
privilege then I can be sued for violating the GPL?  Or are my
assumptions on the first part also incorrect and R can only be used for
the free betterment of mankind?

Mike

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] LDA on pre-assigned training and testing data sets

2008-06-25 Thread Michael Conklin
I think this line 

mafdiscpred - predict(mafdisc, data = test)

needs to be

mafdiscpred - predict(mafdisc, newdata = test)




Michael Conklin

Chief Methodologist - Advanced Analytics

 

MarketTools, Inc.

6465 Wayzata Blvd. Suite 170

Minneapolis, MN 55426 

Tel: 952.417.4719 | Mobile:612.201.8978

[EMAIL PROTECTED]

 

MarketTools(r)http://www.markettools.com

 

This e-mail and any attachments may contain privileged, confidential or
proprietary information. If you are not the intended recipient, be aware
that any review, copying, or distribution of this e-mail or any
attachment is strictly prohibited. If you have received this e-mail in
error, please return it to the sender immediately, and permanently
delete the original and any copies from your system. Thank you for your
cooperation.

 


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Peter Flom
Sent: Wednesday, June 25, 2008 11:22 AM
To: r-help@r-project.org
Subject: [R] LDA on pre-assigned training and testing data sets

Dear r-help

I am trying to run LDA on a training data set, and test it on another
data set with the same variables.  I found examples using
crossvalidation, and using training and testing data sets set up with
sample, but not when they are preassigned.

Here is what I tried

# FIRST SET UP A DATAFRAME WITH ALL THE DATA AND CREATE  NEW VARIABLES

traintest1 - arnaudnognod1[arnaudnognod1$DISC_USE1 ==
1.01|arnaudnognod1$DISC_USE1 == 1.03|arnaudnognod1$DISC_USE1 == 1.04
 |arnaudnognod1$DISC_USE1 == 1.02|arnaudnognod1$DISC_USE1 ==
1.05|arnaudnognod1$DISC_USE1 == 1.06,]
traintest1$normal - traintest1$DISC_USE1 == 1.01|traintest1$DISC_USE1
== 1.03|traintest1$DISC_USE1 == 1.04
traintest1$mafelev - apply(traintest1[,1:40], 1, FUN = mean)
traintest1$mafscatter - apply(traintest1[,1:40], 1, FUN = sd)

# NEXT CREATE TRAINING AND TESTING DATAFRAMES

train - traintest1[traintest1$DISC_USE1 == 1.01|traintest1$DISC_USE1 ==
1.02,]
test - traintest1[traintest1$DISC_USE1  1.02,]

# NOW, TRAIN HAS 400 ROWS, TEST HAS 396 ROWS, AND TRAINTEST1 HAS 796
ROWS, EACH HAS 615 COLUMNS, AS EXPECTED

# RUN DISCRIM ON TRAINING DATA

mafdisc - lda(normal~mafelev + mafscatter, data = train)

#mafdisc$counts IS 210 AND 190, AS EXPECTED

#FINALLY, TEST IT ON THE TEST DATA

mafdiscpred - predict(mafdisc, data = test)

#BUT mafdiscpred$class HAS LENGTH = 400, NOT 396, AS EXPECTED.

any help appreciated

thanks

Peter

Peter L. Flom, PhD
Brainscope, Inc.
212 263 7863 (MTW)
212 845 4485 (Th)
917 488 7176 (F)



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] request: a class having max frequency

2008-06-06 Thread Michael Conklin
The 0 is the name of the item and the 1 is the index in f of the maximum
class. (since f is a table, and the first element of the table is the
maximum, which.max returns a 1) So, if you just want to know which class
is maximum you can say

 names(which.max(f))




Michael Conklin

Chief Methodologist - Advanced Analytics

 

MarketTools, Inc.

6465 Wayzata Blvd. Suite 170

Minneapolis, MN 55426 

Tel: 952.417.4719 | Mobile:612.201.8978

[EMAIL PROTECTED]

 

MarketTools(r)http://www.markettools.com

 

This e-mail and any attachments may contain privileged, confidential or
proprietary information. If you are not the intended recipient, be aware
that any review, copying, or distribution of this e-mail or any
attachment is strictly prohibited. If you have received this e-mail in
error, please return it to the sender immediately, and permanently
delete the original and any copies from your system. Thank you for your
cooperation.

 


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Muhammad Azam
Sent: Friday, June 06, 2008 8:15 AM
To: R Help; R-help request
Subject: [R] request: a class having max frequency

Dear R users
I have a very basic question. I tried but could not find the  required
result. using
dat - pima
f - table(dat[,9])

 f 
  0   1 
500 268
i want to find that class say 0 having maximum frequency i.e 500. I
used
which.max(f)
which provide 
0 
1  
How can i get only the 0.  Thanks and 


best regards

Muhammad Azam 
Ph.D. Student 
Department of Medical Statistics, 
Informatics and Health Economics 
University of Innsbruck, Austria 


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Percentages for categorical data by group

2008-05-23 Thread Michael Conklin
tapply(example.data$responseVar,example.data$groupVar,function(x){prop.t
able(table(x))})

Michael Conklin

Chief Methodologist - Advanced Analytics

 

MarketTools, Inc.

6465 Wayzata Blvd. Suite 170

Minneapolis, MN 55426 

Tel: 952.417.4719 | Mobile:612.201.8978

[EMAIL PROTECTED]

 

MarketTools(r)http://www.markettools.com

 

This e-mail and any attachments may contain privileged, confidential or
proprietary information. If you are not the intended recipient, be aware
that any review, copying, or distribution of this e-mail or any
attachment is strictly prohibited. If you have received this e-mail in
error, please return it to the sender immediately, and permanently
delete the original and any copies from your system. Thank you for your
cooperation.

 


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Economics Guy
Sent: Friday, May 23, 2008 9:52 AM
To: [EMAIL PROTECTED]
Subject: [R] Percentages for categorical data by group

I can think of several ways to blunt force hard code what I want but I
imagine there is a command or two that can be easily combined to do
this:

I have a data frame with about 23000 observations. There first variable
is
the group to which the observation belongs (about 500 different groups).
The
second variable is a response for each observation that is a 1,2,3,4 or
5. I
want to be able to calculate the percentage of each group that choose
each
response. For example I want to know what percentage of group 1 (which
may
have a value of 34456) choose response 1 and so on.

Here is some code I wrote that generates a data frame like the one I
have.

pop - matrix(1:10)
groupIDs - sample(pop,500)
groupVar - sample(groupIDs,23000,replace=TRUE)
responseVar - sample(1:5,23000,replace=TRUE)

example.data - data.frame(groupVar,responseVar)

Is there a fast way to calculate these percentages beyond writing loops
to
manually count the responses for each of the groups?

Thanks,

EG

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Percentages for categorical data by group

2008-05-23 Thread Michael Conklin

 prop.table(table(factor(x,levels=1:5)))


Michael Conklin

Chief Methodologist - Advanced Analytics

 

MarketTools, Inc.

6465 Wayzata Blvd. Suite 170

Minneapolis, MN 55426 

Tel: 952.417.4719 | Mobile:612.201.8978

[EMAIL PROTECTED]

 

MarketTools(r)http://www.markettools.com

 

This e-mail and any attachments may contain privileged, confidential or
proprietary information. If you are not the intended recipient, be aware
that any review, copying, or distribution of this e-mail or any
attachment is strictly prohibited. If you have received this e-mail in
error, please return it to the sender immediately, and permanently
delete the original and any copies from your system. Thank you for your
cooperation.

 


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Economics Guy
Sent: Friday, May 23, 2008 1:36 PM
To: [EMAIL PROTECTED]
Subject: Re: [R] Percentages for categorical data by group

I appreciate all the help.  The trouble is that in my real data set each
group does not always have an observation that choose each response.
This
results in some of the rows returned from prop.table() to be shorter
than
others so I get:

Warning message:
In function (..., deparse.level = 1)  :
  number of columns of result is not a multiple of vector length (arg 8)

Is there a way to tell rbind() or do.call() to treat missing values as
zero
or make prop.table() include the zero proportions?



On Fri, May 23, 2008 at 1:59 PM, Phil Spector
[EMAIL PROTECTED]
wrote:

 EG -
Thanks for the reproducible example!

When I run your code, and check the class of the result from
tapply(), I
 see that it is an
 array, and using dim(), I see it's an array
 of length 500.  How big is each element?

  table(sapply(res,length))


  5
 500

 So each piece is the same length.  That means we could
 make a 500x5 matrix as follows:

 do.call(rbind,res)
   - Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 [EMAIL PROTECTED]








[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with R version 2.6.0

2007-11-09 Thread Michael Conklin



On Fri, 9 Nov 2007, Prof Brian Ripley wrote:

 This is of course not how the rw-FAQ suggests you make use of R, and
the 
 best recommendation is to follow the FAQ's workflow.

The workflow recommendation that I read in the FAQ is:
2.5 How do I run it?
Just double-click on the shortcut you prepared at installation. 

If you want to set up another project, make a new shortcut or use the
existing one and change the `Start in' field of the Properties. 

---

I am wondering why this is the best workflow.  I work on several
hundred projects per year (R is definitely a production vehicle for us)
and can have as many as 20 going at the same time. Having a single
shortcut and changing the working directory on startup to the
appropriate folder (as opposed to changing the Start in property on
the shortcut) is much more efficient for me than creating a new shortcut
for a specific project. The beauty of R is that there are multiple ways
to do many things and the user can find the way that is best for him.

Michael Conklin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.