Re: [R] Heuristic optimisation?
Ajay, Bayesm deals with this very issue in choice modelling (a form of econometric modelling as outlined in the article). I think those guys (the developers of Bayesm) and the apprach they recommend for navigating the likelihood function through a bayesian approachs makes a lot of sense to me, in fact I think they are really onto something amazing here. I am still trying to get the execution side of things from this package adequately sorted for my own purposes. Paul Ajay Shah [EMAIL PROTECTED] wrote: I wondered was people on this list felt about this article: http://www.voxeu.org/index.php?q=node/2363 which talks about the problems of obtaining sound answers in numerical optimisation in settings such as MLE or NLS. -- Ajay Shah http://www.mayin.org/ajayshah [EMAIL PROTECTED] http://ajayshahblog.blogspot.com *(:-? - wizard who doesn't know the answer. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regression with nominal data
Soren, It sounds like you are new to R so I will refer you to some packages that I think some people would find more user friendly as beginners. Zelig is excellent. You could run a series of logistic regressions coding your dependent variables as follows (a versus b, a versus c, b versus c) See the website below http://gking.harvard.edu/zelig/docs/index.html Alternatively you could try Rattle See the website below http://rattle.togaware.com/rattle-features.html Or you could try Rcmder HTH Paul [EMAIL PROTECTED] wrote: Hi, y is nominal (3 categories), x1 to 3 is scale. What I want is a regression, showing the probability to fall in one of the three categories of y according to the x. How can I perform such a regression in R? Thanks for your help Sören __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] New to R. Question about very large files
Hi Richard, It is if you use Rattle. Rattle allows you to do that for quite a few types of models and has a really nice GUI interface. It has been developed for the purposes of datamining, and given your use of the term score in your post, I assume in part that this is what you are looking for. You can import a random sample of cases (a training percentage from the total dataset) and the package will use the remainder of your sample for testing and scoring. Rattle is very sophisticated in terms of the algorithms it offers, and once you get going in R, SAS will become obselete (just kidding). Sorry if I am wrong here about what you need. Welcome . Paul - Original Message - From: Richard Palmer [EMAIL PROTECTED] To: r-help@r-project.org Sent: Thursday, August 14, 2008 10:40 PM Subject: [R] New to R. Question about very large files .. I am new to R but experienced in SAS. SAS has the capability to let me develop a model from a sample and use the results to score the records of another file which won't fit in memory. Is this straightforward in R or does it require coding to do the scoring in segments? Can someone point me to sample code that I can copy or modify to do this quickly? -- Richard Palmer Home 508 877-3862 Cell 508 982-7266 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] error with .dll file and RGtk2
Hi all, I am getting the following error message. Does somebody know what needs to happen here? I have tried re-installing the RGtk2 package and also downloading a .dll file and installing it in the RGtk2 file folder Error in dyn.load(x, as.logical(local), as.logical(now)) : unable to load shared library 'C:/PROGRA~1/R/R-25~1.1/library/RGtk2/libs/RGtk2.dll': LoadLibrary failure: The specified module could not be found. Thanks in advance Paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Excel Trend Function
Hi Felipe, Daniel mentions imputation is a disputed practice. There are recommendations and rules of thumb for its use. I am not sure that imputation is disputed. I would be interested to see some links to articles recommending against its use. Paul - Original Message - From: Felipe Carrillo [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Sunday, July 13, 2008 5:46 AM Subject: [R] Excel Trend Function Hi: I have a dataset and need to interpolate for missing days. In Excel I either average from sampled days from above and below the missing days or use the TREND function to make up for the missing values. I have been reading about na.approx, is this function similar to the TREND function? Which is the best recommendable way to make up for missing data? Here's my dataset: weeks 17,18,26 and 46 have 0 daysSamp. Year Week daysSamp Lower TotalPD Upper varTotalPD 2006 47 6 126988 188259 249530 1045878675 2006 48 7 189155 253350 317545 1148102355 2006 49 7 103300 132741 162182 241480186 2006 50 6 11801 252576 493352 16151006813 2006 51 7 2348 3671 4994 487926 2006 52 5 2606 29901 57196 215454181 2006 2 7 2968 4513 6058 664723 2006 3 7 1128 1889 2650 161231 2006 4 7 479 963 1447 65196 2006 5 7 2819 4413 6007 708094 2006 6 6 -1009 3128 7264 4766743 2006 7 7 -5239 10769 26777 71387835 2006 8 7 150 503 856 34685 2006 9 7 1858 2989 4120 356562 2006 10 7 193 494 795 25281 2006 11 7 125 346 567 13627 2006 12 7 432 767 1102 31189 2006 13 7 1229 1867 2505 113569 2006 14 7 813 1339 1865 77140 2006 15 4 -66 124 315 10105 2006 16 7 152 903 1654 157242 2006 17 0 2006 18 0 2006 19 5 0 0 0 0 2006 20 4 0 0 0 0 2006 21 5 0 0 0 0 2006 22 6 0 0 0 0 2006 23 7 -65 285 635 34112 2006 24 6 0 0 0 0 2006 25 7 0 0 0 0 2006 26 0 2006 27 4 228 931 1634 137726 2006 28 4 801 2231 3662 569977 2006 29 4 4544 9242 13939 6147522 2006 30 5 15798 28465 41131 44697915 2006 31 5 25398 41049 56701 68245523 2006 32 5 48197 82216 116235 322416917 2006 33 5 142980 230411 317841 2129630128 2006 34 5 227141 360468 493794 4952314336 2006 35 5 467244 756325 1045405 23281569629 2006 36 5 281049 463331 645614 9256900449 2006 37 2 227636 620330 1013023 42961663047 2006 38 3 478990 983472 1487954 70903343603 2006 39 7 539690 846522 1153354 26228718974 2006 40 7 320959 457866 594773 5221891252 2006 41 7 427561 582452 737343 6683813344 2006 42 7 271788 351103 430418 1752614293 2006 43 7 165019 208853 252687 535301133 2006 44 7 91514 117390 143266 186537178 2006 45 7 59061 79187 99313 112842787 2006 46 0 Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cluster on both categorical and numerical data
okay, when you cluster information, you can have two inputs raw data information which the algorithms converts have into a matrix and then processes a pre-processed matrix which you create yourself to input into a package essentially, packages will have a default assumption about the data you are using or the type of matrix you are using these matrices are often defined in simplistic terms as either a similarity or dissimilarity matrix think of a correlation matrix as an example of a matrix which represents similarity i think you will need to create a dissimilarity matrix (think of something that is like a correlation matrix which measures similarity in the diagonals) and it is the opposite of this (technically not correct, but you get the idea I hope) i use clustan graphics for all my clustering needs and gower's coefficient is the input i use when i have mixed variables if you pre-process (create a dissimilarity matrix) using Gowers algorithm, then specify this everything should work fine once you get this sorted, it should be all straight-forward PD - Original Message - From: Chua Siang Li [EMAIL PROTECTED] To: r-help@r-project.org Sent: Wednesday, June 18, 2008 7:46 PM Subject: [R] Cluster on both categorical and numerical data Hello there. Is there any function in R that can do cluster on a set of data that has both categorical and numerical variables? thanks. siangli __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fractional Factorial Design
alg-design will do the trick regards paul - Original Message - From: Caio Azevedo [EMAIL PROTECTED] To: R - discussion list [EMAIL PROTECTED] Sent: Monday, April 28, 2008 11:11 PM Subject: [R] Fractional Factorial Design Hi all, Does anybody know if it is possible to build a fractional factorial design in R? That is, suppose that we want do design an experiment with 3 factors with 2, 3 and 3 levels, respectivly. However we want to consider, let's say, only 6 from all possible level combinations. Does R design such experiment? Thanks in advance, Caio [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fractional Factorial Design
Caio, using algdesign code below (this produces a full factorial 2*3*3 full design) gen.factorial(c(2,3,3)) X1 X2 X3 1 -1 -1 -1 2 1 -1 -1 3 -1 0 -1 4 1 0 -1 5 -1 1 -1 6 1 1 -1 7 -1 -1 0 8 1 -1 0 9 -1 0 0 10 1 0 0 11 -1 1 0 12 1 1 0 13 -1 -1 1 14 1 -1 1 15 -1 0 1 16 1 0 1 17 -1 1 1 18 1 1 1 using ... optFederov(~.,dat,6) here is a design that is produced with six trials X1 X2 X3 3 1 -1 -1 4 -1 1 -1 13 -1 -1 1 15 1 -1 1 16 -1 1 1 18 1 1 1 This does the job with good efficiency. I would be interested to know what your objection to this is S Regards Paul - Original Message - From: Caio Azevedo [EMAIL PROTECTED] To: R - discussion list [EMAIL PROTECTED] Sent: Monday, April 28, 2008 11:11 PM Subject: [R] Fractional Factorial Design Hi all, Does anybody know if it is possible to build a fractional factorial design in R? That is, suppose that we want do design an experiment with 3 factors with 2, 3 and 3 levels, respectivly. However we want to consider, let's say, only 6 from all possible level combinations. Does R design such experiment? Thanks in advance, [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fractional Factorial Design
Excellent points Steve, I am ever expanding my understanding in the area and power is an interesting one. I do a lot of choice modelling myself, and I am confounded (grin) by the optimal way to develop designs with conditional levels (deliberate confounds) etc. Thanks for that. Regards P - Original Message - From: S Ellison [EMAIL PROTECTED] To: Caio Azevedo [EMAIL PROTECTED]; paulandpen [EMAIL PROTECTED]; R - discussion list [EMAIL PROTECTED] Sent: Tuesday, April 29, 2008 1:48 AM Subject: Re: [R] Fractional Factorial Design Paul; You asked using ... optFederov(~.,dat,6) ... does the job with good efficiency. I would be interested to know what your objection to this is S I have no issue with AlgDesign in principle, but the question was specifically about _fractional_ factorials, so I answered that. As to which is best - well, first pick your definition of 'best'. Both can improve drastically on full factorials. For me, he advantage of a fractional factorial is that it retains balance and, more importantly from a design perspective, I get to choose which effects are confounded and can arrange matters so that some effects are guaranteed unconfounded. The deterministic nature of the selection also makes it a bit easier to build power considerations into the process if you're so minded. The price of that is that the number of observations is typically larger than the smallest algorithmic design that might do a broadly similar job, though never as large as a full factorial. As I see it, the main advantage of algorithmic design is that you get to pick the size of the experiment. A second plus is that you can handle arbitrarily constrained designs much more easily, which is a feature I've sometimes found important. The disadvantage is that you may incur bias in some of the effect estimates, and because the selection process to fit an arbitrary experiment size typically involves some random selection from a candidate list, you don't necessarily get to choose which effects are biased. I guess you will also have a more interesting job deciding how many observations you need for a given power, if that's relevant. Steve E. *** This email and any attachments are confidential. Any u...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optFederov/AlgDesign - help avail?
I would suggest reading this attachment below. http://support.sas.com/techsup/technote/ts722d.pdf OptFedreov is the go for you, you are correct. I don't know of anybody who has come up with design principles in choice modelling that apply to logit and probit models etc. We all assume that what is good for linear models in principle, is also good for choice models, even when the utilities we are estimating are non-linear. I think it is important to recognise that d-efficiency is the best method of evaluation for choice designs and you should be aiming for an orthogonal array in your design as suggested in this article if you are using traditional logit and MNL, please make sure you allow enough choices and enough cases to complete the choices, and also that your algorithms are geared towards repeated choices if you are using a stated preference approach where people are answering a number of choice alternatives Bob Wheeler may be able to comment further, but i think you are on the right track Thanks Paul - Original Message - From: zubin [EMAIL PROTECTED] To: r-help@r-project.org Sent: Monday, April 21, 2008 9:59 PM Subject: [R] optFederov/AlgDesign - help avail? Hello, we are needing to generate optimal (Fractional) designs for discrete choice applications, where we will be using logistic regression or multinomial logit as the modeling technique. It looks like optFederov, in the AlgDesign package may work, but not sure if this algorithm works when the variable of interest is binary or nominal? Anyone who are experts in this area, anyone interested in consulting with us in this topic (if so, email me we can arrange)? Or can confirm/deny optFederov can work in the discrete case? thx! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optFederov/AlgDesign - help avail?
of yeah, and your design needs to account for main effects and interactions if you intend to model them, so make sure to program that into algdesign as well - Original Message - From: zubin [EMAIL PROTECTED] To: r-help@r-project.org Sent: Monday, April 21, 2008 9:59 PM Subject: [R] optFederov/AlgDesign - help avail? Hello, we are needing to generate optimal (Fractional) designs for discrete choice applications, where we will be using logistic regression or multinomial logit as the modeling technique. It looks like optFederov, in the AlgDesign package may work, but not sure if this algorithm works when the variable of interest is binary or nominal? Anyone who are experts in this area, anyone interested in consulting with us in this topic (if so, email me we can arrange)? Or can confirm/deny optFederov can work in the discrete case? thx! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conjoint Analysis in R??
Faisal, can you elaborate further on your conjoint design there is bayesm which offers a hierarchical bayes approach to analysing choice data MLogit available through zelig (see below) http://gking.harvard.edu/zelig/docs/index.html MNP as a standalone package for the probit model thanks Paul - Original Message - From: faisal afzal siddiqui [EMAIL PROTECTED] To: R Help [EMAIL PROTECTED] Sent: Thursday, December 06, 2007 6:00 PM Subject: [R] Conjoint Analysis in R?? Pls advise how I can use R in conjoint analysis?? regds Faisal Afzal Siddiqui Karachi, Pakistan Looking for last minute shopping deals? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cluster analysis
AMINA SHAHZADI, The eternal question. What I do is that I generate a range of solutions, profile them on variables used to cluster the data into groups and any other information I have to profile the cluster groups on and then present the solutions to a group of others to assess meaningfulness, debate on the solutions and attempt to reach a consensus etc In many cases, eg, for algorithms based on k-means and hierarchical clustering, you are using an exploratory technique and there are no right/wrong answers to this Having used cluster analysis for years some things to look at because there is no way to answer this statistically (unless you are using a latent class type model with goodness of fit measures) are the following 1. What is the minimum size you believe to be robust for a single cluster (eg n=30, n=100) etc because the larger the number of clusters you generate relative to sample size, the smaller your clusters will be and there must be a cut-off point defined upon which you are not prepared to go any lower... 2. If you run the clusters through different algorithms, how comparable are the results (cluster stability) 2. What differences emerge between 2, 3, 4 cluster solutions etc (as you utilise larger numbers of clusters, does this still produce a meaningful result in that the clusters are distinct and unique, or are you just cutting larger clusters into smaller clusters without generating unique and usable information... Examine the clusters via a series of cross tabs (as you go from 2 to 3 to 4 cluster solutions) what happens to the members within clusters, are they distributed differently etc Thanks Paul - Original Message - From: amna khan [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Friday, November 02, 2007 2:19 AM Subject: [R] cluster analysis Hi Sir How can we select the optimum number of clusters? Best Regards -- AMINA SHAHZADI Department of Statistics GC University Lahore, Pakistan. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] query on random.seed not found error in code
Hi, I am trying to use AlgDesign and am partially successful Two lines of code are taken from the help file 1. Line 1 (below) works fine dat-gen.factorial(levels=3,nVars=3,varNames=c(A,B,C)) 2. Line 2 (below) does not work fine desD-optFederov(~quad(A,B,C),dat,nTrials=14,eval=TRUE) Here is the result I get Error in optFederov(~quad(A, B, C), dat, nTrials = 14, eval = TRUE) : object .Random.seed not found What do i need to do, to introduce this object into the process? Thanks in advance Paul - Original Message - From: marcg [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Thursday, September 27, 2007 5:21 PM Subject: [R] different colors for two wireframes in same plot Hello R, According to: g - expand.grid(x = 1:10, y = 5:15, gr = 1:2) g$z - log((g$x^g$g + g$y^2) * g$gr) wireframe(z ~ x * y, data = g, groups = gr, scales = list(arrows = FALSE), drape = TRUE, colorkey = TRUE, screen = list(z = 30, x = -60)) i have two wireframes in one plot. How could i change the color of the top - one to transparent (or only the grid). I want to give insight to the lower layer. Could one make an if-statment like (if gr==1 do drape=F or color=none) if gr=2 do drape=T, colorkey=T) Thanks for your help Marc -- Pt! Schon vom neuen GMX MultiMessenger gehört? Der kanns mit allen: http://www.gmx.net/de/go/multimessenger __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.