Re: [R] processing matrix equation derived from rows of two matrices
Hi, May be this helps: tb[1,]%*%(((val-rep(meansb79[1,],5))^2)/6) # [,1] #[1,] 1.47619 tryvarb-c(1,2,3,4,4,4,4) var(tryvarb) #[1] 1.47619 tb[2,]%*%(((val-rep(meansb79[2,],5))^2)/6) # [,1] #[1,] 1.904762 sapply(seq_len(nrow(tb)),function(i) tb[i,]%*%(((val-rep(meansb79[i,],5))^2/6))) # [1] 1.4761905 1.9047619 1.9047619 1.9047619 1.9047619 2.2857143 1.9047619 # [8] 1.9047619 2.2857143 2.2857143 1.9047619 1.9047619 2.2857143 2.2857143 #[15] 2.2857143 1.9523810 1.9523810 2.2857143 2.2857143 2.2857143 2.2857143 #[22] 1.6190476 1.6190476 1.9047619 1.9047619 1.9047619 1.9047619 1.8095238 #[29] 0.9047619 0.9047619 1.1428571 1.1428571 1.1428571 1.1428571 1.000 #[36] 0.4761905 #or mat1-matrix(((val-rep(meansb79,each=5))^2/6),ncol=5,byrow=TRUE) diag(t(apply(tb,1,function(x) x))%*% apply(mat1,1,function(x) x)) # [1] 1.4761905 1.9047619 1.9047619 1.9047619 1.9047619 2.2857143 1.9047619 #[8] 1.9047619 2.2857143 2.2857143 1.9047619 1.9047619 2.2857143 2.2857143 #[15] 2.2857143 1.9523810 1.9523810 2.2857143 2.2857143 2.2857143 2.2857143 #[22] 1.6190476 1.6190476 1.9047619 1.9047619 1.9047619 1.9047619 1.8095238 #[29] 0.9047619 0.9047619 1.1428571 1.1428571 1.1428571 1.1428571 1.000 #[36] 0.4761905 #or diag(apply(tb,1,function(x) x%*% t(mat1))) # [1] 1.4761905 1.9047619 1.9047619 1.9047619 1.9047619 2.2857143 1.9047619 #[8] 1.9047619 2.2857143 2.2857143 1.9047619 1.9047619 2.2857143 2.2857143 #[15] 2.2857143 1.9523810 1.9523810 2.2857143 2.2857143 2.2857143 2.2857143 #[22] 1.6190476 1.6190476 1.9047619 1.9047619 1.9047619 1.9047619 1.8095238 #[29] 0.9047619 0.9047619 1.1428571 1.1428571 1.1428571 1.1428571 1.000 #[36] 0.4761905 A.K. I have a matrix (tb) that represents all samples of size n=7 from a group of N=9, with scores c(1,2,3,4,4,4,4,5,5). I want to calculate the variance for every sample. Here is my code. The bottom shows the matrix equations and an attempt to process it for each row. I got the strategies from reading the r-help, but neither works. (I do not understand the syntax well enough.) Any suggestions. (I need to do may :additional matrices in the same way.) Thanks. Jan require(combinat) n=7 base79-combn(c(1,2,3,4,4,4,4,5,5), n, tabulate,nbins=5) #find all samples,n=7 (gives cols of 7 cases in sample like 1 1 1 4 0 ) tb-t(base79) val-c(1,2,3,4,5) #values on the scale meansb79-t(base79)%*% (val/7) tb[ ,1])%*%(((val-rep(meansb79[1,],5))^2)/6) #computes the sample variance for the first sample #check tryvarb-c(1,2,3,4,4,4,4) var(tryvarb) #Now I try to get the variance for each sample (row) in tb, but neither of the following attempts work. trybase - apply(tb,1,function(i) t(base79[,i])%*%(((val-rep(meansb79[i,],5))^2)/6)) #or domatrix.f - function(tb, meansb79) { a - nrow(A); b - nrow(B); C - matrix(NA, nrow=a, ncol=b); for (i in 1:a) for (j in 1:b) C[i,j] - t(A[,i])%*%(((val-rep(B[i,],5))^2)/6) } domatrix.f(tb, meansb79) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating %d/%m/%Y %H:%M:%S format from separate date and time columns
Hi R forum, Each row of my data (below) show a new contact event between animals. In order to ultimately look at the patterns of intervals between contacts, I need to calculate a contact end time. The contact starts at the date and time shown in V4 and V5, and lasts for the duration shown IN SECONDS in V6: data2- read.csv(file=file.choose(), header=F, sep= ) head(data2) V1 V2 V3 V4 V5 V6 1 3 PO4 CO1 2011-04-29 07:27:21 28 2 3 PO4 CO1 2011-04-24 05:57:39 20 3 3 PO4 CO1 2011-04-14 10:29:49 4 4 3 PO4 CO1 2011-04-16 07:27:31 63 5 3 PO4 CO1 2011-04-18 15:46:20 1 6 3 PO4 CO1 2011-04-18 15:45:57 1 To start with I have tried to make the start data and time into one new column: startt- strptime((paste(data2$V4, data2$V5)), %d/%m/%Y %H:%M:%S) This executes without any warnings, but returns a full column of NA values. It would be great to fix this, and then to know how to correctly add column V6 as seconds to the resulting column. The problem is further exacerbated by an error with dput() with this data. It's a large dataset of over 9000 rows, and when I call: dput(head(data2,50)) It returns dput(), but for all the data (i.e. not the first 50 rows). This of course does not fit on the workstation screen and therefore I cannot find out what class it has assigned to any of the data. The times appear sorted, suggesting they are being classed as a factor? Sorry I can't provide dput() data! Thanks, Cat __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating %d/%m/%Y %H:%M:%S format from separate date andtimecolumns
Hello, Cat, see inline below. Hth -- Gerrit On Fri, 12 Apr 2013, Cat Cowie wrote: Hi R forum, Each row of my data (below) show a new contact event between animals. In order to ultimately look at the patterns of intervals between contacts, I need to calculate a contact end time. The contact starts at the date and time shown in V4 and V5, and lasts for the duration shown IN SECONDS in V6: data2- read.csv(file=file.choose(), header=F, sep= ) head(data2) V1 V2 V3 V4 V5 V6 1 3 PO4 CO1 2011-04-29 07:27:21 28 2 3 PO4 CO1 2011-04-24 05:57:39 20 3 3 PO4 CO1 2011-04-14 10:29:49 4 4 3 PO4 CO1 2011-04-16 07:27:31 63 5 3 PO4 CO1 2011-04-18 15:46:20 1 6 3 PO4 CO1 2011-04-18 15:45:57 1 To start with I have tried to make the start data and time into one new column: startt- strptime((paste(data2$V4, data2$V5)), %d/%m/%Y %H:%M:%S) Shouldn't you use - instead of / as it is used in data$V4? This executes without any warnings, but returns a full column of NA values. It would be great to fix this, and then to know how to correctly add column V6 as seconds to the resulting column. The problem is further exacerbated by an error with dput() with this data. It's a large dataset of over 9000 rows, and when I call: dput(head(data2,50)) It returns dput(), but for all the data (i.e. not the first 50 rows). This of course does not fit on the workstation screen and therefore I cannot find out what class it has assigned to any of the data. The times appear sorted, suggesting they are being classed as a factor? Sorry I can't provide dput() data! Thanks, Cat __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Assessing the fit of a nonlinear model to a new dataset
Thank you all for your useful suggestions. Jean, I had already tried to use the maxiter command in that way, but it simply told me that the model had not converged. Based on Profs Ripley and Nash's comments, I have opted to use an alternative approach, and am creating a distribution of plausible models, and comparing my sum of squared residuals to those arising from that distribution. This seems to work. Thanks again. Cheers, Rebecca - Rebecca Lester Lecturer, Freshwater Ecology School of Life Environmental Sciences Deakin University PO Box 423, Warrnambool, 3280 Ph: +61 3 5563 3330 Fax: +61 3 5563 3462 -Original Message- From: Prof J C Nash (U30A) [mailto:nas...@uottawa.ca] Sent: Saturday, 6 April 2013 7:42 AM To: r-help@r-project.org; Rebecca Lester Subject: Re: Assessing the fit of a nonlinear model to a new dataset Given nls has a lot of C code (and is pretty complicated), I doubt you'll find much joy doing that. nlxb from my nlmrt package is all in R, but you'll need to do quite a bit of work at each stage. I don't form the J' J matrix, and do a Marquardt approximation by adding appropriate rows to the J matrix then do a qr decomposition on that. In any event, the Hessian (which J' J is only a rather poor appriximation to) is what you want, and it may not be positive definite at the iterates, so you have infinite standard errors. Well, if the curvature was 0, they'd be infinite. Since the curvature is negative, maybe the SEs are more than infinite, if that has any meaning. I have one problem for which I generated 1000 starting points and 75% had the Hessian indefinite. That is a simple logistic nonlinear regression, albeit with nasty data. JN Message: 90 Date: Fri, 5 Apr 2013 05:06:57 + From: Rebecca Lester rebecca.les...@deakin.edu.au To: r-help@r-project.org r-help@r-project.org Subject: [R] Assessing the fit of a nonlinear model to a new dataset Message-ID: 5a72faa65583bc45a816a698a960e92788612...@mbox-f-3.du.deakin.edu.au Content-Type: text/plain Hi all, I am attempting to apply a nonlinear model developed using nls to a new dataset and assess the fit of that model. At the moment, I am using the fitted model from my fit dataset as the starting point for an nls fit for my test dataset (see below). I would like to be able to view the t-statistic and p-values for each of the iterations using the trace function, but have not yet worked out how to do this. Any other suggestions are also welcome. Many thanks, Rebecca model.wa - nls(y ~ A*(x^B), start=list(A=107614,B=-0.415)) # create nls() power model for WA data summary(model.wa) # model summary Formula: y ~ A * (x^B) Parameters: Estimate Std. Error t value Pr(|t|) A 7.644e+04 1.240e+04 6.165 4.08e-06 *** B -3.111e-01 4.618e-02 -6.736 1.15e-06 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 5605 on 21 degrees of freedom Number of iterations to convergence: 6 Achieved convergence tolerance: 7.184e-06 (6 observations deleted due to missingness) model.vic - nls(y.vic ~ A*(x.vic^B), start = list(A = 7.644e+04, B = -3.111e-01), trace = T) 3430193778 : 76440.-0.3111 2634092902 : 48251.9235397-0.2552481 2614516166 : 27912.1921354-0.1772322 2521588892 : 32718.3764594-0.1862611 2521233646 : 32476.4536126-0.1836836 2521230904 : 32553.0767231-0.1841362 2521230824 : 32540.063480-0.184059 2521230822 : 32542.2970040-0.1840721 Important Notice: The contents of this email are intended solely for the named addressee and are confidential; any unauthorised use, reproduction or storage of the contents is expressly prohibited. If you have received this email in error, please delete it and any attachments immediately and advise the sender by return email or telephone. Deakin University does not warrant that this email and any attachments are error or virus free. [[alternative HTML version deleted]] -- Important Notice: The contents of this email are intended solely for the named addressee and are confidential; any unauthorised use, reproduction or storage of the contents is expressly prohibited. If you have received this email in error, please delete it and any attachments immediately and advise the sender by return email or telephone. Deakin University does not warrant that this email and any attachments are error or virus free. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating %d/%m/%Y %H:%M:%S format from separate date andtimecolumns
Hi Gerrit, Thanks for your quick reply - this, in combination with reversing the date as it is shown in my data, worked perfectly. startt- strptime((paste(data2$V4, data2$V5)), %Y-%m-%d %H:%M:%S) Apologies for the silly question it seems, but you've saved me a lot of time. Always learning Thanks, Cat On 12 April 2013 15:00, Gerrit Eichner gerrit.eich...@math.uni-giessen.de wrote: Hello, Cat, see inline below. Hth -- Gerrit On Fri, 12 Apr 2013, Cat Cowie wrote: Hi R forum, Each row of my data (below) show a new contact event between animals. In order to ultimately look at the patterns of intervals between contacts, I need to calculate a contact end time. The contact starts at the date and time shown in V4 and V5, and lasts for the duration shown IN SECONDS in V6: data2- read.csv(file=file.choose(), header=F, sep= ) head(data2) V1 V2 V3 V4 V5 V6 1 3 PO4 CO1 2011-04-29 07:27:21 28 2 3 PO4 CO1 2011-04-24 05:57:39 20 3 3 PO4 CO1 2011-04-14 10:29:49 4 4 3 PO4 CO1 2011-04-16 07:27:31 63 5 3 PO4 CO1 2011-04-18 15:46:20 1 6 3 PO4 CO1 2011-04-18 15:45:57 1 To start with I have tried to make the start data and time into one new column: startt- strptime((paste(data2$V4, data2$V5)), %d/%m/%Y %H:%M:%S) Shouldn't you use - instead of / as it is used in data$V4? This executes without any warnings, but returns a full column of NA values. It would be great to fix this, and then to know how to correctly add column V6 as seconds to the resulting column. The problem is further exacerbated by an error with dput() with this data. It's a large dataset of over 9000 rows, and when I call: dput(head(data2,50)) It returns dput(), but for all the data (i.e. not the first 50 rows). This of course does not fit on the workstation screen and therefore I cannot find out what class it has assigned to any of the data. The times appear sorted, suggesting they are being classed as a factor? Sorry I can't provide dput() data! Thanks, Cat __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating %d/%m/%Y %H:%M:%S format from separate date and time columns
Hi, The following should work: x - paste(paste(data2$V4, data2$V5), data2$V6, sep='.') startt - strptime(x, %Y-%m-%d %H:%M:%OS) Regards, Pascal 2013/4/12 Cat Cowie cat.e.co...@gmail.com Hi R forum, Each row of my data (below) show a new contact event between animals. In order to ultimately look at the patterns of intervals between contacts, I need to calculate a contact end time. The contact starts at the date and time shown in V4 and V5, and lasts for the duration shown IN SECONDS in V6: data2- read.csv(file=file.choose(), header=F, sep= ) head(data2) V1 V2 V3 V4 V5 V6 1 3 PO4 CO1 2011-04-29 07:27:21 28 2 3 PO4 CO1 2011-04-24 05:57:39 20 3 3 PO4 CO1 2011-04-14 10:29:49 4 4 3 PO4 CO1 2011-04-16 07:27:31 63 5 3 PO4 CO1 2011-04-18 15:46:20 1 6 3 PO4 CO1 2011-04-18 15:45:57 1 To start with I have tried to make the start data and time into one new column: startt- strptime((paste(data2$V4, data2$V5)), %d/%m/%Y %H:%M:%S) This executes without any warnings, but returns a full column of NA values. It would be great to fix this, and then to know how to correctly add column V6 as seconds to the resulting column. The problem is further exacerbated by an error with dput() with this data. It's a large dataset of over 9000 rows, and when I call: dput(head(data2,50)) It returns dput(), but for all the data (i.e. not the first 50 rows). This of course does not fit on the workstation screen and therefore I cannot find out what class it has assigned to any of the data. The times appear sorted, suggesting they are being classed as a factor? Sorry I can't provide dput() data! Thanks, Cat __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ordination Plotting: Warning: Species scores not available
Micah Bennett micahgbennett at yahoo.com writes: Hi, I am working with a species-by-trait .csv file (columns=traits, rows=species) and get the following warning message when trying to plot results of both metaMDS and pcoa: Warning message: In ordiplot(x, choices = choices, type = type, display = display, : Species scores not available I am using a Gower's transformation in both procedures within the metaMDS or pcoa functions, and I know I am not using something already made into a distance matrix. There is missing trait data for some species, but there must be a way to plot the trait locations/vectors. There has been at least one posting about this in years past but it did not receive a satisfactory response. Any help would be much appreciated. -Micah [[alternative HTML version deleted]] Dear Micah Bennett, I don't know why you say that there must be a way: there is no obligation to have this feature, but someone can write code to do so. However, as far as I know, this cannot be directly. Both metaMDS() and PCoA (which PCoA? Which function, which package?) work with dissimilarities, and dissimilarities have no information about original column variables. If you calculate dissimilarities within metaMDS, the function knows the original column values, and may be able add scores for those. However, it is not able to add those scores if you have missing values. Actually, I was not able to run metaMDS with missing values unless I suppressed calculating column scores (wascores = FALSE). So there is no direct way of getting those. I don't know how you calcualted PCoA, but the standard R function cmdscale (or its variant wcmdscale in vegan) only accept dissimilarities as input and are unable to add scores for original cover values. Perhaps there is a function called pcoa() somewhere (labdsv?) which can calculate the dissimilarities internally and is able to add column scores, but consult appropriate documentation in that package. You may be able to calculate column scores after the analysis. The default scores in vegan::metaMDS() are wascores() which do not handle missing values. No hope there. However, if you want to have fitted vectors or class centroids, you can use envfit(..., na.rm = TRUE) which will use the complete.cases() subset of the ordination scores for fitting the vectors or class centroids. If you want to have NA-handling in wascores, you better submit new code to vegan and we will incorporate that in the next release. Best wishes, Jari Oksanen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating %d/%m/%Y %H:%M:%S format from separate date andtimecolumns
avoid using F, spell it out Use an appropriate TZ... e.g. use Sys.setenv( TZ=Etc/GMT+5 ) before converting if your time is local standard time year round Use difftime to add to time values for consistent results endt - startt + as.difftime( data2$V6, units=secs ) Use the str function to explore your data --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Cat Cowie cat.e.co...@gmail.com wrote: Hi Gerrit, Thanks for your quick reply - this, in combination with reversing the date as it is shown in my data, worked perfectly. startt- strptime((paste(data2$V4, data2$V5)), %Y-%m-%d %H:%M:%S) Apologies for the silly question it seems, but you've saved me a lot of time. Always learning Thanks, Cat On 12 April 2013 15:00, Gerrit Eichner gerrit.eich...@math.uni-giessen.de wrote: Hello, Cat, see inline below. Hth -- Gerrit On Fri, 12 Apr 2013, Cat Cowie wrote: Hi R forum, Each row of my data (below) show a new contact event between animals. In order to ultimately look at the patterns of intervals between contacts, I need to calculate a contact end time. The contact starts at the date and time shown in V4 and V5, and lasts for the duration shown IN SECONDS in V6: data2- read.csv(file=file.choose(), header=F, sep= ) head(data2) V1 V2 V3 V4 V5 V6 1 3 PO4 CO1 2011-04-29 07:27:21 28 2 3 PO4 CO1 2011-04-24 05:57:39 20 3 3 PO4 CO1 2011-04-14 10:29:49 4 4 3 PO4 CO1 2011-04-16 07:27:31 63 5 3 PO4 CO1 2011-04-18 15:46:20 1 6 3 PO4 CO1 2011-04-18 15:45:57 1 To start with I have tried to make the start data and time into one new column: startt- strptime((paste(data2$V4, data2$V5)), %d/%m/%Y %H:%M:%S) Shouldn't you use - instead of / as it is used in data$V4? This executes without any warnings, but returns a full column of NA values. It would be great to fix this, and then to know how to correctly add column V6 as seconds to the resulting column. The problem is further exacerbated by an error with dput() with this data. It's a large dataset of over 9000 rows, and when I call: dput(head(data2,50)) It returns dput(), but for all the data (i.e. not the first 50 rows). This of course does not fit on the workstation screen and therefore I cannot find out what class it has assigned to any of the data. The times appear sorted, suggesting they are being classed as a factor? Sorry I can't provide dput() data! Thanks, Cat __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bnlearn: how to compute boot strength with mmhc and a blacklist
Dear Leonore, On Wed, Apr 10, 2013 at 5:51 PM, Leonore Wigger leonore.wig...@unil.ch wrote: Question: I have specified a blacklist. I would have expected this to completely disallow the arcs on the blacklist. But the result shows that some of the blacklisted arcs have a strength 0 (rows 7,8,10,11). It seems that only the arc that was blacklisted in both directions was actually banned (x1-x2, in rows 9 and 12). What is the reason for this? Is there a way to completely disallow all blacklisted arcs, such that their strength is 0.0? Or is there a compelling reason why that should not be done? Because by default boot.strength() runs with cpdag = TRUE. This means that reversible arcs can have positive strength in both directions. You should set cpdag = FALSE to get the result you are expecting. In that case the probabilities of the arc directions should be taken with a grain of salt, as they can be influenced by many things (optimized = TRUE/FALSE, order of the variables in the data set) unless you are doing causal modelling. This code makes 50 different networks from the same data, then uses them as input for custom.strength. The networks are constructed using the algorithm hc. A different network is produced every time hc is invoked because a random starting network is supplied to the parameter start. I would like to do the same thing, but use mmhc instead of hc. However, in my hands, the networks that are constructed by mmhc are all identical, and I am not sure how to introduce a random element into the construction. Question: Which parameters do I need to give to mmhc in order to obtain a different network every time it is run on the same data set? This is not surprising, because mmhc() does not have a start argument, so it's starting from the same network over and over. There is no way to provide a random seed to mmhc(), so the only way to perturb it is through bootstrap. Marco -- Marco Scutari, Ph.D. Research Associate, Genetics Institute (UGI) University College London (UCL), United Kingdom __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] pairwise ANOSIM test
Hello I am studying ANOSIM (analysis of similarities) using a package(vegan) provided in r statistics. I am wondering how to do pairwise ANOSIM test among three groups. Can anybody please explain this question. Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] werttemberg
http://aragon-vida.com/11a-hdnsdsjreunx-1.php well, good night and thank you very much 932d4a1dff1375c06208c4cf2a7675e527431024e7d241382978df47d0f8c43f [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error loading workspaces after upgrade
Dear Members, I upgraded R from 2.15.3 to 3.0.0 (binary) on my WinXP and now I can't load my workspaces. The error message is: Error: requested primitive type is not consistent with cached value During startup - Warning message: unable to restore saved data in .RData Any idea to fix that? Thanks, Tamas [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error loading workspaces after upgrade
Hm. Sometimes workspace cannot be loaded when some packages which vere used for creationd objects in workspaces are missing, but in that case the error message is different. So you either can delete .RData but it obviously **delete** all objects or you need to find which package is missing from your R3.0.0 installation. If you want to keep only some objects from your workspace return back to 2.15.3 and save them explicitely see ?save Regards Petr -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Tamas Barjak Sent: Friday, April 12, 2013 1:14 PM To: r-help@r-project.org Subject: [R] Error loading workspaces after upgrade Dear Members, I upgraded R from 2.15.3 to 3.0.0 (binary) on my WinXP and now I can't load my workspaces. The error message is: Error: requested primitive type is not consistent with cached value During startup - Warning message: unable to restore saved data in .RData Any idea to fix that? Thanks, Tamas [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bnlearn: how to compute boot strength with mmhc and a blacklist
Thanks a lot, Marco. This works. And your remark regarding less reliable arc strengths with cpdag=FALSE was very helpful. Leonore Le 12/04/2013 11:08, Marco Scutari a écrit : Dear Leonore, On Wed, Apr 10, 2013 at 5:51 PM, Leonore Wigger leonore.wig...@unil.ch wrote: Question: I have specified a blacklist. I would have expected this to completely disallow the arcs on the blacklist. But the result shows that some of the blacklisted arcs have a strength 0 (rows 7,8,10,11). It seems that only the arc that was blacklisted in both directions was actually banned (x1-x2, in rows 9 and 12). What is the reason for this? Is there a way to completely disallow all blacklisted arcs, such that their strength is 0.0? Or is there a compelling reason why that should not be done? Because by default boot.strength() runs with cpdag = TRUE. This means that reversible arcs can have positive strength in both directions. You should set cpdag = FALSE to get the result you are expecting. In that case the probabilities of the arc directions should be taken with a grain of salt, as they can be influenced by many things (optimized = TRUE/FALSE, order of the variables in the data set) unless you are doing causal modelling. This code makes 50 different networks from the same data, then uses them as input for custom.strength. The networks are constructed using the algorithm hc. A different network is produced every time hc is invoked because a random starting network is supplied to the parameter start. I would like to do the same thing, but use mmhc instead of hc. However, in my hands, the networks that are constructed by mmhc are all identical, and I am not sure how to introduce a random element into the construction. Question: Which parameters do I need to give to mmhc in order to obtain a different network every time it is run on the same data set? This is not surprising, because mmhc() does not have a start argument, so it's starting from the same network over and over. There is no way to provide a random seed to mmhc(), so the only way to perturb it is through bootstrap. Marco -- Leonore Wigger University of Lausanne Genomic Technologies Facility Génopode Building 1015 Lausanne Switzerland ++41 (0)21 692 41 16 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A strange behaviour in the graphical function curve
Berend Hasselman bhh at xs4all.nl writes: Yes. curve expects the function you give it to return a vector if the input argument is a vector. This is clearly documented for the argument expr of curve. Thanks a lot, Berend! In fact, I didn't read carefully the documentation of curve. Anyway, I wouldn't know how to transform my function into one that would be accepted by curve, so your feedback has been very useful for me. -Sergio. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RODBC MSSQL query with date - time tag
Greetings: I am trying to query an MSSQL database which contains time series data using RODBC. Using Server Management Studio, I can retrieve the data with this query: Select * from tb_ace_mag_1m where time_tag='2012-01-08 00:00:00' AND time_tag'2012-01-08 03:00:00' However, when I try to accomplish this using R: sqlQuery(channel1,Select * from tb_ace_mag_1m where time_tag='2012-01-08 00:00:00' AND time_tag'2012-01-08 03:00:00', max=10) I obtain no results: [1] time_taginsert_time dsflag numpts gse_bx gse_by gse_bz gse_lat gse_lon gsm_bx gsm_by gsm_bz gsm_lat gsm_lon bt 0 rows (or 0-length row.names) If I omit the time_tag parameter, I get the following: sqlQuery(channel1,select * from tb_ace_mag_1m, max=10) time_tag insert_time dsflag numpts gse_bx gse_by gse_bz gse_lat gse_lon gsm_bx gsm_by gsm_bzgsm_lat gsm_lon bt 1 2013-01-21 15:23:00 2013-01-21 15:28:26 0 37 0.34268624 -2.814154 -0.460940033 -9.2350445 276.9429 0.33351120 -2.713921 -0.8790606 -17.822054 277.0059 2.872170 2 2013-01-21 15:24:00 2013-01-21 15:29:29 0 60 0.45056427 -2.862180 -0.404557437 -7.9486165 278.9461 0.44123372 -2.769705 -0.8321871 -16.526619 279.0516 2.925534 3 2013-01-21 15:25:00 2013-01-21 15:30:00 0 15 0.16728164 -2.787083 -1.013695598 -19.9538670 273.4348 0.15818150 -2.602005 -1.4240593 -28.647099 273.4789 2.970420 How do I properly format the query in RODBC to obtain the results I seek? I was unable to discover a solution in the archives, although it appears I'm not the only one who has struggled with date-time queries. Thanks, Rob Steenburgh NOAA/NWS Space Weather Prediction Center __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] produce a map in ARCGIS from a GAM output
Hi, I am trying to predict the habitat suitability of lobster with GAMs. I need to produce a map in ArcGis of the predicted densities. I did some search and the function predict.gam seems to do the job. Is this the right way to do it? when you apply the function you get the predicted values (different from your input data points) but I do not understand what is the procedure with the spatial coordinates, I mean what lat and long have the new points? Can I use the same ones of my observations? Thank you very much for your time, Valentina Dr. Valentina Lauria Postdoctoral researcher Room 118, Martin Ryan Institute Department of Earth and Ocean Sciences National University of Ireland, Galway Ireland www.nephrops.euhttp://www.nephrops.eu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] odfWeave: Some questions about potential formatting options
Le jeudi 11 avril 2013 à 13:40 -0700, Paul Miller a écrit : Hello All, Learning to use the odfWeave package. I really like the package. It has good documentation, makes some very nice looking tables, and seems to have lots of options for customizing output. There are a few things I'd like to do that don't seem to be covered in the documentation though. So I'm not sure if they're possible or not. Here's a list of some things I'd like to be able to do: 1. Make titles generated by odfTableCaption/odfFigureCaption bold 2. Add footnotes to tables (using something other than odfCat) 3. Control the width of columns 4. Control the alignment of columns (first column left and centered otherwise) Is it possible to do any or all of these things using odfWeave? For points 1 and 4, you have to use a style, and edit the style via LibreOffice to make the text bold or change its alignement. See tableStyles() for the latter. For point 3 (column widths), I needed to make small changes to odfWeave to support that part of the ODF spec. I've sent them to Max Kuhn for review, but so far he probably could not find the time to study them. If you are interested, you can grab the modified version of the package from here : http://nalimilan.perso.neuf.fr/transfert/odfWeave.tar.gz Then you can set the width like that: currentDefs - getStyleDefs() currentDefs$firstColumn$type - Table Column currentDefs$firstColumn$columnWidth - 5 cm currentDefs$secondColumn$type - Table Column currentDefs$secondColumn$columnWidth - 3 cm setStyleDefs(currentDefs) Then you can use these styles this way: odfTable(df, styles=style, colStyles=c(firstColumn, secondColumn)) For point 2, I'm not sure what you mean. Do you want to include a legend below the table, or do you want to use real footnotes? Regards __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problems with the DSL package
Hello, I'm trying to use R with an distributed File system (HDFS) and when I launch a MapReduce job with DSL package, I've got this message : *Streaming Command Failed! Error in !unlist(lapply(chunks, is.null)) : invalid argument type* I am using the Cloudera CDH4 Virtual Machine, R version is 2.15.3 and Hadoop 0.20.2 CDH3u4. Commands lines : /library(DSL) storage-DStorage(type=HDFS, base_dir=/user/cloudera/dc) dl - DList( line1 = This is the first line., line2 = Now, the second line., DStorage=storage ) res - DLapply( dl, function(x) unlist(strsplit(x, )) )/ Somebody knows something about this problem? Thanks and Regards, Vincent. -- View this message in context: http://r.789695.n4.nabble.com/Problems-with-the-DSL-package-tp4664059.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] gc()
Folks, I have some computations in a function that create some large matrices. I have been in the habit in these circumstances to call null out a matrix once used and call gc(). Some pseudo code: theFunction-function(x, y, z, 1) { myMatrix - black.box.function(x, y, z, 1) interesting.data-myFunc(myMatrix) myMatrix-NULL gc() interesting.data/pi } Would it be different if I chose to replace myMatrix-NULL with rm(myMatrix)? The gc docs are fine but I was wondering if there were any other tips or links on the subject. Thanks for your time, KW -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] produce a map in ARCGIS from a GAM output
On Fri, Apr 12, 2013 at 10:37 AM, Lauria, Valentina valentina.lau...@nuigalway.ie wrote: Hi, I am trying to predict the habitat suitability of lobster with GAMs. I need to produce a map in ArcGis of the predicted densities. I did some search and the function predict.gam seems to do the job. Is this the right way to do it? when you apply the function you get the predicted values (different from your input data points) but I do not understand what is the procedure with the spatial coordinates, I mean what lat and long have the new points? Can I use the same ones of my observations? I suspect you'll get a better response if you: a) ask on R-sig-geo, the mailing list for all things R and mappy b) try and ask with a reproducible example - it doesn't have to be your data, but something similar, and small, and that we can run to get a better idea of your problem. Show us how you are running predict.gam, at least, and what your data looks like. To get data from R into ArcGIS you probably want to either use the raster package to create a grid of values, or the rgdal package to create a shapefile of point locations and estimates. Exactly how you get that from your predict.gam results is a bit guesswork until we know a bit more about how you are running predict.gam Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding unequal tables
Hi, table.x- as.table(as.matrix(read.table(text= A B C D 2 4 6 5 ,sep=,header=TRUE))) table.y- as.table(as.matrix(read.table(text= C D 5 1 ,sep=,header=TRUE))) library(plyr) library(reshape2) vec1-rowSums(join(melt(table.x)[,-1], melt(table.y)[,-1],by=Var2)[,-1],na.rm=TRUE) names(vec1)- colnames(table.x) table.xy-as.table(vec1) table.xy # A B C D # 2 4 11 6 #or dat1-merge(table.x,table.y,all=TRUE)[,-1] table.xy-as.table(with(dat1,tapply(Freq,list(Var2),FUN=sum))) table.xy # A B C D # 2 4 11 6 str(table.xy) # 'table' int [1:4(1d)] 2 4 11 6 # - attr(*, dimnames)=List of 1 # ..$ : chr [1:4] A B C D #or datNew-join(as.data.frame(table.x),as.data.frame(table.y),type=full)[,-1] res-aggregate(Freq~Var2,data=datNew,sum)[,2] names(res)- colnames(table.x) table.xy-as.table(res) table.xy # A B C D # 2 4 11 6 A.K. This is probably really simple, but I'm missing something: I have two tables generated with the table() function that have overlapping columns. I want to add the entries where the column is the same. For instance: table.x A B C D 2 4 6 5 table.y C D 5 1 Output needed: A B C D 2 4 11 6 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Stat question: How to deal w/ negative outliers?
Hello all, I have a question: I am using the interquantile method to spot outliers it gives me values of say 234 -120 or for the higher lower benchmarks. I don't have any issues w/ the higher end. However I don't have any negative values. My lowest possible value is 0. Should I consider 0 as an outlier? Thanks ahead for your thoughts -- View this message in context: http://r.789695.n4.nabble.com/Stat-question-How-to-deal-w-negative-outliers-tp4664068.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Read the data from a text file and reshape the data
Hi Arun, Thank you so much for your answer. It surely does help. Having to know different approaches for the same problem has given me more insights on R language. I appreciate your time and effort. Best, Janesh Devkota On Thu, Apr 11, 2013 at 10:00 PM, arun smartpink...@yahoo.com wrote: Hi, May be this helps: lines1- readLines(WAT_DEP.DAT.part) indx- which(grepl([*],lines1)) indx2-indx[seq(from=indx[2],length(indx),by=2)]+1 lines2-str_trim(lines1[indx2],side=left) dat1-read.table(text=lines2,sep=,header=FALSE) library(stringr) lst1- lapply(split(indx,((seq_along(indx)-1)%/%2)+1), function(x) {x1-seq(max(x)+2,max(x)+2+49,by=1); x2- str_trim(lines1[x1][!is.na(lines1[x1])],side=left); x3-as.vector(as.matrix(read.table(text=x2[x2!=],header=FALSE,sep=))); x4- if(length(x3) 500) c(x3,rep(NA,500-length(x3))) else x3;x4 }) dat2- as.data.frame(do.call(cbind,lst1)) colnames(dat2)-paste(t,colnames(dat2),_,dat1[,2],sep=) dim(dat2) #[1] 500 622 dat2[1:3,1:3] # t1_0.00208 t2_0.00417 t3_0.00625 #1 3.224 4.124 4.502 #2 3.205 4.118 4.500 #3 3.189 4.114 4.498 A.K. - Original Message - From: Janesh Devkota janesh.devk...@gmail.com To: r-help@r-project.org Cc: Sent: Thursday, April 11, 2013 5:55 PM Subject: [R] Read the data from a text file and reshape the data I have a data set for different time intervals. The data has three comment lines before data for each time interval. For each time interval there are 500 data points. I want to change the dataset such that I have the following format: t1t2t3 0.00208 0.00417 0.00625 . a1 a2 a3 ... b1 b2 b3 ... c1c2c3 . ... The link to the file is as follows: https://www.dropbox.com/s/hc8n3qcai1mlxca/WAT_DEP.DAT As you will see on the file, time for each interval is the second data of the third line before the data starts. For the first time, t= 0.00208. I need to change the data in several rows into one column. At last I need to create a dataframe with the format shown above. In the sample above, a1, b1, c1 are the data for time t1, and so on. The sample data is as follows: ** N:SNAPSHOTTIME DELT[S] ** WATER DEPTH [M]: (HP(L),L=2,LA) 18000.00208 0.1 3.224 3.221 3.220 3.217 3.216 3.214 3.212 3.210 3.209 3.207 3.205 3.203 3.202 3.200 3.199 3.197 3.196 3.193 3.192 3.190 3.189 3.187 3.186 3.184 3.184 3.182 3.181 3.179 3.178 3.176 3.175 3.174 3.173 3.171 3.170 3.169 3.168 3.167 3.166 3.164 3.164 3.162 3.162 3.160 3.160 3.158 3.158 3.156 3.156 3.155 3.154 3.153 3.152 3.151 3.150 3.150 3.149 3.149 3.147 3.147 3.146 3.146 3.145 3.145 3.144 3.144 3.143 3.143 3.142 3.142 3.141 3.142 3.141 3.141 3.140 3.141 3.140 3.140 3.139 3.140 3.139 3.140 3.139 3.140 3.139 3.140 3.139 3.140 3.139 3.140 3.139 3.140 3.140 3.140 3.140 3.141 3.141 3.142 3.141 3.142 3.142 3.142 3.143 3.143 3.144 3.144 3.145 3.145 3.146 3.146 3.147 3.148 3.149 3.149 3.150 3.150 3.152 3.152 3.153 3.154 3.155 3.156 3.157 3.158 3.159 3.160 3.161 3.162 3.163 3.164 3.165 3.166 3.168 3.169 3.170 3.171 3.173 3.174 3.176 3.176 3.178 3.179 3.181 3.182 3.184 3.185 3.187 3.188 3.190 3.191 3.194 3.195 3.196 3.198 3.199 3.202 3.203 3.205 3.207 3.209 3.210 3.213 3.214 3.217 3.218 3.221 3.222 3.225 3.226 3.229 3.231 3.233 3.235 3.238 3.239 3.242 3.244 3.247 3.248 3.251 3.253 3.256 3.258 3.261 3.263 3.266 3.268 3.271 3.273 3.276 3.278 3.281 3.283 3.286 3.289 3.292 3.294 3.297 3.299 3.303 3.305 3.307 3.311 3.313 3.317 3.319 3.322 3.325 3.328 3.331 3.334 3.337 3.340 3.343 3.347 3.349 3.353 3.356 3.359 3.362 3.366 3.369 3.372 3.375 3.379 3.382 3.386 3.388 3.392 3.395 3.399 3.402 3.406 3.409 3.413
Re: [R] A strange behaviour in the graphical function curve
Berend Hasselman bhh at xs4all.nl writes: Your function miBeta returns a scalar when the argument mu is a vector. Use Vectorize to vectorize it. Like this VmiBeta - Vectorize(miBeta,vectorize.args=c(mu)) VmiBeta(c(420,440)) and draw the curve with this curve(VmiBeta,xlim=c(370,430), xlab=mu, ylab=L(mu)) Berend Taking into account what you have pointed out, I reprogrammed my function as follows, as an alternative solution to yours: zetas - function(alpha) {z - qnorm(alpha/2); c(z,-z)} # First transformation function Tzx - function(z, sigma_p, mu_p) sigma_p*z + mu_p # Second transformation function Txz - function(x, sigma_p, mu_p) (x - mu_p)/sigma_p BetaG - function(mu, alpha, n, sigma, mu_0) { lasZ - zetas(alpha) # Zs corresponding to alpha sigma_M - sigma/sqrt(n) # sd of my distribution lasX - Tzx(lasZ, sigma_M, mu_0) # Transformed Zs into Xs # Now I consider mu to be a vector composed of m's NewZ - lapply(mu, function(m) Txz(lasX, sigma_M, m)) # NewZ is a list, the same length as mu, with 2D vectors # The result will be a vector, the same length as mu (and NewZ) sapply(NewZ, function(zz) pnorm(zz[2]) - pnorm(zz[1])) } miBeta - function(mu) BetaG(mu, 0.05, 36, 48, 400) plot(miBeta,xlim=c(370,430), xlab=mu, ylab=L(mu)) I hope this is useful to people following this discussion, -Sergio. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multivariate AICc
Dear all, I would like to find out if anyone is aware of the existence of implemented functions, or codes, for running multivariate AICc as it was developed by* *Fujikoshi Satoh (1997)* *. I've been searching for it, but goggle, rseek, and all other searching mechanism returns nothing. Thanks! Fujikoshi, Y., Satoh, K. (1997). Modified AIC and Cp in multivariate linear regression. *Biometrika*, *84*(3), 707-716. -- Vinicius A. G. Bastazini Laboratório de Ecologia Quantitativa PPG-Ecologia UFRGS [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] produce a map in ARCGIS from a GAM output
Dear Barry, Thank you very much for your reply. I am trying to create a map in GIS of the habitat prediction of lobster using environmental predictors. I am attaching here a brief example of my script and data. I am OK for the GAM fitting and evaluation (residuals plot etc) but I am having problems to create a prediction map (raster of my GAM output) in GIS. In particular when I run predict.gam I get a set of values which I guess are the probability of lobster densities in function of my environmental predictors, but I do not know how to geo-reference these points. I will read about the raster package as I have to extrapolate my predictions for a larger area (where we have no data). Any help and suggestion is very much appreciated. Best Regards, Valentina Dr. Valentina Lauria Postdoctoral researcher Room 118, Martin Ryan Institute Department of Earth and Ocean Sciences National University of Ireland, Galway Ireland www.nephrops.eu From: b.rowling...@gmail.com [b.rowling...@gmail.com] on behalf of Barry Rowlingson [b.rowling...@lancaster.ac.uk] Sent: 12 April 2013 14:15 To: Lauria, Valentina Cc: r-help@R-project.org Subject: Re: [R] produce a map in ARCGIS from a GAM output On Fri, Apr 12, 2013 at 10:37 AM, Lauria, Valentina valentina.lau...@nuigalway.ie wrote: Hi, I am trying to predict the habitat suitability of lobster with GAMs. I need to produce a map in ArcGis of the predicted densities. I did some search and the function predict.gam seems to do the job. Is this the right way to do it? when you apply the function you get the predicted values (different from your input data points) but I do not understand what is the procedure with the spatial coordinates, I mean what lat and long have the new points? Can I use the same ones of my observations? I suspect you'll get a better response if you: a) ask on R-sig-geo, the mailing list for all things R and mappy b) try and ask with a reproducible example - it doesn't have to be your data, but something similar, and small, and that we can run to get a better idea of your problem. Show us how you are running predict.gam, at least, and what your data looks like. To get data from R into ArcGIS you probably want to either use the raster package to create a grid of values, or the rgdal package to create a shapefile of point locations and estimates. Exactly how you get that from your predict.gam results is a bit guesswork until we know a bit more about how you are running predict.gam Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] processing matrix equation derived from rows of two matrices
Thanks so much Arun. If you have time, what reference do you use to learn seq_len(nrowâ¦etc and other aspects of your code? Again. Thanks very much. Jan Beckstrand From: arun kirshna [via R] [mailto:ml-node+s789695n4664037...@n4.nabble.com] Sent: Friday, April 12, 2013 2:01 AM To: Beckstrand, Janis, NCOD Subject: Re: processing matrix equation derived from rows of two matrices Hi, May be this helps: tb[1,]%*%(((val-rep(meansb79[1,],5))^2)/6) #[,1] #[1,] 1.47619 tryvarb-c(1,2,3,4,4,4,4) var(tryvarb) #[1] 1.47619 tb[2,]%*%(((val-rep(meansb79[2,],5))^2)/6) # [,1] #[1,] 1.904762 sapply(seq_len(nrow(tb)),function(i) tb[i,]%*%(((val-rep(meansb79[i,],5))^2/6))) # [1] 1.4761905 1.9047619 1.9047619 1.9047619 1.9047619 2.2857143 1.9047619 # [8] 1.9047619 2.2857143 2.2857143 1.9047619 1.9047619 2.2857143 2.2857143 #[15] 2.2857143 1.9523810 1.9523810 2.2857143 2.2857143 2.2857143 2.2857143 #[22] 1.6190476 1.6190476 1.9047619 1.9047619 1.9047619 1.9047619 1.8095238 #[29] 0.9047619 0.9047619 1.1428571 1.1428571 1.1428571 1.1428571 1.000 #[36] 0.4761905 #or mat1-matrix(((val-rep(meansb79,each=5))^2/6),ncol=5,byrow=TRUE) diag(t(apply(tb,1,function(x) x))%*% apply(mat1,1,function(x) x)) # [1] 1.4761905 1.9047619 1.9047619 1.9047619 1.9047619 2.2857143 1.9047619 #[8] 1.9047619 2.2857143 2.2857143 1.9047619 1.9047619 2.2857143 2.2857143 #[15] 2.2857143 1.9523810 1.9523810 2.2857143 2.2857143 2.2857143 2.2857143 #[22] 1.6190476 1.6190476 1.9047619 1.9047619 1.9047619 1.9047619 1.8095238 #[29] 0.9047619 0.9047619 1.1428571 1.1428571 1.1428571 1.1428571 1.000 #[36] 0.4761905 #or diag(apply(tb,1,function(x) x%*% t(mat1))) # [1] 1.4761905 1.9047619 1.9047619 1.9047619 1.9047619 2.2857143 1.9047619 #[8] 1.9047619 2.2857143 2.2857143 1.9047619 1.9047619 2.2857143 2.2857143 #[15] 2.2857143 1.9523810 1.9523810 2.2857143 2.2857143 2.2857143 2.2857143 #[22] 1.6190476 1.6190476 1.9047619 1.9047619 1.9047619 1.9047619 1.8095238 #[29] 0.9047619 0.9047619 1.1428571 1.1428571 1.1428571 1.1428571 1.000 #[36] 0.4761905 A.K. I have a matrix (tb) that represents all samples of size n=7 from a group of N=9, with scores c(1,2,3,4,4,4,4,5,5). I want to calculate the variance for every sample. Here is my code. The bottom shows the matrix equations and an attempt to process it for each row. I got the strategies from reading the r-help, but neither works. (I do not understand the syntax well enough.) Any suggestions. (I need to do may :additional matrices in the same way.) Thanks. Jan require(combinat) n=7 base79-combn(c(1,2,3,4,4,4,4,5,5), n, tabulate,nbins=5) #find all samples,n=7 (gives cols of 7 cases in sample like 1 1 1 4 0 ) tb-t(base79) val-c(1,2,3,4,5) #values on the scale meansb79-t(base79)%*% (val/7) tb[ ,1])%*%(((val-rep(meansb79[1,],5))^2)/6) #computes the sample variance for the first sample #check tryvarb-c(1,2,3,4,4,4,4) var(tryvarb) #Now I try to get the variance for each sample (row) in tb, but neither of the following attempts work. trybase - apply(tb,1,function(i) t(base79[,i])%*%(((val-rep(meansb79[i,],5))^2)/6)) #or domatrix.f - function(tb, meansb79) { a - nrow(A); b - nrow(B); C - matrix(NA, nrow=a, ncol=b); for (i in 1:a) for (j in 1:b) C[i,j] - t(A[,i])%*%(((val-rep(B[i,],5))^2)/6) } domatrix.f(tb, meansb79) __ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/processing-matrix-equation-derived-from-rows-of-two-matrices-tp4664033p4664037.html To unsubscribe from processing matrix equation derived from rows of two matrices, click here http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4664033code=amFuaXMuYmVja3N0cmFuZEB2YS5nb3Z8NDY2NDAzM3wtNjAyODIyMzIx . NAML http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://r.789695.n4.nabble.com/processing-matrix-equation-derived-from-rows-of-two-matrices-tp4664033p4664070.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
Re: [R] A strange behaviour in the graphical function curve
See below On Fri, 12 Apr 2013, Julio Sergio wrote: Berend Hasselman bhh at xs4all.nl writes: Your function miBeta returns a scalar when the argument mu is a vector. Use Vectorize to vectorize it. Like this VmiBeta - Vectorize(miBeta,vectorize.args=c(mu)) VmiBeta(c(420,440)) and draw the curve with this curve(VmiBeta,xlim=c(370,430), xlab=mu, ylab=L(mu)) Berend Taking into account what you have pointed out, I reprogrammed my function as follows, as an alternative solution to yours: zetas - function(alpha) {z - qnorm(alpha/2); c(z,-z)} # First transformation function Tzx - function(z, sigma_p, mu_p) sigma_p*z + mu_p # Second transformation function Txz - function(x, sigma_p, mu_p) (x - mu_p)/sigma_p BetaG - function(mu, alpha, n, sigma, mu_0) { lasZ - zetas(alpha) # Zs corresponding to alpha sigma_M - sigma/sqrt(n) # sd of my distribution lasX - Tzx(lasZ, sigma_M, mu_0) # Transformed Zs into Xs # Now I consider mu to be a vector composed of m's NewZ - lapply(mu, function(m) Txz(lasX, sigma_M, m)) # NewZ is a list, the same length as mu, with 2D vectors # The result will be a vector, the same length as mu (and NewZ) sapply(NewZ, function(zz) pnorm(zz[2]) - pnorm(zz[1])) } miBeta - function(mu) BetaG(mu, 0.05, 36, 48, 400) plot(miBeta,xlim=c(370,430), xlab=mu, ylab=L(mu)) I hope this is useful to people following this discussion, -Sergio. Adding to what you have defined above, consider the benefit that true vectorization provides: BetaGv - function(mu, alpha, n, sigma, mu_0) { lasZ - zetas( alpha ) # Zs corresponding to alpha sigma_M - sigma/sqrt( n ) # sd of my distribution lasX - Tzx( lasZ, sigma_M, mu_0 ) # Transformed Zs into Xs # Now fold lasX and mu into matrices where columns are defined by lasX # and rows are defined by mu lasX_M - matrix( lasX, nrow=length(mu), ncol=2, byrow=TRUE ) mu_M - matrix( mu, nrow=length(mu), ncol=2 ) ) # Compute newZ NewZ - Txz( lasX_M, sigma_M, mu_M ) # NewZ is a matrix, where the first column corresponds to lower Z, and # the second column corresponds to upper Z # The result will be a vector, the same length as mu pnorm(NewZ[,2]) - pnorm(NewZ[,1]) } miBetav - function(mu) BetaGv(mu, 0.05, 36, 48, 400) system.time( miBeta( seq( 370, 430, length.out=1e5 ) ) ) system.time( miBetav( seq( 370, 430, length.out=1e5 ) ) ) On my machine, I get: system.time( miBeta( seq( 370, 430, length.out=1e5 ) ) ) user system elapsed 1.300 0.024 1.476 system.time( miBetav( seq( 370, 430, length.out=1e5 ) ) ) user system elapsed 0.076 0.000 0.073 So a bit of matrix manipulation speeds things up considerably. (That being said, I have a feeling that some algorithmic optimization could speed things up even more, using theoretical considerations.) --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A strange behaviour in the graphical function curve
On Apr 12, 2013, at 7:58 AM, Julio Sergio wrote: Berend Hasselman bhh at xs4all.nl writes: Your function miBeta returns a scalar when the argument mu is a vector. Use Vectorize to vectorize it. Like this VmiBeta - Vectorize(miBeta,vectorize.args=c(mu)) VmiBeta(c(420,440)) and draw the curve with this curve(VmiBeta,xlim=c(370,430), xlab=mu, ylab=L(mu)) Berend Taking into account what you have pointed out, I reprogrammed my function as follows, as an alternative solution to yours: zetas - function(alpha) {z - qnorm(alpha/2); c(z,-z)} # First transformation function Tzx - function(z, sigma_p, mu_p) sigma_p*z + mu_p # Second transformation function Txz - function(x, sigma_p, mu_p) (x - mu_p)/sigma_p BetaG - function(mu, alpha, n, sigma, mu_0) { lasZ - zetas(alpha) # Zs corresponding to alpha sigma_M - sigma/sqrt(n) # sd of my distribution lasX - Tzx(lasZ, sigma_M, mu_0) # Transformed Zs into Xs # Now I consider mu to be a vector composed of m's NewZ - lapply(mu, function(m) Txz(lasX, sigma_M, m)) # NewZ is a list, the same length as mu, with 2D vectors # The result will be a vector, the same length as mu (and NewZ) sapply(NewZ, function(zz) pnorm(zz[2]) - pnorm(zz[1])) } miBeta - function(mu) BetaG(mu, 0.05, 36, 48, 400) plot(miBeta,xlim=c(370,430), xlab=mu, ylab=L(mu)) I hope this is useful to people following this discussion, This could be completely tangential to your problem with vectorization of arguments to 'curve'. Feel free to ignore. The second of your functions looks to be doing the same as the R function 'scale'. I would have expected it to be applied first and then to have an 'unscale' operation performed to restore (but I am not aware that there is an inbuilt function with that feature): umat - sweep( mat, 2, attr(m2, 'scaled:scale'), '*') umat - sweep( umat, 2, attr(m2, 'scaled:center'), '+') # Or perhaps this where 'sx' was the result of a 'scale' call: scale(sx, -attr(sx, 'scaled:center'), 1/attr(sx,'scaled:scale') ), - attr(sx, 'scaled:center') -- David -Sergio. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Stat question: How to deal w/ negative outliers?
Hi, This is not an R question and is therefore not appropriate for this list. You should post to a statistics forum such as http://stats.stackexchange.com/. But the answer is NO! Best, Ista On Fri, Apr 12, 2013 at 9:49 AM, ramoss ramine.mossad...@finra.org wrote: Hello all, I have a question: I am using the interquantile method to spot outliers it gives me values of say 234 -120 or for the higher lower benchmarks. I don't have any issues w/ the higher end. However I don't have any negative values. My lowest possible value is 0. Should I consider 0 as an outlier? Thanks ahead for your thoughts -- View this message in context: http://r.789695.n4.nabble.com/Stat-question-How-to-deal-w-negative-outliers-tp4664068.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Stat question: How to deal w/ negative outliers?
-Original Message- I have a question: I am using the interquantile method to spot outliers it gives me values of say 234 -120 or for the higher lower benchmarks. I don't have any issues w/ the higher end. However I don't have any negative values. My lowest possible value is 0. Should I consider 0 as an outlier? If your limits have been appropriately set and the lower outlier bound is -120, obviously not. However, if your data are constrained positive and unusually small values are as likely to be erroneous as unusually large values, you probably should have set the bounds differently. Try taking logs and see if that gives you a more or less symmetric distribution? S Ellison *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] processing matrix equation derived from rows of two matrices
Hi Janis, For example The goal here is to get results per row BTW, your original posting had tb[,1] tb[,1]%*%(((val-rep(meansb79[1,],5))^2)/6) #Error in tb[, 1] %*% (((val - rep(meansb79[1, ], 5))^2)/6) : # non-conformable arguments I guess it should be: tb[1,]%*%(((val-rep(meansb79[1,],5))^2)/6) # [,1] #[1,] 1.47619 If you can split the codes in the first solution: seq_len(nrow(tb)) # [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 #[26] 26 27 28 29 30 31 32 33 34 35 36 sapply(seq_len(nrow(tb)),function(i) i) #[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 #[26] 26 27 28 29 30 31 32 33 34 35 36 #you can also do with lapply (makes it more easy to understand) lapply(seq_len(nrow(tb)),function(i) i)[1:2] #[[1]] #[1] 1 #[[2]] #[1] 2 lapply(seq_len(nrow(tb)),function(i) tb[i,])[1:2] #[[1]] #[1] 1 1 1 4 0 #[[2]] #[1] 1 1 1 3 1 tb[1:2,] # [,1] [,2] [,3] [,4] [,5] #[1,] 1 1 1 4 0 #[2,] 1 1 1 3 1 lapply(seq_len(nrow(tb)),function(i) (((val-rep(meansb79[i,],5))^2/6)))[1:2] #[[1]] #[1] 0.765306122 0.217687075 0.003401361 0.122448980 0.574829932 #[[2]] #[1] 0.87074830 0.27551020 0.01360544 0.08503401 0.48979592 (((val-rep(meansb79[1:2,],5))^2/6)) # [1] 0.765306122 0.275510204 0.003401361 0.085034014 0.574829932 0.870748299 #[7] 0.217687075 0.013605442 0.122448980 0.489795918 lapply(seq_len(nrow(tb)),function(i)tb[i,] %*%(((val-rep(meansb79[i,],5))^2/6)))[1:2] #[[1]] # [,1] #[1,] 1.47619 #[[2]] # [,1] #[1,] 1.904762 sapply(seq_len(nrow(tb)),function(i)tb[i,] %*%(((val-rep(meansb79[i,],5))^2/6)))[1:2] #[1] 1.476190 1.904762 Hope it helps. A.K. - Original Message - From: Beckstrand, Janis, NCOD janis.beckstr...@va.gov To: smartpink...@yahoo.com Cc: Sent: Friday, April 12, 2013 1:05 PM Subject: FW: [R] processing matrix equation derived from rows of two matrices Thanks so much Arun. If you have time, what reference(s) do you use to learn the techniques like seq_len(nrow etc and other aspects of your code? Again. Thanks very much. Jan Beckstrand From: arun kirshna [via R] [mailto:ml-node+s789695n4664037...@n4.nabble.com] Sent: Friday, April 12, 2013 2:01 AM To: Beckstrand, Janis, NCOD Subject: Re: processing matrix equation derived from rows of two matrices Hi, May be this helps: tb[1,]%*%(((val-rep(meansb79[1,],5))^2)/6) # [,1] #[1,] 1.47619 tryvarb-c(1,2,3,4,4,4,4) var(tryvarb) #[1] 1.47619 tb[2,]%*%(((val-rep(meansb79[2,],5))^2)/6) # [,1] #[1,] 1.904762 sapply(seq_len(nrow(tb)),function(i) tb[i,]%*%(((val-rep(meansb79[i,],5))^2/6))) # [1] 1.4761905 1.9047619 1.9047619 1.9047619 1.9047619 2.2857143 1.9047619 # [8] 1.9047619 2.2857143 2.2857143 1.9047619 1.9047619 2.2857143 2.2857143 #[15] 2.2857143 1.9523810 1.9523810 2.2857143 2.2857143 2.2857143 2.2857143 #[22] 1.6190476 1.6190476 1.9047619 1.9047619 1.9047619 1.9047619 1.8095238 #[29] 0.9047619 0.9047619 1.1428571 1.1428571 1.1428571 1.1428571 1.000 #[36] 0.4761905 #or mat1-matrix(((val-rep(meansb79,each=5))^2/6),ncol=5,byrow=TRUE) diag(t(apply(tb,1,function(x) x))%*% apply(mat1,1,function(x) x)) # [1] 1.4761905 1.9047619 1.9047619 1.9047619 1.9047619 2.2857143 1.9047619 #[8] 1.9047619 2.2857143 2.2857143 1.9047619 1.9047619 2.2857143 2.2857143 #[15] 2.2857143 1.9523810 1.9523810 2.2857143 2.2857143 2.2857143 2.2857143 #[22] 1.6190476 1.6190476 1.9047619 1.9047619 1.9047619 1.9047619 1.8095238 #[29] 0.9047619 0.9047619 1.1428571 1.1428571 1.1428571 1.1428571 1.000 #[36] 0.4761905 #or diag(apply(tb,1,function(x) x%*% t(mat1))) # [1] 1.4761905 1.9047619 1.9047619 1.9047619 1.9047619 2.2857143 1.9047619 #[8] 1.9047619 2.2857143 2.2857143 1.9047619 1.9047619 2.2857143 2.2857143 #[15] 2.2857143 1.9523810 1.9523810 2.2857143 2.2857143 2.2857143 2.2857143 #[22] 1.6190476 1.6190476 1.9047619 1.9047619 1.9047619 1.9047619 1.8095238 #[29] 0.9047619 0.9047619 1.1428571 1.1428571 1.1428571 1.1428571 1.000 #[36] 0.4761905 A.K. I have a matrix (tb) that represents all samples of size n=7 from a group of N=9, with scores c(1,2,3,4,4,4,4,5,5). I want to calculate the variance for every sample. Here is my code. The bottom shows the matrix equations and an attempt to process it for each row. I got the strategies from reading the r-help, but neither works. (I do not understand the syntax well enough.) Any suggestions. (I need to do may :additional matrices in the same way.) Thanks. Jan require(combinat) n=7 base79-combn(c(1,2,3,4,4,4,4,5,5), n, tabulate,nbins=5) #find all samples,n=7 (gives cols of 7 cases in sample like 1 1 1 4 0 ) tb-t(base79) val-c(1,2,3,4,5) #values on the scale meansb79-t(base79)%*% (val/7) tb[ ,1])%*%(((val-rep(meansb79[1,],5))^2)/6) #computes the sample variance for the first sample #check tryvarb-c(1,2,3,4,4,4,4) var(tryvarb) #Now I try to get the variance for each sample (row) in tb, but neither of the following
Re: [R] A strange behaviour in the graphical function
Jeff Newmiller jdnewmil at dcn.davis.ca.us writes: system.time( miBeta( seq( 370, 430, length.out=1e5 ) ) ) user system elapsed 1.300 0.024 1.476 system.time( miBetav( seq( 370, 430, length.out=1e5 ) ) ) user system elapsed This is very interesting, Jeff. Of course, I appreciate very much your contribution. -Sergio. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Solving an integral in R gives the error “The integral is probably divergent”
But is it supposed to be t^{-3/2} or t^{-0.5}?? The formula has the former and the code the latter, and the integral is clearly divergent with the former. -pd On Apr 12, 2013, at 04:51 , Thomas Lumley wrote: I don't get an error message (after I correct the missing line break after the comment b- sapply(a, Cfun, upper=1) b [1] 1.583458e-54 7.768026e-50 2.317562e-45 4.206260e-41 4.645737e-37 3.123801e-33 1.279358e-29 3.193257e-26 4.860876e-23 [10] 4.516582e-20 2.564400e-17 8.908932e-15 1.896996e-12 2.481084e-10 1.998561e-08 9.946570e-07 3.067751e-05 5.862075e-04 [19] 6.818952e-03 4.297061e-02 0.00e+00 3.175122e-01 3.723022e-01 2.364930e-01 9.144836e-02 2.190878e-02 3.252754e-03 [28] 2.983763e-04 1.685692e-05 5.849602e-07 1.244158e-08 1.619155e-10 1.287603e-12 6.250149e-15 1.850281e-17 3.338241e-20 [37] 3.668412e-23 2.454192e-26 9.991546e-30 2.474577e-33 3.727226e-37 3.413319e-41 1.900112e-45 6.428505e-50 1.321588e-54 [46] 1.650722e-59 1.252524e-64 5.772750e-70 1.615916e-75 2.746972e-81 2.835655e-87 1.777399e-93 6.764271e-100 1.562923e-106 [55] 2.192373e-113 1.866955e-120 9.651205e-128 3.028623e-135 5.769185e-143 6.670835e-151 4.682023e-159 1.994643e-167 5.157808e-176 [64] 8.095084e-185 7.711162e-194 4.458042e-203 1.564139e-212 3.330362e-222 4.302974e-232 3.373500e-242 1.604721e-252 4.631224e-263 [73] 8.108474e-274 8.611898e-285 5.547745e-296 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 [82] 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 [91] 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 [100] 0.00e+00 -thomas On Tue, Apr 9, 2013 at 3:14 PM, Janesh Devkota janesh.devk...@gmail.comwrote: I am trying to solve an integral in R. However, I am getting an error when I am trying to solve for that integral. The equation that I am trying to solve is as follows: $$ C_m = \frac{{abs{x}}e^{2x}}{\pi^{1/2}}\int_0^t t^{-3/2}e^{-x^2/t-t}dt $$ [image: enter image description here] The code that I am using is as follows: a - seq(from=-10, by=0.5,length=100) ## Create a function to compute integrationCfun - function(XX, upper){ integrand - function(x)x^(-0.5)*exp((-XX^2/x)-x) integrated - integrate(integrand, lower=0, upper=upper)$value (final - abs(XX)*pi^(-0.5)*exp(2*XX)*integrated) } b- sapply(a, Cfun, upper=1) The error that I am getting is as follows: Error in integrate(integrand, lower = 0, upper = upper) : the integral is probably divergent Does this mean I cannot solve the integral ? Any possible ways to fix this problem will be highly appreciated.The question can be found on http://stackoverflow.com/questions/15892586/solving-an-integral-in-r-gives-error-the-integral-is-probably-divergent also. Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Thomas Lumley Professor of Biostatistics University of Auckland [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error loading workspaces after upgrade
On 12/04/2013 7:14 AM, Tamas Barjak wrote: Dear Members, I upgraded R from 2.15.3 to 3.0.0 (binary) on my WinXP and now I can't load my workspaces. The error message is: Error: requested primitive type is not consistent with cached value During startup - Warning message: unable to restore saved data in .RData Any idea to fix that? Tamas sent me a copy of the bad file, and I've got a workaround: The problem is an object called .SavedPlots. It was produced by the Windows graphics device when it was recording the history. It can't be loaded into R 3.0.0, because some of the internal organization of things has changed. The workaround is to restart 2.15.3, load the workspace, then rm(.SavedPlots) and save the workspace. After that things should be fine. I'll try to get something in place in R-patched so that this doesn't cause the whole .RData load to fail. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Solving an integral in R gives the error The integral is probably divergent
Hi Peter, It is supposed to be t^{3/2} instead of t^{0.5}. The code had a typo. Yes, I figured out that when x =0, the solution is divergent because it is like dividing something by zero. Thanks. Best, Janesh -Original Message- From: peter dalgaard [mailto:pda...@gmail.com] Sent: Friday, April 12, 2013 1:32 PM To: Thomas Lumley Cc: Janesh Devkota; groep R-help Subject: Re: [R] Solving an integral in R gives the error The integral is probably divergent But is it supposed to be t^{-3/2} or t^{-0.5}?? The formula has the former and the code the latter, and the integral is clearly divergent with the former. -pd On Apr 12, 2013, at 04:51 , Thomas Lumley wrote: I don't get an error message (after I correct the missing line break after the comment b- sapply(a, Cfun, upper=1) b [1] 1.583458e-54 7.768026e-50 2.317562e-45 4.206260e-41 4.645737e-37 3.123801e-33 1.279358e-29 3.193257e-26 4.860876e-23 [10] 4.516582e-20 2.564400e-17 8.908932e-15 1.896996e-12 2.481084e-10 1.998561e-08 9.946570e-07 3.067751e-05 5.862075e-04 [19] 6.818952e-03 4.297061e-02 0.00e+00 3.175122e-01 3.723022e-01 2.364930e-01 9.144836e-02 2.190878e-02 3.252754e-03 [28] 2.983763e-04 1.685692e-05 5.849602e-07 1.244158e-08 1.619155e-10 1.287603e-12 6.250149e-15 1.850281e-17 3.338241e-20 [37] 3.668412e-23 2.454192e-26 9.991546e-30 2.474577e-33 3.727226e-37 3.413319e-41 1.900112e-45 6.428505e-50 1.321588e-54 [46] 1.650722e-59 1.252524e-64 5.772750e-70 1.615916e-75 2.746972e-81 2.835655e-87 1.777399e-93 6.764271e-100 1.562923e-106 [55] 2.192373e-113 1.866955e-120 9.651205e-128 3.028623e-135 5.769185e-143 6.670835e-151 4.682023e-159 1.994643e-167 5.157808e-176 [64] 8.095084e-185 7.711162e-194 4.458042e-203 1.564139e-212 3.330362e-222 4.302974e-232 3.373500e-242 1.604721e-252 4.631224e-263 [73] 8.108474e-274 8.611898e-285 5.547745e-296 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 [82] 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 [91] 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 [100] 0.00e+00 -thomas On Tue, Apr 9, 2013 at 3:14 PM, Janesh Devkota janesh.devk...@gmail.comwrote: I am trying to solve an integral in R. However, I am getting an error when I am trying to solve for that integral. The equation that I am trying to solve is as follows: $$ C_m = \frac{{abs{x}}e^{2x}}{\pi^{1/2}}\int_0^t t^{-3/2}e^{-x^2/t-t}dt $$ [image: enter image description here] The code that I am using is as follows: a - seq(from=-10, by=0.5,length=100) ## Create a function to compute integrationCfun - function(XX, upper){ integrand - function(x)x^(-0.5)*exp((-XX^2/x)-x) integrated - integrate(integrand, lower=0, upper=upper)$value (final - abs(XX)*pi^(-0.5)*exp(2*XX)*integrated) } b- sapply(a, Cfun, upper=1) The error that I am getting is as follows: Error in integrate(integrand, lower = 0, upper = upper) : the integral is probably divergent Does this mean I cannot solve the integral ? Any possible ways to fix this problem will be highly appreciated.The question can be found on http://stackoverflow.com/questions/15892586/solving-an-integral-in-r- gives-error-the-integral-is-probably-divergent also. Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Thomas Lumley Professor of Biostatistics University of Auckland [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Why copying columns of a data.frame becomes numeric?
Dear list, I want the 1st, 2nd, 5th, and 6th columns of mtcars. After copying them, the columns become numeric class rather than data frame. But, when I copy rows, they data frame retains its class. Why is this? I don't see why copying rows vs columns is so different. class(mtcars) [1] data.frame head(mtcars) mpg cyl disp hp dratwt qsec vs am gear carb Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 144 Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 144 Datsun 71022.8 4 108 93 3.85 2.320 18.61 1 141 Hornet 4 Drive21.4 6 258 110 3.08 3.215 19.44 1 031 Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 032 Valiant 18.1 6 225 105 2.76 3.460 20.22 1 031 a - mtcars$mpg class(a) [1] numeric b - mtcars[1:5, ] class(b) [1] data.frame Thanks a lot, Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why copying columns of a data.frame becomes numeric?
It's another reason not to use $ for extraction. By default, R reduces dimensionality when subsetting, so mtcars$mpg actually returns a numeric vector. With $, there's no way to override the default behavior. data(mtcars) a - mtcars$mpg class(a) [1] numeric dim(a) NULL a - mtcars[, mpg, drop=FALSE] class(a) [1] data.frame dim(a) [1] 32 1 Sarah On Fri, Apr 12, 2013 at 3:32 PM, C W tmrs...@gmail.com wrote: Dear list, I want the 1st, 2nd, 5th, and 6th columns of mtcars. After copying them, the columns become numeric class rather than data frame. But, when I copy rows, they data frame retains its class. Why is this? I don't see why copying rows vs columns is so different. class(mtcars) [1] data.frame head(mtcars) mpg cyl disp hp dratwt qsec vs am gear carb Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 144 Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 144 Datsun 71022.8 4 108 93 3.85 2.320 18.61 1 141 Hornet 4 Drive21.4 6 258 110 3.08 3.215 19.44 1 031 Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 032 Valiant 18.1 6 225 105 2.76 3.460 20.22 1 031 a - mtcars$mpg class(a) [1] numeric b - mtcars[1:5, ] class(b) [1] data.frame Thanks a lot, Mike -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why copying columns of a data.frame becomes numeric?
class(mtcars[1:4,]) # select some rows [1] data.frame class(mtcars[,1:4]) # select some columns [1] data.frame class(mtcars[,3]) # select one column [1] numeric class(mtcars[, 3, drop=FALSE]) # select one column [1] data.frame I cannot give a definitive reason why it is done this way, but you do need some way to get from a data.frame to a column it contains and some way to get from a data.frame to a single-column data.frame. The above methods do give you that choice. Also note that rows and columns of data.frame are intrinsically different. A column is generally a vector of one type while a row is a list of 1-long vectors. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of C W Sent: Friday, April 12, 2013 12:33 PM To: r-help Subject: [R] Why copying columns of a data.frame becomes numeric? Dear list, I want the 1st, 2nd, 5th, and 6th columns of mtcars. After copying them, the columns become numeric class rather than data frame. But, when I copy rows, they data frame retains its class. Why is this? I don't see why copying rows vs columns is so different. class(mtcars) [1] data.frame head(mtcars) mpg cyl disp hp dratwt qsec vs am gear carb Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 144 Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 144 Datsun 71022.8 4 108 93 3.85 2.320 18.61 1 141 Hornet 4 Drive21.4 6 258 110 3.08 3.215 19.44 1 031 Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 032 Valiant 18.1 6 225 105 2.76 3.460 20.22 1 031 a - mtcars$mpg class(a) [1] numeric b - mtcars[1:5, ] class(b) [1] data.frame Thanks a lot, Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error loading workspaces after upgrade
YES, it's work! Fine! Thank you! I wait for the R-patch. Regards, Tamas 2013/4/12 Duncan Murdoch murdoch.dun...@gmail.com On 12/04/2013 7:14 AM, Tamas Barjak wrote: Dear Members, I upgraded R from 2.15.3 to 3.0.0 (binary) on my WinXP and now I can't load my workspaces. The error message is: Error: requested primitive type is not consistent with cached value During startup - Warning message: unable to restore saved data in .RData Any idea to fix that? Tamas sent me a copy of the bad file, and I've got a workaround: The problem is an object called .SavedPlots. It was produced by the Windows graphics device when it was recording the history. It can't be loaded into R 3.0.0, because some of the internal organization of things has changed. The workaround is to restart 2.15.3, load the workspace, then rm(.SavedPlots) and save the workspace. After that things should be fine. I'll try to get something in place in R-patched so that this doesn't cause the whole .RData load to fail. Duncan Murdoch [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why copying columns of a data.frame becomes numeric?
On 12-04-2013, at 21:32, C W tmrs...@gmail.com wrote: Dear list, I want the 1st, 2nd, 5th, and 6th columns of mtcars. After copying them, the columns become numeric class rather than data frame. But, when I copy rows, they data frame retains its class. Why is this? I don't see why copying rows vs columns is so different. class(mtcars) [1] data.frame head(mtcars) mpg cyl disp hp dratwt qsec vs am gear carb Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 144 Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 144 Datsun 71022.8 4 108 93 3.85 2.320 18.61 1 141 Hornet 4 Drive21.4 6 258 110 3.08 3.215 19.44 1 031 Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 032 Valiant 18.1 6 225 105 2.76 3.460 20.22 1 031 a - mtcars$mpg class(a) [1] numeric Here you are assigning a single column of mtcars, which is a numeric vector, to another object. So that is a numeric vector. b - mtcars[1:5, ] class(b) [1] data.frame Here you are assigning a couple of rows of the complete dataframe and the result is a dataframe. If you want the 1st, 2nd, 5th, and 6th columns of mtcars in a new datafrmae why don't you do this: a - mtcars[,c(1,2,5,6)] then class(a) [1] data.frame Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to change the date into an interval of date?
Hi, I am not sure I understand your question correctly. dat1- read.table(text= id responsed_at number_of_connection scores 1 12-01-2010 1 2 1 15-02-2010 2 3 1 01-04-2010 3 2 ,sep=,header=TRUE,stringsAsFactors=FALSE) datNew- dat1 datNew$responsed_at- gsub(^.*\\-(.*\\-.*),\\1, datNew$responsed_at) dat2-data.frame(responsed_at=format(seq(from=as.Date(01-01-2010,format=%d-%m-%Y),to=as.Date(01-04-2010,format=%d-%m-%Y),by=month),%m-%Y)) res-merge(datNew,dat2,all=TRUE,by=responsed_at) res$responsed_at[is.na(res$id)]-NA res$responsed_at[!is.na(res$responsed_at)]- paste(gsub((.*)\\-.*\\-.*,\\1,dat1$responsed_at),-,res$responsed_at[!is.na(res$responsed_at)],sep=) res$number_of_month- seq_len(nrow(res))-1 res # responsed_at id number_of_connection scores number_of_month #1 12-01-2010 1 1 2 0 #2 15-02-2010 1 2 3 1 #3 NA NA NA NA 2 #4 01-04-2010 1 3 2 3 Dear experts: I have my table like this: id responsed_at number of connection scores 1 12-01-2010 1 2 1 15-02-2010 2 3 1 01-04-2010 3 2 I want to have one table like this: id responsed_at number of month number of connection scors 1 12-01-2010 0 1 2 1 15-02-2010 1 2 3 1 NA 2 NA NA 1 01-04-2010 3 3 2 I want to plus a column indicating the number of month when patients responsed. How can I realize that in R? The rule for number of month is the date in one month, ex: from the first day to the last day of one month count one number of month. Thank you in avance. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why copying columns of a data.frame becomes numeric?
What if my data is much larger, and I don't know what column number but know its name? Do I have to grep its by name? How about subset()? Is that what people commonly use? Mike On Fri, Apr 12, 2013 at 3:44 PM, Berend Hasselman b...@xs4all.nl wrote: On 12-04-2013, at 21:32, C W tmrs...@gmail.com wrote: Dear list, I want the 1st, 2nd, 5th, and 6th columns of mtcars. After copying them, the columns become numeric class rather than data frame. But, when I copy rows, they data frame retains its class. Why is this? I don't see why copying rows vs columns is so different. class(mtcars) [1] data.frame head(mtcars) mpg cyl disp hp dratwt qsec vs am gear carb Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 144 Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 144 Datsun 71022.8 4 108 93 3.85 2.320 18.61 1 141 Hornet 4 Drive21.4 6 258 110 3.08 3.215 19.44 1 031 Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 032 Valiant 18.1 6 225 105 2.76 3.460 20.22 1 031 a - mtcars$mpg class(a) [1] numeric Here you are assigning a single column of mtcars, which is a numeric vector, to another object. So that is a numeric vector. b - mtcars[1:5, ] class(b) [1] data.frame Here you are assigning a couple of rows of the complete dataframe and the result is a dataframe. If you want the 1st, 2nd, 5th, and 6th columns of mtcars in a new datafrmae why don't you do this: a - mtcars[,c(1,2,5,6)] then class(a) [1] data.frame Berend [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why copying columns of a data.frame becomes numeric?
On 12-04-2013, at 21:56, C W tmrs...@gmail.com wrote: What if my data is much larger, and I don't know what column number but know its name? Do I have to grep its by name? How about subset()? Is that what people commonly use? Continuing with mtcars and the desired columns b - mtcars[,c(mpg,cyl,drat,wt)] Berend Mike On Fri, Apr 12, 2013 at 3:44 PM, Berend Hasselman b...@xs4all.nl wrote: On 12-04-2013, at 21:32, C W tmrs...@gmail.com wrote: Dear list, I want the 1st, 2nd, 5th, and 6th columns of mtcars. After copying them, the columns become numeric class rather than data frame. But, when I copy rows, they data frame retains its class. Why is this? I don't see why copying rows vs columns is so different. class(mtcars) [1] data.frame head(mtcars) mpg cyl disp hp dratwt qsec vs am gear carb Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 144 Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 144 Datsun 71022.8 4 108 93 3.85 2.320 18.61 1 141 Hornet 4 Drive21.4 6 258 110 3.08 3.215 19.44 1 031 Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 032 Valiant 18.1 6 225 105 2.76 3.460 20.22 1 031 a - mtcars$mpg class(a) [1] numeric Here you are assigning a single column of mtcars, which is a numeric vector, to another object. So that is a numeric vector. b - mtcars[1:5, ] class(b) [1] data.frame Here you are assigning a couple of rows of the complete dataframe and the result is a dataframe. If you want the 1st, 2nd, 5th, and 6th columns of mtcars in a new datafrmae why don't you do this: a - mtcars[,c(1,2,5,6)] then class(a) [1] data.frame Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing rows that are duplicates but column values are in reversed order
Hi, From your example data, dat1- read.table(text= id1 id2 value a b 10 c d 11 b a 10 c e 12 ,sep=,header=TRUE,stringsAsFactors=FALSE) #it is easier to get the output you wanted dat1[!duplicated(dat1$value),] # id1 id2 value #1 a b 10 #2 c d 11 #4 c e 12 But, if you have cases like the one below (assuming that all those instances were there is reversed order have the same value) dat2- read.table(text= id1 id2 value a b 10 c d 11 b a 10 e c 12 c e 12 ,sep=,header=TRUE,stringsAsFactors=FALSE) dat2[apply(dat2[,-3],1,function(x) {x1- order(x); x1[1]x1[2]}),] # id1 id2 value #1 a b 10 #2 c d 11 #5 c e 12 #or you have cases like these: dat3- read.table(text= id1 id2 value a b 10 c d 11 b a 10 a b 10 e c 12 c e 12 c d 11 ,sep=,header=TRUE,stringsAsFactors=FALSE) dat3New-dat3[apply(dat3[,-3],1,function(x) {x1- order(x); x1[1]x1[2]}),] dat3New[!duplicated(dat3New$value),] # id1 id2 value #1 a b 10 #2 c d 11 #6 c e 12 A.K. Hi everybody, I was hoping that someone could help me with this problem. I have a table with 3 columns. Some rows contain duplicates where the identifiers in columns 1 and 2 are in reverse order, but the value associated with the row is the same. For example: id1 id2 value a b 10 c d 11 b a 10 c e 12 Rows 1 and 3 are duplicates (have the same value). I would like to retain only row 1 and delete row 3. Final table should look like this: id1 id2 value a b 10 c d 11 c e 12 Thanks in advance for any help provided. Vince __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why copying columns of a data.frame becomes numeric?
I was thinking, b - subset(mtcars, select=c(mpg, cyl, drat, wt)) Yours is even simpler. :) Thanks Mike On Fri, Apr 12, 2013 at 4:06 PM, Berend Hasselman b...@xs4all.nl wrote: On 12-04-2013, at 21:56, C W tmrs...@gmail.com wrote: What if my data is much larger, and I don't know what column number but know its name? Do I have to grep its by name? How about subset()? Is that what people commonly use? Continuing with mtcars and the desired columns b - mtcars[,c(mpg,cyl,drat,wt)] Berend Mike On Fri, Apr 12, 2013 at 3:44 PM, Berend Hasselman b...@xs4all.nl wrote: On 12-04-2013, at 21:32, C W tmrs...@gmail.com wrote: Dear list, I want the 1st, 2nd, 5th, and 6th columns of mtcars. After copying them, the columns become numeric class rather than data frame. But, when I copy rows, they data frame retains its class. Why is this? I don't see why copying rows vs columns is so different. class(mtcars) [1] data.frame head(mtcars) mpg cyl disp hp dratwt qsec vs am gear carb Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 144 Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 144 Datsun 71022.8 4 108 93 3.85 2.320 18.61 1 141 Hornet 4 Drive21.4 6 258 110 3.08 3.215 19.44 1 031 Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 032 Valiant 18.1 6 225 105 2.76 3.460 20.22 1 031 a - mtcars$mpg class(a) [1] numeric Here you are assigning a single column of mtcars, which is a numeric vector, to another object. So that is a numeric vector. b - mtcars[1:5, ] class(b) [1] data.frame Here you are assigning a couple of rows of the complete dataframe and the result is a dataframe. If you want the 1st, 2nd, 5th, and 6th columns of mtcars in a new datafrmae why don't you do this: a - mtcars[,c(1,2,5,6)] then class(a) [1] data.frame Berend [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Search for common character strings within a column
Hi, May be this helps: Not sure how you wanted to select those two letters. dat1- read.table(text= Seq,Output A B B C D A C,Yes B C A C B D A C,Yes C D A A C D,No ,sep=,,header=TRUE,stringsAsFactors=FALSE) library(stringr) lapply(str_split(str_trim(dat1$Seq), )[dat1$Output==Yes],function(x) {x1-t(combn(x,2)); apply(x1,1,paste0,collapse=)}) #[[1]] # [1] AB AB AC AD AA AC BB BC BD BA BC BC BD BA BC #[16] CD CA CC DA DC AC #[[2]] # [1] BC BA BC BB BD BA BC CA CC CB CD CA CC AC AB #[16] AD AA AC CB CD CA CC BD BA BC DA DC AC res- sapply(str_split(str_trim(dat1$Seq), )[dat1$Output==Yes],function(x) {x1-t(combn(x,2)); x2-table(apply(x1,1,paste0,collapse=)); x2[which.max(x2)]}) res #BC BC # 4 4 dat1$MaxCombn-NA dat1$MaxCombn[dat1$Output==Yes]- names(res) dat1 # Seq Output MaxCombn #1 A B B C D A C Yes BC #2 B C A C B D A C Yes BC #3 C D A A C D No NA A.K. I have a dataset (data) that consists of two columns: Seq and output. Each entry in Seq is a combination of As,Bs,Cs and Ds and ranges from 5 – 30 characters in length. Each sequence is associated with an output of either yes or no such that: Seq Output (1) A B B C D A C Yes (2) B C A C B D A CYes (3) C D A A C DNo etc, etc. I want to find which 2 letter (A B, A C, A D, etc) strings are most associated with each output. Essentially I want to find which 2 letter combinations occur most frequently in the column Seq, when the output is Yes. I’m new to R and can’t figure out a solution to this problem. Any help greatly appreciated! Cheers, AB __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] split date and time
Hi R experts, For example I have a dataset looks like this: Number TimeStamp Value 1 1/1/2013 0:00 1 2 1/1/2013 0:01 2 3 1/1/2013 0:03 3 How can I split the TimeStamp Column into two and return a new table like this: Number Date Time Value 1 1/1/2013 0:00 1 2 1/1/2013 0:01 2 3 1/1/2013 0:03 3 Thank! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Search for common character strings within a column
HI, In case if you wanted to take BC and CB as the same. dat1- read.table(text= Seq,Output A B B C D A C,Yes B C A C B D A C,Yes C D A A C D,No ,sep=,,header=TRUE,stringsAsFactors=FALSE) lapply(str_split(str_trim(dat1$Seq), )[dat1$Output==Yes],function(x) {x1-t(combn(x,2)); x2-sapply(strsplit(apply(x1,1,paste0,collapse=),),function(x) paste(x[order(x)],collapse=)); table(x2)}) [[1]] #x2 #AA AB AC AD BB BC BD CC CD # 1 4 4 2 1 4 2 1 2 #[[2]] #x2 #AA AB AC AD BB BC BD CC CD # 1 4 6 2 1 6 2 3 3 dat1$MaxCombn- NA res1-sapply(str_split(str_trim(dat1$Seq), )[dat1$Output==Yes],function(x) {x1-t(combn(x,2)); x2-sapply(strsplit(apply(x1,1,paste0,collapse=),),function(x) paste(x[order(x)],collapse=)); x3-table(x2); x3[x3%in% max(x3)]}) dat1$MaxCombn[dat1$Output==Yes]-lapply(res1,names) dat1 # Seq Output MaxCombn #1 A B B C D A C Yes AB, AC, BC #2 B C A C B D A C Yes AC, BC #3 C D A A C D No NA A.K. - Original Message - From: arun smartpink...@yahoo.com To: R help r-help@r-project.org Cc: Sent: Friday, April 12, 2013 4:37 PM Subject: Re: Search for common character strings within a column Hi, May be this helps: Not sure how you wanted to select those two letters. dat1- read.table(text= Seq,Output A B B C D A C,Yes B C A C B D A C,Yes C D A A C D,No ,sep=,,header=TRUE,stringsAsFactors=FALSE) library(stringr) lapply(str_split(str_trim(dat1$Seq), )[dat1$Output==Yes],function(x) {x1-t(combn(x,2)); apply(x1,1,paste0,collapse=)}) #[[1]] # [1] AB AB AC AD AA AC BB BC BD BA BC BC BD BA BC #[16] CD CA CC DA DC AC #[[2]] # [1] BC BA BC BB BD BA BC CA CC CB CD CA CC AC AB #[16] AD AA AC CB CD CA CC BD BA BC DA DC AC res- sapply(str_split(str_trim(dat1$Seq), )[dat1$Output==Yes],function(x) {x1-t(combn(x,2)); x2-table(apply(x1,1,paste0,collapse=)); x2[which.max(x2)]}) res #BC BC # 4 4 dat1$MaxCombn-NA dat1$MaxCombn[dat1$Output==Yes]- names(res) dat1 # Seq Output MaxCombn #1 A B B C D A C Yes BC #2 B C A C B D A C Yes BC #3 C D A A C D No NA A.K. I have a dataset (data) that consists of two columns: Seq and output. Each entry in Seq is a combination of As,Bs,Cs and Ds and ranges from 5 – 30 characters in length. Each sequence is associated with an output of either yes or no such that: Seq Output (1) A B B C D A C Yes (2) B C A C B D A C Yes (3) C D A A C D No etc, etc. I want to find which 2 letter (A B, A C, A D, etc) strings are most associated with each output. Essentially I want to find which 2 letter combinations occur most frequently in the column Seq, when the output is Yes. I’m new to R and can’t figure out a solution to this problem. Any help greatly appreciated! Cheers, AB __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] split date and time
Hi, dat1- read.table(text= Number,TimeStamp,Value 1,1/1/2013 0:00,1 2,1/1/2013 0:01,2 3,1/1/2013 0:03,3 ,sep=,,header=TRUE,stringsAsFactors=FALSE) dat2-data.frame(Number=dat1[,1],do.call(rbind,strsplit(dat1[,2], )), Value=dat1[,3]) names(dat2)[2:3]- c(Date,Time) dat2 # Number Date Time Value #1 1 1/1/2013 0:00 1 #2 2 1/1/2013 0:01 2 #3 3 1/1/2013 0:03 3 A.K. - Original Message - From: Ye Lin ye...@lbl.gov To: R help r-help@r-project.org Cc: Sent: Friday, April 12, 2013 5:49 PM Subject: [R] split date and time Hi R experts, For example I have a dataset looks like this: Number TimeStamp Value 1 1/1/2013 0:00 1 2 1/1/2013 0:01 2 3 1/1/2013 0:03 3 How can I split the TimeStamp Column into two and return a new table like this: Number Date Time Value 1 1/1/2013 0:00 1 2 1/1/2013 0:01 2 3 1/1/2013 0:03 3 Thank! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fw: split date and time
- Forwarded Message - From: arun smartpink...@yahoo.com To: Ye Lin ye...@lbl.gov Cc: Sent: Friday, April 12, 2013 6:25 PM Subject: Re: [R] split date and time Hi Ye, Is this okay? dat2-cbind(dat1[,-2],do.call(rbind,strsplit(dat1[,2], )),stringsAsFactors=FALSE) dat2 # Number Value 1 2 #1 1 1 1/1/2013 0:00 #2 2 2 1/1/2013 0:01 #3 3 3 1/1/2013 0:03 colnames(dat2)[3:4]- c(Date,Time) dat2 # Number Value Date Time #1 1 1 1/1/2013 0:00 #2 2 2 1/1/2013 0:01 #3 3 3 1/1/2013 0:03 str(dat2) #'data.frame': 3 obs. of 4 variables: #$ Number: int 1 2 3 #$ Value : int 1 2 3 #$ Date : chr 1/1/2013 1/1/2013 1/1/2013 #$ Time : chr 0:00 0:01 0:03 From: Ye Lin ye...@lbl.gov To: arun smartpink...@yahoo.com Sent: Friday, April 12, 2013 6:16 PM Subject: Re: [R] split date and time What if I have many columns, for example I have 50 columns in dat1, and say the TimeStamp column is the 10th, anyway to do that instead of listing all the other columns when building dat2? Thanks! On Fri, Apr 12, 2013 at 3:08 PM, arun smartpink...@yahoo.com wrote: Hi, dat1- read.table(text= Number,TimeStamp,Value 1,1/1/2013 0:00,1 2,1/1/2013 0:01,2 3,1/1/2013 0:03,3 ,sep=,,header=TRUE,stringsAsFactors=FALSE) dat2-data.frame(Number=dat1[,1],do.call(rbind,strsplit(dat1[,2], )), Value=dat1[,3]) names(dat2)[2:3]- c(Date,Time) dat2 # Number Date Time Value #1 1 1/1/2013 0:00 1 #2 2 1/1/2013 0:01 2 #3 3 1/1/2013 0:03 3 A.K. - Original Message - From: Ye Lin ye...@lbl.gov To: R help r-help@r-project.org Cc: Sent: Friday, April 12, 2013 5:49 PM Subject: [R] split date and time Hi R experts, For example I have a dataset looks like this: Number TimeStamp Value 1 1/1/2013 0:00 1 2 1/1/2013 0:01 2 3 1/1/2013 0:03 3 How can I split the TimeStamp Column into two and return a new table like this: Number Date Time Value 1 1/1/2013 0:00 1 2 1/1/2013 0:01 2 3 1/1/2013 0:03 3 Thank! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fw: split date and time
Thanks! On Fri, Apr 12, 2013 at 3:30 PM, arun smartpink...@yahoo.com wrote: - Forwarded Message - From: arun smartpink...@yahoo.com To: Ye Lin ye...@lbl.gov Cc: Sent: Friday, April 12, 2013 6:25 PM Subject: Re: [R] split date and time Hi Ye, Is this okay? dat2-cbind(dat1[,-2],do.call(rbind,strsplit(dat1[,2], )),stringsAsFactors=FALSE) dat2 # Number Value12 #1 1 1 1/1/2013 0:00 #2 2 2 1/1/2013 0:01 #3 3 3 1/1/2013 0:03 colnames(dat2)[3:4]- c(Date,Time) dat2 # Number Value Date Time #1 1 1 1/1/2013 0:00 #2 2 2 1/1/2013 0:01 #3 3 3 1/1/2013 0:03 str(dat2) #'data.frame':3 obs. of 4 variables: #$ Number: int 1 2 3 #$ Value : int 1 2 3 #$ Date : chr 1/1/2013 1/1/2013 1/1/2013 #$ Time : chr 0:00 0:01 0:03 From: Ye Lin ye...@lbl.gov To: arun smartpink...@yahoo.com Sent: Friday, April 12, 2013 6:16 PM Subject: Re: [R] split date and time What if I have many columns, for example I have 50 columns in dat1, and say the TimeStamp column is the 10th, anyway to do that instead of listing all the other columns when building dat2? Thanks! On Fri, Apr 12, 2013 at 3:08 PM, arun smartpink...@yahoo.com wrote: Hi, dat1- read.table(text= Number,TimeStamp,Value 1,1/1/2013 0:00,1 2,1/1/2013 0:01,2 3,1/1/2013 0:03,3 ,sep=,,header=TRUE,stringsAsFactors=FALSE) dat2-data.frame(Number=dat1[,1],do.call(rbind,strsplit(dat1[,2], )), Value=dat1[,3]) names(dat2)[2:3]- c(Date,Time) dat2 # Number Date Time Value #1 1 1/1/2013 0:00 1 #2 2 1/1/2013 0:01 2 #3 3 1/1/2013 0:03 3 A.K. - Original Message - From: Ye Lin ye...@lbl.gov To: R help r-help@r-project.org Cc: Sent: Friday, April 12, 2013 5:49 PM Subject: [R] split date and time Hi R experts, For example I have a dataset looks like this: Number TimeStamp Value 1 1/1/2013 0:00 1 2 1/1/2013 0:01 2 3 1/1/2013 0:03 3 How can I split the TimeStamp Column into two and return a new table like this: Number Date Time Value 1 1/1/2013 0:00 1 2 1/1/2013 0:01 2 3 1/1/2013 0:03 3 Thank! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] kNNimpute error
Hi all, I'm trying to use kNNimpute in the imputation package to fill in missing precipitation data for a data frame I have. Example is: okee: Date rainfall 1997-05-010 1997-05-020 1997-05-03NA 1997-05-040 1997-05-050 ... 2007-04-01NA 2007-04-02NA 2007-04-03NA 2007-04-04NA 2007-04-050 ... .. where there are large swatches (30 days) of data missing in the ten year time series. I tried newokee-kNNImpute(okee, k=30, verbose = T) hoping it would impute data for the rows with NA values according to weighted means of closest 30 non-NA neighbours and I got the following message back: [1] imputing on 270 missing values with matrix size 7304 [1] Computing distance matrix... [1] Distance matrix complete [1] Imputing row 70 Error in which(missing.matrix[rowIndex, ]) : subscript out of bounds In addition: Warning message: In dist(x, upper = T) : NAs introduced by coercion What syntax do I need to impute the precipitation data? Failing that, do you have another recommendation of a method to use? Statistics is not my strong point. thank you for any help you are able to give, Aimee. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Batch open netcdf files and get variables
Hi, I'm new to R. I have some daily soil moisture data for the year 1979 in netcdf format such as these sm19790101.1.nc sm19790102.1.nc . . . sm19791231.1.nc I need to average a variable called sm to monthly resolution. I've done these days = formatC(1:31, width=2, flag=0) ncfiles = lapply(days, function(d){ filename = paste(sm197901, d, .1.nc, sep=) #print(filename) open.nc(filename) }) to open the files in January. Questions is: how may I get the variable from each opened file. Rnetcdf package has the var.get.nc() that only read object of class Netcdf returned from open.nc(). Any help would be appreciated. Enhao [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] CCA report species environment correlation
Hi everyone, I did a CCA with R in the vegan package and got all the outputs I need/want to report except for one, that is the species environment correlation values. I only know of them because of one of the sources I read up on about CCAs reported it and I have the suspicion that it´s a value reported when you do a CCA in SPSS (which I have never done) and that just doesnßt come up in R. Can anyone tell me how to find it or calculate it myself? Do I need to run posthoc stats or additional tests on my results? Thanks in advance! -- View this message in context: http://r.789695.n4.nabble.com/CCA-report-species-environment-correlation-tp4664089.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] remove colinear variables
Hello I have a dataset with 151 dependent variables and estimated in 6 different conditions, and each condition is repeated 3 times. It si proteomic and therefore expensive, so few repetitions. I want to conduct a stepwise discriminant analysis to identify the variables that really matter. To do so, I will use greedy.wilks. and stepclass. However, I need to remove colinear variables to run these functions. Can anyone tell me how to identify the colinear variables to then remove them? Thank you in advance julien. -- View this message in context: http://r.789695.n4.nabble.com/remove-colinear-variables-tp4664077.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] model frame and formula mismatch in model.matrix()
Hello everyone, I am trying to fit the following model All X. variables are continuous, while the conditions are categoricals. model - lm(X2 +X3+X4+X5+X6+X7+X8+X9+X10+X11+X12+X13+X14+X15+X16+X17+X18+X19+X20+X21+X22+X23+X24+X25+X26+X27+X28+X29+X30+X31+X32+X33+X34+X35+X36+X37+X38+X39+X40+X41+X42+X43+X44+X45+X46+X47+X48+X49+X50+X51+X52+X53+X54+X55+X56+X57+X58+X59+X60+X61+X62+X63+X64+X65+X66+X67+X68+X69+X70+X71+X72+X73+X74+X75+X76+X77+X78+X79+X80+X81+X82+X83+X84+X85+X86+X87+X88+X89+X90+X91+X92+X93+X94+X95+X96+X97+X98+X99+X100+X101+X102+X103+X104+X105+X106+X107+X108+X109+X110+X111+X112+X113+X114+X115+X116+X117+X118+X119+X120+X121+X122+X123+X124+X125+X126+X127+X128+X129+X130+X131+X132+X133+X134+X135+X136+X137+X138+X139+X140+X141+X142+X143+X144+X145+X146+X147+X148+X149+X150+X151+X152 ~ conditions, data = proteo) but I get this error message: Error in model.matrix.default(mt, mf, contrasts) : model frame and formula mismatch in model.matrix() can someone help me please? Julien. -- View this message in context: http://r.789695.n4.nabble.com/model-frame-and-formula-mismatch-in-model-matrix-tp4664093.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing rows that are duplicates but column values are in reversed order
Thanks very much for your rapid help Arun. Vince On Apr 12, 2013, at 4:10 PM, arun kirshna [via R] wrote: Hi, From your example data, dat1- read.table(text= id1 id2 value a b 10 c d11 b a 10 c e 12 ,sep=,header=TRUE,stringsAsFactors=FALSE) #it is easier to get the output you wanted dat1[!duplicated(dat1$value),] # id1 id2 value #1 a b10 #2 c d11 #4 c e12 But, if you have cases like the one below (assuming that all those instances were there is reversed order have the same value) dat2- read.table(text= id1 id2 value a b 10 c d11 b a 10 e c 12 c e 12 ,sep=,header=TRUE,stringsAsFactors=FALSE) dat2[apply(dat2[,-3],1,function(x) {x1- order(x); x1[1]x1[2]}),] # id1 id2 value #1 a b10 #2 c d11 #5 c e12 #or you have cases like these: dat3- read.table(text= id1 id2 value a b 10 c d11 b a 10 a b10 e c 12 c e 12 c d 11 ,sep=,header=TRUE,stringsAsFactors=FALSE) dat3New-dat3[apply(dat3[,-3],1,function(x) {x1- order(x); x1[1]x1[2]}),] dat3New[!duplicated(dat3New$value),] # id1 id2 value #1 a b10 #2 c d11 #6 c e12 A.K. Hi everybody, I was hoping that someone could help me with this problem. I have a table with 3 columns. Some rows contain duplicates where the identifiers in columns 1 and 2 are in reverse order, but the value associated with the row is the same. For example: id1 id2 value a b 10 c d11 b a 10 c e 12 Rows 1 and 3 are duplicates (have the same value). I would like to retain only row 1 and delete row 3. Final table should look like this: id1 id2 value a b 10 c d11 c e 12 Thanks in advance for any help provided. Vince __ [hidden email]x-msg://23/user/SendEmail.jtp?type=nodenode=4664105i=0 mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/Removing-rows-that-are-duplicates-but-column-values-are-in-reversed-order-tp4664069p4664105.html To unsubscribe from Removing rows that are duplicates but column values are in reversed order, click herehttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4664069code=dnByM0Bjb3JuZWxsLmVkdXw0NjY0MDY5fDEzODMwOTA4MTI=. NAMLhttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://r.789695.n4.nabble.com/Removing-rows-that-are-duplicates-but-column-values-are-in-reversed-order-tp4664069p4664107.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] BLAS with glmnet
I'm using a multithreaded BLAS library with R and I see the expected speed improvements with matrix multiplication, svd, etc. However, glmnet continues to use only a single CPU. Since this package is compiled from Fortran, is this the expected behavior or is there a way to compile the glmnet package so that it uses the multithreaded BLAS library? Thank you Brian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with handling of attributes in xmlToList in XML package
Hello all, I have a problem with the way attributes are dealt with in the function xmlToList(), and I haven't been able to figure it out for days now. Say I have a document (produced by nmap) like this: mydoc - 'host starttime=1365204834 endtime=1365205860status state=up reason=echo-reply reason_ttl=127/ address addr=XXX.XXX.XXX.XXX addrtype=ipv4/ portsport protocol=tcp portid=135state state=open reason=syn-ack reason_ttl=127/service name=msrpc product=Microsoft Windows RPC ostype=Windows method=probed conf=10cpecpe:/o:microsoft:windows/cpe/service/port port protocol=tcp portid=139state state=open reason=syn-ack reason_ttl=127/service name=netbios-ssn method=probed conf=10//port /ports times srtt=647 rttvar=71 to=10/ /host' I want to store this as a list of lists, so I do: mytree-xmlTreeParse(mydoc) myroot-xmlRoot(mytree) mylist-xmlToList(myroot) Now my problem is that when I want to fetch the attributes of the services running of each port, the behavior is not consistent: mylist[[ports]][[1]][[service]]$.attrs[name] name msrpc mylist[[ports]][[2]][[service]]$.attrs[name] Error in trash_list[[ports]][[2]][[service]]$.attrs : $ operator is invalid for atomic vectors I understand that the way they are dfined in the documnt is not the same, but I think there still should be a consistent behavior. I've tried many combination of parameters for xmlTreeParse() but nothing has helped me. I can't find a way to call up the name of the service consistently regardless of whether the node has children or not. Any tips? All the best, S.G. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RJSONIO Installation
Trying to install rjsonio and I've run into a couple of issues. 1. When installing from R, it's starts to download, then I get * installing *source* package 'RJSONIO' ... ERROR: configuration failed for package 'RJSONIO' * removing 'C:/Program Files/R/R-3.0.0/library/RJSONIO' 2. I've tried similar steps on the cmd(I'm win7). I receive the same message, starts to install then, configuration failed. 3. Looking into the 'read_me' doc that comes with rjsonio it mentions libjson C libraries. They give a special command if you don't have this installed, which I tried, but it didn't work. At this point I want to generate the libjson libraries to try the other 'read_me' suggestions or just get it to work from R. FYI, on the R CMD I've tried --binary and I've tried --build, neither work. Please help, running out of options. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Comparison of Date format
Hi, In the example you provided, it looks like the dates in Date2 happens first. So, I changed it a bit. DataA- read.table(text= ID,Status,Date1,Date2 1,A,3-Feb-01,15-May-01 1,B,15-May-01,16-May-01 1,A,16-May-01,3-Sep-01 1,B,3-Sep-01,13-Sep-01 1,C,13-Sep-01,26-Feb-04 2,A,9-Feb-01,24-May-01 2,B,24-May-01,25-May-01 2,A,25-May-01,16-Mar-02 2,A,6-Mar-02,18-Mar-02 2,A,14-Sep-01,6-Mar-02 ,sep=,,header=TRUE,stringsAsFactors=FALSE) library(stringr) DataA[,3]- str_trim(DataA[,3]) DataA[,4]- str_trim(DataA[,4]) DataB- read.table(text= ID Date.Accident 1 3-Sep-01 1 20-Jan-05 1 26-Feb-04 2 6-Mar-02 ,sep=,header=TRUE,stringsAsFactors=FALSE) lst1-lapply(seq_len(nrow(DataB)),function(i) {x1-unlist(mapply(function(x,y) which(x==y),DataA[,3:4],DataB[i,2]));x2-if(length(x1)==2) DataA[x1[which.min(x1)],!names(DataA)%in%names(x1[which.max(x1)])] else if(length(x1)==1) DataA[x1,c(ID,Status,names(x1))] else NULL}) lst2-lapply(lst1,data.frame) lst2-lst2[lapply(lst2,nrow)!=0] lst2 #[[1]] # ID Status Date2 #3 1 A 3-Sep-01 #[[2]] # ID Status Date2 #5 1 C 26-Feb-04 #[[3]] # ID Status Date1 #9 2 A 6-Mar-02 library(plyr) dataNew-do.call(rbind,lapply(lst2,function(x) {colnames(x)[3]- colnames(DataB)[2];x})) res-join(dataNew,DataB,by=c(Date.Accident,ID),type=right) res # Date.Accident ID Status #1 3-Sep-01 1 A #2 20-Jan-05 1 NA #3 26-Feb-04 1 C #4 6-Mar-02 2 A #or you can split by ID lst1New-lapply(unique(DataA$ID),function(i){x1- DataA[DataA$ID==i,]; x2- DataB[DataB$ID==i,]; do.call(rbind,lapply(seq_len(nrow(x2)),function(i) {x3- unlist(mapply(function(x,y) which(x==y), x1[,3:4],x2[i,2])); x4- if(length(x3)==2) x1[x3[which.min(x3)],!names(x1)%in%names(x3[which.max(x3)])] else if(length(x3)==1) x1[x3,c(ID,Status,names(x3))] else NULL})) }) lst1New #[[1]] # ID Status Date2 #3 1 A 3-Sep-01 #5 1 C 26-Feb-04 #[[2]] # ID Status Date1 #9 2 A 6-Mar-02 dataNew1- do.call(rbind,lapply(lst1New,function(x) {colnames(x)[3]- colnames(DataB)[2];x})) res1- join(dataNew1,DataB,by=c(Date.Accident,ID),type=right) res1 # Date.Accident ID Status #1 3-Sep-01 1 A #2 20-Jan-05 1 NA #3 26-Feb-04 1 C #4 6-Mar-02 2 A A.K. From: farnoosh sheikhi farnoosh...@yahoo.com To: smartpink...@yahoo.com smartpink...@yahoo.com Sent: Friday, April 12, 2013 5:40 PM Subject: Comparison of Date format Hi there, Hope all is well. I have a complicated data and I need to create a new variable based on the date and description of the data. I really appreciate if you can help me. Here is how data look like: DataA DataB ID Status Date1 Date2 ID Date.Accident Result 1 A 3-Feb-01 15-May-01 1 3-Sep-01 A 1 B 15-May-01 16-May-01 1 20-Jan-05 NA 1 A 16-May-01 3-Sep-01 1 B 3-Sep-01 13-Sep-01 1 C 13-Sep-01 26-Feb-04 2 A 9-Feb-01 24-May-01 2 6-Mar-02 A 2 B 24-May-01 25-May-01 2 A 25-May-01 6-Mar-02 2 A 6-Mar-02 18-Mar-02 I want to compare dataA to B for each ID. if Date 1 or Date 2 matches to Date.Accident return the result as status in dataA as a new result in Data B. The trick here is I have two dates that are matched, but I want the status of the one that happen first. The sample size of each data is not the same. I really appreciate your time and help. Thanks. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Solving an integral in R gives the error “The integral is probably divergent”
On Sat, Apr 13, 2013 at 6:31 AM, peter dalgaard pda...@gmail.com wrote: But is it supposed to be t^{-3/2} or t^{-0.5}?? The formula has the former and the code the latter, and the integral is clearly divergent with the former. I didn't look at that -- I just assumed that the code would reproduce the reported error. -thomas -pd On Apr 12, 2013, at 04:51 , Thomas Lumley wrote: I don't get an error message (after I correct the missing line break after the comment b- sapply(a, Cfun, upper=1) b [1] 1.583458e-54 7.768026e-50 2.317562e-45 4.206260e-41 4.645737e-37 3.123801e-33 1.279358e-29 3.193257e-26 4.860876e-23 [10] 4.516582e-20 2.564400e-17 8.908932e-15 1.896996e-12 2.481084e-10 1.998561e-08 9.946570e-07 3.067751e-05 5.862075e-04 [19] 6.818952e-03 4.297061e-02 0.00e+00 3.175122e-01 3.723022e-01 2.364930e-01 9.144836e-02 2.190878e-02 3.252754e-03 [28] 2.983763e-04 1.685692e-05 5.849602e-07 1.244158e-08 1.619155e-10 1.287603e-12 6.250149e-15 1.850281e-17 3.338241e-20 [37] 3.668412e-23 2.454192e-26 9.991546e-30 2.474577e-33 3.727226e-37 3.413319e-41 1.900112e-45 6.428505e-50 1.321588e-54 [46] 1.650722e-59 1.252524e-64 5.772750e-70 1.615916e-75 2.746972e-81 2.835655e-87 1.777399e-93 6.764271e-100 1.562923e-106 [55] 2.192373e-113 1.866955e-120 9.651205e-128 3.028623e-135 5.769185e-143 6.670835e-151 4.682023e-159 1.994643e-167 5.157808e-176 [64] 8.095084e-185 7.711162e-194 4.458042e-203 1.564139e-212 3.330362e-222 4.302974e-232 3.373500e-242 1.604721e-252 4.631224e-263 [73] 8.108474e-274 8.611898e-285 5.547745e-296 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 [82] 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 [91] 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00 [100] 0.00e+00 -thomas On Tue, Apr 9, 2013 at 3:14 PM, Janesh Devkota janesh.devk...@gmail.com wrote: I am trying to solve an integral in R. However, I am getting an error when I am trying to solve for that integral. The equation that I am trying to solve is as follows: $$ C_m = \frac{{abs{x}}e^{2x}}{\pi^{1/2}}\int_0^t t^{-3/2}e^{-x^2/t-t}dt $$ [image: enter image description here] The code that I am using is as follows: a - seq(from=-10, by=0.5,length=100) ## Create a function to compute integrationCfun - function(XX, upper){ integrand - function(x)x^(-0.5)*exp((-XX^2/x)-x) integrated - integrate(integrand, lower=0, upper=upper)$value (final - abs(XX)*pi^(-0.5)*exp(2*XX)*integrated) } b- sapply(a, Cfun, upper=1) The error that I am getting is as follows: Error in integrate(integrand, lower = 0, upper = upper) : the integral is probably divergent Does this mean I cannot solve the integral ? Any possible ways to fix this problem will be highly appreciated.The question can be found on http://stackoverflow.com/questions/15892586/solving-an-integral-in-r-gives-error-the-integral-is-probably-divergent also. Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Thomas Lumley Professor of Biostatistics University of Auckland [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com -- Thomas Lumley Professor of Biostatistics University of Auckland [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Comparison of Date format
HI, In cases like below: DataA- read.table(text= ID,Status,Date1,Date2 1,A,3-Feb-01,15-May-01 1,B,15-May-01,16-May-01 1,A,16-May-01,3-Sep-01 1,B,3-Sep-01,13-Sep-01 1,C,13-Sep-01,26-Feb-04 2,A,9-Feb-01,24-May-01 2,B,24-May-01,25-May-01 2,A,25-May-01,16-Mar-02 2,A,6-Mar-02,18-Mar-02 2,A,14-Sep-01,6-Mar-02 ,sep=,,header=TRUE,stringsAsFactors=FALSE) library(stringr) DataA[,3]- str_trim(DataA[,3]) DataA[,4]- str_trim(DataA[,4]) DataB- read.table(text= ID Date.Accident 1 3-Sep-01 1 20-Jan-05 1 26-Feb-04 2 25-May-01 2 6-Mar-02 ,sep=,header=TRUE,stringsAsFactors=FALSE) DataB[,2]- str_trim(DataB[,2]) #Needed modification in the 2nd method: dataNew2-do.call(rbind,lapply(unique(DataA$ID),function(i){x1- DataA[DataA$ID==i,]; x2- DataB[DataB$ID==i,]; do.call(rbind,lapply(seq_len(nrow(x2)),function(i) {x3- unlist(mapply(function(x,y) which(x==y), x1[,3:4],x2[i,2])); x4- if(length(x3)==2) x1[x3[which.min(x3)],!names(x1)%in%names(x3[which.max(x3)])] else if(length(x3)==1) x1[x3,c(ID,Status,names(x3))] else NULL; if(length(x4)0) {colnames(x4)[3]- colnames(DataB)[2]; x4} else NULL})) })) library(plyr) res2- join(dataNew2,DataB,by=c(Date.Accident,ID),type=right) res2 # Date.Accident ID Status #1 3-Sep-01 1 A #2 20-Jan-05 1 NA #3 26-Feb-04 1 C #4 25-May-01 2 B #5 6-Mar-02 2 A Using the first method: lst1-lapply(seq_len(nrow(DataB)),function(i) {x1-unlist(mapply(function(x,y) which(x==y),DataA[,3:4],DataB[i,2]));x2-if(length(x1)==2) DataA[x1[which.min(x1)],!names(DataA)%in%names(x1[which.max(x1)])] else if(length(x1)==1) DataA[x1,c(ID,Status,names(x1))] else NULL}) lst2-lapply(lst1,data.frame) lst2-lst2[lapply(lst2,nrow)!=0] library(plyr) dataNew-do.call(rbind,lapply(lst2,function(x) {colnames(x)[3]- colnames(DataB)[2];x})) res-join(dataNew,DataB,by=c(Date.Accident,ID),type=right) identical(res,res2) #[1] TRUE A.K. - Original Message - From: arun smartpink...@yahoo.com To: farnoosh sheikhi farnoosh...@yahoo.com Cc: R help r-help@r-project.org Sent: Saturday, April 13, 2013 12:41 AM Subject: Re: Comparison of Date format Hi, In the example you provided, it looks like the dates in Date2 happens first. So, I changed it a bit. DataA- read.table(text= ID,Status,Date1,Date2 1,A,3-Feb-01,15-May-01 1,B,15-May-01,16-May-01 1,A,16-May-01,3-Sep-01 1,B,3-Sep-01,13-Sep-01 1,C,13-Sep-01,26-Feb-04 2,A,9-Feb-01,24-May-01 2,B,24-May-01,25-May-01 2,A,25-May-01,16-Mar-02 2,A,6-Mar-02,18-Mar-02 2,A,14-Sep-01,6-Mar-02 ,sep=,,header=TRUE,stringsAsFactors=FALSE) library(stringr) DataA[,3]- str_trim(DataA[,3]) DataA[,4]- str_trim(DataA[,4]) DataB- read.table(text= ID Date.Accident 1 3-Sep-01 1 20-Jan-05 1 26-Feb-04 2 6-Mar-02 ,sep=,header=TRUE,stringsAsFactors=FALSE) lst1-lapply(seq_len(nrow(DataB)),function(i) {x1-unlist(mapply(function(x,y) which(x==y),DataA[,3:4],DataB[i,2]));x2-if(length(x1)==2) DataA[x1[which.min(x1)],!names(DataA)%in%names(x1[which.max(x1)])] else if(length(x1)==1) DataA[x1,c(ID,Status,names(x1))] else NULL}) lst2-lapply(lst1,data.frame) lst2-lst2[lapply(lst2,nrow)!=0] lst2 #[[1]] # ID Status Date2 #3 1 A 3-Sep-01 #[[2]] # ID Status Date2 #5 1 C 26-Feb-04 #[[3]] # ID Status Date1 #9 2 A 6-Mar-02 library(plyr) dataNew-do.call(rbind,lapply(lst2,function(x) {colnames(x)[3]- colnames(DataB)[2];x})) res-join(dataNew,DataB,by=c(Date.Accident,ID),type=right) res # Date.Accident ID Status #1 3-Sep-01 1 A #2 20-Jan-05 1 NA #3 26-Feb-04 1 C #4 6-Mar-02 2 A #or you can split by ID lst1New-lapply(unique(DataA$ID),function(i){x1- DataA[DataA$ID==i,]; x2- DataB[DataB$ID==i,]; do.call(rbind,lapply(seq_len(nrow(x2)),function(i) {x3- unlist(mapply(function(x,y) which(x==y), x1[,3:4],x2[i,2])); x4- if(length(x3)==2) x1[x3[which.min(x3)],!names(x1)%in%names(x3[which.max(x3)])] else if(length(x3)==1) x1[x3,c(ID,Status,names(x3))] else NULL})) }) lst1New #[[1]] # ID Status Date2 #3 1 A 3-Sep-01 #5 1 C 26-Feb-04 #[[2]] # ID Status Date1 #9 2 A 6-Mar-02 dataNew1- do.call(rbind,lapply(lst1New,function(x) {colnames(x)[3]- colnames(DataB)[2];x})) res1- join(dataNew1,DataB,by=c(Date.Accident,ID),type=right) res1 # Date.Accident ID Status #1 3-Sep-01 1 A #2 20-Jan-05 1 NA #3 26-Feb-04 1 C #4 6-Mar-02 2 A A.K. From: farnoosh sheikhi farnoosh...@yahoo.com To: smartpink...@yahoo.com smartpink...@yahoo.com Sent: Friday, April 12, 2013 5:40 PM Subject: Comparison of Date format