[R] favorable mention and use of R on the net
http://equitablegrowth.org/2014/08/26/moving-corrected-calculations-last-weeks-shiller-stock-market-posts-r-afternoon-note/ Begins Because only people who really, really, really want to make bad mistakes do things in the un-debuggable Excel (or Numbers)… and then proceeds with an extended analysis in R. He may have used Excel in the earlier posts; the comment is certainly a reference to an infamous series of errors in a spreadsheet behind a much publicized claim that it was disastrous for countries to have debt 90% GDP. See http://en.wikipedia.org/wiki/Growth_in_a_Time_of_Debt for details. The background question to the blog post is whether current stock market valuations indicate stocks are likely to be a poor medium-term investment (10 years). A quick look does not reveal to me what, if any, substantive changes to the conclusions resulted from using R not Excel (or even if is previous conclusions were based on Excel). In fact, it's kind of hard to tell what the conclusion is! A couple of more technical comments: DeLong (the post's author) uses simple regression (lm) and then observes The significance levels that R reports are wrong: its naive regression package assumes that each of the 1482 observed 10-year returns is independent of each of the others. They are not. I assume R has some tools that can do proper time series analyses; I'm not sure why he didn't use them. DeLong is an economic historian, not an econometrician. One important observation stems from something even simpler than regression: Basically what we know about expected returns is that on the one occasion when CAPE rose above 30, the dot-com crash of 2000 was in the near future and the housing crash of 2008 came into the ten-year return window. That is not much information on which to base a long-run sell decision. Ross Boylan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] wilcox.test - difference between p-values of R and online calculators
It seems your numbers has ties. What happens if you run wilcox.test with correct=FALSE, will the results be the same as the online calculators? Contact Details:--- Contact me: tal.gal...@gmail.com | Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Wed, Sep 3, 2014 at 3:54 AM, W Bradley Knox bradk...@mit.edu wrote: Hi. I'm taking the long-overdue step of moving from using online calculators to compute results for Mann-Whitney U tests to a more streamlined system involving R. However, I'm finding that R computes a different result than the 3 online calculators that I've used before (all of which approximately agree). These calculators are here: http://elegans.som.vcu.edu/~leon/stats/utest.cgi http://vassarstats.net/utest.html http://www.socscistatistics.com/tests/mannwhitney/ An example calculation is *wilcox.test(c(359,359,359,359,359,359,335,359,359,359,359,359,359,359,359,359,359,359,359,359,359,303,359,359,359),c(332,85,359,359,359,220,231,300,359,237,359,183,286,355,250,105,359,359,298,359,359,359,28.6,359,359,128))* which prints *Wilcoxon rank sum test with continuity correction data: c(359, 359, 359, 359, 359, 359, 335, 359, 359, 359, 359, 359, and c(332, 85, 359, 359, 359, 220, 231, 300, 359, 237, 359, 183, 359, 359, 359, 359, 359, 359, 359, 359, 359, 303, 359, 359, and 286, 355, 250, 105, 359, 359, 298, 359, 359, 359, 28.6, 359, 359) and 359, 128) W = 485, p-value = 0.0002594 alternative hypothesis: true location shift is not equal to 0 Warning message: In wilcox.test.default(c(359, 359, 359, 359, 359, 359, 335, 359, : cannot compute exact p-value with ties* However, all of the online calculators find p-values close to 0.0025, 10x the value output by R. All results are for a two-tailed case. Importantly, the W value computed by R *does agree* with the U values output by the first two online calculators listed above, yet it has a different p-value. Can anyone shed some light on how and why R's calculation differs from that of these online calculators? Thanks for your time. W. Bradley Knox, PhD http://bradknox.net bradk...@mit.edu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] shiny datatables column filtering plugin
Thank you for checking Yihui, on the off chance are you familiar with any other methods to filter on multiple conditions? On Tue, Sep 2, 2014 at 11:07 PM, Yihui Xie x...@yihui.name wrote: I just tested it and this plugin does not seem to work with the new .DataTable() API in DataTables 1.10.x, so I guess it is unlikely to make it work in (the current development version of) shiny. It is not in the official list of plugins, either: http://www.datatables.net/extensions/index Regards, Yihui -- Yihui Xie xieyi...@gmail.com Web: http://yihui.name On Tue, Sep 2, 2014 at 11:59 AM, Charles Determan Jr deter...@umn.edu wrote: Greetings, I am currently exploring some capabilities of the 'Shiny' package. I am currently working with the most recent version of 'shiny' from the rstudio github repository (version - 0.10.1.9006) in order to use the most up to date datatables plugin. Using the ggplot2 diamonds dataset, I can easily set columns as unsearchable (commented out below) and I could also subset out all the 'Ideal' diamonds for example, however I cannot filter out multiple conditions such as 'Ideal' and 'Fair' diamonds together. From my searching, this multiple filtering can be done with checkboxes from the column using the jquery column filtering plugin ( http://jquery-datatables-column-filter.googlecode.com/svn/trunk/checkbox.html ). Despite this, I cannot get this plugin to work with my shiny app. Any insight would be appreciated. library(shiny) library(ggplot2) runApp( list(ui = basicPage( h1('Diamonds DataTable with TableTools'), # added column filter plugin singleton(tags$head(tags$script(src=' https://code.google.com/p/jquery-datatables-column-filter/source/browse/trunk/media/js/jquery.dataTables.columnFilter.js ', type='text/javascript'))), dataTableOutput(mytable) ) ,server = function(input, output) { output$mytable = renderDataTable({ diamonds[,1:6] }, options = list( pageLength = 10,# columnDefs = I('[{targets: [0,1], searchable: false}]') columnFilter = I('[{ columnDefs: [targets: [0,1], type: checkbox] }]') ) ) } )) Charles Charles [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] .C and .Fortran
Hi, I'd like to know what is the difference between the functions .C() and .Fortran. I noticed that by enclosing my F90 subroutines in modules .Fortran() can't find them any more in the load table, while .C() still can. I also checked that the subroutine was loaded with the is.loaded() function... So can anyone explain to me the difference and which is better to use? Thanks in advance, Filippo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] frequencies of a discrete numeric variable, including zeros
Thanks to all who replied to this thread. To summarize, John Fox and William Dunlap's suggestion amounts to this plot, where it becomes *crucial* to eliminate the zeros (otherwise they would not be distinguishable from the counts of 1, with points()): # Fox/Dunlap plot, using plot.table method art.tab0 - table(art) plot(art.tab0, ylab=Frequency, xlab=Number of articles) points(as.numeric(names(art.tab0)), art.tab0, pch=16) Here, I actually prefer the barplot, using factor() to retain the zeros: # coerce to a factor, then use table() art.fac - factor(art, levels = 0:19) art.tab - table(art.fac) barplot(art.tab, ylab=Frequency, xlab=Number of articles) However, the frequencies for small values of art dominate the display, and I'm contemplating a Poisson regression anyway, so why not plot on a log scale: # plot on log scale, but start at 1 to avoid log(0) barplot(art.tab+1, ylab=log(Frequency+1), xlab=Number of articles, log=y) # plot on log scale, directly barplot(log(art.tab+1), ylab=log(Frequency+1), xlab=Number of articles) The first method, using log=y gives axis labels on the scale of frequency. best, -Michael On 9/2/2014 3:49 PM, John Fox wrote: Hi Michael, I think that histograms are intrinsically misleading for discrete data, and that while bar graphs are an improvement, they also invite misinterpretation. I generally do something like this: f - table(factor(art, levels=0:19)) plot(as.numeric(names(f)), as.numeric(f), type=h, xlab=art, ylab=frequency, axes=FALSE) axis(1, pos=0, at=0:19) axis(2) points(as.numeric(names(f)), f, pch=16) abline(h=0) Actually, I prefer omitting the points corresponding to 0 counts, which is even simpler: f - table(art) plot(as.numeric(names(f)), as.numeric(f), type=h, xlab=art, ylab=frequency, axes=FALSE) axis(1, pos=0, at=min(art):max(art)) axis(2) points(as.numeric(names(f)), f, pch=16) abline(h=0) Best, John --- John Fox, Professor McMaster University Hamilton, Ontario, Canada http://socserv.socsci.mcmaster.ca/jfox/ -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Michael Friendly Sent: Tuesday, September 02, 2014 1:29 PM To: R-help Subject: [R] frequencies of a discrete numeric variable, including zeros The data vector, art, given below using dput(), gives a set of discrete numeric values for 915 observations, in the range of 0:19. I want to make some plots of the frequency distribution, but the standard tools (hist, barplot, table) don't give me what I want to make a custom plot due to 0 frequencies for some of the 0:19 counts. table() excludes the values of art that occur with zero frequency, and these are excluded in barplot() table(art) art 0 1 2 3 4 5 6 7 8 9 10 11 12 16 19 275 246 178 84 67 27 17 12 1 2 1 1 2 1 1 barplot(table(art)) A direct calculation, using colSums of outer() gives me the values I want, but this seems unnecessarily complicated for this simple task. (art.freq - colSums(outer(art, 0:19, `==`))) [1] 275 246 178 84 67 27 17 12 1 2 1 1 2 0 0 0 1 0 0 1 barplot(art.freq, names.arg=0:19) Moreover, I was surprised by the result of hist() on this data, because the 0 1 counts from the above were combined in this call: art.hist - hist(art, breaks=0:19, plot=FALSE) art.hist$breaks [1] 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 art.hist$counts [1] 521 178 84 67 27 17 12 1 2 1 1 2 0 0 0 1 0 0 1 Is there some option I missed here? The data: dput(art) c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
[R] strange date format after extracting calendar data with RSQLite
Hi, I have recently downloaded RSQLite with intention of using it to access googlecalendar data stored in Thunderbird cache.sqlite and local.sqlite. Thunderbird exports the data just fine as a .ics file but when I access it directly from R using RSQLite function dbGetQuery I get strange dates. As an examples: a date stored as 20140818 appears to me in R as 1129528320 a date stored as 20140823 appears to me in R as -1663476736 I have not been able to make sense of this and I have found no way to convert it back. I am wondering if anyone has experienced these issues or have any suggestions what is causing this and how to correct it. Thank you, Nuno [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] predictNLS {propagate} bug for multiple predictors?
Many R users have requested means to generate prediction/confidence intervals for their non-linear models in R. The propagate packages offers this service. However, I'm having issues when applying predictNLS to my model. Has anyone else had issues with predictNLS? Here is my simple problem. t - c(194.84, 95.62, 40.87, 31.64, 0.00) m - c(0.00, 2.29, 4.72, 10.95, 28.36) t0 - t[1] m0 - m[5] dat - data.frame(t = t, m = m) nlls - nls(m ~ (m0 ^ (1 / lambda) - (t * m0 / t0) ^ (1 / lambda)) ^ lambda, start=list(lambda = 1), data = dat) xv - seq(0,t0, length.out = 100) library(propagate) predictNLS forces me to redefine m0 and t0, which are constants in the model. predictNLS(nlls, newdata = data.frame(t = xv)) /Error in predictNLS(nlls, newdata = data.frame(t = xv, t0 = t0, m0 = m0)) : newdata should have name(s) m0tt0!/ OK, predict.nls doesn't have this issue. But, let's just give it what it wants. Notice the predictors needs to be defined in the right order as well. . predictNLS(nlls, newdata = data.frame(m0 = m0, t = xv, t0 = t0)) /Propagating predictor value #1 ... Error in propagate(expr = EXPR, data = DF, use.cov = COV, alpha = alpha, : Names of input dataframe and var-cov matrix do not match!/ I'm not sure what else I can do here to fix the problem, as the issue seems to be related to names, which predictNLS has accepted until it's internal use of propagate(). My model is quite simple (i.e., plot the data and see). The predictNLS works fine on the example provided in the packages documentation which has only one predictor. Could this be where the error is being generated (i.e., multiple predictors)? What am I doing wrong here? Thanks for any help/advice, -- Patrick Gauthier PhD Candidate Faculty of Natural Resources Management Lakehead University Office: BB-1007-D (tel. 807 766-7127) Lab: BB-1071-A (tel. 807 343-8739) http://abel.lakeheadu.ca/index.shtml http://forward.lakeheadu.ca/gradstudents.htm [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] wilcox.test - difference between p-values of R and online calculators
That does not change the results. The problem is likely to be the way ties are handled. The first sample has 25 values of which 23 are identical (359). The second sample has 26 values of which 12 are identical (359). The difference between the implementations may be a result of the way the ties are ranked. For example the R function rank() offers 5 different ways of handling the rank on tied observations. With so many ties, that could make a substantial difference. Package coin has wilxon_test() which uses Monte Carlo simulation to estimate the confidence limits. - David L Carlson Department of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Tal Galili Sent: Wednesday, September 3, 2014 5:24 AM To: W Bradley Knox Cc: r-help@r-project.org Subject: Re: [R] wilcox.test - difference between p-values of R and online calculators It seems your numbers has ties. What happens if you run wilcox.test with correct=FALSE, will the results be the same as the online calculators? Contact Details:--- Contact me: tal.gal...@gmail.com | Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Wed, Sep 3, 2014 at 3:54 AM, W Bradley Knox bradk...@mit.edu wrote: Hi. I'm taking the long-overdue step of moving from using online calculators to compute results for Mann-Whitney U tests to a more streamlined system involving R. However, I'm finding that R computes a different result than the 3 online calculators that I've used before (all of which approximately agree). These calculators are here: http://elegans.som.vcu.edu/~leon/stats/utest.cgi http://vassarstats.net/utest.html http://www.socscistatistics.com/tests/mannwhitney/ An example calculation is *wilcox.test(c(359,359,359,359,359,359,335,359,359,359,359,359,359,359,359,359,359,359,359,359,359,303,359,359,359),c(332,85,359,359,359,220,231,300,359,237,359,183,286,355,250,105,359,359,298,359,359,359,28.6,359,359,128))* which prints *Wilcoxon rank sum test with continuity correction data: c(359, 359, 359, 359, 359, 359, 335, 359, 359, 359, 359, 359, and c(332, 85, 359, 359, 359, 220, 231, 300, 359, 237, 359, 183, 359, 359, 359, 359, 359, 359, 359, 359, 359, 303, 359, 359, and 286, 355, 250, 105, 359, 359, 298, 359, 359, 359, 28.6, 359, 359) and 359, 128) W = 485, p-value = 0.0002594 alternative hypothesis: true location shift is not equal to 0 Warning message: In wilcox.test.default(c(359, 359, 359, 359, 359, 359, 335, 359, : cannot compute exact p-value with ties* However, all of the online calculators find p-values close to 0.0025, 10x the value output by R. All results are for a two-tailed case. Importantly, the W value computed by R *does agree* with the U values output by the first two online calculators listed above, yet it has a different p-value. Can anyone shed some light on how and why R's calculation differs from that of these online calculators? Thanks for your time. W. Bradley Knox, PhD http://bradknox.net bradk...@mit.edu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] shiny datatables column filtering plugin
The built-in version of DataTables in shiny has already supported numeric ranges. For a numeric column x in data, if you type a,b in the search box, the data will be filtered using a = x = b. The check boxes are not supported, but you can use regular expressions (more flexible) to achieve the same thing, e.g. (this example requires the development version of shiny: https://groups.google.com/forum/#!topic/shiny-discuss/-0u-wTnq_lA) library(shiny) runApp(list( ui = fluidPage( dataTableOutput(mytable) ), server = function(input, output) { output$mytable = renderDataTable( iris[sample(nrow(iris)), ], options = list(search = list(regex = TRUE)) ) } )) Then you can search for ^setosa|versicolor$, which means both setosa and versicolor in the iris data. Or 4,5 in the search box of Sepal.Length to filter this column. Depending on what you want, this may or may not be enough. Regards, Yihui -- Yihui Xie xieyi...@gmail.com Web: http://yihui.name On Wed, Sep 3, 2014 at 7:12 AM, Charles Determan Jr deter...@umn.edu wrote: Thank you for checking Yihui, on the off chance are you familiar with any other methods to filter on multiple conditions? On Tue, Sep 2, 2014 at 11:07 PM, Yihui Xie x...@yihui.name wrote: I just tested it and this plugin does not seem to work with the new .DataTable() API in DataTables 1.10.x, so I guess it is unlikely to make it work in (the current development version of) shiny. It is not in the official list of plugins, either: http://www.datatables.net/extensions/index Regards, Yihui -- Yihui Xie xieyi...@gmail.com Web: http://yihui.name On Tue, Sep 2, 2014 at 11:59 AM, Charles Determan Jr deter...@umn.edu wrote: Greetings, I am currently exploring some capabilities of the 'Shiny' package. I am currently working with the most recent version of 'shiny' from the rstudio github repository (version - 0.10.1.9006) in order to use the most up to date datatables plugin. Using the ggplot2 diamonds dataset, I can easily set columns as unsearchable (commented out below) and I could also subset out all the 'Ideal' diamonds for example, however I cannot filter out multiple conditions such as 'Ideal' and 'Fair' diamonds together. From my searching, this multiple filtering can be done with checkboxes from the column using the jquery column filtering plugin ( http://jquery-datatables-column-filter.googlecode.com/svn/trunk/checkbox.html). Despite this, I cannot get this plugin to work with my shiny app. Any insight would be appreciated. library(shiny) library(ggplot2) runApp( list(ui = basicPage( h1('Diamonds DataTable with TableTools'), # added column filter plugin singleton(tags$head(tags$script(src='https://code.google.com/p/jquery-datatables-column-filter/source/browse/trunk/media/js/jquery.dataTables.columnFilter.js', type='text/javascript'))), dataTableOutput(mytable) ) ,server = function(input, output) { output$mytable = renderDataTable({ diamonds[,1:6] }, options = list( pageLength = 10,# columnDefs = I('[{targets: [0,1], searchable: false}]') columnFilter = I('[{ columnDefs: [targets: [0,1], type: checkbox] }]') ) ) } )) Charles Charles __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] .C and .Fortran
On 03/09/2014 13:28, Filippo Monari wrote: Hi, I'd like to know what is the difference between the functions .C() and .Fortran. I noticed that by enclosing my F90 subroutines in modules .Fortran() can't find them any more in the load table, while .C() still can. I also checked that the subroutine was loaded with the is.loaded() function... So can anyone explain to me the difference and which is better to use? This is not the right list: see the posting guide. But .Fortran is intended for *Fortran 77* code (as the help page says), and maps the supplied NAME argument in the same way as the Fortran 77 compiler does, which is often different from the way the C compiler does. I would strongly recommend that new code uses .Call and a C wrapper to F90 code: it is a safer and more portable route. If you want any more help, you need to follow the posting guide and - post to R-devel, - supply the 'at a minimum information' asked for there, - supply the minimal example asked for, and what the messages you see are. Thanks in advance, Filippo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Emeritus Professor of Applied Statistics, University of Oxford 1 South Parks Road, Oxford OX1 3TG, UK __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] wilcox.test - difference between p-values of R and online calculators
Since they all have the same W/U value, it seems likely that the difference is how the different versions adjust the standard error for ties. Here are a couple of posts addressing the issues of ties: http://tolstoy.newcastle.edu.au/R/e8/help/09/12/9200.html http://stats.stackexchange.com/questions/6127/which-permutation-test-implementation-in-r-to-use-instead-of-t-tests-paired-and David C From: wbradleyk...@gmail.com [mailto:wbradleyk...@gmail.com] On Behalf Of W Bradley Knox Sent: Wednesday, September 3, 2014 9:20 AM To: David L Carlson Cc: Tal Galili; r-help@r-project.org Subject: Re: [R] wilcox.test - difference between p-values of R and online calculators Tal and David, thanks for your messages. I should have added that I tried all variations of true/false values for the exact and correct parameters. Running with correct=FALSE makes only a tiny change, resulting in W = 485, p-value = 0.0002481. At one point, I also thought that the discrepancy between R and these online calculators might come from how ties are handled, but the fact that R and two of the online calcultors reach the same U/W values seems to indicate that ties aren't the issue, since (I believe) the U or W values contain all of the information needed to calculate the p-value, assuming the number of samples is also known for each condition. (However, it's been a while since I looked into how MWU tests work, so maybe now's the time to refresh.) If that's correct, the discrepancy seems to be based in what R does with the W value that is identical to the U values of two of the online calculators. (I'm also assuming that U and W have the same meaning, which seems likely.) - Brad W. Bradley Knox, PhD http://bradknox.nethttp://bradknox.net/ bradk...@mit.edumailto:bradk...@mit.edu On Wed, Sep 3, 2014 at 9:10 AM, David L Carlson dcarl...@tamu.edumailto:dcarl...@tamu.edu wrote: That does not change the results. The problem is likely to be the way ties are handled. The first sample has 25 values of which 23 are identical (359). The second sample has 26 values of which 12 are identical (359). The difference between the implementations may be a result of the way the ties are ranked. For example the R function rank() offers 5 different ways of handling the rank on tied observations. With so many ties, that could make a substantial difference. Package coin has wilxon_test() which uses Monte Carlo simulation to estimate the confidence limits. - David L Carlson Department of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org] On Behalf Of Tal Galili Sent: Wednesday, September 3, 2014 5:24 AM To: W Bradley Knox Cc: r-help@r-project.orgmailto:r-help@r-project.org Subject: Re: [R] wilcox.test - difference between p-values of R and online calculators It seems your numbers has ties. What happens if you run wilcox.test with correct=FALSE, will the results be the same as the online calculators? Contact Details:--- Contact me: tal.gal...@gmail.commailto:tal.gal...@gmail.com | Read me: www.talgalili.comhttp://www.talgalili.com (Hebrew) | www.biostatistics.co.ilhttp://www.biostatistics.co.il (Hebrew) | www.r-statistics.comhttp://www.r-statistics.com (English) -- On Wed, Sep 3, 2014 at 3:54 AM, W Bradley Knox bradk...@mit.edumailto:bradk...@mit.edu wrote: Hi. I'm taking the long-overdue step of moving from using online calculators to compute results for Mann-Whitney U tests to a more streamlined system involving R. However, I'm finding that R computes a different result than the 3 online calculators that I've used before (all of which approximately agree). These calculators are here: http://elegans.som.vcu.edu/~leon/stats/utest.cgi http://vassarstats.net/utest.html http://www.socscistatistics.com/tests/mannwhitney/ An example calculation is *wilcox.test(c(359,359,359,359,359,359,335,359,359,359,359,359,359,359,359,359,359,359,359,359,359,303,359,359,359),c(332,85,359,359,359,220,231,300,359,237,359,183,286,355,250,105,359,359,298,359,359,359,28.6,359,359,128))* which prints *Wilcoxon rank sum test with continuity correction data: c(359, 359, 359, 359, 359, 359, 335, 359, 359, 359, 359, 359, and c(332, 85, 359, 359, 359, 220, 231, 300, 359, 237, 359, 183, 359, 359, 359, 359, 359, 359, 359, 359, 359, 303, 359, 359, and 286, 355, 250, 105, 359, 359, 298, 359, 359, 359, 28.6, 359, 359) and 359, 128) W = 485, p-value = 0.0002594 alternative hypothesis: true location shift is not equal to 0 Warning message: In wilcox.test.default(c(359, 359, 359, 359, 359, 359, 335, 359, : cannot compute exact
Re: [R] shiny datatables column filtering plugin
Thank you Yihui, this would certainly work for me however I have having trouble getting the regex to work appropriately. I am using the developmental version of shiny and have copied your code. I launch the app and the filtering of numbers works fine (i.e. 4,5) but the search for setosa and versicolor gives me a blank datatable. Is there some dependency that I am missing that would prevent this regex to work with shiny? On Wed, Sep 3, 2014 at 11:27 AM, Yihui Xie x...@yihui.name wrote: The built-in version of DataTables in shiny has already supported numeric ranges. For a numeric column x in data, if you type a,b in the search box, the data will be filtered using a = x = b. The check boxes are not supported, but you can use regular expressions (more flexible) to achieve the same thing, e.g. (this example requires the development version of shiny: https://groups.google.com/forum/#!topic/shiny-discuss/-0u-wTnq_lA) library(shiny) runApp(list( ui = fluidPage( dataTableOutput(mytable) ), server = function(input, output) { output$mytable = renderDataTable( iris[sample(nrow(iris)), ], options = list(search = list(regex = TRUE)) ) } )) Then you can search for ^setosa|versicolor$, which means both setosa and versicolor in the iris data. Or 4,5 in the search box of Sepal.Length to filter this column. Depending on what you want, this may or may not be enough. Regards, Yihui -- Yihui Xie xieyi...@gmail.com Web: http://yihui.name On Wed, Sep 3, 2014 at 7:12 AM, Charles Determan Jr deter...@umn.edu wrote: Thank you for checking Yihui, on the off chance are you familiar with any other methods to filter on multiple conditions? On Tue, Sep 2, 2014 at 11:07 PM, Yihui Xie x...@yihui.name wrote: I just tested it and this plugin does not seem to work with the new .DataTable() API in DataTables 1.10.x, so I guess it is unlikely to make it work in (the current development version of) shiny. It is not in the official list of plugins, either: http://www.datatables.net/extensions/index Regards, Yihui -- Yihui Xie xieyi...@gmail.com Web: http://yihui.name On Tue, Sep 2, 2014 at 11:59 AM, Charles Determan Jr deter...@umn.edu wrote: Greetings, I am currently exploring some capabilities of the 'Shiny' package. I am currently working with the most recent version of 'shiny' from the rstudio github repository (version - 0.10.1.9006) in order to use the most up to date datatables plugin. Using the ggplot2 diamonds dataset, I can easily set columns as unsearchable (commented out below) and I could also subset out all the 'Ideal' diamonds for example, however I cannot filter out multiple conditions such as 'Ideal' and 'Fair' diamonds together. From my searching, this multiple filtering can be done with checkboxes from the column using the jquery column filtering plugin ( http://jquery-datatables-column-filter.googlecode.com/svn/trunk/checkbox.html ). Despite this, I cannot get this plugin to work with my shiny app. Any insight would be appreciated. library(shiny) library(ggplot2) runApp( list(ui = basicPage( h1('Diamonds DataTable with TableTools'), # added column filter plugin singleton(tags$head(tags$script(src=' https://code.google.com/p/jquery-datatables-column-filter/source/browse/trunk/media/js/jquery.dataTables.columnFilter.js ', type='text/javascript'))), dataTableOutput(mytable) ) ,server = function(input, output) { output$mytable = renderDataTable({ diamonds[,1:6] }, options = list( pageLength = 10,# columnDefs = I('[{targets: [0,1], searchable: false}]') columnFilter = I('[{ columnDefs: [targets: [0,1], type: checkbox] }]') ) ) } )) Charles Charles -- Dr. Charles Determan, PhD Integrated Biosciences [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] shiny datatables column filtering plugin
It looks like a problem of DataTables -- I cannot find a way to specify the search.regex option for individual columns. You may ask this question on the DataTables forum. Basically I was expecting this to work: .DataTable({ search: { regex: true }, columnDefs: [{ search: { regex: true }, targets: [0, 1, 2, 3, 4] }] }) The global search box works, though. Regards, Yihui -- Yihui Xie xieyi...@gmail.com Web: http://yihui.name On Wed, Sep 3, 2014 at 2:09 PM, Charles Determan Jr deter...@umn.edu wrote: Thank you Yihui, this would certainly work for me however I have having trouble getting the regex to work appropriately. I am using the developmental version of shiny and have copied your code. I launch the app and the filtering of numbers works fine (i.e. 4,5) but the search for setosa and versicolor gives me a blank datatable. Is there some dependency that I am missing that would prevent this regex to work with shiny? On Wed, Sep 3, 2014 at 11:27 AM, Yihui Xie x...@yihui.name wrote: The built-in version of DataTables in shiny has already supported numeric ranges. For a numeric column x in data, if you type a,b in the search box, the data will be filtered using a = x = b. The check boxes are not supported, but you can use regular expressions (more flexible) to achieve the same thing, e.g. (this example requires the development version of shiny: https://groups.google.com/forum/#!topic/shiny-discuss/-0u-wTnq_lA) library(shiny) runApp(list( ui = fluidPage( dataTableOutput(mytable) ), server = function(input, output) { output$mytable = renderDataTable( iris[sample(nrow(iris)), ], options = list(search = list(regex = TRUE)) ) } )) Then you can search for ^setosa|versicolor$, which means both setosa and versicolor in the iris data. Or 4,5 in the search box of Sepal.Length to filter this column. Depending on what you want, this may or may not be enough. Regards, Yihui -- Yihui Xie xieyi...@gmail.com Web: http://yihui.name On Wed, Sep 3, 2014 at 7:12 AM, Charles Determan Jr deter...@umn.edu wrote: Thank you for checking Yihui, on the off chance are you familiar with any other methods to filter on multiple conditions? On Tue, Sep 2, 2014 at 11:07 PM, Yihui Xie x...@yihui.name wrote: I just tested it and this plugin does not seem to work with the new .DataTable() API in DataTables 1.10.x, so I guess it is unlikely to make it work in (the current development version of) shiny. It is not in the official list of plugins, either: http://www.datatables.net/extensions/index Regards, Yihui -- Yihui Xie xieyi...@gmail.com Web: http://yihui.name On Tue, Sep 2, 2014 at 11:59 AM, Charles Determan Jr deter...@umn.edu wrote: Greetings, I am currently exploring some capabilities of the 'Shiny' package. I am currently working with the most recent version of 'shiny' from the rstudio github repository (version - 0.10.1.9006) in order to use the most up to date datatables plugin. Using the ggplot2 diamonds dataset, I can easily set columns as unsearchable (commented out below) and I could also subset out all the 'Ideal' diamonds for example, however I cannot filter out multiple conditions such as 'Ideal' and 'Fair' diamonds together. From my searching, this multiple filtering can be done with checkboxes from the column using the jquery column filtering plugin ( http://jquery-datatables-column-filter.googlecode.com/svn/trunk/checkbox.html). Despite this, I cannot get this plugin to work with my shiny app. Any insight would be appreciated. library(shiny) library(ggplot2) runApp( list(ui = basicPage( h1('Diamonds DataTable with TableTools'), # added column filter plugin singleton(tags$head(tags$script(src='https://code.google.com/p/jquery-datatables-column-filter/source/browse/trunk/media/js/jquery.dataTables.columnFilter.js', type='text/javascript'))), dataTableOutput(mytable) ) ,server = function(input, output) { output$mytable = renderDataTable({ diamonds[,1:6] }, options = list( pageLength = 10,# columnDefs = I('[{targets: [0,1], searchable: false}]') columnFilter = I('[{ columnDefs: [targets: [0,1], type: checkbox] }]') ) ) } )) Charles Charles -- Dr. Charles Determan, PhD Integrated Biosciences __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Four available places on GLMM course in Banff
There are four remaining places on the following course: Course: Introduction to MCMC, Linear mixed effects models and GLMM with R When: 22-26 September, 2014 Where: Parks Canada, Banff, Canada Flyer: http://www.highstat.com/Courses/Flyer2014_09Banff.pdf Course website: http://www.highstat.com/statscourse.htm Kind regards, Alain Zuur -- Dr. Alain F. Zuur First author of: 1. Beginner's Guide to GAMM with R (2014). 2. Beginner's Guide to GLM and GLMM with R (2013). 3. Beginner's Guide to GAM with R (2012). 4. Zero Inflated Models and GLMM with R (2012). 5. A Beginner's Guide to R (2009). 6. Mixed effects models and extensions in ecology with R (2009). 7. Analysing Ecological Data (2007). Highland Statistics Ltd. 9 St Clair Wynd UK - AB41 6DZ Newburgh Tel: 0044 1358 788177 Email: highs...@highstat.com URL: www.highstat.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] wilcox.test - difference between p-values of R and online calculators
Notice that correct=TRUE for wilcox.test refers to the continuity correction, not the correction for ties. You can fairly easily simulate from the exact distribution of W: x - c(359,359,359,359,359,359,335,359,359,359,359, 359,359,359,359,359,359,359,359,359,359,303,359,359,359) y - c(332,85,359,359,359,220,231,300,359,237,359,183,286, 355,250,105,359,359,298,359,359,359,28.6,359,359,128) R - rank(c(x,y)) sim - replicate(1e6,sum(sample(R,25))) - 325 # With no ties, the ranks would be a permutation of 1:51, and we could do sim2 - replicate(1e6,sum(sample(1:51,25))) - 325 In either case, the p-value is the probability that W = 485 or W = 165, and mean(sim = 485 | sim = 165) [1] 0.000151 mean(sim2 = 485 | sim2 = 165) [1] 0.002182 Also, try plot(density(sim)) lines(density(sim2)) and notice that the distribution of sim is narrower than that of sim2 (hence the smaller p-value with tie correction), but also that the normal approximationtion is not nearly as good as for the untied case. The clumpiness is due to the fact that 35 of the ranks have the maximum value of 34 (corresponding to the original 359's). -pd On 03 Sep 2014, at 19:13 , David L Carlson dcarl...@tamu.edu wrote: Since they all have the same W/U value, it seems likely that the difference is how the different versions adjust the standard error for ties. Here are a couple of posts addressing the issues of ties: http://tolstoy.newcastle.edu.au/R/e8/help/09/12/9200.html http://stats.stackexchange.com/questions/6127/which-permutation-test-implementation-in-r-to-use-instead-of-t-tests-paired-and David C From: wbradleyk...@gmail.com [mailto:wbradleyk...@gmail.com] On Behalf Of W Bradley Knox Sent: Wednesday, September 3, 2014 9:20 AM To: David L Carlson Cc: Tal Galili; r-help@r-project.org Subject: Re: [R] wilcox.test - difference between p-values of R and online calculators Tal and David, thanks for your messages. I should have added that I tried all variations of true/false values for the exact and correct parameters. Running with correct=FALSE makes only a tiny change, resulting in W = 485, p-value = 0.0002481. At one point, I also thought that the discrepancy between R and these online calculators might come from how ties are handled, but the fact that R and two of the online calcultors reach the same U/W values seems to indicate that ties aren't the issue, since (I believe) the U or W values contain all of the information needed to calculate the p-value, assuming the number of samples is also known for each condition. (However, it's been a while since I looked into how MWU tests work, so maybe now's the time to refresh.) If that's correct, the discrepancy seems to be based in what R does with the W value that is identical to the U values of two of the online calculators. (I'm also assuming that U and W have the same meaning, which seems likely.) - Brad W. Bradley Knox, PhD http://bradknox.nethttp://bradknox.net/ bradk...@mit.edumailto:bradk...@mit.edu On Wed, Sep 3, 2014 at 9:10 AM, David L Carlson dcarl...@tamu.edumailto:dcarl...@tamu.edu wrote: That does not change the results. The problem is likely to be the way ties are handled. The first sample has 25 values of which 23 are identical (359). The second sample has 26 values of which 12 are identical (359). The difference between the implementations may be a result of the way the ties are ranked. For example the R function rank() offers 5 different ways of handling the rank on tied observations. With so many ties, that could make a substantial difference. Package coin has wilxon_test() which uses Monte Carlo simulation to estimate the confidence limits. - David L Carlson Department of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.orgmailto:r-help-boun...@r-project.org] On Behalf Of Tal Galili Sent: Wednesday, September 3, 2014 5:24 AM To: W Bradley Knox Cc: r-help@r-project.orgmailto:r-help@r-project.org Subject: Re: [R] wilcox.test - difference between p-values of R and online calculators It seems your numbers has ties. What happens if you run wilcox.test with correct=FALSE, will the results be the same as the online calculators? Contact Details:--- Contact me: tal.gal...@gmail.commailto:tal.gal...@gmail.com | Read me: www.talgalili.comhttp://www.talgalili.com (Hebrew) | www.biostatistics.co.ilhttp://www.biostatistics.co.il (Hebrew) | www.r-statistics.comhttp://www.r-statistics.com (English) -- On Wed, Sep 3, 2014 at 3:54 AM, W Bradley Knox
Re: [R] wilcox.test - difference between p-values of R and online calculators
Tal and David, thanks for your messages. I should have added that I tried all variations of true/false values for the exact and correct parameters. Running with correct=FALSE makes only a tiny change, resulting in W = 485, p-value = 0.0002481. At one point, I also thought that the discrepancy between R and these online calculators might come from how ties are handled, but the fact that R and two of the online calcultors reach the same U/W values seems to indicate that ties aren't the issue, since (I believe) the U or W values contain all of the information needed to calculate the p-value, assuming the number of samples is also known for each condition. (However, it's been a while since I looked into how MWU tests work, so maybe now's the time to refresh.) If that's correct, the discrepancy seems to be based in what R does with the W value that is identical to the U values of two of the online calculators. (I'm also assuming that U and W have the same meaning, which seems likely.) - Brad W. Bradley Knox, PhD http://bradknox.net bradk...@mit.edu On Wed, Sep 3, 2014 at 9:10 AM, David L Carlson dcarl...@tamu.edu wrote: That does not change the results. The problem is likely to be the way ties are handled. The first sample has 25 values of which 23 are identical (359). The second sample has 26 values of which 12 are identical (359). The difference between the implementations may be a result of the way the ties are ranked. For example the R function rank() offers 5 different ways of handling the rank on tied observations. With so many ties, that could make a substantial difference. Package coin has wilxon_test() which uses Monte Carlo simulation to estimate the confidence limits. - David L Carlson Department of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Tal Galili Sent: Wednesday, September 3, 2014 5:24 AM To: W Bradley Knox Cc: r-help@r-project.org Subject: Re: [R] wilcox.test - difference between p-values of R and online calculators It seems your numbers has ties. What happens if you run wilcox.test with correct=FALSE, will the results be the same as the online calculators? Contact Details:--- Contact me: tal.gal...@gmail.com | Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Wed, Sep 3, 2014 at 3:54 AM, W Bradley Knox bradk...@mit.edu wrote: Hi. I'm taking the long-overdue step of moving from using online calculators to compute results for Mann-Whitney U tests to a more streamlined system involving R. However, I'm finding that R computes a different result than the 3 online calculators that I've used before (all of which approximately agree). These calculators are here: http://elegans.som.vcu.edu/~leon/stats/utest.cgi http://vassarstats.net/utest.html http://www.socscistatistics.com/tests/mannwhitney/ An example calculation is *wilcox.test(c(359,359,359,359,359,359,335,359,359,359,359,359,359,359,359,359,359,359,359,359,359,303,359,359,359),c(332,85,359,359,359,220,231,300,359,237,359,183,286,355,250,105,359,359,298,359,359,359,28.6,359,359,128))* which prints *Wilcoxon rank sum test with continuity correction data: c(359, 359, 359, 359, 359, 359, 335, 359, 359, 359, 359, 359, and c(332, 85, 359, 359, 359, 220, 231, 300, 359, 237, 359, 183, 359, 359, 359, 359, 359, 359, 359, 359, 359, 303, 359, 359, and 286, 355, 250, 105, 359, 359, 298, 359, 359, 359, 28.6, 359, 359) and 359, 128) W = 485, p-value = 0.0002594 alternative hypothesis: true location shift is not equal to 0 Warning message: In wilcox.test.default(c(359, 359, 359, 359, 359, 359, 335, 359, : cannot compute exact p-value with ties* However, all of the online calculators find p-values close to 0.0025, 10x the value output by R. All results are for a two-tailed case. Importantly, the W value computed by R *does agree* with the U values output by the first two online calculators listed above, yet it has a different p-value. Can anyone shed some light on how and why R's calculation differs from that of these online calculators? Thanks for your time. W. Bradley Knox, PhD http://bradknox.net bradk...@mit.edu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML
[R] snow/Rmpi without MPI.spawn?
I'm a programmer at a high-performance computing center. I'm not very familiar with R, but I have used MPI from C, C++, and Python. I have to run an R code provided by a guy who knows R, but not MPI. So, this fellow used the R snow library to parallelize his R code (theoretically, I'm not actually sure what he did.) I need to get this code running on our machines. However, Rmpi and snow seem to require mpi spawn, which our computing center doesn't support. I even tried building Rmpi with MPICH1 instead of 2, because Rmpi has that option, but it still tries to use spawn. I can launch plenty of processes, but I have to launch them all at once at the beginning. Is there any way to convince Rmpi to just use those processes rather than trying to spawn its own? I haven't found any documentation on this issue, although I would've thought it would be quite common. Thanks, Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] GLM Help
Hi all, I have a large set of data that looks something like this, although this data frame is much smaller and includes made up numbers to make my question easier. x.df - data.frame(Region = c(A, A, A, A, A, B, B, B, B, B, B, C, C, C, C), Group_ID = c(1:15), No_Offspring = c(3, 0, 4, 2, 1, 0, 3, 4, 3, 2, 2, 5, 4, 1, 3), M_Offspring = c(2, 0, 2, 1, 0, 0, 1, 1, 2, 0, 1, 3, 2, 1, 1), F_Offspring = c(1, 0, 2, 1, 1, 0, 2, 3, 1, 2, 1, 2, 2, 0, 2), No_Helpers = c(5, 0, 2, 1, 0, 1, 3, 4, 2, 3, 2, 3, 4, 0, 0)) x.df Region Group_ID No_Offspring M_Offspring F_Offspring No_Helpers 1 A13 2 1 5 2 A20 0 0 0 3 A34 2 2 2 4 A42 1 1 1 5 A51 0 1 0 6 B60 0 0 1 7 B73 1 2 3 8 B84 1 3 4 9 B93 2 1 2 10 B 102 0 2 3 11 B 112 1 1 2 12 C 125 3 2 3 13 C 134 2 2 4 14 C 141 1 0 0 15 C 153 1 2 0 I have been using GLMs to determine if the number of helpers (No_Helpers) has an effect on the sex ratio of the offspring. Here's the GLM I have been using: prop.male - x.df$M_Offspring/x.df$No_Offspring glm = glm(prop.male~No_Helpers,binomial,data=x.df) However, now I'd like to fit a model with region-specific regressions and see if this has more support than the model without region-specificity. So, I'd like one model that generates a regression for each region (A, B, C). I've tried treating No_Helpers and Region as covariates: glm2 = glm(prop.male~No_Helpers+Region-1,binomial,data=x.df) which includes region-specificity in the intercepts, but not the entire regression, and as interaction terms: glm3 = glm(prop.male~No_Helpers*Region-1,binomial,data=x.df) which also does not give me an intercept and slope for each region. I'm not sure how else to adjust the formula, or if the adjustment should be somewhere else in the GLM call. Thanks in advance for your help. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] snow/Rmpi without MPI.spawn?
On 09/03/2014 03:25 PM, Jim Leek wrote: I'm a programmer at a high-performance computing center. I'm not very familiar with R, but I have used MPI from C, C++, and Python. I have to run an R code provided by a guy who knows R, but not MPI. So, this fellow used the R snow library to parallelize his R code (theoretically, I'm not actually sure what he did.) I need to get this code running on our machines. However, Rmpi and snow seem to require mpi spawn, which our computing center doesn't support. I even tried building Rmpi with MPICH1 instead of 2, because Rmpi has that option, but it still tries to use spawn. I can launch plenty of processes, but I have to launch them all at once at the beginning. Is there any way to convince Rmpi to just use those processes rather than trying to spawn its own? I haven't found any documentation on this issue, although I would've thought it would be quite common. This script spawn.R === # salloc -n 12 orterun -n 1 R -f spawn.R library(Rmpi) ## Recent Rmpi bug -- should be mpi.universe.size() nWorkers - mpi.universe.size() mpi.spawn.Rslaves(nslaves=nWorkers) mpiRank - function(i) c(i=i, rank=mpi.comm.rank()) mpi.parSapply(seq_len(2*nWorkers), mpiRank) mpi.close.Rslaves() mpi.quit() can be run like the comment suggests salloc -n 12 orterun -n 1 R -f spawn.R uses slurm (or whatever job manager) to allocate resources for 12 tasks and spawn within that allocation. Maybe that's 'good enough' -- spawning within the assigned allocation? Likely this requires minimal modification of the current code. More extensive is to revise the manager/worker-style code to something more like single instruction, multiple data simd.R == ## salloc -n 4 orterun R --slave -f simd.R sink(/dev/null) # don't capture output -- more care needed here library(Rmpi) TAGS = list(FROM_WORKER=1L) .comm = 0L ## shared `work', here just determine rank and host work = c(rank=mpi.comm.rank(.comm), host=system(hostname, intern=TRUE)) if (mpi.comm.rank(.comm) == 0) { ## manager mpi.barrier(.comm) nWorkers = mpi.comm.size(.comm) res = list(nWorkers) for (i in seq_len(nWorkers - 1L)) { res[[i]] - mpi.recv.Robj(mpi.any.source(), TAGS$FROM_WORKER, comm=.comm) } res[[nWorkers]] = work sink() # start capturing output print(do.call(rbind, res)) } else { ## worker mpi.barrier(.comm) mpi.send.Robj(work, 0L, TAGS$FROM_WORKER, comm=.comm) } mpi.quit() but this likely requires some serious code revision; if going this route then http://r-pbd.org/ might be helpful (and from a similar HPC environment). It's always worth asking whether the code is written to be efficient in R -- a typical 'mistake' is to write R-level explicit 'for' loops that copy-and-append results, along the lines of len - 10 result - NULL for (i in seq_len(len)) ## some complicated calculation, then... result - c(result, sqrt(i)) whereas it's much better to pre-allocate and fill result - integer(len) for (i in seq_len(len)) result[[i]] = sqrt(i) or lapply(seq_len(len), sqrt) and very much better still to 'vectorize' result - sqrt(seq_len(len)) (timing for me are about 1 minute for copy-and-append, .2 s for pre-allocate and fill, and .002s for vectorize). Pushing back on the guy providing the code (grep for for loops, and look for that copy-and-append pattern) might save you from having to use parallel evaluation at all. Martin Thanks, Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Overwriting a procedure
If you have more than one version of fixx(), then the one that is used is the one that comes first in the search path. The search path is revealed by the search() function. So if you can learn to control the search path, then you can control which version of fixx() is used. That would be my initial approach. As an aside, you can define your first version of fixx() more simply as fixx - function(x) list(x=x) and the second more simply as fixx - function(x) { x[,6]-x[,5]^2/10 list(x=x) } Using return() is completely unnecessary (but of course ok if preferred as a matter of programming style). Of course, this all assumes you truly need two different functions with the same name. I would think that is unlikely, but since there¹s no indication of what determines which one should be used, I can¹t say. However, there must be some criterion that determines that the second version should be used, so perhaps fixx - function(x, criterion) { ## criterion must be a logical value of length 1 if (criterion) x[,6]-x[,5]^2/10 list(x=x) } would work. -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 9/2/14, 11:45 AM, Steven Yen sye...@gmail.com wrote: Is there a way to over-write a procedure (subroutine)? I include a default procedure fixx in a list of procedures which are compiled into a package. By default, the procedure deliver the data matrix x. fixx - function(x){ result - list(x=x) return(result) } In some applications, I have transformations (such as squared terms) in some column(s) in x. So I include the following procedure in the mail (calling) program, hoping to over-write the default procedure under the same name in the package (which is the way other languages works, e.g., Gauss): fixx - function(x){ x[,6]-x[,5]^2/10 result - list(x=x) return(result) } This does not seem to work. The procedure in the main (calling) program seems to get ignored. Any idea? Thanks. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Overwriting a procedure
On 03/09/2014, 8:58 PM, MacQueen, Don wrote: If you have more than one version of fixx(), then the one that is used is the one that comes first in the search path. The search path is revealed by the search() function. So if you can learn to control the search path, then you can control which version of fixx() is used. That would be my initial approach. This is only partially correct information. It only applies to uses of fixx() from the console. The original poster wanted to replace a function in a package: uses within the package will search the package namespace first, regardless of the search path. Duncan Murdoch As an aside, you can define your first version of fixx() more simply as fixx - function(x) list(x=x) and the second more simply as fixx - function(x) { x[,6]-x[,5]^2/10 list(x=x) } Using return() is completely unnecessary (but of course ok if preferred as a matter of programming style). Of course, this all assumes you truly need two different functions with the same name. I would think that is unlikely, but since there¹s no indication of what determines which one should be used, I can¹t say. However, there must be some criterion that determines that the second version should be used, so perhaps fixx - function(x, criterion) { ## criterion must be a logical value of length 1 if (criterion) x[,6]-x[,5]^2/10 list(x=x) } would work. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] depth of labels of axis
On 2014/9/1 20:39, David Winsemius wrote: On Sep 1, 2014, at 4:40 PM, Jinsong Zhao wrote: Hi there, With the following code, plot(1:5, xaxt = n) axis(1, at = 1:5, labels = c(expression(E[g]), E, expression(E[j]), E, expression(E[t]))) you may notice that the E within labels of axis(1) are not at the same depth. So the vision of axis(1) labels is something like wave. Is there a possible way to typeset the labels so that they are have the same depth? I'm not sure that we share an interpretation of the term depth in this context. I'm interpreting your request to me vertical alighnment. depth is not the accurate word for this situation. Maybe, baseline is a better one. Any suggestions will be really appreciated. Read the help page, especially the paragraph about padj. I will admit that I thought the description of the actions of padj=0 and padj=1 were not what I experienced when I tried alternate versions. It did not seem to me that padj=0 produced top alignment. Thank you very much for pointing me to this. I did not notice this argument. Best regards, Jinsong __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] depth of labels of axis
On 2014/9/2 11:50, David L Carlson wrote: The bottom of the expression is set by the lowest character (which can even change for subscripted letters with descenders. The solution is to get axis() to align the tops of the axis labels and move the line up to reduce the space, e.g. plot(1:5, xaxt = n) axis(1, at = 1:5, labels = c(expression(E[g]), E, expression(E[j]), E, expression(E[t])), padj=1, mgp=c(3, .1, 0)) # Check alignment abline(h=.7, xpd=TRUE, lty=3) yes. In this situation, padj = 1 is the fast solution. However, If there are also superscript, then it's hard to alignment all the labels. If R provide a mechanism that aligns the label in axis() or text() with the baseline of the character without the super- and/or sub-script, that will be terrific. Regards, Jinsong - David L Carlson Department of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Jinsong Zhao Sent: Monday, September 1, 2014 6:41 PM To: r-help@r-project.org Subject: [R] depth of labels of axis Hi there, With the following code, plot(1:5, xaxt = n) axis(1, at = 1:5, labels = c(expression(E[g]), E, expression(E[j]), E, expression(E[t]))) you may notice that the E within labels of axis(1) are not at the same depth. So the vision of axis(1) labels is something like wave. Is there a possible way to typeset the labels so that they are have the same depth? Any suggestions will be really appreciated. Thanks in advance. Best regards, Jinsong __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] depth of labels of axis
On 2014/9/3 21:33, Jinsong Zhao wrote: On 2014/9/2 11:50, David L Carlson wrote: The bottom of the expression is set by the lowest character (which can even change for subscripted letters with descenders. The solution is to get axis() to align the tops of the axis labels and move the line up to reduce the space, e.g. plot(1:5, xaxt = n) axis(1, at = 1:5, labels = c(expression(E[g]), E, expression(E[j]), E, expression(E[t])), padj=1, mgp=c(3, .1, 0)) # Check alignment abline(h=.7, xpd=TRUE, lty=3) yes. In this situation, padj = 1 is the fast solution. However, If there are also superscript, then it's hard to alignment all the labels. If R provide a mechanism that aligns the label in axis() or text() with the baseline of the character without the super- and/or sub-script, that will be terrific. it seems that the above wish is on the Graphics TODO lists: https://www.stat.auckland.ac.nz/~paul/R/graphicstodos.html Allow text adjustment for mathematical annotations which is relative to a text baseline (in addition to the current situation where adjustment is relative to the bounding box). Regards, Jinsong - David L Carlson Department of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Jinsong Zhao Sent: Monday, September 1, 2014 6:41 PM To: r-help@r-project.org Subject: [R] depth of labels of axis Hi there, With the following code, plot(1:5, xaxt = n) axis(1, at = 1:5, labels = c(expression(E[g]), E, expression(E[j]), E, expression(E[t]))) you may notice that the E within labels of axis(1) are not at the same depth. So the vision of axis(1) labels is something like wave. Is there a possible way to typeset the labels so that they are have the same depth? Any suggestions will be really appreciated. Thanks in advance. Best regards, Jinsong __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] snow/Rmpi without MPI.spawn?
Thanks for the tips. I'll take a look around for for loops in the morning. I think the example you provided worked for OpenMPI. (The default on our machine is MPICH2, but it gave the same error about calling spawn.) Anyway, with OpenMPI I got this: # salloc -n 12 orterun -n 1 R -f spawn.R library(Rmpi) ## Recent Rmpi bug -- should be mpi.universe.size() nWorkers - mpi.universe.size() nslaves = 4 mpi.spawn.Rslaves(nslaves) Reported: 2 (out of 2) daemons - 4 (out of 4) procs Then it hung there. So things spawned anyway, which is progress. I'm just not sure is that expected behavior for parSupply or not. Jim -Original Message- From: Martin Morgan [mailto:mtmor...@fhcrc.org] Sent: Wednesday, September 03, 2014 5:08 PM To: Leek, Jim; r-help@r-project.org Subject: Re: [R] snow/Rmpi without MPI.spawn? On 09/03/2014 03:25 PM, Jim Leek wrote: I'm a programmer at a high-performance computing center. I'm not very familiar with R, but I have used MPI from C, C++, and Python. I have to run an R code provided by a guy who knows R, but not MPI. So, this fellow used the R snow library to parallelize his R code (theoretically, I'm not actually sure what he did.) I need to get this code running on our machines. However, Rmpi and snow seem to require mpi spawn, which our computing center doesn't support. I even tried building Rmpi with MPICH1 instead of 2, because Rmpi has that option, but it still tries to use spawn. I can launch plenty of processes, but I have to launch them all at once at the beginning. Is there any way to convince Rmpi to just use those processes rather than trying to spawn its own? I haven't found any documentation on this issue, although I would've thought it would be quite common. This script spawn.R === # salloc -n 12 orterun -n 1 R -f spawn.R library(Rmpi) ## Recent Rmpi bug -- should be mpi.universe.size() nWorkers - mpi.universe.size() mpi.spawn.Rslaves(nslaves=nWorkers) mpiRank - function(i) c(i=i, rank=mpi.comm.rank()) mpi.parSapply(seq_len(2*nWorkers), mpiRank) mpi.close.Rslaves() mpi.quit() can be run like the comment suggests salloc -n 12 orterun -n 1 R -f spawn.R uses slurm (or whatever job manager) to allocate resources for 12 tasks and spawn within that allocation. Maybe that's 'good enough' -- spawning within the assigned allocation? Likely this requires minimal modification of the current code. More extensive is to revise the manager/worker-style code to something more like single instruction, multiple data simd.R == ## salloc -n 4 orterun R --slave -f simd.R sink(/dev/null) # don't capture output -- more care needed here library(Rmpi) TAGS = list(FROM_WORKER=1L) .comm = 0L ## shared `work', here just determine rank and host work = c(rank=mpi.comm.rank(.comm), host=system(hostname, intern=TRUE)) if (mpi.comm.rank(.comm) == 0) { ## manager mpi.barrier(.comm) nWorkers = mpi.comm.size(.comm) res = list(nWorkers) for (i in seq_len(nWorkers - 1L)) { res[[i]] - mpi.recv.Robj(mpi.any.source(), TAGS$FROM_WORKER, comm=.comm) } res[[nWorkers]] = work sink() # start capturing output print(do.call(rbind, res)) } else { ## worker mpi.barrier(.comm) mpi.send.Robj(work, 0L, TAGS$FROM_WORKER, comm=.comm) } mpi.quit() but this likely requires some serious code revision; if going this route then http://r-pbd.org/ might be helpful (and from a similar HPC environment). It's always worth asking whether the code is written to be efficient in R -- a typical 'mistake' is to write R-level explicit 'for' loops that copy-and-append results, along the lines of len - 10 result - NULL for (i in seq_len(len)) ## some complicated calculation, then... result - c(result, sqrt(i)) whereas it's much better to pre-allocate and fill result - integer(len) for (i in seq_len(len)) result[[i]] = sqrt(i) or lapply(seq_len(len), sqrt) and very much better still to 'vectorize' result - sqrt(seq_len(len)) (timing for me are about 1 minute for copy-and-append, .2 s for pre-allocate and fill, and .002s for vectorize). Pushing back on the guy providing the code (grep for for loops, and look for that copy-and-append pattern) might save you from having to use parallel evaluation at all. Martin Thanks, Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list
Re: [R] snow/Rmpi without MPI.spawn?
I spoke a bit too soon. You may have noticed that it isn't hanging in parSapply, it's hanging in mpi.spawn.Rslaves(). It claims to have launched the slaves, but I can't see them by logging into the target node and running 'top'. I only see the top level R process (which is burning up 100% of a CPU). So I don't know what's going on. It never gets back from the spawn call anyway. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Leek, Jim Sent: Wednesday, September 03, 2014 10:25 PM To: Martin Morgan; r-help@r-project.org Subject: Re: [R] snow/Rmpi without MPI.spawn? Thanks for the tips. I'll take a look around for for loops in the morning. I think the example you provided worked for OpenMPI. (The default on our machine is MPICH2, but it gave the same error about calling spawn.) Anyway, with OpenMPI I got this: # salloc -n 12 orterun -n 1 R -f spawn.R library(Rmpi) ## Recent Rmpi bug -- should be mpi.universe.size() nWorkers - mpi.universe.size() nslaves = 4 mpi.spawn.Rslaves(nslaves) Reported: 2 (out of 2) daemons - 4 (out of 4) procs Then it hung there. So things spawned anyway, which is progress. I'm just not sure is that expected behavior for parSupply or not. Jim -Original Message- From: Martin Morgan [mailto:mtmor...@fhcrc.org] Sent: Wednesday, September 03, 2014 5:08 PM To: Leek, Jim; r-help@r-project.org Subject: Re: [R] snow/Rmpi without MPI.spawn? On 09/03/2014 03:25 PM, Jim Leek wrote: I'm a programmer at a high-performance computing center. I'm not very familiar with R, but I have used MPI from C, C++, and Python. I have to run an R code provided by a guy who knows R, but not MPI. So, this fellow used the R snow library to parallelize his R code (theoretically, I'm not actually sure what he did.) I need to get this code running on our machines. However, Rmpi and snow seem to require mpi spawn, which our computing center doesn't support. I even tried building Rmpi with MPICH1 instead of 2, because Rmpi has that option, but it still tries to use spawn. I can launch plenty of processes, but I have to launch them all at once at the beginning. Is there any way to convince Rmpi to just use those processes rather than trying to spawn its own? I haven't found any documentation on this issue, although I would've thought it would be quite common. This script spawn.R === # salloc -n 12 orterun -n 1 R -f spawn.R library(Rmpi) ## Recent Rmpi bug -- should be mpi.universe.size() nWorkers - mpi.universe.size() mpi.spawn.Rslaves(nslaves=nWorkers) mpiRank - function(i) c(i=i, rank=mpi.comm.rank()) mpi.parSapply(seq_len(2*nWorkers), mpiRank) mpi.close.Rslaves() mpi.quit() can be run like the comment suggests salloc -n 12 orterun -n 1 R -f spawn.R uses slurm (or whatever job manager) to allocate resources for 12 tasks and spawn within that allocation. Maybe that's 'good enough' -- spawning within the assigned allocation? Likely this requires minimal modification of the current code. More extensive is to revise the manager/worker-style code to something more like single instruction, multiple data simd.R == ## salloc -n 4 orterun R --slave -f simd.R sink(/dev/null) # don't capture output -- more care needed here library(Rmpi) TAGS = list(FROM_WORKER=1L) .comm = 0L ## shared `work', here just determine rank and host work = c(rank=mpi.comm.rank(.comm), host=system(hostname, intern=TRUE)) if (mpi.comm.rank(.comm) == 0) { ## manager mpi.barrier(.comm) nWorkers = mpi.comm.size(.comm) res = list(nWorkers) for (i in seq_len(nWorkers - 1L)) { res[[i]] - mpi.recv.Robj(mpi.any.source(), TAGS$FROM_WORKER, comm=.comm) } res[[nWorkers]] = work sink() # start capturing output print(do.call(rbind, res)) } else { ## worker mpi.barrier(.comm) mpi.send.Robj(work, 0L, TAGS$FROM_WORKER, comm=.comm) } mpi.quit() but this likely requires some serious code revision; if going this route then http://r-pbd.org/ might be helpful (and from a similar HPC environment). It's always worth asking whether the code is written to be efficient in R -- a typical 'mistake' is to write R-level explicit 'for' loops that copy-and-append results, along the lines of len - 10 result - NULL for (i in seq_len(len)) ## some complicated calculation, then... result - c(result, sqrt(i)) whereas it's much better to pre-allocate and fill result - integer(len) for (i in seq_len(len)) result[[i]] = sqrt(i) or lapply(seq_len(len), sqrt) and very much better still to 'vectorize' result - sqrt(seq_len(len)) (timing for me are about 1 minute for copy-and-append, .2 s for pre-allocate and fill, and .002s for vectorize). Pushing back on the guy providing the code (grep for for loops, and look for that copy-and-append
[R-es] Duda programacion
Hola a todos, Soy nuevo en esta lista y sobretodo soy nuevo utilizando R. Tengo una duda que no soy capaz de solucionar, en un data.frame tengo varias variables, quiero crear un c�lculo y que me lo devuelva abierto por una de esas variable. He conseguido hacerlo si el c�lculo es una media de una variable, pero en mi caso se trata de un % por lo que no puedo hacer la media, ser�a m�s bien una media ponderada, pero tampoco a� me funciona. Pongo un ejemplo: CampoC1C2%C2/C1 A10110 A10990 A20042 B50714 B10770 B10022 Agrupando; C1C2Promedio% Real A2201434,06,4 B1601628,710,0 Total3803031,37,9 Por ejemplo si el promedio fuera correcto lo har�a asi: aggregate(Datos$C1, list(Datos$%C2/C1), mean) Me podrias ayudar? Gracias [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es
[R-es] Como se hace el operador o (OR) para seleccionar dos o mas niveles de un vector ?
Estimados, tengo un data.frame con una columna que tiene tres diferentes niveles (aunque la columna no es propiamente de un factor, son solo tres letras diferentes), por ejemplo c, t y s, y necesito usar los datos que tienen c o t, como tengo que hacerlo ? Creo que a veces he usado algo asi: dataframe - dataframe[dataframe$columna==c(c,t),] pero por alguna razon, cuando uso esa forma dentro del codigo para crear un grafico, por ejemplo: xyplot(are ~ con | sol, data=datEnd[datEnd$iso==c(c,t),]) el resultado no es correcto. Alguna idea ? Muchas gracias, Eric. -- Forest Engineer Master in Environmental and Natural Resource Economics Ph.D. student in Sciences of Natural Resources at La Frontera University Member in AguaDeTemu2030, citizen movement for Temuco with green city standards for living Nota: Las tildes se han omitido para asegurar compatibilidad con algunos lectores de correo. ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es
Re: [R-es] Como se hace el operador o (OR) para seleccionar dos o mas niveles de un vector ?
Hola Eric, Revisa ?%in% Saludos, Jorge.- 2014-09-04 8:41 GMT+10:00 eric ericconchamu...@gmail.com: Estimados, tengo un data.frame con una columna que tiene tres diferentes niveles (aunque la columna no es propiamente de un factor, son solo tres letras diferentes), por ejemplo c, t y s, y necesito usar los datos que tienen c o t, como tengo que hacerlo ? Creo que a veces he usado algo asi: dataframe - dataframe[dataframe$columna==c(c,t),] pero por alguna razon, cuando uso esa forma dentro del codigo para crear un grafico, por ejemplo: xyplot(are ~ con | sol, data=datEnd[datEnd$iso==c(c,t),]) el resultado no es correcto. Alguna idea ? Muchas gracias, Eric. -- Forest Engineer Master in Environmental and Natural Resource Economics Ph.D. student in Sciences of Natural Resources at La Frontera University Member in AguaDeTemu2030, citizen movement for Temuco with green city standards for living Nota: Las tildes se han omitido para asegurar compatibilidad con algunos lectores de correo. ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es