Re: [R] "And" condition spanning over multiple columns in data frame
В Thu, 12 Sep 2024 09:42:57 +0200 Francesca пишет: > c10Dt <- mutate(c10Dt, exit1= ifelse(is.na(cp1) & id!=1, 1, 0)) > So, I create a new variable, called exit1, in which the program > selects cp1, checks if it is NA, and if it is NA but also the value > of the column "id" is not 1, then it gives back a 1, otherwise 0. > So, what I want is that it selects all the cases in which the id=2,3, > or 4 is not NA in the corresponding values of the matrix. Since all your columns except the first one are the desired "cp*" columns, you can obtain your "exit" columns in bulk: ( c10Dt$id != 1 & # will be recycled column-wise, as we need is.na(c10Dt[-1]) ) |> # ...and then convert back into a data.frame, as.data.frame() |> # rename the columns... (\(x) setNames(x, sub('cp', 'exit', names(x() |> # ...and finally attach to the original data.frame cbind(c10Dt) -- Best regards, Ivan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Stop or limit to console printing large list and so on
Hello Stephen. I am not sure of the exact details of your problem, but following the second part of your e-mail, if you accidentally print a large object in the console and do not want to wait (i.e. you want to stop printing), just press C-c C-c and it will stop it (it will stop any process happening in the console, the same if you send a loong loop and want to abort...) cheers F. On 9/11/24 15:44, stephen sefick wrote: Stephen Sefick __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] non-conformable arrays
В Wed, 11 Sep 2024 17:55:53 +0200 Thierry Onkelinx пишет: > For those how like a fully reproducible example: > the offending line in the code: > https://github.com/inbo/multimput/blob/e1cd0cdff7d2868e4101c411f7508301c7be7482/R/impute_glmermod.R#L65 > a (failing) unit test for the code: > https://github.com/inbo/multimput/blob/e1cd0cdff7d2868e4101c411f7508301c7be7482/tests/testthat/test_ccc_hurdle_impute.R#L10 Setting options(error = recover) and evaluating your test expression directly using eval(parse('testthat/test_ccc_hurdle_impute.R')[[1]][[3]]), I see: Browse[1]> names(random) [1] "Year" Browse[1]> paste("~0 + ", 'Year') |> + as.formula() |> + model.matrix(data = data[missing_obs,]) |> + str() num [1:68, 1] 7 6 9 4 3 5 10 1 6 8 ... - attr(*, "dimnames")=List of 2 ..$ : chr [1:68] "37" "56" "59" "114" ... ..$ : chr "Year" - attr(*, "assign")= int 1 Browse[1]> t(random[['Year']]) |> str() num [1:19, 1:10] 0.175 0.181 0.102 0.119 0.158 ... ...and they are indeed non-conformable. Why is random$Year a matrix? -- Best regards, Ivan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Stop or limit to console printing large list and so on
В Wed, 11 Sep 2024 09:44:05 -0400 stephen sefick пишет: > I am having a problem with accidentally typing an object name at the > console that is a very large list and then having to wait for it to be > printed until I can resume my work. Does it help to interrupt the process? https://www.gnu.org/software/emacs/manual/html_mono/emacs.html#index-C_002dc-C_002dc-_0028Shell-mode_0029 https://ess.r-project.org/Manual/ess.html#index-interrupting-R-commands I'm afraid that the behaviour of the print() method is very class-dependent and limiting options(max.print=...) may not help in your case. -- Best regards, Ivan ______ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading a txt file from internet
I tried it on R 4.4.1 on Linux Mint 21.3 just before I posted it, and I just tried it on R 3.4.2 on Ubuntu 16.04 and R 4.3.2 on Windows 11 just now and it works on all of them. I don't have a big-endian machine to test on, but the Unicode spec says to honor the BOM and if there isn't one to assume that it is big-endian data. But in this case there is a BOM so your machine has a buggy decoder? On September 7, 2024 2:43:24 PM PDT, Duncan Murdoch wrote: >On 2024-09-07 4:52 p.m., Jeff Newmiller via R-help wrote: >> When you specify LE in the encoding type, you are logically telling the >> decoder that you know the two-byte pairs are in little-endian order... which >> could override whatever the byte-order-mark was indicating. If the BOM >> indicated big-endian then the file decoding would break. If there is a BOM, >> don't override it unless you have to (e.g. for a wrong BOM)... leave off the >> LE unless you really need it. > >That sounds like good advice, but it doesn't work: > > > read.delim( > + 'https://online.stat.psu.edu/onlinecourses/sites/stat501/files > /ch15/employee.txt', > + fileEncoding = "UTF-16" > + ) > [1] time > > > > > > > > > > > > > > [2] > vendor.洀攀琀愀氀㐀㐀㜀.㐀㐀㤀.㐀㐀.㐀..㐀.㐀..㐀..㔀...㜀.㐀..㠀..㘀...㠀.㐀㐀㜀...㔀.㐀㐀. > >and so on. >> >> On September 7, 2024 1:22:23 PM PDT, Enrico Schumann >> wrote: >>> On Sun, 08 Sep 2024, Christofer Bogaso writes: >>> >>>> Hi, >>>> >>>> I am trying to the data from >>>> https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt >>>> without any success. Below is the error I am getting: >>>> >>>>> read.delim('https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt') >>>> >>>> Error in make.names(col.names, unique = TRUE) : >>>> >>>>invalid multibyte string at 't' >>>> >>>> In addition: Warning messages: >>>> >>>> 1: In read.table(file = file, header = header, sep = sep, quote = quote, : >>>> >>>>line 1 appears to contain embedded nulls >>>> >>>> 2: In read.table(file = file, header = header, sep = sep, quote = quote, : >>>> >>>>line 2 appears to contain embedded nulls >>>> >>>> 3: In read.table(file = file, header = header, sep = sep, quote = quote, : >>>> >>>>line 3 appears to contain embedded nulls >>>> >>>> 4: In read.table(file = file, header = header, sep = sep, quote = quote, : >>>> >>>>line 4 appears to contain embedded nulls >>>> >>>> 5: In read.table(file = file, header = header, sep = sep, quote = quote, : >>>> >>>>line 5 appears to contain embedded nulls >>>> >>>> Is there any way to read this data directly onto R? >>>> >>>> Thanks for your time >>>> >>> >>> The looks like a byte-order mark >>> (https://en.wikipedia.org/wiki/Byte_order_mark). >>> Try this: >>> >>> fn <- >>> file('https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt', >>>encoding = "UTF-16LE") >>> read.delim(fn) >>> >> > -- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading a txt file from internet
When you specify LE in the encoding type, you are logically telling the decoder that you know the two-byte pairs are in little-endian order... which could override whatever the byte-order-mark was indicating. If the BOM indicated big-endian then the file decoding would break. If there is a BOM, don't override it unless you have to (e.g. for a wrong BOM)... leave off the LE unless you really need it. On September 7, 2024 1:22:23 PM PDT, Enrico Schumann wrote: >On Sun, 08 Sep 2024, Christofer Bogaso writes: > >> Hi, >> >> I am trying to the data from >> https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt >> without any success. Below is the error I am getting: >> >>> read.delim('https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt') >> >> Error in make.names(col.names, unique = TRUE) : >> >> invalid multibyte string at 't' >> >> In addition: Warning messages: >> >> 1: In read.table(file = file, header = header, sep = sep, quote = quote, : >> >> line 1 appears to contain embedded nulls >> >> 2: In read.table(file = file, header = header, sep = sep, quote = quote, : >> >> line 2 appears to contain embedded nulls >> >> 3: In read.table(file = file, header = header, sep = sep, quote = quote, : >> >> line 3 appears to contain embedded nulls >> >> 4: In read.table(file = file, header = header, sep = sep, quote = quote, : >> >> line 4 appears to contain embedded nulls >> >> 5: In read.table(file = file, header = header, sep = sep, quote = quote, : >> >> line 5 appears to contain embedded nulls >> >> Is there any way to read this data directly onto R? >> >> Thanks for your time >> > >The looks like a byte-order mark >(https://en.wikipedia.org/wiki/Byte_order_mark). >Try this: > >fn <- > file('https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt', > encoding = "UTF-16LE") >read.delim(fn) > -- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading a txt file from internet
Add the fileEncoding = "UTF-16" argument to the read call. For a human explanation of why this is going on I recommend [1]. For a more R-related take, try [2]. For reference, I downloaded your file and used the "file" command line program typically available on Linux (and possibly MacOSX) which will tell you about what encoding is used in a particular file. [1] https://www.youtube.com/watch?v=4mRxIgu9R70 [2] https://kevinushey.github.io/blog/2018/02/21/string-encoding-and-r/ On September 7, 2024 12:56:36 PM PDT, Christofer Bogaso wrote: >Hi, > >I am trying to the data from >https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt >without any success. Below is the error I am getting: > >> read.delim('https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt') > >Error in make.names(col.names, unique = TRUE) : > > invalid multibyte string at 't' > >In addition: Warning messages: > >1: In read.table(file = file, header = header, sep = sep, quote = quote, : > > line 1 appears to contain embedded nulls > >2: In read.table(file = file, header = header, sep = sep, quote = quote, : > > line 2 appears to contain embedded nulls > >3: In read.table(file = file, header = header, sep = sep, quote = quote, : > > line 3 appears to contain embedded nulls > >4: In read.table(file = file, header = header, sep = sep, quote = quote, : > > line 4 appears to contain embedded nulls > >5: In read.table(file = file, header = header, sep = sep, quote = quote, : > > line 5 appears to contain embedded nulls > >Is there any way to read this data directly onto R? > >Thanks for your time > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide https://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. -- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] BUG: atan(1i) / 5 = NaN+Infi ?
It seems to me that the documentation of R's complex class & R's atan function do not tell us what to expect, so (as others have suggested), some additional notes are needed. I think that mathematically atan(1i) should be NA_complex_, but R seems not to use any mathematically standard compactification of the complex plane (and I'm not sure that IEEE does either). Incidentally, the signature of the complex constructor is confusing. complex(1L) returns zero, but complex(1L, argument=theta) is an element of the unit circle. The defaults suggest ambiguous results in case only length.out is specified, and you have to read a parenthesis in the details to figure out what will happen. Even then, the behaviour in my example is not spelled out (although it is suggested by negative inference). Moreover, the real & imaginary parts are ignored if either modulus or argument is provided, and I don't see that this is explained at all. R's numeric (& IEEE's floating-point types) seem to approximate a multi-point compactification of the real line. +Inf & -Inf fill out the approximation to the extended real line, and NaN, NA_real_ & maybe some others handle some cases in which the answer does not live in the extended real line. (I'm not digging into bit patterns here. I suspect that there are several versions of NaN, but I hope that they all behave the same way.) The documentation suggests that a complex scalar in R is just a pair of numeric scalars, so we are not dealing with the Riemann sphere or any other usually-studied extension of the complex plane. Since R distinguishes various complex infinities (and seems to allow any combination of numeric values in real & imaginary parts), the usual mathematical answer for atan(1i) may no longer be relevant. The tangent function has an essential singularity at complex infinity (the compactification point in the Riemann sphere, which I consider the natural extension for the study of meromorphic functions, for example making the tangent function well defined on the whole plane), so the usual extension of the plane does not give us an answer for atan(1i). However, another possible extension is the Cartesian square of the extended real line, and in that extension continuity suggests that tan(x + Inf*1i) = 1i and tan(x - Inf*1i) = -1i (for x real & finite). That is the result from R's tan function, and it explains why atan(1i) in R is not NA or NaN. The specific choice of pi/4 + Inf*1i puzzled me at first, but I think it's related to the branch-cut rules given in the documentation. The real part of atan((1+.Machine$double.eps)*1i) is pi/2, and the real part of atan((1-.Machine$double.eps)*1i) is zero, and someone apparently decided to average those for atan(1i). TL;DR: The documentation needs more details, and I don't really like the extended complex plane that R implemented, but within that framework the answers for atan(1i) & atan(-1i) make sense. Regards, Jorgen Harmse. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: effects() extractor for a quantile reqression object: error message
Need to kill some time, so thought I'd Opine. Given the intent, as I understood it... to extract components from a quantile regression (rq) object similar to how one might extract effects from an lm object. Since it seems effects() is not implemented for rq, here are some alternative approaches to achieve similar functionality or extract useful information from the quantile regression model: 1. Extracting Coefficients. Use the coef() function to extract the coefficients of the quantile regression model. This is similar to extracting the effects in a linear model. > coef(qm.9) This will return the estimated coefficients for the 0.9 quantile. 2. Extracting Fitted Values. To get the fitted values from the quantile regression model, use: > fitted(qm.9) This gives the fitted values for the quantile regression model, similar to what you'd see in an lm() model. 3. Extracting ResidualsResiduals from the quantile regression can be extracted using the resid() function: > resid(qm.9) This gives the residuals for the 0.9 quantile regression model, which can be a useful diagnostic for checking model fit. 4. Manually Calculating "Effects".While effects() is not available, you can manually calculate effects by examining the design matrix and the coefficients. If the term "effects" refers to the influence of different covariates in the model, these can be assessed by looking at the coefficients themselves and their impact on the fitted values. 5. Using the summary() Function.The summary() function for rq objects provides detailed information about the quantile regression fit, including coefficient estimates, standard errors, and statistical significance: > summary(qm.9) This can give a more comprehensive understanding of how covariates are contributing to the model. 6. Working with Design Matrices.If you need the design matrix for further custom calculations, you can extract it using: > model.matrix(qm.9) This returns the matrix of predictor variables, which you can then use to manually compute the "effects" of changes in the predictors. 7. Explore Partial Effects with predict(). The predict() function can help you assess how changes in predictor values affect the outcome. For instance, to predict values at specific points in the design space, you can use: > predict(qm.9, newdata = data.frame(your_new_data)) 8. Bootstrapping to Examine Variability. If you want to assess variability in the effects of predictors, you could use bootstrapping. The boot.rq() function in the quantreg package allows you to bootstrap the quantile regression coefficients, giving insight into the variability of the estimated "effects." Example: > boot_results <- boot.rq(y ~ X, tau = 0.9, data = your_data, R = 1000) > summary(boot_results) 9.Interaction or Additive Effects (If Applicable). If you're trying to capture interaction or additive effects, you might need to specify interaction terms directly in your formula and then inspect the coefficients for these terms. Quantile regression will estimate these coefficients in the same manner as linear regression but specific to the quantile of interest. In conclusion, while effects() is not available for quantile regression, the combination of coef(), fitted(), resid(), model.matrix(), and summary() provides the main components of a quantile regression model that should provide similar insights to what effects() provides for linear models. Forgive any typos please... I'm on a mobile device. Kind regards, Gregg Powell Sent from Proton Mail Android Original Message ---- On 06/09/2024 01:37, Koenker, Roger W wrote: > Apologies, forgot to copy R-help on this response. > > Begin forwarded message: > > > From: Roger Koenker > Subject: Re: [R] effects() extractor for a quantile reqression object: error > message > Date: September 6, 2024 at 8:38:47 AM GMT+1 > To: "Christopher W. Ryan" > > Chris, > > This was intended to emulate the effects component of lm() fitting, but was > never implemented. Frankly, I don’t quite see on first glance how this works > for lm() — it seems to be (mostly) about situations where X is not full rank > (see lm.fit) and I also never bothered to implement rq for X that were not > full rank. > > Roger > > > On Sep 6, 2024, at 3:50 AM, Christopher W. Ryan via R-help > wrote: > > I'm using quantreg package version 5.98 of 24 May 2024, in R 4.4.1 on > Linux Mint. > > The online documentation for quantreg says, in part, under the > description of the rq.object, "The coefficients, residuals, and effects > may be extracted by the generic functions of the same name, rather than > by the $ operator." > >
[R] effects() extractor for a quantile reqression object: error message
I'm using quantreg package version 5.98 of 24 May 2024, in R 4.4.1 on Linux Mint. The online documentation for quantreg says, in part, under the description of the rq.object, "The coefficients, residuals, and effects may be extracted by the generic functions of the same name, rather than by the $ operator." I create an rq object for the 0.9 quantile, called qm.9 effects(qm.9) yields, the error message, " effects(qm.9) Error in UseMethod("effects") : no applicable method for 'effects' applied to an object of class "rq" I'm confused. Appreciate any suggestions. Thanks. --Chris Ryan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] BUG: atan(1i) / 5 = NaN+Infi ?
atan(1i) -> 0 + Inf i complex(1/5) -> 0.2 + 0i atan(1i) -> (0 + Inf i) * (0.2 + 0i) -> 0*0.2 + 0*0i + Inf i * 0.2 + Inf i * 0i infinity times zero is undefined -> 0 + 0i + Inf i + NaN * i^2 -> 0 + 0i + Inf i - NaN -> NaN + Inf i I am not sure how complex arithmetic could arrive at another answer. I advise against messing with infinities... use atan2() if you don't actually need complex arithmetic. On September 5, 2024 3:38:33 PM PDT, Bert Gunter wrote: >> complex(real = 0, imaginary = Inf) >[1] 0+Infi > >> Inf*1i >[1] NaN+Infi > >>> complex(real = 0, imaginary = Inf)/5 >[1] NaN+Infi > >See the Note in ?complex for the explanation, I think. Duncan can correct >if I'm wrong. > >-- Bert > >On Thu, Sep 5, 2024 at 3:20 PM Leo Mada wrote: > >> Dear Bert, >> >> These behave like real divisions/multiplications: >> complex(re=Inf, im = Inf) * 5 >> # Inf+Infi >> complex(re=-Inf, im = Inf) * 5 >> # -Inf+Infi >> >> The real division / multiplication should be faster and also is well >> behaved. I was expecting R to do the real division/multiplication on a >> complex number. Which R actually does for these very particular cases; but >> not when only Im(x) is Inf. >> >> Sincerely, >> >> Leonard >> >> -- >> *From:* Bert Gunter >> *Sent:* Friday, September 6, 2024 1:12 AM >> *To:* Duncan Murdoch >> *Cc:* Leo Mada ; r-help@r-project.org < >> r-help@r-project.org> >> *Subject:* Re: [R] BUG: atan(1i) / 5 = NaN+Infi ? >> >> Perhaps >> >> > Inf*1i >> [1] NaN+Infi >> >> clarifies why it is *not* a bug. >> (Boy, did that jog some long dusty math memories :-) ) >> >> -- Bert >> >> On Thu, Sep 5, 2024 at 2:48 PM Duncan Murdoch >> wrote: >> >> On 2024-09-05 4:23 p.m., Leo Mada via R-help wrote: >> > Dear R Users, >> > >> > Is this desired behaviour? >> > I presume it's a bug. >> > >> > atan(1i) >> > # 0+Infi >> > >> > tan(atan(1i)) >> > # 0+1i >> > >> > atan(1i) / 5 >> > # NaN+Infi >> >> There's no need to involve atan() and tan() in this: >> >> > (0+Inf*1i)/5 >> [1] NaN+Infi >> >> Why do you think this is a bug? >> >> Duncan Murdoch >> >> ______ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> https://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > > [[alternative HTML version deleted]] > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide https://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. -- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] BUG: atan(1i) / 5 = NaN+Infi ?
Dear Bert, These behave like real divisions/multiplications: complex(re=Inf, im = Inf) * 5 # Inf+Infi complex(re=-Inf, im = Inf) * 5 # -Inf+Infi The real division / multiplication should be faster and also is well behaved. I was expecting R to do the real division/multiplication on a complex number. Which R actually does for these very particular cases; but not when only Im(x) is Inf. Sincerely, Leonard From: Bert Gunter Sent: Friday, September 6, 2024 1:12 AM To: Duncan Murdoch Cc: Leo Mada ; r-help@r-project.org Subject: Re: [R] BUG: atan(1i) / 5 = NaN+Infi ? Perhaps > Inf*1i [1] NaN+Infi clarifies why it is *not* a bug. (Boy, did that jog some long dusty math memories :-) ) -- Bert On Thu, Sep 5, 2024 at 2:48 PM Duncan Murdoch mailto:murdoch.dun...@gmail.com>> wrote: On 2024-09-05 4:23 p.m., Leo Mada via R-help wrote: > Dear R Users, > > Is this desired behaviour? > I presume it's a bug. > > atan(1i) > # 0+Infi > > tan(atan(1i)) > # 0+1i > > atan(1i) / 5 > # NaN+Infi There's no need to involve atan() and tan() in this: > (0+Inf*1i)/5 [1] NaN+Infi Why do you think this is a bug? Duncan Murdoch __ R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] BUG: atan(1i) / 5 = NaN+Infi ?
Dear R Users, Is this desired behaviour? I presume it's a bug. atan(1i) # 0+Infi tan(atan(1i)) # 0+1i atan(1i) / 5 # NaN+Infi There were some changes in handling of complex numbers. But it looks like a bug. Sincerely, Leonard [[alternative HTML version deleted]] ______ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding parameters for Benchmark normal distribution in shapiro.test
Wouldn't that be because the sample is not being compared to a specific distribution but rather to many possible distributions by MC? [1] If you think that need not be the case, perhaps you can write your own test... but then it will probably be answering a different question? [1] https://en.m.wikipedia.org/wiki/Shapiro%E2%80%93Wilk_test On September 2, 2024 4:26:17 PM PDT, Christofer Bogaso wrote: >Hi, > >In ?shapiro.test, there seems to be no option to pass mean and sd >information of the Normal distribution which I want to compare my >sample data to. > >For example in the code below, I want to test my sample to N(0, 10). > >shapiro.test(rnorm(100, mean = 5, sd = 3)) > >Is there any way to pass the information of the benchmark normal distribution? > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide https://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. -- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Prediction from Arima model
Sent from my iPhone > On Aug 31, 2024, at 10:55 AM, Christofer Bogaso > wrote: > > Hi, > > I have run following code to obtain one step ahead confidence interval > from am arima model > > library(forecast) > > set.seed(100) > > forecast(Arima(rnorm(100), order = c(1,0,1), xreg = rt(100, 1)), h = > 1, xreg = 10) > > However this appear to provide the Prediction interval, however I > wanted to get the confidence interval for the new value. > > Is there any way to get the confidence interval for the new value? I’m not sure it makes sense to output a confidence interval when you are projecting a time series. If you wanted a confidence interval on the past data then you can just use standard descriptive methods. — David. > > I also wanted to get the estimate of SE for the new value which is > used to obtain the confidence interval of the new value. Is there any > method available to obtain that? > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide https://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggregating data with quality control
В Sat, 31 Aug 2024 11:15:10 + Stefano Sofia пишет: > Evaluating the daily mean indipendently from the status is very easy: > > aggregate(mydf$hs, by=list(format(mydf$data_POSIX, "%Y"), > format(mydf$data_POSIX, "%m"), format(mydf$data_POSIX, "%d")), > my.mean) > > > Things become more complicated when I need to export also the status: > this should be "C" when all 48 data have status equal to "C", and > status "D" when at least one value has status ="D". > > > I have no clue on how to do that in an efficient way. You can make the status into an ordered factor: # come up with some statuses status <- sample(c('C', 'D'), 42, TRUE, c(.9, .1)) # convert them into factors, specifying that D is "more than" C status <- ordered(status, c('C', 'D')) Since the factor is ordered and can be subject to comparison like status[1] < status[2], you can now use max() on your groups. If the sample contains any 'D's, max() will return a 'D', because it's larger than any 'C's. If the sample contains only 'C's, that's the maximal value by default. -- Best regards, Ivan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lattice panel layout like cross-tabs, like a 2 x 2 table.
Years ago, I recall creating lattice plots with two binary factors, call them f1 and f2, as in xyplot(y ~ x | f1 + f2, data = dd) and I made it so the rows had strips on the left with the levels of one factor, and the columns had strips on the top with the levels of the other factor. Sort of like strip.left() but for just one of the factors. I can't remember or find how I did it, what options to set. Can anyone remind me? Thanks. --Chris Ryan ______ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fill NA values in columns with values of another column
Use the ave function. On August 29, 2024 2:29:16 PM PDT, Bert Gunter wrote: >Petr et.al: > >I think using merge is a very nice idea! (note that the email omitted the >last rows of the result, though your code of course produced them) > >The only minor problem is that the order of the rows in the result is >changed from the original. If the OP needs to preserve the original >ordering, that can be easily done. Here is a complete implementation of >your idea (I think). > >## assume that dat is a data frame with the first two columns as in the >OP's post, i.e. the first column is the Value with NA's and the second is >the Group > > spl <- dat |> nrow() |> seq_len() |> split(dat[[2]]) |> unlist() ## for >reordering > dat[spl, 1:2] <- >dat[, 1:2] |> >na.omit() |> ## remove rows with NA's >unique() |> ## remove duplicate rows >merge(dat[, 1:2], by.x=2, by.y=2) |> _[, 2:1] ## and now merge() > > Note the final reordering of the first two columns because of the way >merge() works. >I suspect there may be a slicker way to do this using unsplit(), but I >could not figure out how > >The result is: >> dat > Value Group >1 6 8 >2 9 5 >3 2 1 >4 5 6 >5 2 7 >6 7 2 >7 4 4 >8 2 7 >9 2 7 >1010 3 >11 7 2 >12 4 4 >13 5 6 >14 9 5 >15 9 5 >16 5 6 >1710 3 >18 7 2 >19 2 1 >20 2 7 >21 7 2 >22 6 8 >23 4 4 >24 9 5 >25 5 6 >26 2 1 >27 4 4 >28 6 8 >2910 3 >3010 3 >31 6 8 >32 2 1 > >Cheers, >Bert > > > > >On Wed, Aug 28, 2024 at 10:53 PM Petr Pikal wrote: > >> Hallo Francesca >> >> If you had an object with correct setting, something like template >> >> > dput(res) >> structure(list(V1 = c("1", "2", "3", "4", "5", "6", "7", "8"), >> V2 = c(2, 7, 10, 4, 9, 5, 2, 6)), class = "data.frame", row.names = >> c("1", >> "2", "3", "4", "5", "6", "7", "8")) >> >> you could merge it with your object where some values are missing >> >> > dput(daf) >> structure(list(X1 = c(6L, 9L, NA, 5L, NA, NA, 4L, 2L, 2L, NA, >> NA, NA, 5L, 9L, NA, NA, 10L, 7L, 2L, NA, 7L, NA, NA, NA, NA, >> 2L, 4L, 6L, 10L, NA, NA, NA), X2 = c(8L, 5L, 1L, 6L, 7L, 2L, >> 4L, 7L, 7L, 3L, 2L, 4L, 6L, 5L, 5L, 6L, 3L, 2L, 1L, 7L, 2L, 8L, >> 4L, 5L, 6L, 1L, 4L, 8L, 3L, 3L, 8L, 1L)), class = "data.frame", row.names = >> c(NA, >> -32L)) >> >> > merge(daf, res, by.x="X2", by.y="V1") >>X2 X1 V2 >> 1 1 NA 2 >> 2 1 NA 2 >> 3 1 2 2 >> 4 1 2 2 >> 5 2 NA 7 >> 6 2 NA 7 >> 7 2 7 7 >> 8 2 7 7 >> 9 3 10 10 >> 10 3 NA 10 >> 11 3 10 10 >> 12 3 NA 10 >> 13 4 4 4 >> 14 4 NA 4 >> 15 4 4 4 >> 16 4 NA 4 >> 17 5 9 9 >> 18 5 NA 9 >> 19 5 NA 9 >> >> Cheers. >> Petr >> >> >> >> >> st 28. 8. 2024 v 0:45 odesílatel Francesca PANCOTTO via R-help < >> r-help@r-project.org> napsal: >> >> > Dear Contributors, >> > I have a problem with a database composed of many individuals for many >> > periods, for which I need to perform a manipulation of data as follows. >> > Here I report the procedure I need to do for the first 32 observations of >> > the first period. >> > >> > >> > cbind(VB1d[,1],s1id[,1]) >> > [,1] [,2] >> > [1,]68 >> > [2,]95 >> > [3,] NA1 >> > [4,]56 >> > [5,] NA7 >> > [6,] NA2 >> > [7,]44 >> > [8,]27 >> > [9,]27 >> > [10,] NA3 >> > [11,] NA2 >> > [12,] NA4 >> > [13,]56 >> > [14,]95 >> > [15,] NA5 >> > [16,] NA6 >> > [17,] 103 >> > [18,]72 >> > [19,]21 >> > [20,] NA7 >> > [21,]72 >> > [22,] NA8 >> > [23,] NA4 >> > [24,] NA5 >> > [25,] NA6 >> > [26,]21 >> > [27,]4
Re: [R] boxplot of raster and shapefile
max : num 53 .. .. ..@ band : int 1 .. .. ..@ unit : chr "" .. .. ..@ names : chr "Andrena.barbilabris_glo_ensemble" ..@ legend :Formal class '.RasterLegend' [package "raster"] with 5 slots .. .. ..@ type : chr(0) .. .. ..@ values: logi(0) .. .. ..@ color : logi(0) .. .. ..@ names : logi(0) .. .. ..@ colortable: logi(0) ..@ title : chr(0) ..@ extent :Formal class 'Extent' [package "raster"] with 4 slots .. .. ..@ xmin: num 248 .. .. ..@ xmax: num 284 .. .. ..@ ymin: num 107 .. .. ..@ ymax: num 130 ..@ rotated : logi FALSE ..@ rotation:Formal class '.Rotation' [package "raster"] with 2 slots .. .. ..@ geotrans: num(0) .. .. ..@ transfun:function () ..@ ncols : int 14400 ..@ nrows : int 9200 ..@ crs :Formal class 'CRS' [package "sp"] with 1 slot .. .. ..@ projargs: chr "+proj=somerc +lat_0=46.952405556 +lon_0=7.439583 +k_0=1 +x_0=260 +y_0=120 +ellps=bessel +units=m +no_defs" .. .. ..$ comment: chr "PROJCRS[\"unknown\",\nBASEGEOGCRS[\"unknown\",\n DATUM[\"Unknown based on Bessel 1841 ellipsoid\",\n"| __truncated__ ..@ srs : chr "+proj=somerc +lat_0=46.952405556 +lon_0=7.4395833333 +k_0=1 +x_0=260 +y_0=120 +ellps=bessel +units=m +no_defs" ..@ history : list() ..@ z : list() > e <- extract(r,v) Error in (function (classes, fdef, mtable) : kann keine vererbte Methode finden für Funktion ‘extract’ für Signatur ‘"RasterLayer", "SpatVector"’ Kind regards Sibylle -Original Message- From: Ivan Krylov Sent: Tuesday, August 27, 2024 6:55 PM To: SIBYLLE STÖCKLI via R-help Cc: sibylle.stoec...@gmx.ch Subject: Re: [R] boxplot of raster and shapefile В Mon, 26 Aug 2024 14:33:02 +0200 SIBYLLE STÖCKLI via R-help < <mailto:r-help@r-project.org> r-help@r-project.org> пишет: > > # Extract raster values within the shapefile extracted_values <- > > extract(raster_file, shape_file) > > # Assuming the shapefile has multiple polygons and you want to # > > create a boxplot for each data_list <- > > lapply(1:length(extracted_values), function(i) { > + data.frame(value = extracted_values[[i]], polygon = i) > + }) > > data <- do.call(rbind, data_list) > > names(data) > [1] "value" "polygon" > > # Create the boxplot > > bp<-ggplot(data, aes(x = factor(polygon), y = value)) + > + geom_boxplot() + > + labs(x = "Polygon", y = "Raster Values") + > + theme_minimal() > > bp > Error in UseMethod("depth") : > no applicable method for 'depth' applied to an object of class > "NULL" > In addition: Warning message: > Removed 452451 rows containing non-finite outside the scale range > (`stat_boxplot()`). Thank you for providing a runnable example! Could you please also show the output of str(extracted_values) and str(data)? -- Best regards, Ivan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fill NA values in columns with values of another column
Dear Contributors, I have a problem with a database composed of many individuals for many periods, for which I need to perform a manipulation of data as follows. Here I report the procedure I need to do for the first 32 observations of the first period. cbind(VB1d[,1],s1id[,1]) [,1] [,2] [1,]68 [2,]95 [3,] NA1 [4,]56 [5,] NA7 [6,] NA2 [7,]44 [8,]27 [9,]27 [10,] NA3 [11,] NA2 [12,] NA4 [13,]56 [14,]95 [15,] NA5 [16,] NA6 [17,] 103 [18,]72 [19,]21 [20,] NA7 [21,]72 [22,] NA8 [23,] NA4 [24,] NA5 [25,] NA6 [26,]21 [27,]44 [28,]68 [29,] 103 [30,] NA3 [31,] NA8 [32,] NA1 In column s1id, I have numbers from 1 to 8, which are the id of 8 groups , randomly mixed in the larger group of 32. For each group, I want the value that is reported for only to group members, to all the four group members. For example, value 8 in first row , second column, is group 8. The value for group 8 of the variable VB1d is 6. At row 28, again for s1id equal to 8, I have 6. But in row 22, the value 8 of the second variable, reports a value NA. in each group is the same, only two values have the correct number, the other two are NA. I need that each group, identified by the values of the variable S1id, correctly report the number of variable VB1d that is present for just two group members. I hope my explanation is acceptable. The task appears complex to me right now, especially because I will need to multiply this procedure for x12x14 similar databases. Anyone has ever encountered a similar problem? Thanks in advance for any help provided. -- Francesca Pancotto Associate Professor Political Economy University of Modena, Largo Santa Eufemia, 19, Modena Office Phone: +39 0522 523264 Web: *https://sites.google.com/view/francescapancotto/home <https://sites.google.com/view/francescapancotto/home>* -- [[alternative HTML version deleted]] ______ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] boxplot of raster and shapefile
В Mon, 26 Aug 2024 14:33:02 +0200 SIBYLLE STÖCKLI via R-help пишет: > > # Extract raster values within the shapefile > > extracted_values <- extract(raster_file, shape_file) > > # Assuming the shapefile has multiple polygons and you want to > > # create a boxplot for each > > data_list <- lapply(1:length(extracted_values), function(i) { > + data.frame(value = extracted_values[[i]], polygon = i) > + }) > > data <- do.call(rbind, data_list) > > names(data) > [1] "value" "polygon" > > # Create the boxplot > > bp<-ggplot(data, aes(x = factor(polygon), y = value)) + > + geom_boxplot() + > + labs(x = "Polygon", y = "Raster Values") + > + theme_minimal() > > bp > Error in UseMethod("depth") : > no applicable method for 'depth' applied to an object of class > "NULL" > In addition: Warning message: > Removed 452451 rows containing non-finite outside the scale range > (`stat_boxplot()`). Thank you for providing a runnable example! Could you please also show the output of str(extracted_values) and str(data)? -- Best regards, Ivan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] boxplot of raster and shapefile
Dear community This example code works library(raster) library(sp) library(rgdal) library(ggplot2) # Create some sample raster data raster_file <- raster(ncol=36, nrow=18) raster_file[] <- 1:ncell(raster_file) plot(raster_file) #Create some sample polygons cds1 <- rbind(c(-180,-20), c(-160,5), c(-60, 0), c(-160,-60), c(-180,-20)) cds2 <- rbind(c(80,0), c(100,60), c(120,0), c(120,-55), c(80,0)) shape_file <- SpatialPolygons(list(Polygons(list(Polygon(cds1)), 1), Polygons(list(Polygon(cds2)), 2))) plot(shape_file) # Extract raster values within the shapefile extracted_values <- extract(raster_file, shape_file) # Assuming the shapefile has multiple polygons and you want to create a boxplot for each data_list <- lapply(1:length(extracted_values), function(i) { data.frame(value = extracted_values[[i]], polygon = i) }) data <- do.call(rbind, data_list) # Create the boxplot bp<-ggplot(data, aes(x = factor(polygon), y = value)) + geom_boxplot() + labs(x = "Polygon", y = "Raster Values") + theme_minimal() bp For my own dataset I encountered problems in reading in the polygons. The error message comes in the boxplot function Here may shape file > # load shape file including all layers and print layers > shape_file<-shapefile("C:/Users/.BiogeoRegion.shp") Warning message: [vect] Z coordinates ignored > names(shape_file) [1] "RegionNumm" "RegionName" "Unterregio" "Unterreg_1" "ObjNummer" "Version""Shape_Leng" "Shape_Area" "DERegionNa" "FRRegionNa" [11] "ITRegionNa" "DEBioBedeu" "FRBioBedeu" "ITBioBedeu" > str(shape_file) Formal class 'SpatialPolygonsDataFrame' [package "sp"] with 5 slots ..@ data :'data.frame': 12 obs. of 14 variables: .. ..$ RegionNumm: int [1:12] 1 2 2 2 2 3 3 4 5 5 ... .. ..$ RegionName: chr [1:12] "R1" "R2" "R2" "R2" ... .. ..$ Unterregio: int [1:12] 11 21 22 23 24 31 32 41 51 52 ... .. ..$ Unterreg_1: chr [1:12] "U11" "U21" "U22" "U23" ... .. ..$ ObjNummer : chr [1:12] "1" "2" "3" "4" ... .. ..$ Version : chr [1:12] "2020/05/08" "2020/05/08" "2020/05/08" "2020/05/08" ... .. ..$ Shape_Leng: num [1:12] 725117 334364 539746 576810 41 ... .. ..$ Shape_Area: num [1:12] 4.17e+09 1.11e+09 1.07e+09 4.64e+09 4.47e+09 ... .. ..$ DERegionNa: chr [1:12] "Jura" "Mittelland" "Mittelland" "Mittelland" ... .. ..$ FRRegionNa: chr [1:12] "Jura" "Plateau" "Plateau" "Plateau" ... .. ..$ ITRegionNa: chr [1:12] "Giura" "Altipiano" "Altipiano" "Altipiano" ... .. ..$ DEBioBedeu: chr [1:12] "Jura und Randen" "Genferseegebiet" "Hochrheingebiet" "Westliches Mittelland" ... .. ..$ FRBioBedeu: chr [1:12] "Jura et Randen" "Bassin lémanique" "Bassin rhénan" "Plateau occidental" ... .. ..$ ITBioBedeu: chr [1:12] "Giura e Randen" "Regione del Lemano" "Regione dellAlto Reno" "Altipiano occidentale" ... ..@ polygons :List of 12 > # select a specific raster file from a list > raster_file<-allrasters_pres[[1]] > names(raster_file) [1] "Andrena.barbilabris_glo_ensemble" > > # Extract raster values within the shapefile > extracted_values <- extract(raster_file, shape_file) > > > > # Assuming the shapefile has multiple polygons and you want to create a boxplot for each > data_list <- lapply(1:length(extracted_values), function(i) { + data.frame(value = extracted_values[[i]], polygon = i) + }) > data <- do.call(rbind, data_list) > names(data) [1] "value" "polygon" > > # Create the boxplot > bp<-ggplot(data, aes(x = factor(polygon), y = value)) + + geom_boxplot() + + labs(x = "Polygon", y = "Raster Values") + + theme_minimal() > bp Error in UseMethod("depth") : no applicable method for 'depth' applied to an object of class "NULL" In addition: Warning message: Removed 452451 rows containing non-finite outside the scale range (`stat_boxplot()`). > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] checking for unstated dependencies in examples ... ERROR
В Sun, 25 Aug 2024 20:32:07 + Søren Højsgaard via R-help пишет: > checking for unstated dependencies in examples ... ERROR > Warning: parse error in file 'lines': > 77:20) Looks like one of your Rd \examples{} has a syntax error in it. If there is a file gRain.Rcheck/gRain-Ex.R, try to find out which help page does the line 77 correspond to. -- Best regards, Ivan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] checking for unstated dependencies in examples ... ERROR
> On 08/25/2024 1:32 PM PDT Søren Højsgaard via R-help > wrote: > > > Dear all, > When checking a package (ubuntu) I get > > checking for unstated dependencies in examples ... ERROR > Warning: parse error in file 'lines': > 77:20) > > Can anyone help on this? I doubt anyone can help with such scanty information about how this problem comes about. I don't think there's an "ubuntu" package, but rather that you are on a Linux box. Do you actually have a file named "lines"? Have you looked for a syntax error near line 77? -- David. Thanks in advance. > > Best > Søren > > > --- > > > sessionInfo() > R version 4.3.3 (2024-02-29) > Platform: x86_64-pc-linux-gnu (64-bit) > Running under: Ubuntu 24.04 LTS > > Matrix products: default > BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.12.0 > LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0 > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > time zone: Europe/Copenhagen > tzcode source: system (glibc) > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] gRain_1.4.1.9012 gRbase_2.0.2 Rcpp_1.0.13 shtools_1.0 > [5] markdown_1.13knitr_1.48 rmarkdown_2.28 devtools_2.4.5 > [9] usethis_3.0.0 > > loaded via a namespace (and not attached): > [1] tidyselect_1.2.1 dplyr_1.1.4 fastmap_1.2.0 > [4] xopen_1.0.1 promises_1.3.0digest_0.6.37 > [7] mime_0.12 lifecycle_1.0.4 Deriv_4.1.3 > [10] ellipsis_0.3.2processx_3.8.4magrittr_2.0.3 > [13] compiler_4.3.3rlang_1.1.4 tools_4.3.3 > [16] igraph_2.0.3 utf8_1.2.4prettyunits_1.2.0 > [19] htmlwidgets_1.6.4 pkgbuild_1.4.4curl_5.2.1 > [22] xml2_1.3.6pkgload_1.4.0 miniUI_0.1.1.1 > [25] withr_3.0.1 purrr_1.0.2 desc_1.4.3 > [28] grid_4.3.3stats4_4.3.3 fansi_1.0.6 > [31] roxygen2_7.3.2urlchecker_1.0.1 profvis_0.3.8 > [34] xtable_1.8-4 colorspace_2.1-1 ggplot2_3.5.1 > [37] scales_1.3.0 MASS_7.3-60.0.1 cli_3.6.3 > [40] generics_0.1.3remotes_2.5.0 rstudioapi_0.16.0 > [43] modelr_0.1.11 commonmark_1.9.1 sessioninfo_1.2.2 > [46] cachem_1.1.0 stringr_1.5.1 vctrs_0.6.5 > [49] boot_1.3-30 Matrix_1.6-5 callr_3.7.6 > [52] rcmdcheck_1.4.0 testthat_3.2.1.1 tidyr_1.3.1 > [55] glue_1.7.0ps_1.7.7 cowplot_1.1.3 > [58] stringi_1.8.4 gtable_0.3.5 later_1.3.2 > [61] munsell_0.5.1 tibble_3.2.1 pillar_1.9.0 > [64] htmltools_0.5.8.1 brio_1.1.5doBy_4.6.22 > [67] R6_2.5.1 microbenchmark_1.4.10 rprojroot_2.0.4 > [70] evaluate_0.24.0 shiny_1.9.1 lattice_0.22-6 > [73] backports_1.5.0 memoise_2.0.1 broom_1.0.6 > [76] httpuv_1.6.15 xfun_0.47 fs_1.6.4 > [79] pkgconfig_2.0.3 > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide https://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] checking for unstated dependencies in examples ... ERROR
Dear all, When checking a package (ubuntu) I get checking for unstated dependencies in examples ... ERROR Warning: parse error in file 'lines': 77:20) Can anyone help on this? Thanks in advance. Best Søren --- > sessionInfo() R version 4.3.3 (2024-02-29) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 24.04 LTS Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.12.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0 locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C time zone: Europe/Copenhagen tzcode source: system (glibc) attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] gRain_1.4.1.9012 gRbase_2.0.2 Rcpp_1.0.13 shtools_1.0 [5] markdown_1.13knitr_1.48 rmarkdown_2.28 devtools_2.4.5 [9] usethis_3.0.0 loaded via a namespace (and not attached): [1] tidyselect_1.2.1 dplyr_1.1.4 fastmap_1.2.0 [4] xopen_1.0.1 promises_1.3.0digest_0.6.37 [7] mime_0.12 lifecycle_1.0.4 Deriv_4.1.3 [10] ellipsis_0.3.2processx_3.8.4magrittr_2.0.3 [13] compiler_4.3.3rlang_1.1.4 tools_4.3.3 [16] igraph_2.0.3 utf8_1.2.4prettyunits_1.2.0 [19] htmlwidgets_1.6.4 pkgbuild_1.4.4curl_5.2.1 [22] xml2_1.3.6pkgload_1.4.0 miniUI_0.1.1.1 [25] withr_3.0.1 purrr_1.0.2 desc_1.4.3 [28] grid_4.3.3stats4_4.3.3 fansi_1.0.6 [31] roxygen2_7.3.2urlchecker_1.0.1 profvis_0.3.8 [34] xtable_1.8-4 colorspace_2.1-1 ggplot2_3.5.1 [37] scales_1.3.0 MASS_7.3-60.0.1 cli_3.6.3 [40] generics_0.1.3remotes_2.5.0 rstudioapi_0.16.0 [43] modelr_0.1.11 commonmark_1.9.1 sessioninfo_1.2.2 [46] cachem_1.1.0 stringr_1.5.1 vctrs_0.6.5 [49] boot_1.3-30 Matrix_1.6-5 callr_3.7.6 [52] rcmdcheck_1.4.0 testthat_3.2.1.1 tidyr_1.3.1 [55] glue_1.7.0ps_1.7.7 cowplot_1.1.3 [58] stringi_1.8.4 gtable_0.3.5 later_1.3.2 [61] munsell_0.5.1 tibble_3.2.1 pillar_1.9.0 [64] htmltools_0.5.8.1 brio_1.1.5doBy_4.6.22 [67] R6_2.5.1 microbenchmark_1.4.10 rprojroot_2.0.4 [70] evaluate_0.24.0 shiny_1.9.1 lattice_0.22-6 [73] backports_1.5.0 memoise_2.0.1 broom_1.0.6 [76] httpuv_1.6.15 xfun_0.47 fs_1.6.4 [79] pkgconfig_2.0.3 ______ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] paired raster boxplots
I see, the selection of the layers in shape or raster file is tricky. Here what I found out: - It seems that my raster file r is fine, but it is definitively my shape file s that causes problems reading the right layer: read_sf should be ok, because it reads in all layers, names(sf). However, when selecting the third layer it reads in number 3 (Unterregio) and number 15 (geometry). Secondly when using stack() it does not read in the s layer (probably because there are two layers). > > sf <- read_sf("C:/Users/Sibylle > Stöckli/Desktop/NCCS_Impacts_Lot2_2022/InVEST/BAFU_ALLEMA_strata/Grossraume/data/BiogeographischeRegionen/N2020_Revision_BiogeoRegion.shp") > names(sf) [1] "RegionNumm" "RegionName" "Unterregio" "Unterreg_1" "ObjNummer" "Version" "Shape_Leng" "Shape_Area" "DERegionNa" "FRRegionNa" "ITRegionNa" "DEBioBedeu" "FRBioBedeu" "ITBioBedeu" [15] "geometry" > > s<-sf[3] > names(s) [1] "Unterregio" "geometry" > > r<-allrasters_pres[[1]] > names(r) [1] "Andrena.barbilabris_glo_ensemble" > > > rs <- stack(r, s) > names(rs) <- c('r', 's') Error in `names<-`(`*tmp*`, value = c("r", "s")) : incorrect number of layer names > names(rs) [1] "Andrena.barbilabris_glo_ensemble" > > -Original Message- From: Ivan Krylov Sent: Saturday, August 24, 2024 6:04 PM To: sibylle.stoec...@gmx.ch Cc: 'SIBYLLE STÖCKLI via R-help' Subject: Re: [R] paired raster boxplots В Sat, 24 Aug 2024 10:24:36 +0200 пишет: > Yes indeed my raster "s" (the shape file for the boxplot classes) has > several layers. If 's' contains more than one layer, then this already prevents you from giving two names to stack(r, s). > That's way I tried to select a layer by " > s<-sf$Unterregio". 'sf' is a data.frame-like object returned by read_sf, not a raster. If 's' is a raster, it could still contain multiple layers. > > sf <- read_sf("C:/Users/._BiogeoRegion.shp") > > names(sf) > > > names(sf) > [1] "RegionNumm" "RegionName" "Unterregio" "Unterreg_1" "ObjNummer" > "Version""Shape_Leng" "Shape_Area" "DERegionNa" "FRRegionNa" > "ITRegionNa" "DEBioBedeu" "FRBioBedeu" "ITBioBedeu" > [15] "geometry" > > > s<-sf$Unterregio > > r<-allrasters_pres[[1]] Sorry, that's still not enough information because we don't know what names(rs) is. Since 'allrasters_pres' is a list of rasters, 'r' could also contain more than one layer, resulting in stack(r, s) containing more than two layers. In order to avoid the error, you need to see names(rs) and either give the same number of names to the object instead of two, or additionally extract one layer (using raster(r, layer = NUMBER_OR_NAME)) from each of 'r' and 's' before stacking them. -- Best regards, Ivan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A question on Statistics regarding regression
you say you asked elsewhere, but so many hits come up when I just search for "unbalanced sample size" your justification for not following the posting guide does not seem honest. I also recall that various discussions of statistical power address this in basic statistics. On August 24, 2024 11:05:12 AM PDT, Christofer Bogaso wrote: >Hi, > >I have asked this question elsewhere however failed to get any >response, so hoping to get some insight from experts and statisticians >here. > >Let say we are fitting a regression equation where one explanatory >variable is categorical with 2 categories. However in the sample, one >category has 95% of values but other category has just 5%. Means, the >categories are highly unbalanced. > >Typically SE of estimate may be inflated for such highly unbalanced >categorical explanatory variable. > >Such unbalanced case may come from 2 scenarios 1) there is a flaw in >sample or it is just by chance that second category has just 5% values >in the sample or 2) in the population itself, the second category has >very small number of occurrences which is reflected in the sample. > >My question how the SE would be impacted in above 2 cases? Will the >impact be same i.e. we would get incorrect estimate of SE in both >cases? If yes, is there any way to prove analytically or may be based >on simulation? > >My apologies as this question is not directly R related. However I >just wanted to get some insight on above problem related to Statistics >from some of the great Statisticians in this forum. > >Thanks for your time. > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide https://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. -- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] paired raster boxplots
В Sat, 24 Aug 2024 10:24:36 +0200 пишет: > Yes indeed my raster "s" (the shape file for the boxplot classes) has > several layers. If 's' contains more than one layer, then this already prevents you from giving two names to stack(r, s). > That's way I tried to select a layer by " > s<-sf$Unterregio". 'sf' is a data.frame-like object returned by read_sf, not a raster. If 's' is a raster, it could still contain multiple layers. > > sf <- read_sf("C:/Users/._BiogeoRegion.shp") > > names(sf) > > > names(sf) > [1] "RegionNumm" "RegionName" "Unterregio" "Unterreg_1" "ObjNummer" > "Version""Shape_Leng" "Shape_Area" "DERegionNa" "FRRegionNa" > "ITRegionNa" "DEBioBedeu" "FRBioBedeu" "ITBioBedeu" > [15] "geometry" > > > s<-sf$Unterregio > > r<-allrasters_pres[[1]] Sorry, that's still not enough information because we don't know what names(rs) is. Since 'allrasters_pres' is a list of rasters, 'r' could also contain more than one layer, resulting in stack(r, s) containing more than two layers. In order to avoid the error, you need to see names(rs) and either give the same number of names to the object instead of two, or additionally extract one layer (using raster(r, layer = NUMBER_OR_NAME)) from each of 'r' and 's' before stacking them. -- Best regards, Ivan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] paired raster boxplots
Dear Ivan Dear community Quite nice book recommendation. Yes indeed my raster "s" (the shape file for the boxplot classes) has several layers. That's way I tried to select a layer by " s<-sf$Unterregio". > sf <- read_sf("C:/Users/._BiogeoRegion.shp") > names(sf) > names(sf) [1] "RegionNumm" "RegionName" "Unterregio" "Unterreg_1" "ObjNummer" "Version" "Shape_Leng" "Shape_Area" "DERegionNa" "FRRegionNa" "ITRegionNa" "DEBioBedeu" "FRBioBedeu" "ITBioBedeu" [15] "geometry" > s<-sf$Unterregio > r<-allrasters_pres[[1]] Kind regards Sibylle -Original Message- From: Ivan Krylov Sent: Friday, August 23, 2024 5:30 PM To: sibylle.stoec...@gmx.ch Cc: 'SIBYLLE STÖCKLI via R-help' Subject: Re: [R] paired raster boxplots В Fri, 23 Aug 2024 10:15:55 +0200 пишет: > > s<-sf$Unterregio > > r<-allrasters_pres[[1]] > > > > > > rs <- stack(r, s) > > names(rs) <- c('r', 's') > Error in `names<-`(`*tmp*`, value = c("r", "s")) : > incorrect number of layer names It looks like at least one of the rasters 'r' and 's' has multiple layers. What does names(rs) return? I would offer more detailed advice, but I don't know 'raster' that well. The "R Inferno" book [1] offers a lot of generic-R troubleshooting advice, which should help you progress past errors like this one without waiting for someone on R-help to reply. -- Best regards, Ivan [1] https://www.burns-stat.com/documents/books/the-r-inferno/ __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] paired raster boxplots
В Fri, 23 Aug 2024 10:15:55 +0200 пишет: > > s<-sf$Unterregio > > r<-allrasters_pres[[1]] > > > > > > rs <- stack(r, s) > > names(rs) <- c('r', 's') > Error in `names<-`(`*tmp*`, value = c("r", "s")) : > incorrect number of layer names It looks like at least one of the rasters 'r' and 's' has multiple layers. What does names(rs) return? I would offer more detailed advice, but I don't know 'raster' that well. The "R Inferno" book [1] offers a lot of generic-R troubleshooting advice, which should help you progress past errors like this one without waiting for someone on R-help to reply. -- Best regards, Ivan [1] https://www.burns-stat.com/documents/books/the-r-inferno/ ______ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] paired raster boxplots
Dear Ivan Many thanks. Using my own dataset, my "s" is a layer shape file. Somewhere I need to define : snew<-s$Unterregio I tried it to do it before stack(), but it seems to be the wrong way. Do you have any suggestion? Code: > #first import all files in a single folder as a list > rastlist_pres <- list.files(path ="C:/Users/._bee_presence", > pattern='.tif$', all.files= T, full.names= T) > rastlist_RCP85P2 <- list.files(path ="C:/Users/_bee_RCP85P2", > pattern='.tif$', all.files= T, full.names= T) > > > #import all raster files in folder using lapply > allrasters_pres <- lapply(rastlist_pres, raster) > allrasters_RCP85P2 <- lapply(rastlist_RCP85P2, raster) > sf <- read_sf("C:/Users/._BiogeoRegion.shp") > names(sf) > names(sf) [1] "RegionNumm" "RegionName" "Unterregio" "Unterreg_1" "ObjNummer" "Version" "Shape_Leng" "Shape_Area" "DERegionNa" "FRRegionNa" "ITRegionNa" "DEBioBedeu" "FRBioBedeu" "ITBioBedeu" [15] "geometry" > s<-sf$Unterregio > r<-allrasters_pres[[1]] > > > rs <- stack(r, s) > names(rs) <- c('r', 's') Error in `names<-`(`*tmp*`, value = c("r", "s")) : incorrect number of layer names -Original Message- From: Ivan Krylov Sent: Thursday, August 22, 2024 4:50 PM To: SIBYLLE STÖCKLI via R-help Cc: sibylle.stoec...@gmx.ch Subject: Re: [R] paired raster boxplots В Thu, 22 Aug 2024 08:46:03 +0200 SIBYLLE STÖCKLI via R-help пишет: > rr2s <- stack(r, r2,s) > > names(rs) <- c('r', 's', 'r2') > > Error in `names<-`(`*tmp*`, value = c("r", "s", "r2")) : > > incorrect number of layer names The error must be happening because the variable named 'rs' in your workspace originates from some other code you have previously run. The code "works" if you replace names(rs) by names(rr2s), but the plot isn't very useful because d$s are real numbers in this example and they don't form groups for d$r. Are you interested in a boxplot where the boxes are grouped by two, one for 'r' and one for 'r2'? I'm sure they are not impossible to produce manually using the boxplot function, but the 'lattice' or 'ggplot2' packages would make them much easier. You will need to reshape your data into long format, with the "value" column for the r/r2 value, the "kind" column saying "r" or "r2", and the "s" column: https://stackoverflow.com/q/20172560 -- Best regards, Ivan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Linear regression and stand deviation at the Linux command line
В Thu, 22 Aug 2024 13:07:37 -0600 Keith Christian пишет: > I'm interested in R construct(s) to be entered at the command > line that would output slope, y-intercept, and r-squared values read > from a csv or other filename entered at the command line, and the same > for standard deviation calculations, namely the standard deviation, > variance, and z-scores for every data point in the file. If you'd like to script R at the command line, consider the commandArgs() function (try entering ?commandArgs at the R prompt). This way you can pass a file path to an R process without unsafely interpolating it into the R expression itself. These arguments can be given to R --args or to Rscript (without the --args). Also consider the 'littler' scripting-oriented R front-end <https://CRAN.R-project.org/package=littler>, which puts the command line arguments into the 'argv' variable and has a very convenient -d option which loads CSV data from the standard input into a variable named 'X'. > Are line numbers, commas, etc. needed or no? Depends on how you read it. By default, the function read.table() will expect your data to be separated by a mixture of tabs and spaces and will recognise a header if the first line contains one less column than the rest of the file. Enter ?read.table at the R prompt to see the available options (which include read.csv). Good introductions to R include, well, "An Introduction to R" [1] (also available by typing RShowDoc('R-intro') into the R prompt) and "Visual Statistics" by Dr. A. Shipunov [2]. Start with functions read.table(), lm(), scale(), sd(), summary(). Use str() to look at the structure of a variable: summary(lm(...)) will return a named list from which you can extract the values you are interested in (see ?summary.lm). When in doubt, call help(name_of_the_function). -- Best regards, Ivan [1] https://cran.r-project.org/doc/manuals/R-intro.html [2] http://web.archive.org/web/20230106210646/http://ashipunov.info/shipunov/school/biol_240/en/visual_statistics.pdf __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fcaR not for latest R version
Dear Peter, В Thu, 22 Aug 2024 16:53:01 +0200 Peter van Summeren пишет: > Did anyone use fcaR for the current version of R/Rstudio on the Mac? Maybe someone hasn't used fcaR on a Mac but can guess a solution to your problem anyway. Maybe someone has, but has no idea how to help you. On mailing lists, it is much more efficient to state the problem you are having right in the initial message, without "asking to ask". (Unless you would like to start an fcaR user group and are searching for members, in which case this must be the right question.) If you are having package installation problems on your Apple computer, the right mailing list might be r-sig-...@r-project.org. Judging by <https://cran.r-project.org/package=fcaR>, the 'fcaR' package should be able to install on the latest version of R running on a Mac, but the problem could also be in one of its dependencies, which I didn't look at. -- Best regards, Ivan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] paired raster boxplots
В Thu, 22 Aug 2024 08:46:03 +0200 SIBYLLE STÖCKLI via R-help пишет: > rr2s <- stack(r, r2,s) > > names(rs) <- c('r', 's', 'r2') > > Error in `names<-`(`*tmp*`, value = c("r", "s", "r2")) : > > incorrect number of layer names The error must be happening because the variable named 'rs' in your workspace originates from some other code you have previously run. The code "works" if you replace names(rs) by names(rr2s), but the plot isn't very useful because d$s are real numbers in this example and they don't form groups for d$r. Are you interested in a boxplot where the boxes are grouped by two, one for 'r' and one for 'r2'? I'm sure they are not impossible to produce manually using the boxplot function, but the 'lattice' or 'ggplot2' packages would make them much easier. You will need to reshape your data into long format, with the "value" column for the r/r2 value, the "kind" column saying "r" or "r2", and the "s" column: https://stackoverflow.com/q/20172560 -- Best regards, Ivan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Force conversion of (POSIXct) time zone with base R
В Thu, 22 Aug 2024 08:59:46 + Iago Giné Vázquez пишет: > How should POSIXct time zone be changed without modifying the > specified time (so fix the time zone). Since POSIXct represents the number of seconds since the specified "epoch" moment (beginning of 1970 UTC), fixing the time zone while retaining the local time requires modifying the stored time. It might be easier to go through POSIXlt, which represents a local time: (now <- as.POSIXlt(Sys.time())) attr(now, 'tzone') <- 'UTC' (now <- as.POSIXct(now)) (Why attr(now, 'tzone') and not now$zone? Historical reasons, I suppose, but at least both are documented in ?POSIXlt.) You might also go through the string representation (which effectively obtains the local time from the epoch time), but that feels brittle, especially if the POSIXct value has attr(., 'tzone') set: # time zone information successfully lost and replaced by UTC Sys.time() |> format() |> as.POSIXct(tz = 'UTC') -- Best regards, Ivan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] paired raster boxplots
Dear community I have two raster files (here r and r2: y-axis) and an equal x-axis (here s, 4 classes). Instead of plotting two boxplots I would like to plot paired boxplots: for each class the boxplots from r and r2 paired). I tried to adapt the code, but I am struggling around with the error: > names(rs) <- c('r', 's', 'r2') Error in `names<-`(`*tmp*`, value = c("r", "s", "r2")) : incorrect number of layer names Additionally I am not sure how to adapt the function boxplot() with two rasters, r and r2 and not only r. Kind regards Sibylle Working example: library(raster) r <- raster(nc=10, nr=5) r[] <- runif(ncell(r), min=10, max=20) * 2 r2 <- raster(nc=10, nr=5) r2[] <- runif(ncell(r2), min=10, max=20) * 2 s <- setValues(r, sample(c(1:4), replace = T, size=50)) rr2s <- stack(r, r2,s) names(rs) <- c('r', 's', 'r2') d <- as.data.frame(rr2s) boxplot(r~s, data= d) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] allequal diff
Dear Ben and Bert Thanks very much for the interesting discussion Yes that why I was additionally using compareRaster(r1,r2) and then resample (r2, r1) to adapt the extend. library(raster) r1 <- raster("") r2 <- raster("f") compareRaster(r1, r2) extent(r1) extent(r2) r2_resampled <- resample(r2, r1) compareRaster(r1, r2_resampled) CompareRaster: Evaluate whether a two or more Raster* objects have the same extent, number of rows and columns, projection, resolution, and origin (or a subset of these comparisons). Kind regards Sibylle -Original Message- From: R-help On Behalf Of Bert Gunter Sent: Sunday, August 18, 2024 11:21 PM To: Ben Bolker Cc: r-help@r-project.org Subject: Re: [R] allequal diff Ah...I see. Perhaps, then, the maintainer should be contacted, as the desired functionality seems similar to that provided in other all.equal methods. I realize that this may often not elicit a (prompt) response. -- Bert On Sun, Aug 18, 2024 at 11:50 AM Ben Bolker wrote: > > The OP's original problem is that the all.equal method for raster > objects (raster:::all.equal.raster), which is a wrapper around the > compareRaster() function, compares a bunch of different properties of > rasters (extent, resolution, values, etc.) and only returns a single > overall logical (TRUE/FALSE) value. OP wanted to see the magnitude of > the difference (as you could get for more typical all.equal methods by > using tolerance=0), but in order to do that one has to dig into the > code of compareRaster() and pull out code to make the particular > comparisons one wants by applying all.equal to specific components of > the raster (it would be nice if there were a built-in way to get this > information, but I don't know of one) > > On 8/18/24 14:40, Bert Gunter wrote: > > "Is it true that all.equal just compares y values?" > > > > The following may be a bit more than you may have wanted, but I hope > > it is nevertheless useful. > > > > The first place you should go to for questions like this is the Help > > system, not here, i.e. > > ?all.equal > > > > When you do this, you will find that all.equal() is a so-called S3 > > generic function, which, among other things, means that it works > > differently (i.e. "dispatches") depending on its (usually) first > > argument. So, for example, if the first argument is of (S3) class > > "numeric", it will call the default method, all.equal.default(); if > > it's a function, it will call all.equal.function(). Help for > > all.equal's basic methods is found in the single all.equal (base) > > Help page. However, for non-base R packages, there may be other > > different methods provided for classed objects, e.g. perhaps of class > > "raster" > > that would be found by ?all.equal.raster . Or maybe not, if the > > class "inherits" from another class, such as "matrix" (Warning: I am > > completely unfamiliar with the raster package, so these specifics > > are very likely wrong). > > > > To sort this sort of thing out, It would probably be useful for you > > to find a tutorial on R's S3 class system (which is really a form > > of multiple dispatch) and spend some time with it. There are many > > good ones out there. This S3 system is widely used within R and many > > packages, so doing this sort of homework now should serve you well > > in your future R journey. > > > > All IMO obviously. > > > > Cheers, > > Bert > > > > > > On Sun, Aug 18, 2024 at 11:00 AM SIBYLLE STÖCKLI via R-help > > wrote: > >> Dear Ivan > >> > >> Thanks a lot for this very nice example. > >> > >> Is it true that all.equal just compares y values? > >> Based on this help here I think so and the value I got is the difference > >> for the y-values. > >> https://www.statology.org/all-equal-function-r/ > >> > >> However, here I see x and y testing? > >> https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/ > >> all.equal I am actually interested in the x values (x-y > >> coordinates). Test if x-y coordinates of both 25-m-pixel rasters are the > >> same. Ther may be a small shift or differences in the number of decimal > >> places. > >> > >> Kind regards > >> Sibylle > >> > >> > >> > >> -Original Message- > >> From: Ivan Krylov > >> Sent: Friday, August 16, 2024 11:45 AM > >> To: sibylle.stoec...@gmx.ch > >> Cc: 'SIBYLLE STÖCKLI via
Re: [R] allequal diff
Dear Ivan Thanks a lot for this very nice example. Is it true that all.equal just compares y values? Based on this help here I think so and the value I got is the difference for the y-values. https://www.statology.org/all-equal-function-r/ However, here I see x and y testing? https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/all.equal I am actually interested in the x values (x-y coordinates). Test if x-y coordinates of both 25-m-pixel rasters are the same. Ther may be a small shift or differences in the number of decimal places. Kind regards Sibylle -Original Message- From: Ivan Krylov Sent: Friday, August 16, 2024 11:45 AM To: sibylle.stoec...@gmx.ch Cc: 'SIBYLLE STÖCKLI via R-help' Subject: Re: [R] allequal diff В Fri, 16 Aug 2024 11:32:58 +0200 пишет: > # values and mask r1 > r1 <- getValues(r1) > mask1 <- is.na(r1) > # Do the same for r2 > r2 <- getValues(r2_resampled) > mask2 <- is.na(r2) > > # Combine the masks > all.equal(r1[!(mask1 & mask2)], r2[!(mask1 & mask2)]) Let's consider a more tangible example: # The vectors `x` and `y` start out equal x <- y <- 1:10 # But then their different elements are made missing x[c(1,3,4)] <- NA y[c(3,8)] <- NA Now, `is.na(x) & is.na(y)` gives the third element as the only element missing in both x and y: mask1 <- is.na(x) mask2 <- is.na(y) all.equal( # not the comparison you are looking for x[!(mask1 & mask2)], # still two more elements missing y[!(mask1 & mask2)] # still one more element missing ) If you want to ignore all missing elements, you should combine the masks using the element-wise "or" operation ("missing in x and/or y"), not the element-wise "and" operation ("missing in both x and y at the same time"): mask1 & mask2 # drops element 3 # [1] FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE mask1 | mask2 # drops elements 1, 3, 4, 8 # [1] TRUE FALSE TRUE TRUE FALSE FALSE FALSE TRUE FALSE FALSE -- Best regards, Ivan ______ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Terminating a cmd windows from R
Many thanks for your help. I will consider the processx package. Best regards Nicolas Le 17 août 2024 à 16:07, Ivan Krylov a écrit : В Sat, 17 Aug 2024 11: 47: 30 + SIMON Nicolas via R-help пишет: > nmrun <- paste0("cmd. exe /k ",nm_log,". bat ",nmi,". ctl ",nmi,". lst") You are using the /k option that instructs cmd. exe to keep the command В Sat, 17 Aug 2024 11:47:30 + SIMON Nicolas via R-help пишет: > nmrun <- paste0("cmd.exe /k ",nm_log,".bat ",nmi,".ctl ",nmi,".lst") You are using the /k option that instructs cmd.exe to keep the command prompt open. Does the batch file contain an explicit "exit" to ensure that cmd.exe terminates? > system(nmrun, invisible = F, show.output.on.console = T, wait = T) With wait = TRUE, it should be possible to interrupt the process by pressing Ctrl+C in the cmd.exe window, but R itself will not be running your commands until system() returns (or is interrupted, terminating the process). You can specify a timeout for a foreground process using the 'timeout' argument of the system() function. If you'd like to manage a background process, consider the 'processx' CRAN package: https://urldefense.com/v3/__https://cran.r-project.org/package=processx__;!!JQ5agg!YxmSgAHp4UgpqrHOlG1PHD6BvfvG1NC92n14SwHd7xlxN3PkMlbOrtc2Kw67JPntISAjxOic-ZZA0kEDop7N_A$ -- Très cordialement, Ivan ______ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Terminating a cmd windows from R
В Sat, 17 Aug 2024 11:47:30 + SIMON Nicolas via R-help пишет: > nmrun <- paste0("cmd.exe /k ",nm_log,".bat ",nmi,".ctl ",nmi,".lst") You are using the /k option that instructs cmd.exe to keep the command prompt open. Does the batch file contain an explicit "exit" to ensure that cmd.exe terminates? > system(nmrun, invisible = F, show.output.on.console = T, wait = T) With wait = TRUE, it should be possible to interrupt the process by pressing Ctrl+C in the cmd.exe window, but R itself will not be running your commands until system() returns (or is interrupted, terminating the process). You can specify a timeout for a foreground process using the 'timeout' argument of the system() function. If you'd like to manage a background process, consider the 'processx' CRAN package: https://cran.r-project.org/package=processx -- Très cordialement, Ivan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Terminating a cmd windows from R
I execute a program using a batch file from R. The program may have infinite computation. So I need to be avle to stop it. nm_log <- "c:/nm74g64/run/nmfe74" nmi<- "202" nmrun <- paste0("cmd.exe /k ",nm_log,".bat ",nmi,".ctl ",nmi,".lst") # run system(nmrun, invisible = F, show.output.on.console = T, wait = T) Le 17 août 2024 à 13:31, Duncan Murdoch a écrit : On 2024-08-17 6: 21 a. m. , SIMON Nicolas via R-help wrote: > I would like to stop a dos shell windows following the cmd (execute) command. Is there a way to do that from R? I think you need to give more detail on what you are trying to do. On 2024-08-17 6:21 a.m., SIMON Nicolas via R-help wrote: > I would like to stop a dos shell windows following the cmd (execute) command. > Is there a way to do that from R? I think you need to give more detail on what you are trying to do. In Windows, "cmd" is supposed to open a shell window. Could you show us an example of what you are doing? Duncan Murdoch __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Terminating a cmd windows from R
I would like to stop a dos shell windows following the cmd (execute) command. Is there a way to do that from R? __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] boxplot notch
That's not really a reprex Sibylle. I did try to use it to see if I could work out what you were trying to do and help but there is so much in there that I suspect is distraction from the notch issue and its error message. Please can you give us something stripped of all unecessary things and tell us what you want? Something like data that we can read as a tribble() or from a dput() of your data and then only the packages you actually need for the plot (I think tidyverse alone might do) and then a ggplot() call stripped right down to what you need and a clear explanation of what you are trying to do in the geom_boxplot() call and how it uses the summarised data tibble. It may even be that if you do that, you will find what's causing the problem! (I speak from bitter experience!!) Very best (all), Chris On 16/08/2024 17:51, SIBYLLE STÖCKLI via R-help wrote: > Farm_ID JahrBio QI_A > 1 20151 9.5 > 2 20181 15.7 > 3 20201 21.5 > 1 20151 50.5 > 2 20181 12.9 > 3 20201 11.2 > 1 20151 30.6 > 2 20181 28.7 > 3 20201 29.8 > 1 20151 30.1 > 2 20181 NA > 3 20201 16.9 > 1 20150 6.5 > 2 20180 7.9 > 3 20200 10.2 > 1 20150 11.2 > 2 20180 18.5 > 3 20200 29.5 > 1 20150 25.1 > 2 20180 16.1 > 3 20200 15.9 > 1 20150 10.1 > 2 20180 8.4 > 3 20200 3.5 > 1 20150 NA > 2 20180 NA > 3 20200 3.5 > > > Code > setwd("C:/Users/Sibylle Stöckli/Desktop/") > #.libPaths() > getwd() > > #libraries laden > library("ggplot2") > library("gridExtra") > library(scales) > library(nlme) > library(arm) > library(blmeco) > library(stats) > library(dplyr) > library(ggpubr) > library(patchwork) > library(plotrix) > library(tidyverse) > library(dplyr) > > #read data > MS = read.delim("Test1.txt", na.strings="NA") > names(MS) > > MS$Jahr<-as.numeric(MS$Jahr) > MS$Bio<-as.factor(MS$Bio) > str(MS) > > # boxplot BFF QI > > MS1<- MS %>% filter(QI_A!="NA") %>% droplevels() > MS1$Jahr<-as.factor(MS1$Jahr) > > MS1s <- MS1 %>% >group_by(MS1$Jahr, MS1$Bio) %>% >summarise( > y0 = quantile(QI_A, 0.05), > y25 = quantile(QI_A, 0.25), > y50 = mean(QI_A), > y75 = quantile(QI_A, 0.75), > y100 = quantile(QI_A, 0.95)) > > MS1s > colnames(MS1s)[1]<-"Jahr" > colnames(MS1s)[2]<-"Bio" > MS1s > > p1<-ggplot(MS1s, aes(Jahr, fill = as.factor(Bio))) + >geom_boxplot( > aes(ymin = y0, lower = y25, middle = y50, upper = y75, ymax = y100), > stat = "identity", notch=TRUE >) + >theme(panel.background = element_blank())+ >theme(axis.line = element_line(colour = "black"))+ >theme(axis.text=element_text(size=18))+ >theme(axis.title=element_text(size=20))+ >ylab("Anteil BFF an LN [%]") +xlab("Jahr")+ >scale_color_manual(values=c("red","darkgreen"), labels=c("ÖLN", "BIO"))+ >scale_fill_manual(values=c("red","darkgreen"), labels= c("ÖLN", "BIO"))+ >theme(legend.title = element_blank())+ >theme(legend.text=element_text(size=20)) > p1<-p1 + expand_limits(y=c(0, 80)) > p1 -- Chris Evans (he/him) Visiting Professor, UDLA, Quito, Ecuador & Honorary Professor, University of Roehampton, London, UK. Work web site: https://www.psyctc.org/psyctc/ CORE site: http://www.coresystemtrust.org.uk/ Personal site: https://www.psyctc.org/pelerinage2016/ Emeetings (Thursdays): https://www.psyctc.org/psyctc/booking-meetings-with-me/ (Beware: French time, generally an hour ahead of UK) <https://ombook.psyctc.org/book> [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] boxplot notch
Thanks Ben, Here the reproducible example. It works without notch=TRUE, but provides an error with notch=TURE Error in `geom_boxplot()`: ! Problem while converting geom to grob. ℹ Error occurred in the 1st layer. Caused by error in `ans[ypos] <- rep(yes, length.out = len)[ypos]`: ! replacement has length zero Run `rlang::last_trace()` to see where the error occurred. Warning message: In rep(yes, length.out = len) : 'x' is NULL so the result will be NULL Data Farm_ID JahrBio QI_A 1 20151 9.5 2 20181 15.7 3 20201 21.5 1 20151 50.5 2 20181 12.9 3 20201 11.2 1 20151 30.6 2 20181 28.7 3 20201 29.8 1 20151 30.1 2 20181 NA 3 20201 16.9 1 20150 6.5 2 20180 7.9 3 20200 10.2 1 20150 11.2 2 20180 18.5 3 20200 29.5 1 20150 25.1 2 20180 16.1 3 20200 15.9 1 20150 10.1 2 20180 8.4 3 20200 3.5 1 20150 NA 2 20180 NA 3 20200 3.5 Code setwd("C:/Users/Sibylle Stöckli/Desktop/") #.libPaths() getwd() #libraries laden library("ggplot2") library("gridExtra") library(scales) library(nlme) library(arm) library(blmeco) library(stats) library(dplyr) library(ggpubr) library(patchwork) library(plotrix) library(tidyverse) library(dplyr) #read data MS = read.delim("Test1.txt", na.strings="NA") names(MS) MS$Jahr<-as.numeric(MS$Jahr) MS$Bio<-as.factor(MS$Bio) str(MS) # boxplot BFF QI MS1<- MS %>% filter(QI_A!="NA") %>% droplevels() MS1$Jahr<-as.factor(MS1$Jahr) MS1s <- MS1 %>% group_by(MS1$Jahr, MS1$Bio) %>% summarise( y0 = quantile(QI_A, 0.05), y25 = quantile(QI_A, 0.25), y50 = mean(QI_A), y75 = quantile(QI_A, 0.75), y100 = quantile(QI_A, 0.95)) MS1s colnames(MS1s)[1]<-"Jahr" colnames(MS1s)[2]<-"Bio" MS1s p1<-ggplot(MS1s, aes(Jahr, fill = as.factor(Bio))) + geom_boxplot( aes(ymin = y0, lower = y25, middle = y50, upper = y75, ymax = y100), stat = "identity", notch=TRUE ) + theme(panel.background = element_blank())+ theme(axis.line = element_line(colour = "black"))+ theme(axis.text=element_text(size=18))+ theme(axis.title=element_text(size=20))+ ylab("Anteil BFF an LN [%]") +xlab("Jahr")+ scale_color_manual(values=c("red","darkgreen"), labels=c("ÖLN", "BIO"))+ scale_fill_manual(values=c("red","darkgreen"), labels= c("ÖLN", "BIO"))+ theme(legend.title = element_blank())+ theme(legend.text=element_text(size=20)) p1<-p1 + expand_limits(y=c(0, 80)) p1 -Original Message- From: R-help On Behalf Of Ben Bolker Sent: Friday, August 16, 2024 3:30 PM To: r-help@r-project.org Subject: Re: [R] boxplot notch I don't see anything obviously wrong here. There may be something subtle, but we probably won't be able to help without a reproducible example ... On 2024-08-16 9:24 a.m., SIBYLLE STÖCKLI via R-help wrote: > Dear community > > > > I tried the following code using geom_boxplot() and notch=TRUE. Does > anyone know if the command notch=TRUE is at the wrong place in my > special code construct? > > > > Without notch=TRUE the code provides the planned ggplot. > > > > Kind regards > > Sibylle > > > > Code: > > > > MS1<- MS %>% filter(QI_A!="NA") %>% droplevels() > > MS1$Jahr<-as.factor(MS1$Jahr) > > > > MS1s <- MS1 %>% > >group_by(MS1$Jahr, MS1$Bio) %>% > >summarise( > > y0 = quantile(QI_A, 0.05), > > y25 = quantile(QI_A, 0.25), > > y50 = mean(QI_A), > > y75 = quantile(QI_A, 0.75), > > y100 = quantile(QI_A, 0.95)) > > > > MS1s > > colnames(MS1s)[1]<-"Jahr" > > colnames(MS1s)[2]<-"Bio" > > MS1s > > > > p1<-ggplot(MS1s, aes(Jahr, fill = as.factor(Bio))) + > >geom_boxplot( > > aes(ymin = y0, lower = y25, middle = y50, upper = y75, ymax = > y100), > > stat = "identity", notch=TRUE > >) + > >theme(panel.background = element_blank())+ > >theme(axis.line = element_line(colour = "black"))+ > > theme(axis.text=element_text(size=18))+ > >theme(axis.title=element_text(size=20))+ > > ylab("Anteil BFF an LN [%]") +xlab("J
[R] boxplot notch
Dear community I tried the following code using geom_boxplot() and notch=TRUE. Does anyone know if the command �notch=TRUE� is at the wrong place in my special code construct? Without notch=TRUE the code provides the planned ggplot. Kind regards Sibylle Code: MS1<- MS %>% filter(QI_A!="NA") %>% droplevels() MS1$Jahr<-as.factor(MS1$Jahr) MS1s <- MS1 %>% group_by(MS1$Jahr, MS1$Bio) %>% summarise( y0 = quantile(QI_A, 0.05), y25 = quantile(QI_A, 0.25), y50 = mean(QI_A), y75 = quantile(QI_A, 0.75), y100 = quantile(QI_A, 0.95)) MS1s colnames(MS1s)[1]<-"Jahr" colnames(MS1s)[2]<-"Bio" MS1s p1<-ggplot(MS1s, aes(Jahr, fill = as.factor(Bio))) + geom_boxplot( aes(ymin = y0, lower = y25, middle = y50, upper = y75, ymax = y100), stat = "identity", notch=TRUE ) + theme(panel.background = element_blank())+ theme(axis.line = element_line(colour = "black"))+ theme(axis.text=element_text(size=18))+ theme(axis.title=element_text(size=20))+ ylab("Anteil BFF an LN [%]") +xlab("Jahr")+ scale_color_manual(values=c("red","darkgreen"), labels=c("�LN", "BIO"))+ scale_fill_manual(values=c("red","darkgreen"), labels= c("�LN", "BIO"))+ theme(legend.title = element_blank())+ theme(legend.text=element_text(size=20)) p1<-p1 + expand_limits(y=c(0, 80)) p1 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] allequal diff
В Fri, 16 Aug 2024 11:32:58 +0200 пишет: > # values and mask r1 > r1 <- getValues(r1) > mask1 <- is.na(r1) > # Do the same for r2 > r2 <- getValues(r2_resampled) > mask2 <- is.na(r2) > > # Combine the masks > all.equal(r1[!(mask1 & mask2)], r2[!(mask1 & mask2)]) Let's consider a more tangible example: # The vectors `x` and `y` start out equal x <- y <- 1:10 # But then their different elements are made missing x[c(1,3,4)] <- NA y[c(3,8)] <- NA Now, `is.na(x) & is.na(y)` gives the third element as the only element missing in both x and y: mask1 <- is.na(x) mask2 <- is.na(y) all.equal( # not the comparison you are looking for x[!(mask1 & mask2)], # still two more elements missing y[!(mask1 & mask2)] # still one more element missing ) If you want to ignore all missing elements, you should combine the masks using the element-wise "or" operation ("missing in x and/or y"), not the element-wise "and" operation ("missing in both x and y at the same time"): mask1 & mask2 # drops element 3 # [1] FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE mask1 | mask2 # drops elements 1, 3, 4, 8 # [1] TRUE FALSE TRUE TRUE FALSE FALSE FALSE TRUE FALSE FALSE -- Best regards, Ivan ______ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] allequal diff
Cool thanks # values and mask r1 r1 <- getValues(r1) mask1 <- is.na(r1) # Do the same for r2 r2 <- getValues(r2_resampled) mask2 <- is.na(r2) # Combine the masks all.equal(r1[!(mask1 & mask2)], r2[!(mask1 & mask2)]) output > all.equal(r1[!(mask1 & mask2)], r2[!(mask1 & mask2)]) [1] "'is.NA' value mismatch: 389 in current 56989152 in target" --> so there is just a mismatch in NA not in the xy pixels, right? Sibylle -Original Message- From: Ivan Krylov Sent: Friday, August 16, 2024 10:51 AM To: sibylle.stoec...@gmx.ch Cc: 'SIBYLLE STÖCKLI via R-help' Subject: Re: [R] allequal diff В Fri, 16 Aug 2024 10:35:35 +0200 пишет: > what do you mean by use is.na() in getValues(). So I need to call > getValues a second time? Not necessarily, but it's one of the options. I was thinking along the lines of: values1 <- getValues(r1) mask1 <- is.na(values1) # Do the same for r2 # Combine the masks all.equal(values1[!combined_mask], values2[!combined_mask]) Unlike compareRaster(), this assumes that the coordinate grid of r1 and r2 is already the same and that only some of the values may differ. > I suppose you mean to first prepare a mask using is.na without > getValues and then in the second step your code? 'raster' documentation says that is.na() works on raster objects, so it should work. Even if it didn't work, since you already access the underlying data using getValues() and then compare the resulting vectors using all.equal(), using is.na(getValues(...)) should definitely work. -- Best regards, Ivan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] allequal diff
Here my idea including the error: > m1=r1[r1="NA",] > m2=r2_resampled[r2_resampled="NA",] > > > all.equal(getValues(r1)[!m1], getValues(r2_resampled)[!m2]) [1] "Numeric: lengths (80706867, 65806339) differ" -Original Message- From: R-help On Behalf Of SIBYLLE STÖCKLI via R-help Sent: Friday, August 16, 2024 10:36 AM To: 'Ivan Krylov' ; 'SIBYLLE STÖCKLI via R-help' Subject: Re: [R] allequal diff Many thanks Ivan Use is.na() on getValues() outputs, combine the two masks using the | operator to get a mask of values that are missing in either raster, then negate the mask to choose the non-missing values: all.equal(getValues(r1)[!mask], getValues(r2)[!mask]) --> what do you mean by use is.na() in getValues(). So I need to call getValues a second time? I suppose you mean to first prepare a mask using is.na without getValues and then in the second step your code? Kind regards Sibylle -Original Message- From: Ivan Krylov < <mailto:ikry...@disroot.org> ikry...@disroot.org> Sent: Friday, August 16, 2024 9:28 AM To: SIBYLLE STÖCKLI via R-help < <mailto:r-help@r-project.org> r-help@r-project.org> Cc: <mailto:sibylle.stoec...@gmx.ch> sibylle.stoec...@gmx.ch Subject: Re: [R] allequal diff В Fri, 16 Aug 2024 07:19:38 +0200 SIBYLLE STÖCKLI via R-help < <mailto:r-help@r-project.org> r-help@r-project.org> пишет: > Is it possible to consider na.rm=TRUE? > > all.equal(getValues(r1), getValues(r2_resampled), tolerance = 0) > > [1] "'is.NA' value mismatch: 9544032 in current 66532795 in target" Use is.na() on getValues() outputs, combine the two masks using the | operator to get a mask of values that are missing in either raster, then negate the mask to choose the non-missing values: all.equal(getValues(r1)[!mask], getValues(r2)[!mask]) -- Best regards, Ivan ______ <mailto:R-help@r-project.org> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see <https://stat.ethz.ch/mailman/listinfo/r-help> https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide <http://www.R-project.org/posting-guide.html> http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] allequal diff
В Fri, 16 Aug 2024 10:35:35 +0200 пишет: > what do you mean by use is.na() in getValues(). So I need to call > getValues a second time? Not necessarily, but it's one of the options. I was thinking along the lines of: values1 <- getValues(r1) mask1 <- is.na(values1) # Do the same for r2 # Combine the masks all.equal(values1[!combined_mask], values2[!combined_mask]) Unlike compareRaster(), this assumes that the coordinate grid of r1 and r2 is already the same and that only some of the values may differ. > I suppose you mean to first prepare a mask using is.na without > getValues and then in the second step your code? 'raster' documentation says that is.na() works on raster objects, so it should work. Even if it didn't work, since you already access the underlying data using getValues() and then compare the resulting vectors using all.equal(), using is.na(getValues(...)) should definitely work. -- Best regards, Ivan ______ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] allequal diff
Many thanks Ivan Use is.na() on getValues() outputs, combine the two masks using the | operator to get a mask of values that are missing in either raster, then negate the mask to choose the non-missing values: all.equal(getValues(r1)[!mask], getValues(r2)[!mask]) --> what do you mean by use is.na() in getValues(). So I need to call getValues a second time? I suppose you mean to first prepare a mask using is.na without getValues and then in the second step your code? Kind regards Sibylle -Original Message- From: Ivan Krylov Sent: Friday, August 16, 2024 9:28 AM To: SIBYLLE STÖCKLI via R-help Cc: sibylle.stoec...@gmx.ch Subject: Re: [R] allequal diff В Fri, 16 Aug 2024 07:19:38 +0200 SIBYLLE STÖCKLI via R-help пишет: > Is it possible to consider na.rm=TRUE? > > all.equal(getValues(r1), getValues(r2_resampled), tolerance = 0) > > [1] "'is.NA' value mismatch: 9544032 in current 66532795 in target" Use is.na() on getValues() outputs, combine the two masks using the | operator to get a mask of values that are missing in either raster, then negate the mask to choose the non-missing values: all.equal(getValues(r1)[!mask], getValues(r2)[!mask]) -- Best regards, Ivan ______ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] allequal diff
В Fri, 16 Aug 2024 07:19:38 +0200 SIBYLLE STÖCKLI via R-help пишет: > Is it possible to consider na.rm=TRUE? > > all.equal(getValues(r1), getValues(r2_resampled), tolerance = 0) > > [1] "'is.NA' value mismatch: 9544032 in current 66532795 in target" Use is.na() on getValues() outputs, combine the two masks using the | operator to get a mask of values that are missing in either raster, then negate the mask to choose the non-missing values: all.equal(getValues(r1)[!mask], getValues(r2)[!mask]) -- Best regards, Ivan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] allequal diff
Dear Ben Many thanks. I see that a second challenge are NA values. Is it possible to consider na.rm=TRUE? > r2_resampled <- resample(r2, r1) > compareRaster(r1, r2_resampled) [1] TRUE > > all.equal(getValues(r1), getValues(r2_resampled), tolerance = 0) [1] "'is.NA' value mismatch: 9544032 in current 66532795 in target" Kind regards Sibylle -Original Message- From: R-help On Behalf Of Ben Bolker Sent: Friday, August 16, 2024 1:06 AM To: r-help@r-project.org Subject: Re: [R] allequal diff Digging into the code for raster::compareRaster(): library(raster) r <- raster(ncol=3, nrow=3) values(r) <- 1:ncell(r) r2 <- r values(r2) <- c(1:8,10) all.equal(getValues(r), getValues(r2), tolerance = 0) [1] "Mean relative difference: 0.111" compareRaster has fancier machinery internally for doing the comparison for large rasters a block at a time if everything can't fit in memory at the same time ... On 2024-08-15 9:14 a.m., SIBYLLE STÖCKLI via R-help wrote: > Dear community > > > > Similar to the example of the rdocumentation, my idea is to use > all.equal and to print the difference. > > <https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/all> > https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/all > .equal > > > >> d45 <- pi*(1/4 + 1:10) > >> stopifnot( > > + all.equal(tan(d45), rep(1, 10))) # TRUE, but > >> all (tan(d45) == rep(1, 10)) # FALSE, since not exactly > > [1] FALSE > >> all.equal(tan(d45), rep(1, 10), tolerance = 0) # to see difference > > [1] "Mean relative difference: 1.29526e-15" > >> > > > > Unfortunately, I just get "FALSE" not the difference. > >> r2_resampled <- resample(r2, r1) > >> compareRaster(r1, r2_resampled) > > [1] TRUE > >> # Compare rasters > >> result <- all.equal(r1, r2_resampled) > > Warning message: > > In compareRaster(target, current, ..., values = values, stopiffalse = > stopiffalse, : > >not all objects have the same values > >> print(result) > > [1] FALSE > >> > > > > Kind regards > > Sibylle > > > [[alternative HTML version deleted]] > > __ > <mailto:R-help@r-project.org> R-help@r-project.org mailing list -- To > UNSUBSCRIBE and more, see > <https://stat.ethz.ch/mailman/listinfo/r-help> > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > <http://www.R-project.org/posting-guide.html> > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Dr. Benjamin Bolker Professor, Mathematics & Statistics and Biology, McMaster University Director, School of Computational Science and Engineering > E-mail is sent at my convenience; I don't expect replies outside of working hours. __ <mailto:R-help@r-project.org> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see <https://stat.ethz.ch/mailman/listinfo/r-help> https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide <http://www.R-project.org/posting-guide.html> http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [EXTERNAL] Re: Very strange behavior of 'rep'
Its where n.per.grp is first calculated. I rounded. Gosh do I feel stupid. Thanks to all who weighed in. Best Regards, Grant Izmirlian From: Duncan Murdoch Sent: Thursday, August 15, 2024 2:59 PM To: Izmirlian, Grant (NIH/NCI) [E] ; r-help@r-project.org Subject: [EXTERNAL] Re: [R] Very strange behavior of 'rep' I also can't reproduce this. I'd guess that one of your values of n.per.grp or n.tt only prints as the values you showed, but is actually a little smaller. For example, n.per.grp <- 108 - 1.e-14 n.per.grp #> [1] 108 n.tt <- 5 n.per.grp*n.tt #> [1] 540 length(rep(0:1, each=n.per.grp*n.tt)) #> [1] 1078 Duncan Murdoch Created on 2024-08-15 with [reprex v2.1.1](https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Freprex.tidyverse.org%2F&data=05%7C02%7Cizmirlig%40mail.nih.gov%7C39d45a0a10d54c446ce808dcbd5c7518%7C14b77578977342d58507251ca2dc2b06%7C0%7C0%7C638593452006022388%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=ZZDiaJt2vEAejBTdsfyxaFtwYs8WBWgeIXVzUvN4%2BHI%3D&reserved=0)https://reprex.tidyverse.org/>> On 2024-08-15 2:39 p.m., Izmirlian, Grant (NIH/NCI) [E] via R-help wrote: > \n<>\n\n \n<< > This is very weird. I was running a swarm job on the cluster and it bombed > only for n.per.grp=108, not for the other values. Even though > n.per.grp*n.tt is 540, so that the length of the call to 'rep' > should be 1080, I'm getting a vector of length 1078. > n.per.grp <- 108 > n.tt <- 5 > n.per.grp*n.tt > length(rep(0:1, each=n.per.grp*n.tt)) > length(rep(0:1, each=108*5)) >>> \n<>\n\n\n\n > --please do not edit the information below-- > > R Version: > platform = x86_64-pc-linux-gnu > arch = x86_64 > os = linux-gnu > system = x86_64, linux-gnu > status = > major = 4 > minor = 4.1 > year = 2024 > month = 06 > day = 14 > svn rev = 86737 > language = R > version.string = R version 4.4.1 (2024-06-14) > nickname = Race for Your Life > > Locale: > > LC_CTYPE=C.UTF-8;LC_NUMERIC=C;LC_TIME=C.UTF-8;LC_COLLATE=C.UTF-8;LC_MONETARY=C.UTF-8;LC_MESSAGES=C.UTF-8;LC_PAPER=C.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=C.UTF-8;LC_IDENTIFICATION=C > > Search Path: > .GlobalEnv, package:lme4, package:Matrix, package:stats, > package:graphics, package:grDevices, package:utils, package:datasets, > package:showtext, package:showtextdb, package:sysfonts, > package:methods, Autoloads, package:base > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C02%7Cizmirlig%40mail.nih.gov%7C39d45a0a10d54c446ce808dcbd5c7518%7C14b77578977342d58507251ca2dc2b06%7C0%7C0%7C638593452006031043%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=LTp63hQ5fGXMg1K3QmkjLq1NRVQv22v7JH5AIn8AFWA%3D&reserved=0<https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide > https://gcc02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C02%7Cizmirlig%40mail.nih.gov%7C39d45a0a10d54c446ce808dcbd5c7518%7C14b77578977342d58507251ca2dc2b06%7C0%7C0%7C638593452006035637%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=OHU65ku0vmIiRInCbI7ep0QDRO5Xytat61CtBkYYMCM%3D&reserved=0<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and are confident the content is safe. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] allequal diff
Dear community Similar to the example of the rdocumentation, my idea is to use all.equal and to print the difference. https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/all.equal > d45 <- pi*(1/4 + 1:10) > stopifnot( + all.equal(tan(d45), rep(1, 10))) # TRUE, but > all (tan(d45) == rep(1, 10)) # FALSE, since not exactly [1] FALSE > all.equal(tan(d45), rep(1, 10), tolerance = 0) # to see difference [1] "Mean relative difference: 1.29526e-15" > Unfortunately, I just get "FALSE" not the difference. > r2_resampled <- resample(r2, r1) > compareRaster(r1, r2_resampled) [1] TRUE > # Compare rasters > result <- all.equal(r1, r2_resampled) Warning message: In compareRaster(target, current, ..., values = values, stopiffalse = stopiffalse, : not all objects have the same values > print(result) [1] FALSE > Kind regards Sibylle [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] geom_boxplot nocht=TRUE
Dear community I tried to run my ggplot() +geom_boxplot() code using nocht=TRUE, but probably my term noch=TRUE is at the wrong position? Error: Error in `geom_boxplot()`: ! Problem while converting geom to grob. ℹ Error occurred in the 1st layer. Caused by error in `ans[ypos] <- rep(yes, length.out = len)[ypos]`: ! replacement has length zero Run `rlang::last_trace()` to see where the error occurred. Warning message: In rep(yes, length.out = len) : 'x' is NULL so the result will be NULL > rlang::last_trace() Error in `geom_boxplot()`: ! Problem while converting geom to grob. ℹ Error occurred in the 1st layer. Caused by error in `ans[ypos] <- rep(yes, length.out = len)[ypos]`: ! replacement has length zero --- Backtrace: ▆ 1. ├─base (local) ``(x) 2. └─ggplot2:::print.ggplot(x) 3. ├─ggplot2::ggplot_gtable(data) 4. └─ggplot2:::ggplot_gtable.ggplot_built(data) 5. └─ggplot2:::by_layer(...) 6. ├─rlang::try_fetch(...) 7. │ ├─base::tryCatch(...) 8. │ │ └─base (local) tryCatchList(expr, classes, parentenv, handlers) 9. │ │ └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]]) 10. │ │ └─base (local) doTryCatch(return(expr), name, parentenv, handler) 11. │ └─base::withCallingHandlers(...) 12. └─ggplot2 (local) f(l = layers[[i]], d = data[[i]]) 13. └─l$draw_geom(d, layout) 14. └─ggplot2 (local) draw_geom(..., self = self) 15. └─self$geom$draw_layer(...) 16. └─ggplot2 (local) draw_layer(..., self = self) 17. └─base::lapply(...) 18. └─ggplot2 (local) FUN(X[[i]], ...) 19. ├─rlang::inject(self$draw_panel(data, panel_params, coord, !!!params)) 20. └─self$draw_panel(...) 21. └─ggplot2 (local) draw_panel(..., self = self) 22. └─base::lapply(...) 23. └─ggplot2 (local) FUN(X[[i]], ...) 24. └─self$draw_group(group, panel_params, coord, ...) 25. └─ggplot2 (local) draw_group(..., self = self) 26. ├─ggplot2:::data_frame0(...) 27. │ └─vctrs::data_frame(..., .name_repair = "minimal") 28. │ └─rlang::list2(...) 29. └─base::ifelse(notch, data$notchlower, NA) Run rlang::last_trace(drop = FALSE) to see 5 hidden frames. Code: MS1<- MS %>% filter(QI_A!="NA") %>% droplevels() MS1$Jahr<-as.factor(MS1$Jahr) MS1s <- MS1 %>% group_by(MS1$Jahr, MS1$Bio) %>% summarise( y0 = quantile(QI_A, 0.05), y25 = quantile(QI_A, 0.25), y50 = mean(QI_A), y75 = quantile(QI_A, 0.75), y100 = quantile(QI_A, 0.95)) MS1s colnames(MS1s)[1]<-"Jahr" colnames(MS1s)[2]<-"Bio" MS1s p1<-ggplot(MS1s, aes(Jahr, fill = as.factor(Bio))) + geom_boxplot( aes(ymin = y0, lower = y25, middle = y50, upper = y75, ymax = y100), stat = "identity", notch=TRUE ) + theme(panel.background = element_blank())+ theme(axis.line = element_line(colour = "black"))+ theme(axis.text=element_text(size=18))+ theme(axis.title=element_text(size=20))+ ylab("Anteil BFF an LN [%]") +xlab("Jahr")+ scale_color_manual(values=c("red","darkgreen"), labels=c("ÖLN", "BIO"))+ scale_fill_manual(values=c("red","darkgreen"), labels= c("ÖLN", "BIO"))+ theme(legend.title = element_blank())+ theme(legend.text=element_text(size=20)) p1<-p1 + expand_limits(y=c(0, 80)) p1 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [EXTERNAL] Re: Very strange behavior of 'rep'
Ok � to be fair, it looks like I need to load everything and reproduce exactly as its occuring. I suspect its a weird memory hole in one of the loaded packages. Here � this should do it. "%,%" <- paste0 "factorial.design" <- function(...) { m <- match.call() cc <- m cc[[1]] <- as.name("c") arg.vals <- eval(cc, sys.parent()) n.args <- length(arg.vals) n.l <- 0 n.r <- n.args-1 fwd.cpd <- c(1, cumprod(arg.vals)) rev.cpd <- c(cumprod(arg.vals[n.args:2])[(n.args-1):1],1) "%,%" <- paste0 main.call <- as.call(expression(cbind)) for(k in 1:n.args) { arg.k <- eval(m[[1+k]], sys.parent()) rep.call <- as.call(expression(rep)) rep.call$x <- 1:arg.k if(n.l>0) rep.call$times <- fwd.cpd[k] if(n.r>0) rep.call$each <- rev.cpd[k] main.call[["x" %,% k]] <- rep.call n.l <- n.l + 1 n.r <- n.r - 1 } eval(main.call) } add.zeros <- function(x,B) { do.one <- function(x,B) { delta <- floor(logb(B, 10)) - floor(logb(x, 10)) z <- "" if(delta > 0) z <- paste(rep("0", delta), collapse="") z%,%x } if(length(x)==1) ans <- do.one(x,B) if(length(x)>1) ans <- sapply(x, FUN=do.one, B=B) ans } library(lme4) rho.ICC <- 0.5 s2.b <- 0.5*rho.ICC/(1-rho.ICC) s2.e <- 0.5 delta <- 0.27 n.per.grp.lst <- (264/2*(1-450/9900*(5:0))) ## I will different values of 'h' and at these contrasts as well as flat ## I want to compare the AUC approach with just the test of the main effects constant arm coefficient tt.seq <- c(0,3,6,12,18) n.tt <- length(tt.seq) cntr <- c(0.5,1,1,1,0.5) ## (delta1 + 2*delta2 + 2*delta3 + 2*delta4 + delta5)/2 = 0.27 ## delta1 + 2*delta2 + 2*delta3 + 2*delta4 + delta5 = 2*0.27 ## (delta1 + 2*delta2 + 2*delta3 + 2*delta4 + delta5)/8 = 2*0.27/8 = 0.27/4 H <- 0.08+0.005*(0:6) shapes <- list(INC= (-floor(n.tt/2)):floor(n.tt/2), DEC= floor(n.tt/2):(-floor(n.tt/2)), FLT= rep(0,n.tt), PLT= c(-floor(n.tt/2):0,rep(floor(n.tt/2)+1, n.tt-floor(n.tt/2)-1))) trnds <- names(shapes) delta0.j <- rep(5, n.tt) n.n.lst <- length(n.per.grp.lst) n.trnds <- length(trnds) n.H <- length(H) ## We're swarming over conditions. It works like this. This is the ## full list of conditions as a factorial design list. You have to ## determine the total number of conditions and then set up that many ## directories in the swarm folder conds <- factorial.design(n.n.lst, n.trnds, n.H) ## currently with n.n.lst=6, n.trnds=4, n.H=7 that gives ## length(conds)=6x4x7=168 ## then one copy of this file sits in each of the subdirectories. ## Parameters for this run are determined by getting the subdirectory ## number and then pulling the parameter indices from that element ## in the factorial design. node <- 29 n.per.grp <- n.per.grp.lst[conds[node, 1]] trnd <- trnds[conds[node, 2]] h <- H[conds[node, 3]] delta.j <- delta + h*shapes[[trnd]] ## covariates: ID.i <- c(outer(rep(1:n.per.grp,each=n.tt),c(0,n.per.grp),FUN="+")) arm.i <- rep(0:1, each=n.per.grp*n.tt) length(arm.i) ____ From: Rui Barradas Sent: Thursday, August 15, 2024 2:51 PM To: Izmirlian, Grant (NIH/NCI) [E] ; r-help@r-project.org Subject: [EXTERNAL] Re: [R] Very strange behavior of 'rep' �s 19:39 de 15/08/2024, Izmirlian, Grant (NIH/NCI) [E] via R-help escreveu: > \n<>\n\n \n<< > This is very weird. I was running a swarm job on the cluster and it bombed > only for n.per.grp=108, not for the other values. Even though > n.per.grp*n.tt is 540, so that the length of the call to 'rep' > should be 1080, I'm getting a vector of length 1078. > n.per.grp <- 108 > n.tt <- 5 > n.per.grp*n.tt > length(rep(0:1, each=n.per.grp*n.tt)) > length(rep(0:1, each=108*5)) >>> \n<>\n\n\n\n > --please do not edit the information below-- > > R Version: > platform = x86_64-pc-linux-gnu > arch = x86_64 > os = linux-gnu > system = x86_64, linux-gnu > status = > major = 4 > minor = 4.1 > year = 2024 > month = 06 > day = 14 > svn rev = 86737 > language = R > version.string = R version 4.4.1 (2024-06-14) > nickname = Race for Your Life > > Locale: > > LC_CTYPE=C.UTF-8;LC_NUMERIC=C;LC_TIME=C.UTF-8;LC_COLLATE=C.UTF-8;LC_MONETARY=C.UTF-8;LC_MESSAGES=C.UTF-8;LC_PAPER=C.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=C.UTF-8;LC_IDENTIFICATION=C > > Search Path: > .GlobalEnv, package:lme4, package:Matrix, package:stats, > package:graphics, package:grDevices, package:utils, package:datasets, > package:showtex
[R] Very strange behavior of 'rep'
\n<>\n\n \n<< This is very weird. I was running a swarm job on the cluster and it bombed only for n.per.grp=108, not for the other values. Even though n.per.grp*n.tt is 540, so that the length of the call to 'rep' should be 1080, I'm getting a vector of length 1078. n.per.grp <- 108 n.tt <- 5 n.per.grp*n.tt length(rep(0:1, each=n.per.grp*n.tt)) length(rep(0:1, each=108*5)) >> \n<>\n\n\n\n --please do not edit the information below-- R Version: platform = x86_64-pc-linux-gnu arch = x86_64 os = linux-gnu system = x86_64, linux-gnu status = major = 4 minor = 4.1 year = 2024 month = 06 day = 14 svn rev = 86737 language = R version.string = R version 4.4.1 (2024-06-14) nickname = Race for Your Life Locale: LC_CTYPE=C.UTF-8;LC_NUMERIC=C;LC_TIME=C.UTF-8;LC_COLLATE=C.UTF-8;LC_MONETARY=C.UTF-8;LC_MESSAGES=C.UTF-8;LC_PAPER=C.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=C.UTF-8;LC_IDENTIFICATION=C Search Path: .GlobalEnv, package:lme4, package:Matrix, package:stats, package:graphics, package:grDevices, package:utils, package:datasets, package:showtext, package:showtextdb, package:sysfonts, package:methods, Autoloads, package:base [[alternative HTML version deleted]] ______ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] geom_smooth with sd
Thanks Erin Quite relevant. Yes now +sd and -sd are the same values. However they are about +/- 5 and not the values received by the simple code here. I still think it is as the length of y differs. Simple statistics > mean(MS2020[MS2020$Bio=="1",]$QI_A, na.rm=TRUE) [1] 26.81225 > sd(MS2020[MS2020$Bio=="1",]$QI_A, na.rm=TRUE) [1] 21.12419 > mean(MS2020[MS2020$Bio=="0",]$QI_A, na.rm=TRUE) [1] 15.86196 > sd(MS2020[MS2020$Bio=="0",]$QI_A, na.rm=TRUE) [1] 15.00405 Kind regards Sibylle From: Erin Hodgess Sent: Sunday, August 11, 2024 6:30 PM To: sibylle.stoec...@gmx.ch Cc: R-help@r-project.org Subject: Re: [R] geom_smooth with sd Hi! This is probably completely off base, but your ymin and y max setup lines are different. One uses sqrt(y), while the second uses sqrt(length(y)). Could that play a part, please? Thank you Erin Hodgess, PhD mailto: erinm.hodg...@gmail.com <mailto:erinm.hodg...@gmail.com> On Sun, Aug 11, 2024 at 10:10 AM SIBYLLE STÖCKLI via R-help mailto:r-help@r-project.org> > wrote: Dear community Using after_stat() I was able to visualise ggplot with standard deviations instead of a confidence interval as seen in the R help. p1<-ggplot(data = MS1, aes(x= Jahr, y= QI_A,color=Bio, linetype=Bio)) + geom_smooth(aes(fill=Bio, ymax=after_stat(y+se*sqrt(length(y))), ymin=after_stat(y-se*sqrt(y))) , method = "lm" , formula = y ~ x + I(x^2),linewidth=1) + theme(panel.background = element_blank())+ theme(axis.line = element_line(colour = "black"))+ theme(axis.text=element_text(size=18))+ theme(axis.title=element_text(size=20))+ ylab("Anteil BFF an LN [%]") +xlab("Jahr")+ scale_color_manual(values=c("red","darkgreen"), labels=c("ÖLN", "BIO"))+ scale_fill_manual(values=c("red","darkgreen"), labels= c("ÖLN", "BIO"))+ theme(legend.title = element_blank())+ theme(legend.text=element_text(size=20))+ scale_linetype_manual(values=c("dashed", "solid"), labels=c("ÖLN", "BIO")) p1<-p1 + expand_limits(y=c(0, 30)) When comparing the plots to the simple statistics the standard deviation do not match. I assume it is because of the na.rm=TRUE which does not match length(y) in the after_stat code. However I was not able to adapt the code using NA values? Simple statistics > mean(MS2020[MS2020$Bio=="1",]$QI_A, na.rm=TRUE) [1] 26.81225 > sd(MS2020[MS2020$Bio=="1",]$QI_A, na.rm=TRUE) [1] 21.12419 > mean(MS2020[MS2020$Bio=="0",]$QI_A, na.rm=TRUE) [1] 15.86196 > sd(MS2020[MS2020$Bio=="0",]$QI_A, na.rm=TRUE) [1] 15.00405 Kind regards Sibylle [[alternative HTML version deleted]] __ R-help@r-project.org <mailto:R-help@r-project.org> mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] geom_smooth with sd
Dear community Using after_stat() I was able to visualise ggplot with standard deviations instead of a confidence interval as seen in the R help. p1<-ggplot(data = MS1, aes(x= Jahr, y= QI_A,color=Bio, linetype=Bio)) + geom_smooth(aes(fill=Bio, ymax=after_stat(y+se*sqrt(length(y))), ymin=after_stat(y-se*sqrt(y))) , method = "lm" , formula = y ~ x + I(x^2),linewidth=1) + theme(panel.background = element_blank())+ theme(axis.line = element_line(colour = "black"))+ theme(axis.text=element_text(size=18))+ theme(axis.title=element_text(size=20))+ ylab("Anteil BFF an LN [%]") +xlab("Jahr")+ scale_color_manual(values=c("red","darkgreen"), labels=c("�LN", "BIO"))+ scale_fill_manual(values=c("red","darkgreen"), labels= c("�LN", "BIO"))+ theme(legend.title = element_blank())+ theme(legend.text=element_text(size=20))+ scale_linetype_manual(values=c("dashed", "solid"), labels=c("�LN", "BIO")) p1<-p1 + expand_limits(y=c(0, 30)) When comparing the plots to the simple statistics the standard deviation do not match. I assume it is because of the na.rm=TRUE which does not match length(y) in the after_stat code. However I was not able to adapt the code using NA values? Simple statistics > mean(MS2020[MS2020$Bio=="1",]$QI_A, na.rm=TRUE) [1] 26.81225 > sd(MS2020[MS2020$Bio=="1",]$QI_A, na.rm=TRUE) [1] 21.12419 > mean(MS2020[MS2020$Bio=="0",]$QI_A, na.rm=TRUE) [1] 15.86196 > sd(MS2020[MS2020$Bio=="0",]$QI_A, na.rm=TRUE) [1] 15.00405 Kind regards Sibylle [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Printing
В Sun, 11 Aug 2024 22:36:08 +0800 Steven Yen пишет: > All I need is printing by returning out (unless I turn it off). And, > retrieve ap and vap as needed as shown above. Guess I need to read > more about invisible. Perhaps you could print(out) instead of returning it in the if (printing) branch? Then you would be able to always return(invisible(list("ei"=ap,"vi"=vap))), whether printing is TRUE or not. -- Best regards, Ivan ______ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a fast way to do my job
Hi Bert and Ben, Thanks a lot for your suggestion!!. About the different residuals between lm function and lm.fit, from online search, lt seems like that I need to add an intercept in the design matrix x; pur2 <- matrix(gem751be.rpkm$purity2, ncol =1) pur2.1 <- cbind(1,gem751be.rpkm$purity2) then running result2 <- residuals(lm.fit( x= pur2.1, y = dat)); now I am thinking whether an intercept is required or not. Ding From: R-help On Behalf Of Yuan Chun Ding via R-help Sent: Saturday, August 10, 2024 12:30 PM To: Bert Gunter ; Ben Bolker Cc: r-help@r-project.org Subject: Re: [R] a fast way to do my job HI Bert and Ben, Yes, running lm. fit using the matrix format is much faster. I read a couple of online comments why it is faster. However, the residual values for three tested variables or genes from lm function and lm. fit function are different, HI Bert and Ben, Yes, running lm.fit using the matrix format is much faster. I read a couple of online comments why it is faster. However, the residual values for three tested variables or genes from lm function and lm.fit function are different, with Pearson correlation of 0.55, 0.89, and 0.99. I have not found the reason. Thanks, Ding From: Bert Gunter mailto:bgunter.4...@gmail.com>> Sent: Friday, August 9, 2024 7:11 PM To: Ben Bolker mailto:bbol...@gmail.com>> Cc: Yuan Chun Ding mailto:ycd...@coh.org>>; r-help@r-project.org<mailto:r-help@r-project.org> Subject: Re: [R] a fast way to do my job Better idea, Ben! It would work as you might expect it to to produce the same results as the above: ##first make sure your regressor is a matrix: pur2 <- matrix(purity2, ncol =1) ## convert the data frame variables into a matrix dat <- Better idea, Ben! It would work as you might expect it to to produce the same results as the above: ##first make sure your regressor is a matrix: pur2 <- matrix(purity2, ncol =1) ## convert the data frame variables into a matrix dat <- as.matrix(gem751be.rpkm[ , 74:35164]) ##then result <- residuals(lm.fit( x= pur2, y = dat)) Cheers, Bert On Fri, Aug 9, 2024 at 6:38 PM Ben Bolker mailto:bbol...@gmail.com<mailto:bbol...@gmail.com%3cmailto:bbol...@gmail.com>>> wrote: > > You can also fit a linear model with a matrix-valued response > variable, which should be even faster (not sure off the top of my head > how to get the residuals and reshape them to the dimensions you want) > > On Fri, Aug 9, 2024 at 9:31 PM Bert Gunter > mailto:bgunter.4...@gmail.com<mailto:bgunter.4...@gmail.com%3cmailto:bgunter.4...@gmail.com>>> > wrote: > > > > See ?lm.fit. > > I must be missing something, because: > > > > results <- sapply(74:35164, \(i) residuals(lm.fit(purity2, > > gem751be.rpkm[, i] ))) > > > > would give you a 751 x 35091 matrix of the residuals from each of the > > regressions. > > I assume it will be considerably faster than all the overhead you are > > carrying in your current code, but of course you'll have to try it and > > see. ... Assuming that I have interpreted your request correctly. > > Ignore if not. > > > > Cheers, > > Bert > > > > On Fri, Aug 9, 2024 at 4:50 PM Yuan Chun Ding via R-help > > mailto:r-help@r-project.org<mailto:r-help@r-project.org%3cmailto:r-help@r-project.org>>> > > wrote: > > > > > > Dear R users, > > > > > > I am running the following code below, the gem751be.rpkm is a dataframe > > > with dim of 751 samples by 35164 variables, 73 phenotypic variables in > > > the furst to 73rd column and 35091 genomic variables or genes in the 74th > > > to 35164th columns. What I need to do is to calculate the residuals for > > > each gene using the simple linear regression model of genelist[i] ~ > > > purity2; > > > > > > The following code is running, it takes long time, but I have an > > > expensive ThinkStation window computer. > > > Can you provide a fast way to do it? > > > > > > Thank you, > > > > > > Ding > > > > > > - > > > > > > > > > gem751be.rpkm <-merge(gem751be10, as.data.frame(t(rna849.fpkm2)), > > > + by.x="id2",by.y=0) > > > > row.names(gem751be.rpkm)<-gem751be.rpkm$id3 > > > > > > > > colnames(gem751be.rpkm)<-gsub(colnames(gem751be.rpkm),pattern="-",replacemen
Re: [R] a fast way to do my job
You are right. I also just thought about that, no intercept is not applicable to my case. Ding From: Bert Gunter Sent: Saturday, August 10, 2024 1:06 PM To: Yuan Chun Ding Cc: Ben Bolker ; r-help@r-project.org Subject: Re: [R] a fast way to do my job Ah, messages crossed. A no-intercept model **assumes** the straight line fit must pass through the origin. Unless there is a strong justification for such an assumption, you should include an intercept. -- Bert On Sat, Aug 10, 2024 at 1: 02 PM Ah, messages crossed. A no-intercept model **assumes** the straight line fit must pass through the origin. Unless there is a strong justification for such an assumption, you should include an intercept. -- Bert On Sat, Aug 10, 2024 at 1:02 PM Bert Gunter mailto:bgunter.4...@gmail.com>> wrote: > > Is it because I failed to to add a column of ones for an intercept to > the x matrix? TRhat would be my bad. > > -- Bert > > > On Sat, Aug 10, 2024 at 12:59 PM Bert Gunter > mailto:bgunter.4...@gmail.com>> wrote: > > > > Probably because you inadvertently ran different models. Without your code, > > I haven't a clue. > > > > > > On Sat, Aug 10, 2024, 12:29 Yuan Chun Ding > > mailto:ycd...@coh.org>> wrote: > >> > >> HI Bert and Ben, > >> > >> > >> > >> Yes, running lm.fit using the matrix format is much faster. I read a > >> couple of online comments why it is faster. > >> > >> > >> > >> However, the residual values for three tested variables or genes from lm > >> function and lm.fit function are different, with Pearson correlation of > >> 0.55, 0.89, and 0.99. > >> > >> > >> > >> I have not found the reason. > >> > >> > >> > >> Thanks, > >> > >> > >> Ding > >> > >> > >> > >> From: Bert Gunter mailto:bgunter.4...@gmail.com>> > >> Sent: Friday, August 9, 2024 7:11 PM > >> To: Ben Bolker mailto:bbol...@gmail.com>> > >> Cc: Yuan Chun Ding mailto:ycd...@coh.org>>; > >> r-help@r-project.org<mailto:r-help@r-project.org> > >> Subject: Re: [R] a fast way to do my job > >> > >> > >> > >> Better idea, Ben! It would work as you might expect it to to produce the > >> same results as the above: ##first make sure your regressor is a matrix: > >> pur2 <- matrix(purity2, ncol =1) ## convert the data frame variables into > >> a matrix dat <- > >> > >> Better idea, Ben! > >> > >> > >> > >> It would work as you might expect it to to produce the same results as > >> > >> the above: > >> > >> > >> > >> ##first make sure your regressor is a matrix: > >> > >> pur2 <- matrix(purity2, ncol =1) > >> > >> ## convert the data frame variables into a matrix > >> > >> dat <- as.matrix(gem751be.rpkm[ , 74:35164]) > >> > >> ##then > >> > >> result <- residuals(lm.fit( x= pur2, y = dat)) > >> > >> > >> > >> Cheers, > >> > >> Bert > >> > >> > >> > >> On Fri, Aug 9, 2024 at 6:38 PM Ben Bolker > >> mailto:bbol...@gmail.com>> wrote: > >> > >> > > >> > >> > You can also fit a linear model with a matrix-valued response > >> > >> > variable, which should be even faster (not sure off the top of my head > >> > >> > how to get the residuals and reshape them to the dimensions you want) > >> > >> > > >> > >> > On Fri, Aug 9, 2024 at 9:31 PM Bert Gunter > >> > mailto:bgunter.4...@gmail.com>> wrote: > >> > >> > > > >> > >> > > See ?lm.fit. > >> > >> > > I must be missing something, because: > >> > >> > > > >> > >> > > results <- sapply(74:35164, \(i) residuals(lm.fit(purity2, > >> > >> > > gem751be.rpkm[, i] ))) > >> > >> > > > >> > >> > > would give you a 751 x 35091 matrix of the residuals from each of the > >> > >> > > regressions. > >> > >> > > I assume it will be considerably faster than all the overhead you are > >> > >> >
Re: [R] a fast way to do my job
after add intercept, all residuals are the same from lm or lm.fit. Ding From: Bert Gunter Sent: Saturday, August 10, 2024 1:00 PM To: Yuan Chun Ding Cc: Ben Bolker ; r-help@r-project.org Subject: Re: [R] a fast way to do my job Probably because you inadvertently ran different models. Without your code, I haven't a clue. On Sat, Aug 10, 2024, 12: 29 Yuan Chun Ding wrote: HI Bert and Ben, Yes, running lm. fit using the matrix format is much faster. Probably because you inadvertently ran different models. Without your code, I haven't a clue. On Sat, Aug 10, 2024, 12:29 Yuan Chun Ding mailto:ycd...@coh.org>> wrote: HI Bert and Ben, Yes, running lm.fit using the matrix format is much faster. I read a couple of online comments why it is faster. However, the residual values for three tested variables or genes from lm function and lm.fit function are different, with Pearson correlation of 0.55, 0.89, and 0.99. I have not found the reason. Thanks, Ding From: Bert Gunter mailto:bgunter.4...@gmail.com>> Sent: Friday, August 9, 2024 7:11 PM To: Ben Bolker mailto:bbol...@gmail.com>> Cc: Yuan Chun Ding mailto:ycd...@coh.org>>; r-help@r-project.org<mailto:r-help@r-project.org> Subject: Re: [R] a fast way to do my job Better idea, Ben! It would work as you might expect it to to produce the same results as the above: ##first make sure your regressor is a matrix: pur2 <- matrix(purity2, ncol =1) ## convert the data frame variables into a matrix dat <- Better idea, Ben! It would work as you might expect it to to produce the same results as the above: ##first make sure your regressor is a matrix: pur2 <- matrix(purity2, ncol =1) ## convert the data frame variables into a matrix dat <- as.matrix(gem751be.rpkm[ , 74:35164]) ##then result <- residuals(lm.fit( x= pur2, y = dat)) Cheers, Bert On Fri, Aug 9, 2024 at 6:38 PM Ben Bolker mailto:bbol...@gmail.com>> wrote: > > You can also fit a linear model with a matrix-valued response > variable, which should be even faster (not sure off the top of my head > how to get the residuals and reshape them to the dimensions you want) > > On Fri, Aug 9, 2024 at 9:31 PM Bert Gunter > mailto:bgunter.4...@gmail.com>> wrote: > > > > See ?lm.fit. > > I must be missing something, because: > > > > results <- sapply(74:35164, \(i) residuals(lm.fit(purity2, > > gem751be.rpkm[, i] ))) > > > > would give you a 751 x 35091 matrix of the residuals from each of the > > regressions. > > I assume it will be considerably faster than all the overhead you are > > carrying in your current code, but of course you'll have to try it and > > see. ... Assuming that I have interpreted your request correctly. > > Ignore if not. > > > > Cheers, > > Bert > > > > On Fri, Aug 9, 2024 at 4:50 PM Yuan Chun Ding via R-help > > mailto:r-help@r-project.org>> wrote: > > > > > > Dear R users, > > > > > > I am running the following code below, the gem751be.rpkm is a dataframe > > > with dim of 751 samples by 35164 variables, 73 phenotypic variables in > > > the furst to 73rd column and 35091 genomic variables or genes in the 74th > > > to 35164th columns. What I need to do is to calculate the residuals for > > > each gene using the simple linear regression model of genelist[i] ~ > > > purity2; > > > > > > The following code is running, it takes long time, but I have an > > > expensive ThinkStation window computer. > > > Can you provide a fast way to do it? > > > > > > Thank you, > > > > > > Ding > > > > > > - > > > > > > > > > gem751be.rpkm <-merge(gem751be10, as.data.frame(t(rna849.fpkm2)), > > > + by.x="id2",by.y=0) > > > > row.names(gem751be.rpkm)<-gem751be.rpkm$id3 > > > > > > > > colnames(gem751be.rpkm)<-gsub(colnames(gem751be.rpkm),pattern="-",replacement="_") > > > > genelist <- gem751be.rpkm %>% dplyr::select(74:35164) > > > > residuals <- NULL > > > > for (i in 1:length(genelist)) { > > > + #i=1 > > > + formula <- reformulate("purity2", response=names(genelist)[i]) > > > + model <- lm(formula, data = gem751be.rpkm) > > > + resi <- as.data.frame(residuals(model)) > > > + colnames(resi)[1]<-names(genelist)[i] > > > + resi <-as.data.frame(t(resi)) > > >
Re: [R] a fast way to do my job
HI Bert and Ben, Yes, running lm.fit using the matrix format is much faster. I read a couple of online comments why it is faster. However, the residual values for three tested variables or genes from lm function and lm.fit function are different, with Pearson correlation of 0.55, 0.89, and 0.99. I have not found the reason. Thanks, Ding From: Bert Gunter Sent: Friday, August 9, 2024 7:11 PM To: Ben Bolker Cc: Yuan Chun Ding ; r-help@r-project.org Subject: Re: [R] a fast way to do my job Better idea, Ben! It would work as you might expect it to to produce the same results as the above: ##first make sure your regressor is a matrix: pur2 <- matrix(purity2, ncol =1) ## convert the data frame variables into a matrix dat <- Better idea, Ben! It would work as you might expect it to to produce the same results as the above: ##first make sure your regressor is a matrix: pur2 <- matrix(purity2, ncol =1) ## convert the data frame variables into a matrix dat <- as.matrix(gem751be.rpkm[ , 74:35164]) ##then result <- residuals(lm.fit( x= pur2, y = dat)) Cheers, Bert On Fri, Aug 9, 2024 at 6:38 PM Ben Bolker mailto:bbol...@gmail.com>> wrote: > > You can also fit a linear model with a matrix-valued response > variable, which should be even faster (not sure off the top of my head > how to get the residuals and reshape them to the dimensions you want) > > On Fri, Aug 9, 2024 at 9:31 PM Bert Gunter > mailto:bgunter.4...@gmail.com>> wrote: > > > > See ?lm.fit. > > I must be missing something, because: > > > > results <- sapply(74:35164, \(i) residuals(lm.fit(purity2, > > gem751be.rpkm[, i] ))) > > > > would give you a 751 x 35091 matrix of the residuals from each of the > > regressions. > > I assume it will be considerably faster than all the overhead you are > > carrying in your current code, but of course you'll have to try it and > > see. ... Assuming that I have interpreted your request correctly. > > Ignore if not. > > > > Cheers, > > Bert > > > > On Fri, Aug 9, 2024 at 4:50 PM Yuan Chun Ding via R-help > > mailto:r-help@r-project.org>> wrote: > > > > > > Dear R users, > > > > > > I am running the following code below, the gem751be.rpkm is a dataframe > > > with dim of 751 samples by 35164 variables, 73 phenotypic variables in > > > the furst to 73rd column and 35091 genomic variables or genes in the 74th > > > to 35164th columns. What I need to do is to calculate the residuals for > > > each gene using the simple linear regression model of genelist[i] ~ > > > purity2; > > > > > > The following code is running, it takes long time, but I have an > > > expensive ThinkStation window computer. > > > Can you provide a fast way to do it? > > > > > > Thank you, > > > > > > Ding > > > > > > - > > > > > > > > > gem751be.rpkm <-merge(gem751be10, as.data.frame(t(rna849.fpkm2)), > > > + by.x="id2",by.y=0) > > > > row.names(gem751be.rpkm)<-gem751be.rpkm$id3 > > > > > > > > colnames(gem751be.rpkm)<-gsub(colnames(gem751be.rpkm),pattern="-",replacement="_") > > > > genelist <- gem751be.rpkm %>% dplyr::select(74:35164) > > > > residuals <- NULL > > > > for (i in 1:length(genelist)) { > > > + #i=1 > > > + formula <- reformulate("purity2", response=names(genelist)[i]) > > > + model <- lm(formula, data = gem751be.rpkm) > > > + resi <- as.data.frame(residuals(model)) > > > + colnames(resi)[1]<-names(genelist)[i] > > > + resi <-as.data.frame(t(resi)) > > > + residuals <- rbind(residuals, resi) > > > + } > > > > > > > > > > > > -- > > > > > > -SECURITY/CONFIDENTIALITY WARNING- > > > > > > This message and any attachments are intended solely for the individual > > > or entity to which they are addressed. This communication may contain > > > information that is privileged, confidential, or exempt from disclosure > > > under applicable law (e.g., personal health information, research data, > > > financial information). Because this e-mail has been sent withou
Re: [R] If loop
> On 9. Aug 2024, at 10:45, CALUM POLWART wrote: > > Or use <<- assignment I think. (I usually return, but return can only > return one object and I think you want two or more > One can return multiple objects by putting them in a list and returning the list. Martin ______ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a fast way to do my job
Dear R users, I am running the following code below, the gem751be.rpkm is a dataframe with dim of 751 samples by 35164 variables, 73 phenotypic variables in the furst to 73rd column and 35091 genomic variables or genes in the 74th to 35164th columns. What I need to do is to calculate the residuals for each gene using the simple linear regression model of genelist[i] ~ purity2; The following code is running, it takes long time, but I have an expensive ThinkStation window computer. Can you provide a fast way to do it? Thank you, Ding - gem751be.rpkm <-merge(gem751be10, as.data.frame(t(rna849.fpkm2)), + by.x="id2",by.y=0) > row.names(gem751be.rpkm)<-gem751be.rpkm$id3 > > colnames(gem751be.rpkm)<-gsub(colnames(gem751be.rpkm),pattern="-",replacement="_") > genelist <- gem751be.rpkm %>% dplyr::select(74:35164) > residuals <- NULL > for (i in 1:length(genelist)) { + #i=1 + formula <- reformulate("purity2", response=names(genelist)[i]) + model <- lm(formula, data = gem751be.rpkm) + resi <- as.data.frame(residuals(model)) + colnames(resi)[1]<-names(genelist)[i] + resi <-as.data.frame(t(resi)) + residuals <- rbind(residuals, resi) + } -- -SECURITY/CONFIDENTIALITY WARNING- This message and any attachments are intended solely for the individual or entity to which they are addressed. This communication may contain information that is privileged, confidential, or exempt from disclosure under applicable law (e.g., personal health information, research data, financial information). Because this e-mail has been sent without encryption, individuals other than the intended recipient may be able to view the information, forward it to others or tamper with the information without the knowledge or consent of the sender. If you are not the intended recipient, or the employee or person responsible for delivering the message to the intended recipient, any dissemination, distribution or copying of the communication is strictly prohibited. If you received the communication in error, please notify the sender immediately by replying to this message and deleting the message and any accompanying files from your system. If, due to the security risks, you do not wish to rec eive further communications via e-mail, please reply to this message and inform the sender that you do not wish to receive further e-mail from the sender. (LCP301) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] WDI package commands timing out and not working
В Fri, 9 Aug 2024 20:25:51 +0530 Anupam Tyagi пишет: > I am trying this in Bengaluru, India, using R-studio. I tried > downloading a single variable. It happened fast, in less than 5 > seconds. I tried downloading six variables, it took much longer, but > less than a minute. Tried eight variables and it did not download > even in five minutes. > I checked in the browser using the URL provided in the warning > messages: The World Bank web API gave me the indicator in less than > a second. I think that the fact that this initially works from R but then only works in the browser is clear evidence of Cloudflare "protecting" the World Bank API from you trying to use it in an automated manner. That this mostly defeats the purpose of an API was probably lost on people who set it up. In theory, this should be reported to the people who operate the API. There seems to be a support form at <https://datahelpdesk.worldbank.org/knowledgebase/topics/125589-developer-information>. I don't have a lot of confidence in them fixing anything as a result of such a request. Try to cache all access to the API. Consider getting a proxy in a country less discriminated against by Cloudflare. Maybe petition the WDI maintainer to export a function that parses a JSON file manually downloaded with a browser? -- Best regards, Ivan ______ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] WDI package commands timing out and not working
В Thu, 8 Aug 2024 12:43:23 +0530 Anupam Tyagi пишет: > In open.connection(con, "rb") : > URL > 'https://api.worldbank.org/v2/en/country/OED/indicator/NY.ADJ.NNAT.GN.ZS?format=json&date=1977:2020&per_page=32500&page=1': > Timeout of 60 seconds was reached If you try to open the link in the browser, does it work? How long does it take to download? Try increasing options(timeout=...) to a larger time (in seconds). I see there is Cloudflare sitting in front of the API, but it's relatively non-aggressive. I could only get it to deny my request by accessing it through Tor. -- Best regards, Ivan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] An error message with the command fm<-1m
> The function is lm(), not 1m(). Eric Berger is correct (except for the extra parentheses), but it is worth pointing out that variable names do not begin with digits. (You can use backticks, assign, & other features to create such names (e.g. to write the Orwellian assignment `2 + 2` <- 5L), but they are non-standard and you need special syntax to use them.) Maybe the font makes some characters hard to distinguish, but a vertical line at the start of a standard name must be lower-case 'l' or upper-case 'I', not '1' or a pipe symbol. A circle or oval must be 'o' or 'O', not the digit '0'. (Digits after the first character are standard.) Regards, Jorgen Harmse. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Difference between stats.steps() and MuMIn.dredge() to select best fit model
В Wed, 31 Jul 2024 11:56:55 + c.bu...@posteo.jp пишет: > step() explore the model space with a step wise approach. > And dredge() try out all possible combinations of the variables. > > But isn't that the same? I might have a mental block on this. > > Which model (formula) would dredge() "test" that step() wouldn't?` Suppose that the predictors a, b, c, d, e, f are arranged in the descending order of contribution to the model. Consider a forward stepwise algorithm that is asked to choose three variables. It starts by testing a, b, c, d, e, f, and chooses a. It continues by testing a + b, a + c, a + d, a + e, a + f, and chooses a + b. It continues by testing a + b + c, a + b + d, a + b + e, a + b + f, and chooses a + b + c. By being greedy, it doesn't consider, for example, the model d + e + f, because for that it would have to pick d before a. A greedy algorithm for K variables out N tests N + (N-1) + ... + (N-K+1) = N*K - K(K-1)/2 models. An exhaustive search would have to test choose(N,K) = N!/(N-K)!/K! models. -- Best regards, Ivan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] round and trailing zero
Duncan Murdoch answered your question, but I have another. Are you going to do some computation with the rounded numbers, or are they just for display? (One thing I like about Excel is that I can change the display format of a cell without changing answers that depend on that cell.) In the latter case, why stash them in a variable? For more control of the display, consider sprintf (or a wrapper that combines sprintf with cat). Regards, Jorgen Harmse. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using optim() function to find MLE
В Mon, 29 Jul 2024 09:52:22 +0530 Christofer Bogaso пишет: > LL = function(b0, b1) help(optim) documents that the function to be optimised takes a single argument, a vector containing the parameters. Here's how your LL function can be adapted to this interface: LL <- function(par) { b0 <- par[1] b1 <- par[2] sum(apply(as.matrix(dat[, c('PurchasedProb', 'Age')]), 1, function(iROw) iROw['PurchasedProb'] * log( 1 / (1 + exp(-1 * (b0 + b1 * iROw['Age'] + (1 - iROw['PurchasedProb']) * log(1 - 1 / (1 + exp(-1 * (b0 + b1 * iROw['Age'])) } Furethermore, LL(c(0, 1)) results in -Inf. All the methods supported by optim() require at least the initial parameters to result in finite values, and L-BFGS-B requires all evaluations to be finite. You're also maximising the function, and optim() defaults to minimisation, so you need an additional parameter to adjust that (or rewrite the LL function further): result1 <- optim( par = c(0, 0), fn = LL, method = "L-BFGS-B", control = list(fnscale = -1) ) > coef(result1) help(optim) documents the return value of optim() as not having a class or a $coefficients field. You can use result1$par to access the parameters. > Is there any way to force optim() function to use Newton-CG algorithm? I'm assuming you mean the method documented in <https://docs.scipy.org/doc/scipy/reference/optimize.minimize-newtoncg.html>. optim() doesn't support the truncated (line search) Newton-CG method. See the 'optimx' and 'nloptr' packages for an implementation of a truncated Newton method (not necessarily exactly the same one). -- Best regards, Ivan ______ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] please help generate a square correlation matrix
HI Bert, Thank you for extra help!! Yes, exactly, your interpretation is perfectly correct and your R code is what I should look for. after generated all those negative values of correlation, I thought about the extremely small p values associated with those negative correlation, which is not meaningful as I truncated my data. When examining the exclusiveness of mutation pairs, what I first thought about is correlation, so stepped into a more complicated correlation journey. However, what Richard share is very helpful to explain why I got negative correlation values for all pairs. In my case, we measured all mutations for all 1000 samples using an exactly same sequencing method, so no issue of never-reporting. I am very grateful for help and comments from Rui, Richard and Bert!! Ding From: Bert Gunter Sent: Saturday, July 27, 2024 4:50 PM To: Yuan Chun Ding Cc: Richard O'Keefe ; r-help@r-project.org Subject: Re: [R] please help generate a square correlation matrix Your expanded explanation helps clarify your intent. Herewith some comments. Of course, feel free to ignore and not respond. And, as always, my apologies if I have failed to comprehend your intent. 1. I would avoid any notion of "statistical Your expanded explanation helps clarify your intent. Herewith some comments. Of course, feel free to ignore and not respond. And, as always, my apologies if I have failed to comprehend your intent. 1. I would avoid any notion of "statistical significance" like the plague. This is a purely exploratory exercise. 2. My understanding is that you want to know the proportion of rows in a pair of columns/vectors in which only 1 values of the pair is 1 out of the number of pairs where 1 or 2 values is 1. In R syntax, this is simply: sum(xor(x, y)) / sum(x | y) , where x and y are two columns of 1's and 0's Better yet might be to report both this *and* sum(x|y) to help you judge "meaningfulness". Here is a simple function that does this ## first, define a function that does above calculation: assoc <- \(z){ x <- z[,1]; y <- z[,2] n <- sum(x|y) c(prop = sum(xor(x, y))/n, N = n) } ## Now a function that uses it for the various combinations: somecor <- function(dat, func = assoc){ dat <- as.matrix(dat) indx <- seq_len(ncol(dat)) rbind(w <- combn(indx,2), combn(indx, 2, FUN = \(m)func(dat[,m]) )) |> t() |> round(digits =2) |> 'dimnames<-'(list(rep.int('',ncol(w)), c("","", "prop","N"))) } # Now apply it to your example data: somecor(dat) ## which gives prop N 1 2 0.67 6 1 3 0.60 5 1 4 0.57 7 2 3 0.60 5 2 4 0.33 6 3 4 0.71 7 This seems more interpretable and directly useful to me. Bigger values of prop for bigger N are the more interesting, assuming I have interpreted you correctly. Cheers, Bert On Sat, Jul 27, 2024 at 12:54 PM Yuan Chun Ding mailto:ycd...@coh.org>> wrote: > > Hi Richard, > > > > Nice to know you had similar experience. > > Yes, your understanding is right. all correlations are negative after > removing double-zero rows. > > It is consistent with a heatmap we generated. > > 1 is for a cancer patient with a specific mutation. 0 is no mutation for the > same mutation type in a patient. > > a pair of mutation type (two different mutations) are exclusive for most of > patients in heatmap or oncoplots. > > If we include all 1000 patients, 900 of patients with no mutations in both > mutation types, then the correlation is not significant at all. > > But eyeball the heatmap (oncoplots) for mutation (row) by patient (column), > mutations are exclusive for most of patients, > > so I want to measure how strong the exclusiveness between two specific > mutation types across those patients with at least one mutation type. > > Then put the pair of mutations with strong negative mutations on the top rows > by order of negative mutation values. > > > > Regarding a final application, maybe there are some usage for my case. > > If one develops two drugs specific to the two negative correlated mutations, > the drug treatment for cancer patients is usually only for those patients > carrying the specific mutation, > > then it is informative to know how strong the negative correlation when > considering different combination of treatment strategies. > > > > Ding > > > > > > > > > > > > From: R-help > mailto:r-help-boun...@r-project.org>> On Behalf > Of Richard O'Keefe > Sent: Saturday, July 27, 2024 4:47 AM > To: Bert Gunter mailto:bgunter.4...@gmail.com>> > Cc: r-help@r-project.org<m
Re: [R] please help generate a square correlation matrix
Hi Richard, Nice to know you had similar experience. Yes, your understanding is right. all correlations are negative after removing double-zero rows. It is consistent with a heatmap we generated. 1 is for a cancer patient with a specific mutation. 0 is no mutation for the same mutation type in a patient. a pair of mutation type (two different mutations) are exclusive for most of patients in heatmap or oncoplots. If we include all 1000 patients, 900 of patients with no mutations in both mutation types, then the correlation is not significant at all. But eyeball the heatmap (oncoplots) for mutation (row) by patient (column), mutations are exclusive for most of patients, so I want to measure how strong the exclusiveness between two specific mutation types across those patients with at least one mutation type. Then put the pair of mutations with strong negative mutations on the top rows by order of negative mutation values. Regarding a final application, maybe there are some usage for my case. If one develops two drugs specific to the two negative correlated mutations, the drug treatment for cancer patients is usually only for those patients carrying the specific mutation, then it is informative to know how strong the negative correlation when considering different combination of treatment strategies. Ding From: R-help On Behalf Of Richard O'Keefe Sent: Saturday, July 27, 2024 4:47 AM To: Bert Gunter Cc: r-help@r-project.org Subject: Re: [R] please help generate a square correlation matrix Curses, my laptop is hallucinating again. Hope I can get through this. So we're talking about correlations between binary variables. Suppose we have two 0-1-valued variables, x and y. Let A <- sum(x*y) # number of cases where x and y are Curses, my laptop is hallucinating again. Hope I can get through this. So we're talking about correlations between binary variables. Suppose we have two 0-1-valued variables, x and y. Let A <- sum(x*y) # number of cases where x and y are both 1. Let B <- sum(x)-A # number of cases where x is 1 and y is 0 Let C <- sum(y)-A # number of cases where y is 1 and x is 0 Let D <- sum(!x * !y) # number of cases where x and y are both 0. (also D = length(x)-A-B-C) All the information is summarised in the 2-by-2 contingency table. Some years ago, Nathan Rountree and I supervised Yung-Sing Koh's data-mining PhD. She surveyed the data mining literature and found some 37 different "interestingness measures" for two-variable associations -- if I remember correctly; there were a lot of them. They fell into a much smaller number of qualitatively similar groups. At any rate, the Pearson correlation between x and y is (A*D - B*C)/sqrt((A+B)*(C+D)*(A+C)*(B+D)) So what happens when we delete the rows where x = 0 and y = 0? Right, it forces D to 0, leaving A B C unchanged. And looking at the numerator, If you delete rows with x = 0 y = 0 you MUST get a negative correlation. Quite a modest "true" correlation (based on all the data) like -0.2 can masquerade as quite a strong "zero-suppressed" correlation like -0.6. Even +0.2 can turn into -0.4. (These figures are from a particular simulation run and may not apply in your case.) Now one of the reasons why Yun-Sing Koh, Nathan Rountree, and I were interested in interestingness measures is perhaps coincidentally related to the file drawer/underreporting problem: it's quite common for rows where x = 0 and y = 0 never to have been reported to you, so we were hoping there were measures immune to that. I have argued for years that "till record analysis" for supermarkets &c is badly flawed by two facts: (a) it is hard to measure how much of a product people WOULD have bought if only you had offered it for sale (although you can make educated guesses) and (b) till records provide no evidence on what the people who walked out without buying anything wanted (was the price too high? could they not find it?). Problem (a) leads to a commercial variant of the Signor-Lipps effect: "when x and/or y were available for purchase" is not the same as "the period for which data were recorded", thus inflating D, perhaps massively. Methods developed for handling the Signor-Lipps effect in paleontology can be used to estimate when x and y were available helping you to recover a more realistic N=A+B+C+D. I really should have published that. All of which is a long-winded way of saying that - Pearson correlations on binary columns can be computed very efficiently - the rows with x=0 and y=0 may be very informative, even essential for analysis - delete them at your peril. - really, delete them at your peril. On Sat, 27 Jul 2024 at 23:07, Richard O'Keefe mailto:rao...@gmail.com>> wrote: > > Let's go back to the original posting. > > > > > >
Re: [R] plotting nnet function....
В Sat, 27 Jul 2024 11:00:34 + akshay kulkarni пишет: > My question is : how to plot the final model on the actual data > points? Have you been able to obtain the predictions? What happens if you call predict() on the model object returned to you by train()? Once you have both the data and the prediction, it should be as simple as plot(traindata$predictor_column, traindata$regressor_column); lines(traindata$predictor_column, previously_returned_predictions). (Or an equivalent with your favourite plotting system for R.) Try following the vignette from the 'caret' package: https://cran.r-project.org/package=caret/vignettes/caret.html If you do encounter an error on your way or get stuck not knowing how exactly to continue, please ask a more specific question. -- Best regards, Ivan ______ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Automatic Knot selection in Piecewise linear splines
dear all, I apologize for my delay in replying you. Here my contribution, maybe just for completeness: Similar to "earth", "segmented" also fits piecewise linear relationships with the number of breakpoints being selected by the AIC or BIC (recommended). #code (example and code from Martin Maechler previous email) library(segmented) o<-selgmented(y, ~x, Kmax=20, type="bic", msg=TRUE) plot(o, add=TRUE) lines(o, col=2) #the approx CI for the breakpoints confint(o) #the estimated breakpoints (with CI's) slope(o) #the estimated slopes (with CI's) However segmented appears to be less efficient than earth (although with reasonable running times), it does NOT work with multivariate responses neither products between piecewise linear terms. kind regards, Vito Il 16/07/2024 11:22, Martin Maechler ha scritto: Anupam Tyagi on Tue, 9 Jul 2024 16:16:43 +0530 writes: > How can I do automatic knot selection while fitting piecewise linear > splines to two variables x and y? Which package to use to do it simply? I > also want to visualize the splines (and the scatter plot) with a graph. > Anupam NB: linear splines, i.e. piecewise linear continuous functions. Given the knots, use approx() or approxfun() however, the automatic knots selection does not happen in the base R packages. I'm sure there are several R packages doing this. The best such package in my opinion is "earth" which does a re-implementation (and extensive *generalization*) of the famous MARS algorithm of Friedman. ==> https://en.wikipedia.org/wiki/Multivariate_adaptive_regression_splines Note that their strengths and power is that they do their work for multivariate x (MARS := Multivariate Adaptive Regression Splines), but indeed do work for the simple 1D case. In the following example, we always get 11 final knots, but I'm sure one can tweak the many tuning paramters of earth() to get more: ## Can we do knot-selection for simple (x,y) splines? === Yes, via earth() {using MARS}! x <- (0:800)/8 f <- function(x) 7 * sin(pi/8*x) * abs((x-50)/20)^1.25 - (x-40)*(12-x)/64 curve(f(x), 0, 100, n = 1000, col=2, lwd=2) set.seed(11) y <- f(x) + 10*rnorm(x) m.sspl <- smooth.spline(x,y) # base line "standard smoother" require(earth) fm1 <- earth(x, y) # default settings summary(fm1, style = "pmax") #-- got 10 knots (x = 44 "used twice") below ## Call: earth(x=x, y=y) ## y = ## 175.9612 ## - 10.6744 * pmax(0, x - 4.625) ## + 9.928496 * pmax(0, x - 10.875) ## - 5.940857 * pmax(0, x - 20.25) ## + 3.438948 * pmax(0, x - 27.125) ## - 3.828159 * pmax(0, 44 - x) ## + 4.207046 * pmax(0, x - 44) ## + 2.573822 * pmax(0, x - 76.5) ## - 10.99073 * pmax(0, x - 87.125) ## + 10.97592 * pmax(0, x - 90.875) ## + 9.331949 * pmax(0, x - 94) ## - 8.48575 * pmax(0, x - 96.5) ## Selected 12 of 12 terms, and 1 of 1 predictors ## Termination condition: Reached nk 21 ## Importance: x ## Number of terms at each degree of interaction: 1 11 (additive model) ## GCV 108.6592RSS 82109.44GRSq 0.861423RSq 0.86894 fm2 <- earth(x, y, fast.k = 0) # (more extensive forward pass) summary(fm2) all.equal(fm1, fm2)# they are identical (apart from 'call'): fm3 <- earth(x, y, fast.k = 0, pmethod = "none", trace = 3) # extensive forward pass; *no* pruning ## still no change: fm3 "==" fm1 all.equal(predict(fm1, xx), predict(fm3, xx)) ## BTW: The chosen knots and coefficients are mat <- with(fm1, cbind(dirs, cuts=c(cuts), coef = c(coefficients))) ## Plots : fine grid for visualization: instead of xx <- seq(x[1], x[length(x)], length.out = 1024) rnx <- extendrange(x) ## to extrapolate a bit xx <- do.call(seq.int, c(rnx, list(length.out = 1200))) cbind(f = f(xx), sspl = predict(m.sspl, xx)$y, mars = predict(fm1, xx)) -> fits plot(x,y, xlim=rnx, cex = 1/4, col = adjustcolor(1, 1/2)) cols <- c(adjustcolor(2, 1/3), adjustcolor(4, 2/3), adjustcolor("orange4", 2/3)) lwds <- c(3, 2, 2) matlines(xx, fits, col = cols, lwd = lwds, lty=1) legend("topleft", c("true f(x)", "smooth.spline()", "earth()"), col=cols, lwd=lwds, bty = "n") title(paste("earth() linear spline vs. smooth.spline(); n =", length(x))) mtext(substitute(f(x) == FDEF, list(FDEF = body(f __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- = Vito M.R. Muggeo, PhD
Re: [R] please help generate a square correlation matrix
Hi Rui, You are always very helpful!! Thank you, I just modified your R codes to remove a row with zero values in both column pair as below for my real data. Ding dat<-gene22mut.coded r <- P <- matrix(NA, nrow = 22L, ncol = 22L, dimnames = list(names(dat), names(dat))) for(i in 1:22) { #i=1 x <- dat[[i]] for(j in (1:22)) { #j=2 if(i == j) { # there's nothing to test, assign correlation 1 r[i, j] <- 1 } else { tmp <-cbind(x,dat[[j]]) row0 <-rowSums(tmp) tem2 <-tmp[row0!=0,] tmp3 <- cor.test(tem2[,1],tem2[,2]) r[i, j] <- tmp3$estimate P[i, j] <- tmp3$p.value } } } r<-as.data.frame(r) P<-as.data.frame(P) From: R-help On Behalf Of Yuan Chun Ding via R-help Sent: Thursday, July 25, 2024 11:26 AM To: Rui Barradas ; r-help@r-project.org Subject: Re: [R] please help generate a square correlation matrix HI Rui, Thank you for the help! You did not remove a row if zero values exist in both column pair, right? Ding From: Rui Barradas Sent: Thursday, July 25, 2024 11: 15 AM To: Yuan Chun Ding ; HI Rui, Thank you for the help! You did not remove a row if zero values exist in both column pair, right? Ding From: Rui Barradas mailto:ruipbarra...@sapo.pt>> Sent: Thursday, July 25, 2024 11:15 AM To: Yuan Chun Ding mailto:ycd...@coh.org>>; r-help@r-project.org<mailto:r-help@r-project.org> Subject: Re: [R] please help generate a square correlation matrix Às 17: 39 de 25/07/2024, Yuan Chun Ding via R-help escreveu: > Hi R users, > > I generated a square correlation matrix for the dat dataframe below; > dat<-data. frame(g1=c(1,0,0,1,1,1,0,0,0), > g2=c(0,1,0,1,0,1,1,0,0), > g3=c(1,1,0,0,0,1,0,0,0), Às 17:39 de 25/07/2024, Yuan Chun Ding via R-help escreveu: > Hi R users, > > I generated a square correlation matrix for the dat dataframe below; > dat<-data.frame(g1=c(1,0,0,1,1,1,0,0,0), > g2=c(0,1,0,1,0,1,1,0,0), > g3=c(1,1,0,0,0,1,0,0,0), > g4=c(0,1,0,1,1,1,1,1,0)) > library("Hmisc") > dat.rcorr = rcorr(as.matrix(dat)) > dat.r <-round(dat.rcorr$r,2) > > however, I want to modify this correlation calculation; > my dat has more than 1000 rows and 22 columns; > in each column, less than 10% values are 1, most of them are 0; > so I want to remove a row with value of zero in both columns when calculate > correlation between two columns. > I just want to check whether those values of 1 are correlated between two > columns. > Please look at my code in the following; > > cor.4gene <-matrix(0,nrow=4*4, ncol=4) > for (i in 1:4){ >#i=1 >for (j in 1:4) { > #j=1 > d <-dat[,c(i,j)]%>% >filter(eval(as.symbol(colnames(dat)[i]))!=0 | > eval(as.symbol(colnames(dat)[j]))!=0) > c <-cor.test(d[,1],d[,2]) > cor.4gene[i*j,]<-c(colnames(dat)[i],colnames(dat)[j], > c$estimate,c$p.value) >} > } > cor.4gene<-as.data.frame(cor.4gene)%>%filter(V1 !=0) > colnames(cor.4gene)<-c("gene1","gene2","cor","P") > > Can you tell me what mistakes I made? > first, why cor is NA when calculation of correlation for g1 and g1, I though > it should be 1. > > cor.4gene$cor[is.na(cor.4gene$cor)]<-1 > cor.4gene$cor[is.na(cor.4gene$P)]<-0 > cor.4gene.sq <-pivot_wider(cor.4gene, names_from = gene1, values_from = cor) > > Then this line of code above did not generate a square matrix as what the > HMisc library did. > How to fix my code? > > Thank you, > > Ding > > > -- > > -SECURITY/CONFIDENTIALITY WARNING- > > This message and any attachments are intended solely for the individual or > entity to which they are addressed. This communication may contain > information that is privileged, confidential, or exempt from disclosure under > applicable law (e.g., personal health information, research data, financial > information). Because this e-mail has been sent without encryption, > individuals other than the intended recipient may be able to view the > information, forward it to others or tamper with the information without the > knowledge or consent of the sender. If you are not the intended recipient, or > the employee or person responsible for delivering the message to the intended > recipient, any dissemination, distribution or copying of the communicat
Re: [R] please help generate a square correlation matrix
HI Rui, Thank you for the help! You did not remove a row if zero values exist in both column pair, right? Ding From: Rui Barradas Sent: Thursday, July 25, 2024 11:15 AM To: Yuan Chun Ding ; r-help@r-project.org Subject: Re: [R] please help generate a square correlation matrix Às 17: 39 de 25/07/2024, Yuan Chun Ding via R-help escreveu: > Hi R users, > > I generated a square correlation matrix for the dat dataframe below; > dat<-data. frame(g1=c(1,0,0,1,1,1,0,0,0), > g2=c(0,1,0,1,0,1,1,0,0), > g3=c(1,1,0,0,0,1,0,0,0), Às 17:39 de 25/07/2024, Yuan Chun Ding via R-help escreveu: > Hi R users, > > I generated a square correlation matrix for the dat dataframe below; > dat<-data.frame(g1=c(1,0,0,1,1,1,0,0,0), > g2=c(0,1,0,1,0,1,1,0,0), > g3=c(1,1,0,0,0,1,0,0,0), > g4=c(0,1,0,1,1,1,1,1,0)) > library("Hmisc") > dat.rcorr = rcorr(as.matrix(dat)) > dat.r <-round(dat.rcorr$r,2) > > however, I want to modify this correlation calculation; > my dat has more than 1000 rows and 22 columns; > in each column, less than 10% values are 1, most of them are 0; > so I want to remove a row with value of zero in both columns when calculate > correlation between two columns. > I just want to check whether those values of 1 are correlated between two > columns. > Please look at my code in the following; > > cor.4gene <-matrix(0,nrow=4*4, ncol=4) > for (i in 1:4){ >#i=1 >for (j in 1:4) { > #j=1 > d <-dat[,c(i,j)]%>% >filter(eval(as.symbol(colnames(dat)[i]))!=0 | > eval(as.symbol(colnames(dat)[j]))!=0) > c <-cor.test(d[,1],d[,2]) > cor.4gene[i*j,]<-c(colnames(dat)[i],colnames(dat)[j], > c$estimate,c$p.value) >} > } > cor.4gene<-as.data.frame(cor.4gene)%>%filter(V1 !=0) > colnames(cor.4gene)<-c("gene1","gene2","cor","P") > > Can you tell me what mistakes I made? > first, why cor is NA when calculation of correlation for g1 and g1, I though > it should be 1. > > cor.4gene$cor[is.na(cor.4gene$cor)]<-1 > cor.4gene$cor[is.na(cor.4gene$P)]<-0 > cor.4gene.sq <-pivot_wider(cor.4gene, names_from = gene1, values_from = cor) > > Then this line of code above did not generate a square matrix as what the > HMisc library did. > How to fix my code? > > Thank you, > > Ding > > > -- > > -SECURITY/CONFIDENTIALITY WARNING- > > This message and any attachments are intended solely for the individual or > entity to which they are addressed. This communication may contain > information that is privileged, confidential, or exempt from disclosure under > applicable law (e.g., personal health information, research data, financial > information). Because this e-mail has been sent without encryption, > individuals other than the intended recipient may be able to view the > information, forward it to others or tamper with the information without the > knowledge or consent of the sender. If you are not the intended recipient, or > the employee or person responsible for delivering the message to the intended > recipient, any dissemination, distribution or copying of the communication is > strictly prohibited. If you received the communication in error, please > notify the sender immediately by replying to this message and deleting the > message and any accompanying files from your system. If, due to the security > risks, you do not wish to rec > eive further communications via e-mail, please reply to this message and > inform the sender that you do not wish to receive further e-mail from the > sender. (LCP301) > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To > UNSUBSCRIBE and more, see > https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6Hb8338TBM$<https://urldefense.com/v3/__https:/stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6Hb8338TBM$> > PLEASE do read the posting guide > https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6Hb880tLw0$<https://urldefense.co
[R] please help generate a square correlation matrix
Hi R users, I generated a square correlation matrix for the dat dataframe below; dat<-data.frame(g1=c(1,0,0,1,1,1,0,0,0), g2=c(0,1,0,1,0,1,1,0,0), g3=c(1,1,0,0,0,1,0,0,0), g4=c(0,1,0,1,1,1,1,1,0)) library("Hmisc") dat.rcorr = rcorr(as.matrix(dat)) dat.r <-round(dat.rcorr$r,2) however, I want to modify this correlation calculation; my dat has more than 1000 rows and 22 columns; in each column, less than 10% values are 1, most of them are 0; so I want to remove a row with value of zero in both columns when calculate correlation between two columns. I just want to check whether those values of 1 are correlated between two columns. Please look at my code in the following; cor.4gene <-matrix(0,nrow=4*4, ncol=4) for (i in 1:4){ #i=1 for (j in 1:4) { #j=1 d <-dat[,c(i,j)]%>% filter(eval(as.symbol(colnames(dat)[i]))!=0 | eval(as.symbol(colnames(dat)[j]))!=0) c <-cor.test(d[,1],d[,2]) cor.4gene[i*j,]<-c(colnames(dat)[i],colnames(dat)[j], c$estimate,c$p.value) } } cor.4gene<-as.data.frame(cor.4gene)%>%filter(V1 !=0) colnames(cor.4gene)<-c("gene1","gene2","cor","P") Can you tell me what mistakes I made? first, why cor is NA when calculation of correlation for g1 and g1, I though it should be 1. cor.4gene$cor[is.na(cor.4gene$cor)]<-1 cor.4gene$cor[is.na(cor.4gene$P)]<-0 cor.4gene.sq <-pivot_wider(cor.4gene, names_from = gene1, values_from = cor) Then this line of code above did not generate a square matrix as what the HMisc library did. How to fix my code? Thank you, Ding -- -SECURITY/CONFIDENTIALITY WARNING- This message and any attachments are intended solely for the individual or entity to which they are addressed. This communication may contain information that is privileged, confidential, or exempt from disclosure under applicable law (e.g., personal health information, research data, financial information). Because this e-mail has been sent without encryption, individuals other than the intended recipient may be able to view the information, forward it to others or tamper with the information without the knowledge or consent of the sender. If you are not the intended recipient, or the employee or person responsible for delivering the message to the intended recipient, any dissemination, distribution or copying of the communication is strictly prohibited. If you received the communication in error, please notify the sender immediately by replying to this message and deleting the message and any accompanying files from your system. If, due to the security risks, you do not wish to rec eive further communications via e-mail, please reply to this message and inform the sender that you do not wish to receive further e-mail from the sender. (LCP301) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [External] Using the pipe, |>, syntax with "names<-"
I think that the simplicity of setNames is hard to beat: z |> setNames( c( "a", "foo" ) ) and if you are determined not to load dplyr then column_rename <- function( DF, map ) { on <- names( DF ) on[ match( map, on ) ] <- names( map ) names( DF ) <- on DF } is more robust to column reorganization than replace(): z |> column_rename( c( foo = "b" ) ) On July 20, 2024 10:07:57 PM PDT, Deepayan Sarkar wrote: >The main challenge in Bert's original problem is that `[` and `[<-` cannot >be called in a pipeline. The obvious solution is to define named versions, >e.g.: > >elt <- `[` >`elt<-` <- `[<-` > >Then, > >> z <- data.frame(a = 1:3, b = letters[1:3]) >> z |> names() |> elt(2) >[1] "b" >> z |> names() |> elt(2) <- "foo" >> z > a foo >1 1 a >2 2 b >3 3 c > >You could actually also do (using a similar function already defined in >methods) > >z |> names() |> el(2) <- "bar" > >Iris's _ trick is of course a nice alternative; and this example in ?pipeOp >already covers it: > ># using the placeholder as the head of an extraction chain: >mtcars |> subset(cyl == 4) |> lm(formula = mpg ~ disp) |> _$coef[[2]] > >While the replacement question is a nice exercise, I am not sure about the >value of emphasizing that you can use pipes to do complex assignments. >Doesn't that defeat the whole purpose of piping? For one thing, it will >necessarily terminate the pipe. Also, it will not work if the starting >value is not a variable. E.g., > >> data.frame(a = 1:3, b = letters[1:3]) |> names() |> _[2] <- "bar" >Error in names(data.frame(a = 1:3, b = letters[1:3]))[2] <- "bar" : > target of assignment expands to non-language object > >Duncan's rename() approach, which will just change the column name and >return the modified object, seems more useful as part of a pipeline. > >Best, >-Deepayan > >On Sun, 21 Jul 2024 at 04:46, Bert Gunter wrote: > >> I second Rich's excellent suggestion. >> >> As with all elegant solutions, Iris's clicked on the wee light bulb in >> my brain, and I realized that a slightly more verbose, but perhaps >> more enlightening, alternative may be: >> >> z |> attr("names") |> _[2] <- "foo" >> >> However, I would add this as an example *only with* Iris's solution. >> Hers should be shown whether or not the above is. >> >> Cheers, >> Bert >> >> On Sat, Jul 20, 2024 at 3:35 PM Richard M. Heiberger >> wrote: >> > >> > I think Iris's solution should be added to the help file: ?|> >> > there are no examples there now that show assignment or replacement >> using the "_" >> > >> > > On Jul 20, 2024, at 18:21, Duncan Murdoch >> wrote: >> > > >> > > On 2024-07-20 6:02 p.m., Iris Simmons wrote: >> > >> z <- data.frame(a = 1:3, b = letters[1:3]) >> > >> z |> names() |> _[2] <- "foo" >> > >> z >> > > >> > > That's a great suggestion! >> > > >> > > Duncan Murdoch >> > > >> > > __ >> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > > https://stat.ethz.ch/mailman/listinfo/r-help >> > > PLEASE do read the posting guide >> http://www.r-project.org/posting-guide.html >> > > and provide commented, minimal, self-contained, reproducible code. >> > >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. -- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extract
Here is another way... for data analysis, the idiomatic result is usually more useful, though for presentation in a final result the wide result might be desired. library(dplyr) library(tidyr) dat<-read.csv(text= "Year, Sex,string 2002,F,15 xc Ab 2003,F,14 2004,M,18 xb 25 35 21 2005,M,13 25 2006,M,14 ac 256 AV 35 2007,F,11" , header=TRUE ) idiomatic <- ( dat %>% mutate( string = strsplit( string, " " ) ) %>% unnest( cols = string ) %>% group_by( Year, Sex ) %>% mutate( s_name = paste0( "S", seq_along( string ) ) ) %>% ungroup() ) idiomatic # each row has unique Year, Sex, and s_name wide <- ( idiomatic %>% spread( s_name, string ) ) wide On July 19, 2024 11:23:48 AM PDT, Val wrote: >Thank you and sorry for the confusion. >The desired result should have 8 variables as a comma separated in >each line. The string variable is considered as one variable. >The output of your script is wfine for me. Thank you! > >On Fri, Jul 19, 2024 at 1:00 PM Ebert,Timothy Aaron wrote: >> >> The desired result is odd. >> 1) It looks like the string is duplicated in the desired result. The first >> line of data has "15, xc, Ab", and the desired result has "15, xc, Ab, 15, >> xc, Ab" >> 2) The example has S1 through S5, but the desired result has data for eight >> variables in the first line (not five). >> 3) The desired result has a different number of variables for each line. >> 4) Are you assuming that all missing data is at the end of the string? If >> there are 5 variables (S1 S5), do you know that "15, xc, Ab" is S1 = >> 15, S2 = 'xc', and S3 = 'Ab' rather than S2=15, S4='xc' and S5='Ab' ? >> >> This isn't exactly what you asked for, but maybe I was confused somewhere. >> This approach puts string data into variables in order. In this approach one >> mixes string and numeric data. The string is not duplicated. >> >> library(tidyr) >> >> dat <- read.csv(text="Year,Sex,string >> 2002,F,15 xc Ab >> 2003,F,14 >> 2004,M,18 xb 25 35 21 >> 2005,M,13 25 >> 2006,M,14 ac 256 AV 35 >> 2007,F,11", header=TRUE, stringsAsFactors=FALSE) >> >> # split the 'string' column based on spaces >> dat_separated <- dat |> >> separate(string, into = paste0("S", 1:5), sep = " ", >>fill = "right", extra = "merge") >> >> Tim >> >> >> -Original Message- >> From: R-help On Behalf Of Val >> Sent: Friday, July 19, 2024 12:52 PM >> To: r-help@R-project.org (r-help@r-project.org) >> Subject: [R] Extract >> >> [External Email] >> >> Hi All, >> >> I want to extract new variables from a string and add it to the dataframe. >> Sample data is csv file. >> >> dat<-read.csv(text="Year, Sex,string >> 2002,F,15 xc Ab >> 2003,F,14 >> 2004,M,18 xb 25 35 21 >> 2005,M,13 25 >> 2006,M,14 ac 256 AV 35 >> 2007,F,11",header=TRUE) >> >> The string column has a maximum of five variables. Some rows have all and >> others may not have all the five variables. If missing then fill it with >> NA, Desired result is shown below, >> >> >> Year,Sex,string, S1, S2, S3 S4,S5 >> 2002,F,15 xc Ab, 15,xc,Ab, NA, NA >> 2003,F,14, 14,NA,NA,NA,NA >> 2004,M,18 xb 25 35 21,18, xb, 25, 35, 21 >> 2005,M,13 25,13, 25,NA,NA,NA >> 2006,M,14 ac 256 AV 35, 14, ac, 256, AV, 35 >> 2007,F,11, 11,NA,NA,NA,NA >> >> Any help? >> Thank you in advance. >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.r-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. -- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot two-factor legend
Thanks a lot Rui and Jeff Yes including labels=c() in scale_linetype_manual() was the hint. Sibylle -Original Message- From: Rui Barradas Sent: Thursday, July 18, 2024 6:50 PM To: sibylle.stoec...@gmx.ch; r-help@r-project.org Subject: Re: [R] ggplot two-factor legend Às 17:43 de 18/07/2024, Rui Barradas escreveu: > Às 16:27 de 18/07/2024, SIBYLLE STÖCKLI via R-help escreveu: >> Hi >> >> I am using ggplot to visualise y for a two-factorial group (Bio: 0 >> and >> 1) x >> = 6 years. I was able to adapt the colour of the lines (green and >> red) and the linetype (solid and dashed). >> Challenge: my code produces now two legends. One with the colors for >> the group and one with the linetype for the group. Does somebody have >> a hint how to adapt the code to produce one legend? Group 0 = red and >> dashed, Group 1 = green and solid? >> >> >> MS1<- MS %>% filter(QI_A!="NA") %>% droplevels() dev.new(width=4, >> height=2.75) par(mar = c(0,6,0,0)) p1<-ggplot(data = MS1, aes(x= >> Jahr, y= QI_A,group=Bio,color=Bio, >> linetype=Bio)) + >> geom_smooth(aes(fill=Bio) , method = "lm" , formula = y ~ x >> + >> I(x^2),linewidth=1) + >> theme(panel.background = element_blank())+ >> theme(axis.line = element_line(colour = "black"))+ >>theme(axis.text=element_text(size=18))+ >>theme(axis.title=element_text(size=20))+ >> ylab("Anteil BFF an LN [%]") +xlab("Jahr")+ >> scale_color_manual(values=c("red","dark green"), labels=c("ÖLN", >> "BIO"))+ >> scale_fill_manual(values=c("red","dark green"), labels= c("ÖLN", >> "BIO"))+ >> theme(legend.title = element_blank())+ >>theme(legend.text=element_text(size=20))+ >>scale_linetype_manual(values=c("dashed", "solid")) >> p1<-p1 + expand_limits(y=c(0, 30)) >> >> kind regards >> Sibylle >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > Hello, > > To have one legend only, the labels must be the same. Try using > > labels=c("ÖLN", "BIO") > > in > > scale_linetype_manual(values=c("dashed", "solid"), labels=c("ÖLN", > "BIO")) > > > Hope this helps, > > Rui Barradas > > Hello, Here is a more complete an answer with the built-in data set mtcars. Note that the group aesthetic is not used. This is because linetype is categorical (after mutate) and there's no need to group again by the same variable (am). Remove labels from scale_linetype_manual and there are two legends but with the same labels the legends merge. library(ggplot2) library(dplyr) mtcars %>% # linetype must be categorical mutate(am = factor(am)) %>% ggplot(aes(hp, disp, color = am, linetype = am)) + geom_line() + scale_color_manual( values = c("red","dark green"), labels = c("ÖLN", "BIO") ) + scale_linetype_manual( values = c("dashed", "solid"), labels = c("ÖLN", "BIO") ) + theme_bw() Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot two-factor legend
Thanks Jeff I removed the group parameter in the fp1<-ggplot () line. It doesn't change anything. I suppose I got two legends as in the ggplot () line I have color=Bio & linetype=Bio. However, when removing linetype = Bio I just geht red and green. For black and white printing I would like the additionally differentiate the two lines (groups) in the linetype. Sibylle -Original Message- From: Jeff Newmiller Sent: Thursday, July 18, 2024 6:13 PM To: sibylle.stoec...@gmx.ch; SIBYLLE STÖCKLI via R-help ; r-help@r-project.org Subject: Re: [R] ggplot two-factor legend If I follow your question, you want redundant aesthetics. Ggplot normally notices correlated aesthetic mapping variables and merges the legends, so the most likely answer is that your data are not fully correlated in all rows. I have also seen this where data are drawn from different dataframes for different layers since it is hard to merge factors, but I don't see that here. You are using the group parameter... try removing that? The group parameter overrides the automatic group determination. There might be a syntax for specifying correlated grouping, but I don't know it... I normally just verify that my data meets the requirements to be automatically identified as correlated if that is my goal, since that is a prerequisite anyway. On July 18, 2024 8:27:05 AM PDT, "SIBYLLE STÖCKLI via R-help" wrote: >Hi > >I am using ggplot to visualise y for a two-factorial group (Bio: 0 and >1) x = 6 years. I was able to adapt the colour of the lines (green and >red) and the linetype (solid and dashed). >Challenge: my code produces now two legends. One with the colors for >the group and one with the linetype for the group. Does somebody have a >hint how to adapt the code to produce one legend? Group 0 = red and >dashed, Group 1 = green and solid? > > >MS1<- MS %>% filter(QI_A!="NA") %>% droplevels() dev.new(width=4, >height=2.75) par(mar = c(0,6,0,0)) p1<-ggplot(data = MS1, aes(x= Jahr, >y= QI_A,group=Bio,color=Bio, >linetype=Bio)) + > geom_smooth(aes(fill=Bio) , method = "lm" , formula = y ~ x + >I(x^2),linewidth=1) + > theme(panel.background = element_blank())+ > theme(axis.line = element_line(colour = "black"))+ > theme(axis.text=element_text(size=18))+ > theme(axis.title=element_text(size=20))+ > ylab("Anteil BFF an LN [%]") +xlab("Jahr")+ > scale_color_manual(values=c("red","dark green"), labels=c("ÖLN", >"BIO"))+ > scale_fill_manual(values=c("red","dark green"), labels= c("ÖLN", >"BIO"))+ > theme(legend.title = element_blank())+ > theme(legend.text=element_text(size=20))+ > scale_linetype_manual(values=c("dashed", "solid")) >p1<-p1 + expand_limits(y=c(0, 30)) > >kind regards >Sibylle > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. -- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot two-factor legend
If I follow your question, you want redundant aesthetics. Ggplot normally notices correlated aesthetic mapping variables and merges the legends, so the most likely answer is that your data are not fully correlated in all rows. I have also seen this where data are drawn from different dataframes for different layers since it is hard to merge factors, but I don't see that here. You are using the group parameter... try removing that? The group parameter overrides the automatic group determination. There might be a syntax for specifying correlated grouping, but I don't know it... I normally just verify that my data meets the requirements to be automatically identified as correlated if that is my goal, since that is a prerequisite anyway. On July 18, 2024 8:27:05 AM PDT, "SIBYLLE STÖCKLI via R-help" wrote: >Hi > >I am using ggplot to visualise y for a two-factorial group (Bio: 0 and 1) x >= 6 years. I was able to adapt the colour of the lines (green and red) and >the linetype (solid and dashed). >Challenge: my code produces now two legends. One with the colors for the >group and one with the linetype for the group. Does somebody have a hint how >to adapt the code to produce one legend? Group 0 = red and dashed, Group 1 = >green and solid? > > >MS1<- MS %>% filter(QI_A!="NA") %>% droplevels() >dev.new(width=4, height=2.75) >par(mar = c(0,6,0,0)) >p1<-ggplot(data = MS1, aes(x= Jahr, y= QI_A,group=Bio,color=Bio, >linetype=Bio)) + > geom_smooth(aes(fill=Bio) , method = "lm" , formula = y ~ x + >I(x^2),linewidth=1) + > theme(panel.background = element_blank())+ > theme(axis.line = element_line(colour = "black"))+ > theme(axis.text=element_text(size=18))+ > theme(axis.title=element_text(size=20))+ > ylab("Anteil BFF an LN [%]") +xlab("Jahr")+ > scale_color_manual(values=c("red","dark green"), labels=c("ÖLN", >"BIO"))+ > scale_fill_manual(values=c("red","dark green"), labels= c("ÖLN", >"BIO"))+ > theme(legend.title = element_blank())+ > theme(legend.text=element_text(size=20))+ > scale_linetype_manual(values=c("dashed", "solid")) >p1<-p1 + expand_limits(y=c(0, 30)) > >kind regards >Sibylle > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. -- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ggplot two-factor legend
Hi I am using ggplot to visualise y for a two-factorial group (Bio: 0 and 1) x = 6 years. I was able to adapt the colour of the lines (green and red) and the linetype (solid and dashed). Challenge: my code produces now two legends. One with the colors for the group and one with the linetype for the group. Does somebody have a hint how to adapt the code to produce one legend? Group 0 = red and dashed, Group 1 = green and solid? MS1<- MS %>% filter(QI_A!="NA") %>% droplevels() dev.new(width=4, height=2.75) par(mar = c(0,6,0,0)) p1<-ggplot(data = MS1, aes(x= Jahr, y= QI_A,group=Bio,color=Bio, linetype=Bio)) + geom_smooth(aes(fill=Bio) , method = "lm" , formula = y ~ x + I(x^2),linewidth=1) + theme(panel.background = element_blank())+ theme(axis.line = element_line(colour = "black"))+ theme(axis.text=element_text(size=18))+ theme(axis.title=element_text(size=20))+ ylab("Anteil BFF an LN [%]") +xlab("Jahr")+ scale_color_manual(values=c("red","dark green"), labels=c("ÖLN", "BIO"))+ scale_fill_manual(values=c("red","dark green"), labels= c("ÖLN", "BIO"))+ theme(legend.title = element_blank())+ theme(legend.text=element_text(size=20))+ scale_linetype_manual(values=c("dashed", "solid")) p1<-p1 + expand_limits(y=c(0, 30)) kind regards Sibylle __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grDevices segfault when building R4.4.0 on RHEL 9.1.
В Wed, 17 Jul 2024 02:35:22 + Miguel Esteva пишет: > I replaced the "--with-lapack" flag with "--with-lapack='-lflexiblas > -L/tools/flexiblas/3.4.2/lib64'" and everything built ok. Glad to see you managed to avoid the crash! > From a quick check in my emails, seems the RHEL9 system lapack > packages are broken. Will test a bit more. Simon Andrews has also shown me how to reproduce the crash on AlmaLinux: https://stat.ethz.ch/pipermail/r-help/2024-May/479321.html Looks like an ABI incompatibility between gfortran-11 and blas-devel + lapack-devel. -- Best regards, Ivan ______ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grDevices segfault when building R4.4.0 on RHEL 9.1.
Hi Ivan, An apology, I was away for quite a bit. To reproduce the setup: I have been using the default GCC in RHEL 9.1. gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/11/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-redhat-linux Configured with: ../configure --enable-bootstrap --enable-host-pie --enable-host-bind-now --enable-languages=c,c++,fortran,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --enable-plugin --enable-initfini-array --without-isl --enable-multilib --with-linker-hash-style=gnu --enable-offload-targets=nvptx-none --without-cuda-driver --enable-gnu-indirect-function --enable-cet --with-tune=generic --with-arch_64=x86-64-v2 --with-arch_32=x86-64 --build=x86_64-redhat-linux --with-build-config=bootstrap-lto --enable-link-serialization=1 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 11.4.1 20230605 (Red Hat 11.4.1-2) (GCC) I have been building R 4.4.0 and 4.4.1 with Flexiblas and with the built in R BLAS/LAPACK. R BLAS: ./configure --prefix=/tools/R/$RVER --enable-R-shlib --enable-memory-profiling --with-pcre2=/tools/pcre2/10.42 Flexiblas: PKG_CONFIG_PATH=/tools/flexiblas/3.4.2/lib64/pkgconfig ./configure --prefix=/tools/R/flexiblas/4.4.1 --enable-R-shlib --enable-memory-profiling --with-pcre2=/tools/pcre2/10.42 --with-blas="-lflexiblas -L/tools/flexiblas/3.4.2/lib64" --with-lapack I realised the build fails when "--with-lapack" is left unspecified, even though the configure output shows this: Source directory:. Installation directory: /tools/R/flexiblas C compiler: gcc -g -O2 Fortran fixed-form compiler: gfortran -g -O2 Default C++ compiler:g++ -std=gnu++17 -g -O2 C++11 compiler: g++ -std=gnu++11 -g -O2 C++14 compiler: g++ -std=gnu++14 -g -O2 C++17 compiler: g++ -std=gnu++17 -g -O2 C++20 compiler: g++ -std=gnu++20 -g -O2 C++23 compiler: g++ -std=gnu++23 -g -O2 Fortran free-form compiler: gfortran -g -O2 Obj-C compiler: Interfaces supported:X11, tcltk External libraries: pcre2, readline, BLAS(FlexiBlas), LAPACK(in blas), curl, libdeflate Additional capabilities: PNG, JPEG, TIFF, NLS, cairo, ICU Options enabled: shared R library, R profiling, memory profiling, libdeflate for lazyload Capabilities skipped: Options not enabled: shared BLAS Recommended packages:yes I replaced the "--with-lapack" flag with "--with-lapack='-lflexiblas -L/tools/flexiblas/3.4.2/lib64'" and everything built ok. >From a quick check in my emails, seems the RHEL9 system lapack packages are >broken. Will test a bit more. If you need a singularity container in the future, I can provide one with the R-dependencies installed. We setup dependencies similar to: https://github.com/rstudio/r-builds/blob/main/builder/Dockerfile.rhel-9 or subscription-manager repos --enable codeready-builder-for-rhel-9-$(arch)-rpms dnf install -y yum-utils dnf install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm yum-builddep -y R Just bringing this to your attention as the issue is not with R by the looks, as I was unable to reproduce on Rocky Linux 9.1. Kind regards and thanks! Miguel Esteva Senior ITS Research Systems Engineer The Walter and Eliza Hall Institute of Medical Research 1G Royal Parade Parkville VIC 3052 Australia Phone (03) 9345 2909 Email estev...@wehi.edu.au Web http://www.wehi.edu.au<http://www.wehi.edu.au/> From: Ivan Krylov Sent: Friday, 3 May 2024 9:40 PM To: Miguel Esteva via R-help Cc: Miguel Esteva Subject: Re: [R] grDevices segfault when building R4.4.0 on RHEL 9.1. Dear Miguel Esteva, I couldn't get a Red Hat "ubi9" container to install enough dependencies to build R. Is there a way to reproduce your setup on a virtual machine somewhere? On Fri, 3 May 2024 00:42:43 + Miguel Esteva via R-help wrote: > *** caught segfault *** > > address 0x1801fa8f70, cause 'memory not mapped' > > > Traceback: > > 1: solve.default(rgb) This seems to crash inside the BLAS. Which BLAS are you using? Any custom ./configure arguments? Which compilers are you running? To find out more information about the crash, try to follow it with a debugger. Change directory to src/library/grDevices and run: _R_COMPILE_PKGS_=1 R_COMPILER_SUPPRESS_ALL=1 \ R_DEFAULT_PACKAGES=NULL LC_ALL=C \ ../../../bin/
Re: [R] reticulate + virtual environments
В Mon, 15 Jul 2024 07:56:17 +0200 Sigbert Klinke пишет: > > py_config() > > use_virtualenv("mmstat4.hu.data", required = TRUE) Does it help _not_ to call py_config() before use_virtualenv()? help(py_config) says that it forces the initialization of Python. When you later try to ask for a different virtual environment, no conflict is detected because normalizePath('/home/sk/.virtualenvs/r-reticulate/bin/python') is identical to normalizePath('/home/sk/.virtualenvs/mmstat4.hu.data/bin/python'): they must be both symlinks to /usr/bin/python3, so reticulate is likely thinking that it's the same Python. Thus Python is not initialised again, but you also don't see an error. -- Best regards, Ivan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reinterpret data without saving it to a file 1st? Check for integer stopping at 1st decimal?
В Sun, 14 Jul 2024 03:16:56 -0400 DynV Montrealer пишет: > Perhaps some way to break the spreadsheet data (eg XLdata <- > read_excel(...)), then put it back together without any writing to a > file (eg XLdataReformed <- reform(XLdata)) ? read_excel() is documented to return objects of class tibble: https://cran.r-project.org/package=tibble/vignettes/tibble.html Long story short, tibbles are named lists of columns, so it should be possible for you to access and replace the individual parts of them using the standard list subset syntax XLdata[[columnname]]. Lists are described in R Intro chapter 6 and many other books on R: https://cran.r-project.org/doc/manuals/R-intro.html#Lists-and-data-frames http://web.archive.org/web/20230415001551if_/http://ashipunov.info/shipunov/school/biol_240/en/visual_statistics.pdf (see section 3.8.2 on page 93 and following) > In addition, from is.integer() documentation I ran > > > is.wholenumber <- function(x, tol = .Machine$double.eps^0.5) abs(x > > - round (x)) < tol > > and I'm now trying to have it stop at the 1st decimal content of a > column. If you'd like to write idiomatic R code, consider the fact that is.wholenumber is vectorised: is.wholenumber(c(1,2,3,pi)) # [1] TRUE TRUE TRUE FALSE Given a vector of numbers, it will return a vector of the same length specifying whether each element can be considered a whole number. Combine it with all() and you can test the whole column in two function calls. R also has a type.convert function that may be useful in this case: https://search.r-project.org/R/refmans/utils/html/type.convert.html -- Best regards, Ivan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem loading BiodiversityR, Error: package ‘tcltk’ could not be loaded
В Sat, 13 Jul 2024 16:04:17 +0100 Adam Hillier пишет: > error: X11 library is missing: install XQuartz from www.xquartz.org Does the problem go away if you install XQuartz from www.xquartz.org? "R installation and administration" section 4 also documents the requirement to have XQuartz installed in order to use the tcltk package (which is part of R itself) and the x11() device: https://cran.r-project.org/doc/manuals/R-admin.html#Installing-R-under-macOS For macOS-specific problems, r-sig-...@r-project.org may give you more precise advice. If installing XQuartz doesn't help, make sure to provide your sessionInfo() output. -- Best regards, Ivan ______ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grep
which(grepl()) looks odd. Doesn't grep by itself return the correct vector of indices? Regards, Jorgen Harmse. Message: 5 Date: Fri, 12 Jul 2024 17:42:05 +0800 From: Steven Yen mailto:st...@ntu.edu.tw>> To: Uwe Ligges mailto:lig...@statistik.tu-dortmund.de>>, R-help Mailing List mailto:r-help@r-project.org>> Cc: Steven Yen mailto:sye...@gmail.com>> Subject: Re: [R] grep Message-ID: mailto:b73784ce-c018-4587-bcd9-64adbd0dc...@ntu.edu.tw>> Content-Type: text/plain; charset="utf-8" Sorry. grepl worked: which(grepl("very|somewhat",names(goprobit.p$est))) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Weird R Studio behaviour...
В Tue, 9 Jul 2024 13:02:17 + Levent TERLEMEZ via R-help пишет: > System is on W 10, and R 3.4.1 is working with R Studio. R Studio is > updated today https://posit.co/download/rstudio-desktop/ says "RStudio requires R 3.6.0+" so I'm afraid they don't support this configuration any more. -- Best regards, Ivan ______ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Weird R Studio behaviour...
I disagree. UTF-8 is far from new. The IDE cannot fail at the point of not handling a known technology without advancing the argument with further messaging. What happens when you run the same code within R apart from the IDE? What trace work have you accomplished with this special character in a separate test script? It may be how Windows 10 is handling UTF-8 after the latest update of optional packages that Microsoft released last week. Run a test in CMD on UTF-8 and see what you discover. *Stephen Dawson, DSL* /Executive Strategy Consultant/ Business & Technology +1 (865) 804-3454 http://www.shdawson.com On 7/9/24 11:24, Bert Gunter wrote: I think you should also update R to the latest version, as that *might* be the source of the problem. Other may be able to give you a specific diagnosis, but updating R is to a (reasonably, at least) current version is good practice anyway. Cheers, Bert On Tue, Jul 9, 2024 at 8:11 AM Levent TERLEMEZ via R-help < r-help@r-project.org> wrote: Hi, Have a nice week. First of all, I know this is not R Studio forum but I want to ask here first, if you all do not mind. Well, I am away from my computer right now but, I have a strange problem (at least to me). My script worked perfectly for a year, and today, suddenly stop working because R Studio begins to warn me about illegal characters in the script. System is on W 10, and R 3.4.1 is working with R Studio. R Studio is updated today to the latest one because of this problem with the hope of resolving the problem (but no luck) and they are used as their default installation settings. Anyway, the problem example may not be repoducable right now but if it is, I can give detailed one later. While the original working code is this (there is no synax error, too in the code because it was working perfectly until today before updating R Studio, error also came out before this update as I mensioned before); legend(c("Kapanış",""20 Günlük,"50 Günlük"),col=c("black"... The warning is this; Error: unexpected symbol inside: " Encoding(kill3) <- "latin1" legend(c(kill1,kill2,"50 G�nl�k"),col=c("black"… and can be solved when converted “ü” to “u”. Addition to this another solution (at least for me) is this; kill1<-"Kapanış" Encoding(kill1) <- "UTF-8" (these two statements are not needed but fort he sake of code integrity, is applied to it, too. If kill1 is converted to latin1, this time it is broken) kill2<-"22 Günlük MA" Encoding(kill2) <- "latin1" kill3<-"50 Günlük MA" Encoding(kill3) <- "latin1" And also it is set to “ASK” and always “UTF-8” is selected. But, I also wonder why today and what changed so R Studio stops suddenly running the script? I can not following up the changes anymore as used to be and if this is a character set problem, it is coming back again and again. What is the permenant solution of this? This is like an endless problem… With my best regards and thanks for your patience…. Levent Terlemez. YASAL UYARI: Bu e-postanın içerdiği bilgiler (ekleri de dahil olmak üzere) gizlidir. Sahibinin onayı olmaksızın içeriği kopyalanamaz, üçüncü kişilere açıklanamaz veya iletilemez . Bu mesajın gönderilmek istendiği kişi değilseniz (ya da bu e-postayı yanlışlıkla aldıysanız), lütfen yollayan kişiyi haberdar ediniz ve mesajı sisteminizden derhal siliniz. Eskişehir Teknik Üniversitesi, bu mesajın içerdiği bilgilerin doğruluğu veya eksiksiz olduğu konusunda bir garanti vermemektedir. Bu nedenle, bilgilerin ne şekilde olursa olsun içeriğinden, iletilmesinden, alınmasından, saklanmasından Eskişehir Teknik Üniversitesi sorumlu değildir. Bu mesajın içeriği yazarına ait olup, Eskişehir Teknik Üniversitesi'nin görüşlerini içermeyebilir. Bu e-posta bizce bilinen tüm bilgisayar virüslerine karşı taranmıştır. DISCLAIMER: This e-mail (including any attachments) may contain confidential and/or privileged information. Copying, disclosure or distribution of the material in this e-mail without owner authority is strictly forbidden. If you are not the intended recipient (or have received this e-mail in error), please notify the sender and delete it from your system immediately. Eskisehir Technical University makes no warranty as to the accuracy or completeness of any information contained in this message and hereby excludes any liability of any kind for the information contained therein or for the information transmission, reception, storage or use of such in any way whatsoever. Any opinions expressed in this message are those of the author and may not necessarily reflect the opinions of Eskisehir Technical University. This e-mail has been scanned for all computer viruses known to us. [[alternative HTML version deleted]] __ R-help@r-project.org mailing
[R] Weird R Studio behaviour...
Hi, Have a nice week. First of all, I know this is not R Studio forum but I want to ask here first, if you all do not mind. Well, I am away from my computer right now but, I have a strange problem (at least to me). My script worked perfectly for a year, and today, suddenly stop working because R Studio begins to warn me about illegal characters in the script. System is on W 10, and R 3.4.1 is working with R Studio. R Studio is updated today to the latest one because of this problem with the hope of resolving the problem (but no luck) and they are used as their default installation settings. Anyway, the problem example may not be repoducable right now but if it is, I can give detailed one later. While the original working code is this (there is no synax error, too in the code because it was working perfectly until today before updating R Studio, error also came out before this update as I mensioned before); legend(c("Kapanış",""20 Günlük,"50 Günlük"),col=c("black"... The warning is this; Error: unexpected symbol inside: " Encoding(kill3) <- "latin1" legend(c(kill1,kill2,"50 G�nl�k"),col=c("black"… and can be solved when converted “ü” to “u”. Addition to this another solution (at least for me) is this; kill1<-"Kapanış" Encoding(kill1) <- "UTF-8" (these two statements are not needed but fort he sake of code integrity, is applied to it, too. If kill1 is converted to latin1, this time it is broken) kill2<-"22 Günlük MA" Encoding(kill2) <- "latin1" kill3<-"50 Günlük MA" Encoding(kill3) <- "latin1" And also it is set to “ASK” and always “UTF-8” is selected. But, I also wonder why today and what changed so R Studio stops suddenly running the script? I can not following up the changes anymore as used to be and if this is a character set problem, it is coming back again and again. What is the permenant solution of this? This is like an endless problem… With my best regards and thanks for your patience…. Levent Terlemez. YASAL UYARI: Bu e-postanın içerdiği bilgiler (ekleri de dahil olmak üzere) gizlidir. Sahibinin onayı olmaksızın içeriği kopyalanamaz, üçüncü kişilere açıklanamaz veya iletilemez . Bu mesajın gönderilmek istendiği kişi değilseniz (ya da bu e-postayı yanlışlıkla aldıysanız), lütfen yollayan kişiyi haberdar ediniz ve mesajı sisteminizden derhal siliniz. Eskişehir Teknik Üniversitesi, bu mesajın içerdiği bilgilerin doğruluğu veya eksiksiz olduğu konusunda bir garanti vermemektedir. Bu nedenle, bilgilerin ne şekilde olursa olsun içeriğinden, iletilmesinden, alınmasından, saklanmasından Eskişehir Teknik Üniversitesi sorumlu değildir. Bu mesajın içeriği yazarına ait olup, Eskişehir Teknik Üniversitesi'nin görüşlerini içermeyebilir. Bu e-posta bizce bilinen tüm bilgisayar virüslerine karşı taranmıştır. DISCLAIMER: This e-mail (including any attachments) may contain confidential and/or privileged information. Copying, disclosure or distribution of the material in this e-mail without owner authority is strictly forbidden. If you are not the intended recipient (or have received this e-mail in error), please notify the sender and delete it from your system immediately. Eskisehir Technical University makes no warranty as to the accuracy or completeness of any information contained in this message and hereby excludes any liability of any kind for the information contained therein or for the information transmission, reception, storage or use of such in any way whatsoever. Any opinions expressed in this message are those of the author and may not necessarily reflect the opinions of Eskisehir Technical University. This e-mail has been scanned for all computer viruses known to us. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] package spline - default value of Boundary.knots of ns
Dear Maintainer, Thanks for the excellent package splines. I am writing this email to request you to consider a suggestion I have with regards to the function ns. While trying to rework an example from a textbook, I couldn't call ns with appropriate arguments to reproduce the results. The package documentation also couldn't help me find the problem. Finally, I found a stack exchange question (https://stats.stackexchange.com/questions/588769/natural-splines-in-r-with-ns) which helped me understand the problem - the default values of boundary knots are not useful. The problem is described in the stack exchange question, which I request you to kindly read. My suggestion is to change the default value of the argument Boundary.knots to NULL and calculate its values from the extreme values of the argument knots inside the function body if it is NULL and otherwise to keep whatever numerical value it is assigned at call. I think it is more intuitive to the user to specify one set of knots assuming that its minimum and maximum values would be used as the knots beyond which regression would be linear rather than to know that the function automatically calculates boundary knots which are not appropriate and so he needs to override them. Just so that I am clear, an example. Assume that my variable *alcohol* has values from 1 to 100 and I want to specify natural splines at knots 20,40 and 60, expecting linearity would hold below 20 and above 60. Currently, I have to specify knots as 40 and boundary knots as 20 and 60 as ns(alcohol , knots = c(40), Boundary.knots = c(20,60)) If I incorrectly assume that correct values of boundary knots are calculated by default, I will specify knots as 20,40 and 60 as ns(alcohol , knots = c(20,40,60)) and get incorrect values for boundary knots as 1 and 100. (I have done this and the stack exchange post shows that I am not alone). Hope my suggestion will be considered, Thanks, ajith ______ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bug? plot.formula does need support plot.first / plot.last param in plot.default
В Fri, 05 Jul 2024 14:35:40 +0300 "Erez Shomron" пишет: > This works as expected: > with(mtcars, plot(wt, mpg, plot.first = { > plot.window(range(wt), range(mpg)) > arrows(3, 15, 4, 30) > })) I think you meant panel.first, not plot.first. At least I cannot find any mention of plot.first in the R source code. In this example, plot.first ends up being an argument of an internal call from plot.default() to plot.window(), which evaluates its ellipsis arguments. If your plot.first expression returned a non-NULL value, you would also have received a warning: plot.window(0:1, 0:1, plot.first = message('hello')) # hello plot.window(0:1, 0:1, plot.first = 123) # Warning message: # In plot.window(0:1, 0:1, plot.first = 123) : # "plot.first" is not a graphical parameter It is indeed documented that "passing [panel.first] from other ‘plot’ methods may well not work since it may be evaluated too early". The plot.formula method deliberately evaluates the arguments in the ellipsis, and the workaround suggested in https://bugs.r-project.org/show_bug.cgi?id=14591 doesn't help because the expression is then evaluated in an undesired environment (parent frame, not data). You are correct that plot.formula tries to evaluate all its remaining arguments in the context of the data passed to the method. In order for the lazy evaluation to work, plot.formula would have to (1) know and skip all such arguments by name on line 6, minding partial matching, (2) rewrite them into the form evalq(original_argument_expression, model_frame, parent_frame) so that they would be able to access both the data and the variables visible in the frame of the caller, and (3) give these expressions to do.call() in place of the original ones. (1) sounds especially brittle since plot.formula() may dispatch to other plot.* methods. Additionally, great care will need to be taken not to break existing code that calls plot.formula, even if it's already full of workarounds for plot.formula's behaviour. -- Best regards, Ivan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] simple problem with unquoting argument
В Wed, 3 Jul 2024 10:13:59 +0200 Troels Ring пишет: > Now e looks right - but I have been unable to find out how to get the > string e converted to the proper argument for sum() - i.e. what is > function xx? get(e) will return the value of the variable with the name stored in the variable e. A more idiomatic variant will require more changes: 1. Create the "adds" variable as a list, so that it could contain other arbitrary R values: adds <- list() 2. Instead of assigning adds1 <- something(), adds2 <- something_else(), ..., assign to the elements of the list: adds[[1]] <- something() adds[[2]] <- something_else() ... 3. Now you can use the same syntax to access the elements of the list: SS[i] <- sum(adds[[i]]) As a bonus, you can use the "apply" family of R functions that will perform the loop for you: instead of SS <- c(); for (i in 1:11) SS[i] <- sum(adds[[i]]) you can write SS <- vapply(adds, sum, numeric(1)) ...and it will perform the same loop inside it, verifying each time that sum(adds[[i]]) returns a single number. -- Best regards, Ivan P.S. I'm sorry for letting our project lapse. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] summaryRprof: Unexpected unit for memory profiling
There was a time when people pretty much ignored the distinction between MB and MiB in computer applications, and using the binary version was usually assumed because, well, this _is_ memory we are measuring. I think this is a leftover from that time. On July 1, 2024 6:33:43 AM PDT, "Sauer, Lukas Daniel" wrote: >Hello, > >I am profiling memory usage using utils::Rprof() and subsequently summarizing >the profile using utils::summaryRprof(). According to the documentation >?summaryRprof, the option `memory = "both"` reports "memory consumption in Mb >in addition to the timings", i.e. the unit is megabytes. However, looking at >the source code >(https://github.com/wch/r-source/blob/18652de8890d89563b923ff58b45ccb04d9955fe/src/library/utils/R/summRprof.R#L170) > suggests that memory is reported in mebibytes (division by 1048576 and not by >10). This is in line with the following minimal example: > >use_mb <- function(){a <- runif(100)} >use_mib <- function(){b <- runif(1024^2)} >Rprof("Rprof.out", memory.profiling=TRUE) >use_mb() >use_mib() >Rprof(NULL) >summaryRprof("Rprof.out", memory="both") > >Do not source this code, but execute it line by line. This example returns the >output: > >$by.self >self.time self.pct total.time total.pct mem.total >"runif" 0.04 100 0.04 100 15.6 > >$by.total > total.time total.pct mem.total self.time self.pct >"runif" 0.04 100 15.6 0.04 100 >"use_mb"0.0250 7.6 0.000 >"use_mib" 0.0250 8.0 0.000 > >$sample.interval >[1] 0.02 > >$sampling.time >[1] 0.04 > >The example was run under: > >R version 4.4.0 (2024-04-24 ucrt) >Platform: x86_64-w64-mingw32/x64 >Running under: Windows 11 x64 (build 22631) > >If the unit were megabytes, I would expect mem.total to be 16.4, 8.0, and 8.4 >-- but rather it is 15.6, 7.6, and 8.0. Do you agree that this behavior is >unexpected or did I overlook something? If yes, I will file a bug report and >suggest that the documentation is changed to "memory consumption in MiB in >addition to the timings". > >Best regards, > >Lukas D Sauer >Biometrician >Institute of Medical Biometry > >Heidelberg University Hospital | Im Neuenheimer Feld 130.3 | D-69120 Heidelberg >Tel. +49 6221 56-35036 | Fax. +49 6221 56-4195 | E-Mail: >sa...@imbi.uni-heidelberg.de >biometrie.uni-heidelberg.de | twitter.com/imbi_heidelberg > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. -- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create matrix with variable number of columns AND CREATE NAMES FOR THE COLUMNS
Sorkin, John wrote/hat geschrieben on/am 01.07.2024 17:54: #I am trying to write code that will create a matrix with a variable number of columns where the #number of columns is 1+Grps #I can do this: NSims <- 4 Grps <- 5 DiffMeans <- matrix(nrow=NSims,ncol=1+Grps) DiffMeans #I have a problem when I try to name the columns of the matrix. I want the first column to be NSims, #and the other columns to be something like Value1, Value2, . . . Valuen where N=Grps # I wrote a function to build a list of length Grps createValuelist <- function(num_elements) { for (i in 1:num_elements) { cat("Item", i, "\n", sep = "") } } createValuelist(Grps) # When I try to assign column names I receive an error: #Error in dimnames(DiffMeans) <- list(NULL, c("NSim", createValuelist(Grps))) : # length of 'dimnames' [2] not equal to array extent dimnames(DiffMeans) <- list(NULL,c("NSim",createValuelist(Grps))) DiffMeans # Thank you for your help! John David Sorkin M.D., Ph.D. Professor of Medicine, University of Maryland School of Medicine; Associate Director for Biostatistics and Informatics, Baltimore VA Medical Center Geriatrics Research, Education, and Clinical Center; PI Biostatistics and Informatics Core, University of Maryland School of Medicine Claude D. Pepper Older Americans Independence Center; Senior Statistician University of Maryland Center for Vascular Research; Division of Gerontology and Paliative Care, 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 Cell phone 443-418-5382 maybe: NSims <- 4 Grps <- 5 DiffMeans <- matrix(nrow=NSims,ncol=1+Grps, dimnames=list(NULL, c('Nsimn', paste('Item', 1:Grps, sep='' DiffMeans best, Heinz ______ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create matrix with variable number of columns AND CREATE NAMES FOR THE COLUMNS
I think you should reconsider your goal. Matrices must have all elements of the same type, and in this case you seem to be trying to mix a number of something (integer) with mean values (double). This would normally be stored together in a data frame or separately in a vector for counts and a matrix for means. If you are just thinking about data presentation, a data frame would be a better choice than a single matrix. On July 1, 2024 8:54:21 AM PDT, "Sorkin, John" wrote: >#I am trying to write code that will create a matrix with a variable number of >columns where the #number of columns is 1+Grps >#I can do this: >NSims <- 4 >Grps <- 5 >DiffMeans <- matrix(nrow=NSims,ncol=1+Grps) >DiffMeans > >#I have a problem when I try to name the columns of the matrix. I want the >first column to be NSims, #and the other columns to be something like Value1, >Value2, . . . Valuen where N=Grps > ># I wrote a function to build a list of length Grps >createValuelist <- function(num_elements) { > for (i in 1:num_elements) { >cat("Item", i, "\n", sep = "") > } >} >createValuelist(Grps) > ># When I try to assign column names I receive an error: >#Error in dimnames(DiffMeans) <- list(NULL, c("NSim", createValuelist(Grps))) >: ># length of 'dimnames' [2] not equal to array extent >dimnames(DiffMeans) <- list(NULL,c("NSim",createValuelist(Grps))) >DiffMeans > ># Thank you for your help! > > >John David Sorkin M.D., Ph.D. >Professor of Medicine, University of Maryland School of Medicine; >Associate Director for Biostatistics and Informatics, Baltimore VA Medical >Center Geriatrics Research, Education, and Clinical Center; >PI Biostatistics and Informatics Core, University of Maryland School of >Medicine Claude D. Pepper Older Americans Independence Center; >Senior Statistician University of Maryland Center for Vascular Research; > >Division of Gerontology and Paliative Care, >10 North Greene Street >GRECC (BT/18/GR) >Baltimore, MD 21201-1524 >Cell phone 443-418-5382 > > > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. -- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.