from:"R Help"

Re: [R] "And" condition spanning over multiple columns in data frame

2024-09-12 Thread Ivan Krylov via R-help

В Thu, 12 Sep 2024 09:42:57 +0200
Francesca  пишет:

> c10Dt <-  mutate(c10Dt, exit1= ifelse(is.na(cp1) & id!=1, 1, 0))

> So, I create a new variable, called exit1, in which the program
> selects cp1, checks if it is NA, and if it is NA but also the value
> of the column "id" is not 1, then it gives back a 1, otherwise 0.
> So, what I want is that it selects all the cases in which the id=2,3,
> or 4 is not NA in the corresponding values of the matrix.

Since all your columns except the first one are the desired "cp*"
columns, you can obtain your "exit" columns in bulk:

(
 c10Dt$id != 1 & # will be recycled column-wise, as we need
 is.na(c10Dt[-1])
) |>
# ...and then convert back into a data.frame,
as.data.frame() |>
# rename the columns...
(\(x) setNames(x, sub('cp', 'exit', names(x() |>
# ...and finally attach to the original data.frame
cbind(c10Dt)

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Stop or limit to console printing large list and so on

2024-09-11 Thread Fer Arce via R-help


Hello Stephen.
I am not sure of the exact details of your problem, but following the 
second part of your e-mail, if you accidentally print a large object in 
the console and do not want to wait (i.e. you want to stop printing), 
just press C-c C-c and it will stop it (it will stop any process 
happening in the console, the same if you send a loong loop and want to 
abort...)

cheers
F.

On 9/11/24 15:44, stephen sefick wrote:

Stephen Sefick


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] non-conformable arrays

2024-09-11 Thread Ivan Krylov via R-help

В Wed, 11 Sep 2024 17:55:53 +0200
Thierry Onkelinx  пишет:

> For those how like a fully reproducible example:
> the offending line in the code:
> https://github.com/inbo/multimput/blob/e1cd0cdff7d2868e4101c411f7508301c7be7482/R/impute_glmermod.R#L65
> a (failing) unit test for the code:
> https://github.com/inbo/multimput/blob/e1cd0cdff7d2868e4101c411f7508301c7be7482/tests/testthat/test_ccc_hurdle_impute.R#L10

Setting options(error = recover) and evaluating your test expression
directly using
eval(parse('testthat/test_ccc_hurdle_impute.R')[[1]][[3]]), I see:

Browse[1]> names(random)
[1] "Year"
Browse[1]> paste("~0 + ", 'Year') |>
+ as.formula() |>
+ model.matrix(data = data[missing_obs,]) |>
+ str()
 num [1:68, 1] 7 6 9 4 3 5 10 1 6 8 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:68] "37" "56" "59" "114" ...
  ..$ : chr "Year"
 - attr(*, "assign")= int 1
Browse[1]> t(random[['Year']]) |> str()
 num [1:19, 1:10] 0.175 0.181 0.102 0.119 0.158 ...

...and they are indeed non-conformable. Why is random$Year a matrix?

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Stop or limit to console printing large list and so on

2024-09-11 Thread Ivan Krylov via R-help

В Wed, 11 Sep 2024 09:44:05 -0400
stephen sefick  пишет:

> I am having a problem with accidentally typing an object name at the
> console that is a very large list and then having to wait for it to be
> printed until I can resume my work.

Does it help to interrupt the process?

https://www.gnu.org/software/emacs/manual/html_mono/emacs.html#index-C_002dc-C_002dc-_0028Shell-mode_0029
https://ess.r-project.org/Manual/ess.html#index-interrupting-R-commands

I'm afraid that the behaviour of the print() method is very
class-dependent and limiting options(max.print=...) may not help in
your case.

-- 
Best regards,
Ivan

______
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading a txt file from internet

2024-09-07 Thread Jeff Newmiller via R-help

I tried it on R 4.4.1 on Linux Mint 21.3 just before I posted it, and I just 
tried it on R 3.4.2 on Ubuntu 16.04 and R 4.3.2 on Windows 11 just now and it 
works on all of them.

I don't have a big-endian machine to test on, but the Unicode spec says to 
honor the BOM and if there isn't one to assume that it is big-endian data. But 
in this case there is a BOM so your machine has a buggy decoder?

On September 7, 2024 2:43:24 PM PDT, Duncan Murdoch  
wrote:
>On 2024-09-07 4:52 p.m., Jeff Newmiller via R-help wrote:
>> When you specify LE in the encoding type, you are logically telling the 
>> decoder that you know the two-byte pairs are in little-endian order... which 
>> could override whatever the byte-order-mark was indicating. If the BOM 
>> indicated big-endian then the file decoding would break. If there is a BOM, 
>> don't override it unless you have to (e.g. for a wrong BOM)... leave off the 
>> LE unless you really need it.
>
>That sounds like good advice, but it doesn't work:
>
> > read.delim(
> + 'https://online.stat.psu.edu/onlinecourses/sites/stat501/files 
> /ch15/employee.txt',
> + fileEncoding = "UTF-16"
> + )
> [1] time 
>
>
>
>
>
>
>
>
>
>
>
>
>
> [2] 
> vendor.洀攀琀愀氀㐀㐀㜀.㐀㐀㤀.㐀㐀.㐀..㐀.㐀..㐀..㔀...㜀.㐀..㠀..㘀...㠀.㐀㐀㜀...㔀.㐀㐀.
>
>and so on.
>> 
>> On September 7, 2024 1:22:23 PM PDT, Enrico Schumann 
>>  wrote:
>>> On Sun, 08 Sep 2024, Christofer Bogaso writes:
>>> 
>>>> Hi,
>>>> 
>>>> I am trying to the data from
>>>> https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt
>>>> without any success. Below is the error I am getting:
>>>> 
>>>>> read.delim('https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt')
>>>> 
>>>> Error in make.names(col.names, unique = TRUE) :
>>>> 
>>>>invalid multibyte string at 't'
>>>> 
>>>> In addition: Warning messages:
>>>> 
>>>> 1: In read.table(file = file, header = header, sep = sep, quote = quote,  :
>>>> 
>>>>line 1 appears to contain embedded nulls
>>>> 
>>>> 2: In read.table(file = file, header = header, sep = sep, quote = quote,  :
>>>> 
>>>>line 2 appears to contain embedded nulls
>>>> 
>>>> 3: In read.table(file = file, header = header, sep = sep, quote = quote,  :
>>>> 
>>>>line 3 appears to contain embedded nulls
>>>> 
>>>> 4: In read.table(file = file, header = header, sep = sep, quote = quote,  :
>>>> 
>>>>line 4 appears to contain embedded nulls
>>>> 
>>>> 5: In read.table(file = file, header = header, sep = sep, quote = quote,  :
>>>> 
>>>>line 5 appears to contain embedded nulls
>>>> 
>>>> Is there any way to read this data directly onto R?
>>>> 
>>>> Thanks for your time
>>>> 
>>> 
>>> The  looks like a byte-order mark
>>> (https://en.wikipedia.org/wiki/Byte_order_mark).
>>> Try this:
>>> 
>>> fn <- 
>>> file('https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt',
>>>encoding = "UTF-16LE")
>>> read.delim(fn)
>>> 
>> 
>

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading a txt file from internet

2024-09-07 Thread Jeff Newmiller via R-help

When you specify LE in the encoding type, you are logically telling the decoder 
that you know the two-byte pairs are in little-endian order... which could 
override whatever the byte-order-mark was indicating. If the BOM indicated 
big-endian then the file decoding would break. If there is a BOM, don't 
override it unless you have to (e.g. for a wrong BOM)... leave off the LE 
unless you really need it.

On September 7, 2024 1:22:23 PM PDT, Enrico Schumann  
wrote:
>On Sun, 08 Sep 2024, Christofer Bogaso writes:
>
>> Hi,
>>
>> I am trying to the data from
>> https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt
>> without any success. Below is the error I am getting:
>>
>>> read.delim('https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt')
>>
>> Error in make.names(col.names, unique = TRUE) :
>>
>>   invalid multibyte string at 't'
>>
>> In addition: Warning messages:
>>
>> 1: In read.table(file = file, header = header, sep = sep, quote = quote,  :
>>
>>   line 1 appears to contain embedded nulls
>>
>> 2: In read.table(file = file, header = header, sep = sep, quote = quote,  :
>>
>>   line 2 appears to contain embedded nulls
>>
>> 3: In read.table(file = file, header = header, sep = sep, quote = quote,  :
>>
>>   line 3 appears to contain embedded nulls
>>
>> 4: In read.table(file = file, header = header, sep = sep, quote = quote,  :
>>
>>   line 4 appears to contain embedded nulls
>>
>> 5: In read.table(file = file, header = header, sep = sep, quote = quote,  :
>>
>>   line 5 appears to contain embedded nulls
>>
>> Is there any way to read this data directly onto R?
>>
>> Thanks for your time
>>
>
>The  looks like a byte-order mark
>(https://en.wikipedia.org/wiki/Byte_order_mark).
>Try this:
>
>fn <- 
> file('https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt',
>   encoding = "UTF-16LE")
>read.delim(fn)
>

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading a txt file from internet

2024-09-07 Thread Jeff Newmiller via R-help

Add the

   fileEncoding = "UTF-16"

argument to the read call.

For a human explanation of why this is going on I recommend [1]. For a more 
R-related take, try [2].

For reference, I downloaded your file and used the "file" command line program 
typically available on Linux (and possibly MacOSX) which will tell you about 
what encoding is used in a particular file.

[1] https://www.youtube.com/watch?v=4mRxIgu9R70
[2] https://kevinushey.github.io/blog/2018/02/21/string-encoding-and-r/

On September 7, 2024 12:56:36 PM PDT, Christofer Bogaso 
 wrote:
>Hi,
>
>I am trying to the data from
>https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt
>without any success. Below is the error I am getting:
>
>> read.delim('https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt')
>
>Error in make.names(col.names, unique = TRUE) :
>
>  invalid multibyte string at 't'
>
>In addition: Warning messages:
>
>1: In read.table(file = file, header = header, sep = sep, quote = quote,  :
>
>  line 1 appears to contain embedded nulls
>
>2: In read.table(file = file, header = header, sep = sep, quote = quote,  :
>
>  line 2 appears to contain embedded nulls
>
>3: In read.table(file = file, header = header, sep = sep, quote = quote,  :
>
>  line 3 appears to contain embedded nulls
>
>4: In read.table(file = file, header = header, sep = sep, quote = quote,  :
>
>  line 4 appears to contain embedded nulls
>
>5: In read.table(file = file, header = header, sep = sep, quote = quote,  :
>
>  line 5 appears to contain embedded nulls
>
>Is there any way to read this data directly onto R?
>
>Thanks for your time
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] BUG: atan(1i) / 5 = NaN+Infi ?

2024-09-06 Thread Jorgen Harmse via R-help

It seems to me that the documentation of R's complex class & R's atan function 
do not tell us what to expect, so (as others have suggested), some additional 
notes are needed. I think that mathematically atan(1i) should be NA_complex_, 
but R seems not to use any mathematically standard compactification of the 
complex plane (and I'm not sure that IEEE does either).

Incidentally, the signature of the complex constructor is confusing. 
complex(1L) returns zero, but complex(1L, argument=theta) is an element of the 
unit circle. The defaults suggest ambiguous results in case only length.out is 
specified, and you have to read a parenthesis in the details to figure out what 
will happen. Even then, the behaviour in my example is not spelled out 
(although it is suggested by negative inference). Moreover, the real & 
imaginary parts are ignored if either modulus or argument is provided, and I 
don't see that this is explained at all.

R's numeric (& IEEE's floating-point types) seem to approximate a multi-point 
compactification of the real line. +Inf & -Inf fill out the approximation to 
the extended real line, and NaN, NA_real_ & maybe some others handle some cases 
in which the answer does not live in the extended real line. (I'm not digging 
into bit patterns here. I suspect that there are several versions of NaN, but I 
hope that they all behave the same way.) The documentation suggests that a 
complex scalar in R is just a pair of numeric scalars, so we are not dealing 
with the Riemann sphere or any other usually-studied extension of the complex 
plane. Since R distinguishes various complex infinities (and seems to allow any 
combination of numeric values in real & imaginary parts), the usual 
mathematical answer for atan(1i) may no longer be relevant.

The tangent function has an essential singularity at complex infinity (the 
compactification point in the Riemann sphere, which I consider the natural 
extension for the study of meromorphic functions, for example making the 
tangent function well defined on the whole plane), so the usual extension of 
the plane does not give us an answer for atan(1i). However, another possible 
extension is the Cartesian square of the extended real line, and in that 
extension continuity suggests that tan(x + Inf*1i) = 1i and tan(x - Inf*1i) = 
-1i (for x real & finite). That is the result from R's tan function, and it 
explains why atan(1i) in R is not NA or NaN. The specific choice of pi/4 + 
Inf*1i puzzled me at first, but I think it's related to the branch-cut rules 
given in the documentation. The real part of atan((1+.Machine$double.eps)*1i) 
is pi/2, and the real part of atan((1-.Machine$double.eps)*1i) is zero, and 
someone apparently decided to average those for atan(1i).

TL;DR: The documentation needs more details, and I don't really like the 
extended complex plane that R implemented, but within that framework the 
answers for atan(1i) & atan(-1i) make sense.

Regards,
Jorgen Harmse.





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fwd: effects() extractor for a quantile reqression object: error message

2024-09-06 Thread Gregg Powell via R-help

Need to kill some time, so thought I'd Opine.

Given the intent, as I understood it... to extract components from a quantile 
regression (rq) object similar to how one might extract effects from an lm 
object.

Since it seems effects() is not implemented for rq, here are some alternative 
approaches to achieve similar functionality or extract useful information from 
the quantile regression model:

1. Extracting Coefficients. Use the coef() function to extract the coefficients 
of the quantile regression model. This is similar to extracting the effects in 
a linear model.

> coef(qm.9)

This will return the estimated coefficients for the 0.9 quantile.

2. Extracting Fitted Values. To get the fitted values from the quantile 
regression model, use: 

> fitted(qm.9)

This gives the fitted values for the quantile regression model, similar to what 
you'd see in an lm() model.

3. Extracting ResidualsResiduals from the quantile regression can be extracted 
using the resid() function: 

> resid(qm.9)

This gives the residuals for the 0.9 quantile regression model, which can be a 
useful diagnostic for checking model fit.

4. Manually Calculating "Effects".While effects() is not available, you can 
manually calculate effects by examining the design matrix and the coefficients. 
If the term "effects" refers to the influence of different covariates in the 
model, these can be assessed by looking at the coefficients themselves and 
their impact on the fitted values.

5. Using the summary() Function.The summary() function for rq objects 
provides detailed information about the quantile regression fit, including 
coefficient estimates, standard errors, and statistical significance:

> summary(qm.9)

This can give a more comprehensive understanding of how covariates are 
contributing to the model. 

6. Working with Design Matrices.If you need the design matrix for 
further custom calculations, you can extract it using: 

> model.matrix(qm.9) 

 This returns the matrix of predictor variables, which you can then use to 
manually compute the "effects" of changes in the predictors.

7. Explore Partial Effects with predict(). The predict() function can help you 
assess how changes in predictor values affect the outcome. For instance, to 
predict values at specific points in the design space, you can use: 

> predict(qm.9, newdata = data.frame(your_new_data))

8. Bootstrapping to Examine Variability. If you want to assess variability in 
the effects of predictors, you could use bootstrapping. The boot.rq() function 
in the quantreg package allows you to bootstrap the quantile regression 
coefficients, giving insight into the variability of the estimated "effects."

Example:

> boot_results <- boot.rq(y ~ X, tau = 0.9, data = your_data, R = 1000)

> summary(boot_results)

9.Interaction or Additive Effects (If Applicable). If you're trying to capture 
interaction or additive effects, you might need to specify interaction terms 
directly in your formula and then inspect the coefficients for these terms. 
Quantile regression will estimate these coefficients in the same manner as 
linear regression but specific to the quantile of interest.

In conclusion, while effects() is not available for quantile regression, the 
combination of coef(), fitted(), resid(), model.matrix(), and summary() 
provides the main components of a quantile regression model that should provide 
similar insights to what effects() provides for linear models.

Forgive any typos please... I'm on a mobile device.

Kind regards,
Gregg Powell

Sent from Proton Mail Android

 Original Message ----
On 06/09/2024 01:37, Koenker, Roger W  wrote:

>  Apologies,  forgot to copy R-help on this response.
>  
>  Begin forwarded message:
>  
>  
>  From: Roger Koenker 
>  Subject: Re: [R] effects() extractor for a quantile reqression object: error 
> message
>  Date: September 6, 2024 at 8:38:47 AM GMT+1
>  To: "Christopher W. Ryan" 
>  
>  Chris,
>  
>  This was intended to emulate the effects component of lm() fitting, but was 
> never implemented.  Frankly, I don’t quite see on first glance how this works 
> for lm() — it seems to be (mostly) about situations where X is not full rank 
> (see lm.fit) and I also never bothered to implement rq for X that were not 
> full rank.
>  
>  Roger
>  
>  
>  On Sep 6, 2024, at 3:50 AM, Christopher W. Ryan via R-help 
>  wrote:
>  
>  I'm using quantreg package version 5.98 of 24 May 2024, in R 4.4.1 on
>  Linux Mint.
>  
>  The online documentation for quantreg says, in part, under the
>  description of the rq.object, "The coefficients, residuals, and effects
>  may be extracted by the generic functions of the same name, rather than
>  by the $ operator."
>  
>

[R] effects() extractor for a quantile reqression object: error message

2024-09-05 Thread Christopher W. Ryan via R-help

I'm using quantreg package version 5.98 of 24 May 2024, in R 4.4.1 on
Linux Mint.

The online documentation for quantreg says, in part, under the
description of the rq.object, "The coefficients, residuals, and effects
may be extracted by the generic functions of the same name, rather than
by the $ operator."

I create an rq object for the 0.9 quantile, called qm.9

effects(qm.9)

yields, the error message, " effects(qm.9)
Error in UseMethod("effects") :
  no applicable method for 'effects' applied to an object of class "rq"

I'm confused. Appreciate any suggestions. Thanks.

--Chris Ryan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] BUG: atan(1i) / 5 = NaN+Infi ?

2024-09-05 Thread Jeff Newmiller via R-help

atan(1i) -> 0 + Inf i
complex(1/5) -> 0.2 + 0i
atan(1i) -> (0 + Inf i) * (0.2 + 0i)
-> 0*0.2 + 0*0i + Inf i * 0.2 + Inf i * 0i
infinity times zero is undefined
-> 0 + 0i + Inf i + NaN * i^2
-> 0 + 0i + Inf i - NaN
-> NaN + Inf i

I am not sure how complex arithmetic could arrive at another answer.

I advise against messing with infinities... use atan2() if you don't actually 
need complex arithmetic.

On September 5, 2024 3:38:33 PM PDT, Bert Gunter  wrote:
>> complex(real = 0, imaginary = Inf)
>[1] 0+Infi
>
>> Inf*1i
>[1] NaN+Infi
>
>>> complex(real = 0, imaginary = Inf)/5
>[1] NaN+Infi
>
>See the Note in ?complex for the explanation, I think.  Duncan can correct
>if I'm wrong.
>
>-- Bert
>
>On Thu, Sep 5, 2024 at 3:20 PM Leo Mada  wrote:
>
>> Dear Bert,
>>
>> These behave like real divisions/multiplications:
>> complex(re=Inf, im = Inf) * 5
>> # Inf+Infi
>> complex(re=-Inf, im = Inf) * 5
>> # -Inf+Infi
>>
>> The real division / multiplication should be faster and also is well
>> behaved. I was expecting R to do the real division/multiplication on a
>> complex number. Which R actually does for these very particular cases; but
>> not when only Im(x) is Inf.
>>
>> Sincerely,
>>
>> Leonard
>>
>> --
>> *From:* Bert Gunter 
>> *Sent:* Friday, September 6, 2024 1:12 AM
>> *To:* Duncan Murdoch 
>> *Cc:* Leo Mada ; r-help@r-project.org <
>> r-help@r-project.org>
>> *Subject:* Re: [R] BUG: atan(1i) / 5 = NaN+Infi ?
>>
>> Perhaps
>>
>> > Inf*1i
>> [1] NaN+Infi
>>
>> clarifies why it is *not* a bug.
>> (Boy, did that jog some long dusty math memories :-)  )
>>
>> -- Bert
>>
>> On Thu, Sep 5, 2024 at 2:48 PM Duncan Murdoch 
>> wrote:
>>
>> On 2024-09-05 4:23 p.m., Leo Mada via R-help wrote:
>> > Dear R Users,
>> >
>> > Is this desired behaviour?
>> > I presume it's a bug.
>> >
>> > atan(1i)
>> > # 0+Infi
>> >
>> > tan(atan(1i))
>> > # 0+1i
>> >
>> > atan(1i) / 5
>> > # NaN+Infi
>>
>> There's no need to involve atan() and tan() in this:
>>
>>  > (0+Inf*1i)/5
>> [1] NaN+Infi
>>
>> Why do you think this is a bug?
>>
>> Duncan Murdoch
>>
>> ______
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> https://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] BUG: atan(1i) / 5 = NaN+Infi ?

2024-09-05 Thread Leo Mada via R-help

Dear Bert,

These behave like real divisions/multiplications:
complex(re=Inf, im = Inf) * 5
# Inf+Infi
complex(re=-Inf, im = Inf) * 5
# -Inf+Infi

The real division / multiplication should be faster and also is well behaved. I 
was expecting R to do the real division/multiplication on a complex number. 
Which R actually does for these very particular cases; but not when only Im(x) 
is Inf.

Sincerely,

Leonard

From: Bert Gunter 
Sent: Friday, September 6, 2024 1:12 AM
To: Duncan Murdoch 
Cc: Leo Mada ; r-help@r-project.org 
Subject: Re: [R] BUG: atan(1i) / 5 = NaN+Infi ?

Perhaps

> Inf*1i
[1] NaN+Infi

clarifies why it is *not* a bug.
(Boy, did that jog some long dusty math memories :-)  )

-- Bert

On Thu, Sep 5, 2024 at 2:48 PM Duncan Murdoch 
mailto:murdoch.dun...@gmail.com>> wrote:
On 2024-09-05 4:23 p.m., Leo Mada via R-help wrote:
> Dear R Users,
>
> Is this desired behaviour?
> I presume it's a bug.
>
> atan(1i)
> # 0+Infi
>
> tan(atan(1i))
> # 0+1i
>
> atan(1i) / 5
> # NaN+Infi

There's no need to involve atan() and tan() in this:

 > (0+Inf*1i)/5
[1] NaN+Infi

Why do you think this is a bug?

Duncan Murdoch

__
R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To 
UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] BUG: atan(1i) / 5 = NaN+Infi ?

2024-09-05 Thread Leo Mada via R-help

Dear R Users,

Is this desired behaviour?
I presume it's a bug.

atan(1i)
# 0+Infi

tan(atan(1i))
# 0+1i

atan(1i) / 5
# NaN+Infi

There were some changes in handling of complex numbers. But it looks like a bug.

Sincerely,

Leonard


[[alternative HTML version deleted]]

______
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding parameters for Benchmark normal distribution in shapiro.test

2024-09-02 Thread Jeff Newmiller via R-help

Wouldn't that be because the sample is not being compared to a specific 
distribution but rather to many possible distributions by MC? [1]

If you think that need not be the case, perhaps you can write your own test... 
but then it will probably be answering a different question?

[1] https://en.m.wikipedia.org/wiki/Shapiro%E2%80%93Wilk_test

On September 2, 2024 4:26:17 PM PDT, Christofer Bogaso 
 wrote:
>Hi,
>
>In ?shapiro.test, there seems to be no option to pass mean and sd
>information of the Normal distribution which I want to compare my
>sample data to.
>
>For example in the code below, I want to test my sample to N(0, 10).
>
>shapiro.test(rnorm(100, mean = 5, sd = 3))
>
>Is there any way to pass the information of the benchmark normal distribution?
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Prediction from Arima model

2024-08-31 Thread David Winsemius via R-help

Sent from my iPhone

> On Aug 31, 2024, at 10:55 AM, Christofer Bogaso  
> wrote:
> 
> Hi,
> 
> I have run following code to obtain one step ahead confidence interval
> from am arima model
> 
> library(forecast)
> 
> set.seed(100)
> 
> forecast(Arima(rnorm(100), order = c(1,0,1), xreg = rt(100, 1)), h =
> 1, xreg = 10)
> 
> However this appear to provide the Prediction interval, however I
> wanted to get the confidence interval for the new value.
> 
> Is there any way to get the confidence interval for the new value?

I’m not sure it makes sense to output a confidence interval when you are 
projecting a time series. If you wanted a confidence interval on the past data 
then you can just use standard descriptive methods. 

— 
David. 
> 
> I also wanted to get the estimate of SE for the new value which is
> used to obtain the confidence interval of the new value. Is there any
> method available to obtain that?
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] aggregating data with quality control

2024-08-31 Thread Ivan Krylov via R-help

В Sat, 31 Aug 2024 11:15:10 +
Stefano Sofia  пишет:

> Evaluating the daily mean indipendently from the status is very easy:
> 
> aggregate(mydf$hs, by=list(format(mydf$data_POSIX, "%Y"),
> format(mydf$data_POSIX, "%m"), format(mydf$data_POSIX, "%d")),
> my.mean)
> 
> 
> Things become more complicated when I need to export also the status:
> this should be "C" when all 48 data have status equal to "C", and
> status "D" when at least one value has status ="D".
> 
> 
> I have no clue on how to do that in an efficient way.

You can make the status into an ordered factor:

# come up with some statuses
status <- sample(c('C', 'D'), 42, TRUE, c(.9, .1))

# convert them into factors, specifying that D is "more than" C
status <- ordered(status, c('C', 'D'))

Since the factor is ordered and can be subject to comparison like
status[1] < status[2], you can now use max() on your groups. If the
sample contains any 'D's, max() will return a 'D', because it's larger
than any 'C's. If the sample contains only 'C's, that's the maximal
value by default.

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] lattice panel layout like cross-tabs, like a 2 x 2 table.

2024-08-30 Thread Christopher W. Ryan via R-help

Years ago, I recall creating lattice plots with two binary factors, call
them f1 and f2, as in

xyplot(y ~ x | f1 + f2, data = dd)

and I made it so the rows had strips on the left with the levels of one
factor, and the columns had strips on the top with the levels of the
other factor.  Sort of like strip.left() but for just one of the factors.

I can't remember or find how I did it, what options to set. Can anyone
remind me?

Thanks.

--Chris Ryan

______
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fill NA values in columns with values of another column

2024-08-29 Thread Jeff Newmiller via R-help

Use the ave function.

On August 29, 2024 2:29:16 PM PDT, Bert Gunter  wrote:
>Petr et.al:
>
>I think using merge is a very nice idea! (note that the email omitted the
>last rows of the result, though your code of course produced them)
>
>The only minor problem is that the order of the rows in the result is
>changed from the original. If the OP needs to preserve the original
>ordering, that can be easily done. Here is a complete implementation of
>your idea (I think).
>
>## assume that dat is a data frame with the first two columns as in the
>OP's post, i.e. the first column is the Value with NA's and the second is
>the Group
>
> spl <-  dat |> nrow() |> seq_len() |> split(dat[[2]]) |> unlist() ## for
>reordering
> dat[spl, 1:2] <-
>dat[, 1:2] |>
>na.omit() |>  ## remove rows with NA's
>unique() |>  ## remove duplicate rows
>merge(dat[, 1:2], by.x=2, by.y=2) |> _[, 2:1]  ## and now merge()
>
> Note the final reordering of the first two columns because of the way
>merge() works.
>I suspect there may be a slicker way to do this using unsplit(), but I
>could not figure out how
>
>The result is:
>> dat
>   Value Group
>1  6 8
>2  9 5
>3  2 1
>4  5 6
>5  2 7
>6  7 2
>7  4 4
>8  2 7
>9  2 7
>1010 3
>11 7 2
>12 4 4
>13 5 6
>14 9 5
>15 9 5
>16 5 6
>1710 3
>18 7 2
>19 2 1
>20 2 7
>21 7 2
>22 6 8
>23 4 4
>24 9 5
>25 5 6
>26 2 1
>27 4 4
>28 6 8
>2910 3
>3010 3
>31 6 8
>32 2 1
>
>Cheers,
>Bert
>
>
>
>
>On Wed, Aug 28, 2024 at 10:53 PM Petr Pikal  wrote:
>
>> Hallo Francesca
>>
>> If you had an object with correct setting, something like template
>>
>> > dput(res)
>> structure(list(V1 = c("1", "2", "3", "4", "5", "6", "7", "8"),
>> V2 = c(2, 7, 10, 4, 9, 5, 2, 6)), class = "data.frame", row.names =
>> c("1",
>> "2", "3", "4", "5", "6", "7", "8"))
>>
>> you could merge it with your object where some values are missing
>>
>> > dput(daf)
>> structure(list(X1 = c(6L, 9L, NA, 5L, NA, NA, 4L, 2L, 2L, NA,
>> NA, NA, 5L, 9L, NA, NA, 10L, 7L, 2L, NA, 7L, NA, NA, NA, NA,
>> 2L, 4L, 6L, 10L, NA, NA, NA), X2 = c(8L, 5L, 1L, 6L, 7L, 2L,
>> 4L, 7L, 7L, 3L, 2L, 4L, 6L, 5L, 5L, 6L, 3L, 2L, 1L, 7L, 2L, 8L,
>> 4L, 5L, 6L, 1L, 4L, 8L, 3L, 3L, 8L, 1L)), class = "data.frame", row.names =
>> c(NA,
>> -32L))
>>
>> > merge(daf, res, by.x="X2", by.y="V1")
>>X2 X1 V2
>> 1   1 NA  2
>> 2   1 NA  2
>> 3   1  2  2
>> 4   1  2  2
>> 5   2 NA  7
>> 6   2 NA  7
>> 7   2  7  7
>> 8   2  7  7
>> 9   3 10 10
>> 10  3 NA 10
>> 11  3 10 10
>> 12  3 NA 10
>> 13  4  4  4
>> 14  4 NA  4
>> 15  4  4  4
>> 16  4 NA  4
>> 17  5  9  9
>> 18  5 NA  9
>> 19  5 NA  9
>>
>> Cheers.
>> Petr
>>
>>
>>
>>
>> st 28. 8. 2024 v 0:45 odesílatel Francesca PANCOTTO via R-help <
>> r-help@r-project.org> napsal:
>>
>> > Dear Contributors,
>> > I have a problem with a database composed of many individuals for many
>> > periods, for which I need to perform a manipulation of data as follows.
>> > Here I report the procedure I need to do for the first 32 observations of
>> > the first period.
>> >
>> >
>> > cbind(VB1d[,1],s1id[,1])
>> >   [,1] [,2]
>> >  [1,]68
>> >  [2,]95
>> >  [3,]   NA1
>> >  [4,]56
>> >  [5,]   NA7
>> >  [6,]   NA2
>> >  [7,]44
>> >  [8,]27
>> >  [9,]27
>> > [10,]   NA3
>> > [11,]   NA2
>> > [12,]   NA4
>> > [13,]56
>> > [14,]95
>> > [15,]   NA5
>> > [16,]   NA6
>> > [17,]   103
>> > [18,]72
>> > [19,]21
>> > [20,]   NA7
>> > [21,]72
>> > [22,]   NA8
>> > [23,]   NA4
>> > [24,]   NA5
>> > [25,]   NA6
>> > [26,]21
>> > [27,]4

Re: [R] boxplot of raster and shapefile

2024-08-28 Thread SIBYLLE STÖCKLI via R-help

max   : num 53

  .. .. ..@ band  : int 1

  .. .. ..@ unit  : chr ""

  .. .. ..@ names : chr "Andrena.barbilabris_glo_ensemble"

  ..@ legend  :Formal class '.RasterLegend' [package "raster"] with 5 slots

  .. .. ..@ type  : chr(0) 

  .. .. ..@ values: logi(0) 

  .. .. ..@ color : logi(0) 

  .. .. ..@ names : logi(0) 

  .. .. ..@ colortable: logi(0) 

  ..@ title   : chr(0) 

  ..@ extent  :Formal class 'Extent' [package "raster"] with 4 slots

  .. .. ..@ xmin: num 248

  .. .. ..@ xmax: num 284

  .. .. ..@ ymin: num 107

  .. .. ..@ ymax: num 130

  ..@ rotated : logi FALSE

  ..@ rotation:Formal class '.Rotation' [package "raster"] with 2 slots

  .. .. ..@ geotrans: num(0) 

  .. .. ..@ transfun:function ()  

  ..@ ncols   : int 14400

  ..@ nrows   : int 9200

  ..@ crs :Formal class 'CRS' [package "sp"] with 1 slot

  .. .. ..@ projargs: chr "+proj=somerc +lat_0=46.952405556 
+lon_0=7.439583 +k_0=1 +x_0=260 +y_0=120 +ellps=bessel +units=m 
+no_defs"

  .. .. ..$ comment: chr "PROJCRS[\"unknown\",\nBASEGEOGCRS[\"unknown\",\n  
  DATUM[\"Unknown based on Bessel 1841 ellipsoid\",\n"| __truncated__

  ..@ srs : chr "+proj=somerc +lat_0=46.952405556 
+lon_0=7.4395833333 +k_0=1 +x_0=260 +y_0=120 +ellps=bessel +units=m 
+no_defs"

  ..@ history : list()

  ..@ z   : list()

 

 

> e <- extract(r,v)

Error in (function (classes, fdef, mtable)  : 

  kann keine vererbte Methode finden für Funktion ‘extract’ für Signatur 
‘"RasterLayer", "SpatVector"’

 

 

Kind regards

Sibylle

 

-Original Message-
From: Ivan Krylov  
Sent: Tuesday, August 27, 2024 6:55 PM
To: SIBYLLE STÖCKLI via R-help 
Cc: sibylle.stoec...@gmx.ch
Subject: Re: [R] boxplot of raster and shapefile

 

В Mon, 26 Aug 2024 14:33:02 +0200

SIBYLLE STÖCKLI via R-help < <mailto:r-help@r-project.org> 
r-help@r-project.org> пишет:

 

> > # Extract raster values within the shapefile extracted_values <- 

> > extract(raster_file, shape_file)

 

> > # Assuming the shapefile has multiple polygons and you want to # 

> > create a boxplot for each data_list <- 

> > lapply(1:length(extracted_values), function(i) {

> + data.frame(value = extracted_values[[i]], polygon = i)

> + })

> > data <- do.call(rbind, data_list)

 

> > names(data)

> [1] "value"   "polygon"

> > # Create the boxplot

> > bp<-ggplot(data, aes(x = factor(polygon), y = value)) +  

> + geom_boxplot() +

> + labs(x = "Polygon", y = "Raster Values") +

> + theme_minimal()

> > bp  

> Error in UseMethod("depth") : 

>   no applicable method for 'depth' applied to an object of class

> "NULL"

> In addition: Warning message:

> Removed 452451 rows containing non-finite outside the scale range

> (`stat_boxplot()`). 

 

Thank you for providing a runnable example! Could you please also show

the output of str(extracted_values) and str(data)?

 

-- 

Best regards,

Ivan


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fill NA values in columns with values of another column

2024-08-27 Thread Francesca PANCOTTO via R-help

Dear Contributors,
I have a problem with a database composed of many individuals for many
periods, for which I need to perform a manipulation of data as follows.
Here I report the procedure I need to do for the first 32 observations of
the first period.


cbind(VB1d[,1],s1id[,1])
  [,1] [,2]
 [1,]68
 [2,]95
 [3,]   NA1
 [4,]56
 [5,]   NA7
 [6,]   NA2
 [7,]44
 [8,]27
 [9,]27
[10,]   NA3
[11,]   NA2
[12,]   NA4
[13,]56
[14,]95
[15,]   NA5
[16,]   NA6
[17,]   103
[18,]72
[19,]21
[20,]   NA7
[21,]72
[22,]   NA8
[23,]   NA4
[24,]   NA5
[25,]   NA6
[26,]21
[27,]44
[28,]68
[29,]   103
[30,]   NA3
[31,]   NA8
[32,]   NA1


In column s1id, I have numbers from 1 to 8, which are the id of 8 groups ,
randomly mixed in the larger group of 32.
For each group, I want the value that is reported for only to group
members, to all the four group members.

For example, value 8 in first row , second column, is group 8. The value
for group 8 of the variable VB1d is 6. At row 28, again for s1id equal to
8, I have 6.
But in row 22, the value 8 of the second variable, reports a value NA.
in each group is the same, only two values have the correct number, the
other two are NA.
I need that each group, identified by the values of the variable S1id,
correctly report the number of variable VB1d that is present for just two
group members.

I hope my explanation is acceptable.
The task appears complex to me right now, especially because I will need to
multiply this procedure for x12x14 similar databases.

Anyone has ever encountered a similar problem?
Thanks in advance for any help provided.

--

Francesca Pancotto

Associate Professor Political Economy

University of Modena, Largo Santa Eufemia, 19, Modena

Office Phone: +39 0522 523264

Web: *https://sites.google.com/view/francescapancotto/home
<https://sites.google.com/view/francescapancotto/home>*

 --

[[alternative HTML version deleted]]

______
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] boxplot of raster and shapefile

2024-08-27 Thread Ivan Krylov via R-help

В Mon, 26 Aug 2024 14:33:02 +0200
SIBYLLE STÖCKLI via R-help  пишет:

> > # Extract raster values within the shapefile
> > extracted_values <- extract(raster_file, shape_file)

> > # Assuming the shapefile has multiple polygons and you want to
> > # create a boxplot for each
> > data_list <- lapply(1:length(extracted_values), function(i) {  
> + data.frame(value = extracted_values[[i]], polygon = i)
> + })
> > data <- do.call(rbind, data_list)  

> > names(data)  
> [1] "value"   "polygon"
 
> > # Create the boxplot
> > bp<-ggplot(data, aes(x = factor(polygon), y = value)) +  
> + geom_boxplot() +
> + labs(x = "Polygon", y = "Raster Values") +
> + theme_minimal()
> > bp  
> Error in UseMethod("depth") : 
>   no applicable method for 'depth' applied to an object of class
> "NULL"
> In addition: Warning message:
> Removed 452451 rows containing non-finite outside the scale range
> (`stat_boxplot()`). 

Thank you for providing a runnable example! Could you please also show
the output of str(extracted_values) and str(data)?

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] boxplot of raster and shapefile

2024-08-26 Thread SIBYLLE STÖCKLI via R-help

Dear community

This example code works

library(raster)
library(sp)
library(rgdal)
library(ggplot2)

# Create some sample raster data
raster_file <- raster(ncol=36, nrow=18)
raster_file[] <- 1:ncell(raster_file)
plot(raster_file)

#Create some sample polygons
cds1 <- rbind(c(-180,-20), c(-160,5), c(-60, 0), c(-160,-60), c(-180,-20))
cds2 <- rbind(c(80,0), c(100,60), c(120,0), c(120,-55), c(80,0))
shape_file <- SpatialPolygons(list(Polygons(list(Polygon(cds1)), 1), 
  Polygons(list(Polygon(cds2)), 2)))
plot(shape_file)


# Extract raster values within the shapefile
extracted_values <- extract(raster_file, shape_file)


# Assuming the shapefile has multiple polygons and you want to create a
boxplot for each
data_list <- lapply(1:length(extracted_values), function(i) {
  data.frame(value = extracted_values[[i]], polygon = i)
})
data <- do.call(rbind, data_list)

# Create the boxplot
bp<-ggplot(data, aes(x = factor(polygon), y = value)) +
  geom_boxplot() +
  labs(x = "Polygon", y = "Raster Values") +
  theme_minimal()
bp



For my own dataset I encountered problems in reading in the polygons. 
The error message comes in the boxplot function

Here may shape file
> # load shape file including all layers and print layers
> shape_file<-shapefile("C:/Users/.BiogeoRegion.shp")
Warning message:
[vect] Z coordinates ignored 
> names(shape_file)
 [1] "RegionNumm" "RegionName" "Unterregio" "Unterreg_1" "ObjNummer"
"Version""Shape_Leng" "Shape_Area" "DERegionNa" "FRRegionNa"
[11] "ITRegionNa" "DEBioBedeu" "FRBioBedeu" "ITBioBedeu"
> str(shape_file)
Formal class 'SpatialPolygonsDataFrame' [package "sp"] with 5 slots
  ..@ data   :'data.frame': 12 obs. of  14 variables:
  .. ..$ RegionNumm: int [1:12] 1 2 2 2 2 3 3 4 5 5 ...
  .. ..$ RegionName: chr [1:12] "R1" "R2" "R2" "R2" ...
  .. ..$ Unterregio: int [1:12] 11 21 22 23 24 31 32 41 51 52 ...
  .. ..$ Unterreg_1: chr [1:12] "U11" "U21" "U22" "U23" ...
  .. ..$ ObjNummer : chr [1:12] "1" "2" "3" "4" ...
  .. ..$ Version   : chr [1:12] "2020/05/08" "2020/05/08" "2020/05/08"
"2020/05/08" ...
  .. ..$ Shape_Leng: num [1:12] 725117 334364 539746 576810 41 ...
  .. ..$ Shape_Area: num [1:12] 4.17e+09 1.11e+09 1.07e+09 4.64e+09 4.47e+09
...
  .. ..$ DERegionNa: chr [1:12] "Jura" "Mittelland" "Mittelland"
"Mittelland" ...
  .. ..$ FRRegionNa: chr [1:12] "Jura" "Plateau" "Plateau" "Plateau" ...
  .. ..$ ITRegionNa: chr [1:12] "Giura" "Altipiano" "Altipiano" "Altipiano"
...
  .. ..$ DEBioBedeu: chr [1:12] "Jura und Randen" "Genferseegebiet"
"Hochrheingebiet" "Westliches Mittelland" ...
  .. ..$ FRBioBedeu: chr [1:12] "Jura et Randen" "Bassin lémanique" "Bassin
rhénan" "Plateau occidental" ...
  .. ..$ ITBioBedeu: chr [1:12] "Giura e Randen" "Regione del Lemano"
"Regione dellAlto Reno" "Altipiano occidentale" ...
  ..@ polygons   :List of 12


> # select a specific raster file from a list
> raster_file<-allrasters_pres[[1]]
> names(raster_file)
[1] "Andrena.barbilabris_glo_ensemble"
> 
> # Extract raster values within the shapefile
> extracted_values <- extract(raster_file, shape_file)
> 
> 
> 
> # Assuming the shapefile has multiple polygons and you want to create a
boxplot for each
> data_list <- lapply(1:length(extracted_values), function(i) {
+     data.frame(value = extracted_values[[i]], polygon = i)
+ })
> data <- do.call(rbind, data_list)

> names(data)
[1] "value"   "polygon"
> 
> # Create the boxplot
> bp<-ggplot(data, aes(x = factor(polygon), y = value)) +
+ geom_boxplot() +
+ labs(x = "Polygon", y = "Raster Values") +
+ theme_minimal()
> bp
Error in UseMethod("depth") : 
  no applicable method for 'depth' applied to an object of class "NULL"
In addition: Warning message:
Removed 452451 rows containing non-finite outside the scale range
(`stat_boxplot()`). 
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] checking for unstated dependencies in examples ... ERROR

2024-08-25 Thread Ivan Krylov via R-help

В Sun, 25 Aug 2024 20:32:07 +
Søren Højsgaard via R-help  пишет:

> checking for unstated dependencies in examples ... ERROR
>   Warning: parse error in file 'lines':
>   77:20)

Looks like one of your Rd \examples{} has a syntax error in it. If
there is a file gRain.Rcheck/gRain-Ex.R, try to find out which help
page does the line 77 correspond to.

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] checking for unstated dependencies in examples ... ERROR

2024-08-25 Thread DAVID WINSEMIUS via R-help



> On 08/25/2024 1:32 PM PDT Søren Højsgaard via R-help  
> wrote:
> 
>  
> Dear all,
> When checking a package (ubuntu) I get
> 
> checking for unstated dependencies in examples ... ERROR
>   Warning: parse error in file 'lines':
>   77:20)
> 
> Can anyone help on this?

I doubt anyone can help with such scanty information about how this problem 
comes about. I don't think there's an "ubuntu" package, but rather that you are 
on a Linux box. Do you actually have a file named "lines"? Have you looked for 
a syntax error near line 77?

-- 
David.

 Thanks in advance.
> 
> Best
> Søren
> 
> 
> ---
> 
> > sessionInfo()
> R version 4.3.3 (2024-02-29)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Ubuntu 24.04 LTS
> 
> Matrix products: default
> BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.12.0 
> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0
> 
> locale:
>  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C  
>  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
>  [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8   
>  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C 
>  [9] LC_ADDRESS=C   LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C   
> 
> time zone: Europe/Copenhagen
> tzcode source: system (glibc)
> 
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base 
> 
> other attached packages:
> [1] gRain_1.4.1.9012 gRbase_2.0.2 Rcpp_1.0.13  shtools_1.0 
> [5] markdown_1.13knitr_1.48   rmarkdown_2.28   devtools_2.4.5  
> [9] usethis_3.0.0   
> 
> loaded via a namespace (and not attached):
>  [1] tidyselect_1.2.1  dplyr_1.1.4   fastmap_1.2.0
>  [4] xopen_1.0.1   promises_1.3.0digest_0.6.37
>  [7] mime_0.12 lifecycle_1.0.4   Deriv_4.1.3  
> [10] ellipsis_0.3.2processx_3.8.4magrittr_2.0.3   
> [13] compiler_4.3.3rlang_1.1.4   tools_4.3.3  
> [16] igraph_2.0.3  utf8_1.2.4prettyunits_1.2.0
> [19] htmlwidgets_1.6.4 pkgbuild_1.4.4curl_5.2.1   
> [22] xml2_1.3.6pkgload_1.4.0 miniUI_0.1.1.1   
> [25] withr_3.0.1   purrr_1.0.2   desc_1.4.3   
> [28] grid_4.3.3stats4_4.3.3  fansi_1.0.6  
> [31] roxygen2_7.3.2urlchecker_1.0.1  profvis_0.3.8
> [34] xtable_1.8-4  colorspace_2.1-1  ggplot2_3.5.1
> [37] scales_1.3.0  MASS_7.3-60.0.1   cli_3.6.3
> [40] generics_0.1.3remotes_2.5.0 rstudioapi_0.16.0
> [43] modelr_0.1.11 commonmark_1.9.1  sessioninfo_1.2.2
> [46] cachem_1.1.0  stringr_1.5.1 vctrs_0.6.5  
> [49] boot_1.3-30   Matrix_1.6-5  callr_3.7.6  
> [52] rcmdcheck_1.4.0   testthat_3.2.1.1  tidyr_1.3.1  
> [55] glue_1.7.0ps_1.7.7  cowplot_1.1.3
> [58] stringi_1.8.4 gtable_0.3.5  later_1.3.2  
> [61] munsell_0.5.1 tibble_3.2.1  pillar_1.9.0 
> [64] htmltools_0.5.8.1 brio_1.1.5doBy_4.6.22  
> [67] R6_2.5.1  microbenchmark_1.4.10 rprojroot_2.0.4  
> [70] evaluate_0.24.0   shiny_1.9.1       lattice_0.22-6   
> [73] backports_1.5.0   memoise_2.0.1 broom_1.0.6  
> [76] httpuv_1.6.15 xfun_0.47 fs_1.6.4 
> [79] pkgconfig_2.0.3  
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] checking for unstated dependencies in examples ... ERROR

2024-08-25 Thread Søren Højsgaard via R-help

Dear all,
When checking a package (ubuntu) I get

checking for unstated dependencies in examples ... ERROR
  Warning: parse error in file 'lines':
  77:20)

Can anyone help on this? Thanks in advance.

Best
Søren


---

> sessionInfo()
R version 4.3.3 (2024-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 24.04 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.12.0 
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C  
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C 
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C   

time zone: Europe/Copenhagen
tzcode source: system (glibc)

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

other attached packages:
[1] gRain_1.4.1.9012 gRbase_2.0.2 Rcpp_1.0.13  shtools_1.0 
[5] markdown_1.13knitr_1.48   rmarkdown_2.28   devtools_2.4.5  
[9] usethis_3.0.0   

loaded via a namespace (and not attached):
 [1] tidyselect_1.2.1  dplyr_1.1.4   fastmap_1.2.0
 [4] xopen_1.0.1   promises_1.3.0digest_0.6.37
 [7] mime_0.12 lifecycle_1.0.4   Deriv_4.1.3  
[10] ellipsis_0.3.2processx_3.8.4magrittr_2.0.3   
[13] compiler_4.3.3rlang_1.1.4   tools_4.3.3  
[16] igraph_2.0.3  utf8_1.2.4prettyunits_1.2.0
[19] htmlwidgets_1.6.4 pkgbuild_1.4.4curl_5.2.1   
[22] xml2_1.3.6pkgload_1.4.0 miniUI_0.1.1.1   
[25] withr_3.0.1   purrr_1.0.2   desc_1.4.3   
[28] grid_4.3.3stats4_4.3.3  fansi_1.0.6  
[31] roxygen2_7.3.2urlchecker_1.0.1  profvis_0.3.8
[34] xtable_1.8-4  colorspace_2.1-1  ggplot2_3.5.1
[37] scales_1.3.0  MASS_7.3-60.0.1   cli_3.6.3
[40] generics_0.1.3remotes_2.5.0 rstudioapi_0.16.0
[43] modelr_0.1.11 commonmark_1.9.1  sessioninfo_1.2.2
[46] cachem_1.1.0  stringr_1.5.1 vctrs_0.6.5  
[49] boot_1.3-30   Matrix_1.6-5  callr_3.7.6  
[52] rcmdcheck_1.4.0   testthat_3.2.1.1  tidyr_1.3.1  
[55] glue_1.7.0ps_1.7.7  cowplot_1.1.3
[58] stringi_1.8.4 gtable_0.3.5  later_1.3.2  
[61] munsell_0.5.1 tibble_3.2.1  pillar_1.9.0 
[64] htmltools_0.5.8.1 brio_1.1.5doBy_4.6.22  
[67] R6_2.5.1  microbenchmark_1.4.10 rprojroot_2.0.4  
[70] evaluate_0.24.0   shiny_1.9.1   lattice_0.22-6   
[73] backports_1.5.0   memoise_2.0.1 broom_1.0.6  
[76] httpuv_1.6.15 xfun_0.47 fs_1.6.4 
[79] pkgconfig_2.0.3  

______
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] paired raster boxplots

2024-08-25 Thread SIBYLLE STÖCKLI via R-help

I see, the selection of the layers in shape or raster file is tricky.

Here what I found out:
- It seems that my raster file r is fine, but it is definitively my shape file 
s that causes problems reading the right layer: read_sf should be ok, because 
it reads in all layers, names(sf). However, when selecting the third layer it 
reads in number 3 (Unterregio) and number 15 (geometry). Secondly when using 
stack() it does not read in the s layer (probably because there are two layers).

> 
> sf <- read_sf("C:/Users/Sibylle 
> Stöckli/Desktop/NCCS_Impacts_Lot2_2022/InVEST/BAFU_ALLEMA_strata/Grossraume/data/BiogeographischeRegionen/N2020_Revision_BiogeoRegion.shp")
> names(sf)
 [1] "RegionNumm" "RegionName" "Unterregio" "Unterreg_1" "ObjNummer"  "Version" 
   "Shape_Leng" "Shape_Area" "DERegionNa" "FRRegionNa" "ITRegionNa" 
"DEBioBedeu" "FRBioBedeu" "ITBioBedeu"
[15] "geometry"  
> 
> s<-sf[3]
> names(s)
[1] "Unterregio" "geometry"  
>

> r<-allrasters_pres[[1]]
> names(r)
[1] "Andrena.barbilabris_glo_ensemble"
> 
> 
> rs <- stack(r, s)
> names(rs) <- c('r', 's')
Error in `names<-`(`*tmp*`, value = c("r", "s")) : 
  incorrect number of layer names
> names(rs)
[1] "Andrena.barbilabris_glo_ensemble"
> 
>



-Original Message-
From: Ivan Krylov  
Sent: Saturday, August 24, 2024 6:04 PM
To: sibylle.stoec...@gmx.ch
Cc: 'SIBYLLE STÖCKLI via R-help' 
Subject: Re: [R] paired raster boxplots

В Sat, 24 Aug 2024 10:24:36 +0200
 пишет:

> Yes indeed my raster "s" (the shape file for the boxplot classes) has 
> several layers.

If 's' contains more than one layer, then this already prevents you from giving 
two names to stack(r, s).

> That's way I tried to select a layer by "
> s<-sf$Unterregio".

'sf' is a data.frame-like object returned by read_sf, not a raster. If 's' is a 
raster, it could still contain multiple layers.

> > sf <- read_sf("C:/Users/._BiogeoRegion.shp")
> > names(sf)
> 
> > names(sf)
>  [1] "RegionNumm" "RegionName" "Unterregio" "Unterreg_1" "ObjNummer"
> "Version""Shape_Leng" "Shape_Area" "DERegionNa" "FRRegionNa"
> "ITRegionNa" "DEBioBedeu" "FRBioBedeu" "ITBioBedeu"
>  [15] "geometry"
> 
> > s<-sf$Unterregio
> > r<-allrasters_pres[[1]]

Sorry, that's still not enough information because we don't know what
names(rs) is. Since 'allrasters_pres' is a list of rasters, 'r' could also 
contain more than one layer, resulting in stack(r, s) containing more than two 
layers. 

In order to avoid the error, you need to see names(rs) and either give the same 
number of names to the object instead of two, or additionally extract one layer 
(using raster(r, layer = NUMBER_OR_NAME)) from each of 'r' and 's' before 
stacking them.

--
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A question on Statistics regarding regression

2024-08-24 Thread Jeff Newmiller via R-help

you say you asked elsewhere, but so many hits come up when I just search for 
"unbalanced sample size" your justification for not following the posting guide 
does not seem honest. 

I also recall that various discussions of statistical power address this in 
basic statistics.

On August 24, 2024 11:05:12 AM PDT, Christofer Bogaso 
 wrote:
>Hi,
>
>I have asked this question elsewhere however failed to get any
>response, so hoping to get some insight from experts and statisticians
>here.
>
>Let say we are fitting a regression equation where one explanatory
>variable is categorical with 2 categories. However in the sample, one
>category has 95% of values but other category has just 5%. Means, the
>categories are highly unbalanced.
>
>Typically SE of estimate may be inflated for such highly unbalanced
>categorical explanatory variable.
>
>Such unbalanced case may come from 2 scenarios 1) there is a flaw in
>sample or it is just by chance that second category has just 5% values
>in the sample or 2) in the population itself, the second category has
>very small number of occurrences which is reflected in the sample.
>
>My question how the SE would be impacted in above 2 cases? Will the
>impact be same i.e. we would get incorrect estimate of SE in both
>cases? If yes, is there any way to prove analytically or may be based
>on simulation?
>
>My apologies as this question is not directly R related. However I
>just wanted to get some insight on above problem related to Statistics
>from some of the great Statisticians in this forum.
>
>Thanks for your time.
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] paired raster boxplots

2024-08-24 Thread Ivan Krylov via R-help

В Sat, 24 Aug 2024 10:24:36 +0200
 пишет:

> Yes indeed my raster "s" (the shape file for the boxplot classes) has
> several layers.

If 's' contains more than one layer, then this already prevents you
from giving two names to stack(r, s).

> That's way I tried to select a layer by "
> s<-sf$Unterregio".

'sf' is a data.frame-like object returned by read_sf, not a raster. If
's' is a raster, it could still contain multiple layers.

> > sf <- read_sf("C:/Users/._BiogeoRegion.shp")
> > names(sf)  
> 
> > names(sf)  
>  [1] "RegionNumm" "RegionName" "Unterregio" "Unterreg_1" "ObjNummer"
> "Version""Shape_Leng" "Shape_Area" "DERegionNa" "FRRegionNa"
> "ITRegionNa" "DEBioBedeu" "FRBioBedeu" "ITBioBedeu"
>  [15] "geometry"
> 
> > s<-sf$Unterregio
> > r<-allrasters_pres[[1]]  

Sorry, that's still not enough information because we don't know what
names(rs) is. Since 'allrasters_pres' is a list of rasters, 'r' could
also contain more than one layer, resulting in stack(r, s) containing
more than two layers. 

In order to avoid the error, you need to see names(rs) and either give
the same number of names to the object instead of two, or additionally
extract one layer (using raster(r, layer = NUMBER_OR_NAME)) from each
of 'r' and 's' before stacking them.

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] paired raster boxplots

2024-08-24 Thread SIBYLLE STÖCKLI via R-help

Dear Ivan
Dear community

Quite nice book recommendation.
Yes indeed my raster "s" (the shape file for the boxplot classes) has several 
layers. That's way I tried to select a layer by " s<-sf$Unterregio".

> sf <- read_sf("C:/Users/._BiogeoRegion.shp")
> names(sf)

> names(sf)
 [1] "RegionNumm" "RegionName" "Unterregio" "Unterreg_1" "ObjNummer"  "Version" 
   "Shape_Leng" "Shape_Area" "DERegionNa" "FRRegionNa" "ITRegionNa" 
"DEBioBedeu" "FRBioBedeu" "ITBioBedeu"
[15] "geometry"

> s<-sf$Unterregio
> r<-allrasters_pres[[1]]

Kind regards
Sibylle

-Original Message-
From: Ivan Krylov  
Sent: Friday, August 23, 2024 5:30 PM
To: sibylle.stoec...@gmx.ch
Cc: 'SIBYLLE STÖCKLI via R-help' 
Subject: Re: [R] paired raster boxplots

В Fri, 23 Aug 2024 10:15:55 +0200
 пишет:

> > s<-sf$Unterregio
> > r<-allrasters_pres[[1]]
> > 
> > 
> > rs <- stack(r, s)
> > names(rs) <- c('r', 's')
> Error in `names<-`(`*tmp*`, value = c("r", "s")) : 
>   incorrect number of layer names

It looks like at least one of the rasters 'r' and 's' has multiple layers. What 
does names(rs) return? I would offer more detailed advice, but I don't know 
'raster' that well.

The "R Inferno" book [1] offers a lot of generic-R troubleshooting advice, 
which should help you progress past errors like this one without waiting for 
someone on R-help to reply.

--
Best regards,
Ivan

[1] https://www.burns-stat.com/documents/books/the-r-inferno/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] paired raster boxplots

2024-08-23 Thread Ivan Krylov via R-help

В Fri, 23 Aug 2024 10:15:55 +0200
 пишет:

> > s<-sf$Unterregio
> > r<-allrasters_pres[[1]]
> > 
> > 
> > rs <- stack(r, s)
> > names(rs) <- c('r', 's')  
> Error in `names<-`(`*tmp*`, value = c("r", "s")) : 
>   incorrect number of layer names

It looks like at least one of the rasters 'r' and 's' has multiple
layers. What does names(rs) return? I would offer more detailed advice,
but I don't know 'raster' that well.

The "R Inferno" book [1] offers a lot of generic-R troubleshooting
advice, which should help you progress past errors like this one
without waiting for someone on R-help to reply.

-- 
Best regards,
Ivan

[1] https://www.burns-stat.com/documents/books/the-r-inferno/

______
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] paired raster boxplots

2024-08-23 Thread SIBYLLE STÖCKLI via R-help

Dear Ivan

Many thanks.

Using my own dataset, my "s" is a layer shape file.
Somewhere I need to define : snew<-s$Unterregio

I tried it to do it before stack(), but it seems to be the wrong way. Do you 
have any suggestion?

Code:

> #first import all files in a single folder as a list 
> rastlist_pres <- list.files(path ="C:/Users/._bee_presence", 
> pattern='.tif$', all.files= T, full.names= T)
> rastlist_RCP85P2 <- list.files(path ="C:/Users/_bee_RCP85P2", 
> pattern='.tif$', all.files= T, full.names= T)
> 
> 
> #import all raster files in folder using lapply 
> allrasters_pres <- lapply(rastlist_pres, raster)
> allrasters_RCP85P2 <- lapply(rastlist_RCP85P2, raster)

> sf <- read_sf("C:/Users/._BiogeoRegion.shp")
> names(sf)

> names(sf)
 [1] "RegionNumm" "RegionName" "Unterregio" "Unterreg_1" "ObjNummer"  "Version" 
   "Shape_Leng" "Shape_Area" "DERegionNa" "FRRegionNa" "ITRegionNa" 
"DEBioBedeu" "FRBioBedeu" "ITBioBedeu"
[15] "geometry"

> s<-sf$Unterregio
> r<-allrasters_pres[[1]]
> 
> 
> rs <- stack(r, s)
> names(rs) <- c('r', 's')
Error in `names<-`(`*tmp*`, value = c("r", "s")) : 
  incorrect number of layer names


-Original Message-
From: Ivan Krylov  
Sent: Thursday, August 22, 2024 4:50 PM
To: SIBYLLE STÖCKLI via R-help 
Cc: sibylle.stoec...@gmx.ch
Subject: Re: [R] paired raster boxplots

В Thu, 22 Aug 2024 08:46:03 +0200
SIBYLLE STÖCKLI via R-help  пишет:

> rr2s <- stack(r, r2,s)

> > names(rs) <- c('r', 's', 'r2')
> 
> Error in `names<-`(`*tmp*`, value = c("r", "s", "r2")) :
> 
>   incorrect number of layer names

The error must be happening because the variable named 'rs' in your workspace 
originates from some other code you have previously run. The code "works" if 
you replace names(rs) by names(rr2s), but the plot isn't very useful because 
d$s are real numbers in this example and they don't form groups for d$r.

Are you interested in a boxplot where the boxes are grouped by two, one for 'r' 
and one for 'r2'? I'm sure they are not impossible to produce manually using 
the boxplot function, but the 'lattice' or 'ggplot2'
packages would make them much easier. You will need to reshape your data into 
long format, with the "value" column for the r/r2 value, the "kind" column 
saying "r" or "r2", and the "s" column:
https://stackoverflow.com/q/20172560

--
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Linear regression and stand deviation at the Linux command line

2024-08-23 Thread Ivan Krylov via R-help

В Thu, 22 Aug 2024 13:07:37 -0600
Keith Christian  пишет:

> I'm interested in R construct(s) to be entered at the command
> line that would output slope, y-intercept, and r-squared values read
> from a csv or other filename entered at the command line, and the same
> for standard deviation calculations, namely the standard deviation,
> variance, and z-scores for every data point in the file.

If you'd like to script R at the command line, consider the
commandArgs() function (try entering ?commandArgs at the R prompt).
This way you can pass a file path to an R process without unsafely
interpolating it into the R expression itself. These arguments can be
given to R --args or to Rscript (without the --args).

Also consider the 'littler' scripting-oriented R front-end
<https://CRAN.R-project.org/package=littler>, which puts the command
line arguments into the 'argv' variable and has a very convenient -d
option which loads CSV data from the standard input into a variable
named 'X'.

> Are line numbers, commas, etc. needed or no?

Depends on how you read it. By default, the function read.table() will
expect your data to be separated by a mixture of tabs and spaces and
will recognise a header if the first line contains one less column than
the rest of the file. Enter ?read.table at the R prompt to see the
available options (which include read.csv).

Good introductions to R include, well, "An Introduction to R" [1] (also
available by typing RShowDoc('R-intro') into the R prompt) and "Visual
Statistics" by Dr. A. Shipunov [2].

Start with functions read.table(), lm(), scale(), sd(), summary(). Use
str() to look at the structure of a variable: summary(lm(...)) will
return a named list from which you can extract the values you are
interested in (see ?summary.lm). When in doubt, call
help(name_of_the_function).

-- 
Best regards,
Ivan

[1]
https://cran.r-project.org/doc/manuals/R-intro.html

[2]
http://web.archive.org/web/20230106210646/http://ashipunov.info/shipunov/school/biol_240/en/visual_statistics.pdf

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fcaR not for latest R version

2024-08-22 Thread Ivan Krylov via R-help

Dear Peter,

В Thu, 22 Aug 2024 16:53:01 +0200
Peter van Summeren  пишет:

> Did anyone use fcaR for the current version of R/Rstudio on the Mac?

Maybe someone hasn't used fcaR on a Mac but can guess a solution to
your problem anyway. Maybe someone has, but has no idea how to help
you. On mailing lists, it is much more efficient to state the problem
you are having right in the initial message, without "asking to ask".
(Unless you would like to start an fcaR user group and are searching
for members, in which case this must be the right question.)

If you are having package installation problems on your Apple computer,
the right mailing list might be r-sig-...@r-project.org. Judging by
<https://cran.r-project.org/package=fcaR>, the 'fcaR' package should be
able to install on the latest version of R running on a Mac, but the
problem could also be in one of its dependencies, which I didn't look
at.

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] paired raster boxplots

2024-08-22 Thread Ivan Krylov via R-help

В Thu, 22 Aug 2024 08:46:03 +0200
SIBYLLE STÖCKLI via R-help  пишет:

> rr2s <- stack(r, r2,s)

> > names(rs) <- c('r', 's', 'r2')  
> 
> Error in `names<-`(`*tmp*`, value = c("r", "s", "r2")) :
> 
>   incorrect number of layer names

The error must be happening because the variable named 'rs' in your
workspace originates from some other code you have previously run. The
code "works" if you replace names(rs) by names(rr2s), but the plot isn't
very useful because d$s are real numbers in this example and they don't
form groups for d$r.

Are you interested in a boxplot where the boxes are grouped by two, one
for 'r' and one for 'r2'? I'm sure they are not impossible to produce
manually using the boxplot function, but the 'lattice' or 'ggplot2'
packages would make them much easier. You will need to reshape your
data into long format, with the "value" column for the r/r2 value, the
"kind" column saying "r" or "r2", and the "s" column:
https://stackoverflow.com/q/20172560

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Force conversion of (POSIXct) time zone with base R

2024-08-22 Thread Ivan Krylov via R-help

В Thu, 22 Aug 2024 08:59:46 +
Iago Giné Vázquez  пишет:

> How should POSIXct time zone be changed without modifying the
> specified time (so fix the time zone). 

Since POSIXct represents the number of seconds since the specified
"epoch" moment (beginning of 1970 UTC), fixing the time zone while
retaining the local time requires modifying the stored time.

It might be easier to go through POSIXlt, which represents a local time:

(now <- as.POSIXlt(Sys.time()))
attr(now, 'tzone') <- 'UTC'
(now <- as.POSIXct(now))

(Why attr(now, 'tzone') and not now$zone? Historical reasons, I
suppose, but at least both are documented in ?POSIXlt.)

You might also go through the string representation (which effectively
obtains the local time from the epoch time), but that feels brittle,
especially if the POSIXct value has attr(., 'tzone') set:

# time zone information successfully lost and replaced by UTC
Sys.time() |> format() |> as.POSIXct(tz = 'UTC')

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] paired raster boxplots

2024-08-21 Thread SIBYLLE STÖCKLI via R-help

Dear community



I have two raster files (here r and r2: y-axis) and an equal x-axis (here s,
4 classes).

Instead of plotting two boxplots I would like to plot paired boxplots: for
each class the boxplots from r and r2 paired).

I tried to adapt the code, but I am struggling around with the error:



> names(rs) <- c('r', 's', 'r2')

Error in `names<-`(`*tmp*`, value = c("r", "s", "r2")) :

  incorrect number of layer names



Additionally I am not sure how to adapt the function boxplot() with two
rasters, r and r2 and not only r.



Kind regards

Sibylle



Working example:



library(raster)

r <- raster(nc=10, nr=5)

r[] <- runif(ncell(r), min=10, max=20) * 2



r2 <- raster(nc=10, nr=5)

r2[] <- runif(ncell(r2), min=10, max=20) * 2



s <- setValues(r, sample(c(1:4), replace = T, size=50))



rr2s <- stack(r, r2,s)

names(rs) <- c('r', 's', 'r2')



d <- as.data.frame(rr2s)

boxplot(r~s, data= d)




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] allequal diff

2024-08-19 Thread SIBYLLE STÖCKLI via R-help

Dear Ben and Bert

Thanks very much for the interesting discussion

Yes that why I was additionally using compareRaster(r1,r2) and then resample 
(r2, r1) to adapt the extend.

library(raster)
r1 <- raster("")
r2 <- raster("f")
compareRaster(r1, r2)

extent(r1)
extent(r2)

r2_resampled <- resample(r2, r1)
compareRaster(r1, r2_resampled)

CompareRaster:
Evaluate whether a two or more Raster* objects have the same extent, number of 
rows and columns, projection, resolution, and origin (or a subset of these 
comparisons).

Kind regards
Sibylle

-Original Message-
From: R-help  On Behalf Of Bert Gunter
Sent: Sunday, August 18, 2024 11:21 PM
To: Ben Bolker 
Cc: r-help@r-project.org
Subject: Re: [R] allequal diff

Ah...I see.

Perhaps, then, the maintainer should be contacted, as the desired functionality 
seems similar to that provided in other all.equal methods. I realize that this 
may often not elicit a (prompt) response.

-- Bert


On Sun, Aug 18, 2024 at 11:50 AM Ben Bolker  wrote:
>
> The OP's original problem is that the all.equal method for raster 
> objects (raster:::all.equal.raster), which is a wrapper around the
> compareRaster() function, compares a bunch of different properties of 
> rasters (extent, resolution, values, etc.) and only returns a single 
> overall logical (TRUE/FALSE) value. OP wanted to see the magnitude of 
> the difference (as you could get for more typical all.equal methods by 
> using tolerance=0), but in order to do that one has to dig into the 
> code of compareRaster() and pull out code to make the particular 
> comparisons one wants by applying all.equal to specific components of 
> the raster (it would be nice if there were a built-in way to get this 
> information, but I don't know of one)
>
> On 8/18/24 14:40, Bert Gunter wrote:
> > "Is it true that all.equal just compares y values?"
> >
> > The following may be a bit more than you may have wanted, but I hope 
> > it is nevertheless useful.
> >
> > The first place you should go to for questions like this is the Help 
> > system, not here, i.e.
> > ?all.equal
> >
> > When you do this, you will find that all.equal() is a so-called S3 
> > generic function, which, among other things, means that it works 
> > differently (i.e. "dispatches") depending on its (usually) first 
> > argument. So, for example, if the first argument is of (S3) class 
> > "numeric", it will call the default method, all.equal.default(); if 
> > it's a function, it will call all.equal.function().  Help for 
> > all.equal's basic methods is found in the single all.equal (base) 
> > Help page. However, for non-base R packages, there may be other 
> > different methods provided for classed objects, e.g. perhaps of class 
> > "raster"
> > that would be found by ?all.equal.raster . Or maybe not, if the 
> > class "inherits" from another class, such as "matrix" (Warning: I am 
> > completely unfamiliar with the raster package, so these specifics 
> > are very likely wrong).
> >
> > To sort this sort of thing out, It would probably be useful for you 
> > to find a tutorial on R's S3 class  system (which is really a form 
> > of multiple dispatch) and spend some time with it. There are many 
> > good ones out there. This S3 system is widely used within R and many 
> > packages, so doing this sort of homework now should serve you well 
> > in your future R journey.
> >
> > All IMO obviously.
> >
> > Cheers,
> > Bert
> >
> >
> > On Sun, Aug 18, 2024 at 11:00 AM SIBYLLE STÖCKLI via R-help 
> >  wrote:
> >> Dear Ivan
> >>
> >> Thanks a lot for this very nice example.
> >>
> >> Is it true that all.equal just compares y values?
> >> Based on this help here I think so and the value I got is the difference 
> >> for the y-values.
> >> https://www.statology.org/all-equal-function-r/
> >>
> >> However, here I see x and y testing?
> >> https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/
> >> all.equal I am actually interested in the x values (x-y 
> >> coordinates). Test if x-y coordinates of both 25-m-pixel rasters are the 
> >> same. Ther may be a small shift or differences in the number of decimal 
> >> places.
> >>
> >> Kind regards
> >> Sibylle
> >>
> >>
> >>
> >> -Original Message-
> >> From: Ivan Krylov 
> >> Sent: Friday, August 16, 2024 11:45 AM
> >> To: sibylle.stoec...@gmx.ch
> >> Cc: 'SIBYLLE STÖCKLI via

Re: [R] allequal diff

2024-08-18 Thread SIBYLLE STÖCKLI via R-help

Dear Ivan 

Thanks a lot for this very nice example.

Is it true that all.equal just compares y values?
Based on this help here I think so and the value I got is the difference for 
the y-values.
https://www.statology.org/all-equal-function-r/

However, here I see x and y testing?
https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/all.equal
I am actually interested in the x values (x-y coordinates). Test if x-y 
coordinates of both 25-m-pixel rasters are the same. Ther may be a small shift 
or differences in the number of decimal places.

Kind regards
Sibylle



-Original Message-
From: Ivan Krylov  
Sent: Friday, August 16, 2024 11:45 AM
To: sibylle.stoec...@gmx.ch
Cc: 'SIBYLLE STÖCKLI via R-help' 
Subject: Re: [R] allequal diff

В Fri, 16 Aug 2024 11:32:58 +0200
 пишет:

> # values and mask r1
> r1 <- getValues(r1)
> mask1 <- is.na(r1)
> # Do the same for r2
> r2 <- getValues(r2_resampled)
> mask2 <- is.na(r2)
> 
> # Combine the masks
> all.equal(r1[!(mask1 & mask2)], r2[!(mask1 & mask2)])

Let's consider a more tangible example:

# The vectors `x` and `y` start out equal x <- y <- 1:10 # But then their 
different elements are made missing x[c(1,3,4)] <- NA y[c(3,8)] <- NA

Now, `is.na(x) & is.na(y)` gives the third element as the only element missing 
in both x and y:

mask1 <- is.na(x)
mask2 <- is.na(y)
all.equal( # not the comparison you are looking for
 x[!(mask1 & mask2)], # still two more elements missing
 y[!(mask1 & mask2)]  # still one more element missing
)

If you want to ignore all missing elements, you should combine the masks using 
the element-wise "or" operation ("missing in x and/or y"), not the element-wise 
"and" operation ("missing in both x and y at the same time"):

mask1 & mask2 # drops element 3
# [1] FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
mask1 | mask2 # drops elements 1, 3, 4, 8 # [1]  TRUE FALSE  TRUE  TRUE FALSE 
FALSE FALSE  TRUE FALSE FALSE

--
Best regards,
Ivan

______
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Terminating a cmd windows from R

2024-08-17 Thread SIMON Nicolas via R-help

Many thanks for your help.
I will consider the processx package.

Best regards
Nicolas


Le 17 août 2024 à 16:07, Ivan Krylov  a écrit :


В Sat, 17 Aug 2024 11: 47: 30 + SIMON Nicolas via R-help  пишет: > nmrun <- paste0("cmd. exe /k ",nm_log,". bat ",nmi,". 
ctl ",nmi,". lst") You are using the /k option that instructs cmd. exe to keep 
the command


В Sat, 17 Aug 2024 11:47:30 +
SIMON Nicolas via R-help  пишет:

> nmrun <- paste0("cmd.exe /k ",nm_log,".bat ",nmi,".ctl ",nmi,".lst")

You are using the /k option that instructs cmd.exe to keep the command
prompt open. Does the batch file contain an explicit "exit" to ensure
that cmd.exe terminates?

> system(nmrun, invisible = F, show.output.on.console = T, wait = T)

With wait = TRUE, it should be possible to interrupt the process by
pressing Ctrl+C in the cmd.exe window, but R itself will not be running
your commands until system() returns (or is interrupted, terminating
the process). You can specify a timeout for a foreground process using
the 'timeout' argument of the system() function.

If you'd like to manage a background process, consider the 'processx'
CRAN package: 
https://urldefense.com/v3/__https://cran.r-project.org/package=processx__;!!JQ5agg!YxmSgAHp4UgpqrHOlG1PHD6BvfvG1NC92n14SwHd7xlxN3PkMlbOrtc2Kw67JPntISAjxOic-ZZA0kEDop7N_A$

--
Très cordialement,
Ivan

______
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Terminating a cmd windows from R

2024-08-17 Thread Ivan Krylov via R-help

В Sat, 17 Aug 2024 11:47:30 +
SIMON Nicolas via R-help  пишет:

> nmrun <- paste0("cmd.exe /k ",nm_log,".bat ",nmi,".ctl ",nmi,".lst")

You are using the /k option that instructs cmd.exe to keep the command
prompt open. Does the batch file contain an explicit "exit" to ensure
that cmd.exe terminates?

> system(nmrun, invisible = F, show.output.on.console = T, wait = T)

With wait = TRUE, it should be possible to interrupt the process by
pressing Ctrl+C in the cmd.exe window, but R itself will not be running
your commands until system() returns (or is interrupted, terminating
the process). You can specify a timeout for a foreground process using
the 'timeout' argument of the system() function.

If you'd like to manage a background process, consider the 'processx'
CRAN package: https://cran.r-project.org/package=processx

-- 
Très cordialement,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Terminating a cmd windows from R

2024-08-17 Thread SIMON Nicolas via R-help

I execute a program using a batch file from R.
The program may have infinite computation. So I need to be avle to stop it.

nm_log <- "c:/nm74g64/run/nmfe74"
nmi<- "202"
nmrun <- paste0("cmd.exe /k ",nm_log,".bat ",nmi,".ctl ",nmi,".lst")
# run
system(nmrun, invisible = F, show.output.on.console = T, wait = T)

Le 17 août 2024 à 13:31, Duncan Murdoch  a écrit :

On 2024-08-17 6: 21 a. m. , SIMON Nicolas via R-help wrote: > I would like to 
stop a dos shell windows following the cmd (execute) command. Is there a way to 
do that from R? I think you need to give more detail on what you are trying to 
do. 

On 2024-08-17 6:21 a.m., SIMON Nicolas via R-help wrote:
> I would like to stop a dos shell windows following the cmd (execute) command. 
> Is there a way to do that from R?

I think you need to give more detail on what you are trying to do.  In
Windows, "cmd" is supposed to open a shell window.   Could you show us
an example of what you are doing?

Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Terminating a cmd windows from R

2024-08-17 Thread SIMON Nicolas via R-help

I would like to stop a dos shell windows following the cmd (execute) command. 
Is there a way to do that from R?

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] boxplot notch

2024-08-16 Thread Chris Evans via R-help

That's not really a reprex Sibylle.  I did try to use it to see if I 
could work out what you were trying to do and help but there is so much 
in there that I suspect is distraction from the notch issue and its 
error message.

Please can you give us something stripped of all unecessary things and 
tell us what you want?

Something like data that we can read as a tribble() or from a dput() of 
your data and then only the packages you actually need for the plot (I 
think tidyverse alone might do) and then a ggplot() call stripped right 
down to what you need and a clear explanation of what you are trying to 
do in the geom_boxplot() call and how it uses the summarised data tibble.

It may even be that if you do that, you will find what's causing the 
problem!  (I speak from bitter experience!!)

Very best (all),

Chris

On 16/08/2024 17:51, SIBYLLE STÖCKLI via R-help wrote:
> Farm_ID   JahrBio QI_A
> 1 20151   9.5
> 2 20181   15.7
> 3 20201   21.5
> 1 20151   50.5
> 2 20181   12.9
> 3 20201   11.2
> 1 20151   30.6
> 2 20181   28.7
> 3 20201   29.8
> 1 20151   30.1
> 2 20181   NA
> 3 20201   16.9
> 1 20150   6.5
> 2 20180   7.9
> 3 20200   10.2
> 1 20150   11.2
> 2 20180   18.5
> 3 20200   29.5
> 1 20150   25.1
> 2 20180   16.1
> 3 20200   15.9
> 1 20150   10.1
> 2 20180   8.4
> 3 20200   3.5
> 1 20150   NA
> 2 20180   NA
> 3 20200   3.5
>
>
> Code
> setwd("C:/Users/Sibylle Stöckli/Desktop/")
> #.libPaths()
> getwd()
>
> #libraries laden
> library("ggplot2")
> library("gridExtra")
> library(scales)
> library(nlme)
> library(arm)
> library(blmeco)
> library(stats)
> library(dplyr)
> library(ggpubr)
> library(patchwork)
> library(plotrix)
> library(tidyverse)
> library(dplyr)
>
> #read data
> MS = read.delim("Test1.txt", na.strings="NA")
> names(MS)
>
> MS$Jahr<-as.numeric(MS$Jahr)
> MS$Bio<-as.factor(MS$Bio)
> str(MS)
>
> # boxplot BFF QI
>
> MS1<- MS %>% filter(QI_A!="NA") %>% droplevels()
> MS1$Jahr<-as.factor(MS1$Jahr)
>
> MS1s <- MS1 %>%
>group_by(MS1$Jahr, MS1$Bio) %>%
>summarise(
>  y0 = quantile(QI_A, 0.05),
>  y25 = quantile(QI_A, 0.25),
>  y50 = mean(QI_A),
>  y75 = quantile(QI_A, 0.75),
>  y100 = quantile(QI_A, 0.95))
>
> MS1s
> colnames(MS1s)[1]<-"Jahr"
> colnames(MS1s)[2]<-"Bio"
> MS1s
>
> p1<-ggplot(MS1s, aes(Jahr,  fill = as.factor(Bio))) +
>geom_boxplot(
>  aes(ymin = y0, lower = y25, middle = y50, upper = y75, ymax = y100),
>  stat = "identity", notch=TRUE
>) +
>theme(panel.background = element_blank())+
>theme(axis.line = element_line(colour = "black"))+
>theme(axis.text=element_text(size=18))+
>theme(axis.title=element_text(size=20))+
>ylab("Anteil BFF an LN [%]") +xlab("Jahr")+
>scale_color_manual(values=c("red","darkgreen"), labels=c("ÖLN", "BIO"))+
>scale_fill_manual(values=c("red","darkgreen"), labels= c("ÖLN", "BIO"))+
>theme(legend.title = element_blank())+
>theme(legend.text=element_text(size=20))
> p1<-p1 + expand_limits(y=c(0, 80))
> p1
-- 
Chris Evans (he/him)
Visiting Professor, UDLA, Quito, Ecuador & Honorary Professor, 
University of Roehampton, London, UK.
Work web site: https://www.psyctc.org/psyctc/
CORE site: http://www.coresystemtrust.org.uk/
Personal site: https://www.psyctc.org/pelerinage2016/
Emeetings (Thursdays): 
https://www.psyctc.org/psyctc/booking-meetings-with-me/
(Beware: French time, generally an hour ahead of UK)
<https://ombook.psyctc.org/book>
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] boxplot notch

2024-08-16 Thread SIBYLLE STÖCKLI via R-help

Thanks Ben,

Here the reproducible example.
It works without notch=TRUE, but provides an error with notch=TURE

Error in `geom_boxplot()`:
! Problem while converting geom to grob.
ℹ Error occurred in the 1st layer.
Caused by error in `ans[ypos] <- rep(yes, length.out = len)[ypos]`:
! replacement has length zero
Run `rlang::last_trace()` to see where the error occurred.
Warning message:
In rep(yes, length.out = len) : 'x' is NULL so the result will be NULL


Data
Farm_ID JahrBio QI_A
1   20151   9.5
2   20181   15.7
3   20201   21.5
1   20151   50.5
2   20181   12.9
3   20201   11.2
1   20151   30.6
2   20181   28.7
3   20201   29.8
1   20151   30.1
2   20181   NA
3   20201   16.9
1   20150   6.5
2   20180   7.9
3   20200   10.2
1   20150   11.2
2   20180   18.5
3   20200   29.5
1   20150   25.1
2   20180   16.1
3   20200   15.9
1   20150   10.1
2   20180   8.4
3   20200   3.5
1   20150   NA
2   20180   NA
3   20200   3.5


Code
setwd("C:/Users/Sibylle Stöckli/Desktop/")
#.libPaths()
getwd()  

#libraries laden
library("ggplot2")
library("gridExtra")  
library(scales)
library(nlme)
library(arm)
library(blmeco)
library(stats)  
library(dplyr)
library(ggpubr)
library(patchwork)
library(plotrix)
library(tidyverse)
library(dplyr)

#read data
MS = read.delim("Test1.txt", na.strings="NA")
names(MS)

MS$Jahr<-as.numeric(MS$Jahr)
MS$Bio<-as.factor(MS$Bio)
str(MS)

# boxplot BFF QI

MS1<- MS %>% filter(QI_A!="NA") %>% droplevels()
MS1$Jahr<-as.factor(MS1$Jahr)

MS1s <- MS1 %>%
  group_by(MS1$Jahr, MS1$Bio) %>%
  summarise(
y0 = quantile(QI_A, 0.05),
y25 = quantile(QI_A, 0.25),
y50 = mean(QI_A),
y75 = quantile(QI_A, 0.75),
y100 = quantile(QI_A, 0.95))

MS1s
colnames(MS1s)[1]<-"Jahr"
colnames(MS1s)[2]<-"Bio"
MS1s

p1<-ggplot(MS1s, aes(Jahr,  fill = as.factor(Bio))) +
  geom_boxplot(
aes(ymin = y0, lower = y25, middle = y50, upper = y75, ymax = y100),
stat = "identity", notch=TRUE
  ) +
  theme(panel.background = element_blank())+
  theme(axis.line = element_line(colour = "black"))+
  theme(axis.text=element_text(size=18))+
  theme(axis.title=element_text(size=20))+
  ylab("Anteil BFF an LN [%]") +xlab("Jahr")+
  scale_color_manual(values=c("red","darkgreen"), labels=c("ÖLN", "BIO"))+
  scale_fill_manual(values=c("red","darkgreen"), labels= c("ÖLN", "BIO"))+
  theme(legend.title = element_blank())+
  theme(legend.text=element_text(size=20))
p1<-p1 + expand_limits(y=c(0, 80))
p1

-Original Message-
From: R-help  On Behalf Of Ben Bolker
Sent: Friday, August 16, 2024 3:30 PM
To: r-help@r-project.org
Subject: Re: [R] boxplot notch

   I don't see anything obviously wrong here. There may be something subtle, 
but we probably won't be able to help without a reproducible example ...

On 2024-08-16 9:24 a.m., SIBYLLE STÖCKLI via R-help wrote:
> Dear community
> 
>   
> 
> I tried the following code using geom_boxplot() and notch=TRUE. Does 
> anyone know if the command  notch=TRUE  is at the wrong place in my 
> special code construct?
> 
>   
> 
> Without notch=TRUE the code provides the planned ggplot.
> 
>   
> 
> Kind regards
> 
> Sibylle
> 
>   
> 
> Code:
> 
>   
> 
> MS1<- MS %>% filter(QI_A!="NA") %>% droplevels()
> 
> MS1$Jahr<-as.factor(MS1$Jahr)
> 
>   
> 
> MS1s <- MS1 %>%
> 
>group_by(MS1$Jahr, MS1$Bio) %>%
> 
>summarise(
> 
>  y0 = quantile(QI_A, 0.05),
> 
>  y25 = quantile(QI_A, 0.25),
> 
>  y50 = mean(QI_A),
> 
>  y75 = quantile(QI_A, 0.75),
> 
>  y100 = quantile(QI_A, 0.95))
> 
>   
> 
> MS1s
> 
> colnames(MS1s)[1]<-"Jahr"
> 
> colnames(MS1s)[2]<-"Bio"
> 
> MS1s
> 
>   
> 
> p1<-ggplot(MS1s, aes(Jahr,  fill = as.factor(Bio))) +
> 
>geom_boxplot(
> 
>  aes(ymin = y0, lower = y25, middle = y50, upper = y75, ymax = 
> y100),
> 
>  stat = "identity", notch=TRUE
> 
>) +
> 
>theme(panel.background = element_blank())+
> 
>theme(axis.line = element_line(colour = "black"))+
> 
>    theme(axis.text=element_text(size=18))+
> 
>theme(axis.title=element_text(size=20))+
> 
>    ylab("Anteil BFF an LN [%]") +xlab("J

[R] boxplot notch

2024-08-16 Thread SIBYLLE STÖCKLI via R-help

Dear community

 

I tried the following code using geom_boxplot() and notch=TRUE. Does anyone
know if the command �notch=TRUE� is at the wrong place in my special code
construct?

 

Without notch=TRUE the code provides the planned ggplot.

 

Kind regards

Sibylle

 

Code:

 

MS1<- MS %>% filter(QI_A!="NA") %>% droplevels()

MS1$Jahr<-as.factor(MS1$Jahr)

 

MS1s <- MS1 %>%

  group_by(MS1$Jahr, MS1$Bio) %>%

  summarise(

y0 = quantile(QI_A, 0.05),

y25 = quantile(QI_A, 0.25),

y50 = mean(QI_A),

y75 = quantile(QI_A, 0.75),

y100 = quantile(QI_A, 0.95))

 

MS1s

colnames(MS1s)[1]<-"Jahr"

colnames(MS1s)[2]<-"Bio"

MS1s

 

p1<-ggplot(MS1s, aes(Jahr,  fill = as.factor(Bio))) +

  geom_boxplot(

aes(ymin = y0, lower = y25, middle = y50, upper = y75, ymax = y100),

stat = "identity", notch=TRUE

  ) +

  theme(panel.background = element_blank())+

  theme(axis.line = element_line(colour = "black"))+

  theme(axis.text=element_text(size=18))+

  theme(axis.title=element_text(size=20))+

  ylab("Anteil BFF an LN [%]") +xlab("Jahr")+

  scale_color_manual(values=c("red","darkgreen"), labels=c("�LN", "BIO"))+

  scale_fill_manual(values=c("red","darkgreen"), labels= c("�LN", "BIO"))+

  theme(legend.title = element_blank())+

  theme(legend.text=element_text(size=20))

p1<-p1 + expand_limits(y=c(0, 80))

p1


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] allequal diff

2024-08-16 Thread Ivan Krylov via R-help

В Fri, 16 Aug 2024 11:32:58 +0200
 пишет:

> # values and mask r1
> r1 <- getValues(r1)
> mask1 <- is.na(r1)
> # Do the same for r2
> r2 <- getValues(r2_resampled)
> mask2 <- is.na(r2)
> 
> # Combine the masks
> all.equal(r1[!(mask1 & mask2)], r2[!(mask1 & mask2)])

Let's consider a more tangible example:

# The vectors `x` and `y` start out equal
x <- y <- 1:10
# But then their different elements are made missing
x[c(1,3,4)] <- NA
y[c(3,8)] <- NA

Now, `is.na(x) & is.na(y)` gives the third element as the only element
missing in both x and y:

mask1 <- is.na(x)
mask2 <- is.na(y)
all.equal( # not the comparison you are looking for
 x[!(mask1 & mask2)], # still two more elements missing
 y[!(mask1 & mask2)]  # still one more element missing
)

If you want to ignore all missing elements, you should combine the
masks using the element-wise "or" operation ("missing in x and/or y"),
not the element-wise "and" operation ("missing in both x and y at the
same time"):

mask1 & mask2 # drops element 3
# [1] FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
mask1 | mask2 # drops elements 1, 3, 4, 8
# [1]  TRUE FALSE  TRUE  TRUE FALSE FALSE FALSE  TRUE FALSE FALSE

-- 
Best regards,
Ivan

______
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] allequal diff

2024-08-16 Thread SIBYLLE STÖCKLI via R-help

Cool thanks

# values and mask r1
r1 <- getValues(r1)
mask1 <- is.na(r1)
# Do the same for r2
r2 <- getValues(r2_resampled)
mask2 <- is.na(r2)

# Combine the masks
all.equal(r1[!(mask1 & mask2)], r2[!(mask1 & mask2)])


output
> all.equal(r1[!(mask1 & mask2)], r2[!(mask1 & mask2)])
[1] "'is.NA' value mismatch: 389 in current 56989152 in target"

--> so there is just a mismatch in NA not in the xy pixels, right?

Sibylle 



-Original Message-
From: Ivan Krylov  
Sent: Friday, August 16, 2024 10:51 AM
To: sibylle.stoec...@gmx.ch
Cc: 'SIBYLLE STÖCKLI via R-help' 
Subject: Re: [R] allequal diff

В Fri, 16 Aug 2024 10:35:35 +0200
 пишет:

> what do you mean by use is.na() in getValues(). So I need to call 
> getValues a second time?

Not necessarily, but it's one of the options. I was thinking along the lines of:

values1 <- getValues(r1)
mask1 <- is.na(values1)
# Do the same for r2
# Combine the masks
all.equal(values1[!combined_mask], values2[!combined_mask])

Unlike compareRaster(), this assumes that the coordinate grid of r1 and
r2 is already the same and that only some of the values may differ.

> I suppose you mean to first prepare a mask using is.na without 
> getValues and then in the second step your code?

'raster' documentation says that is.na() works on raster objects, so it should 
work.

Even if it didn't work, since you already access the underlying data using 
getValues() and then compare the resulting vectors using all.equal(), using 
is.na(getValues(...)) should definitely work.

--
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] allequal diff

2024-08-16 Thread SIBYLLE STÖCKLI via R-help

Here my idea including the error:

 

> m1=r1[r1="NA",]

> m2=r2_resampled[r2_resampled="NA",]

> 

> 

> all.equal(getValues(r1)[!m1], getValues(r2_resampled)[!m2])

[1] "Numeric: lengths (80706867, 65806339) differ"

 

-Original Message-
From: R-help  On Behalf Of SIBYLLE STÖCKLI via 
R-help
Sent: Friday, August 16, 2024 10:36 AM
To: 'Ivan Krylov' ; 'SIBYLLE STÖCKLI via R-help' 

Subject: Re: [R] allequal diff

 

Many thanks Ivan

 

 

Use is.na() on getValues() outputs, combine the two masks using the | operator 
to get a mask of values that are missing in either raster, then negate the mask 
to choose the non-missing values:

 

all.equal(getValues(r1)[!mask], getValues(r2)[!mask])

 

--> what do you mean by use is.na() in getValues(). So I need to call getValues 
a second time? I suppose you mean to first prepare a mask using is.na without 
getValues and then in the second step your code?

 

Kind regards

Sibylle

 

-Original Message-

From: Ivan Krylov < <mailto:ikry...@disroot.org> ikry...@disroot.org>

Sent: Friday, August 16, 2024 9:28 AM

To: SIBYLLE STÖCKLI via R-help < <mailto:r-help@r-project.org> 
r-help@r-project.org>

Cc:  <mailto:sibylle.stoec...@gmx.ch> sibylle.stoec...@gmx.ch

Subject: Re: [R] allequal diff

 

В Fri, 16 Aug 2024 07:19:38 +0200

SIBYLLE STÖCKLI via R-help < <mailto:r-help@r-project.org> 
r-help@r-project.org> пишет:

 

> Is it possible to consider na.rm=TRUE?

 

> > all.equal(getValues(r1), getValues(r2_resampled), tolerance = 0)

> 

> [1] "'is.NA' value mismatch: 9544032 in current 66532795 in target"

 

Use is.na() on getValues() outputs, combine the two masks using the | operator 
to get a mask of values that are missing in either raster, then negate the mask 
to choose the non-missing values:

 

all.equal(getValues(r1)[!mask], getValues(r2)[!mask])

 

--

Best regards,

Ivan

 

______

 <mailto:R-help@r-project.org> R-help@r-project.org mailing list -- To 
UNSUBSCRIBE and more, see  <https://stat.ethz.ch/mailman/listinfo/r-help> 
https://stat.ethz.ch/mailman/listinfo/r-help

PLEASE do read the posting guide  <http://www.R-project.org/posting-guide.html> 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] allequal diff

2024-08-16 Thread Ivan Krylov via R-help

В Fri, 16 Aug 2024 10:35:35 +0200
 пишет:

> what do you mean by use is.na() in getValues(). So I need to call
> getValues a second time?

Not necessarily, but it's one of the options. I was thinking along the
lines of:

values1 <- getValues(r1)
mask1 <- is.na(values1)
# Do the same for r2
# Combine the masks
all.equal(values1[!combined_mask], values2[!combined_mask])

Unlike compareRaster(), this assumes that the coordinate grid of r1 and
r2 is already the same and that only some of the values may differ.

> I suppose you mean to first prepare a mask using is.na without
> getValues and then in the second step your code?  

'raster' documentation says that is.na() works on raster objects, so it
should work.

Even if it didn't work, since you already access the underlying data
using getValues() and then compare the resulting vectors using
all.equal(), using is.na(getValues(...)) should definitely work.

-- 
Best regards,
Ivan

______
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] allequal diff

2024-08-16 Thread SIBYLLE STÖCKLI via R-help

Many thanks Ivan


Use is.na() on getValues() outputs, combine the two masks using the | operator 
to get a mask of values that are missing in either raster, then negate the mask 
to choose the non-missing values:

all.equal(getValues(r1)[!mask], getValues(r2)[!mask])

--> what do you mean by use is.na() in getValues(). So I need to call getValues 
a second time? I suppose you mean to first prepare a mask using is.na without 
getValues and then in the second step your code?

Kind regards
Sibylle

-Original Message-
From: Ivan Krylov  
Sent: Friday, August 16, 2024 9:28 AM
To: SIBYLLE STÖCKLI via R-help 
Cc: sibylle.stoec...@gmx.ch
Subject: Re: [R] allequal diff

В Fri, 16 Aug 2024 07:19:38 +0200
SIBYLLE STÖCKLI via R-help  пишет:

> Is it possible to consider na.rm=TRUE?

> > all.equal(getValues(r1), getValues(r2_resampled), tolerance = 0)
> 
> [1] "'is.NA' value mismatch: 9544032 in current 66532795 in target"

Use is.na() on getValues() outputs, combine the two masks using the | operator 
to get a mask of values that are missing in either raster, then negate the mask 
to choose the non-missing values:

all.equal(getValues(r1)[!mask], getValues(r2)[!mask])

--
Best regards,
Ivan

______
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] allequal diff

2024-08-16 Thread Ivan Krylov via R-help

В Fri, 16 Aug 2024 07:19:38 +0200
SIBYLLE STÖCKLI via R-help  пишет:

> Is it possible to consider na.rm=TRUE?

> > all.equal(getValues(r1), getValues(r2_resampled), tolerance = 0)  
> 
> [1] "'is.NA' value mismatch: 9544032 in current 66532795 in target"

Use is.na() on getValues() outputs, combine the two masks using the |
operator to get a mask of values that are missing in either raster,
then negate the mask to choose the non-missing values:

all.equal(getValues(r1)[!mask], getValues(r2)[!mask])

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] allequal diff

2024-08-15 Thread SIBYLLE STÖCKLI via R-help

Dear Ben

 

Many thanks.

I see that a second challenge are NA values. Is it possible to consider 
na.rm=TRUE?

 

> r2_resampled <- resample(r2, r1)

> compareRaster(r1, r2_resampled)

[1] TRUE

> 

> all.equal(getValues(r1), getValues(r2_resampled), tolerance = 0)

[1] "'is.NA' value mismatch: 9544032 in current 66532795 in target"

 

Kind regards

Sibylle 

 

 

-Original Message-
From: R-help  On Behalf Of Ben Bolker
Sent: Friday, August 16, 2024 1:06 AM
To: r-help@r-project.org
Subject: Re: [R] allequal diff

 

 

   Digging into the code for raster::compareRaster():

 

library(raster)

r <- raster(ncol=3, nrow=3)

values(r) <- 1:ncell(r)

r2 <- r

values(r2) <- c(1:8,10)

all.equal(getValues(r), getValues(r2), tolerance = 0) [1] "Mean relative 
difference: 0.111"

 

compareRaster has fancier machinery internally for doing the comparison for 
large rasters a block at a time if everything can't fit in memory at the same 
time ...

 

 

 

 

On 2024-08-15 9:14 a.m., SIBYLLE STÖCKLI via R-help wrote:

> Dear community

> 

> 

> 

> Similar to the example of the rdocumentation, my idea is to use 

> all.equal and to print the difference.

> 

>  <https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/all> 
> https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/all

> .equal

> 

> 

> 

>> d45 <- pi*(1/4 + 1:10)

> 

>> stopifnot(

> 

> + all.equal(tan(d45), rep(1, 10)))  # TRUE, but

> 

>> all  (tan(d45) == rep(1, 10)) # FALSE, since not exactly

> 

> [1] FALSE

> 

>> all.equal(tan(d45), rep(1, 10), tolerance = 0)  # to see difference

> 

> [1] "Mean relative difference: 1.29526e-15"

> 

>> 

> 

> 

> 

> Unfortunately, I just get "FALSE" not the difference.

> 

>> r2_resampled <- resample(r2, r1)

> 

>> compareRaster(r1, r2_resampled)

> 

> [1] TRUE

> 

>> # Compare rasters

> 

>> result <- all.equal(r1, r2_resampled)

> 

> Warning message:

> 

> In compareRaster(target, current, ..., values = values, stopiffalse = 

> stopiffalse,  :

> 

>not all objects have the same values

> 

>> print(result)

> 

> [1] FALSE

> 

>> 

> 

> 

> 

> Kind regards

> 

> Sibylle

> 

> 

> [[alternative HTML version deleted]]

> 

> __

>  <mailto:R-help@r-project.org> R-help@r-project.org mailing list -- To 
> UNSUBSCRIBE and more, see 

>  <https://stat.ethz.ch/mailman/listinfo/r-help> 
> https://stat.ethz.ch/mailman/listinfo/r-help

> PLEASE do read the posting guide 

>  <http://www.R-project.org/posting-guide.html> 
> http://www.R-project.org/posting-guide.html

> and provide commented, minimal, self-contained, reproducible code.

 

--

Dr. Benjamin Bolker

Professor, Mathematics & Statistics and Biology, McMaster University Director, 
School of Computational Science and Engineering  > E-mail is sent at my 
convenience; I don't expect replies outside of working hours.

 

__

 <mailto:R-help@r-project.org> R-help@r-project.org mailing list -- To 
UNSUBSCRIBE and more, see  <https://stat.ethz.ch/mailman/listinfo/r-help> 
https://stat.ethz.ch/mailman/listinfo/r-help

PLEASE do read the posting guide  <http://www.R-project.org/posting-guide.html> 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [EXTERNAL] Re: Very strange behavior of 'rep'

2024-08-15 Thread Izmirlian, Grant (NIH/NCI) [E] via R-help

Its where n.per.grp is first calculated. I rounded. Gosh do I feel stupid. 
Thanks to all who weighed in.
Best Regards,
Grant Izmirlian


From: Duncan Murdoch 
Sent: Thursday, August 15, 2024 2:59 PM
To: Izmirlian, Grant (NIH/NCI) [E] ; 
r-help@r-project.org 
Subject: [EXTERNAL] Re: [R] Very strange behavior of 'rep'

I also can't reproduce this.  I'd guess that one of your values of
n.per.grp or n.tt only prints as the values you showed, but is actually
a little smaller.  For example,

n.per.grp <- 108 - 1.e-14
n.per.grp
#> [1] 108
n.tt <- 5
n.per.grp*n.tt
#> [1] 540
length(rep(0:1, each=n.per.grp*n.tt))
#> [1] 1078

Duncan Murdoch


Created on 2024-08-15 with [reprex
v2.1.1](https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Freprex.tidyverse.org%2F&data=05%7C02%7Cizmirlig%40mail.nih.gov%7C39d45a0a10d54c446ce808dcbd5c7518%7C14b77578977342d58507251ca2dc2b06%7C0%7C0%7C638593452006022388%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=ZZDiaJt2vEAejBTdsfyxaFtwYs8WBWgeIXVzUvN4%2BHI%3D&reserved=0)https://reprex.tidyverse.org/>>

On 2024-08-15 2:39 p.m., Izmirlian, Grant (NIH/NCI) [E] via R-help wrote:
> \n<>\n\n \n<<
> This is very weird. I was running a swarm job on the cluster and it bombed
> only for n.per.grp=108, not for the other values. Even though
> n.per.grp*n.tt is 540, so that the length of the call to 'rep'
> should be 1080, I'm getting a vector of length 1078.
>  n.per.grp <- 108
>  n.tt <- 5
>  n.per.grp*n.tt
>  length(rep(0:1, each=n.per.grp*n.tt))
>  length(rep(0:1, each=108*5))
>>> \n<>\n\n\n\n
> --please do not edit the information below--
>
> R Version:
>   platform = x86_64-pc-linux-gnu
>   arch = x86_64
>   os = linux-gnu
>   system = x86_64, linux-gnu
>   status =
>   major = 4
>   minor = 4.1
>   year = 2024
>   month = 06
>   day = 14
>   svn rev = 86737
>   language = R
>   version.string = R version 4.4.1 (2024-06-14)
>   nickname = Race for Your Life
>
> Locale:
>   
> LC_CTYPE=C.UTF-8;LC_NUMERIC=C;LC_TIME=C.UTF-8;LC_COLLATE=C.UTF-8;LC_MONETARY=C.UTF-8;LC_MESSAGES=C.UTF-8;LC_PAPER=C.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=C.UTF-8;LC_IDENTIFICATION=C
>
> Search Path:
>   .GlobalEnv, package:lme4, package:Matrix, package:stats,
>   package:graphics, package:grDevices, package:utils, package:datasets,
>   package:showtext, package:showtextdb, package:sysfonts,
>   package:methods, Autoloads, package:base
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C02%7Cizmirlig%40mail.nih.gov%7C39d45a0a10d54c446ce808dcbd5c7518%7C14b77578977342d58507251ca2dc2b06%7C0%7C0%7C638593452006031043%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=LTp63hQ5fGXMg1K3QmkjLq1NRVQv22v7JH5AIn8AFWA%3D&reserved=0<https://stat.ethz.ch/mailman/listinfo/r-help>
> PLEASE do read the posting guide 
> https://gcc02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C02%7Cizmirlig%40mail.nih.gov%7C39d45a0a10d54c446ce808dcbd5c7518%7C14b77578977342d58507251ca2dc2b06%7C0%7C0%7C638593452006035637%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=OHU65ku0vmIiRInCbI7ep0QDRO5Xytat61CtBkYYMCM%3D&reserved=0<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you recognize the sender and are confident the 
content is safe.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] allequal diff

2024-08-15 Thread SIBYLLE STÖCKLI via R-help

Dear community



Similar to the example of the rdocumentation, my idea is to use all.equal
and to print the difference.

https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/all.equal



> d45 <- pi*(1/4 + 1:10)

> stopifnot(

+ all.equal(tan(d45), rep(1, 10)))  # TRUE, but

> all  (tan(d45) == rep(1, 10)) # FALSE, since not exactly

[1] FALSE

> all.equal(tan(d45), rep(1, 10), tolerance = 0)  # to see difference

[1] "Mean relative difference: 1.29526e-15"

>



Unfortunately, I just get "FALSE" not the difference.

> r2_resampled <- resample(r2, r1)

> compareRaster(r1, r2_resampled)

[1] TRUE

> # Compare rasters

> result <- all.equal(r1, r2_resampled)

Warning message:

In compareRaster(target, current, ..., values = values, stopiffalse =
stopiffalse,  :

  not all objects have the same values

> print(result)

[1] FALSE

>



Kind regards

Sibylle


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] geom_boxplot nocht=TRUE

2024-08-15 Thread SIBYLLE STÖCKLI via R-help

Dear community

 

I tried to run my ggplot() +geom_boxplot() code using nocht=TRUE, but probably 
my term noch=TRUE is at the wrong position?

 

Error:

Error in `geom_boxplot()`:

! Problem while converting geom to grob.

ℹ Error occurred in the 1st layer.

Caused by error in `ans[ypos] <- rep(yes, length.out = len)[ypos]`:

! replacement has length zero

Run `rlang::last_trace()` to see where the error occurred.

Warning message:

In rep(yes, length.out = len) : 'x' is NULL so the result will be NULL

> rlang::last_trace()



Error in `geom_boxplot()`:

! Problem while converting geom to grob.

ℹ Error occurred in the 1st layer.

Caused by error in `ans[ypos] <- rep(yes, length.out = len)[ypos]`:

! replacement has length zero

---

Backtrace:

 ▆

  1. ├─base (local) ``(x)

  2. └─ggplot2:::print.ggplot(x)

  3.   ├─ggplot2::ggplot_gtable(data)

  4.   └─ggplot2:::ggplot_gtable.ggplot_built(data)

  5. └─ggplot2:::by_layer(...)

  6.   ├─rlang::try_fetch(...)

  7.   │ ├─base::tryCatch(...)

  8.   │ │ └─base (local) tryCatchList(expr, classes, parentenv, handlers)

  9.   │ │   └─base (local) tryCatchOne(expr, names, parentenv, 
handlers[[1L]])

10.   │ │ └─base (local) doTryCatch(return(expr), name, parentenv, 
handler)

11.   │ └─base::withCallingHandlers(...)

12.   └─ggplot2 (local) f(l = layers[[i]], d = data[[i]])

13. └─l$draw_geom(d, layout)

14.   └─ggplot2 (local) draw_geom(..., self = self)

15. └─self$geom$draw_layer(...)

16.   └─ggplot2 (local) draw_layer(..., self = self)

17. └─base::lapply(...)

18.   └─ggplot2 (local) FUN(X[[i]], ...)

19. ├─rlang::inject(self$draw_panel(data, panel_params, 
coord, !!!params))

20. └─self$draw_panel(...)

21.   └─ggplot2 (local) draw_panel(..., self = self)

22. └─base::lapply(...)

23.   └─ggplot2 (local) FUN(X[[i]], ...)

24. └─self$draw_group(group, panel_params, coord, 
...)

25.   └─ggplot2 (local) draw_group(..., self = self)

26. ├─ggplot2:::data_frame0(...)

27. │ └─vctrs::data_frame(..., .name_repair = 
"minimal")

28. │   └─rlang::list2(...)

29. └─base::ifelse(notch, data$notchlower, NA)

Run rlang::last_trace(drop = FALSE) to see 5 hidden frames.

 

 

 

 

Code:

 

MS1<- MS %>% filter(QI_A!="NA") %>% droplevels()

MS1$Jahr<-as.factor(MS1$Jahr)

 

MS1s <- MS1 %>%

  group_by(MS1$Jahr, MS1$Bio) %>%

  summarise(

y0 = quantile(QI_A, 0.05),

y25 = quantile(QI_A, 0.25),

y50 = mean(QI_A),

y75 = quantile(QI_A, 0.75),

y100 = quantile(QI_A, 0.95))

 

MS1s

colnames(MS1s)[1]<-"Jahr"

colnames(MS1s)[2]<-"Bio"

MS1s

 

p1<-ggplot(MS1s, aes(Jahr,  fill = as.factor(Bio))) +

  geom_boxplot(

aes(ymin = y0, lower = y25, middle = y50, upper = y75, ymax = y100),

stat = "identity", notch=TRUE

  ) +

  theme(panel.background = element_blank())+

  theme(axis.line = element_line(colour = "black"))+

  theme(axis.text=element_text(size=18))+

  theme(axis.title=element_text(size=20))+

  ylab("Anteil BFF an LN [%]") +xlab("Jahr")+

  scale_color_manual(values=c("red","darkgreen"), labels=c("ÖLN", "BIO"))+

  scale_fill_manual(values=c("red","darkgreen"), labels= c("ÖLN", "BIO"))+

  theme(legend.title = element_blank())+

  theme(legend.text=element_text(size=20))

p1<-p1 + expand_limits(y=c(0, 80))

p1

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [EXTERNAL] Re: Very strange behavior of 'rep'

2024-08-15 Thread Izmirlian, Grant (NIH/NCI) [E] via R-help

Ok � to be fair, it looks like I need to load everything and reproduce exactly 
as its occuring. I suspect its a weird memory hole in one of the loaded 
packages.

Here � this should do it.


"%,%" <- paste0

"factorial.design" <-
function(...)
{
  m <- match.call()
  cc <- m
  cc[[1]] <- as.name("c")
  arg.vals <- eval(cc, sys.parent())
  n.args <- length(arg.vals)

  n.l <- 0
  n.r <- n.args-1
  fwd.cpd <- c(1, cumprod(arg.vals))
  rev.cpd <- c(cumprod(arg.vals[n.args:2])[(n.args-1):1],1)
  "%,%" <- paste0
  main.call <- as.call(expression(cbind))
  for(k in 1:n.args)
  {
  arg.k <- eval(m[[1+k]], sys.parent())
  rep.call <- as.call(expression(rep))
  rep.call$x <- 1:arg.k
  if(n.l>0) rep.call$times <- fwd.cpd[k]
  if(n.r>0) rep.call$each <- rev.cpd[k]
  main.call[["x" %,% k]] <- rep.call
  n.l <- n.l + 1
  n.r <- n.r - 1
  }
  eval(main.call)
}

add.zeros <-
function(x,B)
{
do.one <- function(x,B)
{
  delta <- floor(logb(B, 10)) - floor(logb(x, 10))
  z <- ""
  if(delta > 0) z <- paste(rep("0", delta), collapse="")
  z%,%x
}
if(length(x)==1) ans <- do.one(x,B)
if(length(x)>1) ans <- sapply(x, FUN=do.one, B=B)
ans
}

library(lme4)

rho.ICC <- 0.5
s2.b <- 0.5*rho.ICC/(1-rho.ICC)
s2.e <- 0.5
delta <- 0.27

n.per.grp.lst <- (264/2*(1-450/9900*(5:0)))

## I will different values of 'h' and at these contrasts as well as flat
## I want to compare the AUC approach with just the test of the main effects 
constant arm coefficient

tt.seq <- c(0,3,6,12,18)
n.tt <- length(tt.seq)
cntr <- c(0.5,1,1,1,0.5)


## (delta1 + 2*delta2 + 2*delta3 + 2*delta4 + delta5)/2 = 0.27
## delta1 + 2*delta2 + 2*delta3 + 2*delta4 + delta5 = 2*0.27
## (delta1 + 2*delta2 + 2*delta3 + 2*delta4 + delta5)/8 = 2*0.27/8 = 0.27/4

H <- 0.08+0.005*(0:6)
shapes <- list(INC= (-floor(n.tt/2)):floor(n.tt/2),
   DEC= floor(n.tt/2):(-floor(n.tt/2)),
   FLT= rep(0,n.tt),
   PLT= c(-floor(n.tt/2):0,rep(floor(n.tt/2)+1, 
n.tt-floor(n.tt/2)-1)))
trnds <- names(shapes)

delta0.j <- rep(5, n.tt)

n.n.lst <- length(n.per.grp.lst)
n.trnds <- length(trnds)
n.H <- length(H)

## We're swarming over conditions. It works like this. This is the
## full list of conditions as a factorial design list. You have to
## determine the total number of conditions and then set up that many
## directories in the swarm folder
conds <- factorial.design(n.n.lst, n.trnds, n.H)

## currently with n.n.lst=6, n.trnds=4, n.H=7 that gives
## length(conds)=6x4x7=168

## then one copy of this file sits in each of the subdirectories.
## Parameters for this run are determined by getting the subdirectory
## number and then pulling the parameter indices from that element
## in the factorial design.
node <- 29

n.per.grp <- n.per.grp.lst[conds[node, 1]]
trnd <- trnds[conds[node, 2]]
h <- H[conds[node, 3]]

delta.j <- delta + h*shapes[[trnd]]

## covariates:
ID.i <- c(outer(rep(1:n.per.grp,each=n.tt),c(0,n.per.grp),FUN="+"))
arm.i <- rep(0:1, each=n.per.grp*n.tt)

length(arm.i)

____
From: Rui Barradas 
Sent: Thursday, August 15, 2024 2:51 PM
To: Izmirlian, Grant (NIH/NCI) [E] ; 
r-help@r-project.org 
Subject: [EXTERNAL] Re: [R] Very strange behavior of 'rep'

�s 19:39 de 15/08/2024, Izmirlian, Grant (NIH/NCI) [E] via R-help escreveu:
> \n<>\n\n \n<<
> This is very weird. I was running a swarm job on the cluster and it bombed
> only for n.per.grp=108, not for the other values. Even though
> n.per.grp*n.tt is 540, so that the length of the call to 'rep'
> should be 1080, I'm getting a vector of length 1078.
>  n.per.grp <- 108
>  n.tt <- 5
>  n.per.grp*n.tt
>  length(rep(0:1, each=n.per.grp*n.tt))
>  length(rep(0:1, each=108*5))
>>> \n<>\n\n\n\n
> --please do not edit the information below--
>
> R Version:
>   platform = x86_64-pc-linux-gnu
>   arch = x86_64
>   os = linux-gnu
>   system = x86_64, linux-gnu
>   status =
>   major = 4
>   minor = 4.1
>   year = 2024
>   month = 06
>   day = 14
>   svn rev = 86737
>   language = R
>   version.string = R version 4.4.1 (2024-06-14)
>   nickname = Race for Your Life
>
> Locale:
>   
> LC_CTYPE=C.UTF-8;LC_NUMERIC=C;LC_TIME=C.UTF-8;LC_COLLATE=C.UTF-8;LC_MONETARY=C.UTF-8;LC_MESSAGES=C.UTF-8;LC_PAPER=C.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=C.UTF-8;LC_IDENTIFICATION=C
>
> Search Path:
>   .GlobalEnv, package:lme4, package:Matrix, package:stats,
>   package:graphics, package:grDevices, package:utils, package:datasets,
>   package:showtex

[R] Very strange behavior of 'rep'

2024-08-15 Thread Izmirlian, Grant (NIH/NCI) [E] via R-help

\n<>\n\n \n<<
This is very weird. I was running a swarm job on the cluster and it bombed
only for n.per.grp=108, not for the other values. Even though
n.per.grp*n.tt is 540, so that the length of the call to 'rep'
should be 1080, I'm getting a vector of length 1078.
n.per.grp <- 108
n.tt <- 5
n.per.grp*n.tt
length(rep(0:1, each=n.per.grp*n.tt))
length(rep(0:1, each=108*5))
>> \n<>\n\n\n\n
--please do not edit the information below--

R Version:
 platform = x86_64-pc-linux-gnu
 arch = x86_64
 os = linux-gnu
 system = x86_64, linux-gnu
 status =
 major = 4
 minor = 4.1
 year = 2024
 month = 06
 day = 14
 svn rev = 86737
 language = R
 version.string = R version 4.4.1 (2024-06-14)
 nickname = Race for Your Life

Locale:
 
LC_CTYPE=C.UTF-8;LC_NUMERIC=C;LC_TIME=C.UTF-8;LC_COLLATE=C.UTF-8;LC_MONETARY=C.UTF-8;LC_MESSAGES=C.UTF-8;LC_PAPER=C.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=C.UTF-8;LC_IDENTIFICATION=C

Search Path:
 .GlobalEnv, package:lme4, package:Matrix, package:stats,
 package:graphics, package:grDevices, package:utils, package:datasets,
 package:showtext, package:showtextdb, package:sysfonts,
 package:methods, Autoloads, package:base

[[alternative HTML version deleted]]

______
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] geom_smooth with sd

2024-08-11 Thread SIBYLLE STÖCKLI via R-help

Thanks Erin

Quite relevant. Yes now +sd and -sd are the same values. However they are about 
+/- 5 and not the values received by the simple code here. I still think it is 
as the length of y differs.

Simple statistics

> mean(MS2020[MS2020$Bio=="1",]$QI_A, na.rm=TRUE)

[1] 26.81225

> sd(MS2020[MS2020$Bio=="1",]$QI_A, na.rm=TRUE)

[1] 21.12419

> mean(MS2020[MS2020$Bio=="0",]$QI_A, na.rm=TRUE)

[1] 15.86196

> sd(MS2020[MS2020$Bio=="0",]$QI_A, na.rm=TRUE)

[1] 15.00405

Kind regards

Sibylle 

From: Erin Hodgess  
Sent: Sunday, August 11, 2024 6:30 PM
To: sibylle.stoec...@gmx.ch
Cc: R-help@r-project.org
Subject: Re: [R] geom_smooth with sd

Hi!

This is probably completely off base, but your ymin and y max setup lines are 
different.  One uses sqrt(y), while the second uses sqrt(length(y)).

Could that play a part, please?

Thank you

Erin Hodgess, PhD

mailto: erinm.hodg...@gmail.com <mailto:erinm.hodg...@gmail.com> 

On Sun, Aug 11, 2024 at 10:10 AM SIBYLLE STÖCKLI via R-help 
mailto:r-help@r-project.org> > wrote:

Dear community

Using after_stat() I was able to visualise ggplot with standard deviations
instead of a confidence interval as seen in the R help.

p1<-ggplot(data = MS1, aes(x= Jahr, y= QI_A,color=Bio, linetype=Bio)) + 

geom_smooth(aes(fill=Bio,
ymax=after_stat(y+se*sqrt(length(y))), ymin=after_stat(y-se*sqrt(y))) ,
method = "lm" , formula = y ~ x + I(x^2),linewidth=1) +

theme(panel.background = element_blank())+

theme(axis.line = element_line(colour = "black"))+

  theme(axis.text=element_text(size=18))+

  theme(axis.title=element_text(size=20))+

ylab("Anteil BFF an LN [%]") +xlab("Jahr")+

  scale_color_manual(values=c("red","darkgreen"), labels=c("ÖLN", "BIO"))+

  scale_fill_manual(values=c("red","darkgreen"), labels= c("ÖLN", "BIO"))+

theme(legend.title = element_blank())+

  theme(legend.text=element_text(size=20))+

  scale_linetype_manual(values=c("dashed", "solid"), labels=c("ÖLN", "BIO"))

p1<-p1 + expand_limits(y=c(0, 30))

When comparing the plots to the simple statistics the standard deviation do
not match. I assume it is because of the na.rm=TRUE which does not match
length(y) in the  after_stat code. However I was not able to adapt the code
using NA values?

Simple statistics

> mean(MS2020[MS2020$Bio=="1",]$QI_A, na.rm=TRUE)

[1] 26.81225

> sd(MS2020[MS2020$Bio=="1",]$QI_A, na.rm=TRUE)

[1] 21.12419

> mean(MS2020[MS2020$Bio=="0",]$QI_A, na.rm=TRUE)

[1] 15.86196

> sd(MS2020[MS2020$Bio=="0",]$QI_A, na.rm=TRUE)

[1] 15.00405

Kind regards

Sibylle

[[alternative HTML version deleted]]

__
R-help@r-project.org <mailto:R-help@r-project.org>  mailing list -- To 
UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] geom_smooth with sd

2024-08-11 Thread SIBYLLE STÖCKLI via R-help

Dear community

 

Using after_stat() I was able to visualise ggplot with standard deviations
instead of a confidence interval as seen in the R help.

 

p1<-ggplot(data = MS1, aes(x= Jahr, y= QI_A,color=Bio, linetype=Bio)) + 

geom_smooth(aes(fill=Bio,
ymax=after_stat(y+se*sqrt(length(y))), ymin=after_stat(y-se*sqrt(y))) ,
method = "lm" , formula = y ~ x + I(x^2),linewidth=1) +

theme(panel.background = element_blank())+

theme(axis.line = element_line(colour = "black"))+

  theme(axis.text=element_text(size=18))+

  theme(axis.title=element_text(size=20))+

ylab("Anteil BFF an LN [%]") +xlab("Jahr")+

  scale_color_manual(values=c("red","darkgreen"), labels=c("�LN", "BIO"))+

  scale_fill_manual(values=c("red","darkgreen"), labels= c("�LN", "BIO"))+

theme(legend.title = element_blank())+

  theme(legend.text=element_text(size=20))+

  scale_linetype_manual(values=c("dashed", "solid"), labels=c("�LN", "BIO"))

p1<-p1 + expand_limits(y=c(0, 30))

 

When comparing the plots to the simple statistics the standard deviation do
not match. I assume it is because of the na.rm=TRUE which does not match
length(y) in the  after_stat code. However I was not able to adapt the code
using NA values?

 

Simple statistics

> mean(MS2020[MS2020$Bio=="1",]$QI_A, na.rm=TRUE)

[1] 26.81225

> sd(MS2020[MS2020$Bio=="1",]$QI_A, na.rm=TRUE)

[1] 21.12419

> mean(MS2020[MS2020$Bio=="0",]$QI_A, na.rm=TRUE)

[1] 15.86196

> sd(MS2020[MS2020$Bio=="0",]$QI_A, na.rm=TRUE)

[1] 15.00405

 

Kind regards

Sibylle


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Printing

2024-08-11 Thread Ivan Krylov via R-help

В Sun, 11 Aug 2024 22:36:08 +0800
Steven Yen  пишет:

> All I need is printing by returning out (unless I turn it off). And, 
> retrieve ap and vap as needed as shown above. Guess I need to read
> more about invisible.

Perhaps you could print(out) instead of returning it in the if
(printing) branch? Then you would be able to always
return(invisible(list("ei"=ap,"vi"=vap))), whether printing is TRUE or
not.

-- 
Best regards,
Ivan

______
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] a fast way to do my job

2024-08-10 Thread Yuan Chun Ding via R-help

Hi Bert and Ben,

Thanks a lot for your suggestion!!.

About the different residuals between lm function and lm.fit,  from online 
search,  lt seems like that I need to add an intercept in the design matrix x;

pur2 <- matrix(gem751be.rpkm$purity2, ncol =1)
  pur2.1 <- cbind(1,gem751be.rpkm$purity2)

then running result2 <- residuals(lm.fit( x= pur2.1, y = dat));
now I am thinking whether an intercept is required or not.

Ding
From: R-help  On Behalf Of Yuan Chun Ding via 
R-help
Sent: Saturday, August 10, 2024 12:30 PM
To: Bert Gunter ; Ben Bolker 
Cc: r-help@r-project.org
Subject: Re: [R] a fast way to do my job

HI Bert and Ben, Yes, running lm. fit using the matrix format is much faster. I 
read a couple of online comments why it is faster. However, the residual values 
for three tested variables or genes from lm function and lm. fit function are 
different,


HI Bert and Ben,



Yes, running lm.fit using the matrix format is much faster. I read a couple of 
online comments why it is faster.



However, the residual values for three tested variables or genes from lm 
function and lm.fit function are different, with Pearson correlation of 0.55, 
0.89, and 0.99.



I have not found the reason.



Thanks,



Ding



From: Bert Gunter mailto:bgunter.4...@gmail.com>>

Sent: Friday, August 9, 2024 7:11 PM

To: Ben Bolker mailto:bbol...@gmail.com>>

Cc: Yuan Chun Ding mailto:ycd...@coh.org>>; 
r-help@r-project.org<mailto:r-help@r-project.org>

Subject: Re: [R] a fast way to do my job



Better idea, Ben! It would work as you might expect it to to produce the same 
results as the above: ##first make sure your regressor is a matrix: pur2 <- 
matrix(purity2, ncol =1) ## convert the data frame variables into a matrix dat 
<-





Better idea, Ben!







It would work as you might expect it to to produce the same results as



the above:







##first make sure your regressor is a matrix:



pur2 <- matrix(purity2, ncol =1)



## convert the data frame variables into a matrix



dat <- as.matrix(gem751be.rpkm[ , 74:35164])



##then



result <- residuals(lm.fit( x= pur2, y = dat))







Cheers,



Bert







On Fri, Aug 9, 2024 at 6:38 PM Ben Bolker 
mailto:bbol...@gmail.com<mailto:bbol...@gmail.com%3cmailto:bbol...@gmail.com>>>
 wrote:



>



> You can also fit a linear model with a matrix-valued response



> variable, which should be even faster (not sure off the top of my head



> how to get the residuals and reshape them to the dimensions you want)



>



> On Fri, Aug 9, 2024 at 9:31 PM Bert Gunter 
> mailto:bgunter.4...@gmail.com<mailto:bgunter.4...@gmail.com%3cmailto:bgunter.4...@gmail.com>>>
>  wrote:



> >



> > See ?lm.fit.



> > I must be missing something, because:



> >



> > results <- sapply(74:35164, \(i) residuals(lm.fit(purity2,



> > gem751be.rpkm[, i] )))



> >



> > would give you a 751 x 35091 matrix of the residuals from each of the



> > regressions.



> > I assume it will be considerably faster than all the overhead you are



> > carrying in your current code, but of course you'll have to try it and



> > see. ... Assuming that I have interpreted your request correctly.



> > Ignore if not.



> >



> > Cheers,



> > Bert



> >



> > On Fri, Aug 9, 2024 at 4:50 PM Yuan Chun Ding via R-help



> > mailto:r-help@r-project.org<mailto:r-help@r-project.org%3cmailto:r-help@r-project.org>>>
> >  wrote:



> > >



> > > Dear R users,



> > >



> > > I am running the following code below,  the gem751be.rpkm is a dataframe 
> > > with dim of 751 samples by 35164 variables,  73 phenotypic variables in 
> > > the furst to 73rd column and 35091 genomic variables or genes in the 74th 
> > > to 35164th columns.  What I need to do is to calculate the residuals for 
> > > each gene using the simple linear regression model of genelist[i] ~ 
> > > purity2;



> > >



> > > The following code is running,  it takes long time, but I have an 
> > > expensive ThinkStation window computer.



> > > Can you provide a fast way to do it?



> > >



> > > Thank you,



> > >



> > > Ding



> > >



> > > -



> > >



> > >



> > > gem751be.rpkm <-merge(gem751be10, as.data.frame(t(rna849.fpkm2)),



> > > +   by.x="id2",by.y=0)



> > > >   row.names(gem751be.rpkm)<-gem751be.rpkm$id3



> > > >   
> > > > colnames(gem751be.rpkm)<-gsub(colnames(gem751be.rpkm),pattern="-",replacemen

Re: [R] a fast way to do my job

2024-08-10 Thread Yuan Chun Ding via R-help

You are right.  I also just thought about that, no intercept is not applicable 
to my case.

Ding

From: Bert Gunter 
Sent: Saturday, August 10, 2024 1:06 PM
To: Yuan Chun Ding 
Cc: Ben Bolker ; r-help@r-project.org
Subject: Re: [R] a fast way to do my job

Ah, messages crossed. A no-intercept model **assumes** the straight line fit 
must pass through the origin. Unless there is a strong justification for such 
an assumption, you should include an intercept. -- Bert On Sat, Aug 10, 2024 at 
1: 02 PM


Ah, messages crossed.

A no-intercept model **assumes** the straight line fit must pass

through the origin. Unless there is a strong justification for such an

assumption, you should include an intercept.



-- Bert



On Sat, Aug 10, 2024 at 1:02 PM Bert Gunter 
mailto:bgunter.4...@gmail.com>> wrote:

>

> Is it because I failed to to add a column of ones for an intercept to

> the x matrix? TRhat would be my bad.

>

> -- Bert

>

>

> On Sat, Aug 10, 2024 at 12:59 PM Bert Gunter 
> mailto:bgunter.4...@gmail.com>> wrote:

> >

> > Probably because you inadvertently ran different models. Without your code, 
> > I haven't a clue.

> >

> >

> > On Sat, Aug 10, 2024, 12:29 Yuan Chun Ding 
> > mailto:ycd...@coh.org>> wrote:

> >>

> >> HI Bert and Ben,

> >>

> >>

> >>

> >> Yes, running lm.fit using the matrix format is much faster. I read a 
> >> couple of online comments why it is faster.

> >>

> >>

> >>

> >> However, the residual values for three tested variables or genes from lm 
> >> function and lm.fit function are different, with Pearson correlation of 
> >> 0.55, 0.89, and 0.99.

> >>

> >>

> >>

> >> I have not found the reason.

> >>

> >>

> >>

> >> Thanks,

> >>

> >>

> >> Ding

> >>

> >>

> >>

> >> From: Bert Gunter mailto:bgunter.4...@gmail.com>>

> >> Sent: Friday, August 9, 2024 7:11 PM

> >> To: Ben Bolker mailto:bbol...@gmail.com>>

> >> Cc: Yuan Chun Ding mailto:ycd...@coh.org>>; 
> >> r-help@r-project.org<mailto:r-help@r-project.org>

> >> Subject: Re: [R] a fast way to do my job

> >>

> >>

> >>

> >> Better idea, Ben! It would work as you might expect it to to produce the 
> >> same results as the above: ##first make sure your regressor is a matrix: 
> >> pur2 <- matrix(purity2, ncol =1) ## convert the data frame variables into 
> >> a matrix dat <-

> >>

> >> Better idea, Ben!

> >>

> >>

> >>

> >> It would work as you might expect it to to produce the same results as

> >>

> >> the above:

> >>

> >>

> >>

> >> ##first make sure your regressor is a matrix:

> >>

> >> pur2 <- matrix(purity2, ncol =1)

> >>

> >> ## convert the data frame variables into a matrix

> >>

> >> dat <- as.matrix(gem751be.rpkm[ , 74:35164])

> >>

> >> ##then

> >>

> >> result <- residuals(lm.fit( x= pur2, y = dat))

> >>

> >>

> >>

> >> Cheers,

> >>

> >> Bert

> >>

> >>

> >>

> >> On Fri, Aug 9, 2024 at 6:38 PM Ben Bolker 
> >> mailto:bbol...@gmail.com>> wrote:

> >>

> >> >

> >>

> >> > You can also fit a linear model with a matrix-valued response

> >>

> >> > variable, which should be even faster (not sure off the top of my head

> >>

> >> > how to get the residuals and reshape them to the dimensions you want)

> >>

> >> >

> >>

> >> > On Fri, Aug 9, 2024 at 9:31 PM Bert Gunter 
> >> > mailto:bgunter.4...@gmail.com>> wrote:

> >>

> >> > >

> >>

> >> > > See ?lm.fit.

> >>

> >> > > I must be missing something, because:

> >>

> >> > >

> >>

> >> > > results <- sapply(74:35164, \(i) residuals(lm.fit(purity2,

> >>

> >> > > gem751be.rpkm[, i] )))

> >>

> >> > >

> >>

> >> > > would give you a 751 x 35091 matrix of the residuals from each of the

> >>

> >> > > regressions.

> >>

> >> > > I assume it will be considerably faster than all the overhead you are

> >>

> >> >

Re: [R] a fast way to do my job

2024-08-10 Thread Yuan Chun Ding via R-help

after add intercept, all residuals are the same from lm or lm.fit.

Ding

From: Bert Gunter 
Sent: Saturday, August 10, 2024 1:00 PM
To: Yuan Chun Ding 
Cc: Ben Bolker ; r-help@r-project.org
Subject: Re: [R] a fast way to do my job

Probably because you inadvertently ran different models. Without your code, I 
haven't a clue. On Sat, Aug 10, 2024, 12: 29 Yuan Chun Ding  
wrote: HI Bert and Ben, Yes, running lm. fit using the matrix format is much 
faster. 


Probably because you inadvertently ran different models. Without your code, I 
haven't a clue.

On Sat, Aug 10, 2024, 12:29 Yuan Chun Ding 
mailto:ycd...@coh.org>> wrote:
HI Bert and Ben,

Yes, running lm.fit using the matrix format is much faster. I read a couple of 
online comments why it is faster.

However, the residual values for three tested variables or genes from lm 
function and lm.fit function are different, with Pearson correlation of 0.55, 
0.89, and 0.99.

I have not found the reason.

Thanks,

Ding

From: Bert Gunter mailto:bgunter.4...@gmail.com>>
Sent: Friday, August 9, 2024 7:11 PM
To: Ben Bolker mailto:bbol...@gmail.com>>
Cc: Yuan Chun Ding mailto:ycd...@coh.org>>; 
r-help@r-project.org<mailto:r-help@r-project.org>
Subject: Re: [R] a fast way to do my job

Better idea, Ben! It would work as you might expect it to to produce the same 
results as the above: ##first make sure your regressor is a matrix: pur2 <- 
matrix(purity2, ncol =1) ## convert the data frame variables into a matrix dat 
<-

Better idea, Ben!



It would work as you might expect it to to produce the same results as

the above:



##first make sure your regressor is a matrix:

pur2 <- matrix(purity2, ncol =1)

## convert the data frame variables into a matrix

dat <- as.matrix(gem751be.rpkm[ , 74:35164])

##then

result <- residuals(lm.fit( x= pur2, y = dat))



Cheers,

Bert



On Fri, Aug 9, 2024 at 6:38 PM Ben Bolker 
mailto:bbol...@gmail.com>> wrote:

>

> You can also fit a linear model with a matrix-valued response

> variable, which should be even faster (not sure off the top of my head

> how to get the residuals and reshape them to the dimensions you want)

>

> On Fri, Aug 9, 2024 at 9:31 PM Bert Gunter 
> mailto:bgunter.4...@gmail.com>> wrote:

> >

> > See ?lm.fit.

> > I must be missing something, because:

> >

> > results <- sapply(74:35164, \(i) residuals(lm.fit(purity2,

> > gem751be.rpkm[, i] )))

> >

> > would give you a 751 x 35091 matrix of the residuals from each of the

> > regressions.

> > I assume it will be considerably faster than all the overhead you are

> > carrying in your current code, but of course you'll have to try it and

> > see. ... Assuming that I have interpreted your request correctly.

> > Ignore if not.

> >

> > Cheers,

> > Bert

> >

> > On Fri, Aug 9, 2024 at 4:50 PM Yuan Chun Ding via R-help

> > mailto:r-help@r-project.org>> wrote:

> > >

> > > Dear R users,

> > >

> > > I am running the following code below,  the gem751be.rpkm is a dataframe 
> > > with dim of 751 samples by 35164 variables,  73 phenotypic variables in 
> > > the furst to 73rd column and 35091 genomic variables or genes in the 74th 
> > > to 35164th columns.  What I need to do is to calculate the residuals for 
> > > each gene using the simple linear regression model of genelist[i] ~ 
> > > purity2;

> > >

> > > The following code is running,  it takes long time, but I have an 
> > > expensive ThinkStation window computer.

> > > Can you provide a fast way to do it?

> > >

> > > Thank you,

> > >

> > > Ding

> > >

> > > -

> > >

> > >

> > > gem751be.rpkm <-merge(gem751be10, as.data.frame(t(rna849.fpkm2)),

> > > +   by.x="id2",by.y=0)

> > > >   row.names(gem751be.rpkm)<-gem751be.rpkm$id3

> > > >   
> > > > colnames(gem751be.rpkm)<-gsub(colnames(gem751be.rpkm),pattern="-",replacement="_")

> > > >   genelist <- gem751be.rpkm %>% dplyr::select(74:35164)

> > > >   residuals <- NULL

> > > >   for (i in 1:length(genelist)) {

> > > + #i=1

> > > + formula <- reformulate("purity2", response=names(genelist)[i])

> > > + model <- lm(formula, data = gem751be.rpkm)

> > > + resi <- as.data.frame(residuals(model))

> > > + colnames(resi)[1]<-names(genelist)[i]

> > > + resi <-as.data.frame(t(resi))

> > >

Re: [R] a fast way to do my job

2024-08-10 Thread Yuan Chun Ding via R-help

HI Bert and Ben,

Yes, running lm.fit using the matrix format is much faster. I read a couple of 
online comments why it is faster.

However, the residual values for three tested variables or genes from lm 
function and lm.fit function are different, with Pearson correlation of 0.55, 
0.89, and 0.99.

I have not found the reason.

Thanks,

Ding

From: Bert Gunter 
Sent: Friday, August 9, 2024 7:11 PM
To: Ben Bolker 
Cc: Yuan Chun Ding ; r-help@r-project.org
Subject: Re: [R] a fast way to do my job

Better idea, Ben! It would work as you might expect it to to produce the same 
results as the above: ##first make sure your regressor is a matrix: pur2 <- 
matrix(purity2, ncol =1) ## convert the data frame variables into a matrix dat 
<-


Better idea, Ben!



It would work as you might expect it to to produce the same results as

the above:



##first make sure your regressor is a matrix:

pur2 <- matrix(purity2, ncol =1)

## convert the data frame variables into a matrix

dat <- as.matrix(gem751be.rpkm[ , 74:35164])

##then

result <- residuals(lm.fit( x= pur2, y = dat))



Cheers,

Bert



On Fri, Aug 9, 2024 at 6:38 PM Ben Bolker 
mailto:bbol...@gmail.com>> wrote:

>

> You can also fit a linear model with a matrix-valued response

> variable, which should be even faster (not sure off the top of my head

> how to get the residuals and reshape them to the dimensions you want)

>

> On Fri, Aug 9, 2024 at 9:31 PM Bert Gunter 
> mailto:bgunter.4...@gmail.com>> wrote:

> >

> > See ?lm.fit.

> > I must be missing something, because:

> >

> > results <- sapply(74:35164, \(i) residuals(lm.fit(purity2,

> > gem751be.rpkm[, i] )))

> >

> > would give you a 751 x 35091 matrix of the residuals from each of the

> > regressions.

> > I assume it will be considerably faster than all the overhead you are

> > carrying in your current code, but of course you'll have to try it and

> > see. ... Assuming that I have interpreted your request correctly.

> > Ignore if not.

> >

> > Cheers,

> > Bert

> >

> > On Fri, Aug 9, 2024 at 4:50 PM Yuan Chun Ding via R-help

> > mailto:r-help@r-project.org>> wrote:

> > >

> > > Dear R users,

> > >

> > > I am running the following code below,  the gem751be.rpkm is a dataframe 
> > > with dim of 751 samples by 35164 variables,  73 phenotypic variables in 
> > > the furst to 73rd column and 35091 genomic variables or genes in the 74th 
> > > to 35164th columns.  What I need to do is to calculate the residuals for 
> > > each gene using the simple linear regression model of genelist[i] ~ 
> > > purity2;

> > >

> > > The following code is running,  it takes long time, but I have an 
> > > expensive ThinkStation window computer.

> > > Can you provide a fast way to do it?

> > >

> > > Thank you,

> > >

> > > Ding

> > >

> > > -

> > >

> > >

> > > gem751be.rpkm <-merge(gem751be10, as.data.frame(t(rna849.fpkm2)),

> > > +   by.x="id2",by.y=0)

> > > >   row.names(gem751be.rpkm)<-gem751be.rpkm$id3

> > > >   
> > > > colnames(gem751be.rpkm)<-gsub(colnames(gem751be.rpkm),pattern="-",replacement="_")

> > > >   genelist <- gem751be.rpkm %>% dplyr::select(74:35164)

> > > >   residuals <- NULL

> > > >   for (i in 1:length(genelist)) {

> > > + #i=1

> > > + formula <- reformulate("purity2", response=names(genelist)[i])

> > > + model <- lm(formula, data = gem751be.rpkm)

> > > + resi <- as.data.frame(residuals(model))

> > > + colnames(resi)[1]<-names(genelist)[i]

> > > + resi <-as.data.frame(t(resi))

> > > + residuals <- rbind(residuals, resi)

> > > +   }

> > >

> > >

> > >

> > > --

> > > 

> > > -SECURITY/CONFIDENTIALITY WARNING-

> > >

> > > This message and any attachments are intended solely for the individual 
> > > or entity to which they are addressed. This communication may contain 
> > > information that is privileged, confidential, or exempt from disclosure 
> > > under applicable law (e.g., personal health information, research data, 
> > > financial information). Because this e-mail has been sent withou

Re: [R] If loop

2024-08-10 Thread martin gregory via R-help

> On 9. Aug 2024, at 10:45, CALUM POLWART  wrote:
> 
> Or use <<- assignment I think.  (I usually return, but return can only
> return one object and I think you want two or more
> 

One can return multiple objects by putting them in a list and returning the 
list.

Martin

______
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] a fast way to do my job

2024-08-09 Thread Yuan Chun Ding via R-help

Dear R users,

I am running the following code below,  the gem751be.rpkm is a dataframe with 
dim of 751 samples by 35164 variables,  73 phenotypic variables in the furst to 
73rd column and 35091 genomic variables or genes in the 74th to 35164th 
columns.  What I need to do is to calculate the residuals for each gene using 
the simple linear regression model of genelist[i] ~ purity2;

The following code is running,  it takes long time, but I have an expensive 
ThinkStation window computer.
Can you provide a fast way to do it?

Thank you,

Ding

-


gem751be.rpkm <-merge(gem751be10, as.data.frame(t(rna849.fpkm2)),
+   by.x="id2",by.y=0)
>   row.names(gem751be.rpkm)<-gem751be.rpkm$id3
>   
> colnames(gem751be.rpkm)<-gsub(colnames(gem751be.rpkm),pattern="-",replacement="_")
>   genelist <- gem751be.rpkm %>% dplyr::select(74:35164)
>   residuals <- NULL
>   for (i in 1:length(genelist)) {
+ #i=1
+ formula <- reformulate("purity2", response=names(genelist)[i])
+ model <- lm(formula, data = gem751be.rpkm)
+ resi <- as.data.frame(residuals(model))
+ colnames(resi)[1]<-names(genelist)[i]
+ resi <-as.data.frame(t(resi))
+ residuals <- rbind(residuals, resi)
+   }



--

-SECURITY/CONFIDENTIALITY WARNING-  

This message and any attachments are intended solely for the individual or 
entity to which they are addressed. This communication may contain information 
that is privileged, confidential, or exempt from disclosure under applicable 
law (e.g., personal health information, research data, financial information). 
Because this e-mail has been sent without encryption, individuals other than 
the intended recipient may be able to view the information, forward it to 
others or tamper with the information without the knowledge or consent of the 
sender. If you are not the intended recipient, or the employee or person 
responsible for delivering the message to the intended recipient, any 
dissemination, distribution or copying of the communication is strictly 
prohibited. If you received the communication in error, please notify the 
sender immediately by replying to this message and deleting the message and any 
accompanying files from your system. If, due to the security risks, you do not 
wish to rec
 eive further communications via e-mail, please reply to this message and 
inform the sender that you do not wish to receive further e-mail from the 
sender. (LCP301)


    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] WDI package commands timing out and not working

2024-08-09 Thread Ivan Krylov via R-help

В Fri, 9 Aug 2024 20:25:51 +0530
Anupam Tyagi  пишет:

> I am trying this in Bengaluru, India, using R-studio. I tried
> downloading a single variable. It happened fast, in less than 5
> seconds. I tried downloading six variables, it took much longer, but
> less than a minute. Tried eight variables and it did not download
> even in five minutes.

> I checked in the browser using the URL provided in the warning
> messages:  The World Bank web API gave me the indicator in less than
> a second.

I think that the fact that this initially works from R but then only
works in the browser is clear evidence of Cloudflare "protecting" the
World Bank API from you trying to use it in an automated manner. That
this mostly defeats the purpose of an API was probably lost on people
who set it up.

In theory, this should be reported to the people who operate the API.
There seems to be a support form at
<https://datahelpdesk.worldbank.org/knowledgebase/topics/125589-developer-information>.
I don't have a lot of confidence in them fixing anything as a result
of such a request.

Try to cache all access to the API. Consider getting a proxy in a
country less discriminated against by Cloudflare. Maybe petition the
WDI maintainer to export a function that parses a JSON file manually
downloaded with a browser?

-- 
Best regards,
Ivan

______
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] WDI package commands timing out and not working

2024-08-09 Thread Ivan Krylov via R-help

В Thu, 8 Aug 2024 12:43:23 +0530
Anupam Tyagi  пишет:

> In open.connection(con, "rb") :
>   URL
> 'https://api.worldbank.org/v2/en/country/OED/indicator/NY.ADJ.NNAT.GN.ZS?format=json&date=1977:2020&per_page=32500&page=1':
> Timeout of 60 seconds was reached

If you try to open the link in the browser, does it work? How long does
it take to download? Try increasing options(timeout=...) to a larger
time (in seconds).

I see there is Cloudflare sitting in front of the API, but it's
relatively non-aggressive. I could only get it to deny my request by
accessing it through Tor.

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] An error message with the command fm<-1m

2024-08-06 Thread Jorgen Harmse via R-help

> The function is lm(), not 1m().

Eric Berger is correct (except for the extra parentheses), but it is worth 
pointing out that variable names do not begin with digits. (You can use 
backticks, assign, & other features to create such names (e.g. to write the 
Orwellian assignment `2 + 2` <- 5L), but they are non-standard and you need 
special syntax to use them.) Maybe the font makes some characters hard to 
distinguish, but a vertical line at the start of a standard name must be 
lower-case 'l' or upper-case 'I', not '1' or a pipe symbol. A circle or oval 
must be 'o' or 'O', not the digit '0'. (Digits after the first character are 
standard.)

Regards,
Jorgen Harmse.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Difference between stats.steps() and MuMIn.dredge() to select best fit model

2024-08-01 Thread Ivan Krylov via R-help

В Wed, 31 Jul 2024 11:56:55 +
c.bu...@posteo.jp пишет:

> step() explore the model space with a step wise approach.
> And dredge() try out all possible combinations of the variables.
>
> But isn't that the same? I might have a mental block on this.
> 
> Which model (formula) would dredge() "test" that step() wouldn't?`

Suppose that the predictors a, b, c, d, e, f are arranged in the
descending order of contribution to the model.

Consider a forward stepwise algorithm that is asked to choose three
variables.

It starts by testing a, b, c, d, e, f, and chooses a.

It continues by testing a + b, a + c, a + d, a + e, a + f, and chooses
a + b.

It continues by testing a + b + c, a + b + d, a + b + e, a + b + f, and
chooses a + b + c.

By being greedy, it doesn't consider, for example, the model d + e + f,
because for that it would have to pick d before a.

A greedy algorithm for K variables out N tests N + (N-1) + ... +
(N-K+1) = N*K - K(K-1)/2 models. An exhaustive search would have to test
choose(N,K) = N!/(N-K)!/K! models.

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] round and trailing zero

2024-07-30 Thread Jorgen Harmse via R-help

Duncan Murdoch answered your question, but I have another. Are you going to do 
some computation with the rounded numbers, or are they just for display? (One 
thing I like about Excel is that I can change the display format of a cell 
without changing answers that depend on that cell.) In the latter case, why 
stash them in a variable? For more control of the display, consider sprintf (or 
a wrapper that combines sprintf with cat).

Regards,
Jorgen Harmse.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using optim() function to find MLE

2024-07-28 Thread Ivan Krylov via R-help

В Mon, 29 Jul 2024 09:52:22 +0530
Christofer Bogaso  пишет:

> LL = function(b0, b1)

help(optim) documents that the function to be optimised takes a single
argument, a vector containing the parameters. Here's how your LL
function can be adapted to this interface:

LL <- function(par) {
 b0 <- par[1]
 b1 <- par[2]
 sum(apply(as.matrix(dat[, c('PurchasedProb', 'Age')]), 1,
 function(iROw) iROw['PurchasedProb'] * log( 1 / (1 + exp(-1 * (b0 + b1
 * iROw['Age'] + (1 - iROw['PurchasedProb']) * log(1 - 1 / (1 +
 exp(-1 * (b0 + b1 * iROw['Age']))
}

Furethermore, LL(c(0, 1)) results in -Inf. All the methods supported by
optim() require at least the initial parameters to result in finite
values, and L-BFGS-B requires all evaluations to be finite. You're also
maximising the function, and optim() defaults to minimisation, so you
need an additional parameter to adjust that (or rewrite the LL function
further):

result1 <- optim(
 par = c(0, 0), fn = LL, method = "L-BFGS-B",
 control = list(fnscale = -1)
)

> coef(result1)

help(optim) documents the return value of optim() as not having a
class or a $coefficients field. You can use result1$par to access the
parameters.

> Is there any way to force optim() function to use Newton-CG algorithm?

I'm assuming you mean the method documented in
<https://docs.scipy.org/doc/scipy/reference/optimize.minimize-newtoncg.html>.
optim() doesn't support the truncated (line search) Newton-CG method.
See the 'optimx' and 'nloptr' packages for an implementation of a
truncated Newton method (not necessarily exactly the same one).

-- 
Best regards,
Ivan

______
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] please help generate a square correlation matrix

2024-07-27 Thread Yuan Chun Ding via R-help

HI Bert,

Thank you for extra help!!
Yes, exactly, your interpretation is perfectly correct and your R code is what 
I should look for.
after generated all those negative values of correlation,
I thought about the extremely small p values associated with those negative 
correlation, which is not meaningful as I truncated my data.

When examining the exclusiveness of mutation pairs, what I first thought about 
is correlation, so stepped into a more complicated correlation journey.
However, what Richard share is very helpful to explain why I got negative 
correlation values for all pairs.
In my case, we measured all mutations for all 1000 samples using an exactly 
same sequencing method, so no issue of never-reporting.
I am  very grateful for help and comments from Rui, Richard and Bert!!

Ding

From: Bert Gunter 
Sent: Saturday, July 27, 2024 4:50 PM
To: Yuan Chun Ding 
Cc: Richard O'Keefe ; r-help@r-project.org
Subject: Re: [R] please help generate a square correlation matrix

Your expanded explanation helps clarify your intent. Herewith some comments. Of 
course, feel free to ignore and not respond. And, as always, my apologies if I 
have failed to comprehend your intent. 1. I would avoid any notion of 
"statistical

Your expanded explanation helps clarify your intent. Herewith some

comments. Of course, feel free to ignore and not respond. And, as

always, my apologies if I have failed to comprehend your intent.

1. I would avoid any notion of "statistical significance" like the

plague. This is a purely exploratory exercise.

2. My understanding is that you want to know the proportion of rows in

a pair of columns/vectors in which only 1 values of the pair is 1 out

of the number of pairs where 1 or 2 values is 1.  In R syntax, this is

simply:

sum(xor(x, y)) / sum(x | y)  ,

where x and y are two columns of 1's and 0's

Better yet might be to report both this *and* sum(x|y) to help you

judge "meaningfulness".

Here is a simple function that does this

## first, define a function that does above calculation:

assoc <- \(z){

   x <- z[,1]; y <- z[,2]

   n <- sum(x|y)

   c(prop = sum(xor(x, y))/n, N = n)

}

## Now a function that uses it for the various combinations:

somecor <- function(dat, func = assoc){

   dat <- as.matrix(dat)

   indx <- seq_len(ncol(dat))

   rbind(w <- combn(indx,2),

 combn(indx, 2, FUN = \(m)func(dat[,m]) )) |>

 t()  |> round(digits =2) |>

  'dimnames<-'(list(rep.int('',ncol(w)), c("","", "prop","N")))

}

# Now apply it to your example data:

somecor(dat)

## which gives

 prop N

 1 2 0.67 6

 1 3 0.60 5

 1 4 0.57 7

 2 3 0.60 5

 2 4 0.33 6

 3 4 0.71 7

This seems more interpretable and directly useful to me. Bigger values

of prop for bigger N are the more interesting, assuming I have

interpreted you correctly.

Cheers,

Bert

On Sat, Jul 27, 2024 at 12:54 PM Yuan Chun Ding 
mailto:ycd...@coh.org>> wrote:

>

> Hi Richard,

>

>

>

> Nice to know you had similar experience.

>

> Yes, your understanding is right.  all correlations are negative after 
> removing double-zero rows.

>

> It is consistent with a heatmap we generated.

>

> 1 is for a cancer patient with a specific mutation.  0 is no mutation for the 
> same mutation type in a patient.

>

> a pair of mutation type (two different mutations) are exclusive for most of 
> patients in heatmap or oncoplots.

>

>  If we include all 1000 patients, 900 of patients with no mutations in both 
> mutation types, then the correlation is not significant at all.

>

> But eyeball the heatmap (oncoplots) for mutation (row) by patient (column), 
> mutations are exclusive for most of patients,

>

> so I want to measure how strong the exclusiveness between two specific 
> mutation types across those patients with at least one mutation type.

>

> Then put the pair of mutations with strong negative mutations on the top rows 
> by order of negative mutation values.

>

>

>

> Regarding a final application,  maybe there are some usage for my case.

>

>  If one develops two drugs specific to the two negative correlated mutations, 
> the drug treatment for cancer patients is usually only for those patients 
> carrying the specific mutation,

>

> then it is informative to know how strong the negative correlation when 
> considering different combination of treatment strategies.

>

>

>

> Ding

>

>

>

>

>

>

>

>

>

>

>

> From: R-help 
> mailto:r-help-boun...@r-project.org>> On Behalf 
> Of Richard O'Keefe

> Sent: Saturday, July 27, 2024 4:47 AM

> To: Bert Gunter mailto:bgunter.4...@gmail.com>>

> Cc: r-help@r-project.org<m

Re: [R] please help generate a square correlation matrix

2024-07-27 Thread Yuan Chun Ding via R-help

Hi Richard,

Nice to know you had similar experience.
Yes, your understanding is right.  all correlations are negative after removing 
double-zero rows.
It is consistent with a heatmap we generated.
1 is for a cancer patient with a specific mutation.  0 is no mutation for the 
same mutation type in a patient.
a pair of mutation type (two different mutations) are exclusive for most of 
patients in heatmap or oncoplots.
 If we include all 1000 patients, 900 of patients with no mutations in both 
mutation types, then the correlation is not significant at all.
But eyeball the heatmap (oncoplots) for mutation (row) by patient (column), 
mutations are exclusive for most of patients,
so I want to measure how strong the exclusiveness between two specific mutation 
types across those patients with at least one mutation type.
Then put the pair of mutations with strong negative mutations on the top rows 
by order of negative mutation values.

Regarding a final application,  maybe there are some usage for my case.
 If one develops two drugs specific to the two negative correlated mutations, 
the drug treatment for cancer patients is usually only for those patients 
carrying the specific mutation,
then it is informative to know how strong the negative correlation when 
considering different combination of treatment strategies.

Ding

From: R-help  On Behalf Of Richard O'Keefe
Sent: Saturday, July 27, 2024 4:47 AM
To: Bert Gunter 
Cc: r-help@r-project.org
Subject: Re: [R] please help generate a square correlation matrix

Curses, my laptop is hallucinating again. Hope I can get through this. So we're 
talking about correlations between binary variables. Suppose we have two 
0-1-valued variables, x and y. Let A <- sum(x*y) # number of cases where x and 
y are

Curses, my laptop is hallucinating again.  Hope I can get through this.

So we're talking about correlations between binary variables.

Suppose we have two 0-1-valued variables, x and y.

Let A <- sum(x*y)  # number of cases where x and y are both 1.

Let B <- sum(x)-A  # number of cases where x is 1 and y is 0

Let C <- sum(y)-A # number of cases where y is 1 and x is 0

Let D <- sum(!x * !y) # number of cases where x and y are both 0.

(also D = length(x)-A-B-C)

All the information is summarised in the 2-by-2 contingency table.

Some years ago, Nathan Rountree and I supervised Yung-Sing Koh's

data-mining PhD.

She surveyed the data mining literature and found some 37 different

"interestingness measures" for two-variable associations  -- if I

remember correctly; there were a lot of them.  They fell into a much

smaller number of qualitatively similar groups.

At any rate, the Pearson correlation between x and y is

(A*D - B*C)/sqrt((A+B)*(C+D)*(A+C)*(B+D))

So what happens when we delete the rows where x = 0 and y = 0?

Right, it forces D to 0, leaving A B C unchanged.

And looking at the numerator,

  If you delete rows with x = 0 y = 0 you MUST get a negative correlation.

Quite a modest "true" correlation (based on all the data) like -0.2

can masquerade as quite a strong "zero-suppressed" correlation like

-0.6.  Even +0.2 can turn into -0.4.   (These figures are from a

particular simulation run and may not apply in your case.)

Now one of the reasons why Yun-Sing Koh, Nathan Rountree, and I were

interested in interestingness measures is perhaps coincidentally

related to the file drawer/underreporting problem: it's quite common

for rows where x = 0 and y = 0 never to have been reported to you, so

we were hoping there were measures immune to that.  I have argued for

years that "till record analysis" for supermarkets &c is badly flawed

by two facts: (a) it is hard to measure how much of a product people

WOULD have bought if only you had offered it for sale (although you

can make educated guesses) and (b) till records provide no evidence on

what the people who walked out without buying anything wanted (was the

price too high?  could they not find it?).  Problem (a) leads to a

commercial variant of the Signor-Lipps effect: "when x and/or y were

available for purchase" is not the same as "the period for which data

were recorded", thus inflating D, perhaps massively.  Methods

developed for handling the Signor-Lipps effect in paleontology can be

used to estimate when x and y were available helping you to recover a

more realistic N=A+B+C+D.  I really should have published that.

All of which is a long-winded way of saying that

- Pearson correlations on binary columns can be computed very efficiently

- the rows with x=0 and y=0 may be very informative, even essential for analysis

- delete them at your peril.

- really, delete them at your peril.

On Sat, 27 Jul 2024 at 23:07, Richard O'Keefe 
mailto:rao...@gmail.com>> wrote:

>

> Let's go back to the original posting.

>

> > >

> >

Re: [R] plotting nnet function....

2024-07-27 Thread Ivan Krylov via R-help

В Sat, 27 Jul 2024 11:00:34 +
akshay kulkarni  пишет:

> My question is : how to plot the final model on the actual data
> points?

Have you been able to obtain the predictions? What happens if you call
predict() on the model object returned to you by train()?

Once you have both the data and the prediction, it should be as simple
as plot(traindata$predictor_column, traindata$regressor_column);
lines(traindata$predictor_column, previously_returned_predictions). (Or
an equivalent with your favourite plotting system for R.)

Try following the vignette from the 'caret' package:
https://cran.r-project.org/package=caret/vignettes/caret.html

If you do encounter an error on your way or get stuck not knowing how
exactly to continue, please ask a more specific question.

-- 
Best regards,
Ivan

______
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Automatic Knot selection in Piecewise linear splines

2024-07-26 Thread Vito Muggeo via R-help

dear all,
I apologize for my delay in replying you. Here my contribution, maybe 
just for completeness:

Similar to "earth", "segmented" also fits piecewise linear relationships 
with the number of breakpoints being selected by the AIC or BIC 
(recommended).

#code (example and code from Martin Maechler previous email)

library(segmented)
o<-selgmented(y, ~x, Kmax=20, type="bic", msg=TRUE)
plot(o, add=TRUE)
lines(o, col=2) #the approx CI for the breakpoints

confint(o) #the estimated breakpoints (with CI's)
slope(o) #the estimated slopes (with CI's)

However segmented appears to be less efficient than earth (although with 
reasonable running times), it does NOT work with multivariate responses 
neither products between piecewise linear terms.

kind regards,
Vito

Il 16/07/2024 11:22, Martin Maechler ha scritto:

Anupam Tyagi
 on Tue, 9 Jul 2024 16:16:43 +0530 writes:

 > How can I do automatic knot selection while fitting piecewise linear
 > splines to two variables x and y? Which package to use to do it simply? I
 > also want to visualize the splines (and the scatter plot) with a graph.

 > Anupam

NB: linear splines, i.e. piecewise linear continuous functions.
Given the knots, use  approx() or approxfun() however, the
automatic knots selection does not happen in the base R packages.

I'm sure there are several R packages doing this.
The best such package in my opinion is "earth" which does a
re-implementation (and extensive  *generalization*) of the
famous  MARS algorithm of Friedman.
==> https://en.wikipedia.org/wiki/Multivariate_adaptive_regression_splines

Note that their strengths and power is that  they do their work
for multivariate x (MARS := Multivariate Adaptive Regression
Splines), but indeed do work for the simple 1D case.

In the following example, we always get 11 final knots,
but I'm sure one can tweak the many tuning paramters of earth()
to get more:

## Can we do  knot-selection  for simple (x,y) splines?  === Yes, via  earth() 
{using MARS}!

x <- (0:800)/8

f <- function(x) 7 * sin(pi/8*x) * abs((x-50)/20)^1.25 - (x-40)*(12-x)/64
curve(f(x), 0, 100, n = 1000, col=2, lwd=2)

set.seed(11)
y <- f(x) + 10*rnorm(x)

m.sspl <- smooth.spline(x,y) # base line "standard smoother"

require(earth)
fm1 <- earth(x, y) # default settings
summary(fm1, style = "pmax") #-- got  10 knots (x = 44 "used twice") below
## Call: earth(x=x, y=y)

## y =
##   175.9612
##   -   10.6744 * pmax(0,  x -  4.625)
##   +  9.928496 * pmax(0,  x - 10.875)
##   -  5.940857 * pmax(0,  x -  20.25)
##   +  3.438948 * pmax(0,  x - 27.125)
##   -  3.828159 * pmax(0, 44 -  x)
##   +  4.207046 * pmax(0,  x - 44)
##   +  2.573822 * pmax(0,  x -   76.5)
##   -  10.99073 * pmax(0,  x - 87.125)
##   +  10.97592 * pmax(0,  x - 90.875)
##   +  9.331949 * pmax(0,  x - 94)
##   -   8.48575 * pmax(0,  x -   96.5)

## Selected 12 of 12 terms, and 1 of 1 predictors
## Termination condition: Reached nk 21
## Importance: x
## Number of terms at each degree of interaction: 1 11 (additive model)
## GCV 108.6592RSS 82109.44GRSq 0.861423RSq 0.86894

fm2 <- earth(x, y, fast.k = 0) # (more extensive forward pass)
summary(fm2)
all.equal(fm1, fm2)# they are identical (apart from 'call'):
fm3 <- earth(x, y, fast.k = 0, pmethod = "none", trace = 3) # extensive forward 
pass; *no* pruning
## still no change: fm3 "==" fm1
all.equal(predict(fm1, xx), predict(fm3, xx))

## BTW: The chosen knots and coefficients are
mat <- with(fm1, cbind(dirs, cuts=c(cuts), coef = c(coefficients)))

## Plots : fine grid for visualization: instead of   xx <- seq(x[1], 
x[length(x)], length.out = 1024)
rnx <- extendrange(x) ## to extrapolate a bit
xx <- do.call(seq.int, c(rnx, list(length.out = 1200)))

cbind(f = f(xx),
   sspl = predict(m.sspl, xx)$y,
   mars = predict(fm1, xx)) -> fits

plot(x,y, xlim=rnx, cex = 1/4, col = adjustcolor(1, 1/2))
cols <- c(adjustcolor(2, 1/3),
   adjustcolor(4, 2/3),
   adjustcolor("orange4", 2/3))
lwds <- c(3, 2, 2)
matlines(xx, fits, col = cols, lwd = lwds, lty=1)
legend("topleft", c("true f(x)", "smooth.spline()", "earth()"),
col=cols, lwd=lwds, bty = "n")
title(paste("earth() linear spline vs. smooth.spline();  n =", length(x)))
mtext(substitute(f(x) == FDEF, list(FDEF = body(f

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
=
Vito M.R. Muggeo, PhD

Re: [R] please help generate a square correlation matrix

2024-07-25 Thread Yuan Chun Ding via R-help

Hi Rui,

You are always very helpful!! Thank you,

I just modified your R codes to remove a row with zero values in both column 
pair as below for my real data.

Ding

dat<-gene22mut.coded
r <- P <- matrix(NA, nrow = 22L, ncol = 22L,
 dimnames = list(names(dat), names(dat)))

for(i in 1:22) {
  #i=1
  x <- dat[[i]]
  for(j in (1:22)) {
#j=2
if(i == j) {
  # there's nothing to test, assign correlation 1
  r[i, j] <- 1
} else {
  tmp <-cbind(x,dat[[j]])
  row0 <-rowSums(tmp)
  tem2 <-tmp[row0!=0,]
  tmp3 <- cor.test(tem2[,1],tem2[,2])
  r[i, j] <- tmp3$estimate
  P[i, j] <- tmp3$p.value
}
  }
}
r<-as.data.frame(r)
P<-as.data.frame(P)

From: R-help  On Behalf Of Yuan Chun Ding via 
R-help
Sent: Thursday, July 25, 2024 11:26 AM
To: Rui Barradas ; r-help@r-project.org
Subject: Re: [R] please help generate a square correlation matrix

HI Rui, Thank you for the help! You did not remove a row if zero values exist 
in both column pair, right? Ding From: Rui Barradas  
Sent: Thursday, July 25, 2024 11: 15 AM To: Yuan Chun Ding ;


HI Rui,



Thank you for the  help!



You did not remove a row if zero values exist in both column pair, right?



Ding



From: Rui Barradas mailto:ruipbarra...@sapo.pt>>

Sent: Thursday, July 25, 2024 11:15 AM

To: Yuan Chun Ding mailto:ycd...@coh.org>>; 
r-help@r-project.org<mailto:r-help@r-project.org>

Subject: Re: [R] please help generate a square correlation matrix



Às 17: 39 de 25/07/2024, Yuan Chun Ding via R-help escreveu: > Hi R users, > > 
I generated a square correlation matrix for the dat dataframe below; > 
dat<-data. frame(g1=c(1,0,0,1,1,1,0,0,0), > g2=c(0,1,0,1,0,1,1,0,0), > 
g3=c(1,1,0,0,0,1,0,0,0),





Às 17:39 de 25/07/2024, Yuan Chun Ding via R-help escreveu:



> Hi R users,



>



> I generated a square correlation matrix for the dat dataframe below;



> dat<-data.frame(g1=c(1,0,0,1,1,1,0,0,0),



>  g2=c(0,1,0,1,0,1,1,0,0),



>  g3=c(1,1,0,0,0,1,0,0,0),



>  g4=c(0,1,0,1,1,1,1,1,0))



> library("Hmisc")



> dat.rcorr = rcorr(as.matrix(dat))



> dat.r <-round(dat.rcorr$r,2)



>



> however, I want to modify this correlation calculation;



> my dat has more than 1000 rows and 22 columns;



> in each column, less than 10% values are 1, most of them are 0;



> so I want to remove a  row with value of zero in both columns when calculate 
> correlation between two columns.



> I just want to check whether those values of 1 are correlated between two 
> columns.



> Please look at my code in the following;



>



> cor.4gene <-matrix(0,nrow=4*4, ncol=4)



> for (i in 1:4){



>#i=1



>for (j in 1:4) {



>  #j=1



>  d <-dat[,c(i,j)]%>%



>filter(eval(as.symbol(colnames(dat)[i]))!=0 |



> eval(as.symbol(colnames(dat)[j]))!=0)



>  c <-cor.test(d[,1],d[,2])



>  cor.4gene[i*j,]<-c(colnames(dat)[i],colnames(dat)[j],



>  c$estimate,c$p.value)



>}



> }



> cor.4gene<-as.data.frame(cor.4gene)%>%filter(V1 !=0)



> colnames(cor.4gene)<-c("gene1","gene2","cor","P")



>



> Can you tell me what mistakes I made?



> first, why cor is NA when calculation of correlation for g1 and g1, I though 
> it should be 1.



>



> cor.4gene$cor[is.na(cor.4gene$cor)]<-1



> cor.4gene$cor[is.na(cor.4gene$P)]<-0



> cor.4gene.sq <-pivot_wider(cor.4gene, names_from = gene1, values_from = cor)



>



> Then this line of code above did not generate a square matrix as what the 
> HMisc library did.



> How to fix my code?



>



> Thank you,



>



> Ding



>



>



> --



> 



> -SECURITY/CONFIDENTIALITY WARNING-



>



> This message and any attachments are intended solely for the individual or 
> entity to which they are addressed. This communication may contain 
> information that is privileged, confidential, or exempt from disclosure under 
> applicable law (e.g., personal health information, research data, financial 
> information). Because this e-mail has been sent without encryption, 
> individuals other than the intended recipient may be able to view the 
> information, forward it to others or tamper with the information without the 
> knowledge or consent of the sender. If you are not the intended recipient, or 
> the employee or person responsible for delivering the message to the intended 
> recipient, any dissemination, distribution or copying of the communicat

Re: [R] please help generate a square correlation matrix

2024-07-25 Thread Yuan Chun Ding via R-help

HI Rui,

Thank you for the  help!

You did not remove a row if zero values exist in both column pair, right?

Ding

From: Rui Barradas 
Sent: Thursday, July 25, 2024 11:15 AM
To: Yuan Chun Ding ; r-help@r-project.org
Subject: Re: [R] please help generate a square correlation matrix

Às 17: 39 de 25/07/2024, Yuan Chun Ding via R-help escreveu: > Hi R users, > > 
I generated a square correlation matrix for the dat dataframe below; > 
dat<-data. frame(g1=c(1,0,0,1,1,1,0,0,0), > g2=c(0,1,0,1,0,1,1,0,0), > 
g3=c(1,1,0,0,0,1,0,0,0),


Às 17:39 de 25/07/2024, Yuan Chun Ding via R-help escreveu:

> Hi R users,

>

> I generated a square correlation matrix for the dat dataframe below;

> dat<-data.frame(g1=c(1,0,0,1,1,1,0,0,0),

>  g2=c(0,1,0,1,0,1,1,0,0),

>  g3=c(1,1,0,0,0,1,0,0,0),

>  g4=c(0,1,0,1,1,1,1,1,0))

> library("Hmisc")

> dat.rcorr = rcorr(as.matrix(dat))

> dat.r <-round(dat.rcorr$r,2)

>

> however, I want to modify this correlation calculation;

> my dat has more than 1000 rows and 22 columns;

> in each column, less than 10% values are 1, most of them are 0;

> so I want to remove a  row with value of zero in both columns when calculate 
> correlation between two columns.

> I just want to check whether those values of 1 are correlated between two 
> columns.

> Please look at my code in the following;

>

> cor.4gene <-matrix(0,nrow=4*4, ncol=4)

> for (i in 1:4){

>#i=1

>for (j in 1:4) {

>  #j=1

>  d <-dat[,c(i,j)]%>%

>filter(eval(as.symbol(colnames(dat)[i]))!=0 |

> eval(as.symbol(colnames(dat)[j]))!=0)

>  c <-cor.test(d[,1],d[,2])

>  cor.4gene[i*j,]<-c(colnames(dat)[i],colnames(dat)[j],

>  c$estimate,c$p.value)

>}

> }

> cor.4gene<-as.data.frame(cor.4gene)%>%filter(V1 !=0)

> colnames(cor.4gene)<-c("gene1","gene2","cor","P")

>

> Can you tell me what mistakes I made?

> first, why cor is NA when calculation of correlation for g1 and g1, I though 
> it should be 1.

>

> cor.4gene$cor[is.na(cor.4gene$cor)]<-1

> cor.4gene$cor[is.na(cor.4gene$P)]<-0

> cor.4gene.sq <-pivot_wider(cor.4gene, names_from = gene1, values_from = cor)

>

> Then this line of code above did not generate a square matrix as what the 
> HMisc library did.

> How to fix my code?

>

> Thank you,

>

> Ding

>

>

> --

> 

> -SECURITY/CONFIDENTIALITY WARNING-

>

> This message and any attachments are intended solely for the individual or 
> entity to which they are addressed. This communication may contain 
> information that is privileged, confidential, or exempt from disclosure under 
> applicable law (e.g., personal health information, research data, financial 
> information). Because this e-mail has been sent without encryption, 
> individuals other than the intended recipient may be able to view the 
> information, forward it to others or tamper with the information without the 
> knowledge or consent of the sender. If you are not the intended recipient, or 
> the employee or person responsible for delivering the message to the intended 
> recipient, any dissemination, distribution or copying of the communication is 
> strictly prohibited. If you received the communication in error, please 
> notify the sender immediately by replying to this message and deleting the 
> message and any accompanying files from your system. If, due to the security 
> risks, you do not wish to rec

>   eive further communications via e-mail, please reply to this message and 
> inform the sender that you do not wish to receive further e-mail from the 
> sender. (LCP301)

> 

>

> [[alternative HTML version deleted]]

>

> __

> R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To 
> UNSUBSCRIBE and more, see

> https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6Hb8338TBM$<https://urldefense.com/v3/__https:/stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6Hb8338TBM$>

> PLEASE do read the posting guide 
> https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6Hb880tLw0$<https://urldefense.co

[R] please help generate a square correlation matrix

2024-07-25 Thread Yuan Chun Ding via R-help

Hi R users,

I generated a square correlation matrix for the dat dataframe below;
dat<-data.frame(g1=c(1,0,0,1,1,1,0,0,0),
g2=c(0,1,0,1,0,1,1,0,0),
g3=c(1,1,0,0,0,1,0,0,0),
g4=c(0,1,0,1,1,1,1,1,0))
library("Hmisc")
dat.rcorr = rcorr(as.matrix(dat))
dat.r <-round(dat.rcorr$r,2)

however, I want to modify this correlation calculation;
my dat has more than 1000 rows and 22 columns;
in each column, less than 10% values are 1, most of them are 0;
so I want to remove a  row with value of zero in both columns when calculate 
correlation between two columns.
I just want to check whether those values of 1 are correlated between two 
columns.
Please look at my code in the following;

cor.4gene <-matrix(0,nrow=4*4, ncol=4)
for (i in 1:4){
  #i=1
  for (j in 1:4) {
#j=1
d <-dat[,c(i,j)]%>%
  filter(eval(as.symbol(colnames(dat)[i]))!=0 |
   eval(as.symbol(colnames(dat)[j]))!=0)
c <-cor.test(d[,1],d[,2])
cor.4gene[i*j,]<-c(colnames(dat)[i],colnames(dat)[j],
c$estimate,c$p.value)
  }
}
cor.4gene<-as.data.frame(cor.4gene)%>%filter(V1 !=0)
colnames(cor.4gene)<-c("gene1","gene2","cor","P")

Can you tell me what mistakes I made?
first, why cor is NA when calculation of correlation for g1 and g1, I though it 
should be 1.

cor.4gene$cor[is.na(cor.4gene$cor)]<-1
cor.4gene$cor[is.na(cor.4gene$P)]<-0
cor.4gene.sq <-pivot_wider(cor.4gene, names_from = gene1, values_from = cor)

Then this line of code above did not generate a square matrix as what the HMisc 
library did.
How to fix my code?

Thank you,

Ding


--

-SECURITY/CONFIDENTIALITY WARNING-  

This message and any attachments are intended solely for the individual or 
entity to which they are addressed. This communication may contain information 
that is privileged, confidential, or exempt from disclosure under applicable 
law (e.g., personal health information, research data, financial information). 
Because this e-mail has been sent without encryption, individuals other than 
the intended recipient may be able to view the information, forward it to 
others or tamper with the information without the knowledge or consent of the 
sender. If you are not the intended recipient, or the employee or person 
responsible for delivering the message to the intended recipient, any 
dissemination, distribution or copying of the communication is strictly 
prohibited. If you received the communication in error, please notify the 
sender immediately by replying to this message and deleting the message and any 
accompanying files from your system. If, due to the security risks, you do not 
wish to rec
 eive further communications via e-mail, please reply to this message and 
inform the sender that you do not wish to receive further e-mail from the 
sender. (LCP301)


    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [External] Using the pipe, |>, syntax with "names<-"

2024-07-20 Thread Jeff Newmiller via R-help

I think that the simplicity of setNames is hard to beat:

z |> setNames( c( "a", "foo" ) )

and if you are determined not to load dplyr then

column_rename <- function( DF, map ) {
  on <- names( DF )
  on[ match( map, on ) ] <- names( map )
  names( DF ) <- on
  DF
}

is more robust to column reorganization than replace():

z |> column_rename( c( foo = "b" ) )


On July 20, 2024 10:07:57 PM PDT, Deepayan Sarkar  
wrote:
>The main challenge in Bert's original problem is that `[` and `[<-` cannot
>be called in a pipeline. The obvious solution is to define named versions,
>e.g.:
>
>elt <- `[`
>`elt<-` <- `[<-`
>
>Then,
>
>> z <- data.frame(a = 1:3, b = letters[1:3])
>> z |> names() |> elt(2)
>[1] "b"
>> z |> names() |> elt(2) <- "foo"
>> z
>  a foo
>1 1   a
>2 2   b
>3 3   c
>
>You could actually also do (using a similar function already defined in
>methods)
>
>z |> names() |> el(2) <- "bar"
>
>Iris's _ trick is of course a nice alternative; and this example in ?pipeOp
>already covers it:
>
># using the placeholder as the head of an extraction chain:
>mtcars |> subset(cyl == 4) |> lm(formula = mpg ~ disp) |> _$coef[[2]]
>
>While the replacement question is a nice exercise, I am not sure about the
>value of emphasizing that you can use pipes to do complex assignments.
>Doesn't that defeat the whole purpose of piping? For one thing, it will
>necessarily terminate the pipe. Also, it will not work if the starting
>value is not a variable. E.g.,
>
>> data.frame(a = 1:3, b = letters[1:3]) |> names() |> _[2] <- "bar"
>Error in names(data.frame(a = 1:3, b = letters[1:3]))[2] <- "bar" :
>  target of assignment expands to non-language object
>
>Duncan's rename() approach, which will just change the column name and
>return the modified object, seems more useful as part of a pipeline.
>
>Best,
>-Deepayan
>
>On Sun, 21 Jul 2024 at 04:46, Bert Gunter  wrote:
>
>> I second Rich's excellent suggestion.
>>
>> As with all elegant solutions, Iris's clicked on the wee light bulb in
>> my brain, and I realized that a slightly more verbose, but perhaps
>> more enlightening, alternative may be:
>>
>> z |>  attr("names") |> _[2] <- "foo"
>>
>> However, I would add this as an example *only with* Iris's solution.
>> Hers should be shown whether or not the above is.
>>
>> Cheers,
>> Bert
>>
>> On Sat, Jul 20, 2024 at 3:35 PM Richard M. Heiberger 
>> wrote:
>> >
>> > I think Iris's solution should be added to the help file: ?|>
>> > there are no examples there now that show assignment or replacement
>> using the "_"
>> >
>> > > On Jul 20, 2024, at 18:21, Duncan Murdoch 
>> wrote:
>> > >
>> > > On 2024-07-20 6:02 p.m., Iris Simmons wrote:
>> > >> z <- data.frame(a = 1:3, b = letters[1:3])
>> > >> z |> names() |> _[2] <- "foo"
>> > >> z
>> > >
>> > > That's a great suggestion!
>> > >
>> > > Duncan Murdoch
>> > >
>> > > __
>> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > > https://stat.ethz.ch/mailman/listinfo/r-help
>> > > PLEASE do read the posting guide
>> http://www.r-project.org/posting-guide.html
>> > > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extract

2024-07-19 Thread Jeff Newmiller via R-help

Here is another way... for data analysis, the idiomatic result is usually more 
useful, though for presentation in a final result the wide result might be 
desired.

library(dplyr)
library(tidyr)

dat<-read.csv(text=
"Year, Sex,string
2002,F,15 xc Ab
2003,F,14
2004,M,18 xb 25 35 21
2005,M,13 25
2006,M,14 ac 256 AV 35
2007,F,11"
, header=TRUE )

idiomatic <- (
dat
%>% mutate( string = strsplit( string, " " ) )
%>% unnest( cols = string )
%>% group_by( Year, Sex )
%>% mutate( s_name = paste0( "S", seq_along( string ) ) )
%>% ungroup()
)
idiomatic # each row has unique Year, Sex, and s_name

wide <- (
idiomatic
%>% spread( s_name, string )
)
wide


On July 19, 2024 11:23:48 AM PDT, Val  wrote:
>Thank you and sorry for the confusion.
>The desired result should have 8 variables as a comma separated in
>each line.  The string variable  is  considered as one variable.
>The output of your script is wfine for me.  Thank you!
>
>On Fri, Jul 19, 2024 at 1:00 PM Ebert,Timothy Aaron  wrote:
>>
>> The desired result is odd.
>> 1) It looks like the string is duplicated in the desired result. The first 
>> line of data has "15, xc, Ab",  and the desired result has "15, xc, Ab, 15, 
>> xc, Ab"
>> 2) The example has S1 through S5, but the desired result has data for eight 
>> variables in the first line (not five).
>> 3) The desired result has a different number of variables for each line.
>> 4) Are you assuming that all missing data is at the end of the string? If 
>> there are 5 variables (S1  S5), do you know that "15, xc, Ab" is S1 = 
>> 15, S2 = 'xc', and S3 = 'Ab' rather than S2=15, S4='xc' and S5='Ab' ?
>>
>> This isn't exactly what you asked for, but maybe I was confused somewhere. 
>> This approach puts string data into variables in order. In this approach one 
>> mixes string and numeric data. The string is not duplicated.
>>
>> library(tidyr)
>>
>> dat <- read.csv(text="Year,Sex,string
>> 2002,F,15 xc Ab
>> 2003,F,14
>> 2004,M,18 xb 25 35 21
>> 2005,M,13 25
>> 2006,M,14 ac 256 AV 35
>> 2007,F,11", header=TRUE, stringsAsFactors=FALSE)
>>
>> # split the 'string' column based on spaces
>> dat_separated <- dat |>
>>   separate(string, into = paste0("S", 1:5), sep = " ",
>>fill = "right", extra = "merge")
>>
>> Tim
>>
>>
>> -Original Message-
>> From: R-help  On Behalf Of Val
>> Sent: Friday, July 19, 2024 12:52 PM
>> To: r-help@R-project.org (r-help@r-project.org) 
>> Subject: [R] Extract
>>
>> [External Email]
>>
>> Hi All,
>>
>> I want to extract new variables from a string and add it to the dataframe.
>> Sample data is csv file.
>>
>> dat<-read.csv(text="Year, Sex,string
>> 2002,F,15 xc Ab
>> 2003,F,14
>> 2004,M,18 xb 25 35 21
>> 2005,M,13 25
>> 2006,M,14 ac 256 AV 35
>> 2007,F,11",header=TRUE)
>>
>> The string column has  a maximum of five variables. Some rows have all and 
>> others may not have all the five variables. If missing then  fill it with 
>> NA, Desired result is shown below,
>>
>>
>> Year,Sex,string, S1, S2, S3 S4,S5
>> 2002,F,15 xc Ab, 15,xc,Ab, NA, NA
>> 2003,F,14, 14,NA,NA,NA,NA
>> 2004,M,18 xb 25 35 21,18, xb, 25, 35, 21
>> 2005,M,13 25,13, 25,NA,NA,NA
>> 2006,M,14 ac 256 AV 35, 14, ac, 256, AV, 35
>> 2007,F,11, 11,NA,NA,NA,NA
>>
>> Any help?
>> Thank you in advance.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.r-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot two-factor legend

2024-07-18 Thread SIBYLLE STÖCKLI via R-help

Thanks a lot Rui and Jeff

Yes including labels=c() in  scale_linetype_manual() was the hint.

Sibylle

-Original Message-
From: Rui Barradas  
Sent: Thursday, July 18, 2024 6:50 PM
To: sibylle.stoec...@gmx.ch; r-help@r-project.org
Subject: Re: [R] ggplot two-factor legend

Às 17:43 de 18/07/2024, Rui Barradas escreveu:
> Às 16:27 de 18/07/2024, SIBYLLE STÖCKLI via R-help escreveu:
>> Hi
>>
>> I am using ggplot to visualise y for a two-factorial group (Bio: 0 
>> and
>> 1) x
>> = 6 years. I was able to adapt the colour of the lines (green and 
>> red) and the linetype (solid and dashed).
>> Challenge: my code produces now two legends. One with the colors for 
>> the group and one with the linetype for the group. Does somebody have 
>> a hint how to adapt the code to produce one legend? Group 0 = red and 
>> dashed, Group 1 = green and solid?
>>
>>
>> MS1<- MS %>% filter(QI_A!="NA") %>% droplevels() dev.new(width=4, 
>> height=2.75) par(mar = c(0,6,0,0)) p1<-ggplot(data = MS1, aes(x= 
>> Jahr, y= QI_A,group=Bio,color=Bio,
>> linetype=Bio)) +
>>  geom_smooth(aes(fill=Bio) , method = "lm" , formula = y ~ x 
>> +
>> I(x^2),linewidth=1) +
>> theme(panel.background = element_blank())+
>> theme(axis.line = element_line(colour = "black"))+
>>theme(axis.text=element_text(size=18))+
>>theme(axis.title=element_text(size=20))+
>> ylab("Anteil BFF an LN [%]") +xlab("Jahr")+
>> scale_color_manual(values=c("red","dark green"), labels=c("ÖLN", 
>> "BIO"))+
>> scale_fill_manual(values=c("red","dark green"), labels= c("ÖLN", 
>> "BIO"))+
>> theme(legend.title = element_blank())+
>>theme(legend.text=element_text(size=20))+
>>scale_linetype_manual(values=c("dashed", "solid"))
>> p1<-p1 + expand_limits(y=c(0, 30))
>>
>> kind regards
>> Sibylle
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> Hello,
> 
> To have one legend only, the labels must be the same. Try using
> 
> labels=c("ÖLN", "BIO")
> 
> in
> 
> scale_linetype_manual(values=c("dashed", "solid"), labels=c("ÖLN", 
> "BIO"))
> 
> 
> Hope this helps,
> 
> Rui Barradas
> 
> 
Hello,

Here is a more complete an answer with the built-in data set mtcars.
Note that the group aesthetic is not used. This is because linetype is 
categorical (after mutate) and there's no need to group again by the same 
variable (am).

Remove labels from scale_linetype_manual and there are two legends but with the 
same labels the legends merge.


library(ggplot2)
library(dplyr)

mtcars %>%
   # linetype must be categorical
   mutate(am = factor(am)) %>%
   ggplot(aes(hp, disp, color = am, linetype = am)) +
   geom_line() +
   scale_color_manual(
 values = c("red","dark green"),
 labels = c("ÖLN", "BIO")
   ) +
   scale_linetype_manual(
 values = c("dashed", "solid"),
 labels = c("ÖLN", "BIO")
   ) +
   theme_bw()


Hope this helps,

Rui Barradas



-- 
Este e-mail foi analisado pelo software antivírus AVG para verificar a presença 
de vírus.
www.avg.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot two-factor legend

2024-07-18 Thread SIBYLLE STÖCKLI via R-help

Thanks Jeff

I removed the group parameter in the fp1<-ggplot () line. It doesn't change 
anything.
I suppose I got two legends as in the ggplot () line I have color=Bio & 
linetype=Bio. However, when removing linetype = Bio I just geht red and green. 
For black and white printing I would like the additionally differentiate the 
two lines (groups) in the linetype.

Sibylle

-Original Message-
From: Jeff Newmiller  
Sent: Thursday, July 18, 2024 6:13 PM
To: sibylle.stoec...@gmx.ch; SIBYLLE STÖCKLI via R-help ; 
r-help@r-project.org
Subject: Re: [R] ggplot two-factor legend

If I follow your question, you want redundant aesthetics. Ggplot normally 
notices correlated aesthetic mapping variables and merges the legends, so the 
most likely answer is that your data are not fully correlated in all rows. I 
have also seen this where data are drawn from different dataframes for 
different layers since it is hard to merge factors, but I don't see that here.

You are using the group parameter... try removing that? The group parameter 
overrides the automatic group determination. There might be a syntax for 
specifying correlated grouping, but I don't know it... I normally just verify 
that my data meets the requirements to be automatically identified as 
correlated if that is my goal, since that is a prerequisite anyway.

On July 18, 2024 8:27:05 AM PDT, "SIBYLLE STÖCKLI via R-help" 
 wrote:
>Hi
>
>I am using ggplot to visualise y for a two-factorial group (Bio: 0 and 
>1) x = 6 years. I was able to adapt the colour of the lines (green and 
>red) and the linetype (solid and dashed).
>Challenge: my code produces now two legends. One with the colors for 
>the group and one with the linetype for the group. Does somebody have a 
>hint how to adapt the code to produce one legend? Group 0 = red and 
>dashed, Group 1 = green and solid?
>
>
>MS1<- MS %>% filter(QI_A!="NA") %>% droplevels() dev.new(width=4, 
>height=2.75) par(mar = c(0,6,0,0)) p1<-ggplot(data = MS1, aes(x= Jahr, 
>y= QI_A,group=Bio,color=Bio,
>linetype=Bio)) + 
>   geom_smooth(aes(fill=Bio) , method = "lm" , formula = y ~ x +
>I(x^2),linewidth=1) +
>   theme(panel.background = element_blank())+
>   theme(axis.line = element_line(colour = "black"))+
>  theme(axis.text=element_text(size=18))+
>  theme(axis.title=element_text(size=20))+
>   ylab("Anteil BFF an LN [%]") +xlab("Jahr")+
>   scale_color_manual(values=c("red","dark green"), labels=c("ÖLN", 
>"BIO"))+
>   scale_fill_manual(values=c("red","dark green"), labels= c("ÖLN", 
>"BIO"))+
>       theme(legend.title = element_blank())+
>  theme(legend.text=element_text(size=20))+
>  scale_linetype_manual(values=c("dashed", "solid"))
>p1<-p1 + expand_limits(y=c(0, 30))
>
>kind regards
>Sibylle
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide 
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot two-factor legend

2024-07-18 Thread Jeff Newmiller via R-help

If I follow your question, you want redundant aesthetics. Ggplot normally 
notices correlated aesthetic mapping variables and merges the legends, so the 
most likely answer is that your data are not fully correlated in all rows. I 
have also seen this where data are drawn from different dataframes for 
different layers since it is hard to merge factors, but I don't see that here.

You are using the group parameter... try removing that? The group parameter 
overrides the automatic group determination. There might be a syntax for 
specifying correlated grouping, but I don't know it... I normally just verify 
that my data meets the requirements to be automatically identified as 
correlated if that is my goal, since that is a prerequisite anyway.

On July 18, 2024 8:27:05 AM PDT, "SIBYLLE STÖCKLI via R-help" 
 wrote:
>Hi
>
>I am using ggplot to visualise y for a two-factorial group (Bio: 0 and 1) x
>= 6 years. I was able to adapt the colour of the lines (green and red) and
>the linetype (solid and dashed).
>Challenge: my code produces now two legends. One with the colors for the
>group and one with the linetype for the group. Does somebody have a hint how
>to adapt the code to produce one legend? Group 0 = red and dashed, Group 1 =
>green and solid?
>
>
>MS1<- MS %>% filter(QI_A!="NA") %>% droplevels()
>dev.new(width=4, height=2.75)
>par(mar = c(0,6,0,0))
>p1<-ggplot(data = MS1, aes(x= Jahr, y= QI_A,group=Bio,color=Bio,
>linetype=Bio)) + 
>   geom_smooth(aes(fill=Bio) , method = "lm" , formula = y ~ x +
>I(x^2),linewidth=1) +
>   theme(panel.background = element_blank())+
>   theme(axis.line = element_line(colour = "black"))+
>  theme(axis.text=element_text(size=18))+
>  theme(axis.title=element_text(size=20))+
>   ylab("Anteil BFF an LN [%]") +xlab("Jahr")+
>   scale_color_manual(values=c("red","dark green"), labels=c("ÖLN",
>"BIO"))+
>   scale_fill_manual(values=c("red","dark green"), labels= c("ÖLN",
>"BIO"))+
>   theme(legend.title = element_blank())+
>  theme(legend.text=element_text(size=20))+
>  scale_linetype_manual(values=c("dashed", "solid"))
>p1<-p1 + expand_limits(y=c(0, 30))
>
>kind regards
>Sibylle
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ggplot two-factor legend

2024-07-18 Thread SIBYLLE STÖCKLI via R-help

Hi

I am using ggplot to visualise y for a two-factorial group (Bio: 0 and 1) x
= 6 years. I was able to adapt the colour of the lines (green and red) and
the linetype (solid and dashed).
Challenge: my code produces now two legends. One with the colors for the
group and one with the linetype for the group. Does somebody have a hint how
to adapt the code to produce one legend? Group 0 = red and dashed, Group 1 =
green and solid?


MS1<- MS %>% filter(QI_A!="NA") %>% droplevels()
dev.new(width=4, height=2.75)
par(mar = c(0,6,0,0))
p1<-ggplot(data = MS1, aes(x= Jahr, y= QI_A,group=Bio,color=Bio,
linetype=Bio)) + 
geom_smooth(aes(fill=Bio) , method = "lm" , formula = y ~ x +
I(x^2),linewidth=1) +
theme(panel.background = element_blank())+
theme(axis.line = element_line(colour = "black"))+
  theme(axis.text=element_text(size=18))+
  theme(axis.title=element_text(size=20))+
ylab("Anteil BFF an LN [%]") +xlab("Jahr")+
scale_color_manual(values=c("red","dark green"), labels=c("ÖLN",
"BIO"))+
scale_fill_manual(values=c("red","dark green"), labels= c("ÖLN",
"BIO"))+
theme(legend.title = element_blank())+
  theme(legend.text=element_text(size=20))+
  scale_linetype_manual(values=c("dashed", "solid"))
p1<-p1 + expand_limits(y=c(0, 30))

kind regards
Sibylle

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] grDevices segfault when building R4.4.0 on RHEL 9.1.

2024-07-17 Thread Ivan Krylov via R-help

В Wed, 17 Jul 2024 02:35:22 +
Miguel Esteva  пишет:

> I replaced the "--with-lapack" flag with "--with-lapack='-lflexiblas
> -L/tools/flexiblas/3.4.2/lib64'" and everything built ok.

Glad to see you managed to avoid the crash!

> From a quick check in my emails, seems the RHEL9 system lapack
> packages are broken. Will test a bit more.

Simon Andrews has also shown me how to reproduce the crash on
AlmaLinux: https://stat.ethz.ch/pipermail/r-help/2024-May/479321.html

Looks like an ABI incompatibility between gfortran-11 and blas-devel +
lapack-devel.

-- 
Best regards,
Ivan

______
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] grDevices segfault when building R4.4.0 on RHEL 9.1.

2024-07-17 Thread Miguel Esteva via R-help

Hi Ivan,

An apology, I was away for quite a bit.

To reproduce the setup:

I have been using the default GCC in RHEL 9.1.

gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/11/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap --enable-host-pie 
--enable-host-bind-now --enable-languages=c,c++,fortran,lto --prefix=/usr 
--mandir=/usr/share/man --infodir=/usr/share/info 
--with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared 
--enable-threads=posix --enable-checking=release --with-system-zlib 
--enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object 
--enable-linker-build-id --with-gcc-major-version-only --enable-plugin 
--enable-initfini-array --without-isl --enable-multilib 
--with-linker-hash-style=gnu --enable-offload-targets=nvptx-none 
--without-cuda-driver --enable-gnu-indirect-function --enable-cet 
--with-tune=generic --with-arch_64=x86-64-v2 --with-arch_32=x86-64 
--build=x86_64-redhat-linux --with-build-config=bootstrap-lto 
--enable-link-serialization=1
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.4.1 20230605 (Red Hat 11.4.1-2) (GCC)

I have been building R 4.4.0 and 4.4.1 with Flexiblas and with the built in R 
BLAS/LAPACK.

R BLAS:
./configure --prefix=/tools/R/$RVER  --enable-R-shlib --enable-memory-profiling 
 --with-pcre2=/tools/pcre2/10.42

Flexiblas:

PKG_CONFIG_PATH=/tools/flexiblas/3.4.2/lib64/pkgconfig ./configure 
--prefix=/tools/R/flexiblas/4.4.1  --enable-R-shlib --enable-memory-profiling 
--with-pcre2=/tools/pcre2/10.42 --with-blas="-lflexiblas 
-L/tools/flexiblas/3.4.2/lib64" --with-lapack

I realised the build fails when "--with-lapack" is left unspecified, even 
though the configure output shows this:

  Source directory:.
  Installation directory:  /tools/R/flexiblas

  C compiler:  gcc  -g -O2
  Fortran fixed-form compiler: gfortran  -g -O2

  Default C++ compiler:g++ -std=gnu++17  -g -O2
  C++11 compiler:  g++ -std=gnu++11  -g -O2
  C++14 compiler:  g++ -std=gnu++14  -g -O2
  C++17 compiler:  g++ -std=gnu++17  -g -O2
  C++20 compiler:  g++ -std=gnu++20  -g -O2
  C++23 compiler:  g++ -std=gnu++23  -g -O2
  Fortran free-form compiler:  gfortran  -g -O2
  Obj-C compiler:

  Interfaces supported:X11, tcltk
  External libraries:  pcre2, readline, BLAS(FlexiBlas), LAPACK(in 
blas), curl, libdeflate
  Additional capabilities: PNG, JPEG, TIFF, NLS, cairo, ICU
  Options enabled: shared R library, R profiling, memory profiling, 
libdeflate for lazyload

  Capabilities skipped:
  Options not enabled: shared BLAS

  Recommended packages:yes

I replaced the "--with-lapack" flag with "--with-lapack='-lflexiblas 
-L/tools/flexiblas/3.4.2/lib64'" and everything built ok.

>From a quick check in my emails, seems the RHEL9 system lapack packages are 
>broken. Will test a bit more.

If you need a singularity container in the future, I can provide one with the 
R-dependencies installed. We setup dependencies similar to: 
https://github.com/rstudio/r-builds/blob/main/builder/Dockerfile.rhel-9

or

subscription-manager repos --enable codeready-builder-for-rhel-9-$(arch)-rpms
dnf install -y yum-utils
dnf install -y 
https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm
yum-builddep -y R

Just bringing this to your attention as the issue is not with R by the looks, 
as I was unable to reproduce on Rocky Linux 9.1.

Kind regards and thanks!


Miguel Esteva
Senior ITS Research Systems Engineer


The Walter and Eliza Hall Institute of Medical Research
1G Royal Parade
Parkville VIC 3052
Australia

Phone (03) 9345 2909

Email estev...@wehi.edu.au

Web http://www.wehi.edu.au<http://www.wehi.edu.au/>


From: Ivan Krylov 
Sent: Friday, 3 May 2024 9:40 PM
To: Miguel Esteva via R-help 
Cc: Miguel Esteva 
Subject: Re: [R] grDevices segfault when building R4.4.0 on RHEL 9.1.

Dear Miguel Esteva,

I couldn't get a Red Hat "ubi9" container to install enough
dependencies to build R. Is there a way to reproduce your setup on a
virtual machine somewhere?

On Fri, 3 May 2024 00:42:43 +
Miguel Esteva via R-help  wrote:

>  *** caught segfault ***
>
> address 0x1801fa8f70, cause 'memory not mapped'
>
>
> Traceback:
>
>  1: solve.default(rgb)

This seems to crash inside the BLAS. Which BLAS are you using? Any
custom ./configure arguments? Which compilers are you running?

To find out more information about the crash, try to follow it with a
debugger. Change directory to src/library/grDevices and run:

_R_COMPILE_PKGS_=1 R_COMPILER_SUPPRESS_ALL=1 \
 R_DEFAULT_PACKAGES=NULL LC_ALL=C \
../../../bin/

Re: [R] reticulate + virtual environments

2024-07-15 Thread Ivan Krylov via R-help

В Mon, 15 Jul 2024 07:56:17 +0200
Sigbert Klinke  пишет:

>  > py_config()  

>  > use_virtualenv("mmstat4.hu.data", required = TRUE)  

Does it help _not_ to call py_config() before use_virtualenv()?

help(py_config) says that it forces the initialization of Python. When
you later try to ask for a different virtual environment, no conflict
is detected because
normalizePath('/home/sk/.virtualenvs/r-reticulate/bin/python') is
identical to
normalizePath('/home/sk/.virtualenvs/mmstat4.hu.data/bin/python'): they
must be both symlinks to /usr/bin/python3, so reticulate is likely
thinking that it's the same Python. Thus Python is not initialised
again, but you also don't see an error.

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reinterpret data without saving it to a file 1st? Check for integer stopping at 1st decimal?

2024-07-14 Thread Ivan Krylov via R-help

В Sun, 14 Jul 2024 03:16:56 -0400
DynV Montrealer  пишет:

> Perhaps some way to break the spreadsheet data (eg XLdata <-
> read_excel(...)), then put it back together without any writing to a
> file (eg XLdataReformed <- reform(XLdata)) ?

read_excel() is documented to return objects of class tibble:
https://cran.r-project.org/package=tibble/vignettes/tibble.html

Long story short, tibbles are named lists of columns, so it should be
possible for you to access and replace the individual parts of them
using the standard list subset syntax XLdata[[columnname]].

Lists are described in R Intro chapter 6 and many other books on R:
https://cran.r-project.org/doc/manuals/R-intro.html#Lists-and-data-frames
http://web.archive.org/web/20230415001551if_/http://ashipunov.info/shipunov/school/biol_240/en/visual_statistics.pdf
(see section 3.8.2 on page 93 and following)

> In addition, from is.integer() documentation I ran
> 
> > is.wholenumber <- function(x, tol = .Machine$double.eps^0.5) abs(x
> > - round (x)) < tol  
> 
> and I'm now trying to have it stop at the 1st decimal content of a
> column.

If you'd like to write idiomatic R code, consider the fact that
is.wholenumber is vectorised:

is.wholenumber(c(1,2,3,pi))
# [1]  TRUE  TRUE  TRUE FALSE

Given a vector of numbers, it will return a vector of the same length
specifying whether each element can be considered a whole number.
Combine it with all() and you can test the whole column in two function
calls.

R also has a type.convert function that may be useful in this case:
https://search.r-project.org/R/refmans/utils/html/type.convert.html

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem loading BiodiversityR, Error: package ‘tcltk’ could not be loaded

2024-07-14 Thread Ivan Krylov via R-help

В Sat, 13 Jul 2024 16:04:17 +0100
Adam Hillier  пишет:

>   error: X11 library is missing: install XQuartz from www.xquartz.org

Does the problem go away if you install XQuartz from www.xquartz.org?

"R installation and administration" section 4 also documents the
requirement to have XQuartz installed in order to use the tcltk package
(which is part of R itself) and the x11() device:
https://cran.r-project.org/doc/manuals/R-admin.html#Installing-R-under-macOS

For macOS-specific problems, r-sig-...@r-project.org may give you more
precise advice. If installing XQuartz doesn't help, make sure to
provide your sessionInfo() output.

-- 
Best regards,
Ivan

______
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] grep

2024-07-12 Thread Jorgen Harmse via R-help

which(grepl()) looks odd. Doesn't grep by itself return the correct vector 
of indices?

Regards,
Jorgen Harmse.

Message: 5
Date: Fri, 12 Jul 2024 17:42:05 +0800
From: Steven Yen mailto:st...@ntu.edu.tw>>
To: Uwe Ligges mailto:lig...@statistik.tu-dortmund.de>>, R-help Mailing List
mailto:r-help@r-project.org>>
Cc: Steven Yen mailto:sye...@gmail.com>>
Subject: Re: [R] grep
Message-ID: mailto:b73784ce-c018-4587-bcd9-64adbd0dc...@ntu.edu.tw>>
Content-Type: text/plain; charset="utf-8"

Sorry. grepl worked:

which(grepl("very|somewhat",names(goprobit.p$est)))

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Weird R Studio behaviour...

2024-07-09 Thread Ivan Krylov via R-help

В Tue, 9 Jul 2024 13:02:17 +
Levent TERLEMEZ via R-help  пишет:

> System is on W 10, and R 3.4.1 is working with R Studio. R Studio is
> updated today

https://posit.co/download/rstudio-desktop/ says "RStudio requires R
3.6.0+" so I'm afraid they don't support this configuration any more.

-- 
Best regards,
Ivan

______
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Weird R Studio behaviour...

2024-07-09 Thread Stephen H. Dawson, DSL via R-help


I disagree.

UTF-8 is far from new. The IDE cannot fail at the point of not handling 
a known technology without advancing the argument with further messaging.


What happens when you run the same code within R apart from the IDE?

What trace work have you accomplished with this special character in a 
separate test script?


It may be how Windows 10 is handling UTF-8 after the latest update of 
optional packages that Microsoft released last week. Run a test in CMD 
on UTF-8 and see what you discover.



*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com


On 7/9/24 11:24, Bert Gunter wrote:

I think you should also update R to the latest version, as that *might* be
the source of the problem.

Other may be able to give you a specific diagnosis, but updating R is to a
(reasonably, at least) current version is good practice anyway.

Cheers,
Bert

On Tue, Jul 9, 2024 at 8:11 AM Levent TERLEMEZ via R-help <
r-help@r-project.org> wrote:


Hi,

Have a nice week. First of all, I know this is not R Studio forum but I
want to ask here first, if you all do not mind. Well, I am away from my
computer right now but, I have a strange problem (at least to me). My
script worked perfectly for a year, and today, suddenly stop working
because R Studio begins to warn me about illegal characters in the script.

System is on W 10, and R 3.4.1 is working with R Studio. R Studio is
updated today to the latest one because of this problem with the hope of
resolving the problem (but no luck) and they are used as their default
installation settings. Anyway, the problem example may not be repoducable
right now but if it is, I can give detailed one later.

While the original working code is this (there is no synax error, too in
the code because it was working perfectly until today before updating R
Studio, error also came out before this update as I mensioned before);

legend(c("Kapanış",""20 Günlük,"50 Günlük"),col=c("black"...

The warning is this;

Error: unexpected symbol inside:
" Encoding(kill3) <- "latin1" legend(c(kill1,kill2,"50
G�nl�k"),col=c("black"… and can be solved when converted “ü” to “u”.
Addition to this another solution (at least for me) is this;

kill1<-"Kapanış"
Encoding(kill1) <- "UTF-8" (these two statements are not needed but fort
he sake of code integrity, is applied to it, too. If kill1 is converted to
latin1, this time it is broken)
kill2<-"22 Günlük MA"
Encoding(kill2) <- "latin1"
kill3<-"50 Günlük MA"
Encoding(kill3) <- "latin1"

And also it is set to “ASK” and always “UTF-8” is selected.

But, I also wonder why today and what changed so R Studio stops suddenly
running the script? I can not following up the changes anymore as used to
be and if this is a character set problem, it is coming back again and
again. What is the permenant solution of this? This is like an endless
problem…

With my best regards and thanks for your patience….

Levent Terlemez.




YASAL UYARI: Bu e-postanın içerdiği bilgiler (ekleri de dahil olmak üzere)
gizlidir. Sahibinin onayı olmaksızın içeriği kopyalanamaz, üçüncü kişilere
açıklanamaz veya iletilemez . Bu mesajın gönderilmek istendiği kişi
değilseniz (ya da bu e-postayı yanlışlıkla aldıysanız), lütfen yollayan
kişiyi haberdar ediniz ve mesajı sisteminizden derhal siliniz. Eskişehir
Teknik Üniversitesi, bu mesajın içerdiği bilgilerin doğruluğu veya eksiksiz
olduğu konusunda bir garanti vermemektedir. Bu nedenle, bilgilerin ne
şekilde olursa olsun içeriğinden, iletilmesinden, alınmasından,
saklanmasından Eskişehir Teknik Üniversitesi sorumlu değildir. Bu mesajın
içeriği yazarına ait olup, Eskişehir Teknik Üniversitesi'nin görüşlerini
içermeyebilir. Bu e-posta bizce bilinen tüm bilgisayar virüslerine karşı
taranmıştır.

DISCLAIMER: This e-mail (including any attachments) may contain
confidential and/or privileged information. Copying, disclosure or
distribution of the material in this e-mail without owner authority is
strictly forbidden. If you are not the intended recipient (or have received
this e-mail in error), please notify the sender and delete it from your
system immediately. Eskisehir Technical University makes no warranty as to
the accuracy or completeness of any information contained in this message
and hereby excludes any liability of any kind for the information contained
therein or for the information transmission, reception, storage or use of
such in any way whatsoever. Any opinions expressed in this message are
those of the author and may not necessarily reflect the opinions of
Eskisehir Technical University. This e-mail has been scanned for all
computer viruses known to us.

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing

[R] Weird R Studio behaviour...

2024-07-09 Thread Levent TERLEMEZ via R-help

Hi,

Have a nice week. First of all, I know this is not R Studio forum but I want to 
ask here first, if you all do not mind. Well, I am away from my computer right 
now but, I have a strange problem (at least to me). My script worked perfectly 
for a year, and today, suddenly stop working because R Studio begins to warn me 
about illegal characters in the script.

System is on W 10, and R 3.4.1 is working with R Studio. R Studio is updated 
today to the latest one because of this problem with the hope of resolving the 
problem (but no luck) and they are used as their default installation settings. 
Anyway, the problem example may not be repoducable right now but if it is, I 
can give detailed one later.

While the original working code is this (there is no synax error, too in the 
code because it was working perfectly until today before updating R Studio, 
error also came out before this update as I mensioned before);

legend(c("Kapanış",""20 Günlük,"50 Günlük"),col=c("black"...

The warning is this;

Error: unexpected symbol inside:
" Encoding(kill3) <- "latin1" legend(c(kill1,kill2,"50 G�nl�k"),col=c("black"… 
and can be solved when converted “ü” to “u”. Addition to this another solution 
(at least for me) is this;

kill1<-"Kapanış"
Encoding(kill1) <- "UTF-8" (these two statements are not needed but fort he 
sake of code integrity, is applied to it, too. If kill1 is converted to latin1, 
this time it is broken)
kill2<-"22 Günlük MA"
Encoding(kill2) <- "latin1"
kill3<-"50 Günlük MA"
Encoding(kill3) <- "latin1"

And also it is set to “ASK” and always “UTF-8” is selected.

But, I also wonder why today and what changed so R Studio stops suddenly 
running the script? I can not following up the changes anymore as used to be 
and if this is a character set problem, it is coming back again and again. What 
is the permenant solution of this? This is like an endless problem…

With my best regards and thanks for your patience….

Levent Terlemez.




YASAL UYARI: Bu e-postanın içerdiği bilgiler (ekleri de dahil olmak üzere) 
gizlidir. Sahibinin onayı olmaksızın içeriği kopyalanamaz, üçüncü kişilere 
açıklanamaz veya iletilemez . Bu mesajın gönderilmek istendiği kişi değilseniz 
(ya da bu e-postayı yanlışlıkla aldıysanız), lütfen yollayan kişiyi haberdar 
ediniz ve mesajı sisteminizden derhal siliniz. Eskişehir Teknik Üniversitesi, 
bu mesajın içerdiği bilgilerin doğruluğu veya eksiksiz olduğu konusunda bir 
garanti vermemektedir. Bu nedenle, bilgilerin ne şekilde olursa olsun 
içeriğinden, iletilmesinden, alınmasından, saklanmasından Eskişehir Teknik 
Üniversitesi sorumlu değildir. Bu mesajın içeriği yazarına ait olup, Eskişehir 
Teknik Üniversitesi'nin görüşlerini içermeyebilir. Bu e-posta bizce bilinen tüm 
bilgisayar virüslerine karşı taranmıştır.

DISCLAIMER: This e-mail (including any attachments) may contain confidential 
and/or privileged information. Copying, disclosure or distribution of the 
material in this e-mail without owner authority is strictly forbidden. If you 
are not the intended recipient (or have received this e-mail in error), please 
notify the sender and delete it from your system immediately. Eskisehir 
Technical University makes no warranty as to the accuracy or completeness of 
any information contained in this message and hereby excludes any liability of 
any kind for the information contained therein or for the information 
transmission, reception, storage or use of such in any way whatsoever. Any 
opinions expressed in this message are those of the author and may not 
necessarily reflect the opinions of Eskisehir Technical University. This e-mail 
has been scanned for all computer viruses known to us.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] package spline - default value of Boundary.knots of ns

2024-07-09 Thread Ajith R via R-help

Dear Maintainer,


Thanks for the excellent package splines. I am writing this email to request 
you to consider a suggestion I have with regards to the function ns.


While trying to rework an example from a textbook, I couldn't call ns with 
appropriate arguments to reproduce the results. The package documentation also 
couldn't help me find the problem. Finally, I found a stack exchange question 
(https://stats.stackexchange.com/questions/588769/natural-splines-in-r-with-ns) 
 which helped me understand the problem - the default values of boundary knots 
are not useful. The problem is described in the stack exchange question, which 
I request you to kindly read.


My suggestion is to change the default value of the argument Boundary.knots to 
NULL and calculate its values from  the extreme values of the argument knots 
inside the function body if it is NULL and otherwise to keep whatever numerical 
value it is assigned at call.


I think it is more intuitive to the user to specify one set of knots assuming 
that its minimum and maximum values would be used as the knots beyond which 
regression would be linear rather than to know that the function automatically 
calculates boundary knots which are not appropriate and so he needs to override 
them.


Just so that I am clear, an example. Assume that my variable *alcohol*  has 
values from 1 to 100 and I want to specify natural splines at knots 20,40 and 
60, expecting linearity would hold below 20 and above 60. Currently, I have to 
specify knots as 40 and boundary knots as 20 and 60 as ns(alcohol , knots = 
c(40), Boundary.knots = c(20,60))

If I incorrectly assume that correct values of boundary knots are calculated by 
default, I will specify knots as 20,40 and 60 as ns(alcohol , knots = 
c(20,40,60)) and get incorrect values for boundary knots as 1 and 100. (I have 
done this and the stack exchange post shows that I am not alone).


Hope my suggestion will be considered,


Thanks,
ajith

______
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Bug? plot.formula does need support plot.first / plot.last param in plot.default

2024-07-06 Thread Ivan Krylov via R-help

В Fri, 05 Jul 2024 14:35:40 +0300
"Erez Shomron"  пишет:

> This works as expected:

> with(mtcars, plot(wt, mpg, plot.first = {
> plot.window(range(wt), range(mpg))
> arrows(3, 15, 4, 30)
> }))

I think you meant panel.first, not plot.first. At least I cannot find
any mention of plot.first in the R source code. In this example,
plot.first ends up being an argument of an internal call from
plot.default() to plot.window(), which evaluates its ellipsis
arguments. If your plot.first expression returned a non-NULL value, you
would also have received a warning:

plot.window(0:1, 0:1, plot.first = message('hello'))
# hello
plot.window(0:1, 0:1, plot.first = 123)
# Warning message:
# In plot.window(0:1, 0:1, plot.first = 123) :
#   "plot.first" is not a graphical parameter

It is indeed documented that "passing [panel.first] from other ‘plot’
methods may well not work since it may be evaluated too early". The
plot.formula method deliberately evaluates the arguments in the
ellipsis, and the workaround suggested in
https://bugs.r-project.org/show_bug.cgi?id=14591 doesn't help because
the expression is then evaluated in an undesired environment (parent
frame, not data).

You are correct that plot.formula tries to evaluate all its remaining
arguments in the context of the data passed to the method. In order for
the lazy evaluation to work, plot.formula would have to (1) know and
skip all such arguments by name on line 6, minding partial matching, (2)
rewrite them into the form evalq(original_argument_expression,
model_frame, parent_frame) so that they would be able to access both
the data and the variables visible in the frame of the caller, and (3)
give these expressions to do.call() in place of the original ones.

(1) sounds especially brittle since plot.formula() may dispatch to
other plot.* methods. Additionally, great care will need to be taken
not to break existing code that calls plot.formula, even if it's
already full of workarounds for plot.formula's behaviour.

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] simple problem with unquoting argument

2024-07-03 Thread Ivan Krylov via R-help

В Wed, 3 Jul 2024 10:13:59 +0200
Troels Ring  пишет:

> Now e looks right - but I have been unable to find out how to get the 
> string e converted to the proper argument for sum()  - i.e. what  is 
> function xx?

get(e) will return the value of the variable with the name stored in
the variable e.

A more idiomatic variant will require more changes:

1. Create the "adds" variable as a list, so that it could contain other
arbitrary R values:

adds <- list()

2. Instead of assigning adds1 <- something(), adds2 <-
something_else(), ..., assign to the elements of the list:

adds[[1]] <- something()
adds[[2]] <- something_else()
...

3. Now you can use the same syntax to access the elements of the list:

SS[i] <- sum(adds[[i]])

As a bonus, you can use the "apply" family of R functions that will
perform the loop for you: instead of SS <- c(); for (i in 1:11) SS[i]
<- sum(adds[[i]]) you can write

SS <- vapply(adds, sum, numeric(1))

...and it will perform the same loop inside it, verifying each time
that sum(adds[[i]]) returns a single number.

-- 
Best regards,
Ivan

P.S. I'm sorry for letting our project lapse.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] summaryRprof: Unexpected unit for memory profiling

2024-07-02 Thread Jeff Newmiller via R-help

There was a time when people pretty much ignored the distinction between MB and 
MiB in computer applications, and using the binary version was usually assumed 
because, well, this _is_ memory we are measuring. I think this is a leftover 
from that time.

On July 1, 2024 6:33:43 AM PDT, "Sauer, Lukas Daniel" 
 wrote:
>Hello,
>
>I am profiling memory usage using utils::Rprof() and subsequently summarizing 
>the profile using utils::summaryRprof(). According to the documentation 
>?summaryRprof, the option `memory = "both"` reports "memory consumption in Mb 
>in addition to the timings", i.e. the unit is megabytes. However, looking at 
>the source code 
>(https://github.com/wch/r-source/blob/18652de8890d89563b923ff58b45ccb04d9955fe/src/library/utils/R/summRprof.R#L170)
> suggests that memory is reported in mebibytes (division by 1048576 and not by 
>10). This is in line with the following minimal example:
>
>use_mb <- function(){a <- runif(100)}
>use_mib <- function(){b <- runif(1024^2)}
>Rprof("Rprof.out", memory.profiling=TRUE)
>use_mb()
>use_mib()
>Rprof(NULL)
>summaryRprof("Rprof.out", memory="both")
>
>Do not source this code, but execute it line by line. This example returns the 
>output:
>
>$by.self
>self.time self.pct total.time total.pct mem.total
>"runif"  0.04  100   0.04   100  15.6
>
>$by.total
>  total.time total.pct mem.total self.time self.pct
>"runif" 0.04   100  15.6  0.04  100
>"use_mb"0.0250   7.6  0.000
>"use_mib"   0.0250   8.0  0.000
>
>$sample.interval
>[1] 0.02
>
>$sampling.time
>[1] 0.04
>
>The example was run under:
>
>R version 4.4.0 (2024-04-24 ucrt)
>Platform: x86_64-w64-mingw32/x64
>Running under: Windows 11 x64 (build 22631)
>
>If the unit were megabytes, I would expect mem.total to be 16.4, 8.0, and 8.4 
>-- but rather it is 15.6, 7.6, and 8.0. Do you agree that this behavior is 
>unexpected or did I overlook something? If yes, I will file a bug report and 
>suggest that the documentation is changed to "memory consumption in MiB in 
>addition to the timings".
>
>Best regards,
>
>Lukas D Sauer
>Biometrician
>Institute of Medical Biometry
>
>Heidelberg University Hospital | Im Neuenheimer Feld 130.3 | D-69120 Heidelberg
>Tel. +49 6221 56-35036 | Fax. +49 6221 56-4195 | E-Mail: 
>sa...@imbi.uni-heidelberg.de
>biometrie.uni-heidelberg.de | twitter.com/imbi_heidelberg
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Create matrix with variable number of columns AND CREATE NAMES FOR THE COLUMNS

2024-07-01 Thread Heinz Tuechler via R-help


Sorkin, John wrote/hat geschrieben on/am 01.07.2024 17:54:

#I am trying to write code that will create a matrix with a variable number of 
columns where the #number of columns is 1+Grps
#I can do this:
NSims <- 4
Grps <- 5
DiffMeans <- matrix(nrow=NSims,ncol=1+Grps)
DiffMeans

#I have a problem when I try to name the columns of the matrix. I want the 
first column to be NSims, #and the other columns to be something like Value1, 
Value2, . . . Valuen where N=Grps

# I wrote a function to build a list of length Grps
createValuelist <- function(num_elements) {
  for (i in 1:num_elements) {
cat("Item", i, "\n", sep = "")
  }
}
createValuelist(Grps)

# When I try to assign column names I receive an error:
#Error in dimnames(DiffMeans) <- list(NULL, c("NSim", createValuelist(Grps))) :
# length of 'dimnames' [2] not equal to array extent
dimnames(DiffMeans) <- list(NULL,c("NSim",createValuelist(Grps)))
DiffMeans

# Thank you for your help!


John David Sorkin M.D., Ph.D.
Professor of Medicine, University of Maryland School of Medicine;
Associate Director for Biostatistics and Informatics, Baltimore VA Medical 
Center Geriatrics Research, Education, and Clinical Center;
PI Biostatistics and Informatics Core, University of Maryland School of 
Medicine Claude D. Pepper Older Americans Independence Center;
Senior Statistician University of Maryland Center for Vascular Research;

Division of Gerontology and Paliative Care,
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
Cell phone 443-418-5382


maybe:
NSims <- 4
Grps <- 5
DiffMeans <-
matrix(nrow=NSims,ncol=1+Grps,
   dimnames=list(NULL, c('Nsimn', paste('Item', 1:Grps, sep=''
DiffMeans

best,
Heinz



______
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



______
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Create matrix with variable number of columns AND CREATE NAMES FOR THE COLUMNS

2024-07-01 Thread Jeff Newmiller via R-help

I think you should reconsider your goal. Matrices must have all elements of the 
same type, and in this case you seem to be trying to mix a number of something 
(integer) with mean values (double). This would normally be stored together in 
a data frame or separately in a vector for counts and a matrix for means.

If you are just thinking about data presentation, a data frame would be a 
better choice than a single matrix.

On July 1, 2024 8:54:21 AM PDT, "Sorkin, John"  
wrote:
>#I am trying to write code that will create a matrix with a variable number of 
>columns where the #number of columns is 1+Grps
>#I can do this:
>NSims <- 4
>Grps <- 5
>DiffMeans <- matrix(nrow=NSims,ncol=1+Grps)
>DiffMeans
>
>#I have a problem when I try to name the columns of the matrix. I want the 
>first column to be NSims, #and the other columns to be something like Value1, 
>Value2, . . . Valuen where N=Grps
>
># I wrote a function to build a list of length Grps
>createValuelist <- function(num_elements) {
>  for (i in 1:num_elements) {
>cat("Item", i, "\n", sep = "")
>  }
>}
>createValuelist(Grps)
>
># When I try to assign column names I receive an error:
>#Error in dimnames(DiffMeans) <- list(NULL, c("NSim", createValuelist(Grps))) 
>: 
># length of 'dimnames' [2] not equal to array extent
>dimnames(DiffMeans) <- list(NULL,c("NSim",createValuelist(Grps)))
>DiffMeans
>
># Thank you for your help!
>
>
>John David Sorkin M.D., Ph.D.
>Professor of Medicine, University of Maryland School of Medicine;
>Associate Director for Biostatistics and Informatics, Baltimore VA Medical 
>Center Geriatrics Research, Education, and Clinical Center; 
>PI Biostatistics and Informatics Core, University of Maryland School of 
>Medicine Claude D. Pepper Older Americans Independence Center;
>Senior Statistician University of Maryland Center for Vascular Research;
>
>Division of Gerontology and Paliative Care,
>10 North Greene Street
>GRECC (BT/18/GR)
>Baltimore, MD 21201-1524
>Cell phone 443-418-5382
>
>
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 3262 matches

Mail list logo