Re: [R] collided row names in heatmap.2

2021-08-11 Thread Jim Lemon
Hi Elham,
Your image didn't get through, maybe PNG will work. Label crowding is
a common problem, whether horizontal or vertical. One solution is to
set a maximum length on label text (see truncString in the prettyR
package). Others are to stagger labels (staxlab in plotrix) or shift
them apart when crowded (spread.labels in plotrix). These functions
may not work with heatmap.2 (gplots). It will be easier to suggest
something if you can get your image through.

Jim

On Thu, Aug 12, 2021 at 2:07 AM Elham Hooshmand  wrote:
>
> Hi,
>
> I am trying to draw a heatmap for my 45 topvar gene by the use of
> heatmap.2, and when I set a srtRow=45 in my code(below):
>
> heatmap.2( assay(rld)[ topVarGenes, ], srtRow=45, scale="row",trace="none",
> dendrogram="column",col = colorRampPalette( rev(brewer.pal(9, "RdBu"))
> )(255))
>
> row names collided with each other in a messy way.
>
> would you please help me to solve this problem? (also I'm a newbie beginner
> using R Studio) Thank you so much
> (also I attached the output image)
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] collided row names in heatmap.2

2021-08-11 Thread Bert Gunter
1. This is R-help. RStudio is a separate IDE from a private for-profit
company. You should go to their website for help with that:
https://www.rstudio.com/support/

2. I may be wrong, of course, but I believe your information is too
vague for folks to provide useful help: "row names collided with each
other in a messy way,' does not tell us much. What error message did
you get? Please provide a small, reproducible example (perhaps
via head(yourdatset, ...) see e.g. ?head and ?dput and
https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
. You may also wish to download and use the "reprex" package for this
purpose, instead. **See also the posting guide linked below.**

All of us were newbies at one time; take the time to learn these tools
and follow the advice for posting to R-help, and I believe that you
will have much greater success in getting useful help here as you
climb the learning curve.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Wed, Aug 11, 2021 at 9:07 AM Elham Hooshmand  wrote:
>
> Hi,
>
> I am trying to draw a heatmap for my 45 topvar gene by the use of
> heatmap.2, and when I set a srtRow=45 in my code(below):
>
> heatmap.2( assay(rld)[ topVarGenes, ], srtRow=45, scale="row",trace="none",
> dendrogram="column",col = colorRampPalette( rev(brewer.pal(9, "RdBu"))
> )(255))
>
> row names collided with each other in a messy way.
>
> would you please help me to solve this problem? (also I'm a newbie beginner
> using R Studio) Thank you so much
> (also I attached the output image)
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] collided row names in heatmap.2

2021-08-11 Thread Elham Hooshmand
Hi,

I am trying to draw a heatmap for my 45 topvar gene by the use of
heatmap.2, and when I set a srtRow=45 in my code(below):

heatmap.2( assay(rld)[ topVarGenes, ], srtRow=45, scale="row",trace="none",
dendrogram="column",col = colorRampPalette( rev(brewer.pal(9, "RdBu"))
)(255))

row names collided with each other in a messy way.

would you please help me to solve this problem? (also I'm a newbie beginner
using R Studio) Thank you so much
(also I attached the output image)
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Volume 13/1, June 2021 now available

2021-08-11 Thread Dianne Cook
Dear R Community,

The first issue of the R Journal for 2021 is now available at 
https://journal.r-project.org/. 

There is also a dev version of rending articles in html at 
https://journal.r-project.org/dev/. Particularly look at articles by Earo Wang 
for embedded interactive graphics, and by Mike Kane for html rendering. 
Feedback and suggestions are welcome.

Happy reading, and coding!

Regards,
Di

%>%>%>%>%
Professor Dianne Cook
Editor-in-Chief, R Journal
dicook...@gmail.com

___
r-annou...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-announce

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Formula compared to call within model call

2021-08-11 Thread Martin Maechler
> Tim Taylor 
> on Wed, 11 Aug 2021 08:45:48 + writes:

> Manipulating formulas within different models I notice the following:

> m1 <- lm(formula = hp ~ cyl, data = mtcars)
> m2 <- update(m1, formula. = hp ~ cyl)
> all.equal(m1, m2)
> #> [1] TRUE
> identical(m1, m2)
> #> [1] FALSE
> waldo::compare(m1, m2)
> #> `old$call[[2]]` is a call
> #> `new$call[[2]]` is an S3 object of class , a call

> I'm aware formulas are a form of call but what I'm unsure
> of is whether there is meaningful difference between the
> two versions of the models? 

A good question.
In principle, the promise of an update()  method should be to
produce the *same* result as calling the original model-creation
(or more generally object-creation) function call.

So, already with identical(), you've shown that this is not
quite the case for simple lm(),
and indeed that is a bit undesirable.

To answer your question re "meaningful" difference,
given what I say above is:
No, there shouldn't be any relevant difference, and if there is,
that may considered a bug in the respective update() method,
here update.lm.

More about this in the following  R code snippet :

## MM: indeed,
identical(m1$call, m2$call) #> [1] FALSE
noCall <- function(x) x[setdiff(names(x), "call")]
identical(noCall(m1), noCall(m2))# TRUE!
## look closer:
c1 <- m1$call
c2 <- m2$call
str(as.list(c1))
## List of 3
##  $: symbol lm
##  $ formula: language hp ~ cyl
##  $ data   : symbol mtcars

str(as.list(c2))
## List of 3
##  $: symbol lm
##  $ formula:Class 'formula'  language hp ~ cyl
##   .. ..- attr(*, ".Environment")=
##  $ data   : symbol mtcars

identical(c1[-2], c2[-2]) # TRUE ==> so, indeed the difference is *only* in the 
formula ( = [2]) component
f1 <- c1$formula
f2 <- c2$formula
all.equal(f1,f2) # TRUE
identical(f1,f2) # FALSE
## Note that this is typically *not* visible if the user uses the accessor 
functions:
identical(formula(m1), formula(m2)) # TRUE !
## and indeed, the formula() method for 'lm'  does set the environment:
stats:::formula.lm


--
Martin Maechler
ETH Zurich   and  R Core

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sample size Determination to Compare Three Independent Proportions

2021-08-11 Thread AbouEl-Makarim Aboueissa
Hi Marc:


Thank you for your help in this matter.


With thanks
Abou


On Tue, Aug 10, 2021, 9:28 AM Marc Schwartz  wrote:

> Hi,
>
> A search would suggest that there may not be an R function/package that
> provides power/sample size calculations for the specific scenarios that
> you are describing. There may be something that I am missing, and there
> is also other dedicated software such as PASS
> (https://www.ncss.com/software/pass/) which is not free, but provides a
> large library of possibly relevant functions and support.
>
> That being said, you can run Monte Carlo simulations in R to achieve the
> results you want, while providing yourself with options relative to
> study design, intended tests, and adjustments for multiple comparisons
> as apropos. Many prefer this approach, since it gives you specific
> control over this process.
>
> Taking the simple case, where you are going to run a 3 x 2 chi-square as
> your primary endpoint, and want to power for that, here is a possible
> function, with the same sample size in each group:
>
> ThreeGroups <- function(n, p1, p2, p3, R = 1, power = 0.8) {
>
>MCSim <- function(n, p1, p2, p3) {
>  ## Create a binary distribution for each group
>  G1 <- rbinom(n, 1, p1)
>  G2 <- rbinom(n, 1, p2)
>  G3 <- rbinom(n, 1, p3)
>
>  ## Create a 3 x 2 matrix containing the 3 group counts
>  MAT <- cbind(table(G1), table(G2), table(G3))
>
>  ## Perform a chi-square and just return the p value
>  chisq.test(MAT)$p.value
>}
>
>## Replicate the above R times, and get
>## a distribution of p values
>MC <- replicate(R, MCSim(n, p1, p2, p3))
>
>## Get the p value at the desired "power" quantile
>quantile(MC, power)
> }
>
> Essentially, the above internal MCSim() function generates 3 random
> samples of size 'n' from the binomial distribution, at the 3 proportions
> desired. For each run, it will perform a chi-square test of the 3 x 2
> matrix of counts, returning the p value for each run. The main function
> will then return the p value at the quantile (power) within the
> generated distribution of p values.
>
> You can look at the help pages for the various functions that I use
> above, to get a sense for how they work.
>
> You increase the sample size ('n') until you get a p value returned <=
> 0.05, if that is your desired alpha level.
>
> You also want 'R', the number of replications within each run, to be
> large enough so that the returned p value quantile is relatively stable.
> Values for 'R', once you get "close to" the desired p value should be on
> the order of 1,000,000 or higher. Stay with lower values for 'R' until
> you get in the ballpark of your target, since larger values take much
> longer to run.
>
> Thus, using your example proportions of 0.25, 0.25, and 0.35:
>
> ## 250 per group, 750 total - Not enough
>  > ThreeGroups(250, 0.25, 0.25, 0.35, R = 1)
> 80%
> 0.08884723
>
> ## 350 per group, 1050 total - Too high
>  > ThreeGroups(350, 0.25, 0.25, 0.35, R = 1)
>80%
> 0.0270829
>
> ## 300 per group, 900 total - Close!
>  > ThreeGroups(300, 0.25, 0.25, 0.35, R = 1)
> 80%
> 0.04818842
>
>
> So, keep tweaking the sample size until you get a returned p value at
> your target alpha level, with a large enough 'R', so that you get
> consistent sample sizes for multiple runs.
>
> If I run 300 per group again, with 10,000 replicates:
>
>  > ThreeGroups(300, 0.25, 0.25, 0.35, R = 1)
> 80%
> 0.05033933
>
> the returned p value is slightly higher. So, again, increase R to
> improve the stability of the returned p value and run it multiple times
> to be comfortable that the p value change is less than an acceptable
> threshold.
>
> Now, the tricky part is to decide if the 3 x 2 is your primary endpoint,
> and want to power only for that, or, if you also want to power for the
> other two-group comparisons, possibly having to account for p value
> adjustments for the multiple comparisons, resulting in the need to power
> for a lower alpha level for those tests. In that scenario, you would end
> up taking the largest sample size that you identify across the various
> hypotheses, recognizing that while you are powering for one hypothesis,
> you may be overpowering for others.
>
> That is something that you need to decide, and perhaps consider
> consulting with other local statistical expertise, as may be apropos, in
> the prospective study design, possibly influenced by other
> relevant/similar research in your domain.
>
> You can easily modify the above function for the two-group scenario as
> well, and I will leave that to you.
>
> Regards,
>
> Marc
>
>
> AbouEl-Makarim Aboueissa wrote on 8/10/21 6:34 AM:
> > Hi Marc:
> >
> > First, thank you very much for your help in this matter.
> >
> >
> > Will perform an initial omnibus test of all three groups (e.g. 3 x 2
> > chi-square), possibly followed by
> > all possible 2 x 2 pairwise comparisons (e.g. 1 versus 2, 1 

[R] Formula compared to call within model call

2021-08-11 Thread Tim Taylor
Manipulating formulas within different models I notice the following:

m1 <- lm(formula = hp ~ cyl, data = mtcars)
m2 <- update(m1, formula. = hp ~ cyl)
all.equal(m1, m2)
#> [1] TRUE
identical(m1, m2)
#> [1] FALSE
waldo::compare(m1, m2)
#> `old$call[[2]]` is a call
#> `new$call[[2]]` is an S3 object of class , a call

I'm aware formulas are a form of call but what I'm unsure of is whether there 
is meaningful difference between the two versions of the models? Any 
clarification, even just on the relation between formulas and calls would be 
useful.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] assigning suitability index value

2021-08-11 Thread Marna Wagley
Thank you Jeff. I think the code you wrote works. The value I put in the
output was just guessing by looking at the graph.
Thank you once again Jeff.

temp<-structure(list(X = c(1468285.96, 1468476.96, 1468479.96, 1468482.96,
1468485.96, 1468467.96, 1468470.96, 1468473.96, 1468476.96, 1468479.96,
1468482.96, 1468485.96, 1468458.96, 1468461.96, 1468464.96, 1468467.96,
1468470.96, 1468473.96, 1468476.96), Y = c(415099.27, 415096.27, 415096.27,
415096.27, 415096.27, 415093.27, 415093.27, 415093.27, 415093.27, 415093.27,
415093.27, 415093.27, 415090.27, 415090.27, 415090.27, 415090.27, 415090.27,
415090.27, 415090.27), temp = c(1.959473, 15.092773, 15.128174, 14.368896,
9.892578, 15.720215, 15.767822, 15.26001, 14.642334, 14.6521, 13.916016,
10.3479, 16.052246, 16.094971, 15.167236, 15.455322, 15.472412, 24.741211,
14.755859)), class = "data.frame", row.names = c(NA, -19L))


print(temp)


table2<-structure(list(temp = c(0L, 10L, 15L, 17L, 25L, 30L), Index = c(0,
 0.3, 1, 1, 0.5, 0)), class = "data.frame", row.names = c(NA, -6L))

print(table2)


ggplot(data=table2, aes(x=temp, y=Index)) +geom_path()+geom_point()



Output<-structure(list(X = c(1468285.96, 1468476.96, 1468479.96, 1468482.96,
1468485.96, 1468467.96, 1468470.96, 1468473.96, 1468476.96, 1468479.96,
1468482.96, 1468485.96, 1468458.96, 1468461.96, 1468464.96, 1468467.96,
1468470.96, 1468473.96, 1468476.96), Y = c(415099.27, 415096.27, 415096.27,
415096.27, 415096.27, 415093.27, 415093.27, 415093.27, 415093.27, 415093.27,
415093.27, 415093.27, 415090.27, 415090.27, 415090.27, 415090.27, 415090.27,
415090.27, 415090.27), temp = c(1.959473, 0.092773, 15.128174, 14.368896,
9.892578, 15.720215, 15.767822, 15.26001, 14.642334, 14.6521, 13.916016,
10.3479, 16.052246, 16.094971, 15.167236, 15.455322, 15.472412, 24.741211,
14.755859), index = c(0.012, 0.001, 1, 0.9, 0.31, 1, 1, 1, 0.91, 0.921,
0.824, 0.254, 1, 1, 1, 1, 1, 0.652, 0.93)), class="data.frame", row.names =
c(NA, -19L))


print(Output)

On Tue, Aug 10, 2021 at 11:16 PM Jeff Newmiller 
wrote:

> Piecewise linear interpolation is implemented in the ?approx function. It
> does not agree exactly with your Output, I don't know if there is something
> else you are accounting for it if your Output is in error.
>
> temp$index <- approx( table2$temp, table2$Index, temp$temp )$y
>
> BTW your code was usable but messed up... please set your email program to
> send plain text email so your formatting does not mess with your code.
>
>
> On August 10, 2021 10:30:57 PM PDT, Marna Wagley 
> wrote:
> >Hi R Users,
> >I have two tables, one is temperature data (temp) and another table is a
> >suitability index. I wanted to assign the suitability index value in the
> >temperature data (temp) based on Table 2 (or graph, which is a suitability
> >curve), but I could not figure it out.
> >Are there any suggestions for me how I can assign the suitability index
> >value in table1 (temp) based on the suitability graph? I have a very big
> >data set but showing only a few data to illustrate the problem.
> >
> >temp<-structure(list(X = c(1468285.96, 1468476.96, 1468479.96, 1468482.96,
> >
> >1468485.96, 1468467.96, 1468470.96, 1468473.96, 1468476.96, 1468479.96,
> >
> >1468482.96, 1468485.96, 1468458.96, 1468461.96, 1468464.96, 1468467.96,
> >
> >1468470.96, 1468473.96, 1468476.96), Y = c(415099.27, 415096.27,
> >
> >415096.27, 415096.27, 415096.27, 415093.27, 415093.27, 415093.27,
> >
> >415093.27, 415093.27, 415093.27, 415093.27, 415090.27, 415090.27,
> >
> >415090.27, 415090.27, 415090.27, 415090.27, 415090.27), temp = c(1.959473,
> >
> >15.092773, 15.128174, 14.368896, 9.892578, 15.720215, 15.767822,
> >
> >15.26001, 14.642334, 14.6521, 13.916016, 10.3479, 16.052246,
> >
> >16.094971, 15.167236, 15.455322, 15.472412, 24.741211, 14.755859
> >
> >)), class = "data.frame", row.names = c(NA, -19L))
> >
> >
> >print(temp)
> >
> >
> >table2<-structure(list(temp = c(0L, 10L, 15L, 17L, 25L, 30L), Index = c(0,
> >
> >0.3, 1, 1, 0.5, 0)), class = "data.frame", row.names = c(NA,
> >
> >-6L))
> >
> >print(table2)
> >
> >
> >ggplot(data=table2, aes(x=temp, y=Index)) +
> >
> >  geom_path()+
> >
> >  geom_point()
> >
> >
> >
> ># now I would like to assign the index value of table 2 into table 1
> >(temp), and I was looking for the following table as an output. The index
> >value in the output I put manually.
> >
> >
> >Output<-structure(list(X = c(1468285.96, 1468476.96, 1468479.96,
> 1468482.96,
> > 1468485.96, 1468467.96, 1468470.96, 1468473.96, 1468476.96, 1468479.96,
> >
> >1468482.96, 1468485.96, 1468458.96, 1468461.96, 1468464.96, 1468467.96,
> >
> >1468470.96, 1468473.96, 1468476.96), Y = c(415099.27, 415096.27,
> >
> >415096.27, 415096.27, 415096.27, 415093.27, 415093.27, 415093.27,
> >
> >415093.27, 415093.27, 415093.27, 415093.27, 415090.27, 415090.27,
> >
> >415090.27, 415090.27, 415090.27, 415090.27, 415090.27), temp = c(1.959473,
> >
> >0.092773, 15.128174, 14.368896, 9.892578, 15.720215, 15.767822,
> >
> >15.26001, 

Re: [R] assigning suitability index value

2021-08-11 Thread Jeff Newmiller
Piecewise linear interpolation is implemented in the ?approx function. It does 
not agree exactly with your Output, I don't know if there is something else you 
are accounting for it if your Output is in error.

temp$index <- approx( table2$temp, table2$Index, temp$temp )$y

BTW your code was usable but messed up... please set your email program to send 
plain text email so your formatting does not mess with your code.


On August 10, 2021 10:30:57 PM PDT, Marna Wagley  wrote:
>Hi R Users,
>I have two tables, one is temperature data (temp) and another table is a
>suitability index. I wanted to assign the suitability index value in the
>temperature data (temp) based on Table 2 (or graph, which is a suitability
>curve), but I could not figure it out.
>Are there any suggestions for me how I can assign the suitability index
>value in table1 (temp) based on the suitability graph? I have a very big
>data set but showing only a few data to illustrate the problem.
>
>temp<-structure(list(X = c(1468285.96, 1468476.96, 1468479.96, 1468482.96,
>
>1468485.96, 1468467.96, 1468470.96, 1468473.96, 1468476.96, 1468479.96,
>
>1468482.96, 1468485.96, 1468458.96, 1468461.96, 1468464.96, 1468467.96,
>
>1468470.96, 1468473.96, 1468476.96), Y = c(415099.27, 415096.27,
>
>415096.27, 415096.27, 415096.27, 415093.27, 415093.27, 415093.27,
>
>415093.27, 415093.27, 415093.27, 415093.27, 415090.27, 415090.27,
>
>415090.27, 415090.27, 415090.27, 415090.27, 415090.27), temp = c(1.959473,
>
>15.092773, 15.128174, 14.368896, 9.892578, 15.720215, 15.767822,
>
>15.26001, 14.642334, 14.6521, 13.916016, 10.3479, 16.052246,
>
>16.094971, 15.167236, 15.455322, 15.472412, 24.741211, 14.755859
>
>)), class = "data.frame", row.names = c(NA, -19L))
>
>
>print(temp)
>
>
>table2<-structure(list(temp = c(0L, 10L, 15L, 17L, 25L, 30L), Index = c(0,
>
>0.3, 1, 1, 0.5, 0)), class = "data.frame", row.names = c(NA,
>
>-6L))
>
>print(table2)
>
>
>ggplot(data=table2, aes(x=temp, y=Index)) +
>
>  geom_path()+
>
>  geom_point()
>
>
>
># now I would like to assign the index value of table 2 into table 1
>(temp), and I was looking for the following table as an output. The index
>value in the output I put manually.
>
>
>Output<-structure(list(X = c(1468285.96, 1468476.96, 1468479.96, 1468482.96,
> 1468485.96, 1468467.96, 1468470.96, 1468473.96, 1468476.96, 1468479.96,
>
>1468482.96, 1468485.96, 1468458.96, 1468461.96, 1468464.96, 1468467.96,
>
>1468470.96, 1468473.96, 1468476.96), Y = c(415099.27, 415096.27,
>
>415096.27, 415096.27, 415096.27, 415093.27, 415093.27, 415093.27,
>
>415093.27, 415093.27, 415093.27, 415093.27, 415090.27, 415090.27,
>
>415090.27, 415090.27, 415090.27, 415090.27, 415090.27), temp = c(1.959473,
>
>0.092773, 15.128174, 14.368896, 9.892578, 15.720215, 15.767822,
>
>15.26001, 14.642334, 14.6521, 13.916016, 10.3479, 16.052246,
>
>16.094971, 15.167236, 15.455322, 15.472412, 24.741211, 14.755859
>
>), index = c(0.012, 0.001, 1, 0.9, 0.31, 1, 1, 1, 0.91, 0.921,
>
>0.824, 0.254, 1, 1, 1, 1, 1, 0.652, 0.93)), class = "data.frame", row.names
>= c(NA,
>
>-19L))
>
>
>print(Output)
>
>
>Thank you very much for your help.
>
>MW
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.