Re: [R] suggestion for optimal plotting to show significant differences

Richard M. Heiberger Thu, 12 Feb 2015 12:29:50 -0800

## The next step is to think of this in the analysis of covariance setting.


## this example is based on

library(HH)

demo("ancova", package="HH", ask=FALSE)

item<-rep(letters[1:8], each=18)
day<-rep((0:5)*100, 24)
set<-rep(rep(1:3, each=6), 8)
test<-data.frame(item, day, set)
set.seed(111)
test$value<-(test$day/100+1)+rnorm(144)
test$value<-test$value+(as.numeric(test$item)*1.3)
test$dayf <- factor(test$day)


Fitalphabeta <- aov(value ~ day, data=test)
AB <- xyplot(value ~ day | item, data=test, layout=c(8,1),
scales=list(alternating=1),
             main="regression, ignoring item\ncommon slope, common intercept") +
               layer(panel.abline(coef(Fitalphabeta)))
AB

Fitalphabeta0 <- aov(value ~ 0 + item, data=test)
AB0 <- xyplot(value ~ day | item, data=test, layout=c(8,1),
scales=list(alternating=1),
              main="anova, ignoring day\nzero slope, variable intercept") +
                layer(panel.abline(a=coef(Fitalphabeta0)[panel.number()], b=0))
AB0

Fitalphaibeta <- aov(value ~ 0 + item + day, data=test)
AiB <- xyplot(value ~ day | item, data=test, layout=c(8,1),
scales=list(alternating=1),
              main="ancova\ncommon slope, variable intercept") +

layer(panel.abline(a=coef(Fitalphaibeta)[panel.number()],
b=rev(coef(Fitalphaibeta))[1]))
AiB

Fitalphaibetaj <- aov(value ~ 0 + item / day, data=test)
AiBj <- xyplot(value ~ day | item, data=test, layout=c(8,1),
scales=list(alternating=1),
               main="ancova with interaction\nvariable slope, variable
intercept") +

layer(panel.abline(a=coef(Fitalphaibetaj)[panel.number()],
b=coef(Fitalphaibetaj)[8 + panel.number()]))
AiBj

## this is very tight on the standard 7x7 plotting surface.
## it will look ok with 10in wide and 7in high
## pdf("AB2x2.pdf", height=7, width=10)
print(AB,   split=c(1, 1, 2, 2), more=TRUE)
print(AB0,  split=c(1, 2, 2, 2), more=TRUE)
print(AiB,  split=c(2, 1, 2, 2), more=TRUE)
print(AiBj, split=c(2, 2, 2, 2), more=FALSE)
## dev.off()

## this arrangement matches the arrangement in demo("ancova")
## this too is very tight on the standard 7x7 plotting surface.
## it will look ok with 10in wide and 10in high
## pdf("AB2x3.pdf", height=10, width=10)
print(AB,   split=c(1, 2, 2, 3), more=TRUE)
print(AB0,  split=c(2, 3, 2, 3), more=TRUE)
print(AiB,  split=c(2, 2, 2, 3), more=TRUE)
print(AiBj, split=c(2, 1, 2, 3), more=FALSE)
## dev.off()


On Thu, Feb 12, 2015 at 9:48 AM, PIKAL Petr <petr.pi...@precheza.cz> wrote:
> Hi Rich
>
>> -----Original Message-----
>> From: Richard M. Heiberger [mailto:r...@temple.edu]
>> Sent: Wednesday, February 11, 2015 10:53 PM
>> To: PIKAL Petr
>> Cc: r-help@r-project.org
>> Subject: Re: [R] suggestion for optimal plotting to show significant
>> differences
>>
>> Petr,
>>
>> My first attempt is to use the simple=TRUE argument to interaction2wt.
>>
>> Then the bwplots in the item|item panel show the behavior of value over
>> day for each item.  You get a plot similar to this panel with the
>> growth curve plots from nlme, for example,
>>     bwplot(value ~ day | item, data=test, horizontal=FALSE) I am
>> treating set as a replication and each box is cumulated over the three
>> sets.
>
> Yes, that is the point - set is actually replication of result.
>
>>
>> My analysis question is about day.  You have it as numeric.  My
>> inclination would be to make day factor.  Then you could model the
>> interaction of day and item.
>
> Hm. I hoped I can do
>
> fit<-lm(value~day*item, data=test)
> summary(fit)
>
> in which case I can compare differences in intercepts and/or slopes for each 
> item.
>
> However I am rather lost in aov
>
> test$dayf <- factor(test$day)
>
> fit1 <- aov(value~item+dayf, data=test)
> summary(fit)
> fit2 <- aov(value~item/dayf, data=test)
> summary(fit)
> and
> fit3 <- aov(value~item*dayf, data=test)
> summary(fit)
> which gives bascally the same result as fit2
>
>> anova(fit1, fit2)
> Analysis of Variance Table
>
> Model 1: value ~ item + dayf
> Model 2: value ~ item/dayf
>   Res.Df    RSS Df Sum of Sq      F Pr(>F)
> 1    131 160.59
> 2     96 128.60 35    31.993 0.6824 0.8993
>
> TukeyHSD(fit1, which="item")
> TukeyHSD(fit2, which="item")
>
> Both models seems to give quite similar results and I am not sure what 
> actually differs in those models. I believe that model2 tests each item 
> within particular day (but I am not sure about it).
>
> However this discussion is probably deviating more to the statistics issue 
> than to R itself.
>
> I just thought that somebody helps me with a method for comparison of item 
> performance in case that relation of value to day is not simple linear (as in 
> my example) and cannot be expressed by some formula (like examples in nlme).
>
> So far the best options are either your bwplot or my ggplots.
>
> p<-ggplot(test, aes(x=day, y=value, colour=item))
> p+geom_point()+stat_smooth(method="lm", formula= y~poly(x,2))
>
> p<-ggplot(test, aes(x=dayf, y=value, colour=item))
> p+geom_boxplot()
>
> p<-ggplot(test, aes(x=item, y=value))
> p+geom_boxplot()+facet_wrap(~day)
>
> Thank you for your suggestions which directed me to interaction plots which I 
> was not aware of.
>
> Best regards
> Petr
>
>
>>
>> Rich
>>
>> On Mon, Feb 9, 2015 at 6:01 AM, PIKAL Petr <petr.pi...@precheza.cz>
>> wrote:
>> > Hallo Richard.
>> >
>> > I tried your suggestion but it seems to be no better than simple
>> ggplot. Let me extend the example a bit to 8 items which is more
>> realistic.
>> >
>> > item<-rep(letters[1:8], each=18)
>> > day<-rep((0:5)*100, 24)
>> > set<-rep(rep(1:3, each=6), 8)
>> > test<-data.frame(item, day, set)
>> > set.seed(111)
>> > test$value<-(test$day/100+1)+rnorm(144)
>> > test$value<-test$value+(as.numeric(test$item)*1.3)
>> >
>> > Value is increasing during time (day) for each tested subject (item),
>> each item is measured 3 times (set) each day.
>> >
>> > Here is some graph
>> > p<-ggplot(test, aes(x=day, y=value, colour=item))
>> > p+geom_point()+stat_smooth(method="lm", formula= y~poly(x,2))
>> >
>> > I can do lm or aov, however I am not sure about proper formula.
>> >
>> > fit<-lm(value~day, data=test)
>> > summary(fit)
>> > # this shows that value is increasing with day
>> >
>> > fit<-lm(value~day/item, data=test)
>> > summary(fit)
>> > # this suggests that value is decreasing with day (which is wrong)
>> >
>> > fit<-lm(value~day*item, data=test)
>> > summary(fit)
>> > # and this tells me that value is increasing with day and items have
>> different intercepts but the same rate of growth (I hope I got it
>> right).
>> >
>> > I do not have your book available but I went through help pages.
>> >
>> > Your interaction graph is not much better than ggplot.
>> > I can do
>> >
>> > interaction2wt(value ~ item * day, data=test)
>> >
>> > which probably is closer to actual problem.
>> >
>> > The basic problem is that increase of value with days is in fact not
>> linear and actually it can increase in the beginning and then stagnate
>> or it can stagnate in beginning and then increase. I am not aware of
>> any way how to compare time behaviour of different items in such
>> situations if I cannot state some common formula in which case I would
>> use probably nlme.
>> >
>> > Thank for your insight, I try to go through it more deeply.
>> >
>> > Best regards
>> > Petr
>> >
>> >
>> >> -----Original Message-----
>> >> From: Richard M. Heiberger [mailto:r...@temple.edu]
>> >> Sent: Friday, February 06, 2015 6:14 PM
>> >> To: PIKAL Petr
>> >> Cc: r-help@r-project.org
>> >> Subject: Re: [R] suggestion for optimal plotting to show significant
>> >> differences
>> >>
>> >> I would try one of these illustrations for starts.
>> >> interaction2wt (two-way tables) is designed to be used with aov()
>> for
>> >> testing.
>> >> interaction2wt shows all main effects and all two-way interactions
>> >> for many factors.
>> >>
>> >>
>> >>
>> >> test <-
>> >> structure(list(item = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
>> 1L,
>> >> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
>> >> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", "B"), class
>> >> = "factor"), day = c(0L, 100L, 200L, 300L, 400L, 500L, 0L, 100L,
>> >> 200L, 300L, 400L, 500L, 0L, 100L, 200L, 300L, 400L, 500L, 0L, 100L,
>> >> 200L, 300L, 400L, 500L, 0L, 100L, 200L, 300L, 400L, 500L, 0L, 100L,
>> >> 200L, 300L, 400L, 500L), set = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
>> >> 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
>> >> 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L), value = c(1.08163365169503,
>> >> 2.61998412608805, 3.07820466606394, 4.44993419381934,
>> >> 5.29163171545805, 6.29155990999293, -0.123163011367676,
>> >> 2.07767236834003, 2.32537052874901, 3.09372794501084,
>> >> 6.65273721166635, 5.92304962329131, 1.50504697705548,
>> >> 2.66253728086866, 2.63420157418685, 2.78195098580416,
>> >> 6.47578642973288, 5.89587443775143, 0.848864231485078,
>> >> 1.27549677119713, 2.19573089053609, 2.45659926134292,
>> >> 5.15424403414103, 5.4813151140983, 1.25731482647214,
>> >> 2.09662105167973, 1.75954023316977, 4.81624002288939,
>> >> 4.65029189325307, 6.39946904227214, 0.944996929887344,
>> >> 1.74667265331284, 2.42956264345558, 5.17852980415141,
>> >> 3.5453435965834, 6.9011238437191)), .Names = c("item", "day", "set",
>> >> "value"), row.names = c(NA, -36L), class =
>> >> "data.frame")
>> >>
>> >>
>> >>
>> >> library(HH)
>> >>
>> >> test$set <- factor(test$set)
>> >> test$day <- factor(test$day)
>> >> test$item <- factor(test$item)
>> >>
>> >> interaction2wt(value ~ item * day * set, data=test)
>> >>
>> >> test$item.day <- interaction(test$item, test$day)
>> >> position(test$item.day) <- outer(c(-10,10),
>> >> as.numeric(levels(test$day)), `+`)
>> >>
>> >> xyplot(value ~ as.position(item.day) | set, groups=item,
>> >>         data=test, horizontal=FALSE, pch=c(17,16),
>> >>         xlab="day",
>> >>         scales=list(
>> >>           x=list(
>> >>             alternating=1,
>> >>             at=levels(test$day), ## placement of tick labels and
>> marks
>> >>             tck=1)),
>> >>         key=list(
>> >>           text=list(c("A","B"), col=c("blue","red")),
>> >>           points=list(pch=c(17, 16), col=c("blue","red")),
>> >>        space="top", columns=2, border=TRUE),
>> >>        layout=c(3,1))
>> >>
>> >>
>> >> ## see also the examples in
>> >> demo(package="HH", bwplot.examples)
>> >>
>> >> On Fri, Feb 6, 2015 at 6:09 AM, PIKAL Petr <petr.pi...@precheza.cz>
>> >> wrote:
>> >> > Dear all
>> >> >
>> >> > I would like to ask for your opinion about possible graphical
>> >> representation of such data.
>> >> >
>> >> >> dput(test)
>> >> > structure(list(item = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
>> >> > 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
>> 2L,
>> >> > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", "B"),
>> >> > class
>> >> =
>> >> > "factor"), day = c(0L, 100L, 200L, 300L, 400L, 500L, 0L, 100L,
>> >> > 200L, 300L, 400L, 500L, 0L, 100L, 200L, 300L, 400L, 500L, 0L,
>> 100L,
>> >> > 200L, 300L, 400L, 500L, 0L, 100L, 200L, 300L, 400L, 500L, 0L,
>> 100L,
>> >> > 200L, 300L, 400L, 500L), set = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
>> >> > 2L, 2L,
>> >> 2L,
>> >> > 2L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
>> 2L,
>> >> > 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L), value = c(1.08163365169503,
>> >> > 2.61998412608805, 3.07820466606394, 4.44993419381934,
>> >> > 5.29163171545805, 6.29155990999293, -0.123163011367676,
>> >> > 2.07767236834003, 2.32537052874901, 3.09372794501084,
>> >> > 6.65273721166635, 5.92304962329131, 1.50504697705548,
>> >> > 2.66253728086866, 2.63420157418685, 2.78195098580416,
>> >> > 6.47578642973288, 5.89587443775143, 0.848864231485078,
>> >> > 1.27549677119713, 2.19573089053609, 2.45659926134292,
>> >> > 5.15424403414103, 5.4813151140983, 1.25731482647214,
>> >> 2.09662105167973,
>> >> > 1.75954023316977, 4.81624002288939, 4.65029189325307,
>> >> > 6.39946904227214, 0.944996929887344, 1.74667265331284,
>> >> > 2.42956264345558, 5.17852980415141, 3.5453435965834,
>> >> > 6.9011238437191)), .Names = c("item", "day", "set", "value"),
>> >> > row.names = c(NA, -36L), class = "data.frame")
>> >> >>
>> >> >
>> >> > One option I came with is
>> >> >
>> >> > library(ggplot2)
>> >> > p<-ggplot(test, aes(x=day, y=value, colour=item))
>> >> > p+geom_point()+stat_smooth(method="lm", formula= y~poly(x,2))
>> >> >
>> >> > but -
>> >> > I have more items (around 5-10), and I want to show if the
>> >> > difference
>> >> between items is or is not significant. The actual development of
>> >> value with day is usually not linear nor growing steadily and
>> >> actually I cannot usually evaluate some analytical equation for my
>> >> data to compare equation parameters.
>> >> >
>> >> > I thought about boxplots, but there is not many repetitions and
>> >> actually 5+ boxplots can be quite messy.
>> >> >
>> >> > I can plot only mean for each set and item but in that case I lose
>> >> information if the difference is or is not significant.
>> >> >
>> >> > I appreciate any suggestion.
>> >> >
>> >> > Best regards
>> >> > Petr
>> >> >
>
> ________________________________
> Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou 
> určeny pouze jeho adresátům.
> Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
> jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
> svého systému.
> Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
> jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
> Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či 
> zpožděním přenosu e-mailu.
>
> V případě, že je tento e-mail součástí obchodního jednání:
> - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, 
> a to z jakéhokoliv důvodu i bez uvedení důvodu.
> - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
> Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany 
> příjemce s dodatkem či odchylkou.
> - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným 
> dosažením shody na všech jejích náležitostech.
> - odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost 
> žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně 
> pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu 
> případně osobě, kterou adresát zastupuje, předloženy nebo jejich existence je 
> adresátovi či osobě jím zastoupené známá.
>
> This e-mail and any documents attached to it may be confidential and are 
> intended only for its intended recipients.
> If you received this e-mail by mistake, please immediately inform its sender. 
> Delete the contents of this e-mail with all attachments and its copies from 
> your system.
> If you are not the intended recipient of this e-mail, you are not authorized 
> to use, disseminate, copy or disclose this e-mail in any manner.
> The sender of this e-mail shall not be liable for any possible damage caused 
> by modifications of the e-mail or by delay with transfer of the email.
>
> In case that this e-mail forms part of business dealings:
> - the sender reserves the right to end negotiations about entering into a 
> contract in any time, for any reason, and without stating any reasoning.
> - if the e-mail contains an offer, the recipient is entitled to immediately 
> accept such offer; The sender of this e-mail (offer) excludes any acceptance 
> of the offer on the part of the recipient containing any amendment or 
> variation.
> - the sender insists on that the respective contract is concluded only upon 
> an express mutual agreement on all its aspects.
> - the sender of this e-mail informs that he/she is not authorized to enter 
> into any contracts on behalf of the company except for cases in which he/she 
> is expressly authorized to do so in writing, and such authorization or power 
> of attorney is submitted to the recipient or the person represented by the 
> recipient, or the existence of such authorization is known to the recipient 
> of the person represented by the recipient.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] suggestion for optimal plotting to show significant differences

Reply via email to