[R] how to overlay 2d pdf atop scatter plot using ggplot2

2017-10-08 Thread Big Floppy Dog
Note: I have posted this on SO also but while the question has been
upvoted, there has been no answer yet.

https://stackoverflow.com/questions/46622243/ggplot-plot-2d-probability-density-function-on-top-of-points-on-ggplot

Apologies for those who have seen it there also but I thought that this
list of experts may have someone who knows the answer.

I have the following example code:



require(mvtnorm)
require(ggplot2)
set.seed(1234)
xx <- data.frame(rmvt(100, df = c(13, 13)))
ggplot(data = xx,  aes(x = X1, y= X2)) + geom_point() + geom_density2d()



It yields a scatterplot of X2 against X1 and a KDE contour plot of the
density (as it should).

My question is: is it possible to change the contour plot to display
the contours

of a two-dimensional density function (say dmvt), using ggplot2?

The remaining figures in my document are in ggplot2 and therefore I
am looking for a ggplot2 solution.

Thanks in advance!

BFD

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] example of geom_contour() with function argument

2017-10-09 Thread Big Floppy Dog
Hi,

This is not a HW problem, sadly: I was last in a classroom 30 years ago,
and can no longer run off to the instructor :-(

I apologize but I cut and paste the wrong snippet earlier and made a typo
in doing so, but the result is the same with the more appropriate  snippet.

require(mvtnorm)
require(ggplot2)
set.seed(1234)
xx <- data.frame(rmvt(100, df = c(13, 13)))

v <- ggplot(data = xx, aes(x = X1, y = X2, z = dmvt, df = c(13,13)))
v + geom_contour()

Don't know how to automatically pick scale for object of type function.
Defaulting to continuous.
Error: Aesthetics must be either length 1 or the same as the data (100): x,
y, z, df

I do not understand how to put in a function as an argument to
geom_contour() and the examples in the help fileor in the link that Ulrik
sent are not very helpful to me. Hence, I was asking for some examples that
might be helpful.

I guess the answer is to make a second dataset that is regular and make the
function estimate that, but how do I combine this?

TIA.
BFD


On Mon, Oct 9, 2017 at 11:32 AM, David Winsemius <dwinsem...@comcast.net>
wrote:

>
> > On Oct 9, 2017, at 6:03 AM, Big Floppy Dog <bigfloppy...@gmail.com>
> wrote:
> >
> > Hello Ulrik,
> >
> > I apologize, but I can not see how to provide a pdf in place of the
> density
> > function which calculates a KDE (that is, something from the dataset in
> the
> > example). Can you please point to the specific example that might help?
> >
> > Here is what I get:
> >
> > require(mvtnorm)
> > require(ggplot2)
> > set.seed(1234)
> > xx <- data.frame(rmvt(100, df = c(13, 13)))
> >
> >
> > v <- ggplot(faithfuld, aes(waiting, eruptions, z = drmvt, df = c(13,13)))
> > v + geom_contour()
> >
> > Don't know how to automatically pick scale for object of type function.
> > Defaulting to continuous.
> > Error: Aesthetics must be either length 1 or the same as the data (5625):
> > x, y, z, df
> >
>
> That's not what I get:
>
> > v <- ggplot(faithfuld, aes(waiting, eruptions, z = drmvt, df = c(13,13)))
> > v + geom_contour()
> Error in FUN(X[[i]], ...) : object 'drmvt' not found
> >
> > ? faithfuld
> > str(faithfuld)
> Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   5625 obs. of  3 variables:
>  $ eruptions: num  1.6 1.65 1.69 1.74 1.79 ...
>  $ waiting  : num  43 43 43 43 43 43 43 43 43 43 ...
>  $ density  : num  0.00322 0.00384 0.00444 0.00498 0.00542 ...
>
> So you are apparently trying to throw together code and data that you
> don't understand. The data you are using is already a density estimate
> designed to simply be plotted. It is not the original data. Furthermore you
> are passing drmvt that is apparently not in either the mvtnorm nor the
> ggplot2 packages.
>
> You should determine where that function is and then determine how to do a
> 2d estimate on the original data. I'm guessing this is homework so not
> inclined to offer a complete solution.
>
> --
> David.
>
>
> >
> > Can you please tell me how to use this here? Or is some other example
> more
> > appropriate?
> >
> > TIA,
> > BFD
> >
> >
> >
> > On Mon, Oct 9, 2017 at 2:22 AM, Ulrik Stervbo <ulrik.ster...@gmail.com>
> > wrote:
> >
> >> Hi BFD,
> >>
> >> ?geom_contour() *does* have helpful examples. Your Google-foo is weak:
> >> Searching for geom_contour brought me: http://ggplot2.tidyverse.
> >> org/reference/geom_contour.html as the first result.
> >>
> >> HTH
> >> Ulrik
> >>
> >> On Mon, 9 Oct 2017 at 08:04 Big Floppy Dog <bigfloppy...@gmail.com>
> wrote:
> >>
> >>> Can someone please point me to an example with geom_contour() that
> uses a
> >>> function? The help does not have an example of a function, and also  I
> did
> >>> not find anything from online searches.
> >>>
> >>> TIA,
> >>> BFD
> >>>
> >>>
> >>> 
> >>> ---
> >>>
> >>> How about geom_contour()?
> >>>
> >>> Am So., 8. Okt. 2017, 20:52 schrieb Ranjan Maitra <mai...@email.com>:
> >>>
> >>>> Hi,
> >>>>
> >>>> I am no expert on ggplot2 and I do not know the answer to your
> >>> question. I
> >>>> looked around a bit but could not find an answer right away. But one
> >>>> possibility could be, if a direct approach is not possible, to draw
> >>>> ellipses

Re: [R] example of geom_contour() with function argument

2017-10-09 Thread Big Floppy Dog
Hello Ulrik,

I apologize, but I can not see how to provide a pdf in place of the density
function which calculates a KDE (that is, something from the dataset in the
example). Can you please point to the specific example that might help?

Here is what I get:

require(mvtnorm)
require(ggplot2)
set.seed(1234)
xx <- data.frame(rmvt(100, df = c(13, 13)))


v <- ggplot(faithfuld, aes(waiting, eruptions, z = drmvt, df = c(13,13)))
v + geom_contour()

Don't know how to automatically pick scale for object of type function.
Defaulting to continuous.
Error: Aesthetics must be either length 1 or the same as the data (5625):
x, y, z, df


Can you please tell me how to use this here? Or is some other example more
appropriate?

TIA,
BFD



On Mon, Oct 9, 2017 at 2:22 AM, Ulrik Stervbo <ulrik.ster...@gmail.com>
wrote:

> Hi BFD,
>
> ?geom_contour() *does* have helpful examples. Your Google-foo is weak:
> Searching for geom_contour brought me: http://ggplot2.tidyverse.
> org/reference/geom_contour.html as the first result.
>
> HTH
> Ulrik
>
> On Mon, 9 Oct 2017 at 08:04 Big Floppy Dog <bigfloppy...@gmail.com> wrote:
>
>> Can someone please point me to an example with geom_contour() that uses a
>> function? The help does not have an example of a function, and also  I did
>> not find anything from online searches.
>>
>> TIA,
>> BFD
>>
>>
>> 
>> ---
>>
>> How about geom_contour()?
>>
>> Am So., 8. Okt. 2017, 20:52 schrieb Ranjan Maitra <mai...@email.com>:
>>
>> > Hi,
>> >
>> > I am no expert on ggplot2 and I do not know the answer to your
>> question. I
>> > looked around a bit but could not find an answer right away. But one
>> > possibility could be, if a direct approach is not possible, to draw
>> > ellipses corresponding to the confidence regions of the multivariate t
>> > density and use geom_polygon to draw this successively?
>> >
>> > I will wait for a couple of days to see if there is a better answer
>> posted
>> > and then write some code, unless you get to it first.
>> >
>> > Thanks,
>> > Ranjan
>> >
>> >
>> > On Sun, 8 Oct 2017 09:30:30 -0500 Big Floppy Dog <
>> bigfloppy...@gmail.com>
>> > wrote:
>> >
>> > > Note: I have posted this on SO also but while the question has been
>> > > upvoted, there has been no answer yet.
>> > >
>> > >
>> >
>> https://stackoverflow.com/questions/46622243/ggplot-
>> plot-2d-probability-density-function-on-top-of-points-on-ggplot
>> > >
>> > > Apologies for those who have seen it there also but I thought that
>> this
>> > > list of experts may have someone who knows the answer.
>> > >
>> > > I have the following example code:
>> > >
>> > >
>> > >
>> > > require(mvtnorm)
>> > > require(ggplot2)
>> > > set.seed(1234)
>> > > xx <- data.frame(rmvt(100, df = c(13, 13)))
>> > > ggplot(data = xx,  aes(x = X1, y= X2)) + geom_point() +
>> geom_density2d()
>> > >
>> > >
>> > >
>> > > It yields a scatterplot of X2 against X1 and a KDE contour plot of the
>> > > density (as it should).
>> > >
>> > > My question is: is it possible to change the contour plot to display
>> > > the contours
>> > >
>> > > of a two-dimensional density function (say dmvt), using ggplot2?
>> > >
>> > > The remaining figures in my document are in ggplot2 and therefore I
>> > > am looking for a ggplot2 solution.
>> > >
>> > > Thanks in advance!
>> > >
>> > > BFD
>> > >
>> > >   [[alternative HTML version deleted]]
>> > >
>> > > __
>> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > > https://stat.ethz.ch/mailman/listinfo/r-help
>> > > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > > and provide commented, minimal, self-contained, reproducible code.
>> > >
>> >
>> >
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] example of geom_contour() with function argument

2017-10-09 Thread Big Floppy Dog
Can someone please point me to an example with geom_contour() that uses a
function? The help does not have an example of a function, and also  I did
not find anything from online searches.

TIA,
BFD


---

How about geom_contour()?

Am So., 8. Okt. 2017, 20:52 schrieb Ranjan Maitra <mai...@email.com>:

> Hi,
>
> I am no expert on ggplot2 and I do not know the answer to your question. I
> looked around a bit but could not find an answer right away. But one
> possibility could be, if a direct approach is not possible, to draw
> ellipses corresponding to the confidence regions of the multivariate t
> density and use geom_polygon to draw this successively?
>
> I will wait for a couple of days to see if there is a better answer posted
> and then write some code, unless you get to it first.
>
> Thanks,
> Ranjan
>
>
> On Sun, 8 Oct 2017 09:30:30 -0500 Big Floppy Dog <bigfloppy...@gmail.com>
> wrote:
>
> > Note: I have posted this on SO also but while the question has been
> > upvoted, there has been no answer yet.
> >
> >
>
https://stackoverflow.com/questions/46622243/ggplot-plot-2d-probability-density-function-on-top-of-points-on-ggplot
> >
> > Apologies for those who have seen it there also but I thought that this
> > list of experts may have someone who knows the answer.
> >
> > I have the following example code:
> >
> >
> >
> > require(mvtnorm)
> > require(ggplot2)
> > set.seed(1234)
> > xx <- data.frame(rmvt(100, df = c(13, 13)))
> > ggplot(data = xx,  aes(x = X1, y= X2)) + geom_point() + geom_density2d()
> >
> >
> >
> > It yields a scatterplot of X2 against X1 and a KDE contour plot of the
> > density (as it should).
> >
> > My question is: is it possible to change the contour plot to display
> > the contours
> >
> > of a two-dimensional density function (say dmvt), using ggplot2?
> >
> > The remaining figures in my document are in ggplot2 and therefore I
> > am looking for a ggplot2 solution.
> >
> > Thanks in advance!
> >
> > BFD
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] example of geom_contour() with function argument

2017-10-09 Thread Big Floppy Dog
Thank you very much! So, it appears that a grid has to be created for the
function to be used in stat_contour(). Thanks again for this example! It is
very helpful (and could be a worthwhile addition to geom_contour's help
example).

Btw, I was also trying to make the contour plot have shaded regions
corresponding to how much mass there is in between wach contour and I seem
to be getting something very ugly (and useless). Any suggestions?

library(mvtnorm)
## you were misusing "require"... only use require if you plan to
library(ggplot2)
## test the return value and fail gracefully when the
package is missing
set.seed( 1234 )
xx <- data.frame( rmvt( 100, df = c( 13, 13 ) ) )
xx2 <- expand.grid( X1 = seq( -5, 5, 0.1 )
   ## all combinations... could  be used to fill a matrix
   , X2 = seq( -5, 5, 0.1 )
   )
## compute density as a function of the grid of points
xx2$d <- dmvt( as.matrix( xx2[,1:2] ) ) #! feels weird not specifying
measures of centrality or spread


ggplot( data = xx
   ,  aes( x = X1
 , y = X2
 )
   ) +
geom_tile(data = xx2, aes(fill = d, alpha = 0.01)) +
geom_contour(data = xx2, aes(x = X1
 , y = X2
 , z = d
   )
 ) +
geom_point() + theme_light() +
theme(legend.position="none")


Also, I had not completely appreciated the different between require() and
library(). I will look into the differences again! Thanks for pointing this
out.

TIAA.
BFD



On Mon, Oct 9, 2017 at 4:01 PM, jdnewmil <jdnew...@dcn.davis.ca.us> wrote:

> library(mvtnorm) # you were misusing "require"... only use require if you
> plan to
> library(ggplot2) # test the return value and fail gracefully when the
> package is missing
> set.seed( 1234 )
> xx <- data.frame( rmvt( 100, df = c( 13, 13 ) ) )
> xx2 <- expand.grid( X1 = seq( -5, 5, 0.1 ) # all combinations... could be
> used to fill a matrix
>   , X2 = seq( -5, 5, 0.1 )
>   )
> # compute density as a function of the grid of points
> xx2$d <- dmvt( as.matrix( xx2[,1:2] ) ) # feels weird not specifying
> measures of centrality or spread
> ggplot( data = xx
>   ,  aes( x = X1
> , y = X2
> )
>   ) +
> geom_point() + # might want this line after the geom_contour
> geom_contour( data = xx2 # may want to consider geom_tile as well
> , mapping = aes( x = X1
>, y = X2
>, z = d
>)
> )
> #' ![](https://i.imgur.com/8ExFYtI.png)
> ## generated/tested with the reprex package to double check that it is
> reproducible
>
>
> On 2017-10-09 09:52, Big Floppy Dog wrote:
>
>> Hi,
>>
>> This is not a HW problem, sadly: I was last in a classroom 30 years ago,
>> and can no longer run off to the instructor :-(
>>
>> I apologize but I cut and paste the wrong snippet earlier and made a typo
>> in doing so, but the result is the same with the more appropriate
>> snippet.
>>
>> require(mvtnorm)
>> require(ggplot2)
>> set.seed(1234)
>> xx <- data.frame(rmvt(100, df = c(13, 13)))
>>
>> v <- ggplot(data = xx, aes(x = X1, y = X2, z = dmvt, df = c(13,13)))
>> v + geom_contour()
>>
>> Don't know how to automatically pick scale for object of type function.
>> Defaulting to continuous.
>> Error: Aesthetics must be either length 1 or the same as the data (100):
>> x,
>> y, z, df
>>
>> I do not understand how to put in a function as an argument to
>> geom_contour() and the examples in the help fileor in the link that Ulrik
>> sent are not very helpful to me. Hence, I was asking for some examples
>> that
>> might be helpful.
>>
>> I guess the answer is to make a second dataset that is regular and make
>> the
>> function estimate that, but how do I combine this?
>>
>> TIA.
>> BFD
>>
>>
>> On Mon, Oct 9, 2017 at 11:32 AM, David Winsemius <dwinsem...@comcast.net>
>> wrote:
>>
>>
>>> > On Oct 9, 2017, at 6:03 AM, Big Floppy Dog <bigfloppy...@gmail.com>
>>> wrote:
>>> >
>>> > Hello Ulrik,
>>> >
>>> > I apologize, but I can not see how to provide a pdf in place of the
>>> density
>>> > function which calculates a KDE (that is, something from the dataset in
>>> the
>>> > example). Can you please point to the specific example that might help?
>>> >
>>> > Here is what I get:
>>

[R] ggplot2: plot gruped/nested split violins

2018-03-06 Thread Big Floppy Dog
Hi,

I posted this on StackOverflow also but did not get a response so I thought
that I would also try luck here. The post is at:

https://stackoverflow.com/questions/49120060/ggplot2-display-blocks-of-nested-split-violins

Basically, I have the following test example:

--cut-and-paste-from-here-on

df <- data.frame(dens = rnorm(5000),
 split = as.factor(sample(1:2, 5000, replace = T)),
 method = as.factor(sample(c("A","B"), 5000, replace = T))
 counts = sample(c(1, 10, 100, 1000, 1), 5000, replace = T))

-stop-cut-and-paste-here


What i am wanting to do is to do split violin plots for splits 1 and 2
within groups A and B for each count (which would be in the logscale, but
that is not important for this example). We have four groups for each
setting but there is a nested aspect to it.

Here is what I have tried:


-start-cut-and-paste-again---

GeomSplitViolin <- ggproto("GeomSplitViolin", GeomViolin,

  draw_group = function(self, data, ..., draw_quantiles = NULL){
# By @YAK: 
https://stackoverflow.com/questions/35717353/split-violin-plot-with-ggplot2
data <- transform(data, xminv = x - violinwidth * (x - xmin),
xmaxv = x + violinwidth * (xmax - x))
grp <- data[1,'group']
newdata <- plyr::arrange(transform(data, x = if(grp%%2==1) xminv
else xmaxv), if(grp%%2==1) y else -y)
newdata <- rbind(newdata[1, ], newdata, newdata[nrow(newdata), ],
newdata[1, ])
newdata[c(1,nrow(newdata)-1,nrow(newdata)), 'x'] <- round(newdata[1, 'x'])
if (length(draw_quantiles) > 0 & !scales::zero_range(range(data$y))) {
  stopifnot(all(draw_quantiles >= 0), all(draw_quantiles <= 1))
  quantiles <- create_quantile_segment_frame(data, draw_quantiles,
split = TRUE, grp = grp)
  aesthetics <- data[rep(1, nrow(quantiles)), setdiff(names(data),
c("x", "y")), drop = FALSE]
  aesthetics$alpha <- rep(1, nrow(quantiles))
  both <- cbind(quantiles, aesthetics)
  quantile_grob <- GeomPath$draw_panel(both, ...)
  ggplot2:::ggname("geom_split_violin",
grid::grobTree(GeomPolygon$draw_panel(newdata, ...), quantile_grob))
}
else {
  ggplot2:::ggname("geom_split_violin",
GeomPolygon$draw_panel(newdata, ...))
}
  })

create_quantile_segment_frame <- function (data, draw_quantiles, split
= FALSE, grp = NULL) {
  dens <- cumsum(data$density)/sum(data$density)
  ecdf <- stats::approxfun(dens, data$y)
  ys <- ecdf(draw_quantiles)
  violin.xminvs <- (stats::approxfun(data$y, data$xminv))(ys)
  violin.xmaxvs <- (stats::approxfun(data$y, data$xmaxv))(ys)
  violin.xs <- (stats::approxfun(data$y, data$x))(ys)
  if (grp %% 2 == 0) {
data.frame(x = ggplot2:::interleave(violin.xs, violin.xmaxvs),
   y = rep(ys, each = 2), group = rep(ys, each = 2))
  } else {
data.frame(x = ggplot2:::interleave(violin.xminvs, violin.xs),
   y = rep(ys, each = 2), group = rep(ys, each = 2))
  }}



geom_split_violin <- function (mapping = NULL, data = NULL, stat =
"ydensity", position = "identity", ..., draw_quantiles = NULL, trim =
TRUE, scale = "area", na.rm = FALSE, show.legend = NA, inherit.aes =
TRUE) {
  layer(data = data, mapping = mapping, stat = stat, geom =
GeomSplitViolin, position = position, show.legend = show.legend,
inherit.aes = inherit.aes, params = list(trim = trim, scale = scale,
draw_quantiles = draw_quantiles, na.rm = na.rm, ...))}


ggplot(df, aes(x = factor(counts), y = dens, fill =
interaction(split,method))) +
   geom_split_violin(draw_quantiles = c(0.25, 0.5, 0.75)) +
scale_fill_manual(values=RColorBrewer::brewer.pal(name="Paired",n=4))
+ theme_light() + theme(legend.position="bottom")


--stop-cut-and-paste-again---

Now, I  almost get what i want but for the fact that the two split violins
for a "Count" end up on top of the other. What I want is them to be next to
each other and separated from the values for the other "Counts".

In other words, what I want is really the light blue and the dark blue to
be the two halves of a split violin plot and the light green and the dark
green to be the two halves of another split violin plot and these plots
should be bunched together.

Let me know if something is not clear, sorry for that.

As I mentioned, I also posted on SO, and I will keep both fora updated if I
get a good answer in either (unless someone else also posts there
directly).

TIA for any suggestions!

BFD

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.