Re: [R] Is there a hash data structure for R

2021-11-03 Thread Abby Spurdle
Here's an interesting article:
Collections in R: Review and Proposal
Timothy Barry
The R Journal
doi: 10.32614/RJ-2018-037
https://journal.r-project.org/archive/2018/RJ-2018-037/RJ-2018-037.pdf

On Tue, Nov 2, 2021 at 10:48 PM Yonghua Peng  wrote:
>
> I know this is a newbie question. But how do I implement the hash structure
> which is available in other languages (in python it's dict)?
>
> I know there is the list, but list's names can be duplicated here.
>
> > x <- list(x=1:5,y=month.name,x=3:7)
>
> > x
>
> $x
>
> [1] 1 2 3 4 5
>
>
> $y
>
>  [1] "January"   "February"  "March" "April" "May"   "June"
>
>  [7] "July"  "August""September" "October"   "November"  "December"
>
>
> $x
>
> [1] 3 4 5 6 7
>
>
>
> Thanks a lot.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Need help to unzip files in Windows

2021-08-23 Thread Abby Spurdle
There are some differences in R, between Windows and Linux.
You could try the 'shell' command instead.

#On Windows
?shell

On Tue, Aug 24, 2021 at 4:53 AM Anas Jamshed  wrote:
>
> I have the file GSE162562_RAW. First I untar them
> by untar("GSE162562_RAW.tar")
> then I am running like:
>  system("gunzip ~/Desktop/GSE162562_RAW/*.gz")
>
>
> This is running fine in Linux but not in windows. What changes I
> should make to run this command in windows as well
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Testing optimization solvers with equality constraints

2021-05-27 Thread Abby Spurdle
I meant:
x0 = c (1, 1e-3, 0)

Not:
x0 = c (1, 1e6, 0)

So, large intentional error may work too.
Possibly, better...?

On Thu, May 27, 2021 at 6:00 PM Abby Spurdle  wrote:
>
> If I can re-answer the original post:
> There's a relatively simple solution.
> (For these problems, at least).
>
> #wrong
> x0 = c (1, 0, 0)
> NlcOptim::solnl(x0, objfun = f, confun = conf)$par
> Rdonlp2::donlp2(x0, fn = f, nlin = list(heq), nlin.lower = 0,
> nlin.upper = 0)$par
>
> #right
> x0 = c (1, 1e6, 0)
> NlcOptim::solnl(x0, objfun = f, confun = conf)$par
> Rdonlp2::donlp2(x0, fn = f, nlin = list(heq), nlin.lower = 0,
> nlin.upper = 0)$par
>
> So, problems with the starting point, appear to be very *specific*.
> Hence, a small amount of intentional error resolves the problem.
>
> Presumably, there are more efficient solutions, that the package
> maintainers may (or may not) want to address.
>
>
> On Thu, May 27, 2021 at 3:27 PM Abby Spurdle  wrote:
> >
> > I need to retract my previous post.
> > (Except the part that the R has extremely good numerical capabilities).
> >
> > I ran some of the examples, and Hans W was correct.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Testing optimization solvers with equality constraints

2021-05-27 Thread Abby Spurdle
If I can re-answer the original post:
There's a relatively simple solution.
(For these problems, at least).

#wrong
x0 = c (1, 0, 0)
NlcOptim::solnl(x0, objfun = f, confun = conf)$par
Rdonlp2::donlp2(x0, fn = f, nlin = list(heq), nlin.lower = 0,
nlin.upper = 0)$par

#right
x0 = c (1, 1e6, 0)
NlcOptim::solnl(x0, objfun = f, confun = conf)$par
Rdonlp2::donlp2(x0, fn = f, nlin = list(heq), nlin.lower = 0,
nlin.upper = 0)$par

So, problems with the starting point, appear to be very *specific*.
Hence, a small amount of intentional error resolves the problem.

Presumably, there are more efficient solutions, that the package
maintainers may (or may not) want to address.


On Thu, May 27, 2021 at 3:27 PM Abby Spurdle  wrote:
>
> I need to retract my previous post.
> (Except the part that the R has extremely good numerical capabilities).
>
> I ran some of the examples, and Hans W was correct.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Testing optimization solvers with equality constraints

2021-05-26 Thread Abby Spurdle
I need to retract my previous post.
(Except the part that the R has extremely good numerical capabilities).

I ran some of the examples, and Hans W was correct.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Testing optimization solvers with equality constraints

2021-05-24 Thread Abby Spurdle
I received an off-list email, questioning the relevance of my post.
So, I thought I should clarify.

If an optimization algorithm is dependent on the starting point (or
other user-selected parameters), and then fails to find the "correct"
solution because the starting point (or other user-selected
parameters) are unsuitable, then that, in itself, does not indicate a
problem with the algorithm.

In other words, the R's packages listed in this thread appear to be
working fine.
(Or at least, there's no clear counter-evidence against).

One solution is to project the surface (here, equality constraints) on
to lower dimensions, as already suggested.
Another much simpler solution, is to use two algorithms, where one
selects one or more starting points.
(These could be the solution to an initial optimization, or chosen at
random, or a combination of both).

Both of these approaches generalize to a broader set of problems.
And I assume that there are other (possibly much better) approaches.
However, that's an off-list discussion...

All and all, I would say R has extremely good numerical capabilities.
Which are even more useful still, with the use of well chosen
mathematical and statistical graphics.


On Sun, May 23, 2021 at 5:25 PM Abby Spurdle  wrote:
>
> For a start, there's two local minima.
>
> Add to that floating point errors.
> And possible assumptions by the package authors.
>
> begin code
> f <- function (x, y, sign)
> {   unsign.z <- sqrt (1 - x^2 - y^2)
> 2 * (x^2 - sign * y * unsign.z)
> }
>
> north.f <- function (x, y) f (x, y, +1)
> south.f <- function (x, y) f (x, y, -1)
>
> N <- 100
> p0 <- par (mfrow = c (1, 2) )
> plotf_cfield (north.f, c (-1.1, 1.1),
> main="north",
> ncontours=10, n=N, raster=TRUE, hcv=TRUE)
> plotf_cfield (south.f, c (-1.1, 1.1),
> main="south",
> ncontours=10, n=N, raster=TRUE, hcv=TRUE)
> par (p0)
> end code 
>
> Please ignore R warnings.
> I'm planning to reinvent this package soon.
> And also, it wasn't designed for circular heatmaps.
>
>
> On Sat, May 22, 2021 at 8:02 PM Hans W  wrote:
> >
> > Yes. "*on* the unit sphere" means on the surface, as you can guess
> > from the equality constraint. And 'auglag()' does find the minimum, so
> > no need for a special approach.
> >
> > I was/am interested in why all these other good solvers get stuck,
> > i.e., do not move away from the starting point. And how to avoid this
> > in general, not only for this specific example.
> >
> >
> > On Sat, 22 May 2021 at 09:44, Abby Spurdle  wrote:
> > >
> > > Sorry, this might sound like a poor question:
> > > But by "on the unit sphere", do you mean on the ***surface*** of the 
> > > sphere?
> > > In which case, can't the surface of a sphere be projected onto a pair
> > > of circles?
> > > Where the cost function is reformulated as a function of two (rather
> > > than three) variables.
> > >

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Testing optimization solvers with equality constraints

2021-05-22 Thread Abby Spurdle
Sorry, missed the top line of code.
library (barsurf)

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Testing optimization solvers with equality constraints

2021-05-22 Thread Abby Spurdle
For a start, there's two local minima.

Add to that floating point errors.
And possible assumptions by the package authors.

begin code
f <- function (x, y, sign)
{   unsign.z <- sqrt (1 - x^2 - y^2)
2 * (x^2 - sign * y * unsign.z)
}

north.f <- function (x, y) f (x, y, +1)
south.f <- function (x, y) f (x, y, -1)

N <- 100
p0 <- par (mfrow = c (1, 2) )
plotf_cfield (north.f, c (-1.1, 1.1),
main="north",
ncontours=10, n=N, raster=TRUE, hcv=TRUE)
plotf_cfield (south.f, c (-1.1, 1.1),
main="south",
ncontours=10, n=N, raster=TRUE, hcv=TRUE)
par (p0)
end code 

Please ignore R warnings.
I'm planning to reinvent this package soon.
And also, it wasn't designed for circular heatmaps.


On Sat, May 22, 2021 at 8:02 PM Hans W  wrote:
>
> Yes. "*on* the unit sphere" means on the surface, as you can guess
> from the equality constraint. And 'auglag()' does find the minimum, so
> no need for a special approach.
>
> I was/am interested in why all these other good solvers get stuck,
> i.e., do not move away from the starting point. And how to avoid this
> in general, not only for this specific example.
>
>
> On Sat, 22 May 2021 at 09:44, Abby Spurdle  wrote:
> >
> > Sorry, this might sound like a poor question:
> > But by "on the unit sphere", do you mean on the ***surface*** of the sphere?
> > In which case, can't the surface of a sphere be projected onto a pair
> > of circles?
> > Where the cost function is reformulated as a function of two (rather
> > than three) variables.
> >
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Testing optimization solvers with equality constraints

2021-05-22 Thread Abby Spurdle
Sorry, this might sound like a poor question:
But by "on the unit sphere", do you mean on the ***surface*** of the sphere?

In which case, can't the surface of a sphere be projected onto a pair
of circles?
Where the cost function is reformulated as a function of two (rather
than three) variables.


On Sat, May 22, 2021 at 3:01 AM Hans W  wrote:
>
> Just by chance I came across the following example of minimizing
> a simple function
>
> (x,y,z) --> 2 (x^2 - y z)
>
> on the unit sphere, the only constraint present.
> I tried it with two starting points, x1 = (1,0,0) and x2 = (0,0,1).
>
> #-- Problem definition in R
> f = function(x)  2 * (x[1]^2 - x[2]*x[3])   # (x,y,z) |-> 2(x^2 -yz)
> g = function(x)  c(4*x[1], 2*x[3], 2*x[2])  # its gradient
>
> x0 = c(1, 0, 0); x1 = c(0, 0, 1)# starting points
> xmin = c(0, 1/sqrt(2), 1/sqrt(2))   # true minimum -1
>
> heq = function(x)  1-x[1]^2-x[2]^2-x[3]^2   # staying on the sphere
> conf = function(x) {# constraint function
> fun = x[1]^2 + x[2]^2 + x[3]^2 - 1
> return(list(ceq = fun, c = NULL))
> }
>
> I tried all the nonlinear optimization solvers in R packages that
> allow for equality constraints: 'auglag()' in alabama, 'solnl()' in
> NlcOptim, 'auglag()' in nloptr, 'solnp()' in Rsolnp, or even 'donlp2()'
> from the Rdonlp2 package (on R-Forge).
>
> None of them worked from both starting points:
>
> # alabama
> alabama::auglag(x0, fn = f, gr = g, heq = heq)  # right (inaccurate)
> alabama::auglag(x1, fn = f, gr = g, heq = heq)  # wrong
>
> # NlcOptim
> NlcOptim::solnl(x0, objfun = f, confun = conf)  # wrong
> NlcOptim::solnl(x1, objfun = f, confun = conf)  # right
>
> # nloptr
> nloptr::auglag(x0, fn = f, heq = heq)   # wrong
> # nloptr::auglag(x1, fn = f, heq = heq) # not returning
>
> # Rsolnp
> Rsolnp::solnp(x0, fun = f, eqfun = heq) # wrong
> Rsolnp::solnp(x1, fun = f, eqfun = heq) # wrong
>
> # Rdonlp2
> Rdonlp2::donlp2(x0, fn = f, nlin = list(heq),   # wrong
>nlin.lower = 0, nlin.upper = 0)
> Rdonlp2::donlp2(x1, fn = f, nlin = list(heq),   # right
>nlin.lower = 0, nlin.upper = 0)  # (fast and exact)
>
> The problem with starting point x0 appears to be that the gradient at
> that point, projected onto the unit sphere, is zero. Only alabama is
> able to handle this somehow.
>
> I do not know what problem most solvers have with starting point x1.
> The fact that Rdonlp2 is the fastest and most accurate is no surprise.
>
> If anyone with more experience with one or more of these packages can
> give a hint of what I made wrong, or how to change calling the solver
> to make it run correctly, please let me know.
>
> Thanks  -- HW
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Spie charts

2021-03-29 Thread Abby Spurdle
I couldn't find a predefined function for this purpose.
However, it wouldn't be too difficult to write a pair of functions.

The big question is how flexible does the rendering function need to be?

#plot from angles, distances, etc
#(angles on arbitrary scale)
spiechart.render <- function (
angle=1, distance, ...,
circd= stdd, labd = 1.25 * stdd,
main="", labs="", line.col="black", area.col="white",
stdd = mean (distance) )
{   n <- length (distance)
angle <- rep_len (angle, n)
angle <- 2 * pi * angle / sum (angle)

}

#compute angles and distances, from data
#(then call rendering function)
spiechart <- function (

spiechart.render (angle, distance, ...)
}

Partially off-topic remarks:
I know there's some criticism of this approach.
However, the OP never stated the purpose.
And this approach could be useful in some cases.
Say for modelling certain ecological or weather events.
Where for each event, there's a categorical date (such as month) and a
magnitude/etc.
And then, in the top level function from above, the angles and
distances would be the result of aggregation functions.


On Mon, Mar 29, 2021 at 4:59 AM Ferri Leberl  wrote:
>
> Dear ∀,
> Ist there a function to plot "spie charts" in R?
> https://en.wikipedia.org/wiki/spie_chart
> (These are a combination of pie charts and radial pie charts, where the angle 
> represents one dimension and the radius of the respective sector another 
> dimension)
> Thank you in advance!
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Off topic --- underdispersed (pseudo) binomial data.

2021-03-27 Thread Abby Spurdle
Further to yesterday's posts:
I think the "n" value would be the maximum possible number of jumps,
not the number of students.
In theory, the minimum possible number is zero, so the distributions
are more binomial-like than they look.

Also, there was a mistake in my comments.
The jump is from non-A-grade to A-grade, not non-pass to pass.


On Sat, Mar 27, 2021 at 10:00 PM Abby Spurdle  wrote:
>
> Sorry.
> I just realized, after posting, that the "n" value in the dispersion
> calculation isn't correct.
> I'll have to revisit the simulation, tomorrow.
>
> On Sat, Mar 27, 2021 at 9:11 PM Abby Spurdle  wrote:
> >
> > Hi Rolf,
> >
> > Let's say we have a course called Corgiology 101, with a single moderated 
> > exam.
> > And let's say the moderators transform initial exam scores, such that
> > there are fixed percentages of pass rates and A grades.
> >
> > Rather than count the number of passes, we can count the number of "jumps".
> > That is, the number of people that pass the corgiology exam after
> > moderation, that would not have passed without moderation.
> >
> > I've created a function to test for underdispersion, based on your 
> > expression.
> > (I hope I got it right).
> >
> > Then I've gone on to create simulations, using both constant and
> > nonconstant class sizes.
> > The nonconstant simulations apply an (approx) discrete scaling
> > transformation, referred to previously.
> >
> > We can see from the examples that there are a lot of these jumps.
> > And more importantly, they appear to be underdispersed.
> >
> > code
> > PASS.SCORE <- 0.5
> > A.SCORE <- 0.8
> >
> > #target parameters
> > PASS.RATE <- 0.8
> > A.RATE <- 0.2
> > #unmoderated parameters
> > UNMOD.MEAN.SCORE <- 0.65
> > UNMOD.SD.SCORE <- 0.075
> >
> > NCLASSES <- 2000
> > NSTUD.CONST <- 200
> > NSTUD.NONCONST.LIMS <- c (50, 800)
> >
> > sim.njump <- function (nstud, mean0=UNMOD.MEAN.SCORE, sd0=UNMOD.SD.SCORE,
> > pass.score=PASS.SCORE, a.score=A.SCORE,
> > pass.rate=PASS.RATE, a.rate=A.RATE)
> > {   x <- rnorm (nstud, mean0, sd0)
> > q <- quantile (x, 1 - c (pass.rate, a.rate), names=FALSE)
> > dq <- diff (q)
> > q <- (a.score - pass.score) / dq * q
> > y <- pass.score - q [1] + (a.score - pass.score) / dq * x
> > sum (x < a.score & y >= a.score)
> > }
> >
> > sim.nclasses <- function (nclasses, nstud, nstud.std)
> > {   nstud <- rep_len (nstud, nclasses)
> > njump <- integer (nclasses)
> > for (i in 1:nclasses)
> > njump [i] <- sim.njump (nstud [i])
> > if (missing (nstud.std) )
> > njump
> > else
> > round (nstud.std / nstud * njump)
> > }
> >
> > is.under <- function (x, n)
> > var (x) < mean (x) * (1 - mean (x) / n)
> >
> > njump.hom <- sim.nclasses (NCLASSES, NSTUD.CONST)
> > nstud <- round (runif (NCLASSES, NSTUD.NONCONST.LIMS [1],
> > NSTUD.NONCONST.LIMS [2]) )
> > njump.het <- sim.nclasses (NCLASSES, nstud, NSTUD.CONST)
> >
> > under.hom <- is.under (njump.hom, NSTUD.CONST)
> > under.het <- is.under (njump.het, NSTUD.CONST)
> > main.hom <- paste0 ("const class size (under=", under.hom, ")")
> > main.het <- paste0 ("diff class sizes (under=", under.het, ")")
> >
> > p0 <- par (mfrow = c (2, 1) )
> > hist (njump.hom, main=main.hom)
> > hist (njump.het, main=main.het)
> > par (p0)
> > code
> >
> > best,
> > B.
> >
> >
> > On Thu, Mar 25, 2021 at 2:33 PM Rolf Turner  wrote:
> > >
> > >
> > > I would like a real-life example of a data set which one might think to
> > > model by a binomial distribution, but which is substantially
> > > underdispersed. I.e. a sample X = {X_1, X_2, ..., X_N} where each X_i
> > > is an integer between 0 and n (n known a priori) such that var(X) <<
> > > mean(X)*(1 - mean(X)/n).
> > >
> > > Does anyone know of any such examples?  Do any exist?  I've done
> > > a perfunctory web search, and had a look at "A Handbook of Small
> > > Data Sets" by Hand, Daly, Lunn, et al., and drawn a blank.
> > >
> > > I've seen on the web some references to underdispersed "pseudo-Poisson"
> > > data, but not to underdispersed "pseudo-binomial" data.  And of course
> > > th

Re: [R] Off topic --- underdispersed (pseudo) binomial data.

2021-03-27 Thread Abby Spurdle
Sorry.
I just realized, after posting, that the "n" value in the dispersion
calculation isn't correct.
I'll have to revisit the simulation, tomorrow.

On Sat, Mar 27, 2021 at 9:11 PM Abby Spurdle  wrote:
>
> Hi Rolf,
>
> Let's say we have a course called Corgiology 101, with a single moderated 
> exam.
> And let's say the moderators transform initial exam scores, such that
> there are fixed percentages of pass rates and A grades.
>
> Rather than count the number of passes, we can count the number of "jumps".
> That is, the number of people that pass the corgiology exam after
> moderation, that would not have passed without moderation.
>
> I've created a function to test for underdispersion, based on your expression.
> (I hope I got it right).
>
> Then I've gone on to create simulations, using both constant and
> nonconstant class sizes.
> The nonconstant simulations apply an (approx) discrete scaling
> transformation, referred to previously.
>
> We can see from the examples that there are a lot of these jumps.
> And more importantly, they appear to be underdispersed.
>
> code
> PASS.SCORE <- 0.5
> A.SCORE <- 0.8
>
> #target parameters
> PASS.RATE <- 0.8
> A.RATE <- 0.2
> #unmoderated parameters
> UNMOD.MEAN.SCORE <- 0.65
> UNMOD.SD.SCORE <- 0.075
>
> NCLASSES <- 2000
> NSTUD.CONST <- 200
> NSTUD.NONCONST.LIMS <- c (50, 800)
>
> sim.njump <- function (nstud, mean0=UNMOD.MEAN.SCORE, sd0=UNMOD.SD.SCORE,
> pass.score=PASS.SCORE, a.score=A.SCORE,
> pass.rate=PASS.RATE, a.rate=A.RATE)
> {   x <- rnorm (nstud, mean0, sd0)
> q <- quantile (x, 1 - c (pass.rate, a.rate), names=FALSE)
> dq <- diff (q)
> q <- (a.score - pass.score) / dq * q
> y <- pass.score - q [1] + (a.score - pass.score) / dq * x
> sum (x < a.score & y >= a.score)
> }
>
> sim.nclasses <- function (nclasses, nstud, nstud.std)
> {   nstud <- rep_len (nstud, nclasses)
> njump <- integer (nclasses)
> for (i in 1:nclasses)
> njump [i] <- sim.njump (nstud [i])
> if (missing (nstud.std) )
> njump
> else
> round (nstud.std / nstud * njump)
> }
>
> is.under <- function (x, n)
> var (x) < mean (x) * (1 - mean (x) / n)
>
> njump.hom <- sim.nclasses (NCLASSES, NSTUD.CONST)
> nstud <- round (runif (NCLASSES, NSTUD.NONCONST.LIMS [1],
> NSTUD.NONCONST.LIMS [2]) )
> njump.het <- sim.nclasses (NCLASSES, nstud, NSTUD.CONST)
>
> under.hom <- is.under (njump.hom, NSTUD.CONST)
> under.het <- is.under (njump.het, NSTUD.CONST)
> main.hom <- paste0 ("const class size (under=", under.hom, ")")
> main.het <- paste0 ("diff class sizes (under=", under.het, ")")
>
> p0 <- par (mfrow = c (2, 1) )
> hist (njump.hom, main=main.hom)
> hist (njump.het, main=main.het)
> par (p0)
> code
>
> best,
> B.
>
>
> On Thu, Mar 25, 2021 at 2:33 PM Rolf Turner  wrote:
> >
> >
> > I would like a real-life example of a data set which one might think to
> > model by a binomial distribution, but which is substantially
> > underdispersed. I.e. a sample X = {X_1, X_2, ..., X_N} where each X_i
> > is an integer between 0 and n (n known a priori) such that var(X) <<
> > mean(X)*(1 - mean(X)/n).
> >
> > Does anyone know of any such examples?  Do any exist?  I've done
> > a perfunctory web search, and had a look at "A Handbook of Small
> > Data Sets" by Hand, Daly, Lunn, et al., and drawn a blank.
> >
> > I've seen on the web some references to underdispersed "pseudo-Poisson"
> > data, but not to underdispersed "pseudo-binomial" data.  And of course
> > there's lots of *over* dispersed stuff.  But that's not what I want.
> >
> > I can *simulate* data sets of the sor that I am looking for (so far the
> > only ideas I've had for doing this are pretty simplistic and
> > artificial) but I'd like to get my hands on a *real* example, if
> > possible.
> >
> > Grateful for any pointers/suggestions.
> >
> > cheers,
> >
> > Rolf Turner
> >
> > --
> > Honorary Research Fellow
> > Department of Statistics
> > University of Auckland
> > Phone: +64-9-373-7599 ext. 88276
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Off topic --- underdispersed (pseudo) binomial data.

2021-03-27 Thread Abby Spurdle
Hi Rolf,

Let's say we have a course called Corgiology 101, with a single moderated exam.
And let's say the moderators transform initial exam scores, such that
there are fixed percentages of pass rates and A grades.

Rather than count the number of passes, we can count the number of "jumps".
That is, the number of people that pass the corgiology exam after
moderation, that would not have passed without moderation.

I've created a function to test for underdispersion, based on your expression.
(I hope I got it right).

Then I've gone on to create simulations, using both constant and
nonconstant class sizes.
The nonconstant simulations apply an (approx) discrete scaling
transformation, referred to previously.

We can see from the examples that there are a lot of these jumps.
And more importantly, they appear to be underdispersed.

code
PASS.SCORE <- 0.5
A.SCORE <- 0.8

#target parameters
PASS.RATE <- 0.8
A.RATE <- 0.2
#unmoderated parameters
UNMOD.MEAN.SCORE <- 0.65
UNMOD.SD.SCORE <- 0.075

NCLASSES <- 2000
NSTUD.CONST <- 200
NSTUD.NONCONST.LIMS <- c (50, 800)

sim.njump <- function (nstud, mean0=UNMOD.MEAN.SCORE, sd0=UNMOD.SD.SCORE,
pass.score=PASS.SCORE, a.score=A.SCORE,
pass.rate=PASS.RATE, a.rate=A.RATE)
{   x <- rnorm (nstud, mean0, sd0)
q <- quantile (x, 1 - c (pass.rate, a.rate), names=FALSE)
dq <- diff (q)
q <- (a.score - pass.score) / dq * q
y <- pass.score - q [1] + (a.score - pass.score) / dq * x
sum (x < a.score & y >= a.score)
}

sim.nclasses <- function (nclasses, nstud, nstud.std)
{   nstud <- rep_len (nstud, nclasses)
njump <- integer (nclasses)
for (i in 1:nclasses)
njump [i] <- sim.njump (nstud [i])
if (missing (nstud.std) )
njump
else
round (nstud.std / nstud * njump)
}

is.under <- function (x, n)
var (x) < mean (x) * (1 - mean (x) / n)

njump.hom <- sim.nclasses (NCLASSES, NSTUD.CONST)
nstud <- round (runif (NCLASSES, NSTUD.NONCONST.LIMS [1],
NSTUD.NONCONST.LIMS [2]) )
njump.het <- sim.nclasses (NCLASSES, nstud, NSTUD.CONST)

under.hom <- is.under (njump.hom, NSTUD.CONST)
under.het <- is.under (njump.het, NSTUD.CONST)
main.hom <- paste0 ("const class size (under=", under.hom, ")")
main.het <- paste0 ("diff class sizes (under=", under.het, ")")

p0 <- par (mfrow = c (2, 1) )
hist (njump.hom, main=main.hom)
hist (njump.het, main=main.het)
par (p0)
code

best,
B.


On Thu, Mar 25, 2021 at 2:33 PM Rolf Turner  wrote:
>
>
> I would like a real-life example of a data set which one might think to
> model by a binomial distribution, but which is substantially
> underdispersed. I.e. a sample X = {X_1, X_2, ..., X_N} where each X_i
> is an integer between 0 and n (n known a priori) such that var(X) <<
> mean(X)*(1 - mean(X)/n).
>
> Does anyone know of any such examples?  Do any exist?  I've done
> a perfunctory web search, and had a look at "A Handbook of Small
> Data Sets" by Hand, Daly, Lunn, et al., and drawn a blank.
>
> I've seen on the web some references to underdispersed "pseudo-Poisson"
> data, but not to underdispersed "pseudo-binomial" data.  And of course
> there's lots of *over* dispersed stuff.  But that's not what I want.
>
> I can *simulate* data sets of the sor that I am looking for (so far the
> only ideas I've had for doing this are pretty simplistic and
> artificial) but I'd like to get my hands on a *real* example, if
> possible.
>
> Grateful for any pointers/suggestions.
>
> cheers,
>
> Rolf Turner
>
> --
> Honorary Research Fellow
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] local maxima positions in a vector with duplicated values

2021-03-26 Thread Abby Spurdle
Hi Stefano,

My package, vectools, is partly designed for this purpose.
(Unfortunately, the package *is* subject to *change*, and some of the
functions may change in the next update).

library (vectools)
which.maxs (x, ret.type="intervals")[,1] # c (8, 10, 13)
which.mins (x, ret.type="intervals")[,2] # c (4, 9, 12, 16)


best,
B.

P.S.
I'll make sure the next version has an option for the lower/upper index.


On Sat, Mar 27, 2021 at 4:36 AM Stefano Sofia
 wrote:
>
> Dear list users,
> I need to find local maxima and local minima positions in a vector where 
> there might be duplicates; in the particular in case of
> - duplicated local maxima, I should take the position of the first duplicated 
> value;
> - duplicated local minima, I should take the position of the last duplicated 
> value.
>
> Example:
>
> >x <- c(1,0,0,0,2,2,3,4,0,1,1,0,5,5,5,0,1)
> >which(diff(diff(x)>=0)<0)+1
>
> gives me 8 11 15 while I need 8 10 13;
>
> >which(diff(diff(x)>0)>0)+1
>
> gives me 4 6  9 12 16 while I need 4 9 12 16.
>
> Could you please help me in this task? I would be happier not to use 
> additional packages.
>
> Thank you for your precious help
> Stefano
>
>
>  (oo)
> --oOO--( )--OOo--
> Stefano Sofia PhD
> Civil Protection - Marche Region - Italy
> Meteo Section
> Snow Section
> Via del Colle Ameno 5
> 60126 Torrette di Ancona, Ancona (AN)
> Uff: +39 071 806 7743
> E-mail: stefano.so...@regione.marche.it
> ---Oo-oO
>
> 
>
> AVVISO IMPORTANTE: Questo messaggio di posta elettronica può contenere 
> informazioni confidenziali, pertanto è destinato solo a persone autorizzate 
> alla ricezione. I messaggi di posta elettronica per i client di Regione 
> Marche possono contenere informazioni confidenziali e con privilegi legali. 
> Se non si è il destinatario specificato, non leggere, copiare, inoltrare o 
> archiviare questo messaggio. Se si è ricevuto questo messaggio per errore, 
> inoltrarlo al mittente ed eliminarlo completamente dal sistema del proprio 
> computer. Ai sensi dell’art. 6 della DGR n. 1394/2008 si segnala che, in caso 
> di necessità ed urgenza, la risposta al presente messaggio di posta 
> elettronica può essere visionata da persone estranee al destinatario.
> IMPORTANT NOTICE: This e-mail message is intended to be received only by 
> persons entitled to receive the confidential information it may contain. 
> E-mail messages to clients of Regione Marche may contain information that is 
> confidential and legally privileged. Please do not read, copy, forward, or 
> store this message unless you are an intended recipient of it. If you have 
> received this message in error, please forward it to the sender and delete it 
> completely from your computer system.
>
> --
> Questo messaggio  stato analizzato da Libraesva ESG ed  risultato non infetto.
> This message was scanned by Libraesva ESG and is believed to be clean.
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Off topic --- underdispersed (pseudo) binomial data.

2021-03-25 Thread Abby Spurdle
I haven't checked this, but I guess that the number of students that
*pass* a particular exam/subject, per semester would be like that.

e.g.
Let's say you have a course in maximum likelihood, that's taught once
per year to 3rd year students, and a few postgrads.
You could count the number of passes, each year.

If you assume a near-constant probability of passing in each exam/semester:
Then I would assume it would follow the distribution that you're requesting.

If there is a significant change in the number of students:
Lets say, that less and less students study maximum likelihood because
they would rather study "advanced" R programming for "data science"
with "large data", then you might be able to apply some sort of
discrete-scaling transformation to the number of passes each semester.
This would allow you to pretend that the number of people studying
maximum likelihood is the same, and no one is studying other
apparently more important subjects.


On Thu, Mar 25, 2021 at 2:33 PM Rolf Turner  wrote:
>
>
> I would like a real-life example of a data set which one might think to
> model by a binomial distribution, but which is substantially
> underdispersed. I.e. a sample X = {X_1, X_2, ..., X_N} where each X_i
> is an integer between 0 and n (n known a priori) such that var(X) <<
> mean(X)*(1 - mean(X)/n).
>
> Does anyone know of any such examples?  Do any exist?  I've done
> a perfunctory web search, and had a look at "A Handbook of Small
> Data Sets" by Hand, Daly, Lunn, et al., and drawn a blank.
>
> I've seen on the web some references to underdispersed "pseudo-Poisson"
> data, but not to underdispersed "pseudo-binomial" data.  And of course
> there's lots of *over* dispersed stuff.  But that's not what I want.
>
> I can *simulate* data sets of the sor that I am looking for (so far the
> only ideas I've had for doing this are pretty simplistic and
> artificial) but I'd like to get my hands on a *real* example, if
> possible.
>
> Grateful for any pointers/suggestions.
>
> cheers,
>
> Rolf Turner
>
> --
> Honorary Research Fellow
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] quantile from quantile table calculation without original data

2021-03-12 Thread Abby Spurdle
Hi Petr,

In principle, I like David's approach the best.
However, I note that there's a bug in the squared step.
Furthemore, the variance of the sample quantiles should increase as
they move away from the modal region.

I've built on David's approach, but changed it to a two stage
optimization algorithm.
The parameter estimates from the first stage are used to compute density values.
Then the second stage is weighted, using the scaled density values.

I tried to create an iteratively reweighted algorithm.
However, it didn't converge.
(But that doesn't necessarily mean it can't be done).

The following code returns the value: 1.648416e-05

qfit.lnorm <- function (p, q, lower.tail=TRUE, ...,
par0 = c (-0.5, 0.5) )
{   n <- length (p)
qsample <- q

objf <- function (par)
{   qmodel <- qlnorm (p, par [1], par [2], lower.tail)
sum ( (qmodel - qsample)^2) / n
}
objf.w <- function (wpar, w)
{   qmodel <- qlnorm (p, wpar [1], wpar [2], lower.tail)
sum (w * (qmodel - qsample)^2)
}

wpar0 <- optim (par0, objf)$par
w <- dlnorm (p, wpar0 [1], wpar0 [2], lower.tail)
optim (wpar0, objf.w,, w=w)
}

par <- qfit.lnorm (temp$percent, temp$size, FALSE)$par
plnorm (0.1, par [1], par [2])


On Tue, Mar 9, 2021 at 2:52 AM PIKAL Petr  wrote:
>
> Hallo David, Abby and Bert
>
> Thank you for your solutions. In the meantime I found package 
> rriskDistributions, which was able to calculate values for lognormal 
> distribution from quantiles.
>
> Abby
> > 1-psolution
> [1] 9.980823e-06
>
> David
> > plnorm(0.1, -.7020649, .4678656)
> [1] 0.0003120744
>
> rriskDistributions
> > plnorm(0.1, -.6937355, .3881209)
> [1] 1.697379e-05
>
> Bert suggested to ask for original data before quantile calculation what is 
> probably the best but also the most problematic solution. Actually, maybe 
> original data are unavailable as it is the result from particle size 
> measurement, where the software always twist the original data and spits only 
> descriptive results.
>
> All your results are quite consistent with the available values as they are 
> close to 1, so for me, each approach works.
>
> Thank you again.
>
> Best regards.
> Petr
>
> > -Original Message-
> > From: David Winsemius 
> > Sent: Sunday, March 7, 2021 1:33 AM
> > To: Abby Spurdle ; PIKAL Petr
> > 
> > Cc: r-help@r-project.org
> > Subject: Re: [R] quantile from quantile table calculation without original 
> > data
> >
> >
> > On 3/6/21 1:02 AM, Abby Spurdle wrote:
> > > I came up with a solution.
> > > But not necessarily the best solution.
> > >
> > > I used a spline to approximate the quantile function.
> > > Then use that to generate a large sample.
> > > (I don't see any need for the sample to be random, as such).
> > > Then compute the sample mean and sd, on a log scale.
> > > Finally, plug everything into the plnorm function:
> > >
> > > p <- seq (0.01, 0.99,, 1e6)
> > > Fht <- splinefun (temp$percent, temp$size) x <- log (Fht (p) )
> > > psolution <- plnorm (0.1, mean (x), sd (x), FALSE) psolution
> > >
> > > The value of the solution is very close to one.
> > > Which is not a surprise.
> > >
> > > Here's a plot of everything:
> > >
> > > u <- seq (0.01, 1.65,, 200)
> > > v <- plnorm (u, mean (x), sd (x), FALSE) plot (u, v, type="l", ylim =
> > > c (0, 1) ) points (temp$size, temp$percent, pch=16) points (0.1,
> > > psolution, pch=16, col="blue")
> >
> > Here's another approach, which uses minimization of the squared error to
> > get the parameters for a lognormal distribution.
> >
> > temp <- structure(list(size = c(1.6, 0.9466, 0.8062, 0.6477, 0.5069, 0.3781,
> > 0.3047, 0.2681, 0.1907), percent = c(0.01, 0.05, 0.1, 0.25, 0.5, 0.75, 0.9, 
> > 0.95,
> > 0.99)), .Names = c("size", "percent"
> > ), row.names = c(NA, -9L), class = "data.frame")
> >
> > obj <- function(x) {sum( qlnorm(1-temp$percent, x[[1]], x[[2]])-temp$size
> > )^2}
> >
> > # Note the inversion of the poorly named and flipped "percent" column,
> >
> > optim( list(a=-0.65, b=0.42), obj)
> >
> > #
> >
> > $par
> >   a  b
> > -0.7020649  0.4678656
> >
> > $value
> > [1] 3.110316e-12
> >
> > $counts
> > function gradient
> >51   NA
> >
> > $convergence
> > [1] 0
> >
> > $message
> > NULL
> >

Re: [R] quantile from quantile table calculation without original data

2021-03-06 Thread Abby Spurdle
I came up with a solution.
But not necessarily the best solution.

I used a spline to approximate the quantile function.
Then use that to generate a large sample.
(I don't see any need for the sample to be random, as such).
Then compute the sample mean and sd, on a log scale.
Finally, plug everything into the plnorm function:

p <- seq (0.01, 0.99,, 1e6)
Fht <- splinefun (temp$percent, temp$size)
x <- log (Fht (p) )
psolution <- plnorm (0.1, mean (x), sd (x), FALSE)
psolution

The value of the solution is very close to one.
Which is not a surprise.

Here's a plot of everything:

u <- seq (0.01, 1.65,, 200)
v <- plnorm (u, mean (x), sd (x), FALSE)
plot (u, v, type="l", ylim = c (0, 1) )
points (temp$size, temp$percent, pch=16)
points (0.1, psolution, pch=16, col="blue")


On Sat, Mar 6, 2021 at 8:09 PM Abby Spurdle  wrote:
>
> I'm sorry.
> I misread your example, this morning.
> (I didn't read the code after the line that calls plot).
>
> After looking at this problem again, interpolation doesn't apply, and
> extrapolation would be a last resort.
> If you can assume your data comes from a particular type of
> distribution, such as a lognormal distribution, then a better approach
> would be to find the most likely parameters.
>
> i.e.
> This falls within the broader scope of maximum likelihood.
> (Except that you're dealing with a table of quantile-probability
> pairs, rather than raw observational data).
>
> I suspect that there's a relatively easy way of finding the parameters.
>
> I'll think about it...
> But someone else may come back with an answer first...
>
>
> On Sat, Mar 6, 2021 at 8:17 AM Abby Spurdle  wrote:
> >
> > I note three problems with your data:
> > (1) The name "percent" is misleading, perhaps you want "probability"?
> > (2) There are straight (or near-straight) regions, each of which, is
> > equally (or near-equally) spaced, which is not what I would expect in
> > problems involving "quantiles".
> > (3) Your plot (approximating the distribution function) is
> > back-the-front (as per what is customary).
> >
> >
> > On Fri, Mar 5, 2021 at 10:14 PM PIKAL Petr  wrote:
> > >
> > > Dear all
> > >
> > > I have table of quantiles, probably from lognormal distribution
> > >
> > >  dput(temp)
> > > temp <- structure(list(size = c(1.6, 0.9466, 0.8062, 0.6477, 0.5069,
> > > 0.3781, 0.3047, 0.2681, 0.1907), percent = c(0.01, 0.05, 0.1,
> > > 0.25, 0.5, 0.75, 0.9, 0.95, 0.99)), .Names = c("size", "percent"
> > > ), row.names = c(NA, -9L), class = "data.frame")
> > >
> > > and I need to calculate quantile for size 0.1
> > >
> > > plot(temp$size, temp$percent, pch=19, xlim=c(0,2))
> > > ss <- approxfun(temp$size, temp$percent)
> > > points((0:100)/50, ss((0:100)/50))
> > > abline(v=.1)
> > >
> > > If I had original data it would be quite easy with ecdf/quantile function 
> > > but without it I am lost what function I could use for such task.
> > >
> > > Please, give me some hint where to look.
> > >
> > >
> > > Best regards
> > >
> > > Petr
> > > Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních 
> > > partnerů PRECHEZA a.s. jsou zveřejněny na: 
> > > https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information 
> > > about processing and protection of business partner's personal data are 
> > > available on website: 
> > > https://www.precheza.cz/en/personal-data-protection-principles/
> > > Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou 
> > > důvěrné a podléhají tomuto právně závaznému prohlá±ení o vyloučení 
> > > odpovědnosti: https://www.precheza.cz/01-dovetek/ | This email and any 
> > > documents attached to it may be confidential and are subject to the 
> > > legally binding disclaimer: https://www.precheza.cz/en/01-disclaimer/
> > >
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide 
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] quantile from quantile table calculation without original data

2021-03-05 Thread Abby Spurdle
I'm sorry.
I misread your example, this morning.
(I didn't read the code after the line that calls plot).

After looking at this problem again, interpolation doesn't apply, and
extrapolation would be a last resort.
If you can assume your data comes from a particular type of
distribution, such as a lognormal distribution, then a better approach
would be to find the most likely parameters.

i.e.
This falls within the broader scope of maximum likelihood.
(Except that you're dealing with a table of quantile-probability
pairs, rather than raw observational data).

I suspect that there's a relatively easy way of finding the parameters.

I'll think about it...
But someone else may come back with an answer first...


On Sat, Mar 6, 2021 at 8:17 AM Abby Spurdle  wrote:
>
> I note three problems with your data:
> (1) The name "percent" is misleading, perhaps you want "probability"?
> (2) There are straight (or near-straight) regions, each of which, is
> equally (or near-equally) spaced, which is not what I would expect in
> problems involving "quantiles".
> (3) Your plot (approximating the distribution function) is
> back-the-front (as per what is customary).
>
>
> On Fri, Mar 5, 2021 at 10:14 PM PIKAL Petr  wrote:
> >
> > Dear all
> >
> > I have table of quantiles, probably from lognormal distribution
> >
> >  dput(temp)
> > temp <- structure(list(size = c(1.6, 0.9466, 0.8062, 0.6477, 0.5069,
> > 0.3781, 0.3047, 0.2681, 0.1907), percent = c(0.01, 0.05, 0.1,
> > 0.25, 0.5, 0.75, 0.9, 0.95, 0.99)), .Names = c("size", "percent"
> > ), row.names = c(NA, -9L), class = "data.frame")
> >
> > and I need to calculate quantile for size 0.1
> >
> > plot(temp$size, temp$percent, pch=19, xlim=c(0,2))
> > ss <- approxfun(temp$size, temp$percent)
> > points((0:100)/50, ss((0:100)/50))
> > abline(v=.1)
> >
> > If I had original data it would be quite easy with ecdf/quantile function 
> > but without it I am lost what function I could use for such task.
> >
> > Please, give me some hint where to look.
> >
> >
> > Best regards
> >
> > Petr
> > Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních 
> > partnerů PRECHEZA a.s. jsou zveřejněny na: 
> > https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about 
> > processing and protection of business partner's personal data are available 
> > on website: https://www.precheza.cz/en/personal-data-protection-principles/
> > Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné 
> > a podléhají tomuto právně závaznému prohlá±ení o vyloučení odpovědnosti: 
> > https://www.precheza.cz/01-dovetek/ | This email and any documents attached 
> > to it may be confidential and are subject to the legally binding 
> > disclaimer: https://www.precheza.cz/en/01-disclaimer/
> >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem downloading R

2021-03-05 Thread Abby Spurdle
This was quite some time ago.
Namely, 2019.

At the time, I was using a mix of old Windows 8 and 9 computers.
The problem happened on Windows 8, but not Windows 9.

Note that I was trying to help the OP.
Beyond that, I'm not concerned about it.


On Fri, Mar 5, 2021 at 9:30 PM Uwe Ligges
 wrote:
>
> Sounds like you always got corrupted vesions. Either an issue with your
> connection or the mirror. What happens if you try another mirror and
> clear your browser caches?
>
> Best,
> Uwe Ligges
>
> On 05.03.2021 08:58, Abby Spurdle wrote:
> > Does the following sound familiar?
> >
> > The Windows installer starts installing (or decompressing) R, flashing
> > one file name at a time.
> > And then, part way through, says  file is corrupt, and gives you
> > the choice to ignore.
> > And if you click ignore, then the next file does the same thing.
> > And one quickly realizes, that every subsequent file will have the same 
> > message.
> >
> > Then if you re-download the installation file, the same thing happens.
> > Except that the first file flagged as corrupted, is not necessarily
> > the same file.
> >
> >
> > On Fri, Mar 5, 2021 at 5:55 AM Dick Mathews  
> > wrote:
> >>
> >> I am trying to download the Windows version of R. This computer is Win7,
> >> I do have Win10 computers also.
> >>
> >> Tried R4.0.4, got message these files are corrupted.
> >>
> >> Then tried R4.0.3, got same message as above.
> >>
> >> Tried the next, R4.0.2, got same message.
> >>
> >> I checked to see if I was downloading the proper version, seems okay.
> >>
> >> What is the problem? Can anybody help me with this?
> >>
> >> Dick Mathews
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide 
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] quantile from quantile table calculation without original data

2021-03-05 Thread Abby Spurdle
I note three problems with your data:
(1) The name "percent" is misleading, perhaps you want "probability"?
(2) There are straight (or near-straight) regions, each of which, is
equally (or near-equally) spaced, which is not what I would expect in
problems involving "quantiles".
(3) Your plot (approximating the distribution function) is
back-the-front (as per what is customary).


On Fri, Mar 5, 2021 at 10:14 PM PIKAL Petr  wrote:
>
> Dear all
>
> I have table of quantiles, probably from lognormal distribution
>
>  dput(temp)
> temp <- structure(list(size = c(1.6, 0.9466, 0.8062, 0.6477, 0.5069,
> 0.3781, 0.3047, 0.2681, 0.1907), percent = c(0.01, 0.05, 0.1,
> 0.25, 0.5, 0.75, 0.9, 0.95, 0.99)), .Names = c("size", "percent"
> ), row.names = c(NA, -9L), class = "data.frame")
>
> and I need to calculate quantile for size 0.1
>
> plot(temp$size, temp$percent, pch=19, xlim=c(0,2))
> ss <- approxfun(temp$size, temp$percent)
> points((0:100)/50, ss((0:100)/50))
> abline(v=.1)
>
> If I had original data it would be quite easy with ecdf/quantile function but 
> without it I am lost what function I could use for such task.
>
> Please, give me some hint where to look.
>
>
> Best regards
>
> Petr
> Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních 
> partnerů PRECHEZA a.s. jsou zveřejněny na: 
> https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about 
> processing and protection of business partner's personal data are available 
> on website: https://www.precheza.cz/en/personal-data-protection-principles/
> Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a 
> podléhají tomuto právně závaznému prohlá±ení o vyloučení odpovědnosti: 
> https://www.precheza.cz/01-dovetek/ | This email and any documents attached 
> to it may be confidential and are subject to the legally binding disclaimer: 
> https://www.precheza.cz/en/01-disclaimer/
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem downloading R

2021-03-04 Thread Abby Spurdle
Does the following sound familiar?

The Windows installer starts installing (or decompressing) R, flashing
one file name at a time.
And then, part way through, says  file is corrupt, and gives you
the choice to ignore.
And if you click ignore, then the next file does the same thing.
And one quickly realizes, that every subsequent file will have the same message.

Then if you re-download the installation file, the same thing happens.
Except that the first file flagged as corrupted, is not necessarily
the same file.


On Fri, Mar 5, 2021 at 5:55 AM Dick Mathews  wrote:
>
> I am trying to download the Windows version of R. This computer is Win7,
> I do have Win10 computers also.
>
> Tried R4.0.4, got message these files are corrupted.
>
> Then tried R4.0.3, got same message as above.
>
> Tried the next, R4.0.2, got same message.
>
> I checked to see if I was downloading the proper version, seems okay.
>
> What is the problem? Can anybody help me with this?
>
> Dick Mathews
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Image processing in R for BMI calculation

2021-03-01 Thread Abby Spurdle
I can't help but feel that a discussion on the merit of BMI is a
digression, from the OP's question.
In addition, to being of no relevance to "R Programming".

In relation to Richard's technical comments:
As per my previous post, it is possible to get *relative" measures.
(Assuming the images are not on a standardized scale, which they could be).

In relation to Jim's anatomical comments:
An improved algorithm could evaluate a subject's body type.
In addition to factoring in gender, ethnic characteristics and age.

In any case, I don't see why image-based classification is any less
worthwhile than conclusions based on timeseries plots and
scatterplots...

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Image processing in R for BMI calculation

2021-02-28 Thread Abby Spurdle
Hi Paul,

If the background is relatively uniform:
Then a simple algorithm could be used, to distinguish foreground from
background points.
Essentially, returning a logical matrix.

Otherwise, I'm assuming that suitable pooling/convolution operations
could be used for this purpose.

Then you could estimate *relative* measures of height and mass from
the logical matrix.
(With more precise measures being possible, if you know the image scales).

There are a number of R packages for reading images.
Currently, I mainly use Simon Urbanek's package, "png".

I'm currently creating a package with support for pooling/convolution
operations, and related utility functions.
However, it's not ready yet.

Another option is Jeroen Ooms' package, "magick".


best,
B.


On Mon, Mar 1, 2021 at 5:39 AM Paul Bernal  wrote:
>
> Hello everyone,
>
> Does anyone know about any package for image processing, for example, to
> calculate body mass index pased on a picture, silouette or image.
>
> Any guidance will be greatly appreciated.
>
> Best regards,
>
> Paul
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is it Possible to Create S4 Function Objects?

2021-02-24 Thread Abby Spurdle
I've completed the S4 example.
Thank you, Herve, for your assistance.

begin code
plotf <- function (f)
{   x <- seq (-1, 1,, 200)
plot (x, f (x), type="l")
}

#s4-based function object, with slot
setClass ("Quad.S4", contains="function", slots = list (p="numeric") )
Quad.S4 <- function (p = c (0, 0, 1) )
{   f <- function (x)
{   this <- sys.function ()
p <- this@p
p [1] + p [2] * x + p [3] * x^2
}
new ("Quad.S4", f, p=p)
}

f.s4 <- Quad.S4 ()
plotf (f.s4)
f.s4@p


On Tue, Feb 23, 2021 at 11:23 AM Abby Spurdle (/əˈbi/)
 wrote:
>
> Oh wow.
> Post of the year!
> Where's the like button?
>
> Note, I was able to rewrite it without the deprecated args.
> (Don't want one of those CRAN emails, in 12 months from now, saying
> please change ).
>
> I'll have to come back to this later, to see if I can get the body of
> f() to access the slot
>
>
> On Mon, Feb 22, 2021 at 11:36 AM Hervé Pagès  
> wrote:
> >
> > Hi Abby,
> >
> > Something along the line of:
> >
> >setClass("S4Function",
> >  contains="function",
> >  representation(name="character", more_stuff="ANY")
> >)
> >
> > seems to do what you want:
> >
> >f <- new("S4Function", function(a) a^2, name="square")
> >
> ># 'f' is both an S4 object and a function:
> >is.object(f)
> ># [1] TRUE
> >is.function(f)
> ># [1] TRUE
> >
> >f@name
> ># [1] "square"
> >f(11)
> ># [1] 121
> >
> > Hope this helps,
> >
> > H.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] random numbers with constraints

2021-01-28 Thread Abby Spurdle
I recognize the problems with global data.
And my code could certainly be improved.

However, I also note that the random numbers (ignoring
transformations), need to be constant, while computing the rate.
Otherwise, my algorithm wouldn't work well.

As it is, rounding operations can cause "jumps".
Generating random numbers for each value of rate, would make those jumps bigger.

I'm thinking a better solution would be to put the entire code inside
a single function.

Although, in saying all of that, the uniroot approach, is not the best here.
A more "robust" root finding algorithm (or solver) would be far
better, but I'm not sure what that would mean, exactly...

Given the possible relevance of generating constrained samples, a
discussion on what a robust solver means, could be interesting...
Although, I'll leave it to you to agree or disagree, and if in
agreement, to suggest the best forum for such a discussion.


On Fri, Jan 29, 2021 at 4:42 AM Martin Maechler
 wrote:
>
> >>>>> Abby Spurdle
> >>>>> on Thu, 28 Jan 2021 08:48:06 +1300 writes:
>
> > I note that there's a possibility of floating point errors.
> > If all values have one digit after the decimal point, you could replace:
> > qexp (p, rate) with round (qexp (p, rate), 1).
>
> > However, sometimes uniroot will fail, due to problems with input.
>
> I also think the  constrained.sample() function should not
> depend on the global variable 'u',
> but rather depend on 'n'   and construct 'u' from  runif(n).
>
> Martin
>
> > On Thu, Jan 28, 2021 at 5:02 AM Denis Francisci
> >  wrote:
> >>
> >> Wonderful!
> >> This is exactly what I need!
> >> Thank you very much!!
> >>
> >> Denis
> >>
> >>
> >>
> >> Il giorno mer 27 gen 2021 alle ore 10:58 Abby Spurdle 
>  ha scritto:
> >>>
> >>> u <- runif (410)
> >>> u <- (u - min (u) ) / diff (range (u) )
> >>>
> >>> constrained.sample <- function (rate)
> >>> {   plim <- pexp (c (9.6, 11.6), rate)
> >>> p <- plim [1] + diff (plim) * u
> >>> qexp (p, rate)
> >>> }
> >>>
> >>> diff.sum <- function (rate)
> >>> sum (constrained.sample (rate) ) - 4200
> >>>
> >>> rate <- uniroot (diff.sum, c (1, 2) )$root
> >>> q <- constrained.sample (rate)
> >>>
> >>> length (q)
> >>> range (q)
> >>> sum (q)
> >>>
> >>>
> >>> On Wed, Jan 27, 2021 at 9:03 PM Denis Francisci
> >>>  wrote:
> >>> >
> >>> > Hi,
> >>> > I would like to generate random numbers in R with some constraints:
> >>> > - my vector of numbers must contain 410 values;
> >>> > - min value must be 9.6 and max value must be 11.6;
> >>> > - sum of vector's values must be 4200.
> >>> > Is there a way to do this in R?
> >>> > And is it possible to generate this series in such a way that it 
> follows a
> >>> > specific distribution form (for example exponential)?
> >>> > Thank you in advance,
> >>> >
> >>> > D.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] random numbers with constraints

2021-01-27 Thread Abby Spurdle
I note that there's a possibility of floating point errors.
If all values have one digit after the decimal point, you could replace:
qexp (p, rate) with round (qexp (p, rate), 1).

However, sometimes uniroot will fail, due to problems with input.

On Thu, Jan 28, 2021 at 5:02 AM Denis Francisci
 wrote:
>
> Wonderful!
> This is exactly what I need!
> Thank you very much!!
>
> Denis
>
>
>
> Il giorno mer 27 gen 2021 alle ore 10:58 Abby Spurdle  
> ha scritto:
>>
>> u <- runif (410)
>> u <- (u - min (u) ) / diff (range (u) )
>>
>> constrained.sample <- function (rate)
>> {   plim <- pexp (c (9.6, 11.6), rate)
>> p <- plim [1] + diff (plim) * u
>> qexp (p, rate)
>> }
>>
>> diff.sum <- function (rate)
>> sum (constrained.sample (rate) ) - 4200
>>
>> rate <- uniroot (diff.sum, c (1, 2) )$root
>> q <- constrained.sample (rate)
>>
>> length (q)
>> range (q)
>> sum (q)
>>
>>
>> On Wed, Jan 27, 2021 at 9:03 PM Denis Francisci
>>  wrote:
>> >
>> > Hi,
>> > I would like to generate random numbers in R with some constraints:
>> > - my vector of numbers must contain 410 values;
>> > - min value must be 9.6 and max value must be 11.6;
>> > - sum of vector's values must be 4200.
>> > Is there a way to do this in R?
>> > And is it possible to generate this series in such a way that it follows a
>> > specific distribution form (for example exponential)?
>> > Thank you in advance,
>> >
>> > D.
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > __
>> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide 
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to predict/interpolate new Y given knwon Xs and Ys?

2021-01-27 Thread Abby Spurdle
I got 16.60964.
Your curve is not linear up to the 39th point.
And as your points appear to be deterministic and nonlinear, splines
are likely to be easier to use.

Here's a base-only solution (if you don't like my kubik suggestion):

g <- splinefun (X, Y)
f <- function (x) g (x) - 6
uniroot (f, c (1, 45) )$root


On Wed, Jan 27, 2021 at 10:30 PM Luigi Marongiu
 wrote:
>
> Dear Jeff,
> I am not sure if I understood the procedure properly but it looks like it 
> works:
> ```
> Y <-c(1.301030,  1.602060,  1.903090,  2.204120,  2.505150,  2.806180,
>  3.107210,  3.408240,  3.709270,
> 4.010300,  4.311330,  4.612360,  4.913390,  5.214420,  5.515450,
> 5.816480,  6.117510,  6.418540,
> 6.719570,  7.020599,  7.321629,  7.622658,  7.923686,  8.224713,
> 8.525735,  8.826751,  9.127752,
> 9.428723,  9.729637, 10.030434, 10.330998, 10.631096, 10.930265,
> 11.227580, 11.521213, 11.807577,
> 12.079787, 12.325217, 12.523074, 12.647915, 12.693594, 12.698904,
> 12.698970, 12.698970, 12.698970)
> X <- 1:45
> plot(Y~X)
> raw_value <- predict(lm(X[1:39]~Y[1:39]), newdata = data.frame(Y=6))
> x <- unname(raw_value[!is.na(raw_value)]) # x= 16.62995
> points(x, 6, pch = 16)
> ```
> Here I used the points 1:39 because afterward there is a bend. But I
> am not clear why I need to use `lm(X~Y),` instead of `lm(Y~X)`.
> Thank you
>
> On Tue, Jan 26, 2021 at 10:20 AM Jeff Newmiller
>  wrote:
> >
> > model2 <- lm( x~y )
> > predict(model2, data.frame(y=26))
> >
> > model2 is however not the inverse of model... if you need that then you 
> > need to handle that some other way than using predict, such as an 
> > invertible monotonic spline (or in this case a little algebra).
> >
> > On January 26, 2021 1:11:39 AM PST, Luigi Marongiu 
> >  wrote:
> > >Hello,
> > >I have a series of x/y and a model. I can interpolate a new value of x
> > >using this model, but I get funny results if I give the y and look for
> > >the correspondent x:
> > >```
> > >> x = 1:10
> > >> y = 2*x+15
> > >> model <- lm(y~x)
> > >> predict(model, data.frame(x=7.5))
> > > 1
> > >30
> > >> predict(model, data.frame(y=26))
> > > 1  2  3  4  5  6  7  8  9 10
> > >17 19 21 23 25 27 29 31 33 35
> > >Warning message:
> > >'newdata' had 1 row but variables found have 10 rows
> > >> data.frame(x=7.5)
> > >x
> > >1 7.5
> > >> data.frame(y=26)
> > >   y
> > >1 26
> > >```
> > >what is the correct syntax?
> > >Thank you
> > >Luigi
> > >
> > >__
> > >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > >https://stat.ethz.ch/mailman/listinfo/r-help
> > >PLEASE do read the posting guide
> > >http://www.R-project.org/posting-guide.html
> > >and provide commented, minimal, self-contained, reproducible code.
> >
> > --
> > Sent from my phone. Please excuse my brevity.
>
>
>
> --
> Best regards,
> Luigi
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] random numbers with constraints

2021-01-27 Thread Abby Spurdle
u <- runif (410)
u <- (u - min (u) ) / diff (range (u) )

constrained.sample <- function (rate)
{   plim <- pexp (c (9.6, 11.6), rate)
p <- plim [1] + diff (plim) * u
qexp (p, rate)
}

diff.sum <- function (rate)
sum (constrained.sample (rate) ) - 4200

rate <- uniroot (diff.sum, c (1, 2) )$root
q <- constrained.sample (rate)

length (q)
range (q)
sum (q)


On Wed, Jan 27, 2021 at 9:03 PM Denis Francisci
 wrote:
>
> Hi,
> I would like to generate random numbers in R with some constraints:
> - my vector of numbers must contain 410 values;
> - min value must be 9.6 and max value must be 11.6;
> - sum of vector's values must be 4200.
> Is there a way to do this in R?
> And is it possible to generate this series in such a way that it follows a
> specific distribution form (for example exponential)?
> Thank you in advance,
>
> D.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to find when a value is reached given a function?

2021-01-25 Thread Abby Spurdle
You could use a spline to interpolate the points.
(And I'd consider increasing the number of points if possible, say to 200).

Then use a root finder, such as uniroot(), to solve for
f(i) - k
Where, k (a constant), would be 1e6, based on your example.

There are a number of variations on this approach.
My kubik package provides a solve method, and can impose some constraints.


library (kubik)
f <- chs (1:45, round (PCR_array),
constraints = chs.constraints (increasing=TRUE) )
plot (f)

sol <- solve (f, 1e6)
abline (v=sol, lty=2)
sol


Note that I had to round the values, in order to impose a
non-decreasing constraint.

Also note that I've just used the 45 points.
But re-iterating, you should increase the number of points, if possible.


On Mon, Jan 25, 2021 at 8:58 AM Luigi Marongiu  wrote:
>
> Hello
> I am trying to simulate a PCR by running a logistic equation. So I set
> the function:
> ```
> PCR <- function(initCopy, dupRate, Carry) {
>   ROI_T = initCopy
>   A = array()
>   for (i in 1:45) {
> ROI_TplusOne <- ROI_T * dupRate * (1 - ROI_T/Carry)
> A[i] <- ROI_TplusOne
> ROI_T <- ROI_TplusOne
>   }
>   return(A)
> }
> ```
> Which returns an array that follows the logistic shape, for instance
> ```
> d <- 2
> K <- 10^13
> A_0 <- 1
> PCR_array <- PCR(A_0, d, K)
> plot(PCR_array)
> ```
> Given the formula `ROI_TplusOne <- ROI_T * dupRate * (1 -
> ROI_T/Carry)`, is it possible to determine at what time point `i` a
> given threshold is reached? For instance, what fractional value of i
> returns 1000 000 copies?
> Thank you
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot histogram and lognormal fit on the same time

2021-01-22 Thread Abby Spurdle
Sorry, Bert.
The fitdistr function estimates parameters via maximum likelihood.
(i.e. The "lognormal" part of this, is not a kernel).


On Fri, Jan 22, 2021 at 5:14 AM Bert Gunter  wrote:
>
> In future, you should try to search before posting. I realize that getting
> good search terms can sometimes be tricky, but not in this case: 'plot
> density with histogram in R' or similar on rseek.org or just on your usual
> search platform brought up several ways to do it.
>
> As a (slightly offtopic) side issue, be careful with this: both histograms
> and smooth density estimates are just that: density **estimates** that
> depend on both the data and various parameters, e.g. the location of the
> bins for histograms and kernel for kernel density estimates. You may
> therefore get plots where the two do not appear to "match" when you use the
> defaults for each.
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Thu, Jan 21, 2021 at 3:54 AM Eric Leroy  wrote:
>
> > Hi,
> >
> > I would like to plot the histogram of data and fit it with a lognormal
> > distribution.
> >
> > The ideal, would be to superimpose the fit on the histogram and write
> > the results of the fit on the figure.
> >
> > Right now, I was able to plot the histogram and fit the density with a
> > lognormal, but I can't combine all together.
> >
> > Here is the code I wrote :
> >
> > histdata <- hist(dataframe$data)
> >
> > library(MASS)
> >
> > fitdistr(histdata$density, "lognormal")
> >
> > Can you help me ?
> >
> > Best regards,
> >
> > --
> >
> > *Eric Leroy*
> >
> > /Responsable de la plateforme microscopie électronique/
> >
> > /Correspondant Sécurité des Systèmes Informatiques de l'ICMPE/
> >
> > ICMPE - UMR 7182 - CNRS - UPEC
> >
> > 2/8, rue Henri Dunant
> >
> > 94320 Thiais
> >
> > Tél : 01.49.78.12.09
> >
> > Fax : 01.49.78.12.03
> >
> > courriel : le...@icmpe.cnrs.fr 
> >
> > Page Web : : http://www.icmpe.cnrs.fr 
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] package not available for R version...

2021-01-12 Thread Abby Spurdle
It's not clear from your post why you're trying to contact the maintainer.
But it gives the impression you're trying to contact the maintainer,
of an archived package, because you can't install the package.
It's not their responsibility to respond to these kinds of questions.

Also, I note the most obvious choice for GAM-based models in R, is
mgcv, which ships with R.
But there are other packages, if mgcv doesn't meet your needs.


On Wed, Jan 13, 2021 at 8:44 AM varin sacha via R-help
 wrote:
>
> Dear R-experts,
>
> I have tried to reach the maintainer of the rgam package. Until now, no 
> response.
> Since I'm in a bit of a hurry, I try to reach you because as I try to install 
> the rgam package using the command :
>
> install.packages("rgam")
>
> I get this warning message :
>
> Warning message:
>
> package ‘rgam’ is not available (for R version 3.6.3)
>
> The strange thing is that rgam depends R (>= 3.0.2) , stats
>
> How can I solve this problem ?
>
> Best,
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Does anyone have any use for this?

2021-01-01 Thread Abby Spurdle
And it was supposed to say billions.

plt (main="Monthly NZ GDP (Billions)")


On Sat, Jan 2, 2021 at 2:32 PM Abby Spurdle  wrote:
>
> I'm not enthusiastic about nonstandard evaluation and allowing
> functions to change state data.
> Currently, I use some of this in my own packages, but I'm planning to
> remove most of it.
>
> But I did have some fun with your function.
>
> --
> plt <- memify (plot)
>
> x <- 1:12
> y1 <- seq (0, 18,, 12)
> y2 <- c (
> 16.88, 16.04, 13.23, 13.88, 11.85, 9.61,
>  9.28,  5.81,  7.52,  3.40,  3.37, 0.07)
>
> #test 1
> plt (x, y1, type="l")
> #test 2
> plt (ylim = c (18, 0) )
>
> #important econometric timeseries analysis
> plt (y=y2, main="Monthly NZ GDP (Millions)")
> --
>
> Note:
> This data is not accurate.
>
>
> On Sat, Jan 2, 2021 at 9:20 AM Bert Gunter  wrote:
> >
> > Hi all:
> >
> > In the course of playing around with other issues, I wrote the following
> > little function that allows functions to keep state
> > and easily replay and update their state.(see comments after):
> >
> > memify <- function(f)
> > {
> >if (is.primitive(f)) {
> >   cat("Cannot memify primitive functions.\n")
> >   return(NULL)
> >}
> >if (!inherits(f, "function"))
> >   stop("Argument must inherit from class 'function'")
> >arglist <- list()
> >structure(
> >   function(...) {
> >  m <- tryCatch(
> > as.list(match.call(f)[-1]),
> > error = function(e) {
> >warning("Bad function call; cannot update arguments\n")
> >return(NULL)
> > }
> >  )
> >  nm <- names(m)
> >  hasname <- nm != "" #logical index of named values
> >  if (any(hasname)) {
> > if (anyDuplicated(nm, incomparables = ""))
> >warning("Duplicated names in call; only the first will be
> > used.")
> > arglist <<- modifyList(arglist, m[hasname]) ## this is what
> > does the job
> >  }
> >  do.call(f, modifyList(m, arglist))
> >   },
> >   class = c("memified", class(f)))
> > }
> >
> > Examples:
> >
> >  x <- 1:9; y <- runif(9)
> >  plt <- memify(plot)
> >  x <- 1:9; y <- runif(9)
> >  plt(x,y, col = "blue")  ## plt "remembers" these arguments; i.e. keeps
> > state
> >  plt( type = "b") ## all other arguments as previous
> >  plt(col = "red") ## ditto
> >
> > So my question is: Beyond allowing one to easily change/add argument values
> > and replay when there are lots of arguments floating around, which we often
> > use an IDE's editor to do, is there any real use for this? I confess that,
> > like Pygmalion, I have been charmed by this idea, but it could well be
> > useless, By all means feel free to chastise me if so.
> >
> > 1. I am aware that many functions already have "update" methods to "update"
> > their results without re-entering all arguments -- e.g. lattice graphics,
> > glm, etc.
> > 2. Several packages -- rlang and R6 anyway -- but very likely more, do this
> > sort of thing and way more; the price is added complexity, of course.
> > 3. I realize also that keeping state would be a bad idea in many
> > circumstances, e.g. essentially changing documented defaults.
> >
> > Reply privately to praise or bury if you do not think this is of any
> > interest to readers of this list. Publicly is fine, too. If it's dumb it's
> > dumb.
> >
> > Cheers and best wishes for a better new year for us all,
> >
> > Bert Gunter
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Does anyone have any use for this?

2021-01-01 Thread Abby Spurdle
I'm not enthusiastic about nonstandard evaluation and allowing
functions to change state data.
Currently, I use some of this in my own packages, but I'm planning to
remove most of it.

But I did have some fun with your function.

--
plt <- memify (plot)

x <- 1:12
y1 <- seq (0, 18,, 12)
y2 <- c (
16.88, 16.04, 13.23, 13.88, 11.85, 9.61,
 9.28,  5.81,  7.52,  3.40,  3.37, 0.07)

#test 1
plt (x, y1, type="l")
#test 2
plt (ylim = c (18, 0) )

#important econometric timeseries analysis
plt (y=y2, main="Monthly NZ GDP (Millions)")
--

Note:
This data is not accurate.


On Sat, Jan 2, 2021 at 9:20 AM Bert Gunter  wrote:
>
> Hi all:
>
> In the course of playing around with other issues, I wrote the following
> little function that allows functions to keep state
> and easily replay and update their state.(see comments after):
>
> memify <- function(f)
> {
>if (is.primitive(f)) {
>   cat("Cannot memify primitive functions.\n")
>   return(NULL)
>}
>if (!inherits(f, "function"))
>   stop("Argument must inherit from class 'function'")
>arglist <- list()
>structure(
>   function(...) {
>  m <- tryCatch(
> as.list(match.call(f)[-1]),
> error = function(e) {
>warning("Bad function call; cannot update arguments\n")
>return(NULL)
> }
>  )
>  nm <- names(m)
>  hasname <- nm != "" #logical index of named values
>  if (any(hasname)) {
> if (anyDuplicated(nm, incomparables = ""))
>warning("Duplicated names in call; only the first will be
> used.")
> arglist <<- modifyList(arglist, m[hasname]) ## this is what
> does the job
>  }
>  do.call(f, modifyList(m, arglist))
>   },
>   class = c("memified", class(f)))
> }
>
> Examples:
>
>  x <- 1:9; y <- runif(9)
>  plt <- memify(plot)
>  x <- 1:9; y <- runif(9)
>  plt(x,y, col = "blue")  ## plt "remembers" these arguments; i.e. keeps
> state
>  plt( type = "b") ## all other arguments as previous
>  plt(col = "red") ## ditto
>
> So my question is: Beyond allowing one to easily change/add argument values
> and replay when there are lots of arguments floating around, which we often
> use an IDE's editor to do, is there any real use for this? I confess that,
> like Pygmalion, I have been charmed by this idea, but it could well be
> useless, By all means feel free to chastise me if so.
>
> 1. I am aware that many functions already have "update" methods to "update"
> their results without re-entering all arguments -- e.g. lattice graphics,
> glm, etc.
> 2. Several packages -- rlang and R6 anyway -- but very likely more, do this
> sort of thing and way more; the price is added complexity, of course.
> 3. I realize also that keeping state would be a bad idea in many
> circumstances, e.g. essentially changing documented defaults.
>
> Reply privately to praise or bury if you do not think this is of any
> interest to readers of this list. Publicly is fine, too. If it's dumb it's
> dumb.
>
> Cheers and best wishes for a better new year for us all,
>
> Bert Gunter
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finance & R

2020-12-25 Thread Abby Spurdle
I'm not sure what the official status of Rmetrics is.
However, as far as I can see most of the Rmetrics-based R packages are
on CRAN, and are currently maintained.
(e.g. fBasics, fPortfolio, fAssets, fTrading).

I found a list here:
https://r-forge.r-project.org/R/?group_id=156


On Fri, Dec 25, 2020 at 10:47 PM Eric Berger  wrote:
>
> Hi Ben,
>
> Abby makes a good suggestion on Rmetrics, although it is no longer current in 
> CRAN.
> The www.rmetrics.org site looks quite interesting (although not clear whether 
> there are any recent additions.)
>
> The suggestion did bring to mind another excellent resource: quantlib - a 
> library for computational finance.
> See https://www.quantlib.org
>
> The quantlib library is written in C++ and can be accessed directly via C++.
> It is also possible (and easy) to access quantlib from R, via the package 
> RQuantLib (written by the incredible Dirk Eddelbuettel.)
>
> Best,
> Eric
>
>
>
> On Thu, Dec 24, 2020 at 10:34 PM Abby Spurdle  wrote:
>>
>> Dear All,
>>
>> One of the most significant contributors to open source finance is:
>>
>> Diethelm Würtz
>> https://comp.phys.ethz.ch/news-and-events/nc/2016/08/in-memoriam-diethelm-wuertz.html
>>
>> And on that note, I'd like to wish Merry Christmas to a great
>> mathematician and programmer, and his family.
>>
>> A quick search, revealed the following hits:
>>
>> https://en.wikipedia.org/wiki/Rmetrics
>> https://www.rmetrics.org
>>
>>
>> B.
>>
>>
>> On Thu, Dec 24, 2020 at 6:58 AM Ben van den Anker via R-help
>>  wrote:
>> >
>> >
>> >
>> > Hello everyone,
>> > Could anyonre recommend some good resources for finance applications in R? 
>> > I find the packages quantmod, TTR and PerformanceAnalytics a bit outdated. 
>> > There must be something more recent on the market. Any suggestions will be 
>> > much appreciated!
>> > Cheers,
>> > Ben van den Anker
>> >
>> >
>> >
>> >
>> >
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > __
>> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide 
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finance & R

2020-12-24 Thread Abby Spurdle
Dear All,

One of the most significant contributors to open source finance is:

Diethelm Würtz
https://comp.phys.ethz.ch/news-and-events/nc/2016/08/in-memoriam-diethelm-wuertz.html

And on that note, I'd like to wish Merry Christmas to a great
mathematician and programmer, and his family.

A quick search, revealed the following hits:

https://en.wikipedia.org/wiki/Rmetrics
https://www.rmetrics.org


B.


On Thu, Dec 24, 2020 at 6:58 AM Ben van den Anker via R-help
 wrote:
>
>
>
> Hello everyone,
> Could anyonre recommend some good resources for finance applications in R? I 
> find the packages quantmod, TTR and PerformanceAnalytics a bit outdated. 
> There must be something more recent on the market. Any suggestions will be 
> much appreciated!
> Cheers,
> Ben van den Anker
>
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with simulation of unbalanced clustered data

2020-12-16 Thread Abby Spurdle
Hi Chao Liu,

I'm having difficulty following your question, and examples.
And also, I don't see the motivation for increasing, then decreasing
the sample sizes.
Intuitively, one would compute the correct sample sizes, first time round...

But I thought I'd add some comments, just in case they're useful.

If the problem relates to memberships (in clusters), then the problem
can be simplified.
All one needs is an integer vector, where each value is the index of
the cluster.

To compute random memberships of 600 observations in 20 clusters, one could run:

m <- sample (1:20, 600, TRUE)

To compute the number of observations per cluster, one could then run:

table (m)

In the above code, the probability of an observation being assigned to
each cluster, is uniform.
Non-uniform sampling can be achieved by supplying a 4th argument to
the sample function, which is a numeric vector of weights.


On Wed, Dec 16, 2020 at 10:08 PM Chao Liu  wrote:
>
> Dear R experts,
>
> I want to simulate some unbalanced clustered data. The number of clusters
> is 20 and the average number of observations is 30. However, I would like
> to create an unbalanced clustered data per cluster where there are 10% more
> observations than specified (i.e., 33 rather than 30). I then want to
> randomly exclude an appropriate number of observations (i.e., 60) to arrive
> at the specified average number of observations per cluster (i.e., 30). The
> probability of excluding an observation within each cluster was not uniform
> (i.e., some clusters had no cases removed and others had more excluded).
> Therefore in the end I still have 600 observations in total. How to realize
> that in R? Thank you for your help!
>
> Best,
>
> Liu
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] graphics::rasterImage and grid::grid.raster, doubly upside down images

2020-12-05 Thread Abby Spurdle
Dear list,

I've been writing R-based software for image and spatial data. And
I've decided to support a dual implementation with both base graphics
and grid graphics, and the possibility of other graphics systems
later.

Also, I've decided to plot raster images with the vertical axis
flipped. Which means that each image also needs to be flipped.
(In rasterImage, ybottom/ytop are swapped, and in grid.raster, height
is negative).

This appears to work fine.
However, the help files don't explicitly say that height/etc can be
negative. And then go on to warn users about possible problems.

Given that both functions support scaling, I'm assuming that negative
scaling should be fine. But I thought I'd run it through the list, for
feedback/suggestions on any possible problems that one is likely to
encounter plotting raster images like this.

Examples follow:
(Requires the png package).

#required packages
library (png)
library (grid)

#base graphics wrapper function
base.wrapper <- function (x, y, colm, interpolate)
{   rasterImage (colm, x [1], y [2], x [2], y [1],
interpolate=interpolate)
}

#grid graphics wrapper function
grid.wrapper <- function (x, y, colm, interpolate)
{   grid.raster (colm, mean (x), mean (y), x [2] - x [1], y [1] - y [2],
default.units="native", interpolate=interpolate)
}

#create raster object
im <- readPNG (system.file ("img", "Rlogo.png", package="png") )
im <- as.raster (im)

#base graphics example
plot.new ()
plot.window (0:1, 1:0)
base.wrapper (0:1, 0:1, im, FALSE)
axis (2)

#grid graphics example
grid.newpage ()
pushViewport (viewport (xscale=0:1, yscale=1:0, width=0.75, height=0.75) )
grid.wrapper (0:1, 0:1, im, FALSE)
grid.yaxis ()

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] calling r from java

2020-11-27 Thread Abby Spurdle
Hi Eduard,

> Now I developed a service that executes Rscript (Using ProcessBuilder),
> sends text to stdin of the process and reads from stdout of the
> process.

This doesn't answer your question, but may be relevant.
I have a java-based application that works on a similar principle.
(The code is horrendously bad and most of it should be thrown away).

I'm planning to write a terminal emulator, in the near future.
(Currently, second place on my top-level todo list).

The primary objective is to build object-oriented and message passing
APIs on top of the core terminal emulation system, with at least some
cross platform functionality.
Noting that, in my opinion, the cross-platform aspect is more
important for R, than in many other IPC topics.

Hence, I'm interested in hearing "wishlist" items for such APIs.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] the invisible question

2020-11-22 Thread Abby Spurdle
If I copy a paste into Consulus (my Java/Swing based software), I get a square.
(Screenshot, attached).

But the interesting thing is, that there's a different result, running
the code via the source function (with the defaults anyway), from
piping it in.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] the invisible question

2020-11-22 Thread Abby Spurdle
I have no idea what everyone's talking about.
What invisible character

The black triangle triangle with a question mark renders fine in (my) gmail.
And it's a unicode character used when there was a problem reading
(presumably text) data.

https://en.wikipedia.org/wiki/Specials_(Unicode_block)

If I copy and paste it into R, it's replaced by a regular question mark.


On Mon, Nov 23, 2020 at 12:18 PM Jim Lemon  wrote:
>
> Hi again,
> I finally trapped the character (hexadecimal C2 capital A hat) with
> some low trickery.
>
> Jim
>
> On Mon, Nov 23, 2020 at 10:07 AM Jim Lemon  wrote:
> >
> > Hi all,
> > I have encountered a problem in some emails with invisible characters
> > in code snippets that throw an error when pasted into the terminal
> > window:
> >
> > Error: unexpected input in:
> > "star_wars_matrix<-matrix(box_office,nrow=3,byrow=TRUE,
> > �"
> >
> > This morning, after two hours of intense frustration, I found that I
> > could reproduce this by typing code containing the above line into
> > Kwrite, the text editor I use. When I pasted the code from Kwrite, it
> > worked in the R terminal window. When I pasted it into a gmail message
> > (set to Plain Text), copied that and pasted it into the R terminal
> > window, I got the error. I was helping a young friend with one of
> > those introductory R courses, but I have had this happen with other
> > requests on the R help list. I thought that it was the fault of the
> > fancy formatting in some emails, and could grumpily blame that, but
> > this is worse as I can't seem to edit it out. Anybody else had this
> > problem? I have sent a message to Gmail help.
> >
> > Jim
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inappropriate color name

2020-11-19 Thread Abby Spurdle
> Surely these colors can be changed
> to something less offensive- my suggestion is "blush."
> How can I find out who to contact about making this happen?

Yes, they can.

blush <- "#CD5C5C"
mycols <- function () { #your code here...

I note that:

(1)
Changing existing code (esp in base packages) creates considerable problems.
Because code (not just CRAN packages) may depend on whatever is being changed.

(2)
The word "Blush" may also offend someone.
Blush is a type of makeup, often tested on animals.
Hence, we go around in a circle.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] analyzing results from Tuesday's US elections

2020-11-16 Thread Abby Spurdle
I've come to the conclusion this whole thing was a waste of time.
This is after evaluating much of the relevant information.

The main problem is a large number of red herrings (some in the data,
some in the context), leading pointless data analysis and pointless
data collection.
It's unlikely that sophisticated software, or sophisticated
statistical modelling tools will make any difference.
Although pretty plots, and pretty web-graphics are achievable.

Sorry list, for encouraging this discussion...

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] analyzing results from Tuesday's US elections

2020-11-15 Thread Abby Spurdle
I've updated the dataset.
(Which now includes turnout and population estimates).

Also, I've found some anomalous features in the data.
(Namely, more "straight lines" than what I would intuitively expect).

The dataset/description are on my website.
(Links at bottom).


#set PATH as required

data <- read.csv (PATH, header=TRUE)
head (data, 3)

I took a subset, where the Dem/Rep margins have reversed between the
2016 and 2020 elections.

rev.results <- (sign (data$RMARGIN_2016) + sign (data$RMARGIN_2020) == 0)
data2 <- data [data$SUBSV1 != 1 & rev.results,]
sc <- paste (data2$STATE, data2$EQCOUNTY, sep=": ")
head (data2, 3)

Then created two plots, attached.
(1) Republican margin vs voter turnout.
(2) Republican margin vs log (number of votes).

In both cases, there are near-straight lines.
Re-iterating, more than what I would intuitively expect.

library (probhat)

plot1 <- function ()
{   x <- with (data2, cbind (x1=RMARGIN_2020, x2=TURNOUT_2020) )
plot (pdfmv.cks (x, smoothness = c (1, 1) ), contours=FALSE,
hcv=TRUE, n=80,
xlim = c (-2.5, 10), ylim = c (40, 52.5),
main="US Counties\n(with reversed results, over 2016/2020
elections)",
xlab="Republican Margin, 2020", ylab="Voter Turnout, 2020")
points (x, pch=16, col="#00")
abline (v=0, h=50, lty=2)

I1 <- (sc == "Colorado: Alamosa" | sc == "Georgia: Burke" | sc
== "Ohio: Lorain")
I2 <- (sc == "South Carolina: Clarendon" | sc == "Ohio: Mahoning")
sc [! (I1 | I2)] <- ""

k <- lm (TURNOUT_2020 ~ RMARGIN_2020, data = data2 [I1,])$coef
abline (a = k [1], b = k [2])

points (x [I1 | I2,], col="white")
text (x [,1] + 0.2, x [,2], sc, adj = c (0, 0.5) )
}

plot2 <- function ()
{   x <- with (data2, cbind (x1=RMARGIN_2020, x2 = log (NVOTES_2020) ) )
plot (pdfmv.cks (x, smoothness = c (1, 1) ), contours=FALSE,
hcv=TRUE, n=80,
xlim = c (-2.5, 35),
main="US Counties\n(with reversed results, over 2016/2020
elections)",
xlab="Republican Margin, 2020", ylab="log (Number of Votes), 2020")
points (x, pch=16, col="#00")
abline (v=0, lty=2)

sc <- paste (data2$STATE, data2$EQCOUNTY, sep=": ")
I1 <- (sc == "Texas: Kenedy")
I2 <- (sc == "Texas: Reeves" | sc == "New York: Rockland")

k <- lm (log (NVOTES_2020) ~ RMARGIN_2020, data = data2 [I1 | I2,])$coef
abline (a = k [1], b = k [2])

points (x [I1 | I2,], col="white")
text (x [I1, 1] - 0.5, x [I1, 2], sc [I1], adj = c (1, 0.5) )
text (x [I2, 1] + 0.5, x [I2, 2], sc [I2], adj = c (0, 0.5) )
}

plot1 ()
plot2 ()

https://sites.google.com/site/spurdlea/us_election_2020
https://sites.google.com/site/spurdlea/exts/election_results_2.txt


On Sun, Nov 15, 2020 at 8:51 AM Rolf Turner  wrote:
>
>
> On Fri, 13 Nov 2020 19:02:19 -0800
> Jeff Newmiller  wrote:
>
> > It was explained in the video... his counts were so small that they
> > spanned the 1-9 and 10-99 ranges.
>
> Sorry, missed that.  I'll have to watch the video again.
>
> Thanks.
>
> cheers,
>
> Rolf
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] analyzing results from Tuesday's US elections

2020-11-09 Thread Abby Spurdle
RESENT
INITIAL EMAIL, TOO BIG
ATTACHMENTS REPLACED WITH LINKS

I created a dataset, linked.
Had to manually copy and paste from the NY Times website.

> head (data, 3)
STATE   EQCOUNTY RMARGIN_2016 RMARGIN_2020 NVOTERS_2020 SUB_STATEVAL_2016
1 Alabama Mobile 13.3   12   181783 0
2 Alabama Dallas-37.5  -3817861 0
3 Alabama Tuscaloosa 19.3   1589760 0

> tail (data, 3)
   STATE EQCOUNTY RMARGIN_2016 RMARGIN_2020 NVOTERS_2020 SUB_STATEVAL_2016
4248 WyomingUinta 58.5   63 9400 0
4249 Wyoming Sublette 63.0   62 4970 0
4250 Wyoming  Johnson 64.3   61 4914 0

> head (data [data [,1] == "Alaska",], 3)
STATE EQCOUNTY RMARGIN_2016 RMARGIN_2020 NVOTERS_2020 SUB_STATEVAL_2016
68 AlaskaED 40 14.7-24.0   82 1
69 AlaskaED 37 14.7 -1.7  173 1
70 AlaskaED 38 14.7 -0.4  249 1

EQCounty, is the County or Equivalent.
Several states, D.C., Alaska, Connecticut, Maine, Massachusetts, Rhode
Island and Vermont are different.
RMargin(s) are the republican percentages minus the democrate
percentages, as 2 or 3 digit numbers between 0 and 100.
The last column is 0s or 1s, with 1s for Alaska, Connecticut, Maine,
Massachusetts, Rhode Island and Vermont, where I didn't have the 2016
margins, so the 2016 margins have been replaced with state-levels
values.

Then I scaled the margins, based on the number of voters.
i.e.
wx2016 <- 1000 * x2016 * nv / max.nv
(Where x2016 is equal to RMARGIN_2020, and nv is equal to NVOTERS_2020).

There may be a much better way.

And came up the following plots (linked) and output (follows):

---INPUT---
PATH = ""
data = read.csv (PATH, header=TRUE)

#raw data
x2016 <- as.numeric (data$RMARGIN_2016)
x2020 <- as.numeric (data$RMARGIN_2020)
nv <- as.numeric (data$NVOTERS_2020)
subs <- as.logical (data$SUB_STATEVAL)

#computed data
max.nv <- max (nv)
wx2016 <- 1000 * x2016 * nv / max.nv
wx2020 <- 1000 * x2020 * nv / max.nv
diffs <- wx2020 - wx2016

OFFSET <- 500
p0 <- par (mfrow = c (2, 2) )

#plot 1
plot (wx2016, wx2020,
main="All Votes\n(By County, or Equivalent)",
xlab="Scaled Republican Margin, 2016", ylab="Scaled Republican Margin, 2020")
abline (h=0, v=0, lty=2)

#plot 2
OFFSET <- 200
plot (wx2016, wx2020,
xlim = c (-OFFSET, OFFSET), ylim = c (-OFFSET, OFFSET),
main="All Votes\n(Zoomed In)",
xlab="Scaled Republican Margin, 2016", ylab="Scaled Republican Margin, 2020")
abline (h=0, v=0, lty=2)

OFFSET <- 1000

#plot 3
J1 <- order (diffs, decreasing=TRUE)[1:400]
plot (wx2016 [J1], wx2020 [J1],
xlim = c (-OFFSET, OFFSET), ylim = c (-OFFSET, OFFSET),
main="400 Biggest Shifts Towards Republican",
xlab="Scaled Republican Margin, 2016", ylab="Scaled Republican Margin, 2020")
abline (h=0, v=0, lty=2)
abline (a=0, b=1, lty=2)

#plot 4
J2 <- order (diffs)[1:400]
plot (wx2016 [J2], wx2020 [J2],
xlim = c (-OFFSET, OFFSET), ylim = c (-OFFSET, OFFSET),
main="400 Biggest Shifts Towards Democrat",
xlab="Scaled Republican Margin, 2016", ylab="Scaled Republican Margin, 2020")
abline (h=0, v=0, lty=2)
abline (a=0, b=1, lty=2)

par (p0)

#most democrat
I = order (wx2020)[1:30]
cbind (data [I,], scaled.dem.vote = -1 * wx2020 [I])

#biggest move toward democrat
head (cbind (data [J2,], diffs = diffs [J2]), 30)

---OUTPUT---
#most democrat
> cbind (data [I,], scaled.dem.vote = -1 * wx2020 [I])
  STATEEQCOUNTY RMARGIN_2016 RMARGIN_2020
NVOTERS_2020 SUB_STATEVAL_2016 scaled.dem.vote
229  California Los Angeles-49.3  -44
3674850 0   44000.000
769IllinoisCook-53.1  -47
1897721 0   24271.164
4073 WashingtonKing-48.8  -53
1188152 0   17135.953
3092   PennsylvaniaPhiladelphia-67.0  -63
701647 0   12028.725
215  California Alameda-63.5  -64
625710 0   10897.163
227  California Santa Clara-52.1  -49
726186 09682.875
238  California   San Diego-19.7  -23
1546144 09676.942
2683   New YorkBrooklyn-62.0  -49
693937 09252.871
2162  MinnesotaHennepin-34.9  -43
753716 08819.350
2074   Michigan   Wayne-37.1  -37
863382 08692.908
2673   New York   Manhattan-76.9  -70
446861 08511.986
221  California   San Francisco-75.2  -73
413642 08216.898
3495  Texas   

Re: [R] analyzing results from Tuesday's US elections

2020-11-08 Thread Abby Spurdle
> such a repository already exists -- the NY Times, AP, CNN, etc. etc. already 
> have interactive web pages that did this

I've been looking for presidential election results, by ***county***.
I've found historic results, including results for 2016.

However, I can't find such a dataset, for 2020.
(Even though this seems like an obvious thing to publish).

I suspect that the NY Times has the data, but I haven't been able to
work where the data is on their website, or how to access it.

More ***specific*** suggestions would be appreciated...?

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] analyzing results from Tuesday's US elections

2020-11-07 Thread Abby Spurdle
> What can you tell me about plans to analyze data from this year's
> general election, especially to detect possible fraud?

I was wondering if there's any R packages with out-of-the-box
functions for this sort of thing.
Can you please let us know, if you find any.

> I might be able to help with such an effort.  I have NOT done
> much with election data, but I have developed tools for data analysis,
> including web scraping, and included them in R packages available on the
> Comprehensive R Archive Network (CRAN) and GitHub.[1]

Do you have a URL for detailed election results?
Or even better, a nice R-friendly CSV file...

I recognize that the results aren't complete.
And that such a file may need to be updated later.
But that doesn't necessarily prevent modelling now.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R: sim1000G

2020-11-01 Thread Abby Spurdle
Hi Berina,

I'm not an expert on genetics.
I haven't looked at the package.
And I've only glanced at your question.
So, this is probably not the best response.

But as no one else has responded, here's some comments:

(1)

Have you checked if there's a function in the package to do what you want?
The remainder of these questions assume that you have, and the answer is no.

(2)

I'm having some difficulty following the sample size.
The initial code sets an explicit value to 3000.
But if I'm following it correctly, the object that's returned has 2000
rows, containing two 1000 row groups.
But then your question implies you want 48?

Given that the sample already contains two groups, are they relevant
to the sample you're trying to produce?
And are you wanting to take a small sample of 48 from a larger sample
of 2000, or something else?

(3)

Are the observations (not sure if that's the correct term here) in
each 1000 row group, statistically independent?
If they are, then taking a smaller sample should be relatively simple.
If they're not, then this is a much more complex question, that's
probably off-topic.

(4)

What exactly is in the data?
i.e. Could you call the str() or head() functions on the data, and
show us the results?

(5)

Is there a boolean-style variable in the data, indicating whether each
row is casual or non-casual?


B.


On Fri, Oct 30, 2020 at 10:37 PM Berina Zametica UNI
 wrote:
>
> Hi,
>
> I am using the sim1000G R package to simulate data for case/control study.
> I can not figure out how to manipulate this code to be able to generate 10%
> or 50% causal SNPs in R.
>
> This is whole code provided as example on GitHub:
>
> library(sim1000G)
> vcf_file = "region-chr4-357-ANK2.vcf.gz" #nvariants = 442, ss=1000
>
> vcf = readVCF( vcf_file, maxNumberOfVariants = 442  ,min_maf =
> 0.0005,max_maf = 0.01) #lowest MAF
> dim( vcf$gt1 ) #rows represent number of variants, columns represent
> number of individuals
>
> ## Download and use full chromosome genetic map
> downloadGeneticMap(4)
> readGeneticMap(4)
>
> sample.size=3000
>
> startSimulation(vcf, totalNumberOfIndividuals = sample.size)
>
> data_sim = function(seed.num){
>
>   SIM$reset()
>
>   id = generateUnrelatedIndividuals(sample.size)
>
>   gt = retrieveGenotypes(id)
>
>
>   freq = apply(gt,2,sum)/(2*nrow(gt))
>   causal = sample(setdiff(1:ncol(gt),which(freq==0)),45)
>
>   beta.sign = rep(1,45)
>   c.value = 0.402
>   beta.abs = c.value*abs(log10(freq[causal]))
>   beta.val = beta.sign*beta.abs
>   x.bar = apply(gt[,causal],2,mean)
>   x.bar = as.matrix(x.bar)
>   beta.val = t(as.matrix(beta.val))
>   #disease prvalance = 1%
>   #beta0 = -log(99)-beta.val %*% x.bar
>   #disease prvalance = 1.5%
>   beta0 = 0-beta.val %*% x.bar
>
>   eta = beta.val %*% t(gt[,causal])
>   eta = as.vector(eta) + rep(beta0,nrow(gt))
>   prob = exp(eta)/(1+exp(eta))
>
>   genocase = rep(NA, sample.size)
>
>   set.seed(seed.num)
>   for(i in 1:sample.size){
> genocase[i] = rbinom(1, 1, prob[i])
>   }
>   case.idx = sample(which(genocase==1),1000)
>   control.idx = sample(which(genocase==0),1000)
>
>   return(rbind(gt[case.idx,],gt[control.idx,]))
> }
>
> How I can modify code in a way that it will simulate:
> 50 % of causal SNPs** ( exmp. 24 causal variants and 24 non causal SNPs)
> 10 % of causal SNP (exmpl. 5 causal and 43 non causal SNPs)
>
> Thanks a lot for any suggestion.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 3d plot of earth with cut

2020-10-22 Thread aBBy Spurdle , ⍺XY
> It should be a 2D slice/plane embedded into a 3D space.

I was able to come up with the plot, attached.
My intention was to plot national boundaries on the surface of a sphere.
And put the slice inside.
However, I haven't (as yet) worked out how to get the coordinates for
the boundaries.

Let me know, if of any value.
And I'll post the code.
(But needs to be polished first)
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 3d plot of earth with cut

2020-10-22 Thread aBBy Spurdle , ⍺XY
If you have "value" as a function of latitude and radius, isn't that a
2D (not 3D) scalar field?
Which can be plotted using a regular heatmap.

If you want a curved edge where depth=0 (radius=?), that's not too
difficult to achieve.
Not quite sure what continent boundaries mean in this context, but
that could possibly be added to.

Or do you want a 2D slice superimposed within a 3D space?

And if so, does it need to be dynamic?
(i.e. Rotate the whole thing, with a trackball/mouse).

Note that the further you go down the list above, the more work is required.


On Fri, Oct 23, 2020 at 7:41 AM Balint Radics  wrote:
>
> Hello,
>
> Could someone suggest a package/way to make a 3D raster plot of the Earth
> (with continent boundaries), and then make a "cut" or "slice" of it such
> that one can also visualize some scalar quantity as a function of the
> Radius/Depth across that given slice ?
>
> Formally, I would have a given, fixed longitude, and a list of vectors
> {latitude, radius, Value}
> that would show the distribution of the quantity "Value" at various depths
> and latitudes in 3D.
>
> Thanks a lot in advance,
> Balint
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fitting Mixed Distributions in the fitdistrplus package

2020-10-22 Thread aBBy Spurdle , ⍺-Male
dmixgampar <- function (x, param1, param2, ...)
{
#compute density at x

}


On Wed, Oct 21, 2020 at 8:03 PM Charles Thuo  wrote:
>
> Dear Sirs,
>
> The below listed code fits a gamma and a pareto distribution to a data set
> danishuni. However the distributions are not appropriate to fit both  tails
> of the data set hence a mixed distribution is required  which has ben
> defined as "mixgampar"
> as shown below.
>
> library(fitdistrplus)
> x<- danishuni$Loss
>  fgam<- fitdist(x,"gamma",lower=0)
>  fpar<- fitdist(x,"pareto",start = list(shape=2,scale=2),lower=0)
> fmixgampar<- fitdist(x,"mixgampar",start =
> list(prob=1/2,nu=1,lambda=1,alpha=1,theta=1),lower=0)
> Error in fitdist(x, "mixgampar", start = list(prob = 1/2, nu = 1, lambda =
> 1,  :
>
>  The  dmixgampar  function must be defined
>
> Kindly assist to define the dmixgampar
>
> Charkes
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 2 D density plot interpretation and manipulating the data

2020-10-09 Thread Abby Spurdle
> SNP$density <- get_density(SNP$mean, SNP$var)
> > summary(SNP$density)
>Min. 1st Qu.  MedianMean 3rd Qu.Max.
>   0 383 696 73811701789

This doesn't look accurate.
The density values shouldn't all be integers.
And I wouldn't expect the smallest density to be zero, if using a
Gaussian kernel.

Are these values rounded or formatted?

(Recombined Excerpts)
> and keep only entries with density > 400
> a=SNP[SNP$density>400,]
> Any idea how do I interpret data points that are left contained within
the ellipses?

Reiterating, they're contour lines, but they should *not* be ellipses.

You could work out the proportion of "densities" > 400.

d <- SNP$density
p.remain <- length (d [d > 400]) / length (d)
p.remain

Or a more succinct version:

p.remain <- sum (SNP$density > 400) / nrow (SNP)

Then you can say that you've plotted data with the highest  densities.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 2 D density plot interpretation and manipulating the data

2020-10-09 Thread Abby Spurdle
You could assign a density value to each point.
Maybe you've done that already...?

Then trim the lowest n (number of) data points
Or trim the lowest p (proportion of) data points.

e.g.
Remove the data points with the 20 lowest density values.
Or remove the data points with the lowest 5% of density values.

I'll let you decide whether that is a good idea or a bad idea.
And if it's a good idea, then how much to trim.


On Sat, Oct 10, 2020 at 5:47 AM Ana Marija  wrote:
>
> Hi Bert,
>
> Another confrontational response from you...
>
> You might have noticed that I use the word "outlier" carefully in this
> post and only in relation to the plotted ellipses. I do not know the
> underlying algorithm of geom_density_2d() and therefore I am having an
> issue of how to interpret the plot. I was hoping someone here knows
> that and can help me.
>
> Ana
>
> On Fri, Oct 9, 2020 at 11:31 AM Bert Gunter  wrote:
> >
> > I recommend that you consult with a local statistical expert. Much of what 
> > you say (outliers?!?) seems to make little sense, and your statistical 
> > knowledge seems minimal. Perhaps more to the point, none of your questions 
> > can be properly answered without subject matter context, which this list is 
> > not designed to provide. That's why I believe you need local expertise.
> >
> > Bert Gunter
> >
> > "The trouble with having an open mind is that people keep coming along and 
> > sticking things into it."
> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> >
> > On Fri, Oct 9, 2020 at 8:25 AM Ana Marija  
> > wrote:
> >>
> >> Hi Abby,
> >>
> >> thank you for getting back to me and for this useful information.
> >>
> >> I'm trying to detect the outliers in my distribution based of mean and
> >> variance. Can I see that from the plot I provided? Would outliers be
> >> outside of ellipses? If so how do I extract those from my data frame,
> >> based on which parameter?
> >>
> >> So I am trying to connect outliers based on what the plot is showing:
> >> s <- ggplot(SNP, mapping = aes(x = mean, y = var))
> >> s <- s +  geom_density_2d() + geom_point() + my.theme + ggtitle("SNPs")
> >>
> >> versus what is in the data:
> >>
> >> > head(SNP)
> >>mean  var sd
> >> FQC.10090295 0.0327 0.002678 0.0517
> >> FQC.10119363 0.0220 0.000978 0.0313
> >> FQC.10132112 0.0275 0.002088 0.0457
> >> FQC.10201128 0.0169 0.000289 0.0170
> >> FQC.10208432 0.0443 0.004081 0.0639
> >> FQC.10218466 0.0116 0.000131 0.0115
> >> ...
> >>
> >> the distribution is not normal, it is right-skewed.
> >>
> >> Cheers,
> >> Ana
> >>
> >> On Fri, Oct 9, 2020 at 2:13 AM Abby Spurdle  wrote:
> >> >
> >> > > My understanding is that this represents bivariate normal
> >> > > approximation of the data which uses the kernel density function to
> >> > > test for inclusion within a level set. (please correct me)
> >> >
> >> > You can fit a bivariate normal distribution by computing five parameters.
> >> > Two means, two standard deviations (or two variances) and one
> >> > correlation (or covariance) coefficient.
> >> > The bivariate normal *has* elliptical contours.
> >> >
> >> > A kernel density estimate is usually regarded as an estimate of an
> >> > unknown density function.
> >> > Often they use a normal (or Gaussian) kernel, but I wouldn't describe
> >> > them as normal approximations.
> >> > In general, bivariate kernel density estimates do *not* have
> >> > elliptical contours.
> >> > But in saying that, if the data is close to normality, then contours
> >> > will be close to elliptical.
> >> >
> >> > Kernel density estimates do not test for inclusion, as such.
> >> > (But technically, there are some exceptions to that).
> >> >
> >> > I'm not sure what you're trying to achieve here.
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide 
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 2 D density plot interpretation and manipulating the data

2020-10-09 Thread Abby Spurdle
> My understanding is that this represents bivariate normal
> approximation of the data which uses the kernel density function to
> test for inclusion within a level set. (please correct me)

You can fit a bivariate normal distribution by computing five parameters.
Two means, two standard deviations (or two variances) and one
correlation (or covariance) coefficient.
The bivariate normal *has* elliptical contours.

A kernel density estimate is usually regarded as an estimate of an
unknown density function.
Often they use a normal (or Gaussian) kernel, but I wouldn't describe
them as normal approximations.
In general, bivariate kernel density estimates do *not* have
elliptical contours.
But in saying that, if the data is close to normality, then contours
will be close to elliptical.

Kernel density estimates do not test for inclusion, as such.
(But technically, there are some exceptions to that).

I'm not sure what you're trying to achieve here.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R for mac

2020-09-25 Thread Abby Spurdle
Running R on Chromebook, is contrary to the way Chromebook is designed to work.
In theory, R can be run off a server.
Then people/students can not only access it from Chromebook, but from
their phones.

Unfortunately, this is not a topic I'm familiar with, so can't offer specifics.
But, from an educational perspective, under the presumption of
training students for the (emerging) "real world", the mobile/web
based approach would seem more useful...


On Sat, Sep 26, 2020 at 4:00 AM Michael Johnston
 wrote:
>
> Hi
>
> I am club mentor for a group of high school students learning R. The Vice
> President of Information Technology is hoping to broaden and deepen her
> skill set so she can help others. She is doing well in helping students
> install R on the Windows operating system. However, a few have problems
> installing R on mac and one student is struggling to install R on
> Chromebook. Is there someone who would be willing to provide some pointers?
>
> Thank you for considering this request,
> Michael
>
>
> On Thu, Sep 24, 2020 at 9:10 AM Dirk Eddelbuettel  wrote:
>
> >
> > On 23 September 2020 at 15:32, Michael Johnston wrote:
> > | Hi
> > | I am club mentor for a group of high school students learning R. A few
> > have
> > | problems installing R on mac. The Vp of information technology is hoping
> > | for training so she can help others. Could you recommend someone to help
> > | her?
> > | Thank you for considering this request
> >
> > I have nothing to do with R on macOS -- but this mailing list is the place
> > for mac-focussed discussion.  Maybe try that, and you likely need to
> > subscribe before you can post.
> >
> > Dirk
> >
> > | Michael
> >
> > --
> > https://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Quadratic programming

2020-09-24 Thread Abby Spurdle
Before going any further, I have to check, what is:
MinQuadProg qp

Also, if I'm following the C++ code correctly, H, is an identity matrix.
This implies the input to the C++ solver, requires the QP in a
different form to the R solver.
In which case, the C++ inputs and the R inputs, should be different...?

(A)
It may be worthwhile comparing the solver output (for C++ and R) using
a *much* simpler example.
e.g. the examples from the quadprog package.

(B)
If they produce the same output (using a simple example), then that
implies there's a difference in your inputs.
So, you just need to work out which input values are different.
Expanding on my previous post, just print them out.
But check (A) above first.









On Thu, Sep 24, 2020 at 11:15 PM Maija Sirkjärvi
 wrote:
>
> Thank you for giving me your time!
>
> The problem is the quadratic optimization part. Something goes wrong along 
> the way. In C++ loops run from 0 and in R they run from 1, and I've tried to 
> take that into account. Still I'm having hard time figuring out the mistake I 
> make, cause I get a result from my R code. It's just not the same that I get 
> with the C++.
>
> Here are the quadratic optimization parts for both codes.
>
> C++
>
>   /* Set Up Quadratic Programing Problem */
> Vector hSmooth(J);
> for(int j=0; j Vector Q(J);
> for(int j=0; j One) / (One + Eta));
> SymmetricMatrix H(J,Zero);
> Vector c(J,Zero);
> Matrix Aeq(0,J);
> Vector beq(0);
> Matrix Aneq(2*J-3,J,Zero);
> Vector bneq(2*J-3);
> Vector lb(J,-Inf);
> Vector ub(J,Inf);
> for(int j=0; j for(int j=0; j
> for(int j=1; j {
> Aneq(j-1,j-1) = -One;
> Aneq(j-1,j)   = One;
> bneq[j-1] = Delta1;
> }
> for(int j=2; j {
> Aneq(J-1+j-2,j)   = -One / (Q(j) - Q(j-1));
> Aneq(J-1+j-2,j-1) = One / (Q(j) - Q(j-1)) + One / (Q(j-1) - Q(j-2));
> Aneq(J-1+j-2,j-2) = -One / (Q(j-1) - Q(j-2));
> bneq[J-1+j-2] = Delta2;
> }
>
> /* Solve Constrained Optimization Problem Using Quadratic Programming */
> MinQuadProg qp(c,H,Aeq,beq,Aneq,bneq,lb,ub);
> qp.PrintLevel = 0;
> qp.Solve();
>
> And R
>
> J <- length(Price)
> hs <- numeric(J)
> for(j in 1:J){
>   hs[j] <-(-(gEst$KernelRegPartLin..Phi[j]^(-0.1)))
> }
> hs
>
> Q <- rep(0,J)
> for(j in 1:(length(Price))){
>   Q[j] <- exp((-0.1) * (Beta *Price[j]^(Eta + 1) - 1) / (1 + Eta))
> }
> Q
> plot(Q)
> Dmat <- matrix(0,nrow= J, ncol=J)
> diag(Dmat) <- 1
> dvec <- -hs
> Aeq <- 0
> beq <- 0
> Amat <- matrix(0,J,2*J-3)
> bvec <- rep(0,2*J-3)
>
> for(j in 2:nrow(Amat)){
>   Amat[j-1,j-1] = -1
>   Amat[j,j-1] = 1
> }
>
> for(j in 3:nrow(Amat)){
>   Amat[j,J+j-3] = -1/(Q[j]-Q[j-1])
>   Amat[j-1,J+j-3] = 1/(Q[j]-Q[j-1])
>   Amat[j-2,J+j-3] = -1/(Q[j-1]-Q[j-2])
> }
>
> for(j in 2:nrow(bvec)) {
>   bvec[j-1] = Delta1
> }
> for(j in 3:nrow(bvec)) {
>   bvec[J-1+j-2] = Delta2
> }
>
> solution <- solve.QP(Dmat,dvec,Amat,bvec)
>
>
>
>
>
> ke 23. syysk. 2020 klo 9.12 Abby Spurdle (spurdl...@gmail.com) kirjoitti:
>>
>> > I'm trying to replicate a C++ code with R.
>>
>> Notes:
>> (1) I'd recommend you make the code more modular.
>> i.e. One function for initial data prep/modelling, one function for
>> setting up and solving the QP, etc.
>> This should be easier to debug.
>> (However, you would probably have to do it to the C++ code first).
>> (2) Your R code is not completely reproducible.
>> i.e. AEJData
>> (3) For the purposes of a reproducible example, your code can be simplified.
>> i.e. Only one contributed R package should be attached.
>>
>> Regardless of (1) above, you should be able to identify at what point
>> the C++ and R code becomes inconsistent.
>> The simplest approach is to add print-based functions into both the
>> C++ and R code, and print out state data, at each major step.
>> Then all you need to do is compare the output for both.
>>
>> > Is there a better/another package for these types of problems?
>>
>> I'm not sure.
>> After a quick search, this is the best I found:
>>
>> scam::scam
>> scam::shape.constrained.smooth.terms

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Quadratic programming

2020-09-23 Thread Abby Spurdle
> I'm trying to replicate a C++ code with R.

Notes:
(1) I'd recommend you make the code more modular.
i.e. One function for initial data prep/modelling, one function for
setting up and solving the QP, etc.
This should be easier to debug.
(However, you would probably have to do it to the C++ code first).
(2) Your R code is not completely reproducible.
i.e. AEJData
(3) For the purposes of a reproducible example, your code can be simplified.
i.e. Only one contributed R package should be attached.

Regardless of (1) above, you should be able to identify at what point
the C++ and R code becomes inconsistent.
The simplest approach is to add print-based functions into both the
C++ and R code, and print out state data, at each major step.
Then all you need to do is compare the output for both.

> Is there a better/another package for these types of problems?

I'm not sure.
After a quick search, this is the best I found:

scam::scam
scam::shape.constrained.smooth.terms

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Quadratic programming

2020-09-21 Thread Abby Spurdle
I was wondering if you're trying to fit a curve, subject to
monotonicity/convexity constraints...
If you are, this is a challenging topic, best of luck...


On Tue, Sep 22, 2020 at 8:12 AM Abby Spurdle  wrote:
>
> Hi,
>
> Sorry, for my rushed responses, last night.
> (Shouldn't post when I'm about to log out).
>
> I haven't used the quadprog package for nearly a decade.
> And I was hoping that an expert using optimization in finance in
> economics would reply.
>
> Some comments:
> (1) I don't know why you think bvec should be a matrix. The
> documentation clearly says it should be a vector (implying not a
> matrix).
> The only arguments that should be matrices are Dmat and Amat.
> (2) I'm having some difficulty following your quadratic program, even
> after rendering it.
> Perhaps you could rewrite your expressions, in a form that is
> consistent with the input to solve.QP. That's a math problem, not an R
> programming problem, as such.
> (3) If that fails, then you'll need to produce a minimal reproducible example.
> I strongly recommend that the R code matches the quadratic program, as
> closely as possible.
>
>
> On Mon, Sep 21, 2020 at 9:28 PM Maija Sirkjärvi
>  wrote:
> >
> > Hi!
> >
> > I was wondering if someone could help me out. I'm minimizing a following
> > function:
> >
> > \begin{equation}
> > $$\sum_{j=1}^{J}(m_{j} -\hat{m_{j}})^2,$$
> > \text{subject to}
> > $$m_{j-1}\leq m_{j}-\delta_{1}$$
> > $$\frac{1}{Q_{j-1}-Q_{j-2}} (m_{j-2}-m_{j-1}) \leq \frac{1}{Q_{j}-Q_{j-1}}
> > (m_{j-1}-m_{j})-\delta_{2} $$
> > \end{equation}
> >
> > I have tried quadratic programming, but something is off. Does anyone have
> > an idea how to approach this?
> >
> > Thanks in advance!
> >
> > Q <- rep(0,J)
> > for(j in 1:(length(Price))){
> >   Q[j] <- exp((-0.1) * (Beta *Price[j]^(Eta + 1) - 1) / (1 + Eta))
> > }
> >
> > Dmat <- matrix(0,nrow= J, ncol=J)
> > diag(Dmat) <- 1
> > dvec <- -hs
> > Aeq <- 0
> > beq <- 0
> > Amat <- matrix(0,J,2*J-3)
> > bvec <- matrix(0,2*J-3,1)
> >
> > for(j in 2:nrow(Amat)){
> >   Amat[j-1,j-1] = -1
> >   Amat[j,j-1] = 1
> > }
> > for(j in 3:nrow(Amat)){
> >   Amat[j,J+j-3] = -1/(Q[j]-Q[j-1])
> >   Amat[j-1,J+j-3] = 1/(Q[j]-Q[j-1])
> >   Amat[j-2,J+j-3] = -1/(Q[j-1]-Q[j-2])
> > }
> > for(j in 2:ncol(bvec)) {
> >   bvec[j-1] = Delta1
> > }
> > for(j in 3:ncol(bvec)) {
> >   bvec[J-1+j-2] = Delta2
> > }
> > solution <- solve.QP(Dmat,dvec,Amat,bvec=bvec)
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Quadratic programming

2020-09-21 Thread Abby Spurdle
Hi,

Sorry, for my rushed responses, last night.
(Shouldn't post when I'm about to log out).

I haven't used the quadprog package for nearly a decade.
And I was hoping that an expert using optimization in finance in
economics would reply.

Some comments:
(1) I don't know why you think bvec should be a matrix. The
documentation clearly says it should be a vector (implying not a
matrix).
The only arguments that should be matrices are Dmat and Amat.
(2) I'm having some difficulty following your quadratic program, even
after rendering it.
Perhaps you could rewrite your expressions, in a form that is
consistent with the input to solve.QP. That's a math problem, not an R
programming problem, as such.
(3) If that fails, then you'll need to produce a minimal reproducible example.
I strongly recommend that the R code matches the quadratic program, as
closely as possible.


On Mon, Sep 21, 2020 at 9:28 PM Maija Sirkjärvi
 wrote:
>
> Hi!
>
> I was wondering if someone could help me out. I'm minimizing a following
> function:
>
> \begin{equation}
> $$\sum_{j=1}^{J}(m_{j} -\hat{m_{j}})^2,$$
> \text{subject to}
> $$m_{j-1}\leq m_{j}-\delta_{1}$$
> $$\frac{1}{Q_{j-1}-Q_{j-2}} (m_{j-2}-m_{j-1}) \leq \frac{1}{Q_{j}-Q_{j-1}}
> (m_{j-1}-m_{j})-\delta_{2} $$
> \end{equation}
>
> I have tried quadratic programming, but something is off. Does anyone have
> an idea how to approach this?
>
> Thanks in advance!
>
> Q <- rep(0,J)
> for(j in 1:(length(Price))){
>   Q[j] <- exp((-0.1) * (Beta *Price[j]^(Eta + 1) - 1) / (1 + Eta))
> }
>
> Dmat <- matrix(0,nrow= J, ncol=J)
> diag(Dmat) <- 1
> dvec <- -hs
> Aeq <- 0
> beq <- 0
> Amat <- matrix(0,J,2*J-3)
> bvec <- matrix(0,2*J-3,1)
>
> for(j in 2:nrow(Amat)){
>   Amat[j-1,j-1] = -1
>   Amat[j,j-1] = 1
> }
> for(j in 3:nrow(Amat)){
>   Amat[j,J+j-3] = -1/(Q[j]-Q[j-1])
>   Amat[j-1,J+j-3] = 1/(Q[j]-Q[j-1])
>   Amat[j-2,J+j-3] = -1/(Q[j-1]-Q[j-2])
> }
> for(j in 2:ncol(bvec)) {
>   bvec[j-1] = Delta1
> }
> for(j in 3:ncol(bvec)) {
>   bvec[J-1+j-2] = Delta2
> }
> solution <- solve.QP(Dmat,dvec,Amat,bvec=bvec)
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Quadratic programming

2020-09-21 Thread Abby Spurdle
One more thing, is bvec supposed to be a matrix?

Note you may need to provide a reproducible example, for better help...

On Mon, Sep 21, 2020 at 10:09 PM Abby Spurdle  wrote:
>
> Sorry, ignore the last part.
> What I should have said, is the inequality has the opposite sign.
> >= bvec (not <= bvec)
>
>
> On Mon, Sep 21, 2020 at 10:05 PM Abby Spurdle  wrote:
> >
> > Are you using the quadprog package?
> > If I can take a random shot in the dark, should bvec be -bvec?
> >
> >
> > On Mon, Sep 21, 2020 at 9:28 PM Maija Sirkjärvi
> >  wrote:
> > >
> > > Hi!
> > >
> > > I was wondering if someone could help me out. I'm minimizing a following
> > > function:
> > >
> > > \begin{equation}
> > > $$\sum_{j=1}^{J}(m_{j} -\hat{m_{j}})^2,$$
> > > \text{subject to}
> > > $$m_{j-1}\leq m_{j}-\delta_{1}$$
> > > $$\frac{1}{Q_{j-1}-Q_{j-2}} (m_{j-2}-m_{j-1}) \leq \frac{1}{Q_{j}-Q_{j-1}}
> > > (m_{j-1}-m_{j})-\delta_{2} $$
> > > \end{equation}
> > >
> > > I have tried quadratic programming, but something is off. Does anyone have
> > > an idea how to approach this?
> > >
> > > Thanks in advance!
> > >
> > > Q <- rep(0,J)
> > > for(j in 1:(length(Price))){
> > >   Q[j] <- exp((-0.1) * (Beta *Price[j]^(Eta + 1) - 1) / (1 + Eta))
> > > }
> > >
> > > Dmat <- matrix(0,nrow= J, ncol=J)
> > > diag(Dmat) <- 1
> > > dvec <- -hs
> > > Aeq <- 0
> > > beq <- 0
> > > Amat <- matrix(0,J,2*J-3)
> > > bvec <- matrix(0,2*J-3,1)
> > >
> > > for(j in 2:nrow(Amat)){
> > >   Amat[j-1,j-1] = -1
> > >   Amat[j,j-1] = 1
> > > }
> > > for(j in 3:nrow(Amat)){
> > >   Amat[j,J+j-3] = -1/(Q[j]-Q[j-1])
> > >   Amat[j-1,J+j-3] = 1/(Q[j]-Q[j-1])
> > >   Amat[j-2,J+j-3] = -1/(Q[j-1]-Q[j-2])
> > > }
> > > for(j in 2:ncol(bvec)) {
> > >   bvec[j-1] = Delta1
> > > }
> > > for(j in 3:ncol(bvec)) {
> > >   bvec[J-1+j-2] = Delta2
> > > }
> > > solution <- solve.QP(Dmat,dvec,Amat,bvec=bvec)
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide 
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Quadratic programming

2020-09-21 Thread Abby Spurdle
Sorry, ignore the last part.
What I should have said, is the inequality has the opposite sign.
>= bvec (not <= bvec)


On Mon, Sep 21, 2020 at 10:05 PM Abby Spurdle  wrote:
>
> Are you using the quadprog package?
> If I can take a random shot in the dark, should bvec be -bvec?
>
>
> On Mon, Sep 21, 2020 at 9:28 PM Maija Sirkjärvi
>  wrote:
> >
> > Hi!
> >
> > I was wondering if someone could help me out. I'm minimizing a following
> > function:
> >
> > \begin{equation}
> > $$\sum_{j=1}^{J}(m_{j} -\hat{m_{j}})^2,$$
> > \text{subject to}
> > $$m_{j-1}\leq m_{j}-\delta_{1}$$
> > $$\frac{1}{Q_{j-1}-Q_{j-2}} (m_{j-2}-m_{j-1}) \leq \frac{1}{Q_{j}-Q_{j-1}}
> > (m_{j-1}-m_{j})-\delta_{2} $$
> > \end{equation}
> >
> > I have tried quadratic programming, but something is off. Does anyone have
> > an idea how to approach this?
> >
> > Thanks in advance!
> >
> > Q <- rep(0,J)
> > for(j in 1:(length(Price))){
> >   Q[j] <- exp((-0.1) * (Beta *Price[j]^(Eta + 1) - 1) / (1 + Eta))
> > }
> >
> > Dmat <- matrix(0,nrow= J, ncol=J)
> > diag(Dmat) <- 1
> > dvec <- -hs
> > Aeq <- 0
> > beq <- 0
> > Amat <- matrix(0,J,2*J-3)
> > bvec <- matrix(0,2*J-3,1)
> >
> > for(j in 2:nrow(Amat)){
> >   Amat[j-1,j-1] = -1
> >   Amat[j,j-1] = 1
> > }
> > for(j in 3:nrow(Amat)){
> >   Amat[j,J+j-3] = -1/(Q[j]-Q[j-1])
> >   Amat[j-1,J+j-3] = 1/(Q[j]-Q[j-1])
> >   Amat[j-2,J+j-3] = -1/(Q[j-1]-Q[j-2])
> > }
> > for(j in 2:ncol(bvec)) {
> >   bvec[j-1] = Delta1
> > }
> > for(j in 3:ncol(bvec)) {
> >   bvec[J-1+j-2] = Delta2
> > }
> > solution <- solve.QP(Dmat,dvec,Amat,bvec=bvec)
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Quadratic programming

2020-09-21 Thread Abby Spurdle
Are you using the quadprog package?
If I can take a random shot in the dark, should bvec be -bvec?


On Mon, Sep 21, 2020 at 9:28 PM Maija Sirkjärvi
 wrote:
>
> Hi!
>
> I was wondering if someone could help me out. I'm minimizing a following
> function:
>
> \begin{equation}
> $$\sum_{j=1}^{J}(m_{j} -\hat{m_{j}})^2,$$
> \text{subject to}
> $$m_{j-1}\leq m_{j}-\delta_{1}$$
> $$\frac{1}{Q_{j-1}-Q_{j-2}} (m_{j-2}-m_{j-1}) \leq \frac{1}{Q_{j}-Q_{j-1}}
> (m_{j-1}-m_{j})-\delta_{2} $$
> \end{equation}
>
> I have tried quadratic programming, but something is off. Does anyone have
> an idea how to approach this?
>
> Thanks in advance!
>
> Q <- rep(0,J)
> for(j in 1:(length(Price))){
>   Q[j] <- exp((-0.1) * (Beta *Price[j]^(Eta + 1) - 1) / (1 + Eta))
> }
>
> Dmat <- matrix(0,nrow= J, ncol=J)
> diag(Dmat) <- 1
> dvec <- -hs
> Aeq <- 0
> beq <- 0
> Amat <- matrix(0,J,2*J-3)
> bvec <- matrix(0,2*J-3,1)
>
> for(j in 2:nrow(Amat)){
>   Amat[j-1,j-1] = -1
>   Amat[j,j-1] = 1
> }
> for(j in 3:nrow(Amat)){
>   Amat[j,J+j-3] = -1/(Q[j]-Q[j-1])
>   Amat[j-1,J+j-3] = 1/(Q[j]-Q[j-1])
>   Amat[j-2,J+j-3] = -1/(Q[j-1]-Q[j-2])
> }
> for(j in 2:ncol(bvec)) {
>   bvec[j-1] = Delta1
> }
> for(j in 3:ncol(bvec)) {
>   bvec[J-1+j-2] = Delta2
> }
> solution <- solve.QP(Dmat,dvec,Amat,bvec=bvec)
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mapping 2D to 3D

2020-09-21 Thread Abby Spurdle
Hi H,

I probably owe you an apology.
I was just reading the geom_contour documentation.
It's difficult to follow.

Base R functions, my functions, and pretty much everyone's functions,
take a matrix as input.
But as far as I can tell, geom_contour wants a data.frame with three
{x, y and z} coordinates.
It's not clear how the data in the data.frame is mapped onto the {x,
y, and z} coordinate system.

Also, it uses density estimation in *all* its examples.
Making it difficult for inexperienced users to tell the difference
between contour plots (generally) and density visualization.

I would suggest base R functions as a better starting point:

contour
filled.contour
image
heatmap
persp

I have a package named barsurf, with (almost) density-free examples:
https://cran.r-project.org/web/packages/barsurf/vignettes/barsurf.pdf

Additionally, there are functions in the lattice package, and possibly
the rgl package.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mapping 2D to 3D

2020-09-20 Thread Abby Spurdle
> I was looking at this example which uses geom_contour():
>
> ggvolcano = volcano %>%
>  reshape2::melt() %>%
>  ggplot() +
>  geom_tile(aes(x=Var1,y=Var2,fill=value)) +
>  geom_contour(aes(x=Var1,y=Var2,z=value),color="black") +
>  scale_x_continuous("X",expand = c(0,0)) +
>  scale_y_continuous("Y",expand = c(0,0)) +
>  scale_fill_gradientn("Z",colours = terrain.colors(10)) +
>  coord_fixed()
> print(ggvolcano)

Yesturday, I watched the movie Tenet.

WARNING: SPOILER ALERT

Everything's going forwards and backwards at the same time.
This code reminds me of that movie.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mapping 2D to 3D

2020-09-19 Thread Abby Spurdle
> Understood

I'd recommend you try to be more precise.

> I just began looking at the volcano dataset which uses geom_contour.

The volcano dataset does *not* use geom_contour.
However, the help file for the volcano dataset, does use the
filled.contour function, in its example.

> I now realize that the function stat_density_2d really maps a heatmap

If I hadn't read the rest of this thread, I wouldn't know what you
meant by "maps" a heatmap.

The kde2d function returns a list, containing a density matrix.
(As per my previous post).

The plotting functions, compute the density via the above density
estimation function, and then plot that density, in some form.

I suppose you could say the plotting functions map observations to
density estimates, then map the density estimates to contours and/or
other graphic data, and then map the graphic data to a plot, which is
seen by a user...
...but it's probably easier to just say plot the density.

>of a computed variable.

It's rare in probability theory to refer to density as a "variable".
(Which is relevant because density estimates are estimates of
probability distributions).

However, it is common in computer graphics and geometry, to use "z"
for a "third variable".
And in applied statistics and data science, "variable" could mean anything...
So, be careful there...

Based on your posts, I take it you want to plot a function of two
variables (or plot a matrix of values), using a 2d plot.

There are a number of options here.

Contour plots.
Filled contour plots.
Heatmaps.
Plots using hex/other binning.
Maybe others...?

Additionally, there are 3d plots, such as surface plots.

And I note that it's possible to plot contour lines on top of
color-filled contours or heatmaps.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mapping 2D to 3D

2020-09-18 Thread Abby Spurdle
> But there's no reason for the user to do that when using the plotting 
> function.

I should amend the above.
There's no reason for the user to do that (compute a third "variable"
representing density), if using a high level plotting function, that's
designed to compute the density for you.

It is possible to do it in two or more steps.
(Compute the density matrix, then plot it).

Again, refer to the help file for kde2d.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mapping 2D to 3D

2020-09-18 Thread Abby Spurdle
I'm not familiar with the gg graphics system.
However, I am familiar with density estimation, and density visualization.

There is *no* third variable, as such.
But rather, density estimates, which in this context, would usually be a matrix.
(And are computed inside the plotting or density estimation functions).

The documentation for the function you've used, says it uses  MASS::kde2d().
This does just that, returns an object, which contains a density matrix.
(Refer to the help file for kde2d).

Of course, there's no reason why one can't create a third variable,
from a mathematical perspective.
e.g. d, z, h, fv, or whatever you prefer...
And then set z = fh (x, y).

But there's no reason for the user to do that when using the plotting function.

Note that there are situations where one might want to set the limits
of the plot.
And set the breaks, colors, and color key.
e.g. Creating two or more plots, and putting them next to each other,
for comparison purposes.


On Fri, Sep 18, 2020 at 2:17 PM H  wrote:
>
> I am trying to understand how to map 2D to 3D using ggplot() and eventually 
> plot_gg(). I am, however, stuck on understanding how to express the third 
> variable to be mapped. This example:
>
> ggdiamonds = ggplot(diamonds, aes(x, depth)) +
> stat_density_2d(aes(fill = stat(nlevel)),
> geom = "polygon", n = 100, bins = 10,contour = TRUE) +
> facet_wrap(clarity~.) +
> scale_fill_viridis_c(option = "A")
>
> uses a variable nlevel that I now understand is calculated during the 
> building of the ggplot but I have not figured out from where it is calculated 
> or how to specify a variable of my choosing.
>
> Does anyone have a good reference for understanding how to specify this 
> variable? Most examples on the 'net seem to use the same dataset but do not 
> specify this particular aspect...
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to reduce the sparseness in a TDM to make a cluster plot readable?

2020-09-17 Thread Abby Spurdle
I'm not familiar with these subjects.
And hopefully, someone who is, will offer some better suggestions.

But to get things started, maybe...
(1) What packages are you using (re: tdm)?
(2) Where does the problem happen, in dist, hclust, the plot method
for hclust, or in the package(s) you are using?
(3) Do you think you could produce a small reproducible example,
showing what is wrong, and explaining you would like it to do instead?

Note that if the problem relates to hclust, or the plot method, then
you should be able to produce a much simpler example.
e.g.

mycount.matrix <- matrix (rpois (25000, 20),, 5)
head (mycount.matrix, 3)
tail (mycount.matrix, 3)

plot (hclust (dist (mycount.matrix) ) )

On Tue, Sep 15, 2020 at 6:54 AM Andrew  wrote:
>
> Hello all
>
> I am doing some text mining on a set of five plain text files and have
> run into a snag when I run hclust in that there are just too many leaves
> for anything to be read. It returns a solid black line.
>
> The texts have been converted into a TDM which has a dim of 5,292 and 5
> (as per 5 docs).
>
> My code for removing sparsity is as follows:
>
>  > tdm2 <- removeSparseTerms(tdm, sparse=0.9)
>
>  > inspect(tdm2)
>
> <>
> Non-/sparse entries: 10415/16045
> Sparsity   : 61%
> Maximal term length: 22
> Weighting  : term frequency (tf)
>
> While the tf-idf weighting returns this when 0.9 sparseness is removed:
>
>  > inspect(tdm.tfidf)
> <>
> Non-/sparse entries: 7915/18545
> Sparsity   : 70%
> Maximal term length: 22
> Weighting  : term frequency - inverse document frequency
> (normalized) (tf-idf)
>
> I have experimented by decreasing the value I use for decreasing
> sparseness, and that helps a bit, for example:
>
>  > tdm2 <- removeSparseTerms(tdm, sparse=0.215)
>  > inspect(tdm2)
> <>
> Non-/sparse entries: 3976/369
> Sparsity   : 8%
> Maximal term length: 14
> Weighting  : term frequency (tf)
>
> But, no matter what I do, the resulting plot is unreadable. The code for
> plotting the cluster is:
>
>  > hc <- hclust(dist(tdm2, method = "euclidean"), method = "complete")
>  > plot(hc, yaxt = 'n', main = "Hierarchical clustering")
>
> Can someone kindly either advise me what I am doing wrong and/ or
> signpost me to some detailed info on how to fix this.
>
> Many thanks in anticipation.
>
> Andy
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to represent the effect of one covariate on regression results?

2020-09-15 Thread Abby Spurdle
> My question is how do I present/plot the effect of covariate "TD" in
> the example it has "P" equal to 3.32228e-12 for all IDs in the
> resulting file so that I show how much effect covariate "TD" has on
> the analysis. Should I run another regression without covariate "TD"

I'll take a second shot in the dark:

There is R^2, and a number of generalizations.
(The most common of which, is probably adjusted R^2).
And there are various other goodness of fit tests.

https://en.wikipedia.org/wiki/Goodness_of_fit
https://en.wikipedia.org/wiki/Coefficient_of_determination

You could fit two models (one with a particular variable included, and
one without), and compare how the statistic changes.

However, I'm probably going to get told off, for going off-topic.
So, unless any further questions are specific to R programming, I
don't think I'm going to contribute further.

Also, I'd recommend you read some notes on statistical modelling, or
consult an expert, or both.
And I suspect there are additional considerations modelling genetic data.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to represent the effect of one covariate on regression results?

2020-09-14 Thread Abby Spurdle
I'm wondering if you want one of these:
(1) Plots of "Main Effects".
(2) "Partial Residual Plots".

Search for them, and you should be able to tell if they're what you want.

But a word of warning:

Many people (including many senior statisticians) misinterpret this
kind of information.
Because, it's always the effect of xj on Y, while holding the other
variables *constant*.
That's not as simple as it sounds, and people have a tendency of
disregarding the importance of the second half of that sentence, in
their final interpretations.


P.S.
John Fox, announced a package with support for Regression Diagnostics,
about 11 days ago:
https://stat.ethz.ch/pipermail/r-help/2020-September/468609.html

I'm not sure how relevant it is to your question, but I just glanced
at the vignette, and it's pretty slick...




On Tue, Sep 15, 2020 at 1:30 AM Ana Marija  wrote:
>
> Hello,
>
> I was running association analysis using --glm genotypic from:
> https://www.cog-genomics.org/plink/2.0/assoc with these covariates:
> sex,age,PC1,PC2,PC3,PC4,PC5,PC6,PC7,PC8,PC9,PC10,TD,array,HBA1C. The
> result looks like this:
>
> #CHROMPOSIDREFALTA1TESTOBS_CTBETA
>   SEZ_OR_F_STATPERRCODE
> 10135434303rs11101905GAAADD11863
> -0.1107330.0986981-1.121930.261891.
> 10135434303rs11101905GAADOMDEV11863
> 0.0797970.1110040.7188680.47.
> 10135434303rs11101905GAAsex=Female
> 11863-0.1204040.0536069-2.246050.0247006.
> 10135434303rs11101905GAAage11863
> 0.005245010.003915281.339630.180367.
> 10135434303rs11101905GAAPC111863
> -0.01917790.0166868-1.149280.25044.
> 10135434303rs11101905GAAPC211863
> -0.02699390.0173086-1.559570.118863.
> 10135434303rs11101905GAAPC311863
> 0.01152070.01680760.6854480.493061.
> 10135434303rs11101905GAAPC411863
> 9.57832e-050.01246070.00768680.993867.
> 10135434303rs11101905GAAPC511863
> -0.001910470.00543937-0.351230.725416.
> 10135434303rs11101905GAAPC611863
> -0.01033090.0159879-0.6461720.518168.
> 10135434303rs11101905GAAPC711863
> 0.007909970.01440250.5492070.582863.
> 10135434303rs11101905GAAPC811863
> -0.002056390.0142709-0.1440960.885424.
> 10135434303rs11101905GAAPC911863
> -0.008737710.0057239-1.526530.126878.
> 10135434303rs11101905GAAPC1011863
> 0.01161970.01238260.9383880.348045.
> 10135434303rs11101905GAATD11863
> -0.6700260.0962216-6.963373.32228e-12.
> 10135434303rs11101905GAAarray=Biobank
> 118630.1606660.0736312.182050.0291062.
> 10135434303rs11101905GAAHBA1C11863
> 0.02659330.0016875815.75836.0236e-56.
> 10135434303rs11101905GAAGENO_2DF11863
>   NANA0.7265140.483613.
>
> This results is shown just for one ID (rs11101905) there is about 2
> million of those in the resulting file.
>
> My question is how do I present/plot the effect of covariate "TD" in
> the example it has "P" equal to 3.32228e-12 for all IDs in the
> resulting file so that I show how much effect covariate "TD" has on
> the analysis. Should I run another regression without covariate "TD"
> and than do scatter plot of P values with and without "TD" covariate
> or there is a better way to do this from the data I already have?
>
> Thanks
> Ana
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] on the growth of standard error

2020-08-22 Thread Abby Spurdle
> The absolute
> value of e grows as L grows, but by how much?  It seems statistical
> theory claims it grow by an order of the square root of L.

Assuming you want the standard deviation for the number of successes,
given p=0.5:

#exact
0.5 * sqrt (n)

#numerical approximation
sd (rbinom (1e6, n, 0.5) )

Note that variance should be linear in n.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rotation Forest Error Message

2020-08-20 Thread Abby Spurdle
Just re-read your question and realized I misread the error message.
The argument is of zero length.

But the conclusion is the same, either a bug in the package, or a
problem with your input.


On Fri, Aug 21, 2020 at 4:16 PM Abby Spurdle  wrote:
>
> Note that I'm not familiar with this package or the method.
> Also note that you haven't told anyone what function you're using, or
> what your call was.
>
> I'm assuming that you're using the rotationForest() function.
> According to its help page, the default is:
>
> K = round(ncol(x)/3, 0)
>
> There's no reason why the default K value should be higher than the
> number of columns, unless:
> (1) There's a bug with the package; or
> (2) There's a problem with your input.
>
> I note that the package is only version 0.1.3, so a bug is not out of
> the question.
> Also, I'm a little surprised the author didn't use integer division:
>
> K = ncol (x) %/% 3
>
> You could just set K to the above value, and see what happens...
>
>
> On Fri, Aug 21, 2020 at 1:06 PM Sparks, John  wrote:
> >
> > Hi R Helpers,
> >
> > I wanted to try the rotationForest package.
> >
> > I pointed it at my data set and got the error message "Error in if (K >= 
> > ncol(x)) stop("K should not be greater than or equal to the number of 
> > columns in x") :
> >   argument is of length zero'.
> >
> > My dataset has 3688 obs. of  111 variables.
> >
> > Would a quick adjustment to the default value of K resolve this?
> >
> > If anybody with more experience with the package than me has a general 
> > suggestion I would appreciate it.
> >
> > --John Spaarks
> >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rotation Forest Error Message

2020-08-20 Thread Abby Spurdle
Note that I'm not familiar with this package or the method.
Also note that you haven't told anyone what function you're using, or
what your call was.

I'm assuming that you're using the rotationForest() function.
According to its help page, the default is:

K = round(ncol(x)/3, 0)

There's no reason why the default K value should be higher than the
number of columns, unless:
(1) There's a bug with the package; or
(2) There's a problem with your input.

I note that the package is only version 0.1.3, so a bug is not out of
the question.
Also, I'm a little surprised the author didn't use integer division:

K = ncol (x) %/% 3

You could just set K to the above value, and see what happens...


On Fri, Aug 21, 2020 at 1:06 PM Sparks, John  wrote:
>
> Hi R Helpers,
>
> I wanted to try the rotationForest package.
>
> I pointed it at my data set and got the error message "Error in if (K >= 
> ncol(x)) stop("K should not be greater than or equal to the number of columns 
> in x") :
>   argument is of length zero'.
>
> My dataset has 3688 obs. of  111 variables.
>
> Would a quick adjustment to the default value of K resolve this?
>
> If anybody with more experience with the package than me has a general 
> suggestion I would appreciate it.
>
> --John Spaarks
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] install.packages() R vs RStudio

2020-08-17 Thread Abby Spurdle
Duncan Murdoch  wrote:
> R is designed to be flexible, and to let people change its behaviour.
> Using that flexibility is what all users should do.  Improving the user
> experience is what front-end writers should do.  I don't find it
> inadvisable at all.

Well, that's a big whopping U-turn.

Abby Spurdle wrote:
> There's a work around.
> You can redefine the print function, using something like:
> print = function (...) base::print (...)

Duncan Murdoch replied:
> That's a really, really bad idea.  If there are two generics named the
> same, how are your users going to know which one they are getting when
> they just say print(myobj)?

https://stat.ethz.ch/pipermail/r-devel/2018-August/076581.html

I can't see how redefining a generic function is any different from
redefining a package installation function.
And can't help but suspect, you're making an exception for RStudio...

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Best settings for RStudio video recording?

2020-08-16 Thread Abby Spurdle
> a) Read about it yourself. It is a legal definition.

Not quite.
Your statement implies some sort of universalism, which is unrealistic.
Legal definitions vary from one legal system to the next.

I'm not an expert in US company/corporate law.
But as I understand it, the applicable laws vary from state to state.

It's unlikely that you or most readers will interpret the original
statement in a strict legal sense.
But rather, the term is used to imply something.

If the criteria is:
Sacrificing prophets (not just theirs, but their *holding/sibling
companies too*), for some public benefit(s)...

...then I would like to see evidence of this.

> b) Don't "correct" me with misinformation you are clearly inventing. RStudio 
> the software does not "introduce people to a modified version of R."

Read this post:
https://stat.ethz.ch/pipermail/r-help/2020-May/466788.html

My information is accurate, and RStudio does modify R, unless of
course something has changed...

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Best settings for RStudio video recording?

2020-08-15 Thread Abby Spurdle
On Fri, Aug 14, 2020 at 12:11 PM Jeff Newmiller
 wrote:
>  It is a public benefit corporation

Seriously?

On Fri, Aug 14, 2020 at 12:11 PM Jeff Newmiller
 wrote:
>  used to introduce people to R

Correction, it introduces people to a modified version of R.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reproducibility Between Local and Remote Computer with R

2020-08-08 Thread Abby Spurdle
Hi Kevin,

Intuitively, the first step would be to ensure that all versions of R,
and all the R packages, are the same.

However, you mention HPC.
And the glmnet package imports the foreach package, which appears
(after a quick glance) to support multi-core and parallel computing.

If your code uses parallel computing (?), you may need to look at how
random numbers, and related results, are handled...


On Sun, Aug 9, 2020 at 1:14 AM Kevin Egan  wrote:
>
> I posted this question:
>
> I am currently using R , RStudio , and a remote computer (using an R script) 
> to run the same code. I start by using set.seed(123) in all three versions of 
> the code, then using glmnet to assess a matrix. Ultimately, I am having 
> trouble reproducing the results between my local and the remote computer's 
> results. I am using R version 4.0.2 locally, and R version 3.6.0 remote.
>
> After running several tests, I'm wondering if there is a difference between 
> the two versions in R which may lead to slightly different coefficients. If 
> anyone has any insight I would appreciate it.
>
> Thanks.
>
> and found that there were slight differences between using rnorm with R-4.0.2 
> and R-3.6.0 but did not find any differences for runif between both systems. 
> In my original code, I am using rnorm and was wondering if this may be the 
> reason I am finding slight differences in coefficients for glmnet and lars 
> testing between using my local computer (R-4.0.2) and my remote computer 
> (R-3.6.0). I am running my code locally on a MacOSX and remote on what I 
> believe is an HPC.
>
> Thanks.
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] defining group colours in a call to rda

2020-08-04 Thread Abby Spurdle
Hi,

Your example is not reproducible.
However, I suspect that the following is the problem:

c("red","green","blue","aquamarine","magenta")[MI_fish_all.mrt$where]

Here's my version:

where = c (3, 3, 8, 6, 6, 9, 5, 5, 9, 3, 8, 6, 9, 6, 5, 9, 5, 3, 8, 6,
9, 6, 5, 9, 5, 3, 3, 8, 6, 6, 9, 5, 5, 9, 6, 9, 5, 9)
unique (where)

c("red", "green", "blue", "aquamarine", "magenta")[where]

There's five colors.
But only two of the indices are within one to five.
So, the resulting color vector contains missing values.

In the base graphics system, if you set colors to NA, it usually means no color.

I'm not sure exactly what you want to do, but I'm assuming you can fix
it from here.

On Tue, Aug 4, 2020 at 9:49 PM Andrew Halford  wrote:
>
> Hi,
>
> I've been trying to use the output on group membership of the final leaves
> in a MRT analysis to define my own colours, however I am not getting the
> result I'm after.
>
> Here is the code
> fish.pca <-rda(fish_all.hel,scale=TRUE)
> fish.site <- scores(fish.pca,display="sites",scaling=3)
> fish.spp <-
> scores(fish.pca,display="species",scaling=3)[fish.MRT.indval$pval<=0.05,]
> plot(fish.pca,display=c("sites","species"),type="n",scaling=3)
> points(fish.site,pch=21,bg=MI_fish_all.mrt$where,cex=1.2)
> plotcolor <-
> c("red","green","blue","aquamarine","magenta")[MI_fish_all.mrt$where]
>  fish.pca <-rda(fish_all.hel,scale=TRUE)
> plot(fish.pca,display=c("sites","species"),type="n",scaling=3)
> points(fish.site,pch=21,bg=plotcolor,cex=1.2)
> MI_fish_all.mrt$where
>
> If I run the points command and insert the group membership direct from the
> MRT analysis e.g.  bg=MI_fish_all.mrt$where , then the subsequent points
> plot up correctly with a different colour for each group.However if I try
> to impose my own colour combo with plotcolor.It prints colours for 2
> groups and leaves the rest uncoloured.
>
> The call to  MI_fish_all.mrt$where gives...
>  [1] 3 3 8 6 6 9 5 5 9 3 8 6 9 6 5 9 5 3 8 6 9 6 5 9 5 3 3 8 6 6 9 5 5 9 6
> 9 5 9.
>
> These are the split groupings for all 39 sites in the analysis and there
> are 5 numbers corresponding to 5 final leaves in the tree.
>
> I cant see why my colour scheme isnt being recognised.
>
> All help accepted.
>
> Andy
>
>
> --
> Andrew Halford Ph.D
> Senior Coastal Fisheries Scientist
> Pacific Community | Communauté du Pacifique CPS – B.P. D5 | 98848 Noumea,
> New Caledonia | Nouméa, Nouvelle-Calédonie
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [FORGED] Dependent Variable in Logistic Regression

2020-08-03 Thread Abby Spurdle
> Sorry, Abby, I do disagree here ((strongly enough as to warrant
> this reply) :

Which part are you disagreeing with?
That unambiquous names/references should be used, or that there are
many R functions for GLMs.
The wording of your post, suggests (kind of), that there is only one R
function for GLMs.

> We're talking about doing "basic" statistics with R,  and these
> function in the stats package have been part of R even before
> got a version number.

Remember, not everyone is using the same R packages, as you.
And some people have done university courses, or done online courses,
etc, in R, without ever using one function from the stats package.

I'm reluctant to assume that all R users will have a common understanding.
And what may seem obvious to you or me, may seem quite foreign to some
users, or vice versa.

> So, no,  glm()  {and the stats package} are the default and I still
> think everybody should know and assume that.

But perhaps most importantly, the OP said "the glm".
He never said "glm()", but rather the subsequent posters did.

Rolf suggested his post was bullshit, after removing the lexical peroxide.
How does anyone know that it wasn't a genuine post, but in reference
to something other than stats::glm?

Shouldn't people be innocent until proven guilty.
Otherwise (something I have been guilty of in the past), the mailing
list turns into statistical propaganda...

Even if the OP was referring to stats::glm, I'm still inclined to feel
the post was legitimate, just a bit short on technical details...

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [FORGED] Dependent Variable in Logistic Regression

2020-08-01 Thread Abby Spurdle
That's a bit harsh.
Isn't the best advice here, to post a reproducible example...
Which I believe has been mentioned.

Also, I'd strongly encourage people to use package+function name, for
this sort of thing.

stats::glm

As there are many R functions for GLMs...


On Sun, Aug 2, 2020 at 12:47 PM Rolf Turner  wrote:
>
>
> On 2/08/20 5:39 am, Paul Bernal wrote:
>
> > Dear friends,
> >
> > Hope you are doing great. I want to fit a logistic regression in R, where
> > the dependent variable is the covid status (I used 1 for covid positives,
> > and 0 for covid negatives), but when I ran the glm, R complains that I
> > should make the dependent variable a factor.
> >
> > What would be more advisable, to keep the dependent variable with 1s and
> > 0s, or code it as yes/no and then make it a factor?
> >
> > Any guidance will be greatly appreciated,
>
>
> There have been many responses to this post, the majority of them being
> confusing and off the point.
>
> BOTTOM LINE:  R/glm() does *NOT* complain that one "should make the
> dependent variable a factor".   This is bovine faecal output.
>
> As Rui Barradas has pointed out (alternatively: RTFM!) when you fit a
> Bernoulli model using glm(), your response/dependent variable is allowed
> to be
>
>  * a numeric variable with values 0 or 1
>  * a logical variable
>  * a factor with two levels
>
> The OP presumably fed glm() a *character* vector with values "0" and
> "1".  Doing *this* will cause glm() to whinge.
>
> I reiterate:  RTFM!!!  (And perhaps learn to distinguish between
> character vectors and factors.)
>
> cheers,
>
> Rolf Turner
>
> --
> Honorary Research Fellow
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to create a readable plot in R with 10000+ values in a dataframe

2020-07-29 Thread Abby Spurdle
On Sat, Jul 25, 2020 at 12:40 AM Martin Maechler
 wrote:
> Good answers to this question will depend very much on how many
> 'Machine' and 'Region' levels there are.

I second that.
And unless I missed something, the OP hasn't answered this question, as such.
But "10k+" combinations, does imply around 100 levels each.

Another important question is, are the combinations unique or not?

It would be possible to create an (approx):
100x100 heatmap of boolean values, for unique combinations, or;
100x100 heatmap of counts (or density), for non-unique combinations.

But unless there's some meaningful order to the levels, the resulting
plot may end up looking like a $3 pizza.
I'm unable to comment on possible exploratory value, but I doubt that
this is a good approach, for presentation purposes.

If the goal was some sort of ranking, a textual summary, may work better...?
Or you could plot relevant subsets of the data...

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Character (1a, 1b) to numeric

2020-07-11 Thread Abby Spurdle
On Sat, Jul 11, 2020 at 8:04 AM Fox, John  wrote:
> We've had several solutions, and I was curious about their relative 
> efficiency. Here's a test

Am I the only person on this mailing list who learnt to program with ASCII...?

In theory, the most ***efficient*** solution, is to get the
ASCII/UTF8/etc values.
Then use a simple (math) formula.
No matching, no searching, required ...

Here's one possibility:

xc <-  c ("1", "1a", "1b", "1c", "2", "2a", "2b", "2c")

I <- (nchar (xc) == 2)
xn <- as.integer (substring (xc, 1, 1) )
xn [I] <- xn [I] + (utf8ToInt (paste (substring (xc [I], 2, 2),
collapse="") ) - 96) / 4
xn

Unfortunately, this makes R look bad.
The corresponding C implementation is simpler and presumably the
performance winner.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bivariate ReLU Distribution

2020-07-10 Thread Abby Spurdle
Last line should use outside = c (0, 1).
But not that important.

On Sat, Jul 11, 2020 at 1:31 PM Abby Spurdle  wrote:
>
> NOTE: LIMITED TESTING
> (You may want to check this carefully, if you're interested in using it).
>
> library (kubik)
> library (mvtnorm)
>
> sim.cdf <- function (mx, my, sdx, sdy, cor, ..., n=2e5)
> sim.cdf.2 (mx, my, sdx^2, sdy^2, sdx * sdy * cor, n=n)
>
> sim.cdf.2 <- function (mx, my, vx, vy, cov, ..., n=2e5)
> {   m <- c (mx, my)
> v <- matrix (c (vx, cov, cov, vy), 2, 2)
> u <- rmvnorm (2 * n, m, v)
> for (i in 1:(2 * n) )
> u [i] <- max (0, u [i])
> z <- u [1:n] + u [(n + 1):(2 * n)]
>
> P0 <- sum (z == 0) / n
>
> z2 <- z [z != 0]
> z2 <- c (-z2, z2)
> de <- density (z2)
> xFh <- chs.integral (de$x, de$y)
>
> cx <- seq (0, max (de$x), length.out=60)
> cy <- xFh (cx)
> cy <- cy - cy [1]
> cy <- P0 + cy * (1 - P0) / cy [60]
>
> cs = chs.constraints (increasing=TRUE)
> chs (cx, cy, constraints=cs, outside = c (0, cy [60]) )
> }
>
> #X1, X2 means: 0 and 2
> #X1, Y2 sds: 1.5 and 3.5
> #cor (X1, X2): 0.75
> Fh <- sim.cdf (0, 2, 1.5, 3.5, 0.75)
>
> plot (Fh, ylim = c (0, 1.05), yaxs="i")
>
> #prob 1 < U < 2
> Fh (2) - Fh (1)
>
>
> On Sat, Jul 11, 2020 at 1:49 AM Arun Kumar Saha via R-help
>  wrote:
> >
> > Hi,
> > I would rather have a Statistics related question hope experts here can 
> > provide some suggestions. I have posted this request in some other forum 
> > but failed to generate meaningful response
> > I am looking for some technical document on deriving the Distribution 
> > function for sum of 2 ReLU(푋)=max{0,푋} distributions i.e max{0,푋1} + 
> > max{0,푋2} where X1 and X2 jointly follow some bivariate Nomal distribution.
> > There are few technical notes available for univariate ReLU distribution, 
> > however I failed to find any spec for bivariate/multivariate setup.
> > Any pointer on above subject will be highly helpful.
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bivariate ReLU Distribution

2020-07-10 Thread Abby Spurdle
NOTE: LIMITED TESTING
(You may want to check this carefully, if you're interested in using it).

library (kubik)
library (mvtnorm)

sim.cdf <- function (mx, my, sdx, sdy, cor, ..., n=2e5)
sim.cdf.2 (mx, my, sdx^2, sdy^2, sdx * sdy * cor, n=n)

sim.cdf.2 <- function (mx, my, vx, vy, cov, ..., n=2e5)
{   m <- c (mx, my)
v <- matrix (c (vx, cov, cov, vy), 2, 2)
u <- rmvnorm (2 * n, m, v)
for (i in 1:(2 * n) )
u [i] <- max (0, u [i])
z <- u [1:n] + u [(n + 1):(2 * n)]

P0 <- sum (z == 0) / n

z2 <- z [z != 0]
z2 <- c (-z2, z2)
de <- density (z2)
xFh <- chs.integral (de$x, de$y)

cx <- seq (0, max (de$x), length.out=60)
cy <- xFh (cx)
cy <- cy - cy [1]
cy <- P0 + cy * (1 - P0) / cy [60]

cs = chs.constraints (increasing=TRUE)
chs (cx, cy, constraints=cs, outside = c (0, cy [60]) )
}

#X1, X2 means: 0 and 2
#X1, Y2 sds: 1.5 and 3.5
#cor (X1, X2): 0.75
Fh <- sim.cdf (0, 2, 1.5, 3.5, 0.75)

plot (Fh, ylim = c (0, 1.05), yaxs="i")

#prob 1 < U < 2
Fh (2) - Fh (1)


On Sat, Jul 11, 2020 at 1:49 AM Arun Kumar Saha via R-help
 wrote:
>
> Hi,
> I would rather have a Statistics related question hope experts here can 
> provide some suggestions. I have posted this request in some other forum but 
> failed to generate meaningful response
> I am looking for some technical document on deriving the Distribution 
> function for sum of 2 ReLU(푋)=max{0,푋} distributions i.e max{0,푋1} + 
> max{0,푋2} where X1 and X2 jointly follow some bivariate Nomal distribution.
> There are few technical notes available for univariate ReLU distribution, 
> however I failed to find any spec for bivariate/multivariate setup.
> Any pointer on above subject will be highly helpful.
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Opening Another Application in R Then Hangs

2020-07-05 Thread Abby Spurdle
shell ("Notepad", wait=FALSE)

On Mon, Jul 6, 2020 at 10:07 AM Sparks, John  wrote:
>
> Hi R Helpers,
>
> I am trying to open another application from within R and then work with it.
>
> I can get the application to open, but R then hangs at that point (spinning 
> blue circle in the middle of the screen) and my subsequent programming does 
> not execute.
>
> Does anybody know how to get R to unlock?
>
> I am using Windows 10 and R4.0.
>
> The example below freezes R on my machine.
>
> Any guidance appreciated.  Thanks.
> --John Sparks
>
> setwd("C:/Users/JSparks/AppData/Roaming/Microsoft/Windows/Start 
> Menu/Programs/Accessories")
> system2("Notepad",invisible=FALSE)
>
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to combine uncertainty and weighting in spearman correlation?

2020-06-22 Thread Abby Spurdle
Hi,

I suspect that there's a formula for this.
However, I couldn't find it.
So, here's (most of) the code to produce a simulated data set.

---
sim.data <- function (xmean, ymean, xsd=NULL, ysd=NULL, w, ...,
print=FALSE, plot=FALSE, nsub=1000)
{   N <- length (xmean)

x <- y <- u <- matrix (0, nsub, N)
for (k in 1:N)
{   if (is.null (xsd) ) x [,k] <- xmean [k]
else x [,k] <- rnorm (nsub, xmean [k], xsd [k])
if (is.null (ysd) ) y [,k] <- ymean [k]
else y [,k] <- rnorm (nsub, ymean [k], ysd [k])
u [,k] <- w [k] / nsub
}
x <- as.vector (x)
y <- as.vector (y)
u <- as.vector (u)

if (print)
print (cbind (x, y, u) )
if (plot)
{   plot (x, y)
points (xmean, ymean, pch=16, col="blue")
}

cor (x, y)
}
---

I didn't install the wCorr package, so you'll need to change the
second to last line.
The weights are in the vector, u (not w).

A subsample is generated for each group.
I've computed weights, such that the total weights for each group are
equal to the original weight for that group.
That sounds right, but I'm not completely sure.

Then call it using something like:
sim.data (x1, y1, x1_SD, NULL, corr_weight)

You can remove the print/plot parts if you want.
But if you call it with print=TRUE, then set nsub to a smaller value.
sim.data (x1, y1, x1_SD, NULL, corr_weight, print=TRUE, plot=TRUE, nsub=10)

If there's any problems, let me know.

On Mon, Jun 22, 2020 at 7:55 PM Frederik Feys  wrote:
>
> Thanks Abby, some info on the data:
>
> score   score_SDdeath_count population_size
> x1  x1_SD   y1  corr_weight
> 4.3 2.3 5800900.000
> 5.7 6.1 250 11.000.600
> ..  ..      ..  ..
>
> > Op 22 jun. 2020, om 02:02 heeft Abby Spurdle  het 
> > volgende geschreven:
> >
> > I need to fix my mistakes, from earlier this morning.
> > The sums should be over densities, so:
> >
> > fh (X, Y) = [fh1 (X1, X1) + fh2 (X2, Y2) + ... + fhn (Xn, Yn)] / n
> >
> > fh (X, Y) = w1*fh1 (X1, X1) + w2*fh2 (X2, Y2) + ... + wn*fhn (Xn, Yn)
> >
> >assuming the weights sum to 1
> >
> > If simulated data is used, then the expressions above can be replaced
> > with the union of multiple (sub)samples.
> > Then an estimate/inference (say correlation) can be computed from one
> > or more combined samples.
> >
> > Sorry, for triple posting.
> >
> >
> > On Mon, Jun 22, 2020 at 10:00 AM Abby Spurdle  wrote:
> >>
> >> Hi Frederick,
> >>
> >> I glanced at the webpage you've linked.
> >> (But only the top three snippets).
> >>
> >> This is what I would call the sum of random variables.
> >> (X, Y) = (X1, X1) + (X2, Y2) + ... + (Xn, Yn)
> >>
> >> The example makes the mistake of assuming that the Xs are normally
> >> distributed, and each of the Ys are from exactly the same uniform
> >> distribution.
> >> By "combine"-ing both approaches, are you wanting to weight each pair?
> >>
> >> w1(X1, X1) + w2(X2, Y2) + ... + wn(Xn, Yn)
> >>
> >> I note that you haven't told us much about your data.
> >> There may be an easier way of doing things...
> >>
> >>
> >> On Mon, Jun 22, 2020 at 1:53 AM Frederik Feys  wrote:
> >>>
> >>> Hello everyone
> >>>
> >>> At the moment I put a lot of attention in the uncertainty of my analyzes. 
> >>> I want to do a spearman correlation that takes into account the 
> >>> uncertainty in my observations and has weighting.
> >>>
> >>> uncertainty of observations: I came across this excellent blog that 
> >>> proposes a bootstrap function: 
> >>> https://www.r-bloggers.com/finding-correlations-in-data-with-uncertainty/
> >>>
> >>> weighted: I do weighted correlations with the wCorr package.
> >>>
> >>> Now I want to combine both approaches in one approach for a final 
> >>> analysis. How would you do that?
> >>>
> >>> Thanks for the help!
> >>>
> >>> Frederik Feys
> >>> PhD Medical Sciences
> >>> Onafhankelijk Methodoloog
> >>> https://www.researchgate.net/profile/Frederik_Feys
> >>> +32488020010
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>[[alternative HTML version deleted]]
> >>>
> >>> __
> >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide 
> >>> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to combine uncertainty and weighting in spearman correlation?

2020-06-21 Thread Abby Spurdle
I need to fix my mistakes, from earlier this morning.
The sums should be over densities, so:

fh (X, Y) = [fh1 (X1, X1) + fh2 (X2, Y2) + ... + fhn (Xn, Yn)] / n

fh (X, Y) = w1*fh1 (X1, X1) + w2*fh2 (X2, Y2) + ... + wn*fhn (Xn, Yn)

assuming the weights sum to 1

If simulated data is used, then the expressions above can be replaced
with the union of multiple (sub)samples.
Then an estimate/inference (say correlation) can be computed from one
or more combined samples.

Sorry, for triple posting.


On Mon, Jun 22, 2020 at 10:00 AM Abby Spurdle  wrote:
>
> Hi Frederick,
>
> I glanced at the webpage you've linked.
> (But only the top three snippets).
>
> This is what I would call the sum of random variables.
> (X, Y) = (X1, X1) + (X2, Y2) + ... + (Xn, Yn)
>
> The example makes the mistake of assuming that the Xs are normally
> distributed, and each of the Ys are from exactly the same uniform
> distribution.
> By "combine"-ing both approaches, are you wanting to weight each pair?
>
> w1(X1, X1) + w2(X2, Y2) + ... + wn(Xn, Yn)
>
> I note that you haven't told us much about your data.
> There may be an easier way of doing things...
>
>
> On Mon, Jun 22, 2020 at 1:53 AM Frederik Feys  wrote:
> >
> > Hello everyone
> >
> > At the moment I put a lot of attention in the uncertainty of my analyzes. I 
> > want to do a spearman correlation that takes into account the uncertainty 
> > in my observations and has weighting.
> >
> > uncertainty of observations: I came across this excellent blog that 
> > proposes a bootstrap function: 
> > https://www.r-bloggers.com/finding-correlations-in-data-with-uncertainty/
> >
> > weighted: I do weighted correlations with the wCorr package.
> >
> > Now I want to combine both approaches in one approach for a final analysis. 
> > How would you do that?
> >
> > Thanks for the help!
> >
> > Frederik Feys
> > PhD Medical Sciences
> > Onafhankelijk Methodoloog
> > https://www.researchgate.net/profile/Frederik_Feys
> > +32488020010
> >
> >
> >
> >
> >
> >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to combine uncertainty and weighting in spearman correlation?

2020-06-21 Thread Abby Spurdle
Just realised the above notation may be a bit misleading.
Because I was thinking in terms of simulated data.


On Mon, Jun 22, 2020 at 10:00 AM Abby Spurdle  wrote:
>
> Hi Frederick,
>
> I glanced at the webpage you've linked.
> (But only the top three snippets).
>
> This is what I would call the sum of random variables.
> (X, Y) = (X1, X1) + (X2, Y2) + ... + (Xn, Yn)
>
> The example makes the mistake of assuming that the Xs are normally
> distributed, and each of the Ys are from exactly the same uniform
> distribution.
> By "combine"-ing both approaches, are you wanting to weight each pair?
>
> w1(X1, X1) + w2(X2, Y2) + ... + wn(Xn, Yn)
>
> I note that you haven't told us much about your data.
> There may be an easier way of doing things...
>
>
> On Mon, Jun 22, 2020 at 1:53 AM Frederik Feys  wrote:
> >
> > Hello everyone
> >
> > At the moment I put a lot of attention in the uncertainty of my analyzes. I 
> > want to do a spearman correlation that takes into account the uncertainty 
> > in my observations and has weighting.
> >
> > uncertainty of observations: I came across this excellent blog that 
> > proposes a bootstrap function: 
> > https://www.r-bloggers.com/finding-correlations-in-data-with-uncertainty/
> >
> > weighted: I do weighted correlations with the wCorr package.
> >
> > Now I want to combine both approaches in one approach for a final analysis. 
> > How would you do that?
> >
> > Thanks for the help!
> >
> > Frederik Feys
> > PhD Medical Sciences
> > Onafhankelijk Methodoloog
> > https://www.researchgate.net/profile/Frederik_Feys
> > +32488020010
> >
> >
> >
> >
> >
> >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to combine uncertainty and weighting in spearman correlation?

2020-06-21 Thread Abby Spurdle
Hi Frederick,

I glanced at the webpage you've linked.
(But only the top three snippets).

This is what I would call the sum of random variables.
(X, Y) = (X1, X1) + (X2, Y2) + ... + (Xn, Yn)

The example makes the mistake of assuming that the Xs are normally
distributed, and each of the Ys are from exactly the same uniform
distribution.
By "combine"-ing both approaches, are you wanting to weight each pair?

w1(X1, X1) + w2(X2, Y2) + ... + wn(Xn, Yn)

I note that you haven't told us much about your data.
There may be an easier way of doing things...


On Mon, Jun 22, 2020 at 1:53 AM Frederik Feys  wrote:
>
> Hello everyone
>
> At the moment I put a lot of attention in the uncertainty of my analyzes. I 
> want to do a spearman correlation that takes into account the uncertainty in 
> my observations and has weighting.
>
> uncertainty of observations: I came across this excellent blog that proposes 
> a bootstrap function: 
> https://www.r-bloggers.com/finding-correlations-in-data-with-uncertainty/
>
> weighted: I do weighted correlations with the wCorr package.
>
> Now I want to combine both approaches in one approach for a final analysis. 
> How would you do that?
>
> Thanks for the help!
>
> Frederik Feys
> PhD Medical Sciences
> Onafhankelijk Methodoloog
> https://www.researchgate.net/profile/Frederik_Feys
> +32488020010
>
>
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Load svg, eps or png into graphics device?

2020-06-19 Thread Abby Spurdle
If I understand your question correctly, you're already able to read
an EPS file.
So, essentially, you have an answer to your question.

Paul Murrell published an article on using raster graphics, in 2011.
https://journal.r-project.org/archive/2011-1/RJournal_2011-1_Murrell.pdf

I would assume there's been progress since then.
But I don't see the point here.
Writing what would otherwise be vector graphics to a raster-based
image file, and then reading it back into a vector graphics system,
would create unnecessary problems.


On Sat, Jun 20, 2020 at 2:56 AM Rainer M Krug  wrote:
>
>
> Hi
>
> I have a package, which plots from the plantuml syntax (https://plantuml.com) 
> graphs via a knitr engine, into a file, or into a graphics device 
> (https://github.com/rkrug/plantuml).
>
> I am using at the moment grImport for the eps import, but would like to cut 
> down on dependencies.
>
> Is there a way of bringing graphics files (png, sag, eps, …) into a graphics 
> device in R?
>
> Thanks,
>
> Rainer
>
>
>
>
>
>
> --
> Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, 
> UCT), Dipl. Phys. (Germany)
>
> Orcid ID: -0002-7490-0066
>
> Department of Evolutionary Biology and Environmental Studies
> University of Zürich
> Office Y34-J-74
> Winterthurerstrasse 190
> 8075 Zürich
> Switzerland
>
> Office: +41 (0)44 635 47 64
> Cell:   +41 (0)78 630 66 57
> email:  rainer.k...@uzh.ch
> rai...@krugs.de
> Skype: RMkrug
>
> PGP: 0x0F52F982
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mnormt package for bivariate normal distribution

2020-06-17 Thread Abby Spurdle
I'm not familiar with the mnormt package.
I'm guessing its documentation may answer some (if not all) of your questions.

Note that my package, bivariate, wraps the dmvnorm function, from the
mvtnorm package.

library (bivariate)

f <- nbvpdf (
0, 0, #means X, Y
1, 1, #sds X, Y
0.5)  #cor

#combined contour-heat
plot (f)
#surface
plot (f, TRUE, arrows=FALSE)
#other
plot (f, all=TRUE, arrows=FALSE)

https://cran.r-project.org/web/packages/mnormt/mnormt.pdf
https://cran.r-project.org/web/packages/mvtnorm/mvtnorm.pdf

https://cran.r-project.org/web/packages/bivariate/vignettes/bivariate.pdf


On Thu, Jun 18, 2020 at 3:36 AM Vahid Borji  wrote:
>
> Hi, my R friends,
> I am going to plot the surface of a bivariate normal distribution and its
> contours. I have written the below codes:
>
> library(MASS)
>
> set.seed(69)
> n <- 5000
> x <- rnorm(n, 0, 15)
> y <- 0.5 * x  + rnorm(n, 0, 10)
> z <- kde2d(x, y, n = 50)
>
> persp(z, theta = 55, phi = 35,
>   shade = 0.75, col = "gold", expand = 0.5, r = 2,
>   ltheta = 25, ticktype = "detailed")
>
> contour(z)
>
> It seems that I could use mnormt package for bivariate normal distribution.
> How can I use suitable functions of mnormt package in my above codes?
> Indeed I want to change my above codes somewhat and use the mnormt package.
>
> Thank you in advance
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Seeking implementation of my algorithm 'spdspds' - a novel algorithm for solving Linear Programming Problems with O(L^1.5) computational complexity

2020-06-11 Thread Abby Spurdle
> solving Linear Programming Problems with O(L^1.5)
> computational complexity

I'm not an expert on this topic.
However, a quick glance at the topic suggests that these sorts of
algorithms are usually exponential in "n", here the number of
variables/dimensions.
Apparently, "L" is the number of input bits.

Your notation suggests your algorithm is dependent on the number input
bits only, and is otherwise constant in the number of
variables/dimensions.
So, we can solve an LP with hundreds of millions of variables,
near-instantaneously...?

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] did bot execute

2020-06-02 Thread Abby Spurdle
(excerpts only)
> Tried this new version but did not execute...
> Error in plot_ds(bat_call, "plot 2", c(25, 28), c(-15, 10), k1 = 1.25,  :
>   object 'bat_call' not found

I've used the bat_call object, from Jim's earlier post.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on contour plots

2020-06-02 Thread Abby Spurdle
> The contour lines are actually useful to see groupings.
> However w/o a legend for density it is not possible to see what is
> presented.

I need to re-iterate, that the diagonal lines, may be important.

Also, I'm not sure I see the point in adding density values.
Unless people have a good knowledge of probability theory and
calculus, I doubt that specific density values will be useful.
i.e. If I said the density was 0.0035, what does that tell you...?

If you really want to add a legend, it's possible.

But this creates at least two problems:
(1) In the base graphics system, the resulting plots can't be nested.
(2) It's difficult to interpret specific color-encoded values.

In my opinion, a better idea, is to label the contour lines.
In my packages, this is possible by using contour.labels=TRUE,
however, the defaults are ugly.
(Something else for my todo list).

Here's a slightly more complex example, with prettier contour labels:

library (barsurf)
library (KernSmooth)
set.bs.theme ("heat")

plot_ds <- function (dataset, main="", xlim, ylim, ...,
ncontours=3, labcex=0.8, ndec=3,
k1=1, k2=1, n=30)
{   names <- names (dataset)
x <- dataset [,1]
y <- dataset [,2]
 bw.x <- k1 * bw.nrd (x)
bw.y <- k2 * bw.nrd (y)
if (missing (xlim) )
xlim <- range (x) + c(-1, 1) * bw.x
if (missing (ylim) )
ylim <- range (y) + c(-1, 1) * bw.y

ks <- bkde2D (dataset, c (bw.x, bw.y),
c (n, n), list (xlim, ylim), FALSE)

fb <- seq (min (ks$fhat), max (ks$fhat),
length.out = ncontours + 2)
fb <- fb [2:(ncontours + 1)]
fb <- round (fb, ndec)

plot_cfield (ks$x1, ks$x2, ks$fhat,
contours=FALSE,
main=main, xlab = names [1], ylab = names [2],
xyrel="m")
points (x, y, pch=16, col="#0040")
contour (ks$x1, ks$x2, ks$fhat, levels=fb, labcex=labcex, add=TRUE)
}

plot_ds (bat_call, "plot 2", c (25, 28), c (-15, 10), k1=1.25, k2=1.25)

If you still want a legend, have a look at:
graphics::filled.contour

And then modify the second half of my code, starting after ks <- ...

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on contour plots

2020-06-02 Thread Abby Spurdle
>  that extraneous white lines in PDFs are the fault of the PDF
> viewing program rather than of R.

Except it's a PNG file.

I've tried to minimize artifacts viewing PDF files.
But assumed (falsely?) that PNGs and other raster formats, would be fine.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on contour plots

2020-06-02 Thread Abby Spurdle
> Very nice

Jim, thank you.
However, the (deterministic, or near-deterministic) diagonal lines in
the plot, make me question the suitability of this approach.
In my plot, the contour lines could be removed, and brighter colors
could be used.

But perhaps, a better approach would be to model those lines...
And it's not clear from the plot, if all the observations fall on a
diagonal line...


P.S.
I'm not sure why there's a white line on the plot.
Most of my testing was with PDF output, I will need to do some more
testing with PNG output.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on contour plots

2020-06-02 Thread Abby Spurdle
I'm putting this back on the list.

> So how would I set up the code to do this with the data type I have?

> I will need to replicate the same task > 200 times with other data sets.
> What I need to do is plot *Fc *against *Sc* with the third dimension being 
> the *density* of the data points.

Using Jim's bat_call data:

library (bivariate)

plot_ds <- function (dataset, main="", xlim, ylim, ..., k1=1, k2=1)
{   names <- names (dataset)
fh <- kbvpdf (dataset [,1], dataset [,2], k1 * bw.nrd (dataset
[,1]), k2 * bw.nrd (dataset [,2]) )
plot (fh, main=main, xlab = names [1], ylab = names [2],
xlim=xlim, ylim=ylim,
ncontours=2)
}

plot_ds (bat_call, "plot 1", k1=1.25, k2=1.25)

Note that I've used stats::bw.nrd.
The k1 and k2 values, simply scale the default bandwidth.
(In this case, I've increased the smoothness).

If you want to do it 200+ times:
(1) Create another function, to iterate over each data set.
(2) If you want to save the plots, you will need to add in a call to
pdf/png/etc and close the device, in each iteration.
(3) It may be desirable to have constant xlim/ylim values, ideally
based on the ranges of the combined data:

plot_ds (bat_call, "plot 1", xlim = c (25, 30), ylim = c (-15, 10),
k1=1.25, k2=1.25)

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Query on contour plots

2020-06-01 Thread Abby Spurdle
Hi,

I'm probably biased.

But my package, bivariate, contains a wrapper for KernSmooth::bkde2D,
which can produce both 3D surface plots and (pretty) contour plots of
bivariate kernel density estimates, conveniently.

https://cran.r-project.org/web/packages/bivariate/vignettes/bivariate.pdf
(pages 18 to 19)


On Mon, Jun 1, 2020 at 5:16 AM Neotropical bat risk assessments
 wrote:
>
> Hi all,
>
> While exploring  packages for 3D plots that several folks suggested (Tnx
> all!)
> It seems what I really need is a contour plot.  This is not working int
> he Deducer GUI.
>
> This will be an aid to separating bats by their vocal signatures.
> What I need to do is plot *Fc *against *Sc* with the third dimension
> being the *density* of the data points in the Fc-Sc plot.
>
> Data format is like this abbreviated sample.  Fc is a frequency in kHz
> and Sc is the characteristic slope  (octaves per second) of each call pulse.
>
> Any suggestions, guidance greatly appreciated.
> Bruce
>
> Fc  Sc
> 26.58   -5.95
> 27.03   -8.2
> 27.16   -2.07
> 26.19   -7.68
> 26.62   -3.99
> 26.85   -6.08
> 26.94   0
> 26.1-5.74
> 26.62   -5.96
> 26.85   -4.05
> 26.98   -4.09
> 26.02   -5.69
> 26.53   -7.89
> 26.62   -2
> 26.8-4.04
> 28.73   7
> 25.72   -2.97
> 26.14   -5.76
> 26.32   -3.89
> 26.40
> 26.32   5.88
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] The best way for making speciall matrix

2020-05-23 Thread Abby Spurdle
This sounds like a homework question...
But... numerical linear algebra rocks...

cbind (diag (1:3), 4:6)


On Sat, May 23, 2020 at 9:46 PM Vahid Borji  wrote:
>
> Hi my friends,
>
> I want to make the below matrix in r:
>
> 1 0 0 4
>
> 0 2 0 5
>
> 0 0 3 6
>
> I used the below code:
>
> matrix(c(1,0,0,0,2,0,0,0,3,4,5,6),nrow=3)
>
> My code works. But I do not like my solution way. I am thinking to find the
> simplest way for making this matrix. Do you think my code is the simplest
> code for making this matrix? If not, could anyone writes a simpler code
> than my one?
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] access for free more than 500 essential Springer Nature textbooks

2020-05-23 Thread Abby Spurdle
> My book is
> Statistical Analysis and Data Display, Richard M. Heiberger, Burt
> Holland, 2nd ed. 2015

In all fairness, I thought should look at your book.

I was quite impressed by the chapter on multiple comparisons.
And may look again, later.

In my personal opinion (diverging slightly), with more and more people
using extensive exploratory-style modelling, I think some of the
methods for multiple comparisons could (and should) be adapted to the
interpretation of exploratory plots.

And returning to your book...
There's relatively comprehensive chapters on multiple regression.
And I'm happy you used the term "explanatory" rather than "independent".
When people use the term "independent variables" (outside a
theoretical context) they usually go my how-did-these people-get-a-job
list.

Some comments:
(1) Expanding on the above point, I couldn't see a section on
interpreting regression coefficients, which is something people often
get wrong.
(2) It had a nice objective/frequentist flavor, but it would have been
nicer to see clearer references to robust, semiparametric and more
general nonparametric methods, even if only brief.

In principle, some of these (but not all) follow the same philosophy
as classical statistics, but allow greater flexibility.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [External] Re: access for free more than 500 essential Springer Nature textbooks

2020-05-22 Thread Abby Spurdle
> The Excel file is what you need.

Well, now I'm in a bad mood.

I went to all the trouble of opening the thing...
And the first two Springer-published books I look for, aren't there.

(1) Programming with Data, John Chambers
(2) Applied Econometrics with R, Z and co.

Next time someone tells me to use an Excel document, I'm adding them
to the spam list.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   3   >