Re: [R] Fwd: dmnorm not meant for 1024-dimensional data?

2007-04-25 Thread Adelchi Azzalini
On Wed, Apr 25, 2007 at 01:25:09PM -0500, Daniel Elliott wrote:
 Hello.  Thank you for your mnormt package.  Below is an email I sent to
 R-help regarding the function dmnorm.
 
 Any help you can provide would be greatly appreciated.
 

Hello.

You have not provided information on what specifically you 
have tried as for parameters and evaluation points, so I have
chosen a particularly simple case, namely the 1024-dimensional
density with independent standard marginals, using this code

d- 1024
x - 1
const- 0.5*log(2*pi)
log.pdf -0
for (i in 1:d)   log.pdf - log.pdf - x^2/2 -const
#
S - diag(d)
mu- rep(0,d)
X - rep(x,d)
log.pdf2 - dmnorm(X,mu,S, log=TRUE)
cat(log.pdf, log.pdf2, abs(log.pdf-log.pdf2),\n)

and the outcome was this one:
  -1453 -1453 2.137e-11 
which seems quite decent to me. Obviously, you must not take exp()
of this log-density, as otherwise a 0 is indeed produced, but 
the reason is not in the package, rather in the possibility of
representing that number using standard floating-point
numerical devices.

best regards,

Adelchi Azzalini


 Thank you.
 
 - dan elliott
 
 -- Forwarded message --
 From: Daniel Elliott [EMAIL PROTECTED]
 Date: Apr 25, 2007 1:21 PM
 Subject: dmnorm not meant for 1024-dimensional data?
 To: r-help@stat.math.ethz.ch
 
 Hello,
 
 I have some data generated by a simple mixture of Gaussians (more like
 K-means) and (as a test) am using dmnorm to calculate the probability
 of each data point coming from each Gaussian.  However, I get only
 zero probabilities.
 
 This code works in low dimensions (tried 2 and 3 already).  I have run
 into many implementations that do not work in high dimension, but I
 thought that I was safe with dmnorm because it has an option to
 compute the log of the probability.
 
 So, is dmnorm not intended to be used with data of such high dimensionality?
 
 Thank you,
 
 dan elliott

-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bivariate interpolation

2007-04-03 Thread Adelchi Azzalini
On Mon, 2 Apr 2007 22:26:06 -0400, eric lee wrote:

EL Hi.  I'm trying to take a data set with two independent and one dependent
EL variable and enter a x,y value to predict the dependent with a
EL nonparametric technique.  I've been using interpp in the akima package,
EL (windows xp, R 2.4.1), but get values that are orders of magnitude off
EL when the predictors are slightly out of the range of the data set.  Can
EL you recommend a function for me?  I've read that predict.loess has the
EL same problem or just lists NA.  Smooth.spline looks good, but can only do
EL one predictor instead of bivariate.  I've tried help.search(predict),
EL but couldn't find what I needed.  Thanks.
EL 

one option is to use sm.regression of package sm, along these lines

 x - cbind(runif(100,-2, 2), runif(100,-2, 2))
 y - x[,1]^2 + x[,2]^2 + 0.7*x[,1]*x[,2] + rnorm(100)/3
 sm.regression(x,y)
 ev.pt - rbind(c(-2,-2),c(2,2), c(2,-2),c(-2,2))
 a - sm.regression(x,y, eval.points=ev.pt,  eval.grid=FALSE)
-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bivariate interpolation

2007-04-03 Thread Adelchi Azzalini
On Tue, 3 Apr 2007 11:43:55 +0200, Adelchi Azzalini wrote:

AA On Mon, 2 Apr 2007 22:26:06 -0400, eric lee wrote:
AA 
AA EL Hi.  I'm trying to take a data set with two independent and one
AA EL dependent variable and enter a x,y value to predict the dependent
AA EL with a nonparametric technique.  I've been using interpp in the akima
AA EL package, (windows xp, R 2.4.1), but get values that are orders of
AA EL magnitude off when the predictors are slightly out of the range of
AA EL the data set.  Can you recommend a function for me?  I've read that
AA EL predict.loess has the same problem or just lists NA.  Smooth.spline
AA EL looks good, but can only do one predictor instead of bivariate.  I've
AA EL tried help.search(predict), but couldn't find what I needed.
AA EL Thanks.
AA EL 
AA 
AA one option is to use sm.regression of package sm, along these lines
AA 
AA  x - cbind(runif(100,-2, 2), runif(100,-2, 2))
AA  y - x[,1]^2 + x[,2]^2 + 0.7*x[,1]*x[,2] + rnorm(100)/3
AA  sm.regression(x,y)
AA  ev.pt - rbind(c(-2,-2),c(2,2), c(2,-2),c(-2,2))
AA  a - sm.regression(x,y, eval.points=ev.pt,  eval.grid=FALSE)

I forgot the last bit

   print(a$estimate)
-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Correlated random effects in lme

2007-03-17 Thread Adelchi Azzalini
On Sat, Mar 17, 2007 at 08:12:35PM +0100, Guillermo Villa wrote:
 Hello,
 
 I am interested in estimating this type of random effects panel:
 
 y_it = x'_it * beta + u_it + e_it
 
 u_it = rho * u_it-1 + d_itrho belongs to (-1, 1)
 
 where:
 
 u and e are independently normally zero-mean distributed.
 
 d is also independently normally zero-mean distributed.
 
 So, I want random effects for group i to be correlated in t, following an
 AR(1) process.
 
 Any idea of how to estimate this kind of models using the lme function?
 
 (Now, I know the corAR1 option is not the way...)
 
 G

If you add an AR(1) process, the {u_t} process process in this
case (here I drop the index i), and a pure noise effect, the {e_t}
process in you notation, their sum is an ARMA(1,1) process with
certain restrictions on the parameters. Given this, I do not
know lme() well enough to insert this constraint on the parameters.
As an approximate solution, you could fit a plain ARMA(1,1) model.

However, is it not the case that the e_{it} is actually constant
over t, so we have just an e_i term? (a much frequent case in 
practice). This is a much easier problem.


Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] vectorized dmvnorm

2007-03-02 Thread Adelchi Azzalini
On Fri, 2 Mar 2007 12:08:20 +0100, Ingmar Visser wrote:

IV Dear List,
IV Is there an R-function that computes vectorized densities of the  
IV multivariate normal distribution, ie vectorized in x, mean and sigma?
IV 
IV That is, a function that takes eg:
IV 
IV x - matrix(0,3,2)
IV y - matrix(1:6,3)-3
IV sig - array(c(1,0,0,1,1,0.1,0.1,1,1,0.3,0.3,1),c(2,2,3))
IV 
IV   x
IV   [,1] [,2]
IV [1,]00
IV [2,]00
IV [3,]00
IV   y
IV   [,1] [,2]
IV [1,]   -21
IV [2,]   -12
IV [3,]03
IV   sig
IV , , 1
IV 
IV   [,1] [,2]
IV [1,]10
IV [2,]01
IV 
IV , , 2
IV 
IV   [,1] [,2]
IV [1,]  1.0  0.1
IV [2,]  0.1  1.0
IV 
IV , , 3
IV 
IV   [,1] [,2]
IV [1,]  1.0  0.3
IV [2,]  0.3  1.0
IV 
IV and then returns a vector with elements:
IV 
IV dmvnorm(x[i,],y[i,],sig[,,i])
IV 
IV And/or is there another efficient of computing these without using  
IV loops:
IV for(i in 1:3)  print(dmvnorm(x[i,],y[i,],sig[,,i]))
IV 

a partial answer:
dmnorm() in package mnormt is vectorized for x, not for mean and covariance;
dmvnorm() in package mvtnorm works similarly

-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fitting the gamma cumulative distribution function

2007-02-27 Thread Adelchi Azzalini
On Tue, 27 Feb 2007 06:59:51 -0800 (PST), Stephen Tucker wrote:

ST Hi Tim,
ST 
ST I believe fitdistr() in the MASS package is the function you are looking
ST for. (example given in help page)...
ST 
ST Best regards,
ST ST
ST 
ST --- Tim Bergsma [EMAIL PROTECTED] wrote:
ST 
ST  Hi.
ST  
ST  I have a vector of quantiles and a vector of probabilites that, when 
ST  plotted, look very like the gamma cumulative distribution function.  I 
ST  can guess some shape and scale parameters that give a similar result, 
ST  but I'd rather let the parameters be estimated.  Is there a direct way 
ST  to do this in R?
ST  
ST  Thanks,
ST  
ST  Tim.
ST  
ST  week - c(0,5,6,7,9,11,14,19,39)
ST  fraction - c
ST  (0,0.23279,0.41093,0.58198,0.77935,0.88057,0.94231,0.98583,1) weeks -
ST  1:160/4 plot(weeks, pgamma(weeks,shape=6,scale=1.15),type=l)
ST  points(week,fraction,col=red)
ST  

you can decide a distance criterion and select the paramers which
minimize that distance, something like
 
  criterion  - function(param, week, fraction){
  cdf - pgamma(week, param[1], param[2])
  p - diff(cdf)
  sum((diff(fraction)-p)^2/p) # or some other function
  }
  
and then minimize this criterion with respect to the parameters
using optim() or nlminb().

You cannot use fitdistr() because it requires the individual sample values.
In fact you cannot even use MLE for grouped data or X^2, since the sample
size is not  known (at least not reported), hence we do not have the absolute 
frequencies. If the sample size was known, then the problem would change.
  
-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] generate Binomial (not Binary) data

2007-02-09 Thread Adelchi Azzalini
On Fri, 09 Feb 2007 16:21:58 - (GMT), (Ted Harding) wrote:

 
   Let i indexes  a subject and Y_i = (Y_i1, Y_i2,...,Y_iT)
be a vector of binomial variables for subject i  such that
Y_it ~ Bin(n,p_t) with t = 1,2, T.
A simple correlation I would like to have is :
corr(Y_ij, Y_ik) = c for all (j,k)

I have only seen only a portion of this discussion, so I hope that
the following is not unrelated to the question.

Would it be Ok to generate a sequence p_1,...,p_t, ...p_T
from a Beta stationary process (which exists, actual more
than one sort), and then sample from Bin(n,p_t)?

Alternatively, some 20 years ago, there was a stream of literature 
for discrete sttionary processes, based on the idea of thinning
a discrete variable. There are for sure stationary processes
with Negative binomial distribution of this sort. I have
not followed that literature, but it is conceivable that
the similar process for the binomial case has been considered.

Best wishes,

Adelchi Azzalini

-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Density Estimation

2006-06-10 Thread Adelchi Azzalini
On Thu, Jun 08, 2006 at 08:31:26PM +0200, Pedro Ramirez wrote:
 In mathematical terms the optimal bandwith for density estimation
 decreases at rate n^{-1/5}, while the one for distribution function
 decreases at rate n^{-1/3}, if n is the sample size. In practical terms,
 one must choose an appreciably smaller bandwidth in the second case
 than in the first one.
 
 Thanks a lot for your remark! I was not aware of the fact that the
 optimal bandwidths for density and distribution do not decrease
 at the same rate.
 
 Besides the computational aspect, there is a statistical one:
 the optimal choice of bandwidth for estimating the density function
 is not optimal (and possibly not even jsut sensible) for estimating
 the distribution function, and the stated problem is equivalent to
 estimation of the distribution function.
 
 The given interval 0x3 was only an example, in fact I would
 like to estimate the probability for intervals such as
 
 0=x1 , 1=x2 , 2=x3 , 3=x4 , 
 
 and compare it with the estimates of a corresponding histogram.
 In this case the stated problem is not anymore equivalent to the
 estimation of the distribution function. What do you think, can

why not? the probabilities you are interested in are of the form

F(1)-F(0), F(2)-F(1), and so on

where F(.) if the cumulative distribution function (and it must
be continuous, since its derivative exists).

 I go a ahead in this case with the optimal bandwidth for the
 density? Thanks a lot for your help!

no

best wishes,

Adelchi

 Best wishes
 Pedro
 
 
 
 
 best wishes,
 
 Adelchi
 
 
 PR
 PR 
 PR --
 PR Gregory (Greg) L. Snow Ph.D.
 PR Statistical Data Center
 PR Intermountain Healthcare
 PR [EMAIL PROTECTED]
 PR (801) 408-8111
 PR 
 PR 
 PR -Original Message-
 PR From: [EMAIL PROTECTED]
 PR [mailto:[EMAIL PROTECTED] On Behalf Of Pedro
 PR Ramirez Sent: Wednesday, June 07, 2006 11:00 AM
 PR To: r-help@stat.math.ethz.ch
 PR Subject: [R] Density Estimation
 PR 
 PR Dear R-list,
 PR 
 PR I have made a simple kernel density estimation by
 PR 
 PR x - c(2,1,3,2,3,0,4,5,10,11,12,11,10)
 PR kde - density(x,n=100)
 PR 
 PR Now I would like to know the estimated probability that a new
 PR observation falls into the interval 0x3.
 PR 
 PR How can I integrate over the corresponding interval?
 PR In several R-packages for kernel density estimation I did not
 PR found a corresponding function. I could apply Simpson's Rule for
 PR integrating, but perhaps somebody knows a better solution.
 PR 
 PR Thanks a lot for help!
 PR 
 PR Pedro
 PR 
 PR _
 PR 
 PR __
 PR R-help@stat.math.ethz.ch mailing list
 PR https://stat.ethz.ch/mailman/listinfo/r-help
 PR PLEASE do read the posting guide!
 PR http://www.R-project.org/posting-guide.html
 PR 
 PR
 PR __
 PR R-help@stat.math.ethz.ch mailing list
 PR https://stat.ethz.ch/mailman/listinfo/r-help
 PR PLEASE do read the posting guide!
 PR http://www.R-project.org/posting-guide.html
 PR
 
 _
 Don't just search. Find. Check out the new MSN Search! 
 http://search.msn.com/

-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Density Estimation

2006-06-08 Thread Adelchi Azzalini
On Wed, 07 Jun 2006 19:54:32 +0200, Pedro Ramirez wrote:

PR Not a direct answer to your question, but if you use a logspline
PR density estimate rather than a kernal density estimate then the
PR logspline package will help you and it has built in functions for
PR dlogspline, qlogspline, and plogspline that do the integrals for
PR you.
PR 
PR If you want to stick with the KDE, then you could find the area
PR under each of the kernals for the range you are interested in
PR (need to work out the standard deviation used from the bandwidth,
PR then use pnorm for the default gaussian kernal), then just sum
PR the individual areas.
PR 
PR Hope this helps,
PR 
PR Thanks a lot for your quick help! I think I will follow your first
PR 
PR suggestion (logspline
PR density estimation) instead of summing over the kernel areas
PR because at the boundaries of the range truncated kernel areas can
PR occur, so I think it is easier to do it with logsplines. Thanks
PR again for your help!!
PR 
PR Pedro
PR 
PR 

Besides the computational aspect, there is a statistical one:
the optimal choice of bandwidth for estimating the density function 
is not optimal (and possibly not even jsut sensible) for estimating
the distribution function, and the stated problem is equivalent to
estimation of the distribution function. 

In mathematical terms the optimal bandwith for density estimation
decreases at rate n^{-1/5}, while the one for distribution function 
decreases at rate n^{-1/3}, if n is the sample size. In practical terms, 
one must choose an appreciably smaller bandwidth in the second case 
than in the first one.

best wishes,

Adelchi 


PR 
PR 
PR --
PR Gregory (Greg) L. Snow Ph.D.
PR Statistical Data Center
PR Intermountain Healthcare
PR [EMAIL PROTECTED]
PR (801) 408-8111
PR 
PR 
PR -Original Message-
PR From: [EMAIL PROTECTED]
PR [mailto:[EMAIL PROTECTED] On Behalf Of Pedro
PR Ramirez Sent: Wednesday, June 07, 2006 11:00 AM
PR To: r-help@stat.math.ethz.ch
PR Subject: [R] Density Estimation
PR 
PR Dear R-list,
PR 
PR I have made a simple kernel density estimation by
PR 
PR x - c(2,1,3,2,3,0,4,5,10,11,12,11,10)
PR kde - density(x,n=100)
PR 
PR Now I would like to know the estimated probability that a new
PR observation falls into the interval 0x3.
PR 
PR How can I integrate over the corresponding interval?
PR In several R-packages for kernel density estimation I did not
PR found a corresponding function. I could apply Simpson's Rule for
PR integrating, but perhaps somebody knows a better solution.
PR 
PR Thanks a lot for help!
PR 
PR Pedro
PR 
PR _
PR 
PR __
PR R-help@stat.math.ethz.ch mailing list
PR https://stat.ethz.ch/mailman/listinfo/r-help
PR PLEASE do read the posting guide!
PR http://www.R-project.org/posting-guide.html
PR 
PR 
PR __
PR R-help@stat.math.ethz.ch mailing list
PR https://stat.ethz.ch/mailman/listinfo/r-help
PR PLEASE do read the posting guide!
PR http://www.R-project.org/posting-guide.html
PR

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Multivariate skew-t cdf

2006-06-05 Thread Adelchi Azzalini
On Mon, Jun 05, 2006 at 09:43:24AM -0700, Spencer Graves wrote:


Thanks to Spencer Graves for provinding this clarification about pmst.
As the author of pmst, I was really the one expected to answer the
query, but  I was on travel in the last two weeks and did not read 
this query.

As a complement to what already explained, the reason of the sharp change
from k=19 to k=20 is as follows: pmst (package sn) calls pmt (package
mnormt) with k increased by 1, and pmt calls a Fortran routine (written by
Alan Genz), which issues an error if k20 or k1. Hence, the effective
maximum number of dimensions allowed by pmst is 19. 

best wishes,

Adelchi Azzalini



 You want to evaluate skewed t probabilities in how many dimensions? 
 If 27 is your maximum, the problem won't be as difficult as if 27 is 
 your minimum.  Also, do you want to compute the multivariate cumulative 
 probability function for arbitrary points, location, covariance, shape 
 and degrees of freedom?  Or are you really only interested in certain 
 special case(s)?  If you have a simpler special case, it might be easier 
 to get a solution.
 
 I was able to replicate your result, even when I reduced the 
 dimensionality down to 20;  with 19 dimensions, the function seemed to 
 return a sensible answer.  If it were my problem, I might first make a 
 local copy of the pmst function and modify it to use the mvtnorm package 
 in place of mnormt.  That might get you answers with somewhat higher 
 dimensionality, though it might not be adequate -- and I wouldn't trust 
 the numbers I got without serious independent testing.  I'd try to think 
 how I could modify the skewness so I could check the accuracy that way.
 
 Have you studied the reference mentioned in the dmst help page, and 
 reviewed some of their sources?  Computing multivariate probabilities 
 like this is still a research project, I believe.  In this regard, I 
 found the following two books helpful:
 
 * Evans and Schwarz (2000) Approximating Integrals via Monte Carlo 
 and Deterministic Methods (Oxford)
 
 * Kythe and Schaeferkotter (2005) Handbook of Computational Methods 
 for Integration (Chapman and Hall).
 
 Also, have you asked about this directly to the maintainers of the 
 sn, mnormt and mvtnorm packages?  They might have other suggestions.
 
 Hope this helps.
 Spencer Graves
 p.s.  Thanks for the self-contained example.  There seems to be a typo 
 in your example:  Omega = diag(0, 27) is the matrix of all zeros, which 
 produces a point mass at the center of the distribution.  I got your 
 answers after changing it to diag(1, 27).
 
 Making the dimension a variable, I found a sharp transition between k 
 = 19 and 20:

 
   k - 19
   xi - alpha - x - rep(0,k)
   Omega - diag(1,k)
   (p19 - pmst(x, xi, Omega, alpha, df = 5))
 [1] 1.907349e-06
 attr(,error)
 [1] 2.413537e-20
 attr(,status)
 [1] normal completion
   k - 20
   xi - alpha - x - rep(0,k)
   Omega - diag(1,k)
   (p20 - pmst(x, xi, Omega, alpha, df = 5))
 [1] 0
 attr(,error)
 [1] 1
 attr(,status)
 [1] oversize
 
 Konrad Banachewicz wrote:
  Dear All,
  I am using the pmst function from the sn package (version 0.4-0). After
  inserting the example from the help page, I get non-trivial answers, so
  everything is fine. However, when I try to extend it to higher dimension:
  xi - alpha - x - rep(0,27)
  Omega - diag(0,27)
  p1 - pmst(x, xi, Omega, alpha, df = 5)
  
  I get the following result:
  
  p1
  [1] 0
  attr(,error)
  [1] 1
  attr(,status)
  [1] oversize
  
  So it seems like the dimension is a problem here (and not the syntax or type
  mismatch, as I inititally thought - the function is evaluated) - although I
  found no warning about it in the help page.
  
  Can anyone give me a hint as to how to work around this problem and evaluate
  the skew-t cdf in a large-dimensional space? It's pretty crucial to my
  current research. Thanks in advance,
  
  Konrad
  
  [[alternative HTML version deleted]]
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 

-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] pnorm2

2006-04-25 Thread Adelchi Azzalini
On Mon, Apr 24, 2006 at 03:26:14PM +0100, Tolga Uzuner wrote:
 Hi,
 
 Has pnorm2 been dropped from sn ? Have some code using it which seems to 
 failt with a new sn update...
 
 Thanks,
 Tolga
 

yes, I have dropped it, since pmnorm() of package mnormt 
provides the same facility in far more general form

..in fact I never realized pnorm2 had been used by someone :-)

if for some reason you prefer not to load mnormt, 
then I can put back pnorm2 in sn


best wishes,

Adelchi Azzalini
-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] bivariate weighted kernel density estimator

2006-04-24 Thread Adelchi Azzalini
On Sun, 23 Apr 2006 09:13:35 +0200, Erich Neuwirth wrote:

EN Is there code for bivariate kernel density estimation?
EN For bivariate kernels there is
EN kde2d in MASS
EN kde2d.g in GRASS
EN KernSur in GenKern
EN (list probably incomplete)
EN but none of them seems to accept a weight parameter
EN (like density does since R 2.2.0)
EN 

sm.density of package sm allows to use weights 
(with Gaussian kernel)

best wishes,
Adelchi Azzalini

-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Skewed t distribution

2006-03-28 Thread Adelchi Azzalini
On Tue, 28 Mar 2006 11:41:19 +0200, Konrad Banachewicz wrote:

please supply the ingredients needed to reproduce the problem that
you have faced (including the values of the parameters mu,P,alpha,nu,
among the rest)

best wishes,

Adelchi Azzalini


KB Dear All,
KB I am working with skewed-t copula in my research recently, so I
KB needed to write an mle
KB procedure instead of using a standard fit one; I stick to the sn
KB package. On subsamples of the entire population that I deal with,
KB everything is fine. However, on the total sample (difference in
KB cross-sectional dimension: 30 vs 240) things go wrong - the
KB objective function diverges to infinity. I located the rotten
KB line to be
KB 
KB t1 - dmst(vector, mu, P, alpha, nu)
KB 
KB where vector is the matrix row, on which I evaluate my
KB likelihood and the rest in parametrized in a standard
KB way, just as the help pages give it. In large dimensions, I get a
KB zero value of the density (which is probably due to numerical
KB issues). I tried the following dummy example
KB 
KB t1 - rmst(1,mu,P,alpha, nu)
KB t2 - dmst(t1, mu, alpha,nu)
KB 
KB and t2 remains to be zero. Can anyone help me on this one?
KB 
KB thanks in advance,
KB Konrad
KB 
KB --
KB We are what we pretend to be, so we must be careful about what we
KB pretend to be
KB 
KB   Kurt Vonnegut Jr. Mother Night
KB 
KB [[alternative HTML version deleted]]
KB 
KB __
KB R-help@stat.math.ethz.ch mailing list
KB https://stat.ethz.ch/mailman/listinfo/r-help
KB PLEASE do read the posting guide!
KB http://www.R-project.org/posting-guide.html
KB

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Skewed t distribution

2006-03-28 Thread Adelchi Azzalini
On Tue, 28 Mar 2006 13:11:02 +0200, Konrad Banachewicz wrote:

KB P is an identity matrix 240X240, mu and alpha are vectors of zeros
KB 240X1, nu equals 10, so alltogether You need:
KB P - matrix(0,244,244)
KB diag(P) - 1
KB nu - 10
KB alpha - rep(0,244)
KB mu - rep(0,244)
KB require(sn)
KB t1 - rmst(1,mu,P, alpha, nu)
KB t2 - dmst(t1,mu,P,alpha,nu)
KB 
KB 

With such a large dimension, numerical problems are obiquitous.
At the very least, I suggest that you work on the log scale:

R t2 - dmst(t1,mu,P,alpha,nu, log=TRUE)
R t2
[1] -250.3
R exp(t2)
[1] 2.002e-109


The next release of the 'sn' package will handle this sort of things
in a more consisent way (the R code is largely updated, but the
documentation is far behind..)



-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/


KB 
KB 
KB  please supply the ingredients needed to reproduce the problem
KB  that you have faced (including the values of the parameters
KB  mu,P,alpha,nu, among the rest)
KB 
KB  best wishes,
KB 
KB  Adelchi Azzalini
KB 
KB 
KB

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Skewed t distribution

2006-03-28 Thread Adelchi Azzalini
On Tue, 28 Mar 2006 12:59:34 +0200, Konrad Banachewicz wrote:

KB On 3/28/06, Prof Brian Ripley [EMAIL PROTECTED] wrote:
KB 
KB  Try maximizing the log-likelihood and using the log=TRUE
KB  argument to dmst.
KB 
KB 
KB seems like dmst does not support this argument (the way e.g. dt
KB does)
KB 

here I get the following

R library(sn)
Loading required package: mvtnorm
Library 'sn', version  0.3-5 (2005-12-30) , © 1998-2005 A.Azzalini
type 'help(SN)' for summary information
R args(dmst)
function (x, xi = rep(0, d), Omega, alpha, df = Inf, log = FALSE) 
NULL
R 

notice that 0.3-5 is the current version on CRAN

KB 
KB 
KB  (You have told us so little about what you are doing that we can
KB  but guess at what you mean by `write an mle procedure': what is
KB  wrong with st.mle, for example?)
KB 
KB 
KB st.mle assumes skewed-t marginals (for a whole distribution),
KB whereas I am working with a copula so my margins are uniform. The
KB whole point is separating the joint and marginal dynamics.
KB 

I am not a copula expert, but what Brian Ripley suggests makes sense 
to me; and I know of people that have used st.mle to obtain the
marginal components (which is what is needed for the copula mechanism)




-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Remove [1] ... from output

2006-03-28 Thread Adelchi Azzalini
On Tue, 28 Mar 2006 13:21:59 +0200, Gregor Gorjanc wrote:

GG Hello!
GG 
GG I am writing some numbers and character vectors to an ascii file
GG and would like to get rid of [1] ... as shown bellow (a dummy
GG example)
GG 
GG R runif(20)
GG  [1] 0.653574 0.164053 0.036031 0.127208 0.134274 0.103252
GG  0.506480 0.547759 [9] 0.912421 0.584382 0.987208 0.996846
GG  0.666760 0.053637 0.327590 0.370737
GG [17] 0.505706 0.412316 0.887421 0.812151
GG 
GG I have managed to work up to remove quotes and all [*] except [1]
GG as shown bellow.
GG 
GG R print(paste(runif(20), collapse =  ), quote = FALSE)
GG [1] 0.790620362851769 0.45603066496551 0.563822037540376
GG 0.812907998682931 0.726162418723106 0.37031230609864
GG 0.681147597497329 0.29929908295162 0.209858040558174
GG 0.304300333140418 0.105796672869474 0.743657597573474
GG 0.409294542623684 0.825012607965618 0.282235795632005
GG 0.21159387845546 0.620056127430871 0.337449935730547
GG 0.754527133889496 0.280175548279658
GG 
GG Any hints how to solve my task?
GG 

what about 
  cat(format(runif(20)))
?

you can improve it a bit with

 cat(format(runif(20)),\n)




-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] color quantization / binning a variable into levels

2006-02-22 Thread Adelchi Azzalini
On Tue, 21 Feb 2006 11:08:38 -0600 (CST), David Forrest wrote:

perhaps binning of package sm is what you want

best wishes,
Adelchi Azzalini

DF Hi all,
DF 
DF I'd like to quantize a variable to map it into a limited set of
DF integers for use with a colormap.  image and filled.contour  do
DF this mapping inside somewhere, but I'd like to choose the colors
DF for plotting a set of polygons.  Is there a pre-existing function
DF that does something like this well?  i.e., is capable of using
DF 'breaks'?
DF 
DF quantize-function(x,n=10, breaks=NULL){
DF # bin the variable x into n levels
DF   xmin-min(x)
DF   xmax-max(x)
DF   1+floor(n*(x-xmin)/(xmax-xmin)*.999)
DF }
DF 
DF x- -10:10
DF cbind(x,quantize(x,2),quantize(x),quantize(x,21))
DF 
DF quantize(x,breaks=c(5,7))   #
DF 
DF Thanks for your time,
DF 
DF Dave

-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] bivariate normal distribution

2006-02-15 Thread Adelchi Azzalini
On Mon, 13 Feb 2006 17:11:30 -0500, He, Yulei wrote:

HY Hi, there.
HY 
HY  
HY 
HY Does anyone know the R function for calculating the cdf of
HY bivariate normal distribution function?
HY 
HY  

choose among packages mvtnorm and mnormt the one best suited
for your purpose

best wishes,

Adelchi Azzalini

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] mutlivariate normal and t distributions

2006-01-24 Thread Adelchi Azzalini
On Tue, 24 Jan 2006 08:31:16 +0100 (CET), [EMAIL PROTECTED] wrote:

TD 
TD  yes, Adelchi, you should definitely submit the package to CRAN!

thanks for the feedback

I have just now uploaded the 'mnormt' package to CRAN, i.e. 
 ftp.ci.tuwien.ac.at/incoming/
perhaps at a later stage we can find a way to 
merge 'mnormt' and 'mvtnorm'

best wishes

Adelchi

-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] mutlivariate normal and t distributions

2006-01-23 Thread Adelchi Azzalini

Dear R-help list members,

I have created a package 'mnormt' with facilities for the multivariate
normal and t distributions. The core part is simply an interface to
Fortran routines by Alan Genz for computing the integral of two
densities over rectangular regions, using an adaptive integration
method. Other R functions compute densities and generate random
numbers.

The starting motivation to write it was the  need to have functions
which compute the distribution functions in a non-Monte Carlo form,
sinse this caused me problems when these probabilities are involved in
a minimization problem. For this reason, I could not make use of the
CRAN package 'mvtnorm'.  Exactly to avoid superposition with the CRAN
package, 'mnormt' is made available somehere else, in case other
people want to use it.  The package 'mnormt' is at 
   http://azzalini.stat.unipd.it/SN/Pkg-mnormt

As explained, this is not uploaded to CRAN just to avoid clash with
the existing package. However, if it is felt appropriate, I have no
objection to upload it to CRAN.

Best wishes,

Adelchi Azzalini
-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] negative predicted values in poisson glm

2006-01-18 Thread Adelchi Azzalini
On Wed, 18 Jan 2006 14:58:26 +0100, P. Olsson wrote:

PO Dear  R  helpers,
PO running the following code of a glm model of the family poisson,
PO gives predicted values  0. Why?
PO 
PO library(MASS)
PO library(stats)
PO library(mvtnorm)
PO library(pscl)
PO data(bioChemists)
PO poisson_glm -  glm(art ~ fem + mar + kid5 + phd + ment, data =
PO bioChemists, family = poisson)
PO predicted.values = predict(poisson_glm)
PO range(predicted.values)
PO 


use
 predicted.values = predict(poisson_glm, type=response)

best wishes,

Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] bivariate kernel density estimates at point locations (rather than at grid locations)

2005-12-22 Thread Adelchi Azzalini
On Wed, 21 Dec 2005 12:21:54 -0500, Strickland, Matthew wrote:

SM Hello Dr. Adelchi Azzalini,
SM 


Dr. Strickland, 

your message was directed to the whole r-help list with no CC to
myself, and sometimes I do not have the chance to browse the
 r-help list for weeks..

SM Thank you for your quick response to my question that I posted on
SM the r-help board about bivariate kernel density estimates.  I have
SM been using your sm package the past few days and have encountered
SM a problem with estimating the density for only 1 point. I am using
SM R 2.2.0, sm version 2.1-0 on a Windows machine.  Example code
SM below:
SM 
SM #The code below creates 3 point locations
SM 
SM x.locs = c(74, 75, 77)
SM y.locs = c(64, 63, 61)
SM points = cbind(x.locs, y.locs)
SM 
SM #If I send this data into sm.density everything works fine.
SM 
SM dens = sm.density(points, h=c(1, 1))
SM 
SM #However, if I only wish to send 1 point location to sm.density,
SM #i.e.,
SM 
SM points.2 =points[1,]
SM dens.2 = sm.density(points.2, h=c(1, 1)) 
SM 
SM R returns to me the error:length(h) != 1

formally, the error is due to this

R is.vector(points.2)
[1] TRUE

sm.density receives a vector of length 2, and it  works for that
case: estimation of a one-dimensional density from which you
supplied two data values. Then it finds a two-dimensional h
and there it complains.

On the statistical side, I cannot follow the logic of estimating
nonparametrically a density function on the basis of only one 
(supposed bivariate) observation.

SM 
SM It appears to me that sm.density thinks that my 1 point is a
SM 1-dimensional location rather than a 2-dimensional location, and I
SM am getting an error when I request a bivariate kernel.  Do you
SM have any suggestions?  
SM 

I am not sure to grasp what you have in mind; is it perhaps that you
want the following?

dens.2 = sm.density(points, h=c(1, 1), eval.points=points, eval.grid=FALSE) 
print(dens.2$estimate[1])

best regards,

Adelchi


SM Best,
SM Matt
SM 
SM 
SM -Original Message-
SM From: Adelchi Azzalini [mailto:[EMAIL PROTECTED] 
SM Sent: Friday, December 16, 2005 2:42 AM
SM To: Strickland, Matthew
SM Cc: r-help@stat.math.ethz.ch; [EMAIL PROTECTED]
SM Subject: Re: [R] bivariate kernel density estimates at point
SM locations (rather than at grid locations)
SM 
SM On Thu, 15 Dec 2005 14:21:17 -0500, Strickland, Matthew wrote:
SM 
SM SM Hi,
SM SM 
SM SM My data consists of a set of point locations (x,y). 
SM SM 
SM SM I would like to know if there is a procedure for bivariate
SM SM kernel  density estimation in R that returns the density
SM SM estimates at the  observed point locations rather than at grid
SM SM locations. I have  looked at a number of different routines
SM SM and they all seem to return  estimates at grid locations.
SM SM 
SM 
SM One option is to use (from package sm),
SM   sm.density(xy, eval.points=xy, eval.grid=FALSE) where xy in a
SM   (n\times 2) matrix.
SM 
SM Best wishes,
SM Adelchi Azzalini
SM 
SM --
SM Adelchi Azzalini  [EMAIL PROTECTED] Dipart.Scienze
SM Statistiche, Università di Padova, Italia tel. +39 049 8274147, 
SM http://azzalini.stat.unipd.it/
SM 
SM __
SM R-help@stat.math.ethz.ch mailing list
SM https://stat.ethz.ch/mailman/listinfo/r-help
SM PLEASE do read the posting guide!
SM http://www.R-project.org/posting-guide.html
SM

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] bivariate kernel density estimates at point locations (rather than at grid locations)

2005-12-15 Thread Adelchi Azzalini
On Thu, 15 Dec 2005 14:21:17 -0500, Strickland, Matthew wrote:

SM Hi,
SM 
SM My data consists of a set of point locations (x,y). 
SM 
SM I would like to know if there is a procedure for bivariate kernel
SM density estimation in R that returns the density estimates at the
SM observed point locations rather than at grid locations. I have
SM looked at a number of different routines and they all seem to
SM return estimates at grid locations.  
SM 

One option is to use (from package sm), 
  sm.density(xy, eval.points=xy, eval.grid=FALSE)
where xy in a (n\times 2) matrix.

Best wishes,
Adelchi Azzalini

-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Output glm

2005-11-08 Thread Adelchi Azzalini
On Tue, 08 Nov 2005 12:02:52 +0100, Ciro Marziliano wrote:

CM 
CM  How can I obtain the likelihood ratio of a Poisson regression
CM  model?

likelihood ratio for testing which hypothesis?

however, in GLM the deviance provides the essential ingredient
for the job, since
  deviance = 2(log L(saturated model) - log L(fitted model))

best wishes

Adelchi Azzalini

-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Distribution fitting problem

2005-11-02 Thread Adelchi Azzalini
On Wed, 2 Nov 2005 14:32:52 +0200, Mark Miller wrote:

MM I am using the MASS library function 
MM 
MM fitdistr(x, dpois, list(lambda=2))
MM 
MM but I get 
MM 
MM Error in optim(start, mylogfn, x = x, hessian = TRUE, ...) :
MM Function cannot be evaluated at initial parameters
MM In addition: There were 50 or more warnings (use warnings() to see
MM the first  50)
MM 
MM and all the first 50 warnings say 
MM 
MM 1:  non-integer x = 1.45
MM etc
MM 

are the data integers (as implicit in the assumption of Poisson dist'n)?
the above message seems to say that they are not 

Adelchi Azzalini

-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Multivariate Skew Normal distribution

2005-09-05 Thread Adelchi Azzalini
On Thu, 1 Sep 2005 13:09:00 + (GMT), Caio Lucidius Naberezny
Azevedo wrote:

CLNA 
CLNA  Could anyone tell me if there is any package (or function) that
CLNA  generates values from a multivariate skew normal distribution?

try the following

library(sn)
location - c(20,4) # e.g. for a two-dimensional variable
dispers  - matrix(c(3,-2,-2,2), 2,2)
skew - c(10,-6)
rmsn(n=10, xi=location, Omega=dispers, alpha=skew) # for skew-normal data
rmst(n=10, xi=location, Omega=dispers, alpha=skew, df=5) # for skew-t data

see also help(rsn) and help(rst) for the univariate case

for more information, see also (as already pointed out in the list):
  http://azzalini.stat.unipd.it/SN

best wishes,
Adelchi Azzalini

-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Off topic -2 Ln Lambda and Chi square

2005-07-11 Thread Adelchi Azzalini
On Mon, 11 Jul 2005 11:18:55 -0400, Liaw, Andy wrote:

LA If you meant the lambda as the likelihood ratio test statistic, the
LA asymptotic chi-squared distribution comes from the asymptotic
LA normality of the MLEs.  The proof is in a paper by Abraham Wald in
LA 1943.  See Stuart  Ord (Kendall's Advanced Statistics) for
LA discussion (e.g., vol. 2, 5th edition).
LA 
LA Andy
LA 
LA  From: Laura Holt
LA  
LA  Dear R :
LA  
LA  Sorry for the off topic question, but does anyone know the 
LA  reference for
LA  the -2 Ln Lambda following a Chi Square distribution, please?
LA  

see

S. S Wilks ( 1938)
The large-sample distribution of the likelihood ratio for testing composite 
hypotheses
- Ann. Math. Stat, 9, 60-62

-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] plot empirical pdf

2003-08-26 Thread Adelchi Azzalini
On Tuesday 26 August 2003 08:47, Ott Toomet wrote:
 Hi,

 are there any function to plot the empirical probability distribution
 function?  I just don't want to reinvent the wheel...


try this, 

regards, Adelchi 
--
n - 10
x - rnorm(n)
plot(c(min(x)-1,sort(x), max(x+1)), c(0:n,n)/n, type=s,
  xlab=data, ylab=ecdf)
rug(x)

-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
http://azzalini.stat.unipd.it/
(please, no ms-word/ms-excel/alike attachments)

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Method of L-BFGS-B of optim evaluate function outside of boxconstraints

2003-08-21 Thread Adelchi Azzalini
On Wednesday 20 August 2003 15:44, Shengqiao Li wrote:
 Hi, R guys:

 I'm using L-BFGS-B method of optim for minimization problem. My function
 called besselI function which need non-negative parameter and the besselI
 will overflow if the parameter is too large. So I set the constraint box
 which is reasonable for my problem. But the point outside the box was
 test, and I got error. My program and the error follows. This program
 depends on CircStats package.


 Anyone has any idea about this?


No idea... I can only say that a  similar behaviour of optim
with  method=L-BFGS-B (i.e. evaluation of the function 
outside the lower/upper limits) has occurred to me too. 
It is a rare but possible behaviour.

Last June, I have sent a full description of the problem to 
[EMAIL PROTECTED]

Regards,

Adelchi Azzalini


-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
http://azzalini.stat.unipd.it/
(please, no ms-word/ms-excel/alike attachments)

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Is there a bug in qr(..,LAPACK=T)

2003-07-17 Thread Adelchi Azzalini
On Wednesday 16 July 2003 20:50, Mike Meyer wrote:
 Several people have kindly (and gently) pointed out that the ?qr
 documentation states that rank detection does not work for the LAPACK case.
  Its my fault for assuming that rank detection did work. --Mike

sprictly speaking is your fault,
however it seems sensible that qr(..) returns the rank value as NULL or NA 
when LAPACK=TRUE -- since it does not try to evaluate it -- instead of
always returning `full rank'.

regards, Adelchi
-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
http://azzalini.stat.unipd.it/

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Fitting inter-arrival time data

2003-07-01 Thread Adelchi Azzalini
On Tuesday 01 July 2003 05:16, M. Edward Borasky wrote:
 Unfortunately, the data are *non-negative*, not strictly positive. Zero is
 a valid and frequent inter-arrival time. It is, IIRC, the most likely value
 of a (negative) exponential distribution.

Not really. Zero+ is the value with highest density in a (negative) exponential 
distribution, which implies that you should have *no* observed zero's from that
distribution.

If you have a non-negligible fraction of 0 values, then your data are reasonably 
described as  having a mixed distribution: 
  (1) a discrete component at 0, and 
  (2) a continuous positive component.

Kernel (or similar) density estimation is appropriate for the continuous component
only.  Notice that the same remark applies to any procedure (parametric or 
non-parametric, using mixtures, etc.) which is based on continuous components only. 

It *looks* that a wise procedure is to separate out the discrete and the continuos
component of your data, and handle them separately.  At the end you can merge
the two parts into
 Y = p * 0 + (1-p) * X
where p is the proportion of 0's, and X represents the  continuous component of
the random variable.

best wishes,

Adelchi Azzalini

-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
http://azzalini.stat.unipd.it/

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Fitting inter-arrival time data

2003-07-01 Thread Adelchi Azzalini
 the two parts into
      Y = p * 0 + (1-p) * X
 where p is the proportion of 0's, and X represents the  continuous
 component of the random variable.

I must amend myself... what I should have written is
   Y = I * 0 + (1-I) * X
where I is a Bernoulli random variable with probability p of success (i.e. 1)
and X represents the  continuous component of the random variable.

-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
http://azzalini.stat.unipd.it/

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] Fitting inter-arrival time data

2003-06-30 Thread Adelchi Azzalini
On Monday 30 June 2003 01:23, M. Edward Borasky wrote:
 I have a collection of data which includes inter-arrival times of requests
 to a server. What I've done so far with it is use sm.density to explore
 the distribution, which found two large peaks. However, the peaks are made
 up of Gaussians, and that's not really correct, because the inter-arrival
 time can never be less than zero. In fact, the leftmost peak is centered at
 somewhere around ten seconds, and quite a bit of it extends into negative
 territory.

if you data are positive, you could use

  sm.density(..., positive=TRUE)

and possibly make use of the additional parameter delta for fine tuning

best wishes,

Adelchi Azzalini

-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
http://azzalini.stat.unipd.it/

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] mode of a data set

2003-06-23 Thread Adelchi Azzalini
On Monday 23 June 2003 17:50, Erin Hodgess wrote:
 Dear R People:

 Is there a function to find the mode of a data set, please?

 This is the mode as in the value(s) which occur most often.


 x[rev(order(table(x)))[1]]

is this what you want?

regards
Adelchi Azzalini

-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
http://azzalini.stat.unipd.it/

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] optim with contraints

2003-06-21 Thread Adelchi Azzalini

There seems to exist peculiar cases where optim does not take care
of constraints on the parameters to be optimized over.  The call to
optim is of the form

  opt - optim(cp, fn=sn.dev, gr=sn.dev.gh, method=L-BFGS-B,
   lower=c(-Inf, 1e-10, -0.99527), 
   upper=c( Inf, Inf,0.99527), 
   control=control, X=X, y=y, hessian=FALSE)

The code has worked fine many times, but I have come across cases 
(for suitable data X and y) where the constraint on the last component 
is ignored;  that means that a call is made to sn.dev with
   cp = -1.3546  0.4645  3.1741
so the third component exceeds 0.99527, and the program stops 
because of a check in the  function to be obtimised.

The call just before was to the gradient function
sn.dev.gh: gradient =219013   -312643 441647332
which has rather large values.

To make the problem more interesting, it shows up on a Linux
(Debian) installation, but it works fine on MS-windows.
In both cases, R is 1.7.0.  

Perhaps this sort of question should not be directed to the R-help
list, rather to the developer(s) of optim. Please instruct me on this 
point. Also, I appreciate that the above is not a reproducible example;
this would be longish in text, and I though to ask first to whom 
it is appropriate that I direct my question.


Best wishes,

Adelchi Azzalini


-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Universit di Padova, Italia
http://azzalini.stat.unipd.it/

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re: [R] fitting data to exponential distribution with glm

2003-06-10 Thread Adelchi Azzalini
On Tuesday 10 June 2003 17:31, Masayoshi Hayashi wrote:
(B I am learning glm function, but how do you fit data using exponential
(B distribution with glm?
(B
(BThe Gamma family is parametrised in glm() by two parameters: 
(Bmean and dispersion; the "dispersion" regulates the shape. 
(B
(BSo must fit a GLM with the Gamma family, and then produce a "summary"
(Bwith dispersion parameter set equal to 1, since this value 
(Bcorresponds to the exponential distribution in the Gamma family.
(B
(BIn practice:
(B
(Bfit - glm(formula =...,  family = Gamma)
(Bsummary(fit,dispersion=1)   
(B
(B
(Bbest wishes,
(B
(BAdelchi Azzalini
(B
(B-- 
(BAdelchi Azzalini  [EMAIL PROTECTED]
(BDipart.Scienze Statistiche, Universit(B?(B di Padova, Italia
(Bhttp://azzalini.stat.unipd.it/
(B
(B__
(B[EMAIL PROTECTED] mailing list
(Bhttps://www.stat.math.ethz.ch/mailman/listinfo/r-help

[R] RMySQL

2003-03-19 Thread Adelchi Azzalini

Hi. Could anyone help with the following?

  library(RMySQL)
Loading required package: methods 
  handle-dbDriver(MySQL)
 con-dbConnect(handle, dbname=echp,  host=tango, user=aa)
Segmentation fault

Some  variants of the above cheme invariably end up with the same conclusion.
The DBI  and RMySQL packages have compiled correclty, apparently,
when installed.  Here below are system data.

Best wishes,

Adelchi Azzalini


 R.version
 _
platform i686-pc-linux-gnu
arch i686 
os   linux-gnu
system   i686, linux-gnu  
status
major1
minor6.1  
year 2002 
month11   
day  01   
language R

-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
http://azzalini.stat.unipd.it/

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] downloaf.file

2003-02-04 Thread Adelchi Azzalini

Dear List-members,

to download a file from the net, the function download.file(..)
does the job.  However, before embarking on the download, I would
like to find out how large the file is.  Is there a way to know it?

Most easily, this question has been asked before, but I am new to 
the list.

Regards, with thanks in advance,

Adelchi Azzalini


Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
http://azzalini.stat.unipd.it/

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help