[R] reading a big file

2007-05-23 Thread Remigijus Lapinskas
Dear All,

I am on WindowsXP with 512 MB of RAM, R 2.4.0, and I want to read in a
big file mln100.txt. The file is 390MB big, it contains a column of 100 
millions integers.

 mln100=scan(mln100.txt)
Error: cannot allocate vector of size 512000 Kb
In addition: Warning messages:
1: Reached total allocation of 511Mb: see help(memory.size)
2: Reached total allocation of 511Mb: see help(memory.size)

In fact, I would be quite happy if I could read, say, every tenth 
integer (line) of the file. Is it possible to do this?

Cheers,
Rem

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] multivariate normality test

2006-06-29 Thread Remigijus Lapinskas
Hello,

Could someone help me to explain the VERY big difference in applying two
tests on multivariate normality:

library(mvnormtest)
data(EuStockMarkets)
mshapiro.test(t(EuStockMarkets[15:29,1:4]))


 Shapiro-Wilk normality test
data:  Z
W = 0.8161, p-value = 0.005955

and

library(energy)
mvnorm.etest( EuStockMarkets[15:29,1:4] )

 Energy test of multivariate normality: estimated parameters

data:  x, sample size 15, dimension 4, replicates 999
E-statistic = 1.0041, p-value = 0.2482



Many thanks,
Rem

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] survival with Weibull

2006-01-17 Thread Remigijus Lapinskas
Hello,

I want to test if the Weibull distribution is appropriate for the
failure time. When trying to reproduce an example from MASS (the book,
Ch. 13.2), I type

library(survival)
library(MASS)
leuk.wei - survreg( Surv(time)~ag+log(wbc),data=leuk)
ntimes - leuk$time*exp(-leuk.wei$linear.predictors)
plot(survfit(Surv(ntimes)),log=T)

and get (almost) the same graph as in the book (which means that the
distribution is close to exponential). On the other hand, if I want to
test for the Weibull distribution, the recommended command

 plot(survfit(Surv(ntimes)),fun=cloglog)

ends in an error message:

Error in rep.default(2, n2 - 1) : invalid number of copies in rep()
In addition: Warning message:
2 x values = 0 omitted from logarithmic plot in: xy.coords(x, y,
xlabel, ylabel, log)

I'd appreciate any hints,
Rem

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] priceIts

2005-09-28 Thread Remigijus Lapinskas
Dear All,

There is an example for the priceIts function (the its package) which 
does not work for me as expected.

  ?priceIts
   x1 - priceIts(instrument = c(^ftse), start = 1998-01-01,
+  quote = Close)
Error in validObject(.Object) : invalid class its object: Missing 
values in dates
   x2 - priceIts(instrument = c(^gdax), start = 1998-01-01,
+  quote = Close)
Error in download.file(url, destfile, method = method, quiet = quiet) :
 cannot open URL 
'http://chart.yahoo.com/table.csv?s=^gdaxa=0b=01c=1998d=8e=28f=2005g=dq=qy=0z=^gdaxx=.csv'
In addition: Warning message:
cannot open: HTTP status was '404 Not Found'
   x - union(x1,x2)
Error: Object x1 not found
Error in union(x1, x2) : unable to find the argument 'x' in selecting a 
method for function 'union'
   names(x) - c(FTSE,DAX)
   plot(x,lab=TRUE)

Any help appreciated.

Best wishes,
Remigijus

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] What is median in survfit

2005-06-10 Thread Remigijus Lapinskas
Dear All,

A very simple question:

  library(survival)
  fit - coxph(Surv(time, status) ~ x, data=aml)
  survfit(fit)
Call: survfit.coxph(object = fit)

   n  events  median 0.95LCL 0.95UCL
  23  18  30  18  45

I believe I know what is median here, but how to extract it?

Many thanks,
Rem

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to use lag?

2005-03-05 Thread Remigijus Lapinskas

I use the following two function for a lagged regression:

lm.lag=function(y,lag=1) 
summary(lm(embed(y,lag+1)[,1]~embed(y,lag+1)[,2:(lag+1)]))
lm.lag.x=function(y,x,lag=1) 
summary(lm(embed(y,lag+1)[,1]~embed(x,lag+1)[,2:(lag+1)]))

for, respectively,

y_t=a+b_1*y_t-1+...+b_lag*y_t-lag
y_t=a+b_1*x_t-1+...+b_lag*x_t-lag

I am not quite sure whether this an answer to your question, but here
are two examples:

set.seed(7)
ar1=arima.sim(n=300,list(ar=0.8))
lm.lag(ar1)
lm.lag.x(ar1,ar1)

set.seed(8)
ar3=arima.sim(n = 200, list(ar = c(0.4, -0.5, 0.7)))
lm.lag(ar3,3) 
lm.lag.x(ar3,ar3,3)

Best wishes,
Rem




Saturday, March 5, 2005, 6:14:15 PM, you wrote:

SG   Is it possible to fit a lagged regression, y[t]=b0+b1*x[t-1]+e,
SG using the function lag?  If so, how?  If not, of what use is the
SG function lag?  I get the same answer from y~x as y~lag(x), whether
SG using lm or arima.  I found it using y~c(NA, x[-length(x)])). Consider
SG the following: 

  set.seed(1)
  x - rep(c(rep(0, 4), 9), len=9)
  y - (rep(c(rep(0, 5), 9), len=9)+rnorm(9)) # y[t] = x[t-1]+e
 
  lm(y~x)
SG (Intercept)x 
SG  1.2872  -0.1064 
  lm(y~lag(x))
SG (Intercept)   lag(x) 
SG  1.2872  -0.1064 
  arima(y, xreg=x)
SG   interceptx
SG  1.2872  -0.1064
SG s.e. 0.9009   0.3003
SG sigma^2 estimated as 6.492:  log likelihood = -21.19,  aic = 48.38
  arima(y, xreg=lag(x))
SG   intercept   lag(x)
SG  1.2872  -0.1064
SG s.e. 0.9009   0.3003
  arima(y, xreg=c(NA, x[-9]))
SG   intercept  c(NA, x[-9])
SG  0.43920.8600
SG s.e. 0.23720.0745
SG sigma^2 estimated as 0.3937:  log likelihood = -7.62,  aic = 21.25
  arima(ts(y), xreg=lag(ts(x)))
SG arima(x = ts(y), xreg = lag(ts(x)))
SG   intercept  lag(ts(x))
SG  1.2872 -0.1064
SG s.e. 0.9009  0.3003
SG sigma^2 estimated as 6.492:  log likelihood = -21.19,  aic = 48.38
 
SG   Thanks for your help. 
SG   Spencer Graves

SG __
SG R-help@stat.math.ethz.ch mailing list
SG https://stat.ethz.ch/mailman/listinfo/r-help
SG PLEASE do read the posting guide!
SG http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How can I simulate Pareto distribution in R?

2005-01-09 Thread Remigijus Lapinskas

If you define Pareto density as p(x)=c*x^(-(c+1)) for x1, then

dpareto - function(x,c){
if(c=0)stop(c must be positive) # Diagnostic step
ifelse(x1,0,c/x^(c+1))}

ppareto - function(q,c){
if(c=0)stop(c must be positive  0)
ifelse(q1,0,1-1/q^c)}

qpareto - function(p,c){
if(c=0) stop(c must be positive  0)
if(any(p0)|any(p1)) # Symbol | denotes logical OR
stop(p must be between 0 and 1)
q - (1-p)^(-1/c)
q}

rpareto - function(n,c){
if(c=0) stop(c must be positive)
rp - runif(n)^(-1/c)
rp}

Good luck,
Rem

Sunday, January 9, 2005, 10:21:47 AM, you wrote:

NX Hi, guys,
NXI need to simulate Pareto distribution. But I found
NX 'rpareto' didn't exist in R. And it seems that Pareto distribution
NX don't have mathematical relationships with other distributions.
NX What can I do?
 
NX Thanks a lot.
 
NX Ni

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] R vs EViews - serial correlation

2004-09-23 Thread Remigijus Lapinskas

Dear all,

I met with some problems when dealing with a time series with serial correlation.

FIRST, I generate a series with correlated errors

set.seed(1)
x=1:50
y=x+arima.sim(n = 50, list(ar = c(0.47)))

SECOND, I estimate three constants (a, b and rho) in the model Y=a+b*X+u, where 
u=rho*u(-1)+eps
 
library(nlme)
gls(y~x,correlation = corAR1(0.5)) # Is it the right procedure?

Coefficients:
(Intercept)   x 
  0.1410465   1.0023341 

Correlation Structure: AR(1)
 Formula: ~1 
 Parameter estimate(s):
 Phi 
0.440594 
Degrees of freedom: 50 total; 48 residual
Residual standard error: 0.9835158

THIRD, I do the same procedure with EViews as LS Y C X AR(1) and get
Y = 0.1375 + 1.0024*X + [AR(1)=0.3915]

My problem is actually connected with the fitting procedure. As far as I understand 

gls(y~x,correlation = corAR1(0.5))$fit

is obtained through the linear equation 0.1410+1.0023*X while in EViews through the 
nonlinear equation 

Y=rho*Y(-1) + (1-rho)*a+(X-rho*X(-1))*b

where either dynamic or static fitting procedures are applied. 

X   YYF_DYF_S gls.fit
1   1  1.1592  NA  NA  1.1434
2   2  3.5866  2.1499  2.1499  2.1457
3   3  4.1355  3.1478  3.7103  3.1480
4   4  3.9125  4.1484  4.5352  4.1504
5   5  2.7442  5.1502  5.0578  5.1527
6   6  6.0647  6.1523  5.2103  6.1551
7   7  6.9855  7.1547  7.1203  7.1574
.
47 47 49.4299 47.2521 47.5288 47.2507
48 48 48.7748 48.2545 49.1072 48.2531
49 49 48.3200 49.2570 49.4607 49.2554
50 50 50.2501 50.2594 49.8926 50.2578

All gls.fit values are on a line, YF_D (D for dynamic) soon begin
to follow a line and YF_S (S for static) try to mimic Y.

My question is: do R and EViews estimate the same model? If yes, why
the estimates are different and which of the two (three?) procedures
is correct?

Thanking you in advance,
Rem

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] predict(arima)

2004-08-29 Thread Remigijus Lapinskas
Dear All,

R 1.9.1, Windows

When copying and pasting a few lines from the 'predict.Arima' help, I
get an error message:

 data(lh)
 predict(arima(lh, order = c(3,0,0)), n.ahead = 12)
Error in eval(expr, envir, enclos) : Object xreg not found

On the other hand, the following is OK:

 data(lh)
 predict(arima0(lh, order = c(3,0,0)), n.ahead = 12)
$pred
Time Series:
Start = 49 
End = 60 
Frequency = 1 
 [1] 2.460173 2.270829 2.198597 2.260696 2.346933 2.414479 2.438918 2.431440 2.410223 
2.391645 2.382653 2.382697

$se
Time Series:
Start = 49 
End = 60 
Frequency = 1 
 [1] 0.4226823 0.5029332 0.5245256 0.5247161 0.5305499 0.5369159 0.5388045 0.5388448 
0.5391043 0.5395174 0.5396991 0.5397140

So, what is wrong with arima?

Thanking you in advance,
Rem

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] automate save.image

2003-10-29 Thread Remigijus Lapinskas
Dear all,

Sometimes, during an R session, my computer hangs and I
loose all the objects created during this session.
Is there a way to automatically type save.image() and
savehistory() every 5 minutes?

Best regards,
Remigijus

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re[2]: [R] optim

2003-03-02 Thread Remigijus Lapinskas
Sunday, March 02, 2003, 9:48:26 PM, you wrote:

MK [EMAIL PROTECTED] wrote:
 You are not solving the same problem, which is more than a little `odd'.

MK Why?

MK   want to choose the three parameters so that
MK these three integrals are as close to, resp., 2300, 4600 and 5800 as
MK possible. As I have three equations with three unknowns, I expect the
MK exact fit, i.e., the SS (see below) to be zero.

Good evening to all!

In fact, I want to estimate three unknown parameters TETA[1],TETA[2]
(the scale parameter) and TETA[3]. What I know is 10 numbers
aa - c(2300,4600,5800,
7100,...,37700) and these are the empirical counterparts of
the integrals over [0,0.1],...[0.9,1]. I don't know whether this is
correct but in order to find the parameters I minimize the
sum((integr-aa)^2) (kind of the moment method). As I had problems
with optim, I reduced the problem to three intervals.

I have to admit, I'm still fighting with optim. Following Prof
Ripley's suggestion, I replaced

res - optim(init,LSS,aa=aa,method = L-BFGS-B,lower=c(0,0,0.5))

by

res - optim(init,LSS,aa=aa,method = L-BFGS-B,lower=c(0,0,0.5),
control=list(parscale=c(1,1000,1)))

Now I have

 source(C:/Program Files/R/integral.R)
[1] 2.50  7000.00 0.84   # initial
[1] 2.052587 66734.47682242.110597   # final
[1] 42820.43

instead of what I had earlier

[1]2.50  7000.00 0.84# initial
[1]2.3487221 6999.8230.5623628   # final
[1] 75613.05

Now SS=42820, but it is still far from zero. I agree, I did not
implemented all what I was advised to, but it could take some time.

Many thanks for the help.
Remigijus

***
***


Do read the help page.  It says:

 [...]

`parscale' A vector of scaling values for the parameters.
  Optimization is performed on `par/parscale' and these should
  be comparable in the sense that a unit change in any element
  produces about a unit change in the scaled value.

**
**

Dear all,

I have a function MYFUN which depends on 3 positive parameters TETA[1],
TETA[2], and TETA[3]; x belongs to [0,1].
I integrate the function over [0,0.1], [0.1,0.2] and
[0.2,0.3] and want to choose the three parameters so that
these three integrals are as close to, resp., 2300, 4600 and 5800 as
possible. As I have three equations with three unknowns, I expect the
exact fit, i.e., the SS (see below) to be zero. However, the optim
function never gives me what I expect, the minimal SS value(=res$value)
never comes close to zero, the estimates of the parameters, res$par,
wildly depends on init etc.
I would be grateful for any comments on this miserable situation.

aa - c(2300,4600,5800)
init - c(2.5,8000,0.84) # initial values of parameters
print(init)
###
myfun - function(x,TETA) TETA[2]*(((1-x)^(-1/TETA[3]))-
1)^(1/TETA[1])
###
x - seq(0,0.3,by=0.01)
plot(x,myfun(x,init),type=l)
###
LSS - function(teta,aa)
{
integr - numeric(3)
   for(i in 1:3)
   {integr[i] - 10*integrate(myfun,
   lower=(i-1)/10,upper=i/10,TETA=teta)$value
   }
SS -  # SS=Sum of Squares
SS
}

res - optim(init,LSS,aa=aa,
method = L-BFGS-B,lower=c(0,0,0.5))
print(res$par)
print(res$value)



source(C:/Program Files/R/integral.R)

[1]2.50  7000.00 0.84# initial
[1]2.3487221 6999.8230.5623628   # final
[1] 75613.05 # minSS

source(C:/Program Files/R/integral.R)

[1] 2.5  150000.84   # initial
[1] 2.125804 14999.999747 2.241179   # final
[1] 50066.35 # minSS


__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] optim

2003-02-28 Thread Remigijus Lapinskas
Dear all,

I have a function MYFUN which depends on 3 positive parameters TETA[1],
TETA[2], and TETA[3]; x belongs to [0,1].
I integrate the function over [0,0.1], [0.1,0.2] and
[0.2,0.3] and want to choose the three parameters so that
these three integrals are as close to, resp., 2300, 4600 and 5800 as
possible. As I have three equations with three unknowns, I expect the
exact fit, i.e., the SS (see below) to be zero. However, the optim
function never gives me what I expect, the minimal SS value(=res$value)
never comes close to zero, the estimates of the parameters, res$par,
wildly depends on init etc.
I would be grateful for any comments on this miserable situation.

aa - c(2300,4600,5800)
init - c(2.5,8000,0.84) # initial values of parameters
print(init)
###
myfun - function(x,TETA) TETA[2]*(((1-x)^(-1/TETA[3]))-
1)^(1/TETA[1])
###
x - seq(0,0.3,by=0.01)
plot(x,myfun(x,init),type=l)
###
LSS - function(teta,aa)
{
integr - numeric(3)
   for(i in 1:3)
   {integr[i] - 10*integrate(myfun,
   lower=(i-1)/10,upper=i/10,TETA=teta)$value
   }
SS - sum((integr-aa)^2) # SS=Sum of Squares
SS
}

res - optim(init,LSS,aa=aa,
method = L-BFGS-B,lower=c(0,0,0.5))
print(res$par)
print(res$value)


 source(C:/Program Files/R/integral.R)
[1]2.50  7000.00 0.84# initial
[1]2.3487221 6999.8230.5623628   # final
[1] 75613.05 # minSS
 source(C:/Program Files/R/integral.R)
[1] 2.5  150000.84   # initial
[1] 2.125804 14999.999747 2.241179   # final
[1] 50066.35 # minSS



Best regards,
Remigijus  mailto:[EMAIL PROTECTED]

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help


Re[2]: [R] Interpolation

2003-02-12 Thread Remigijus Lapinskas
Many thanks to all who replied to my e-mail. My problem was that I
had not known about the approx function.

By the way, if I have  x - c(1990,1994,1995,1997), is there an
automated way to fill in the gaps, i.e., to get
c(1991,1992,1993,1996)?

Cheers,
Remigijus


Wednesday, February 12, 2003, 12:09:33 PM, you wrote:

PDB Remigijus Lapinskas [EMAIL PROTECTED] writes:

 Dear all,

 I have two vectors and connect the points by line segments:

 x - c(1990,1994,1995,1997)
 y - c(30,40,80,20)
 plot(x,y,type=l)

 I want to get the values of y's on these lines at missing x's,
 i.e., at 1991,1992,1993,and 1996. Is there a simple way to do this?


PDB approx(x,y,c(1991,1992,1993,1996))
PDB plot(x,y,type=l)
PDB points(approx(x,y,c(1991,1992,1993,1996)),pch=*,col=red,cex=5)

PDB --
PDBO__   Peter Dalgaard Blegdamsvej 3
PDB   c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
PDB  (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
PDB ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



Re: [R] Plotting coloured histograms...

2003-01-25 Thread Remigijus Lapinskas
Try the H. Bengtsson's function plot.histogram from

http://www.maths.lth.se/matstat/staff/hb/mypackages/R/plot.histogram.R

Remigijus

Saturday, January 25, 2003, 10:21:30 PM, you wrote:

FHFPdRHi, I am having some trouble trying to plot a histogram in more than one
FHFPdR colour. What I want to do is, plot two vectors in the same histogram, but
FHFPdR with different colours, for instance:
FHFPdR x - rnorm(1000,20,4);
FHFPdR y - rnorm(1000,10,2);
FHFPdR Then I'd like to have x and y ploted on the same hist (I can do that
FHFPdR already doing w - c(x,y) then hist(w)) but the bars representing the x's 
should
FHFPdR be in one colour and the bars representing the y should be in another one,
FHFPdR so that I could see the overlaping areas of the two distributions etc.
FHFPdR  Is there any way to do that? I've read through the hist docummentation
(help(hist)) and also googled for R colour histogram but didn't find
FHFPdR anything helpfull.

FHFPdR Thank you for your attention,

FHFPdR --

FHFPdR __
FHFPdR [EMAIL PROTECTED] mailing list
FHFPdR http://www.stat.math.ethz.ch/mailman/listinfo/r-help

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help