Re: [R] R2 always increases as variables are added?

2007-05-22 Thread Paul Lynch
On 5/21/07, Alberto Monteiro [EMAIL PROTECTED] wrote:
 Paul Lynch wrote:
 
  I don't think it makes sense to compare models with
  and without an intercept term.  (Also, I don't know what the point of
  using a model without an intercept term would be, but that is
  probably just my ignorance.)
 
 Suppose that you are 100% sure that the intercept term is zero, or
 so insignifantly small as not to matter. For example, if you are
 measuring the density of some material, and you determine a lot
 of pairs (mass, volume), you know that mass = density * volume,
 with intercept zero.


In that case, you are 100% sure that the intercept *should* be zero,
but you aren't 100% sure that the measurements have a best fit with
intercept zero.  There could have been some systematic error that is
throwing things off.  It seems safer to leave the intercept in and let
the data show that the intercept is insignificantly small.  However, I
don't really know enough to know whether that is always the best
approach.  (And given that R provides a facility for excluding the
intercept, I suspect there must be some good reason for doing so in
some circumstances.)


-- 
Paul Lynch
Aquilent, Inc.
National Library of Medicine (Contractor)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R2 always increases as variables are added?

2007-05-21 Thread Paul Lynch
Junjie,
First, a disclaimer:  I am not a statistician, and have only taken
one statistics class, but I just took it this Spring, so the concepts
of linear regression are relatively fresh in my head and hopefully I
will not be too inaccurate.
According to my statistics textbook, when selecting variables for
a model, the intercept term is always present.  The variables under
consideration do not include the constant 1 that multiplies the
intercept term.  I don't think it makes sense to compare models with
and without an intercept term.  (Also, I don't know what the point of
using a model without an intercept term would be, but that is probably
just my ignorance.)
Similarly, the formula you were using for R**2 seems to only be
useful in the context of a standard linear regression (i.e., one that
includes an intercept term).  As your example shows, it is easy to
construct a fit (e.g. y = 10,000,000*x) so that SSR  SST if one is
not deriving the fit from the regular linear regression process.
  --Paul

On 5/19/07, 李俊杰 [EMAIL PROTECTED] wrote:
 I know that -1 indicates to remove the intercept term. But my question is
 why intercept term CAN NOT be treated as a variable term as we place a
 column consited of 1 in the predictor matrix.

 If I stick to make a comparison between a model with intercept and one
 without intercept on adjusted r2 term, now I think the strategy is always to
 use another definition of r-square or adjusted r-square, in which
 r-square=sum(( y.hat)^2)/sum((y)^2).

 Am I  in the right way?

 Thanks

 Li Junjie


 2007/5/19, Paul Lynch [EMAIL PROTECTED]:
  In case you weren't aware, the meaning of the -1 in y ~ x - 1 is to
  remove the intercept term that would otherwise be implied.
  --Paul
 
  On 5/17/07, 李俊杰 [EMAIL PROTECTED] wrote:
   Hi, everybody,
  
   3 questions about R-square:
   -(1)--- Does R2 always increase as variables are added?
   -(2)--- Does R2 always greater than 1?
   -(3)--- How is R2 in summary(lm(y~x-1))$r.squared
   calculated? It is different from (r.square=sum((y.hat-mean
   (y))^2)/sum((y-mean(y))^2))
  
   I will illustrate these problems by the following codes:
   -(1)---  R2  doesn't always increase as
 variables are added
  
x=matrix(rnorm(20),ncol=2)
y=rnorm(10)
   
lm=lm(y~1)
y.hat=rep(1*lm$coefficients,length(y))
(r.square=sum((y.hat-mean(y))^2)/sum((y-mean(y))^2))
   [1] 2.646815e-33
   
lm=lm(y~x-1)
y.hat=x%*%lm$coefficients
(r.square=sum((y.hat-mean(y))^2)/sum((y-mean(y))^2))
   [1] 0.4443356
   
 This is the biggest model, but its R2 is not the
 biggest,
   why?
lm=lm(y~x)
y.hat=cbind(rep(1,length(y)),x)%*%lm$coefficients
(r.square=sum((y.hat-mean(y))^2)/sum((y-mean(y))^2))
   [1] 0.2704789
  
  
   -(2)---  R2  can greater than 1
  
x=rnorm(10)
y=runif(10)
lm=lm(y~x-1)
y.hat=x*lm$coefficients
(r.square=sum((y.hat-mean(y))^2)/sum((y-mean(y))^2))
   [1] 3.513865
  
  
-(3)--- How is R2 in summary(lm(y~x-1))$r.squared
   calculated? It is different from (r.square=sum((y.hat-mean
   (y))^2)/sum((y-mean(y))^2))
x=matrix(rnorm(20),ncol=2)
xx=cbind(rep(1,10),x)
y=x%*%c(1,2)+rnorm(10)
### r2 calculated by lm(y~x)
lm=lm(y~x)
summary(lm)$r.squared
   [1] 0.9231062
### r2 calculated by lm(y~xx-1)
lm=lm(y~xx-1)
summary(lm)$r.squared
   [1] 0.9365253
### r2 calculated by me
y.hat=xx%*%lm$coefficients
(r.square=sum((y.hat-mean(y))^2)/sum((y-mean(y))^2))
   [1] 0.9231062
  
  
   Thanks a lot for any cue:)
  
  
  
  
   --
   Junjie Li,  [EMAIL PROTECTED]
   Undergranduate in DEP of Tsinghua University,
  
   [[alternative HTML version deleted]]
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
 
 
  --
  Paul Lynch
  Aquilent, Inc.
  National Library of Medicine (Contractor)
 



 --

 Junjie Li,  [EMAIL PROTECTED]
 Undergranduate in DEP of Tsinghua University,


-- 
Paul Lynch
Aquilent, Inc.
National Library of Medicine (Contractor)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plotting data with a fitted curve

2007-04-16 Thread Paul Lynch
Suppose you have a vector of data in x and response values in y.  How
do you plot together both the points (x,y) and the curve that results
from the fitted model, if the model is not y ~ x, but a higher order
polynomial, e.g. y~poly(x,2)?  (In other words, abline doesn't work
for this case.)

Thanks,
 --Paul

-- 
Paul Lynch
Aquilent, Inc.
National Library of Medicine (Contractor)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Vector indexing question

2007-03-29 Thread Paul Lynch
Suppose you have 4 related vectors:

a.id-c(1:25, 1:25, 1:25)
a.vals - c(101:175)# same length as a.id (the values for those IDs)
a.id.levels - c(1:25)
a.id.ratings - rep(letters[1:5], times=5)# same length as a.id.levels

What I would like to do is specify a rating from a.ratings (e.g. e),
get the vector of corresponding IDs from a.id.levels (via
a.id.levels[a.id.ratings=='e']) and then somehow use those IDs in a.id
to get the corresponding values from a.vals.

I think I can probably write a loop to construct of a vector of
ratings of the same length as a.id so that the ratings match the ID,
and then go from there.  Is there a better way?  Perhaps using factors
or levels or something?

Thanks,
  --Paul

-- 
Paul Lynch
Aquilent, Inc.
National Library of Medicine (Contractor)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Vector indexing question

2007-03-29 Thread Paul Lynch
Adai-- Thanks a lot!  This is just what I was looking for.  I was
almost sure there had to be a neat of doing this.

Bert--  Thanks for the tip.

Marc-- Not quite, although your solution works fine for the case I
gave.  What I had in mind for a.id was an arbitrary sequence of the
numbers in the range [1,25], of length 75, though I was not savvy
enough with R to express that succinctly.  You spotted a shortcut that
I hadn't reallized I was introducing.

Thanks all for your help!
  --Paul

On 3/29/07, Adaikalavan Ramasamy [EMAIL PROTECTED] wrote:
 Sounds like you have two different tables and are trying to mine one
 based on the other. Try

 ref - data.frame( levels  = 1:25,
 ratings = rep(letters[1:5], times=5) )

 db - data.frame( vals=101:175, levels=c(1:25, 1:25, 1:25) )

 levels.of.interest - ref$levels[ ref$rating==a ]
 db$vals[ which(db$levels %in% levels.of.interest) ]

   [1] 101 106 111 116 121 126 131 136 141 146 151 156 161 166 171


 OR a much more intuitive way is to merge both tables and proceeding as

 out - merge( db, ref, by=levels, all.x=TRUE )
 out - out[ order(out$val), ] # little cleanup
 subset( out, ratings==a )   # ignore the rownames

 levels vals ratings
 1   1  101   a
 16  6  106   a
 31 11  111   a
 46 16  116   a
 61 21  121   a
 3   1  126   a
 17  6  131   a
 32 11  136   a
 47 16  141   a
 62 21  146   a
 2   1  151   a
 18  6  156   a
 33 11  161   a
 48 16  166   a
 63 21  171   a

 Then you can do cool things using the apply() family like
tapply( out$vals, out$ratings, mean )
  a   b   c   d   e
136 137 138 139 140

 Check out %in%, merge and apply.

 Regards, Adai



 Paul Lynch wrote:
  Suppose you have 4 related vectors:
 
  a.id-c(1:25, 1:25, 1:25)
  a.vals - c(101:175)# same length as a.id (the values for those IDs)
  a.id.levels - c(1:25)
  a.id.ratings - rep(letters[1:5], times=5)# same length as a.id.levels
 
  What I would like to do is specify a rating from a.ratings (e.g. e),
  get the vector of corresponding IDs from a.id.levels (via
  a.id.levels[a.id.ratings=='e']) and then somehow use those IDs in a.id
  to get the corresponding values from a.vals.
 
  I think I can probably write a loop to construct of a vector of
  ratings of the same length as a.id so that the ratings match the ID,
  and then go from there.  Is there a better way?  Perhaps using factors
  or levels or something?
 
  Thanks,
--Paul
 




-- 
Paul Lynch
Aquilent, Inc.
National Library of Medicine (Contractor)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fitting a line to a qqplot's points?

2007-03-22 Thread Paul Lynch
I've made some normal plots of my data using qqplot, and now
I would like to fit a line to the points on the plot and
check the correlation coefficient to have a more objective measure
of how straight the line is.  Is there a simple way of doing that?
(I'm still pretty new to R.)

Thanks,
   --Paul

-- 
Paul Lynch
Aquilent, Inc.
National Library of Medicine (Contractor)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Connecting R-help and Google Groups?

2007-03-14 Thread Paul Lynch
This morning I tried to see if I could find the r-help mailing list on
Google Groups, which has an interface that I like.  I found three
Google Groups (The R Project for Statistical Computing, rproject,
and rhelp) but none of them are connected to the r-help list.

Is there perhaps some reason why it wouldn't be a good thing for there
to be a connected Google Group?  I think it should be possible to set
things up so that a post to the Google Group goes to the r-help
mailing list, and vice-versa.

Also, does anyone know why the three existing R Google Groups failed
to get connected to r-help?  It might require some action on the part
of the r-help list administrator.

Thanks,
--Paul

-- 
Paul Lynch
Aquilent, Inc.
National Library of Medicine (Contractor)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Connecting R-help and Google Groups?

2007-03-14 Thread Paul Lynch
Well, I don't see what danger could arise from the fact that Google
Groups is owned by a company.  Google Groups provides access to all of
usenet, plus many mailing lists (e.g. the ruby-talk mailing list for
Ruby programmers).  They don't control any of the newgroups or mailing
lists that they provide access to.  It is a free service, supported by
advertising.

As for the issue of whether there might be future access problems
(e.g. if Google goes bankrupt, which currently seems unlikely)  R
users would still have access to the r-help list through the means
that they have now.  I am not recommending replacing any of the
current means of access to the r-help list; I am just asking about
adding an additional means of access.

  --Paul

On 3/14/07, Bert Gunter [EMAIL PROTECTED] wrote:
 I know nothing about Google Groups, but FWIW, I think it would be most
 unwise for R/CRAN to hook up to **any** commercially sponsored web portals.
 Future changes in their policies, interfaces,or access conditions may make
 them inaccessible or unfreindly to R users. So long as we have folks willing
 and able to host and maintain our lists as part of the CRAN infrastructure,
 CRAN maintains control. I think this is wise and prudent.

 I am happy to be educated to the contrary if I misunderstand how this would
 work.

 Bert Gunter
 Genentech Nonclinical Statistics
 South San Francisco, CA 94404
 650-467-7374


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Paul Lynch
 Sent: Wednesday, March 14, 2007 8:48 AM
 To: R-help@stat.math.ethz.ch
 Subject: [R] Connecting R-help and Google Groups?

 This morning I tried to see if I could find the r-help mailing list on
 Google Groups, which has an interface that I like.  I found three
 Google Groups (The R Project for Statistical Computing, rproject,
 and rhelp) but none of them are connected to the r-help list.

 Is there perhaps some reason why it wouldn't be a good thing for there
 to be a connected Google Group?  I think it should be possible to set
 things up so that a post to the Google Group goes to the r-help
 mailing list, and vice-versa.

 Also, does anyone know why the three existing R Google Groups failed
 to get connected to r-help?  It might require some action on the part
 of the r-help list administrator.

 Thanks,
 --Paul

 --
 Paul Lynch
 Aquilent, Inc.
 National Library of Medicine (Contractor)

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Paul Lynch
Aquilent, Inc.
National Library of Medicine (Contractor)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simplest question ever...

2007-03-01 Thread Paul Lynch
I'm not sure this is the most efficient, but how about:
   diag(m[a,b])
?

On 3/1/07, yoo [EMAIL PROTECTED] wrote:

 Let's say i have

 a = c(1, 4, 5)
 b = c(2, 6, 7)

 and i have matrix m, what's an efficient way of access
 m[1, 2], m[4, 6], m[5, 7]
 like of course m[a, b] = is not going to do, but what's an expression that
 will allow me to have that list?

 Thanks!
 --
 View this message in context: 
 http://www.nabble.com/Simplest-question-ever...-tf3329894.html#a9258932
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] integrate over polygon

2007-02-15 Thread Paul Lynch
I'm still pretty ignorant about R, but I think it might be possible to
work out an algorithm using  cross products.  First you would want to
subdivide the polygon into convex polygons.  I haven't tried to do
that before, but it looks like it might be possible by looking at the
sign of cross products of vectors between vertices.  (In other words,
pick a vertex, and then start working your way around the polygon and
pay attention to the sign of cross products of vectors from the
starting vertex to successive vertices.)  Once you have convex
polygons, you can calculate the area using cross products of vectors
from some point (e.g. the origin) to adjacent vertices of the polygon.
 I think that probably most computer graphics texts would have such an
algorithm.

How to implement that in R is not something I can answer, but it
doesn't sound hard.
   --Paul

On 2/14/07, Haiyong Xu [EMAIL PROTECTED] wrote:
 Hi there,

 I want to integrate a function over an irregular polygon. Is there
 any function which can implement this easily? Otherwise, I am
 thinking of divide the polygon into very small rectangles and use
 adapt to approximate it. Do you have any suggestions to get the
 fine division? Any advice is appreciated.

 Haiyong

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] integrate over polygon

2007-02-15 Thread Paul Lynch
Oops.  I just re-read your message and saw you were trying to
integrate a function over a polygon, not calculate its area.  I'm
sorry I didn't read more carefully.
 --Paul

On 2/15/07, Paul Lynch [EMAIL PROTECTED] wrote:
 I'm still pretty ignorant about R, but I think it might be possible to
 work out an algorithm using  cross products.  First you would want to
 subdivide the polygon into convex polygons.  I haven't tried to do
 that before, but it looks like it might be possible by looking at the
 sign of cross products of vectors between vertices.  (In other words,
 pick a vertex, and then start working your way around the polygon and
 pay attention to the sign of cross products of vectors from the
 starting vertex to successive vertices.)  Once you have convex
 polygons, you can calculate the area using cross products of vectors
 from some point (e.g. the origin) to adjacent vertices of the polygon.
  I think that probably most computer graphics texts would have such an
 algorithm.

 How to implement that in R is not something I can answer, but it
 doesn't sound hard.
--Paul

 On 2/14/07, Haiyong Xu [EMAIL PROTECTED] wrote:
  Hi there,
 
  I want to integrate a function over an irregular polygon. Is there
  any function which can implement this easily? Otherwise, I am
  thinking of divide the polygon into very small rectangles and use
  adapt to approximate it. Do you have any suggestions to get the
  fine division? Any advice is appreciated.
 
  Haiyong
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R book advice

2007-02-15 Thread Paul Lynch
I'm looking for a book for someone completely ignorant of statistics
who wishes to learn both statistics and R.  I've found three
possibilities, one by Verzani (Using R for Introductory Statistics),
one by Crawley (Statistics: An Introduction using R), and one by
Dalgaard (Introductory Statistics with R).  Do these books have
different emphases, perspectives, or strengths?  Should I just pick
one at random and buy it?

Thanks,
--Paul

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] make check failure, internet.Rout.fail, Error in strsplit

2007-02-13 Thread Paul Lynch
Thanks for giving it a try.  It is very odd that you got
Content-Length when I am getting Content-length.  I just tried
curl (I had been using telnet to port 80) and I got the same (error
causing) length result:

 curl --head http://www.stats.ox.ac.uk/pub/datasets/csb/ch11b.dat
HTTP/1.1 200 OK
Date: Tue, 13 Feb 2007 16:10:41 GMT
Server: Apache/2.0.40 (Red Hat Linux)
Last-Modified: Fri, 19 May 1995 10:27:04 GMT
ETag: 7bc27-836-39a78e00
Accept-Ranges: bytes
Content-Type: text/plain; charset=ISO-8859-1
Content-length: 2102
Connection: Keep-Alive

Perhaps we are hitting different web servers?  I ran nslookup on
www.stats.ox.ac.uk, and it appears to be an alias for
web2.stats.ox.ac.uk.  Is that the machine you are getting?  What
happens if you run curl against web2, i.e.:
   curl --head http://web2.stats.ox.ac.uk/pub/datasets/csb/ch11b.dat

?
(I get Content-length).
Thanks,
 --Paul

On 2/12/07, Charilaos Skiadas [EMAIL PROTECTED] wrote:
 On Feb 12, 2007, at 6:28 PM, Paul Lynch wrote:

  I'm trying to build R on RedHat EL4.  The compile went fine, but a
  make check ran into a problem and produced a file
  internet.Rout.fail.  Judging by the last part of that file, it was
  trying to run an R routine called httpget to retrieve the URL
  http://www.stats.ox.ac.uk/pub/datasets/csb/ch11b.dat.  The precise
  error it encountered was:
 
  Error in strsplit(grep(Content-Length, b, value = TRUE), :)[[1]] :
  subscript out of bounds
 
  So, it looks like the data it read from that URL was not what was
  expected.  I tried mimicking the script's request of the header
  information for that URL, and got back the following header lines:
 
  HTTP/1.1 200 OK
  Date: Mon, 12 Feb 2007 23:22:06 GMT
  Server: Apache/2.0.40 (Red Hat Linux)
  Last-Modified: Fri, 19 May 1995 10:27:04 GMT
  ETag: 7bc27-836-39a78e00
  Accept-Ranges: bytes
  Content-Type: text/plain; charset=ISO-8859-1
  Content-length: 2102
  Connection: Keep-Alive
 
  The script appears to be looking for a Content-Length field, but as
  you can see the returned header is Content-length with a lower-case
  l.  I don't know R yet, so I'm not sure if the grep in the test code
  is case-sensitive or not, but if it is, that would seem to be the
  problem.  But then, surely everyone would be hitting this error?

 The grep is indeed case sensitive, as a quick test can show. However,
 the header I got back when I tried the above address had Length in it:
 HTTP/1.1 200 OK
 Date: Tue, 13 Feb 2007 01:40:48 GMT
 Server: Apache/2.0.40 (Red Hat Linux)
 Last-Modified: Fri, 19 May 1995 10:27:04 GMT
 ETag: 7bc27-836-39a78e00
 Accept-Ranges: bytes
 Content-Length: 2102
 Content-Type: text/plain; charset=ISO-8859-1
 X-Pad: avoid browser bug

 ( I used curl for this, if it makes a difference)

 Hope this helps in some way.

--Paul

 Haris




__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] make check failure, internet.Rout.fail, Error in strsplit

2007-02-13 Thread Paul Lynch
I just happen to also have a MacOS 10.4 machine, but when I tried from
there, I still got Content-length.  Anyway, I am fairly certain that
headers received from web servers would not be modified by the
receiving machine or anything in-between.  I suspect that the machine
at www.stats.ox.ac.uk, even though we see it as the same IP, must be
doing some sort of load balancing, probably on the basis of our IP
addresses, and sending our requests to different servers.

If this is correct, then it would seem that the test code for this
part of the make test should be modified to do the grep in a
case-insensitive way, or at the very least to support
Content-length.  I guess the next thing to do would be to submit a
bug report.

Thanks a lot for helping me look into this problem,
   --Paul

On 2/13/07, Charilaos Skiadas [EMAIL PROTECTED] wrote:
 On Feb 13, 2007, at 11:16 AM, Paul Lynch wrote:

  Thanks for giving it a try.  It is very odd that you got
  Content-Length when I am getting Content-length.  I just tried
  curl (I had been using telnet to port 80) and I got the same (error
  causing) length result:
 
  Perhaps we are hitting different web servers?  I ran nslookup on
  www.stats.ox.ac.uk, and it appears to be an alias for
  web2.stats.ox.ac.uk.  Is that the machine you are getting?

 I get:
 Server: 192.200.129.190
 Address:192.200.129.190#53

 Non-authoritative answer:
 www.stats.ox.ac.uk  canonical name = web2.stats.ox.ac.uk.
 Name:   web2.stats.ox.ac.uk
 Address: 163.1.210.2

  What
  happens if you run curl against web2, i.e.:

curl --head http://web2.stats.ox.ac.uk/pub/datasets/csb/ch11b.dat
 
  ?
  (I get Content-length).

 I get Content-Length.

 I'm on MacOSX 10.4.8, don't know if that makes any difference.

  Thanks,
  --Paul

 Haris




__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] make check failure, internet.Rout.fail, Error in strsplit

2007-02-12 Thread Paul Lynch
I'm trying to build R on RedHat EL4.  The compile went fine, but a
make check ran into a problem and produced a file
internet.Rout.fail.  Judging by the last part of that file, it was
trying to run an R routine called httpget to retrieve the URL
http://www.stats.ox.ac.uk/pub/datasets/csb/ch11b.dat.  The precise
error it encountered was:

Error in strsplit(grep(Content-Length, b, value = TRUE), :)[[1]] :
subscript out of bounds

So, it looks like the data it read from that URL was not what was
expected.  I tried mimicking the script's request of the header
information for that URL, and got back the following header lines:

HTTP/1.1 200 OK
Date: Mon, 12 Feb 2007 23:22:06 GMT
Server: Apache/2.0.40 (Red Hat Linux)
Last-Modified: Fri, 19 May 1995 10:27:04 GMT
ETag: 7bc27-836-39a78e00
Accept-Ranges: bytes
Content-Type: text/plain; charset=ISO-8859-1
Content-length: 2102
Connection: Keep-Alive

The script appears to be looking for a Content-Length field, but as
you can see the returned header is Content-length with a lower-case
l.  I don't know R yet, so I'm not sure if the grep in the test code
is case-sensitive or not, but if it is, that would seem to be the
problem.  But then, surely everyone would be hitting this error?

Can anyone offer some suggestions as how to proceed from here?  Thanks,
  --Paul

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.