Re: [R] Padding in lattice plots

2005-07-16 Thread Federico Gherardini
On Friday 15 July 2005 17:00, Deepayan Sarkar wrote:
 On 7/15/05, Federico Gherardini [EMAIL PROTECTED] wrote:
  On Friday 15 July 2005 14:42, you wrote:
   Hi all,
   I've used the split argument to print four lattice plots on a single
   page. The problem now is that I need to reduce the amount of white
   space between the plots. I've read other mails in this list about the
   new trellis parameters layout.heights and layout.widhts but I haven't
   been able to use them properly. I've tried to input values between 0
   and 1 as the padding value (both left and right and top and bottom) but
   nothing changed. It seems I can only increase the padding by using
   values  1. Any ideas?
  
   Thanks in advance for your help
   Federico Gherardini
 
  It seems like I've found an answer myself you have to use negative
  values to decrease the padding. I thought it was something like the cex
  parameter which acts like a multiplier

 I thought so too.

  but this is not the case.

 Could you post what you used? There are several different padding
 parameters you need to set to 0, did you change them all?

 Deepayan

Hi Deepayan
This is what I used I don't know if I did everything the proper way but 
at least I got the result I was seeking! :)

trellis.par.set(list(layout.heights = list(top.padding = -1)))

trellis.par.set(list(layout.heights = list(bottom.padding = -1, 
axis.xlab.padding = 1, xlab = -1.2)))

trellis.par.set(list(layout.widths = list(left.padding = -1)))

trellis.par.set(list(layout.widths = list(right.padding = -1, 
ylab.axis.padding = -0.5)))

Do these settings make any sense?

Federico

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Padding in lattice plots

2005-07-15 Thread Federico Gherardini
Hi all,
I've used the split argument to print four lattice plots on a single page. The 
problem now is that I need to reduce the amount of white space between the 
plots. I've read other mails in this list about the new trellis parameters 
layout.heights and layout.widhts but I haven't been able to use them 
properly. I've tried to input values between 0 and 1 as the padding value 
(both left and right and top and bottom) but nothing changed. It seems I can 
only increase the padding by using values  1. Any ideas?

Thanks in advance for your help
Federico Gherardini

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Padding in lattice plots

2005-07-15 Thread Federico Gherardini
On Friday 15 July 2005 14:42, you wrote:
 Hi all,
 I've used the split argument to print four lattice plots on a single page.
 The problem now is that I need to reduce the amount of white space between
 the plots. I've read other mails in this list about the new trellis
 parameters layout.heights and layout.widhts but I haven't been able to use
 them properly. I've tried to input values between 0 and 1 as the padding
 value (both left and right and top and bottom) but nothing changed. It
 seems I can only increase the padding by using values  1. Any ideas?

 Thanks in advance for your help
 Federico Gherardini
It seems like I've found an answer myself you have to use negative values 
to decrease the padding. I thought it was something like the cex parameter 
which acts like a multiplier but this is not the case.

Thanks anyway
Federico

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Negative intercept in glm poisson model

2005-03-01 Thread Federico Gherardini
Dear list,
I'm trying to fit a glm model using family=poisson(link = identity). The 
problem is that the glm function fits a model with a negative intercept, 
which sounds like a nonsense to me, being the response a Poisson variable. 
From a previous discussion on this list I've understood that the glm function 
uses IRLS for the fitting without any constraint so it is possible for it to 
end up in a region of values which are not admissible given the model, and in 
fact sometimes it fails asking for valid starting values. In this case I 
expected it to fail asking to supply starting values, and instead it fits the 
model just nice with this negative intercept. What am I missing?

Thanks,
Cheers
Federico

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Negative intercept in glm poisson model

2005-03-01 Thread Federico Gherardini
On Tuesday 01 March 2005 14:31, Douglas Bates wrote:
 Not really.  The negative intercept is on the scale of the linear
 predictor.  The expected response, which is the exponential of the
 linear predictor, is always positive.
Thank you very much for the quick response. Actually I used link = linear, so 
I suppose that the response is on the same scale of the linear predictor, 
isn't it?

Cheers,
Federico

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Exact poisson confidence intervals

2005-01-14 Thread Federico Gherardini
Thanks everybody for their answers!
Cheers,
fede
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Exact poisson confidence intervals

2005-01-13 Thread Federico Gherardini
Hi all,
Is there any R function to compute exact confidence limits for a Poisson 
distribution with a given Lambda?

Thanks in advance
Federico
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Error using glm with poisson family and identity link

2004-11-26 Thread Federico Gherardini
Thanks very much to everybody for your replies and comments!
Cheers,
Federico
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Error using glm with poisson family and identity link

2004-11-25 Thread Federico Gherardini
Hi all
I'm trying to use the function glm from the MASS package to do the 
following fit.

fit - glm(FP ~ rand, data = tab, family = poisson(link = identity), 
subset = rand = 1)
(FP is = 0)

but I get the following error
Error: no valid set of coefficients has been found:please supply 
starting values
In addition: Warning message:
NaNs produced in: log(x)

in contrast if I fit a model without intercept
fit - glm(FP ~ rand - 1, data = tab, family = poisson(link = 
identity), subset = rand = 1)

everything goes fine.
Now my guess is that the points naturally have a negative intercept so 
the error is produced because I'm using the poisson distribution for the 
y and negative values are of course not admitted. Am I right?
Also if this is the cause, shouldn't the function always try to do the 
best fit given the parameters? I mean shouldn't it fit a model with 
intercept 0 anyway and report it as a bad fit?

Thanks
Federico
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Poisson regression

2004-10-29 Thread Federico Gherardini
Dear all,
First of all sorry if this is a dumb question...
I'm trying to fit a linear model between the logarithm of two numerical 
variables (log(y) ~ log(x)). A log-log plot shows that the variance of 
log(y) is decreasing with the mean of log(x), in other words the points 
are quite dispersed for low values of log(x) and approach a straight 
line as log(x) increases. I have tried the following glm model
fit - glm(y ~ log(x), data = tab, family = poisson)
The residuals seem very good but I have a doubt: I have used y and not 
log(y) in  the model formula because, as far as I understand, the 
poisson regression assumes a logarithmic transformation of the response. 
Is this correct? I mean is it correct to watch a y ~ log(x) poisson 
regression line on a log(y) ~ log(x) xyplot or I am confusing the two 
things?

Thank you very much
Federico
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Combining columns of different length

2004-10-26 Thread Federico Gherardini
Hi all,
Simple and direct question
Is it possible to add a shorter column to a data frame or matrix in such 
a way that the missing values are replaced with NAs?
For example suppose I have

3   2
4   2
5   8
and I want to add a column
3
3
to get...
3   2   3
4   2   3
5   8   NA
Thanks
Federico
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Testing for normality of residuals in a regression model

2004-10-16 Thread Federico Gherardini
Prof Brian Ripley wrote:
However, stats 901 or some such tells you that if the distributions have 
even slightly longer tails than the normal you can get much better 
estimates than OLS, and this happens even before a test of normality 
rejects on a sample size of thousands.

Robustness of efficiency is much more important than robustness of 
distribution, and I believe robustness concepts should be in stats 101.
(I was teaching them yesterday in the third lecture of a basic course, 
albeit a graduate course.)
   

This is a very interesting discussion. So you are basically saying that 
it's better to use robust regression methods, without having to worry 
too much about the distribution of residuals, instead of using standard 
methods and doing a lot of tests to check for normality? Did I get your 
point?

Cheers,
Federico
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Testing for normality of residuals in a regression model

2004-10-15 Thread Federico Gherardini
Hi all,
Is it possible to have a test value for assessing the normality of 
residuals from a linear regression model, instead of simply relying on 
qqplots?
I've tried to use fitdistr to try and fit the residuals with a normal 
distribution, but fitdsitr only returns the parameters of the 
distribution and the standard errors, not the p-value. Am I missing 
something?

Cheers,
Federico
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Testing for normality of residuals in a regression model

2004-10-15 Thread Federico Gherardini
Thank you very much for your suggestions! The residuals come from a gls 
model, because I had to correct for heteroscedasticity using a weighted 
regression... can I simply apply one of these tests (like shapiro.test) 
to the standardized residuals from my gls model?

Cheers,
Federico
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Testing for normality of residuals in a regression model

2004-10-15 Thread Federico Gherardini
Berton Gunter wrote:
Quite right, John!
I have 2 additional questions:
1) Why test for normality of residuals? Suppose you reject -- then what?
(residual plots may give information on skewness, multi-modality, data
anomalies that can affect the data analysis).
 

Because I want to know if my model satisfies the basic assumptions of 
regression theory... in other words I want to know if I can trust my 
model.

Cheers,
Federico
2) Why test for normality? Is it EVER useful? Suppose you reject -- then
what?
(I am tempted to add a 3rd question -- why test at all? -- but that is
perhaps too iconoclastic and certainly off topic. Let the hounds remain
leashed for now.)
Cheers,
-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Testing for normality of residuals in a regression model

2004-10-15 Thread Federico Gherardini
Berton Gunter wrote:
Exactly! My point is that normality tests are useless for this purpose for
reasons that are beyond what I can take up here. 

Thanks for your suggestions, I undesrtand that! Could you possibly give 
me some (not too complicated!)
links so that I can investigate this matter further?

Cheers,
Federico
Hints: Balanced designs are
robust to non-normality; independence (especially clustering of subjects
due to systematic effects), not normality is usually the biggest real
statistical problem; hypothesis tests will always reject when samples are
large -- so what!; trust refers to prediction validity which has to do
with study design and the validity/representativeness of the current data to
future. 

I know that all the stats 101 tests say to test for normality, but they're
full of baloney!
Of course, this is free advice -- so caveat emptor!
Cheers,
Bert
 

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Fw: [R] Another big data size problem

2004-07-28 Thread Federico Gherardini
On Wed, 28 Jul 2004 09:53:08 +0200
Uwe Ligges [EMAIL PROTECTED] wrote:

 
 If your data is numeric, you will need roughly
 
 1220 * 2 * 8 / 1024 / 1024  ~~ 200 MB
 
 just to store one copy in memory. If you need more than two copies, your 
 machine with its 512MB will start to use swap space .
 Hence either use a machine with more memory, or don't use all the data 
 at once in memory, e.g. by making use of a database.
 
 Uwe Ligges
 
Well I'd be happy if it used swap space instead of locking itself up! By the way I 
don't think that the problem is entirely related to memory consumption. I have written 
a little function that reads the data row by row and does a print each time, to 
monitor its functioning. Everything starts to crwal to an horrible slowness long 
before my memory is exhausted... i.e.: after about 100 lines. It seems like R has 
problems managing very large objects per se? By the way I'll try to upgrade to 1.9 and 
see what happens...

Ernesto Jardim wrote:

Hi,

It looks like you're running linux !? if so it will be quite easy to
create a table in MySQL, upload all the data into the database and
access the data with RMySQL (it's _very_ fast). Probably there will be
some operations that you can do on MySQL instead of eating memory in
R.

Regards

EJ

I'll give that a try.

Thanks everybody for their time

fede

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Another big data size problem

2004-07-28 Thread Federico Gherardini
On Wed, 28 Jul 2004 13:28:20 +0100
Ernesto Jardim [EMAIL PROTECTED] wrote:


 Hi,
 
 When you're writing a table to MySQL you have to be carefull if the
 table is created by RMySQL. The fields definition may not be the most
 adequate and there will be no indexes in your table, which makes the
 queries _very_ slow.
 
So, if I understood correctly, if you want to use SQL you'll have to upload the table 
in SQL, directly from MySQL without using R at all, and then use RMySQL to read the 
elements in R?

Uwe Ligges [EMAIL PROTECTED] wrote:

Note that it is better to initialize the object to full size before 
inserting -- rather than using rbind() and friends which is indeed slow
since it need to re-allocate much memory for each step.

Do you mean something like this?

tab - matrix(rep(0, 1227 * 2), 1227, 2, byrow = TRUE)

for(i in 0:num.lines)
tab[i + 1,] - scan(mytab, nlines = 1, what=PS, skip = i)


The above doesn't get very far either... it seems that, once it has created the table, 
it becomes so slow that it's unusable. I'll have to try this with more RAM by the way.

Cheers,

fede

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Another big data size problem

2004-07-28 Thread Federico Gherardini
Thanks for your suggestions,

On Wed, 28 Jul 2004 17:04:57 +0200
Henrik Bengtsson [EMAIL PROTECTED] wrote:

 
 Anyway, I think you also should read the help for scan(). What do you want
 with argument 'what=PS'? PS is not a valid data type; 'what' does not
 specify a name of field/column to be read.

From the scan help page...

   what: the type of 'what' gives the type of data to be read.

It seems to me that I had read somewhere (maybe on the mailing list archives), that 
'what' was supposed to be a sort of example of the kind of data you had to read... so 
I put a character string because I wanted the data to be read as character because I 
had a column of factors. I know this is not a great way to do it (better have a matrix 
made up of nubers only, instead of having to subsequently convert columns of character 
strings to number) but I wanted to do a quick test without having to rearrange my file.

Cheers,

fede

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Another big data size problem

2004-07-27 Thread Federico Gherardini
Hi all,

I'm trying to read a 1220 * 2 table in R but I'm having lot of problems. Basically 
what it happens is that R.bin starts eating all my memory until it gets about 90%. At 
that point it locks itself in a  uninterruptible sleep status (at least that's what 
top says) where it just sits there barely using the cpu at all but keeping its tons of 
memory. I've tried with read.table and scan but none of them did the trick. I've also 
tried some orrible hack like reading one line a time and gradually combining 
everything in a matrix using rbind... nope! It seems I can read up to 500 lines in a 
*decent* time but nothing more. The machine is a 3 GHz P4 with HT and 512 MB RAM 
running R-1.8.1. Will I have to write a little a C program myself to handle this thing 
or am I missing something?

Thanks in advance for your help,

fede

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html