date:20080426

[R] how to modify the histogram's frequencies

2008-04-26 Thread marcelha mukim

Hi:


I have been trying to figure out a simple way to plot an histogram whose
frequencies are modified by a factor (associated with each value).


Actually what I did was plotting each value with its modified
frequency(using plot($values, $frequency, type="h")), but it doesnt take
into account the probability distribution(which is necessary).


It will be very helpful from you any approach to solve this problem.

thank you,

best


m.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] resolution (dpi) problem

2008-04-26 Thread Prof Brian Ripley

On Sat, 26 Apr 2008, [EMAIL PROTECTED] wrote:

> On 26-Apr-08 19:30:35, Chris Walker wrote:
>> I am using R 2.4.1 with Windows XP.
>>
>> I use the plot command in a fairly simple script and I use
>> the right mouse click on the plot and save as a postscript
>> file. I used the resultant file in a paper which was submitted
>> electronically. However, I get the following response from the
>> journal:
>>
>> Your manuscript has been unsubmitted because you failed to meet
>> the submission guidelines as indicated below:
>>
>> -Your figures must be submitted in TIFF or EPS format according
>> to the following minimum resolutions:
>>
>> 1200 dpi for black and white line art (simple bar graphs, charts,
>> etc.) 300dpi for halftones (black and white photographs) 600dpi
>> for combination halftones (Photographs that also contain line art
>> such as labeling or thin lines)
>>
>> Does anyone know how to produce the correct settings for the
>> journal (i.e.1200 dpi)?
>>
>> Thankyou
>> Chris W
>
> I'm about to swim in (for me) murky waters here, since I don't
> use R on Windows, so things may happen on that platform which
> I'm not aware of.
>
> But I just want to make general comments about PostScript and R.
>
> 1. R's postscript() device produces EPS, so that bit should
>   be satisfied.

Only if there is a single page in the file.  The help file says

The postscript produced for a single R plot is EPS (_Encapsulated
PostScript_) compatible, and can be included into other documents,

> 2. Normally, except when a graphic has been converted from a
>   bit-mapped format, a PostScript (or EPS) file does not
>   have any intrinsic resolution, so long as what it represents
>   is vector graphics (which includes the rendering of letters,
>   numerals, symbols, etc.). Any resolution applying to the
>   result when a PS/EPS file is displayed/printed depends on
>   the resolution of the "end device" (screen/printer etc.)
>   which does the rendering. In principle, PS/EPS has "infinite"
>   resolution (for instance, printing from EPS to photograpic film
>   using a laser beam could have resolution as fine as 100,000).

Actually, there is an intrinsic resolution since coordinates in the file 
are recorded to a finite accuracy.  In R's case this is 0.01 bp, that is 
7200 ppi (dpi is incorrect terminology for anything involving 
halftones).

> 3. It is of course possible that the software generating the
>   graphic implements certain things as bit-maps in the first
>   place, in which case what goes in the PS/EPS file will
>   inevitably do the same.

Possible in general, but not in R's postscript() device (which is what 
saving from the windows() menu uses).

> You do not say what sort of graphic you have plotted, so one
> cannot tell whether (3) applies. However, my feeling is that
> either the journal has got the wrong impression of what you
> sent them, or what was intended to be an EPS file in fact
> got created/converted to some other (bit-mapped) format
> before you extracted it and sent it off.
>
> Sorry not to be more specifically helpful, but I especially
> wanted to make points (1) and (2) above, for clarification.
>
> Best wishes,
> Ted.
>
> 
> E-Mail: (Ted Harding) <[EMAIL PROTECTED]>
> Fax-to-email: +44 (0)870 094 0861
> Date: 26-Apr-08   Time: 23:22:36
> -- XFMail --
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] resolution (dpi) problem

2008-04-26 Thread Prof Brian Ripley

On Sat, 26 Apr 2008, Marc Schwartz wrote:

> See comments inline:
>
> Chris Walker wrote:
>> I am using R 2.4.1 with Windows XP.
>
> First, you are using a version of R that is a year and a half and 6
> releases out of date.  Version 2.7.0 was just released this past week.
> You can download it from your nearest CRAN mirror.
>
>> I use the plot command in a fairly simple script and I use the right mouse
>> click on the plot and save as a postscript file. I used the resultant file
>> in a paper which was submitted electronically. However, I get the following
>> response from the journal:
>>
>> Your manuscript has been unsubmitted because you failed to meet the
>> submission guidelines as indicated below:
>>
>> -Your figures must be submitted in TIFF or EPS format according to the
>> following minimum resolutions:
>>
>> 1200 dpi for black and white line art (simple bar graphs, charts, etc.)
>> 300dpi for halftones (black and white photographs) 600dpi for combination
>> halftones (Photographs that also contain line art such as labeling or thin
>> lines)
>>
>> Does anyone know how to produce the correct settings for the journal (i.e.
>> 1200 dpi)?
>
> Their comments about resolution apply to TIFF files and not EPS files,
> which are resolution independent.
>
> It has been several years since I last used the Windows version of R,
> but if the 'File Save As' menu for the plot indicates Postscript and not
> Encapsulated Postscript, that is your problem.

Whatever the menu says, it does produce Encapsulated PostScript (which is 
a special case of PostScript)

R's postscript is always encapsulated, but EPS also requires a single 
frame and a suitable header.  If you save from the menu it is

%!PS-Adobe-3.0 EPSF-3.0
...

> You can use dev.copy2eps(...) after you have plotted the graphic to the
> screen device, or better, plot directly to an EPS file by surrounding
> your plot code with:
>
>   postscript("FileName.eps", width = X, height = Y, paper = "special",
>  horizontal = FALSE, onepage = FALSE)
>
>  YOUR PLOT CODE HERE
>
>   dev.off()
>
>
> See ?dev.copy and/or ?postscript for more help.
>
> And...be sure to install the latest version of R.  :-)

Not least because it can produce a 1200dpi TIFF figure via the tiff() 
device.


>
> HTH,
>
> Marc Schwartz

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] integration error when I use "optim" and "integrate" simultaneously

2008-04-26 Thread Charles C. Berry

On Sat, 26 Apr 2008, kathie wrote:

>
> Dear R users,
>
> When I use two functions, 'optim' and 'integrate', simultaneously, I always
> get an error like this
>
> --
> numint = function(z) {
>   dlnorm(z,mu[1],sqrt(exp(g[1]))) *
> dnorm((z-mu[2])/sqrt(exp(g[2])))/sqrt(exp(g[2]))
> }
>
> integrate(numint,lower=0,upper=Inf)$value
>
>
> Error in integrate(numint, lower = 0, upper = Inf) : non-finite function
> value
> -
>
> Dr. Ripley said for this problem long time ago
>
> "You are trying to use a derivative-based optimization method without
> supplying derivatives.  This will use numerical approximations to the
> derivatives, and your objective function will not be suitable as it is
> internally using adaptive numerical quadrature and hence is probably not
> close enough to a differentiable function (it may well have steps)."
>
>
> So, I am trying to avoid this 'integrate' in R, but I have no idea.
>
> Any suggestion will be greatly appreciated.

The implied advice in the quotation above is to use the derivatives.

The function you have is the product of a lognormal and a normal density. 
You need the derivatives of those w.r.t their means and variances. You can 
probably look these up someplace if the challenge of hacking them out is 
too daunting. (Hint: how do you find the MLEs for these densities?)

You can appreciate why integrate() sometimes complains if you spend some 
time looking at

curve( numint, 0, uplim )

for suitably chosen values of mu and g and uplim. Say

mu <- c(1,1)
g <- c(10,1)
uplim <- 0.01.

But replacing integrate with a better quadrature rule for this particular 
problem probably will not smooth it out enough to substitute for 
derivatives.

HTH,

Chuck

>
> Kathryn Lord
> -- 
> View this message in context: 
> http://www.nabble.com/integration-error-when-I-use-%22optim%22-and-%22integrate%22-simultaneously-tp16918109p16918109.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Charles C. Berry(858) 534-2098
 Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]  UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] matrix from list

2008-04-26 Thread Greg Snow

So you want mylist[[1:2]] to return something (other than an error) when mylist 
is a list.

What if mylist <- list( 1:10, 101:110 , some.other.things) so the first 2 
elements are vectors of length 10.  then mylist[1:2] makes sense as still being 
a list with the 2 vectors.  What should mylist[[1:2]] return in this case?  One 
vector of length 20? or should it return a matrix with 2 columns and 10 rows.  
Both of those make sense, how should the computer decide between them (it may 
be obvious to you knowing the context, but how can the computer decide).  You 
can do either of these in R by giving the computer a bit more information 
(as.matrix or unlist).  What if one of the vectors is character and one is 
numeric, what should the return object be?  What if the first element of mylist 
is the return object from "lm" and the second element is a function, what 
should mylist[[1:2]] return then?

If you can come up with a set of rules that will cover every possible case, 
then someone may be willing to implement those rules.  But while it is not 
obvious what to return without giving extra information, it is better to 
require the extra information through other functions.

From: [EMAIL PROTECTED] on behalf of Olivier Lefevre
Sent: Sat 4/26/2008 9:43 AM
To: [EMAIL PROTECTED]
Subject: Re: [R] matrix from list

Olivier Lefevre wrote:
> Anyway you are right that it would still return the kind of object, only
> subsetted, which is not I want.

I mean [] would do that; I know [[]] doesn't. Yet I still don't see why one
accepts vector arguments but not the other: they are both indexing
operators. It is such inconsistencies that make languages hard to learn.

-- O.L.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] system.time()

2008-04-26 Thread jim holtman

The first two numbers will not add up to the third.  The third is the
elapsed time (wall clock) and the first two are CPU time; even though
they have what appears to be the same units (seconds), they can not
really be compared.  If, for example, was script was taking input from
the console, you will get something like this:

> system.time({
+ x <- scan("")
+ })
1: 12345
2:
Read 1 item
   user  system elapsed
   0.000.067.34
>

Here it took 7 seconds for me to enter the data, on only 0.06 seconds
of CPU (all system time -- probably the 'read' function).

The differences in your numbers were the CPU is larger than the
elapsed might be due to the resolution of the time since it was a very
short time, or if you are running an a multiprocessor, and the
application can run multithreaded, you will see the same thing happen
-- e.g., if you have 2 CPU and the application could multithread and
it was CPU bound, then in 10 seconds of elapsed time you might see 20
seconds of CPU time being used.

HTH

On Sat, Apr 26, 2008 at 8:32 PM, Fernando Saldanha <[EMAIL PROTECTED]> wrote:
> I thought system.time() would return three numbers, the first two
> adding up to the third one. However, this does not hold in my system
> running Windows Vista Home Premium, with an Intel Core 2 Duo
> processor. For example, using the code in the help for system.time()
> (with one zero added), I got:
>
> > system.time(for(i in 1:100) mad(runif(1)))
>   user  system elapsed
>   0.270.000.25
>
> Is this a problem with my system clock or am I missing something?
>
> Thanks.
>
> FS
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] integration error when I use "optim" and "integrate" simultaneously

2008-04-26 Thread kathie


Dear R users,

When I use two functions, 'optim' and 'integrate', simultaneously, I always
get an error like this

--
numint = function(z) {
dlnorm(z,mu[1],sqrt(exp(g[1]))) *
dnorm((z-mu[2])/sqrt(exp(g[2])))/sqrt(exp(g[2]))
}

integrate(numint,lower=0,upper=Inf)$value


Error in integrate(numint, lower = 0, upper = Inf) : non-finite function
value
-

Dr. Ripley said for this problem long time ago

"You are trying to use a derivative-based optimization method without
supplying derivatives.  This will use numerical approximations to the
derivatives, and your objective function will not be suitable as it is 
internally using adaptive numerical quadrature and hence is probably not
close enough to a differentiable function (it may well have steps)." 


So, I am trying to avoid this 'integrate' in R, but I have no idea.  

Any suggestion will be greatly appreciated.

Kathryn Lord 
-- 
View this message in context: 
http://www.nabble.com/integration-error-when-I-use-%22optim%22-and-%22integrate%22-simultaneously-tp16918109p16918109.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lists of data.frame

2008-04-26 Thread jim holtman

I think this is what you want.  You need to look at the help file for
?'[' and ?'[['.  Also understand the differences between a data.frame
and a list.  in the case of defining qn you need to use a list and not
a data.frame because here is what happens with a data.frame:

> q1 <- data.frame(q=d1, n="a")
> str(q1)
'data.frame':   1 obs. of  3 variables:
 $ q.x: num 1
 $ q.y: num 2
 $ n  : Factor w/ 1 level "a": 1

Which is probably not want you want.  So the following should give you
the results you want

>  d1 <- data.frame(x=1, y=2)
>  d2 <- data.frame(x=1, y=3)
>  d3 <- data.frame(x=21, y=3)
>
>
> q1 <- list(q=d1, n="a")
> q2 <- list(q=d2, n="a")
> q3 <- list(q=d3, n="a")
>
>
> v <- list()
> v[[1]] <- q1
> v[[2]] <- q2
> v[[3]] <- q3
> str(v)
List of 3
 $ :List of 2
  ..$ q:'data.frame':   1 obs. of  2 variables:
  .. ..$ x: num 1
  .. ..$ y: num 2
  ..$ n: chr "a"
 $ :List of 2
  ..$ q:'data.frame':   1 obs. of  2 variables:
  .. ..$ x: num 1
  .. ..$ y: num 3
  ..$ n: chr "a"
 $ :List of 2
  ..$ q:'data.frame':   1 obs. of  2 variables:
  .. ..$ x: num 21
  .. ..$ y: num 3
  ..$ n: chr "a"
> #access
> v[[2]]$q$x
[1] 1


On Sat, Apr 26, 2008 at 8:04 PM, Worik R <[EMAIL PROTECTED]> wrote:
> I cannot find out how to build data structures as lists of data structures.
>
> I want to do...
> r-help@r-project.org
>
>  d1 <- data.frame(x=1, y=2)
>  d2 <- data.frame(x=1, y=3)
>  d3 <- data.frame(x=21, y=3)
>
>
> q1 <- data.frame(q=d1, n="a")
> q2 <- data.frame(q=d2, n="a")
> q3 <- data.frame(q=d3, n="a")
>
>
> v <- vector() or list() or whatever
>
> v[1] <- q1
> v[2] <- q2
> v[3] <- q3
>
> then access d2$x as: v[2]$q$x
>
> But I cannot get any thing like this.
>
> I can find no examples in the manuals.
>
>
> cheers
> Worik
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] system.time()

2008-04-26 Thread Fernando Saldanha

I thought system.time() would return three numbers, the first two
adding up to the third one. However, this does not hold in my system
running Windows Vista Home Premium, with an Intel Core 2 Duo
processor. For example, using the code in the help for system.time()
(with one zero added), I got:

> system.time(for(i in 1:100) mad(runif(1)))
   user  system elapsed
   0.270.000.25

Is this a problem with my system clock or am I missing something?

Thanks.

FS

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] reading information

2008-04-26 Thread Manoel Santos

>  p<-function(x){
+  for(i in 1:teste2[k]+1){
+  p2=c+teste2[3+i]*x^(i-1)
+ c=p2}
+ return(c)}
> p3<-p(x)

if i make this i can use a p(x) calling a function in anohter function? no ,
right?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Lists of data.frame

2008-04-26 Thread Worik R

I cannot find out how to build data structures as lists of data structures.

I want to do...
r-help@r-project.org

 d1 <- data.frame(x=1, y=2)
 d2 <- data.frame(x=1, y=3)
 d3 <- data.frame(x=21, y=3)


q1 <- data.frame(q=d1, n="a")
q2 <- data.frame(q=d2, n="a")
q3 <- data.frame(q=d3, n="a")


v <- vector() or list() or whatever

v[1] <- q1
v[2] <- q2
v[3] <- q3

then access d2$x as: v[2]$q$x

But I cannot get any thing like this.

I can find no examples in the manuals.


cheers
Worik

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] reading information

2008-04-26 Thread Manoel Santos

teste2
  theta_chapeu f_estrela k a0 a1 a2 a3
13.21.2  3  2  1  4   5


>  p<-function(k){
+ i<-1
+  for(i in 1:k){
+  p2=teste2[3+i]}
+ return(p2)}

> > p3- p3
  a2
1  4

> p3[1]
  a2
1  4


how i can make an array with the polinomyal function?
lilke p3[1]=1 p3[2]=4 p3[3]=5
tks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Calling a stored model within the predict() function

2008-04-26 Thread Jim_S


Hi all,

First of all, I'm a novice R user (less that a week), so perhaps my code
isn't very efficient.

Using the MBoost package I created a model using the following command and
saved it to a file for later use:

model <- gamboost(fpfm,data=SampleClusterData,baselearner="bbs") # Creating
a model
save(model,file="model.RData") # Saving a model


After this, during a new R session, I want to deploy this model:

setwd("Q:/Program Files/R/library/mboost/data") 
NewData <- read.table("NewData.txt",header=TRUE,sep=,",") #Calling a new
data set
setwd("Q:/Program Files/R/library/mboost")
model <- load("model.rdata") # Loading the model

This won't work. When I check the content of the 'model' object, the only
screen output I see is :

"model
[1]model"

...and that the actual file size from 'model.rdata' is exactly 4,100 kb (is
this a limit?!)


Then I try to call this model object with the predict() function:

CRPredict <- predict(model, newdata= SampleClusterData) # Make the
predictions

This results in the following error message:

"Error in UseMethod("predict") : no applicable method for "predict"  "

Is the a correct way to store and call a model or is it only possible to
create & deploy a model during the same R session?

Thanks!

Jim



-- 
View this message in context: 
http://www.nabble.com/Calling-a-stored-model-within-the-predict%28%29-function-tp16918111p16918111.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] quasi-random sequences

2008-04-26 Thread Stas Kolenikov

You might want to shuffle coordinates independently to get rid of the
diagonals. Otherwise what quasi-random sequence guarantee are upper
boundaries on the coverage errors, but not anything nice-looking and
irregular. Sobol' sequences, even though they are theoretically
superior to some others (e.g., Halton sequences more popular among
economists), are especially nasty in producing bands and bricks on the
low dimensional plots.

 Among statisticians, Art Owen from Stanford is almost the only one
interested in this sort of stuff (referred to as quasi-Monte Carlo, in
his field(s)). You might have better luck on a physics list with a
question like yours.

 On Sat, Apr 26, 2008 at 5:22 AM, baptiste Auguié <[EMAIL PROTECTED]> wrote:
 > Dear list useRs,
 >
 >  I have to generate a random set of coordinates (x,y) in [-1 ; 1]^2
 >  for say, N points. At each of these points is drawn a circle (later
 >  on, an ellipse) of random size, as in:
 >
 >
 >  > N <- 100
 >  >
 >  > positions <- matrix(rnorm(2 * N, mean = 0 , sd= 0.5), nrow=N)
 >  > sizes<-rnorm(N, mean = 0 , sd= 1)
 >  > plot(positions,type="p",cex=sizes)
 >
 >
 >  My problem is to avoid collisions (overlap, really) between the
 >  points. I would like some random pattern, but with a minimum
 >  exclusion distance. In looking up "Numerical recipes in C", I found
 >  out about some Sobol quasi-random sequences, which one can call from
 >  the gsl package,
 >
 >
 >  > library(gsl)
 >  >
 >  > g <- qrng_alloc(type="sobol",dim=2)
 >  > qrng_get(g,n= N) ->xy
 >  >
 >  > plot((xy),t="p",cex=0.5)
 >
 >  but this does not look very random: I clearly see some pattern
 >  (diagonals, etc...), and even the non-overlapping condition is not
 >  impressive.
 >
 >  One (painful) way I can foresee is to check the distance between each
 >  symbol and the others, and move the overlapping ones in a recursive
 >  manner. Before delving into this, I wanted to check I'm not
 >  overlooking something in the rgl quasi-random sequences, or missing a
 >  more obvious way to generate such patterns. Perhaps solving an
 >  electrostatic problem with a potential both attractive at long
 >  distances and repulsive at short distances is a better way? I have a
 >  vague recollection of hearing that somewhere to position points
 >  evenly on a sphere.
 >

 --
 Stas Kolenikov, also found at http://stas.kolenikov.name
 Small print: I don't check Gmail account regularly.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] resolution (dpi) problem

2008-04-26 Thread Ted Harding

On 26-Apr-08 19:30:35, Chris Walker wrote:
> I am using R 2.4.1 with Windows XP.
> 
> I use the plot command in a fairly simple script and I use
> the right mouse click on the plot and save as a postscript
> file. I used the resultant file in a paper which was submitted
> electronically. However, I get the following response from the
> journal:
> 
> Your manuscript has been unsubmitted because you failed to meet
> the submission guidelines as indicated below:
> 
> -Your figures must be submitted in TIFF or EPS format according
> to the following minimum resolutions:
> 
> 1200 dpi for black and white line art (simple bar graphs, charts,
> etc.) 300dpi for halftones (black and white photographs) 600dpi
> for combination halftones (Photographs that also contain line art
> such as labeling or thin lines)
> 
> Does anyone know how to produce the correct settings for the
> journal (i.e.1200 dpi)?
> 
> Thankyou
> Chris W

I'm about to swim in (for me) murky waters here, since I don't
use R on Windows, so things may happen on that platform which
I'm not aware of.

But I just want to make general comments about PostScript and R.

1. R's postscript() device produces EPS, so that bit should
   be satisfied.

2. Normally, except when a graphic has been converted from a
   bit-mapped format, a PostScript (or EPS) file does not
   have any intrinsic resolution, so long as what it represents
   is vector graphics (which includes the rendering of letters,
   numerals, symbols, etc.). Any resolution applying to the
   result when a PS/EPS file is displayed/printed depends on
   the resolution of the "end device" (screen/printer etc.)
   which does the rendering. In principle, PS/EPS has "infinite"
   resolution (for instance, printing from EPS to photograpic film
   using a laser beam could have resolution as fine as 100,000).

3. It is of course possible that the software generating the
   graphic implements certain things as bit-maps in the first
   place, in which case what goes in the PS/EPS file will
   inevitably do the same.

You do not say what sort of graphic you have plotted, so one
cannot tell whether (3) applies. However, my feeling is that
either the journal has got the wrong impression of what you
sent them, or what was intended to be an EPS file in fact
got created/converted to some other (bit-mapped) format
before you extracted it and sent it off.

Sorry not to be more specifically helpful, but I especially
wanted to make points (1) and (2) above, for clarification.

Best wishes,
Ted.

E-Mail: (Ted Harding) <[EMAIL PROTECTED]>
Fax-to-email: +44 (0)870 094 0861
Date: 26-Apr-08   Time: 23:22:36
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] resolution (dpi) problem

2008-04-26 Thread Marc Schwartz

See comments inline:

Chris Walker wrote:
> I am using R 2.4.1 with Windows XP.

First, you are using a version of R that is a year and a half and 6 
releases out of date.  Version 2.7.0 was just released this past week. 
You can download it from your nearest CRAN mirror.

> I use the plot command in a fairly simple script and I use the right mouse
> click on the plot and save as a postscript file. I used the resultant file
> in a paper which was submitted electronically. However, I get the following
> response from the journal:
> 
> Your manuscript has been unsubmitted because you failed to meet the
> submission guidelines as indicated below:
> 
> -Your figures must be submitted in TIFF or EPS format according to the
> following minimum resolutions:
> 
> 1200 dpi for black and white line art (simple bar graphs, charts, etc.)
> 300dpi for halftones (black and white photographs) 600dpi for combination
> halftones (Photographs that also contain line art such as labeling or thin
> lines)
> 
> Does anyone know how to produce the correct settings for the journal (i.e.
> 1200 dpi)?

Their comments about resolution apply to TIFF files and not EPS files, 
which are resolution independent.

It has been several years since I last used the Windows version of R, 
but if the 'File Save As' menu for the plot indicates Postscript and not 
Encapsulated Postscript, that is your problem.

You can use dev.copy2eps(...) after you have plotted the graphic to the 
screen device, or better, plot directly to an EPS file by surrounding 
your plot code with:

   postscript("FileName.eps", width = X, height = Y, paper = "special",
  horizontal = FALSE, onepage = FALSE)

  YOUR PLOT CODE HERE

   dev.off()

See ?dev.copy and/or ?postscript for more help.

And...be sure to install the latest version of R.  :-)

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] nested anova and multiple comparisons

2008-04-26 Thread Kingsford Jones

Hi Stephen,

On Sat, Apr 26, 2008 at 10:29 AM, Stephen Cole <[EMAIL PROTECTED]> wrote:
..
>
>  I have managed to accomplish my first two goals by analyzing the data
>  as a 3 level nested Anova with the following code
>
>  mod1 <- aov(Count~Region/Location/Site, data=data)
>
>  This allows me to get the MS for a Anova table.   However R does not
>  compute the correct F-statistics (because it uses the residual MS for
>  the denominator in all calculations which it shouldnt) so i manually
>  computed the F-stats and variance components by hand.
>

R's just doing what it was asked it to do.  The fact that it used the
residual MS isn't a surprise given your code.

>  >From reading the help guide i learned about and tried using the
>  Error(Region/Location/Site) command but then i can only get MS and no
>  F-statistics and still hand to compute the rest by hand.

Yes, if you want to use Method-of-Moments estimation to solve for
expected mean squares for the variance components and calculate
F-stats w/ the 'correct' MS, then then specifying error strata is
reasonable for balanced data (assuming you've met model assumptions).
However, I believe most statisticians these days are more inclined to
use (Restricted) Maximum Likelihood estimation (e.g. using lme or
lmer) because of the added fliexibility it provides (also it won't
produce negative variance estimates when the mean squared error is
larger than the mean square between the groups of interest)

As for why you are only getting MS and no F-stats, it's hard to say
without a reproducible example (see the posting guide).  In my
experience this will occur when there are insufficient degrees of
freedom for calculating an F-stat at a given level.

>
>  My problem now is that i would liek to use TukeyHSD for multiple
>  comparisons.  Howeber since R is unable to compute the correct F
>  statistics in this nested design i also think it is using the wrong MS
>  and df in calculating tukeys q. for example when i use
>

Again, it's not that R is unable to, it's that you asked R not to.

>  TukeyHSD(mod1, "Region")  i will get values however i do not think
>  they have been calculated correctly.
>
>  Furthermore when i use the Error(Region/Location/Site) term i can then
>  no longer use TukeyHSD as i get the error message that there is no
>  applicable use of tukey on this object.

Looking at the methods available for TukeyHSD shows that it will work
for an object of class "aov"

> methods(TukeyHSD)
[1] TukeyHSD.aov

But ?aov states:

Value:

 An object of class 'c("aov", "lm")' or for multiple responses of
 class 'c("maov", "aov", "mlm", "lm")' or for multiple error strata
 of class '"aovlist"'.  There are 'print' and 'summary' methods
 available for these.

So your model with error strata is of class "aovlist" not "aov".

>
>  i am just wondering if there is any way to use Multiple comparisons
>  with R in a nested design like mine.
>

I'm not sure but the package 'multcomp' might be of help.

>  I have thought about using lme or lmer but if i understand them right
>  with a balanced design i should be able to get the same result using
>  aov
>

Even with balanced data, you don't get the exact same thing with aov.
As mentioned above lme and lmer use different estimation methods.  One
of the advantages of using  (RE)ML is a lot more flexibility in how
you specifiy the structure of random effects covariance matrices, and
(at least with lme) you have a lot of flexibility in how you structure
the residual covariance matrix as well.  This is particularly useful
when you are not meeting assumptions of constant variance and/or
independence of observations (which seems to be the rule rather than
the exception with ecological data with a spatial component, such as
you appear to have).

The theory behind analyses you want to do are quite complex with many
potential pitfalls.  If at all possible, I highly recommend finding
the help of someone who is an expert in these types of models (known
as mixed, hierarchical, multilevel, random effects, variance
components, nested ANOVA, etc).

best,

Kingsford Jones

>  Thanks
>
>  __
>  R-help@r-project.org mailing list
>  https://stat.ethz.ch/mailman/listinfo/r-help
>  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>  and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R question about prediction from a fitted model

2008-04-26 Thread David Winsemius

Lisa <[EMAIL PROTECTED]> wrote in
news:[EMAIL PROTECTED]: 

> Hi Jorge,
> 
>  ... the new question is,
> is there any prediction for the cox model? I tried
> predict(Surv(y~x),newdata), it only gave the fitted values. Any clue
> on this? thanks a lot! 

I cannot claim great expertise in this area, but my impression is that a 
Surv() objects is not really a Cox model, but rather an object that can 
be an argument to one of the Cox PH functions. Suggestion: construct an 
example and remember to specify which of the libraries the Cox function 
came from.

-- 
David Winsemius

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Use of survreg.distributions

2008-04-26 Thread COREY SPARKS

It would seem to me that your problem is in your data.  For survreg, you have 
to have positive durations, and if your data have y>=-5 as seen in

>Error in survreg(Surv(y, y >= -5, type = "left") ~ x +  :
>  Invalid survival times for this distribution
>In addition: Warning messages:
>1: In log(dlist$dtrans(Y[exactsurv, 1])) : NaNs produced
>2: In log(y) : NaNs produced

then you have negative durations in the data.  I would suggest truncating your 
data at y>0, as that would avoid taking natural logs of negative numbers.

Corey Sparks

Corey S. Sparks, Ph.D.

Assistant Professor 
Department of Demography and Organization Studies
University of Texas San Antonio
One UTSA Circle 
San Antonio, TX 78249
email:[EMAIL PROTECTED]
 


Date: Fri, 25 Apr 2008 10:02:21 -0700 (PDT)
From: Abdus Sattar <[EMAIL PROTECTED]>
Subject: Re: [R] Use of survreg.distributions
To: Terry Therneau <[EMAIL PROTECTED]>
Cc: r-help@R-project.org
Message-ID: <[EMAIL PROTECTED]>
Content-Type: text/plain

Hello Dr. Therneau:
 
Thank you for your response. Let me explain to you want I want. My y 
variable(time) is normal and would like to fit the model on logarithmic 
transformation of y (log(y)). I tried to run codes according to your suggestion:
tfit=survreg(Surv(y, y>=-5, type="left")~x + cluster(id), dist="lognormal", 
data=y.data, scale=0, weights=w)

The following error message is giving:
 
Error in survreg(Surv(y, y >= -5, type = "left") ~ x +  :
  Invalid survival times for this distribution
In addition: Warning messages:
1: In log(dlist$dtrans(Y[exactsurv, 1])) : NaNs produced
2: In log(y) : NaNs produced

Note, the data file y.data does not contain any missing data! Do you know why 
it is giving me such an error message please?
Thank you again for your helpful comment/suggestion.
 
Best Regards,
 
Abdus Sattar
[EMAIL PROTECTED]



- Original Message 
From: Terry Therneau <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Cc: r-help@R-project.org
Sent: Friday, April 25, 2008 8:46:45 AM
Subject: Re: [R] Use of survreg.distributions

--begin included message ---

I am using survreg(Surv()) for fitting a Tobit model of left-censored
longitudinal data. For logarithmic transformation of y data, I am trying use
survreg.distributions in the following way:
tfit=survreg(Surv(y, y>=-5, type="left")~x + cluster(id), dist="gaussian",
data=y.data, scale=0, weights=w)
my.gaussian<-survreg.distributions$gaussian
my.gaussian$name="lognormal"
my.gaussian$dist<-my.gaussian
tfit=survreg(Surv(y, y>=-5, type="left")~x + cluster(id), dist=my.gaussian,
data=y.data, scale=0, weights=w)

If I run these codes then I got the following error message,

Error in survreg(Surv(y, y >= -5, type = "left") ~ x +  :
Invalid distribution object

Does anybody can help me in identifying the error(s) in these code please?

--- end include 

  Can you tell us what you are trying to do? 
  Your first model was a fit of y ~ x + eps, eps ~ Gaussian.  If what you want
is log(y) ~ x + eps, then all that you need do is use dist="loggaussian" in the
survreg call.  (Or 'lognormal'; which is the same distribution.)
 
  Terry Therneau




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Ubuntu vs. Windows

2008-04-26 Thread George N. White III

On Tue, 22 Apr 2008, Doran, Harold wrote:

> Dear List:
>
> I am very much a unix neophyte, but recently had a Ubuntu box installed
> in my office. I commonly use Windows XP with 3 GB RAM on my machine and
> the Ubuntu machine is exactly the same as my windows box (e.g.,
> processor and RAM) as far as I can tell.
>
> Now, I recently had to run a very large lmer analysis using my windows
> machine, but was unable to due to memory limitations, even after
> increasing all the memory limits in R (which I think is a 2gig max
> according to the FAQ for windows). So, to make this computationally
> feasible, I had to sample from my very big data set and then run the
> analysis. Even still, it would take something on the order of 45 mins to
> 1 hr to get parameter estimates. (BTW, SAS Proc nlmixed was even worse
> and kept giving execution errors until the data set was very small and
> then it ran for a long time)
>
> However, I just ran the same analysis on the Ubuntu machine with the
> full, complete data set, which is very big and lmer gave me back
> parameter estimates in less than 5 minutes.
>
> Because I have so little experience with Ubuntu, I am quite pleased and
> would like to understand this a bit better. Does this occur because R is
> a bit friendlier with unix somehow? Or, is this occuring because unix
> somehow has more efficient methods for memory allocation?

On the same hardware the differences between windows and linux
performance are generally minor, but there are many things that
can cause very poor performance on either platform.

> I wish I knew enough to even ask the right questions. So, I welcome any
> enlightenment members may add.

I have seen very big differences in performance on computational 
benchmarks for hardware with similar basic specifications (CPU type and 
clock, RAM, etc).  Often the difference is a symptom of broken hardware or 
some misconfiguration.  Can you see the difference in performance in other 
applications?  Here are some things to consider:

1. anti-virus scanning and other background tasks -- I've seen
systems configured to scan gigabyte network drives.  Windows
task manager and linux "top", etc. can give an idea of what
is using a lot of CPU, but they are not so helpful if the issue
involves I/O bottlenecks.

2. incorrect hardware configuration in the system BIOS.  This happens
far too often, even with big name vendors.  I like to run some
benchmarks on every new system to make sure there aren't some
basic configuration errors, and to have as a reference if I
suspect problems after the systems have been in use.

3. network problems.  Where I work, so PC's (both linux and Windows)
get the ethernet duplex setting wrong when booted.  This can
result in poor performance when using networked disks without
other symptoms.  On windows, the "repair network connection"
button often clears the problem.  On linux, ethtool can display
and change ethernet settings.

4. all sorts of hardware issues -- sometimes useful data appear in
the system logs.  Use "event viewer" on Windows, look at
/var/log/messages and /var/log/dmesg on linux.

5. does the slow system exhibit a lot more disk activity?  Sometimes
this is hard to detect, but most systems do provide some statistics.
Try running some I/O intensive benchmark at the same time your R
job is running.

-- 
George N. White III  <[EMAIL PROTECTED]>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] resolution (dpi) problem

2008-04-26 Thread Chris Walker

I am using R 2.4.1 with Windows XP.

I use the plot command in a fairly simple script and I use the right mouse
click on the plot and save as a postscript file. I used the resultant file
in a paper which was submitted electronically. However, I get the following
response from the journal:

 

Your manuscript has been unsubmitted because you failed to meet the
submission guidelines as indicated below:

 

-Your figures must be submitted in TIFF or EPS format according to the
following minimum resolutions:

1200 dpi for black and white line art (simple bar graphs, charts, etc.)
300dpi for halftones (black and white photographs) 600dpi for combination
halftones (Photographs that also contain line art such as labeling or thin
lines)

 

Does anyone know how to produce the correct settings for the journal (i.e.
1200 dpi)?

 

Thankyou

Chris W

 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] quasi-random sequences

2008-04-26 Thread baptiste Auguié

Thank you all for the great suggestions and comments. As two of you  
pointed out, the problem was not well defined (who said a well-posed  
problem is a problem solved?), and also it seems to be a very wide  
topic. I've had an interesting reading discussing the similarities  
between half-toning in black and white printing, and the quasi Monte  
Carlo integration technique [1].

I'm trying the suggested ways plus a few other things, I'll put  
together a summary of my discoveries when I've clarified it all.

Thanks again,

baptiste

[1] "Halftoning and Quasi-Monte Carlo," K. M. Hanson, Sensitivity  
Analysis of Model Output, K. M. Hanson and F. M. Hemez, eds., pp.  
430-442 (Los Alamos Research Library, 2005)


_

Baptiste Auguié

Physics Department
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

Phone: +44 1392 264187

http://newton.ex.ac.uk/research/emag
http://projects.ex.ac.uk/atto

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] nested anova and multiple comparisons

2008-04-26 Thread Stephen Cole

Hello R List:

My problem is with a nested anova.  I have read the r-help and it has
answered some of my questions but i still need some help on this one.
I have also posted for help on this data set before, so i apologize in
advance for any repetition.

My design is as follows:

response: Quadrat Counts (individuals per quadrat)

Explanatory: Region (3 regions)
 Locations (4 locations nested within each region
for a total of 12)
 Site (4 sites nested within each location within
each region for a total of 48)



I want to analyse this data set as an observational study where i am
interested if 1) There is significant variation at each scale in the
study

   2) partition the variance
components for each scale.

   3) Conduct multiple
comparisons at the highest level in the study (region)



I have managed to accomplish my first two goals by analyzing the data
as a 3 level nested Anova with the following code

mod1 <- aov(Count~Region/Location/Site, data=data)

This allows me to get the MS for a Anova table.   However R does not
compute the correct F-statistics (because it uses the residual MS for
the denominator in all calculations which it shouldnt) so i manually
computed the F-stats and variance components by hand.

>From reading the help guide i learned about and tried using the
Error(Region/Location/Site) command but then i can only get MS and no
F-statistics and still hand to compute the rest by hand.

My problem now is that i would liek to use TukeyHSD for multiple
comparisons.  Howeber since R is unable to compute the correct F
statistics in this nested design i also think it is using the wrong MS
and df in calculating tukeys q. for example when i use

TukeyHSD(mod1, "Region")  i will get values however i do not think
they have been calculated correctly.

Furthermore when i use the Error(Region/Location/Site) term i can then
no longer use TukeyHSD as i get the error message that there is no
applicable use of tukey on this object.

i am just wondering if there is any way to use Multiple comparisons
with R in a nested design like mine.

I have thought about using lme or lmer but if i understand them right
with a balanced design i should be able to get the same result using
aov

Thanks

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] matrix from list

2008-04-26 Thread Olivier Lefevre

Martin Maechler wrote:
>   The difference to as.matrix() is that data.matrix() also
>   produces a numeric matrix in the case the data frame contains
>   factors.

Thanks, that is useful but it is becoming a little rococo: so may ways to 
do this. Also, what if I have a list, not a data frame? read.table returns 
a data frame but a select from a spreadsheet (via RODBC) returns a list.

> - I don't understand why you want to throw away the dimnames of
>   the resulting numeric matrix.  Rather, in general you'd want
>   to keep them.

Merely because I find it confusing. When printing, which you do a lot of 
when developing in interactive languages like R, list and as.matrix(list) 
look absolutely the same, which is irksome and confusing since you have to 
resort to mode to check what you've got, under penalty of strange errors 
farther down. So I stick to the convention of unattributed matrices and 
arrays in my code: that way it's easy to know which is which.

Regards,

-- O.L.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Consistency of m-shapiro.test

2008-04-26 Thread Manuel Rivas

Hello all,

 

I tried several experiments with the mshapiro.test package in R and compared
it with the energy package to test for multivariate normality and find that
the mshapiro.test is not consistent which is a bit concerning and has
suspicious behavior.  On the other hand the energy test seems to be a more
appropriate test for testing multivariate normality in any dimension.  I
looked for the reference literature and either seem to reference the
original literature on the univariate Shapiro Test or implementation
algorithm of the Shapiro test.  I was wondering if the procedure used in the
mshapiro.test is published elsewhere.  

 

For direct comparison we selected three alternative distributions used in
Quiroz and Manzotti (2001):

1) Gaussian mixture-bimodal (GM) - (1=2)N (0; Iq) + (1=2)N (3; Iq),

 2) Uniformdistribution on the unit cube [0; 1]q, and 

3) distribution having i.i.d. coordinates with the

Logistic distribution. 

We evaluate the power of each statistic(1) energy statistic of Szekely and
Rizzo, and 2) Multivariate Shapiro Wilk, on the set of alternatives
considered, from sets of 1000 samples with the alternative distribution at
each dimension q = 2; 3; 4; 5 and each sample size n = 20; 50; 100. We use
the R implementation of the

energy test and multivariate Shapiro-Wilk test.

 

It seems as if the Multivariate Shapiro-Wilk test exhibits good power
against alternatives that are Gaussian Mixture. My numbers do not agree with
the power reportedby Szekely and Rizzo in their paper for the Gaussian
Mixture distribution (but this is of less importance). I find behavior that
are kind of suspect with the Shapiro-Wilk method, such as increase power
with smaller sample size for U[0; 1]q and increase power across all
alternatives for increasing q. On the other hand, the multivariate energy
test exhibits consistent power against all alternatives, i.e. power
increases with sample size which is what is desired from any test.

 

Best Regards,

Manuel Rivas


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] matrix from list

2008-04-26 Thread Olivier Lefevre

Olivier Lefevre wrote:
> Anyway you are right that it would still return the kind of object, only 
> subsetted, which is not I want.

I mean [] would do that; I know [[]] doesn't. Yet I still don't see why one 
accepts vector arguments but not the other: they are both indexing 
operators. It is such inconsistencies that make languages hard to learn.

-- O.L.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] quasi-random sequences

2008-04-26 Thread Hans W. Borchers

baptiste Auguié  exeter.ac.uk> writes:
> 
> Dear list useRs,

You might be interested to apply the Hammersley or Halton point sets that 
are often used in numerical integration or Differential Evolution. These 
pseudo-random distributions are both uniform and irregular, but have a 
kind of minimum resolution

There is an implementation of Halton Sequences in the often overlooked 
'sfsmisc' package, see the 'QUnif()' function there.  The help includes a 
visualization example dispaying the behavior I think you were looking for.

Hans Werner

> I have to generate a random set of coordinates (x,y) in [-1 ; 1]^2  
> for say, N points. At each of these points is drawn a circle (later  
> on, an ellipse) of random size, [...]
> 
> My problem is to avoid collisions (overlap, really) between the  
> points. I would like some random pattern, but with a minimum  
> exclusion distance. In looking up "Numerical recipes in C", I found  
> out about some Sobol quasi-random sequences, which one can call from  
> the gsl package,
> [...]
> but this does not look very random: I clearly see some pattern  
> (diagonals, etc...), and even the non-overlapping condition is not  
> impressive.
> 
> One (painful) way I can foresee is to check the distance between each  
> symbol and the others, and move the overlapping ones in a recursive  
> manner. Before delving into this, I wanted to check I'm not  
> overlooking something in the rgl quasi-random sequences, or missing a  
> more obvious way to generate such patterns. Perhaps solving an  
> electrostatic problem with a potential both attractive at long  
> distances and repulsive at short distances is a better way? I have a  
> vague recollection of hearing that somewhere to position points  
> evenly on a sphere.
> 
> Thanks for any comment / suggestion,
> 
> Baptiste
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to use the optim function if the length of the objectivefunction is greater than 1?

2008-04-26 Thread Charles Annis, P.E.

Help you out with what?  You have not provided a commented, minimal,
self-contained, reproducible example as you were asked to provide in the
bottom text of every note to r-help.  Can you expect help with a problem
only you know?

I *suspect* that you will need to define an objective function, such as the
sum of the squares of deviations, so that the function is then
single-valued.  But that's only a guess, based on insufficient information.

Charles Annis, P.E.

[EMAIL PROTECTED]
phone: 561-352-9699
eFax:  614-455-3265
http://www.StatisticalEngineering.com
 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of kakul modani
Sent: Saturday, April 26, 2008 10:40 AM
To: r-help@r-project.org
Subject: [R] How to use the optim function if the length of the
objectivefunction is greater than 1?

Hi,

Please help me out with this.Im a new user of R.
Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] quasi-random sequences

2008-04-26 Thread Robert A LaBudde

You seem to have ambiguous requirements.

First, you want equidistribution for a packing 
structure, which would suggest closest packing or 
quasirandom sequences, as you have tried.

But then you are disturbed by the packing 
structure, because it gives a perceivable 
pattern, so you wish to randomize it, but under 
some unspecified condition of equidistribution 
(your "electrostatic repulsion" algorithm).

You obviously have some constraints on selection 
you have not quantified yet. E.g., circles of unspecified radii cannot overlap.

You should also realize that any distribution of 
centers under such constraints will always 
exhibit structure due to the constraints.

Is your problem simply to give an appearance of 
randomness to the casual observer, or something more definite?

You also need to say something about the packing density involved.

On the face of it, you with
At 06:22 AM 4/26/2008, baptiste Auguié wrote:
>Dear list useRs,
>
>I have to generate a random set of coordinates (x,y) in [-1 ; 1]^2
>for say, N points. At each of these points is drawn a circle (later
>on, an ellipse) of random size, as in:
>
>
> > N <- 100
> >
> > positions <- matrix(rnorm(2 * N, mean = 0 , sd= 0.5), nrow=N)
> > sizes<-rnorm(N, mean = 0 , sd= 1)
> > plot(positions,type="p",cex=sizes)
>
>
>My problem is to avoid collisions (overlap, really) between the
>points. I would like some random pattern, but with a minimum
>exclusion distance. In looking up "Numerical recipes in C", I found
>out about some Sobol quasi-random sequences, which one can call from
>the gsl package,
>
>
> > library(gsl)
> >
> > g <- qrng_alloc(type="sobol",dim=2)
> > qrng_get(g,n= N) ->xy
> >
> > plot((xy),t="p",cex=0.5)
>
>but this does not look very random: I clearly see some pattern
>(diagonals, etc...), and even the non-overlapping condition is not
>impressive.
>
>One (painful) way I can foresee is to check the distance between each
>symbol and the others, and move the overlapping ones in a recursive
>manner. Before delving into this, I wanted to check I'm not
>overlooking something in the rgl quasi-random sequences, or missing a
>more obvious way to generate such patterns. Perhaps solving an
>electrostatic problem with a potential both attractive at long
>distances and repulsive at short distances is a better way? I have a
>vague recollection of hearing that somewhere to position points
>evenly on a sphere.
>
>
>Thanks for any comment / suggestion,
>
>Baptiste
>
>
>_
>
>Baptiste Auguié
>
>Physics Department
>University of Exeter
>Stocker Road,
>Exeter, Devon,
>EX4 4QL, UK
>
>Phone: +44 1392 264187
>
>http://newton.ex.ac.uk/research/emag
>http://projects.ex.ac.uk/atto
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: [EMAIL PROTECTED]
Least Cost Formulations, Ltd.URL: http://lcfltd.com/
824 Timberlake Drive Tel: 757-467-0954
Virginia Beach, VA 23464-3239Fax: 757-467-2947

"Vere scire est per causas scire"

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Ubuntu vs Mac OS X

2008-04-26 Thread Dirk Eddelbuettel


On 26 April 2008 at 12:27, Prof Brian Ripley wrote:
| 3) If you are doing matrix algebra you want an optimized BLAS.  The CRAN 
| build of R for Mac OS by default uses vecLib (which is optimized by 
| Apple).  On other OSes R can be built to use optimized BLAS (see the 
| manual) -- my understanding is that on Debian derivatives it is built to 
| make use of the system libblas, and you can install versions of the latter 
| based on ATLAS. Some optimized BLAS can make use of multiple CPUs.

Yes, on Ubuntu, use 'apt-get install atlas3-base' for basic tuned Atlas, or
'apt-get install atlas3-sse2' assuming that sse2 is the closest fit to the
cpu in question.
 
Dirk

-- 
Three out of two people have difficulties with fractions.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to use the optim function if the length of the objective function is greater than 1?

2008-04-26 Thread kakul modani

Hi,

Please help me out with this.Im a new user of R.
Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sys.time question - how to improve R performance

2008-04-26 Thread Prof Brian Ripley


On Sat, 26 Apr 2008, Jiří Voller wrote:


Dear R-users,
I run my program a difference between sys.times is as follows:
user   system  elapsed
60167.53  2848.75 63278.93

I am quite puzzled how it may happen that  system time is so much shorter
than user time, I have 2 core computer most of the time (90%) I was not
doing anything else with the computer. What could be the reason of such a
difference? I switched off file indexing and all other resident
programs.. My program include writing output into a file - could that be
the reason?


First, do you mean system.time()?  Its help page refers you to ?proc.time, 
which says:


 An object of class '"proc_time"' which is a numeric vector of
 length 5, containing the user, system, and total elapsed times for
 the currently running R process, and the cumulative sum of user
 and system times of any child processes spawned by it on which it
 has waited.  (The 'print' method combines the child times with
 those of the main process.)

and it seems you don't know those terms.  At any given time a CPU is 
running a task for a process, it is either in a system call or in user 
code (R itself or a package).  This is recorded every once in a while (a 
few ms) and totalled.  So your process spent 2848s in system calls.
The elapsed time is a little more than the sum of the user and system 
times, as your R session was not running 100% of the time.


You didn't even tell us your OS (see the posting guide) nor your 
background .  On a Unix-alike look at the man pages for time and times.

E.g. mine says

  The tms_utime field contains the CPU time spent executing instructions
  of the calling process.  The tms_stime field contains the CPU time
  spent in the system while executing tasks on behalf of the calling process.



Thanks for your replies.

--

Ji?? Voller
Laboratory of Growth Regulators
Palack? University & Institute of Experimental Botany AS CR
?lechtitel? 11, 783 71 Olomouc
Czech Republic
http://rustreg.upol.cz
landline: +420-585-634-855
cell: +420-737-520-506
fax: +420-585-634-870
--

[[alternative HTML version deleted]]


--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] g(x,y) = f(x,y) - e(x)- e(y)?

2008-04-26 Thread William Simpson

Thanks everyone for the help.

Greg, I posted the equation from Prof Brillinger's paper. To me it is
not clear how or why to subtract a vector from a matrix (since they
have diff dimensions). That's why I posted my question to the experts
here!

Anyway, I have some answers now.

Cheers
Bill

On Wed, Apr 23, 2008 at 5:52 PM, Greg Snow <[EMAIL PROTECTED]> wrote:
> It is not clear exactly what you want to do by subtracting a vector from a 
> matrix, but look at the "sweep" function, it may do what you want.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help with simulation of heteroskedasticity

2008-04-26 Thread Leo Correia

Hello guys!
Sorry to bother with such a question

I was trying to generate a monte carlo simulation with heteroskedasticity
errors. but I am not sure if the command line that I had
wrote is quite correct.
the type of heteroskedasticity that I want to create is such as var(e) =
var(x^4)

I began my work with this

x<- rnorm (100, 2,0.4)  # generating an indepedent random variable
e<- rnorm(100,0,x^2) # generating the error term
y.fitted<- 0.5*(x) +3
y<- y.fitted+e

And then I wrote an for loop code which could give me the results of a
t-test.
It seems that the simulation worked good because the main result is that the
t-test was no longer meanfull (as it is expected)

But my main doubt is if this code was capable to generate the type of
heteroskedasticity
that I wanted.
I tried to understand how the rnorm function generates the numbers but i
could not grab much

So could anyone help me telling if this kind of work is okay? or is there
any moree inteligent way to do the job?

thanks!!!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] sys.time question - how to improve R performance

2008-04-26 Thread Jiří Voller

Dear R-users,
I run my program a difference between sys.times is as follows:
 user   system  elapsed
60167.53  2848.75 63278.93

I am quite puzzled how it may happen that  system time is so much shorter
than user time, I have 2 core computer most of the time (90%) I was not
doing anything else with the computer. What could be the reason of such a
difference? I switched off file indexing and all other resident
programs.. My program include writing output into a file - could that be
the reason?

Thanks for your replies.

-- 

Jiøí Voller
Laboratory of Growth Regulators
Palacký University & Institute of Experimental Botany AS CR
©lechtitelù 11, 783 71 Olomouc
Czech Republic
http://rustreg.upol.cz
landline: +420-585-634-855
cell: +420-737-520-506
fax: +420-585-634-870
--

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 3 questions: debug R script, multi-level sorts, and multi-gsub

2008-04-26 Thread David Winsemius

"Ng Stanley" <[EMAIL PROTECTED]> wrote in
news:[EMAIL PROTECTED]: 

> 2) I have looked deeply into R, but can't fine multi-level sorts.

At the top of the help page for sort...

"Sort (or order) a vector or factor (partially) into ascending (or 
descending) order. For ordering along more than one variable, e.g., for 
sorting data frames, see _order_. "

In the examples on the order page, it shows what I suspect you mean by 
a multilevel sort (although you offer no examples).

> (ii <- order(x <- c(1,1,3:1,1:4,3), y <- c(9,9:1), z <-c(2,1:9)))
 [1]  6  5  2  1  7  4 10  8  3  9
> dd <- transform(data.frame(x,y,z),
+ z = factor(z, labels=LETTERS[9:1]))
> dd
   x y z
1  1 9 H
2  1 9 I
3  3 8 H
4  2 7 G
5  1 6 F
6  1 5 E
7  2 4 D
8  3 3 C
9  4 2 B
10 3 1 A

> dd[ order(x, -y, z) ,]
   x y z
2  1 9 I
1  1 9 H
5  1 6 F
6  1 5 E
4  2 7 G
7  2 4 D
3  3 8 H
8  3 3 C
10 3 1 A
9  4 2 B

-- 
David Winsemius


> 
> 3) I have a vector of words, several of which need to be replaced by
> different unique words. I am using multi-lines of gsub, is there any
> elegant alternative ?
> 
>  [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html and provide commented,
> minimal, self-contained, reproducible code. 
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to overlap two density plots?

2008-04-26 Thread Jared O'Connell

My apologies, the previous replies did not show up for me.

On Sat, Apr 26, 2008 at 1:49 PM, Jared O'Connell
<[EMAIL PROTECTED]> wrote:
> Very näively, you could do something like this,
>
>  plot(density(A))
>  lines(density(B),col=2)
>
>  , and tinker your xlim and ylim as suitable.  The Cairo library gives
>  a pretty example,
>
>  data(iris)
>  attach(iris)
>  plot(Petal.Length, rep(-0.03,length(Species)), xlim=c(1,7),
>   ylim=c(0,1.7), xlab="Petal.Length", ylab="Density",
>   pch=21, cex=1.5, col="#0001", main = "Iris (yet again)",
>   bg=c("#ff20","#00ff0020","#ff20")[unclass(Species)])
>  for (i in 1:3) 
> polygon(density(Petal.Length[unclass(Species)==i],bw=0.2),col=c("#ff40","#00ff0040","#ff40")[i])
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to overlap two density plots?

2008-04-26 Thread Jared O'Connell

Very näively, you could do something like this,

plot(density(A))
lines(density(B),col=2)

, and tinker your xlim and ylim as suitable.  The Cairo library gives
a pretty example,

data(iris)
attach(iris)
plot(Petal.Length, rep(-0.03,length(Species)), xlim=c(1,7),
  ylim=c(0,1.7), xlab="Petal.Length", ylab="Density",
  pch=21, cex=1.5, col="#0001", main = "Iris (yet again)",
  bg=c("#ff20","#00ff0020","#ff20")[unclass(Species)])
for (i in 1:3) 
polygon(density(Petal.Length[unclass(Species)==i],bw=0.2),col=c("#ff40","#00ff0040","#ff40")[i])

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Ubuntu vs Mac OS X

2008-04-26 Thread Prof Brian Ripley

Three comments, all from the 'R Installation and Adminstration' manual.

1) In general there is a small performance penalty for using a 64-bit 
version of R (larger pointers to move around).

2) For speed, bulld yourself a non-shared-library version of R.  Those 
binary maintainers are supplying an R-shlib version that suits their 
agendas but maybe not yours.  You may also want to increase the 
optimization level on the compiler.

3) If you are doing matrix algebra you want an optimized BLAS.  The CRAN 
build of R for Mac OS by default uses vecLib (which is optimized by 
Apple).  On other OSes R can be built to use optimized BLAS (see the 
manual) -- my understanding is that on Debian derivatives it is built to 
make use of the system libblas, and you can install versions of the latter 
based on ATLAS. Some optimized BLAS can make use of multiple CPUs.

You have not told us what chip your '3.2 GHz Dual-core' is -- it will 
matter for an optimized BLAS.  On Core 2 Duos I tend to use the (academic 
use only) BLAS by Dr Goto.  On my modest Intel E6600 Core 2 Duo I get

> system.time(coco2<-eigen(coco))
user  system elapsed
  11.984   0.052  12.065
> system.time(coco3<-coco%*%coco)
user  system elapsed
   1.694   0.000   1.694

with vanilla 64-bit R compiled at -O3 and

> system.time(coco2<-eigen(coco))
user  system elapsed
   7.450   0.062   3.912
> system.time(coco3<-coco%*%coco)
user  system elapsed
   0.249   0.007   0.132

with the Goto BLAS.  These are using both cores, as you see from the 
elapsed times.  It is the work of moments to swap the BLAS used, which is 
how I did that test.

On Fri, 25 Apr 2008, Guillaume Blanchet wrote:

> Hi !
>
> I have installed R-2.7.0 64-bits on a computer where the new Ubuntu
> version (8.04) was installed. This computer is a 3.2 GHz Dual-core with
> 2 Gb of RAM.
>
> To test how fast (I though !) this machine is, I compared two simple
> matrix manipulations between Ubuntu and my laptop (a MacBook Pro 2.2 GHz
> with also 2 Gb of RAM under Mac OS X 10.5.2). By the way, I install the
> normal version of R on my MacBook Pro (32-bits)
>
> I decided to use the following matrix manipulation because they are
> typical manipulation that I was planning to do repetitively and with
> much larger matrices for my PhD project.
>
> First matrix manipulation:
>
> coco<-matrix(rnorm(100),1000,1000)
> system.time(coco2<-eigen(coco))
>
> Results on the Mac:
> user system  elapsed
> 7.301   0.974   7.376
>
> Results on Ubuntu:
> user system  elapsed
> 20.573   0.088   20.766
>
> Second matrix manipulation:
>
> coco<-matrix(rnorm(100),1000,1000)
> system.time(coco3<-coco%*%coco)
>
> Results on the Mac:
> user system  elapsed
> 0.360   0.024   0.224
>
> Results on Ubuntu:
> user system  elapsed
> 2.756   0.000   2.758
>
> If I understand these results right, my MacBook Pro is far better than
> this Desk computer. I know Macs are good but that difference of
> performance is surprising me, did I do something wrong considering the
> large difference in processor?
>
> Thanks in advance
>
> Guillaume Blanchet
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] quasi-random sequences

2008-04-26 Thread Prof Brian Ripley


On Sat, 26 Apr 2008, baptiste Auguié wrote:


Dear list useRs,

I have to generate a random set of coordinates (x,y) in [-1 ; 1]^2
for say, N points. At each of these points is drawn a circle (later
on, an ellipse) of random size, as in:



N <- 100

positions <- matrix(rnorm(2 * N, mean = 0 , sd= 0.5), nrow=N)
sizes<-rnorm(N, mean = 0 , sd= 1)
plot(positions,type="p",cex=sizes)



My problem is to avoid collisions (overlap, really) between the
points. I would like some random pattern, but with a minimum
exclusion distance. In looking up "Numerical recipes in C", I found
out about some Sobol quasi-random sequences, which one can call from
the gsl package,


That is a hard-core point process, e.g. a special case of a Strauss 
process.  You will find ways to simulate such a process (there are several 
processes, and several ways for most) in the various spatial packages, 
including 'spatial' itself.


I think you have misunderstood the point of quasi-random sequences (which, 
given the exposition in the edition of NR I have, would be easy to do).




library(gsl)

g <- qrng_alloc(type="sobol",dim=2)
qrng_get(g,n= N) ->xy

plot((xy),t="p",cex=0.5)


but this does not look very random: I clearly see some pattern
(diagonals, etc...), and even the non-overlapping condition is not
impressive.

One (painful) way I can foresee is to check the distance between each
symbol and the others, and move the overlapping ones in a recursive
manner. Before delving into this, I wanted to check I'm not
overlooking something in the rgl quasi-random sequences, or missing a
more obvious way to generate such patterns. Perhaps solving an
electrostatic problem with a potential both attractive at long
distances and repulsive at short distances is a better way? I have a
vague recollection of hearing that somewhere to position points
evenly on a sphere.


Thanks for any comment / suggestion,

Baptiste


_

Baptiste Auguié

Physics Department
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

Phone: +44 1392 264187

http://newton.ex.ac.uk/research/emag
http://projects.ex.ac.uk/atto

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] quasi-random sequences

2008-04-26 Thread Duncan Murdoch

baptiste Auguié wrote:
> Dear list useRs,
>
> I have to generate a random set of coordinates (x,y) in [-1 ; 1]^2  
> for say, N points. At each of these points is drawn a circle (later  
> on, an ellipse) of random size, as in:
>
>   
The quasi-random sequences are useful for integration, but they're not 
random in the sense of being unpredictable.

Here's an idea for a sequence that might work for you:  divide the 
square into an n x n grid of smaller squares.  Randomly select a smaller 
square, then uniformly select a point within it.  (Or just permute the 
list of all n^2 squares, and proceed through the permuted list.)

This won't guarantee a separation, but it will tend to lead to one.  If 
you want a guarantee, then only select from squares with even coordinates.

Once you've selected all the squares in the grid, go to a 2n x 2n grid, 
and repeat.  (If you are doing the "even coordinates" modification, 
you'll have had to
select non-uniformly for this to work.)

Duncan Murdoch
>   
>> N <- 100
>>
>> positions <- matrix(rnorm(2 * N, mean = 0 , sd= 0.5), nrow=N)
>> sizes<-rnorm(N, mean = 0 , sd= 1)
>> plot(positions,type="p",cex=sizes)
>> 
>
>
> My problem is to avoid collisions (overlap, really) between the  
> points. I would like some random pattern, but with a minimum  
> exclusion distance. In looking up "Numerical recipes in C", I found  
> out about some Sobol quasi-random sequences, which one can call from  
> the gsl package,
>
>
>   
>> library(gsl)
>>
>> g <- qrng_alloc(type="sobol",dim=2)
>> qrng_get(g,n= N) ->xy
>>
>> plot((xy),t="p",cex=0.5)
>> 
>
> but this does not look very random: I clearly see some pattern  
> (diagonals, etc...), and even the non-overlapping condition is not  
> impressive.
>
> One (painful) way I can foresee is to check the distance between each  
> symbol and the others, and move the overlapping ones in a recursive  
> manner. Before delving into this, I wanted to check I'm not  
> overlooking something in the rgl quasi-random sequences, or missing a  
> more obvious way to generate such patterns. Perhaps solving an  
> electrostatic problem with a potential both attractive at long  
> distances and repulsive at short distances is a better way? I have a  
> vague recollection of hearing that somewhere to position points  
> evenly on a sphere.
>
>
> Thanks for any comment / suggestion,
>
> Baptiste
>
>
> _
>
> Baptiste Auguié
>
> Physics Department
> University of Exeter
> Stocker Road,
> Exeter, Devon,
> EX4 4QL, UK
>
> Phone: +44 1392 264187
>
> http://newton.ex.ac.uk/research/emag
> http://projects.ex.ac.uk/atto
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] quasi-random sequences

2008-04-26 Thread baptiste Auguié

Dear list useRs,

I have to generate a random set of coordinates (x,y) in [-1 ; 1]^2  
for say, N points. At each of these points is drawn a circle (later  
on, an ellipse) of random size, as in:


> N <- 100
>
> positions <- matrix(rnorm(2 * N, mean = 0 , sd= 0.5), nrow=N)
> sizes<-rnorm(N, mean = 0 , sd= 1)
> plot(positions,type="p",cex=sizes)


My problem is to avoid collisions (overlap, really) between the  
points. I would like some random pattern, but with a minimum  
exclusion distance. In looking up "Numerical recipes in C", I found  
out about some Sobol quasi-random sequences, which one can call from  
the gsl package,


> library(gsl)
>
> g <- qrng_alloc(type="sobol",dim=2)
> qrng_get(g,n= N) ->xy
>
> plot((xy),t="p",cex=0.5)

but this does not look very random: I clearly see some pattern  
(diagonals, etc...), and even the non-overlapping condition is not  
impressive.

One (painful) way I can foresee is to check the distance between each  
symbol and the others, and move the overlapping ones in a recursive  
manner. Before delving into this, I wanted to check I'm not  
overlooking something in the rgl quasi-random sequences, or missing a  
more obvious way to generate such patterns. Perhaps solving an  
electrostatic problem with a potential both attractive at long  
distances and repulsive at short distances is a better way? I have a  
vague recollection of hearing that somewhere to position points  
evenly on a sphere.


Thanks for any comment / suggestion,

Baptiste


_

Baptiste Auguié

Physics Department
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

Phone: +44 1392 264187

http://newton.ex.ac.uk/research/emag
http://projects.ex.ac.uk/atto

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Variables selection in Neural Networks

2008-04-26 Thread _Fede_



Hi folks,

I want to apply a neural network to a data set to classify the observations
in the different classes from a concrete response variable. The idea is to
prove different models from network modifying the number of neurons of the
hidden layer to control overfitting. But, to select the best model how I can
choose the relevant variables? How I can eliminate those that are not
significant for the model of neural networks? How I can do this in R?

dataset.nn=nnet(response.variable~., dataset, subset = training, size=1,
decay=0.001, linout=F, skip=T, maxit=200, Hess=T)

What I am doing is to vary size between 0 and 1 since with a single layer it
can learn any type of function or continuous relation between a group of
input and output variables. But this only would give two different models.
The ideal would be to be reducing nonsignifictive variables. How I can prove
other different models?

Regards.

_Fede_
-- 
View this message in context: 
http://www.nabble.com/Variables-selection-in-Neural-Networks-tp16911299p16911299.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] matrix from list

2008-04-26 Thread Martin Maechler

> "OL" == Olivier Lefevre <[EMAIL PROTECTED]>
> on Sat, 26 Apr 2008 01:31:16 +0200 writes:

OL> Greg Snow wrote:
>> The '[[' only returns a single element from a data structure

OL> I know but that is precisely what I find arbitrary.

OL> Anyway you are right that it would still return the kind of object, 
only 
OL> subsetted, which is not I want. As someone kindly pointed out to me 
OL> offline, besides the unlist trick posted before, you can also achieve 
the 
OL> desired result with

OL> m <- as.matrix(l)
OL> dimnames(m) <- NULL

Two remarks on these two lines:

- Slightly better for the 99% user- case of coercing a
  'data.frame' (the result of read.table() or similar) ato a matrix is

   m <- data.matrix(.)

  The difference to as.matrix() is that data.matrix() also
  produces a numeric matrix in the case the data frame contains
  factors.

  Consider

  (dd <- data.frame(x= rnorm(10), f = gl(2,5))

  as.matrix(dd)   ## not what you want
  data.matrix(dd)   ## what you want

- I don't understand why you want to throw away the dimnames of
  the resulting numeric matrix.  Rather, in general you'd want
  to keep them.

Martin Maechler, ETH Zurich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Another Question Regarding the nnet package

2008-04-26 Thread Colin Blake Rickert

I have one other question regarding the nnet package in R:

What is the value of the step size (often referred to as "eta") for  
the gradient descent function?

I dont see anywhere in the API where this value can be modified.

Thanks,

-Colin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Equivalent of Excel pivot tables in R

2008-04-26 Thread Simon Blomberg

See also the reshape package.

Cheers,

Simon.

Simon Blomberg, BSc (Hons), PhD, MAppStat. 
Lecturer and Consultant Statistician 
Faculty of Biological and Chemical Sciences 
The University of Queensland 
St. Lucia Queensland 4072 
Australia 
T: +61 7 3365 2506 
email: S.Blomberg1_at_uq.edu.au

Policies:
1.  I will NOT analyse your data for you.
2.  Your deadline is your problem.

The combination of some data and an aching desire for 
an answer does not ensure that a reasonable answer can 
be extracted from a given body of data. - John Tukey.



-Original Message-
From: [EMAIL PROTECTED] on behalf of ppaarrkk
Sent: Sat 26/04/2008 7:54 AM
To: r-help@r-project.org
Subject: [R]  Equivalent of Excel pivot tables in R
 

Can somebody tell me how to do the equivalent of a pivot table in R ?


For example, if I have :

var1var2   var3
a x10
b y20
a z10
b z20
a z10
b z20

I could have :

  x   y   z
a1   0   2 
b0   1   2

where entries in the table are counts of var3.

  x   y   z
a10   0   20 
b0   20   40

where entries are sums of var3.



I would expect it to be tapply(), but I can't see how it would be done.


Any suggestions please.


-- 
View this message in context: 
http://www.nabble.com/Equivalent-of-Excel-pivot-tables-in-R-tp16906289p16906289.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

46 matches

Mail list logo