Re: [R] bug in closing gzfile-opened connections?

2007-07-04 Thread Prof Brian Ripley
Note that the use of read.table() does make a difference.  If you did

x - scan(gzfile(xxx.gz), list(,,))

you would leave an unused connection, and showConnections(all=TRUE) would 
show this.  There is a finite pool of connections, and in general the 
correct way to use them is

con - gzfile(xxx.gz)
x - scan(con, list(,,))
close(con)

read.table() is the exception, so I suspect it is other things that have 
been done in the session that have used up the pool of connections.

On Tue, 3 Jul 2007, Duncan Murdoch wrote:

 On 03/07/2007 1:37 PM, David Reiss wrote:
 Hi,
 I am making multiple calls to gzfile() via read.table(), e.g.

 x - read.table( gzfile( xxx.gz ) )

 After i do this many times (I haven't counted, but probably between 50 and
 100 times) I get the error message:

 Error in open.connection(file, r) : unable to open connection
 In addition: Warning message:
 cannot open compressed file 'xxx.gz'

 however, I also find that:

 showConnections()
  description class mode text isopen can read can write

 so there are no (apparently) open connections. Calling closeAllConnections()
 does not fix the problem. I have to quit and re-start R.
 I am using R 2.5.0 on a Mac (OSX 10.4.9).

 Anyone know if this is a bug or a 'feature'? I see from the gzfile help
 that:

  In general functions using connections
  will open them if they are not open, but then close them again, so
  to leave a connection open call 'open' explicitly.

 You didn't give a reproducible example, so I couldn't say.  When I
 create a gzipped version of a write.table output and run

 for(i in 1:1000) read.table(gzfile(f))

 in R 2.5.0 I don't see a problem.  This is on Windows, but I doubt that
 makes a difference.

 Duncan Murdoch

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] possible bug in ggplot2 v0.5.2???

2007-07-04 Thread Prof Brian Ripley
On Tue, 3 Jul 2007, hadley wickham wrote:

 Hi Stephane,

 The problem is that the windows graphics device doesn't support
 transparent colours.  You can get around this in two ways:

It certainly does!  Try col=transparent (and perhaps consult your 
dictionary).  It was news to me that the windows() graphics device worked 
on 
Linux i586.

What it does not support as yet is translucent colours, and that is a 
restriction imposed by Windows (translucency support was introduced for 
Windows XP, and we still try to support older versions of Windows, unlike 
the MacOS people).  I have been working on a workaround, so translucency 
support is likely to be implemented in R 2.6.0 for users of XP or later.

Given that neither of the two main screen devices and neither of the 
standard print devices support translucency, the subject line looks 
correct to me: the problem surely lies in the assumptions made in ggplot2.

 * export to a device that does support transparency (eg. pdf)
 * use a solid fill colour : + stat_smooth(method=lm, fill=grey50)

 Hadley

 On 7/3/07, Stephane Cruveiller [EMAIL PROTECTED] wrote:
 Dear R-Users,

 I recently gave a try to the nice package ggplot2. Everything  went
 well until I tried to add a smoother (using lm method for instance).
 On the graphic device the regression line is displayed but not confidence
 intervals as it should be (at least on ggplot website). I tried to do
 the job on
 both MS winXP and Linux i586: same result. Did anyone encountered this
 problem? Did I miss something?


 My R version is 2.4.1.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] retrieving stats from bwplot

2007-07-04 Thread deepayan . sarkar
On 7/3/07, Héctor Villalobos [EMAIL PROTECTED] wrote:
 Hi all,

 I want to retrieve the stats from a 'bwplot' with one factor. I have read
 the help for 'panel'
 function and I'm aware of the option 'stats' which defaults to
 'boxplot.stats' but I didn't
 understand it well and therefore I am unable to get what I need.

I'm not sure what bwplot has to do with this. Perhaps this will help:

 foo - with(OrchardSprays, split(decrease, treatment))
 str(foo)
List of 8
 $ A: num [1:8] 2 2 5 4 5 12 4 3
 $ B: num [1:8] 8 6 4 10 7 4 8 14
 $ C: num [1:8] 15 84 16 9 17 29 13 19
 $ D: num [1:8] 57 36 22 51 28 27 20 39
 $ E: num [1:8] 95 51 39 114 43 47 61 55
 $ F: num [1:8] 90 69 87 20 71 44 57 114
 $ G: num [1:8] 92 71 72 24 60 77 72 80
 $ H: num [1:8] 69 127 72 130 81 76 81 86
 boxplot.stats(foo$A)
$stats
[1] 2.0 2.5 4.0 5.0 5.0

$n
[1] 8

$conf
[1] 2.603464 5.396536

$out
[1] 12

 bxp.stats - lapply(foo, boxplot.stats)
 str(bxp.stats)
List of 8
 $ A:List of 4
  ..$ stats: num [1:5] 2 2.5 4 5 5
  ..$ n: int 8
  ..$ conf : num [1:2] 2.60 5.40
  ..$ out  : num 12
 $ B:List of 4
  ..$ stats: num [1:5] 4 5 7.5 9 14
  ..$ n: int 8
  ..$ conf : num [1:2] 5.27 9.73
  ..$ out  : num(0)
 $ C:List of 4
  ..$ stats: num [1:5] 9 14 16.5 24 29
  ..$ n: int 8
  ..$ conf : num [1:2] 10.9 22.1
  ..$ out  : num 84
 $ D:List of 4
  ..$ stats: num [1:5] 20 24.5 32 45 57
  ..$ n: int 8
  ..$ conf : num [1:2] 20.5 43.5
  ..$ out  : num(0)
 $ E:List of 4
  ..$ stats: num [1:5] 39 45 53 78 114
  ..$ n: int 8
  ..$ conf : num [1:2] 34.6 71.4
  ..$ out  : num(0)
 $ F:List of 4
  ..$ stats: num [1:5] 20 50.5 70 88.5 114
  ..$ n: int 8
  ..$ conf : num [1:2] 48.8 91.2
  ..$ out  : num(0)
 $ G:List of 4
  ..$ stats: num [1:5] 60 65.5 72 78.5 92
  ..$ n: int 8
  ..$ conf : num [1:2] 64.7 79.3
  ..$ out  : num 24
 $ H:List of 4
  ..$ stats: num [1:5]  69  74  81 106 130
  ..$ n: int 8
  ..$ conf : num [1:2] 62.8 99.2
  ..$ out  : num(0)


If you want combinations defined by more than one factor, you could
use something like

with(OrchardSprays, split(decrease, interaction(treatment, colpos)))

(although this is a bad example, since there is only one observation
per combination)

-Deepayan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] read items

2007-07-04 Thread elyakhlifi mustapha
Hello,
I write us because I wanna know if it's possible to don't display 

read item

when I execute a scan(textConnection())
thanks


  
_ 

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] probabilty plot

2007-07-04 Thread along zeng
Hi all,
   I am a freshman of R,but I am interested  in it! Those days,I am
learning   pages on NIST,with url
http://www.itl.nist.gov/div898/handbook/eda/section3/probplot.htm,
I am meeting  a problem about probability plot and I don't know how to
plot a data set with R.
Could somebody tell me the answer,and a example is the best!  I will
look forward to your answer.
 Thank you very much.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] possible bug in ggplot2 v0.5.2???

2007-07-04 Thread hadley wickham
On 7/4/07, Prof Brian Ripley [EMAIL PROTECTED] wrote:
 On Tue, 3 Jul 2007, hadley wickham wrote:

  Hi Stephane,
 
  The problem is that the windows graphics device doesn't support
  transparent colours.  You can get around this in two ways:

 It certainly does!  Try col=transparent (and perhaps consult your
 dictionary).  It was news to me that the windows() graphics device worked
 on
 Linux i586.

Well my dictionary defines transparent as allowing light to pass
through so that objects behind can be distinctly seen which I believe
applies here (ie. stained glass windows and blue points with alpha 0.5
are both transparent).  What does your dictionary say?

 What it does not support as yet is translucent colours, and that is a
 restriction imposed by Windows (translucency support was introduced for
 Windows XP, and we still try to support older versions of Windows, unlike
 the MacOS people).  I have been working on a workaround, so translucency
 support is likely to be implemented in R 2.6.0 for users of XP or later.

I am confused by your implication that windows (prior to XP) does not
support translucency.  Perhaps it is not supported at the operating
system level, but it has certainly been available at the application
level for a very long time.

 Given that neither of the two main screen devices and neither of the
 standard print devices support translucency, the subject line looks
 correct to me: the problem surely lies in the assumptions made in ggplot2.

The features of the windows and X11 devices clearly lag behind the
quartz and pdf devices.  I can program for the lowest common
denominator or I can use modern features that support the tasks I am
working on.  I choose the later, and it is certainly your prerogative
to declare that a bug in me.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to install R 2.5 with Synaptic in Ubuntu?

2007-07-04 Thread Grimbough

Hi,
I can't get this approach to work.  When I first added the repsitory line to
/etc/apt/sources.list synaptic complained that it was a malformed line.  I
fixed this
by adding main to end of the entry making it:

 deb http://my.favorite.cran.mirror/bin/linux/ubuntu feisty main

However after this it still complains that it can't find packages.gz

It appears to be looking in 
http://my.favorite.cran.mirror/bin/linux/ubuntu/distsfeisty
which isn't the directory structure of the cran repository, but 
I can see anyway to modify this behaviour.  Every other Ubuntu repositoy
I have looked at contains the dists directory.

Any suggestions for modifying this behaviour are gratefully recieved.
Many thanks

Mike Smith


quot;Stefan Großequot; wrote:
 
I'm using Ubuntu dapper, which only have R of Version 2.2.1.
 
Would anybody tell me how to install the latest version of R with 
 
 
 from the CRAN Ubuntu readme- works for synaptic as well:
 
 * UBUNTU
 
 R packages for Ubuntu on i386 are available. The plans are to support at
 least the latest Ubuntu release and the latest LTS release. Currently
 (April 2007), these are Feisty Fawn (7.04) and Dapper Drake (6.06),
 respectively. Since Feisty was released very shortly before R 2.5.0,
 binary packages *for this release of R* are also available for Edgy
 Eft (6.10).
 
 To obtain the latest R packages, add an entry like
 
   deb http://my.favorite.cran.mirror/bin/linux/ubuntu feisty/
 
 or
 
   deb http://my.favorite.cran.mirror/bin/linux/ubuntu edgy/
 
 or
 
   deb http://my.favorite.cran.mirror/bin/linux/ubuntu dapper/
 
 in your /etc/apt/sources.list file. See 
 http://cran.r-project.org/mirrors.html
 for the list of CRAN mirrors. To install the complete R system, use
 
   sudo apt-get update
   sudo apt-get install r-base
 
 Users who need to compile packages should also install the r-base-dev
 package:
 
   sudo apt-get install r-base-dev
 
 The R packages for Ubuntu should otherwise behave like the Debian ones.
 For
 more information, see the README file in
 
   http://cran.R-project.org/bin/linux/debian/
 
 * SECURE APT
 
 The Ubuntu archives on CRAN are signed with the key of Vincent Goulet
 [EMAIL PROTECTED] with key ID E2A11821. You can fetch
 this with
 
   gpg --keyserver subkeys.pgp.net --recv-key E2A11821
 
 and then you feed the key to apt-key with
 
   gpg -a --export E2A11821 | sudo apt-key add -
 
 Some people have reported difficulties using this approach. The issue
 was usually related to a firewall blocking port 11371. An alternative
 approach is to search for the key at http://keyserver.noreply.org/ and
 copy the key to a plain text file, say key.txt. Then, feed the key to
 apt-key with
 
   sudo apt-key add key.txt
 
 
 * ACKNOWLEDGEMENT
 
 The Debian R packages are maintained by Dirk Eddelbuettel and Doug Bates.
 The Ubuntu packages are compiled for i386 by Vincent Goulet.
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/How-to-install-R-2.5-with-Synaptic-in-Ubuntu--tf3998481.html#a11426235
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Adding data to existing plot with new=TRUE does not appear to work

2007-07-04 Thread Paul Lemmens
Dear all,

I am trying to shove a number of cmdscale() results into a single plot
(k=1 so I'm trying to get multiple columns in the plot).  From ?par I
learned that I can/should set new=TRUE in either par() or the plot
function itself. However with the following reduced code, I get only a
plot with a column of data points with x==2.

plot(1,10, xlim=range(0,3), ylim=range(0,10), type='n')
aa - rep(1,10)
bb - 1:10
plot(aa,bb, xlim=range(0,3), ylim=range(0,10), new=TRUE)
aa - rep(2,10)
plot(aa,bb, xlim=range(0,3), ylim=range(0,10), new=TRUE)

Also, when I insert a op - par(new=TRUE) either before or immediately
after the first plot statement (the type='n' one) in the above code
fragment, the resulting graph still only shows one column of data.

Have I misinterpreted the instructions or the functionality of new=TRUE?

Thank you,
Paul Lemmens

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to plot a monthplot from a ts object where all individual years are shown (e.g. as lines) and can be compared with a average or median year?

2007-07-04 Thread Jan.Verbesselt
Dear R help,

 

I'm working with regular 8-daily time-series from 2000 up till now and
would like to be able to compare years with each other. E.g. by creating
a monthplot via the result of the stl() method it looks ok but I was
wondering whether there exist other methods to plot the different years
as lines on top of each other such that years can be compared with each
other (temporal change detection)?

 

This is the type of time-series we are working with:

forest - ts(data, frequency=46, start=c(2000,8), end=c(2006,46))

 

I tried it as follows but this is not very clear.

year0 - window(forest, start=2000,end=c(2000,46))

year1 - window(forest, start=2001,end=c(2001,46))

year2 - window(forest, start=2002,end=c(2002,46))

year3 - window(forest, start=2003,end=c(2003,46))

year4 - window(forest, start=2004,end=c(2004,46))

year5 - window(forest, start=2005,end=c(2005,46))

year6 - window(forest, start=2006,end=c(2006,46))

 

plot(1:46,years[,1], col=2)

lines(1:46,years[,2], col=3)

lines(1:46,years[,3], col=4)

lines(1:46,years[,4], col=5)

lines(1:46,years[,5], col=6)

lines(1:46,years[,6], col=7)

lines(1:46,years[,7], col=8)

 

Are there other options to be able to compare years with each other in
order to detect change (e.g., per month)?

 

Thanks a lot,

J

 

R 2.5, Win XP 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fine tunning rgenoud

2007-07-04 Thread Patrick Burns
I think fine tuning the function might be in order.

The function has just a single penalty for not meeting
the constraints no matter how close it is to meeting
them.  A better approach is to have a penalty that
depends on the amount by which all of the constraints
are breached.


Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and A Guide for the Unwilling S User)


Paul Smith wrote:

Dear All,

I am trying to solve the following maximization problem, but I cannot
have rgenoud giving me a reliable solution.

Any ideas?

Thanks in advance,

Paul


library(rgenoud)

v - 0.90
O1 - 10
O2 - 20
O0 - v*O1+(1-v)*O2

myfunc - function(x) {
  U0 - x[1]
  U1 - x[2]
  U2 - x[3]
  q0 - x[4]
  q1 - x[5]
  q2 - x[6]
  p - x[7]

  if (U0  0)
return(-1e+200)
  else if (U1  0)
return(-1e+200)
  else if (U2  0)
return(-1e+200)
  else if ((U0-(U1+(O1-O0)*q1))  0)
return(-1e+200)
  else if ((U0-(U2+(O2-O0)*q2))  0)
return(-1e+200)
  else if ((U1-(U0+(O0-O1)*q0))  0)
return(-1e+200)
  else if ((U1-(U2+(O2-O1)*q2))  0)
return(-1e+200)
  else if((U2-(U0+(O0-O2)*q0))  0)
return(-1e+200)
  else if((U2-(U1+(O1-O2)*q1))  0)
return(-1e+200)
  else if(p  0)
return(-1e+200)
  else if(p  1)
return(-1e+200)
  else if(q0  0)
return(-1e+200)
  else if(q1  0)
return(-1e+200)
  else if(q2  0)
return(-1e+200)
  else 
 return(p*(sqrt(q0)-(O0*q0+U0))+(1-p)*(v*(sqrt(q1)-(O1*q1+U1))+(1-v)*(sqrt(q2)-(O2*q2+U2

}
genoud(myfunc,nvars=7,max=T,pop.size=6000,starting.values=runif(7),wait.generations=150,max.generations=300,boundary.enforcement=2)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to install R 2.5 with Synaptic in Ubuntu?

2007-07-04 Thread Stefan Grosse

  to end of the entry making it:

  deb http://my.favorite.cran.mirror/bin/linux/ubuntu feisty main

 However after this it still complains that it can't find packages.gz

   

Just a guess: have you replaced the my.favorite.cran.mirror by a mirror
which is close to you? If you're in UK it would be for example

deb http://www.stats.bris.ac.uk/R/bin/linux/ubuntu feisty main

;o)
Stefan

 It appears to be looking in 
 http://my.favorite.cran.mirror/bin/linux/ubuntu/distsfeisty
 which isn't the directory structure of the cran repository, but 
 I can see anyway to modify this behaviour.  Every other Ubuntu repositoy
 I have looked at contains the dists directory.

 Any suggestions for modifying this behaviour are gratefully recieved.
 Many thanks

 Mike Smith


   




-=-=-
... The simple truth is that interstellar distances will not fit into
the human imagination - (The Hitchhiker's Guide to the Galaxy)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odp: Adding data to existing plot with new=TRUE does not appear to work

2007-07-04 Thread Petr PIKAL
Hi

if you change your code

plot(1,10, xlim=range(0,3), ylim=range(0,10), type='n')
aa - rep(1,10)
bb - 1:10
plot(aa,bb, xlim=range(0,3), ylim=range(0,10), new=TRUE)
aa - rep(2,10)
par(new=T)
plot(aa,bb, xlim=range(0,3), ylim=range(0,10), new=TRUE)

you will get both columns plotted.

However you can get similar result with using points

plot(1,10, xlim=range(0,3), ylim=range(0,10), type='n')
aa - rep(1,10)
bb - 1:10
points(aa,bb)
aa - rep(2,10)
points(aa,bb)

Regards

Petr
[EMAIL PROTECTED]

[EMAIL PROTECTED] napsal dne 04.07.2007 09:48:15:

 Dear all,
 
 I am trying to shove a number of cmdscale() results into a single plot
 (k=1 so I'm trying to get multiple columns in the plot).  From ?par I
 learned that I can/should set new=TRUE in either par() or the plot
 function itself. However with the following reduced code, I get only a
 plot with a column of data points with x==2.
 
 plot(1,10, xlim=range(0,3), ylim=range(0,10), type='n')
 aa - rep(1,10)
 bb - 1:10
 plot(aa,bb, xlim=range(0,3), ylim=range(0,10), new=TRUE)
 aa - rep(2,10)
 plot(aa,bb, xlim=range(0,3), ylim=range(0,10), new=TRUE)
 
 Also, when I insert a op - par(new=TRUE) either before or immediately
 after the first plot statement (the type='n' one) in the above code
 fragment, the resulting graph still only shows one column of data.
 
 Have I misinterpreted the instructions or the functionality of new=TRUE?
 
 Thank you,
 Paul Lemmens
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to install R 2.5 with Synaptic in Ubuntu?

2007-07-04 Thread msmith

Hi,

Thanks for the suggestion and I wish the solution was that obvious, but I
have changed it to really point at my favourite mirror.

Using your example Synaptic reports the following error when I try to update
the repositories:

http://www.stats.bris.ac.uk/R/bin/linux/ubuntu/dists/feisty/main/binary-i386/Packages.gz:
404 Not Found

This is understandable since that location doesn't exist, but it makes me
think that the directory structure of the R mirrors is not compatible with
Ubuntu and Synaptic, since it automatically seeks /dists/feisty/ rather than
just /feisty/ as it is on the CRAN mirrors.

Thanks again
Mike Smith


Stefan Grosse-2 wrote:
 
 
  to end of the entry making it:

  deb http://my.favorite.cran.mirror/bin/linux/ubuntu feisty main

 However after this it still complains that it can't find packages.gz

   
 
 Just a guess: have you replaced the my.favorite.cran.mirror by a mirror
 which is close to you? If you're in UK it would be for example
 
 deb http://www.stats.bris.ac.uk/R/bin/linux/ubuntu feisty main
 
 ;o)
 Stefan
 
 It appears to be looking in 
 http://my.favorite.cran.mirror/bin/linux/ubuntu/distsfeisty
 which isn't the directory structure of the cran repository, but 
 I can see anyway to modify this behaviour.  Every other Ubuntu repositoy
 I have looked at contains the dists directory.

 Any suggestions for modifying this behaviour are gratefully recieved.
 Many thanks

 Mike Smith


   
 
 
 
 
 -=-=-
 ... The simple truth is that interstellar distances will not fit into
 the human imagination - (The Hitchhiker's Guide to the Galaxy)
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/How-to-install-R-2.5-with-Synaptic-in-Ubuntu--tf3998481.html#a11427837
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] QP for solving Support Vector Regression

2007-07-04 Thread Gorden T Jemwa
The ksvm object is probably what you need to use.

[quote]
Dear R users,
   I'm trying to run the Support Vector Regression by a general 
quadratic programming function like ipop ( ) in kernlab or solve.QP ( ) 
in quadprog packages.

   Since they are general, their application in Support Vector 
Regression can lead to misunderstanding, particularly when constructing 
matrices. Even their examples are general and applied in Support Vector 
Classification.

   Could anybody please introduce an example code for regression case.
[\quote]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fine tunning rgenoud

2007-07-04 Thread Paul Smith
On 7/4/07, RAVI VARADHAN [EMAIL PROTECTED] wrote:
 Here is another approach: I wrote an R function that would generate interior 
 points as starting values for constrOptim.  This might work better than the 
 LP approach, since the LP approach gives you a starting value that is on the 
 boundary of the feasible region, i.e a vertex of the polyhedron, whereas this 
 new approach gives you points on the interior.  You can generate as many 
 points as you wish, but the approach is brute-force and is very inefficient - 
 it takes on the order of a 1000 tries to find one feasible point.

Thanks again, Ravi. Actually, the LP approach also works here. Let
g(X) = k be the constraints. Then, by solving a LP problem with the
constraints

g(X) = (k+0.2)

returns an interior starting value for constrOptim. I am aware that
the new set of constraints may correspond to an impossible linear
system, but it works in many cases.

Paul

 - Original Message -
 From: Paul Smith [EMAIL PROTECTED]
 Date: Tuesday, July 3, 2007 7:32 pm
 Subject: Re: [R] Fine tunning rgenoud
 To: R-help r-help@stat.math.ethz.ch


  On 7/4/07, Ravi Varadhan [EMAIL PROTECTED] wrote:
It should be easy enough to check that your solution is valid (i.e.
  a local
minimum):  first, check to see if the solution satisfies all the
constraints; secondly, check to see if it is an interior point
  (i.e. none of
the constraints become equality); and finally, if the solution is an
interior point, check to see whether the gradient there is close to
  zero.
Note that if the solution is one of the vertices of the polyhedron,
  then the
gradient may not be zero.
 
   I am having bad luck: all constraints are satisfied, but the solution
   given by constrOptim is not interior; the gradient is not equal to
   zero.
 
   Paul
 
 
-Original Message-
From: [EMAIL PROTECTED]
[ On Behalf Of Paul Smith
Sent: Tuesday, July 03, 2007 5:10 PM
To: R-help
Subject: Re: [R] Fine tunning rgenoud
   
On 7/3/07, Ravi Varadhan [EMAIL PROTECTED] wrote:
 You had indicated in your previous email that you are having trouble
finding
 a feasible starting value for constrOptim().  So, you basically
  need to
 solve a system of linear inequalities to obtain a starting point.
   Have
you
 considered using linear programming? Either simplex() in the boot
package
 or solveLP() in linprog would work.  It seems to me that you
  could use
any
 linear objective function in solveLP to obtain a feasible
  starting point.
 This is not the most efficient solution, but it might be worth a
  try.

 I am aware of other methods for generating n-tuples that satisfy
  linear
 inequality constraints, but AFAIK those are not available in R.
   
Thanks, Ravi. I had already conceived the solution that you suggest,
actually using lpSolve. I am able to get a solution for my problem
with constrOptim, but I am not enough confident that the solution is
right. That is why I am trying to get a solution with rgenoud, but
unsuccessfully until now.
   
Paul
   
   
   
 -Original Message-
 From: [EMAIL PROTECTED]
 [ On Behalf Of Paul Smith
 Sent: Tuesday, July 03, 2007 4:10 PM
 To: R-help
 Subject: [R] Fine tunning rgenoud

 Dear All,

 I am trying to solve the following maximization problem, but I cannot
 have rgenoud giving me a reliable solution.

 Any ideas?

 Thanks in advance,

 Paul

 
 library(rgenoud)

 v - 0.90
 O1 - 10
 O2 - 20
 O0 - v*O1+(1-v)*O2

 myfunc - function(x) {
   U0 - x[1]
   U1 - x[2]
   U2 - x[3]
   q0 - x[4]
   q1 - x[5]
   q2 - x[6]
   p - x[7]

   if (U0  0)
 return(-1e+200)
   else if (U1  0)
 return(-1e+200)
   else if (U2  0)
 return(-1e+200)
   else if ((U0-(U1+(O1-O0)*q1))  0)
 return(-1e+200)
   else if ((U0-(U2+(O2-O0)*q2))  0)
 return(-1e+200)
   else if ((U1-(U0+(O0-O1)*q0))  0)
 return(-1e+200)
   else if ((U1-(U2+(O2-O1)*q2))  0)
 return(-1e+200)
   else if((U2-(U0+(O0-O2)*q0))  0)
 return(-1e+200)
   else if((U2-(U1+(O1-O2)*q1))  0)
 return(-1e+200)
   else if(p  0)
 return(-1e+200)
   else if(p  1)
 return(-1e+200)
   else if(q0  0)
 return(-1e+200)
   else if(q1  0)
 return(-1e+200)
   else if(q2  0)
 return(-1e+200)
   else


  return(p*(sqrt(q0)-(O0*q0+U0))+(1-p)*(v*(sqrt(q1)-(O1*q1+U1))+(1-v)*(sqrt(q2
 )-(O2*q2+U2

 }


  genoud(myfunc,nvars=7,max=T,pop.size=6000,starting.values=runif(7),wait.gene
 rations=150,max.generations=300,boundary.enforcement=2)

 __
 R-help@stat.math.ethz.ch mailing list


Re: [R] sequences

2007-07-04 Thread livia

Hi all, thank you very much.

livia wrote:
 
 Hi, I would like to generate a series in the following form (0.8^1, 0.8^2,
 ..., 0.8^600)
 Could anyone tell me how can I achieve that? I am really new to R.
 

-- 
View this message in context: 
http://www.nabble.com/sequences-tf4019146.html#a11428415
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice: shifting strips to left of axes

2007-07-04 Thread Michael Hoffman
[EMAIL PROTECTED] wrote:

 myYlabGrob -
 function(..., main.ylab = ) ## ...is lab1, lab2, etc
 {
 ## you can add arguments to textGrob for more control
 ## in the next line
 labs - lapply(list(...), textGrob, rot=90)
 main.ylab - textGrob(main.ylab, rot = 90)
 nlabs - length(labs)
 lab.heights -
 lapply(labs,
function(lab) unit(1, grobheight,
   data=list(lab)))
 unit1 - unit(1.2, grobheight, data = list(main.ylab))
 unit2 - do.call(max, lab.heights)
 lab.layout -
 grid.layout(ncol = 2, nrow = nlabs,
 heights = unit(1, null),
 widths = unit.c(unit1, unit2),
 respect = TRUE)
 lab.gf - frameGrob(layout=lab.layout)
 for (i in seq_len(nlabs))
 {
 lab.gf - placeGrob(lab.gf, labs[[i]], row = i, col = 2)
 }
 lab.gf - placeGrob(lab.gf, main.ylab, col = 1)
 lab.gf
 }

Wow. I don't think I would have been able to come up with that on my 
own. Thank you!
-- 
Michael Hoffman

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adding data to existing plot with new=TRUE does not appear to work

2007-07-04 Thread Paul Lemmens
Hi Petr,

On 7/4/07, Petr PIKAL [EMAIL PROTECTED] wrote:
 par(new=T)
 plot(aa,bb, xlim=range(0,3), ylim=range(0,10), new=TRUE)

So I need to activate the par(new=T) really just ahead of time when I
need it, not as sort of a general clause at the beginning of my
script?


 However you can get similar result with using points

Yes I new that, but I wanted to try and go without an if() for
deciding between the first and consecutive columns.

Thnx for helping out!
Paul Lemmens

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] The R Book by M. J. Crawley

2007-07-04 Thread Berwin A Turlach
G'day Uwe,

On Tue, 03 Jul 2007 14:33:05 +0200
Uwe Ligges [EMAIL PROTECTED] wrote:

 Pietrzykowski, Matthew (GE, Research) wrote:
   I saw the new book,
  The R Book, by Michael J. Crawley and wanted to know what R users
  thoughts of it.
 
 The author seems to be an expert in (almost?) all available
 statistical programming languages 

I would have thought that honour would go to Brian S. Everitt. :-)

M.J. Crawley only seems to write books using S-PLus and R.  His book on
Statistical Computing (using S-Plus) is about 750 pages and his book
Statistics: An introduction using R is about 320 pages.  

I do not know and have not seen The R Book yet, so I cannot comment
on it.  The statistical material presented in the other two books is
pretty sound and well explained.  M.J. Crawley definitely has some
strong opinions on how certain data should be analysed and how
statistics should be used. The S-Plus code or R code he uses can, on
occasions, be somewhat improved.

Cheers,

Berwin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Calling C Code from R

2007-07-04 Thread Deb Midya
Hi R Users,
   
  Thanks in advance.
   
  I am using R-2.5.1 on Windows XP.
   
  I am trying to call C code (testCX1.C) from R. testCX1.c calls another C code 
(funcC1.c) and returning a value to testCX1.c. I like to have this value in R.
   
  My steps are below:
   
  1. R CMD SHLIB testCX1.c funcC1.c (at command propmt)
   
  2. It creates testCX1.dll with warning (but testCX1.dll works):
   
  testCX1.c:38: warning: implicit declaration of function 'func'
   
  How to get rid off this error ?
   
  What is the best way to call funcC1.c from testCX1.c?
   
  I have provided the codes below:
   
  Once again thank you very much for the time you have given.
   
  Regards,
   
  Debabrata Midya (Deb)
  Statistician
  NSW Department of Commerce
  Sydney, Australia
   
  testCX1.C
  --
   
  /*
 *
 testCX1.c
  *
*/
  #include R.h
#include Rdefines.h
#include Rmath.h
   
  SEXP testC1(SEXP a)
{
   int i, nr;
   double *xa, *xw, tmp;
   SEXP w;
   
 PROTECT(a = coerceVector(a, REALSXP));
   
   nr = length(a);
 printf( Length : %d \n, nr);
   
   PROTECT(w = allocVector(REALSXP, 1));
   
   xa = REAL(a);
   xw = REAL(w);

   tmp = 0.0;
 for (i = 0; i  nr; i++)
 {
tmp += xa[i];
 }
 // tmp = 0.0;
   xw[0] = func(xa);
 UNPROTECT(2);
   return(w);
}

   
  funcC1.c
  
   
  /*
 *
 funcC1.c
  *
*/
   
  #include R.h
#include Rdefines.h
#include Rmath.h
   
  SEXP func(SEXP a)
{
   int i, nr = 3;
   double *xa, *xw, tmp;
   SEXP w;
   
 PROTECT(a = coerceVector(a, REALSXP));
   
   PROTECT(w = allocVector(REALSXP, 1));
   
   xa = REAL(a);
   xw = REAL(w);

   tmp = 0.0;
 for (i = 0; i  nr; i++)
 {
tmp += xa[i] * xa[i];
 }
 xw[0] = tmp;
 UNPROTECT(2);
   return(w);
}

  R script
  ---
   
  dyn.load(testCX1.dll)
xL - 1:5
xL - as.double(as.vector(t(xL)))
  .Call(testC1,xL)

  [1] 55
   
   

   
-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calling C Code from R

2007-07-04 Thread Gabor Csardi
On Wed, Jul 04, 2007 at 04:39:18AM -0700, Deb Midya wrote:
 Hi R Users,

   Thanks in advance.

   I am using R-2.5.1 on Windows XP.

   I am trying to call C code (testCX1.C) from R. testCX1.c calls another C 
 code (funcC1.c) and returning a value to testCX1.c. I like to have this value 
 in R.

   My steps are below:

   1. R CMD SHLIB testCX1.c funcC1.c (at command propmt)

   2. It creates testCX1.dll with warning (but testCX1.dll works):

   testCX1.c:38: warning: implicit declaration of function 'func'

   How to get rid off this error ?

By adding the prototype of 'func' to testCX1.c:

SEXP func(SEXP a);

Probably it is simplest to collect all prototypes in a single header file
and include that from all .c files.

   What is the best way to call funcC1.c from testCX1.c?

See .C and .Call and in particular the 'Writing R Extensions' manual,
5 System and foreign language interfaces.

Gabor

[...]

-- 
Csardi Gabor [EMAIL PROTECTED]MTA RMKI, ELTE TTK

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to install R 2.5 with Synaptic in Ubuntu?

2007-07-04 Thread Stefan Grosse
msmith schrieb:
 Hi,

 Thanks for the suggestion and I wish the solution was that obvious, but I
 have changed it to really point at my favourite mirror.

 Using your example Synaptic reports the following error when I try to update
 the repositories:

 http://www.stats.bris.ac.uk/R/bin/linux/ubuntu/dists/feisty/main/binary-i386/Packages.gz:
 404 Not Found

 This is understandable since that location doesn't exist, but it makes me
 think that the directory structure of the R mirrors is not compatible with
 Ubuntu and Synaptic, since it automatically seeks /dists/feisty/ rather than
 just /feisty/ as it is on the CRAN mirrors.

   

Hm. I have Fedora 7 so I cannot really check what my entry would look
like. I recently installed Kubuntu on a friends notebook so there the
readme from http://www.stats.bris.ac.uk/R/bin/linux/ubuntu/README.html
did actually work. I just had a second look at your mail so could it be
that there is a missing slash after feisty? If that does not work try
another mirror like

deb http://stat.ethz.ch/CRAN/bin/linux/ubuntu/ feisty/

there is also no main stated there in the readme... sorry I copied that
from your mail so thats obviously not following the readme...

I hope this works now.

Stefan




-=-=-
... Satisfaction does not come with achievement, but with effort. Full
effort is full victory. (M.Gandhi)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to install R 2.5 with Synaptic in Ubuntu?

2007-07-04 Thread Stefan Grosse
Stefan Grosse schrieb:
 deb http://www.stats.bris.ac.uk/R/bin/linux/ubuntu feisty main
   

Sorry, I copied a mistake there, it should be:

deb http://www.stats.bris.ac.uk/R/bin/linux/ubuntu feisty/

Stefan



-=-=-
... The simple truth is that interstellar distances will not fit into
the human imagination - (The Hitchhiker's Guide to the Galaxy)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] probabilty plot

2007-07-04 Thread John Kane

Is this what you mean ? 

---
mydata - c(1,2,3,4,5,7,5,4,3)

plot(mydata)
---


--- along zeng [EMAIL PROTECTED] wrote:

 Hi all,
I am a freshman of R,but I am interested  in it!
 Those days,I am
 learning   pages on NIST,with url

http://www.itl.nist.gov/div898/handbook/eda/section3/probplot.htm,
 I am meeting  a problem about probability plot and I
 don't know how to
 plot a data set with R.
 Could somebody tell me the answer,and a example is
 the best!  I will
 look forward to your answer.
  Thank you very much.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calling C Code from R

2007-07-04 Thread Deb Midya
Gabor,
   
  Thank you very much for such a quick response.
   
  As I am new to this area, will you please explain where can I put SEXP 
func(SEXP a);
in my program.
   
  Once again, thank you very much for your quick response.
   
  Regards,
   
  Deb
  

Gabor Csardi [EMAIL PROTECTED] wrote:
  On Wed, Jul 04, 2007 at 04:39:18AM -0700, Deb Midya wrote:
 Hi R Users,
 
 Thanks in advance.
 
 I am using R-2.5.1 on Windows XP.
 
 I am trying to call C code (testCX1.C) from R. testCX1.c calls another C code 
 (funcC1.c) and returning a value to testCX1.c. I like to have this value in R.
 
 My steps are below:
 
 1. R CMD SHLIB testCX1.c funcC1.c (at command propmt)
 
 2. It creates testCX1.dll with warning (but testCX1.dll works):
 
 testCX1.c:38: warning: implicit declaration of function 'func'
 
 How to get rid off this error ?

By adding the prototype of 'func' to testCX1.c:

SEXP func(SEXP a);

Probably it is simplest to collect all prototypes in a single header file
and include that from all .c files.

 What is the best way to call funcC1.c from testCX1.c?

See .C and .Call and in particular the 'Writing R Extensions' manual,
5 System and foreign language interfaces.

Gabor

[...]

-- 
Csardi Gabor MTA RMKI, ELTE TTK


   
-
Boardwalk for $500? In 2007? Ha! 

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calling C Code from R

2007-07-04 Thread Gabor Csardi

On Wed, Jul 04, 2007 at 05:15:15AM -0700, Deb Midya wrote:
Gabor,
 
Thank you very much for such a quick response.
 
As I am new to this area, will you please explain where can I put SEXP
func(SEXP a);
in my program.

Deb, anywhere before calling it. (Well outside a function definition.)
Typically after the #include lines.
Or put all these prototypes into a header file called myfuncs.h
and add line 

#include myfuncs.h

just after the other #include lines at the beginning of the file.

Gabor

Once again, thank you very much for your quick response.
 
Regards,
 
Deb
 

-- 
Csardi Gabor [EMAIL PROTECTED]MTA RMKI, ELTE TTK

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Loop and cbind

2007-07-04 Thread livia

Hi, I would like to apply the following function for i between 1 and 12, and
then construct a list of the return series.

for (i in 1:12){
ewma[i] - emaTA(calm[[i]]^2,0.03)
standard[i]- calm[[i]]/sqrt(ewma[i])
standard - cbind(standard[i])
}

But it does not work. Could anyone give me some advice how can I achieve
this? Many thanks
-- 
View this message in context: 
http://www.nabble.com/Loop-and-cbind-tf4024291.html#a11430500
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem/bug with smooth.spline and all.knots=T

2007-07-04 Thread Hubertus
Dear list,
if I do
  smooth.spline(tmpSec, tmpT, all.knots=T)
with the attached data, I get this error-message:
  Error in smooth.spline(tmpSec, tmpT, all.knots = T) :
smoothing parameter value too small
If I do
  smooth.spline(tmpSec[-single arbitrary number], tmpT[-single arbitrary 
number], all.knots=T)
it works!

I just don't see it. It works for hundrets other datasets, but not for this one.
Would be glad if anyone could help!

Hubertus
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loop and cbind

2007-07-04 Thread john seers \(IFR\)



Hi 

In what way does it not work?

My guess is that you have not declared your values outside the for loop.
As they are local they will be lost on exit.

You need to declare them before:

ewma-vector(length=12)
standard-vector(length=12)

for ... {

}

John Seers
 


 
---

Hi, I would like to apply the following function for i between 1 and 12,
and then construct a list of the return series.

for (i in 1:12){
ewma[i] - emaTA(calm[[i]]^2,0.03)
standard[i]- calm[[i]]/sqrt(ewma[i])
standard - cbind(standard[i])
}

But it does not work. Could anyone give me some advice how can I achieve
this? Many thanks
--
View this message in context:
http://www.nabble.com/Loop-and-cbind-tf4024291.html#a11430500
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loop and cbind

2007-07-04 Thread ONKELINX, Thierry
A more elegant way to do this is

standard - sapply(calm, function(calmi){calmi / sqrt(emaTA(calmi ^ 2,
0.03))})

Cheers,

Thierry


ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
[EMAIL PROTECTED]
www.inbo.be 

Do not put your faith in what statistics say until you have carefully
considered what they do not say.  ~William W. Watt
A statistical analysis, properly conducted, is a delicate dissection of
uncertainties, a surgery of suppositions. ~M.J.Moroney

 

 -Oorspronkelijk bericht-
 Van: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] Namens john seers (IFR)
 Verzonden: woensdag 4 juli 2007 15:19
 Aan: livia; r-help@stat.math.ethz.ch
 Onderwerp: Re: [R] Loop and cbind
 
 
 
 
 Hi 
 
 In what way does it not work?
 
 My guess is that you have not declared your values outside 
 the for loop.
 As they are local they will be lost on exit.
 
 You need to declare them before:
 
 ewma-vector(length=12)
 standard-vector(length=12)
 
 for ... {
   
 }
 
 John Seers
  
 
 
  
 ---
 
 Hi, I would like to apply the following function for i 
 between 1 and 12, and then construct a list of the return series.
 
 for (i in 1:12){
 ewma[i] - emaTA(calm[[i]]^2,0.03)
 standard[i]- calm[[i]]/sqrt(ewma[i])
 standard - cbind(standard[i])
 }
 
 But it does not work. Could anyone give me some advice how 
 can I achieve this? Many thanks
 --
 View this message in context:
 http://www.nabble.com/Loop-and-cbind-tf4024291.html#a11430500
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fine tunning rgenoud

2007-07-04 Thread Paul Smith
On 7/4/07, RAVI VARADHAN [EMAIL PROTECTED] wrote:
 My point is that it might be better to try multiple (feasible) starting 
 values for constrOptim to ensure that you have a good local minimum, since it 
 appears that constrOptim converges to a boundary solution where the gradient 
 is non-zero.  That is why my code could be useful.

Thanks, Ravi. I have used your function, which works pretty fine.
However, constrOptim returns solutions markedly different, depending
on the starting values. That is true that I am expecting a solution in
the boundary, but should not constrOptim find boundary solutions
correctly? The set of solution that I got is below.

Paul



2.67682495728743e-080.676401684216637   5.18627076390355e-09
0.00206463986063195 0.871859686128364.32039325909089e-11
0.9996234
3.71711020733097e-080.539853580957444   1.82592937615235e-08
0.00206941041763503 0.933052503934472.08076621230984e-11
0.9995774
1.55648443014316e-080.356047772992972   8.61341165816411e-09
0.00207149128044574 0.939531540703735   2.55211186629222e-12
0.424
2.20685747493755e-070.575689534431218   5.30976753476747e-08
0.00210500604605837 0.588947341576757   3.1310360048386e-10 
0.9998789
1.92961662926727e-080.773588030510204   1.04841835042200e-08
0.00206723852358352 0.816755014708394   3.89478290348532e-11
0.9997794
0.0002798240512890820.0003109923855228861.01467522935252e-06
3.11645639181419e-050.00249801538651552 3.0978819115532e-05 
7.11821104872585e-06
2.81901448690893e-070.381718731525906   4.72860507882539e-08
0.00206807672109157 0.769178513763055   1.39278079797628e-09
0.9967123
5.58938545019597e-050.00171253668169328 4.54005998518212e-09
0.00165663757292733 0.00247994862102590 6.20992250482468e-06
0.419169641865998
1.03300938985890e-080.438357835603591   6.89854079723234e-09
0.00206693286138396 0.977554885433201   1.17209206267609e-10
0.996921
7.63336821363444e-050.00177141538041517 1.88050423143828e-10
0.00169507950991094 0.00249739505142207 8.4814984916537e-06 
0.470929220605509
9.16005846107533e-090.682179815036755   1.63255733785783e-09
0.00206922107327189 0.919323193130209   5.71436138398897e-11
0.999629
1.40968913167328e-080.343606628343661   1.33227447885302e-08
0.00206789984370423 0.343671264496824   1.11679312116211e-11
0.822
4.76054734844857e-090.593022549313178   2.28102966623129e-09
0.00206625165098398 0.947562121256448   8.9437610753173e-11 
0.992
1.96950784184139e-070.579488113726155   1.61915231214025e-07
0.00208000350528798 1.008913405950401.22248906754713e-10
0.9996493
8.1448937742933e-09 0.441088618716555   4.54846390087941e-09
0.00207634940425852 0.446155700100820   4.81439647816238e-12
0.39
4.82439218405912e-080.557771049256698   3.53737879481732e-08
0.0020663035737319  0.588137767965923   2.6568947800491e-11 
0.9988615
2.43086751126363e-080.522927598354163   2.26886829089137e-08
0.00206533531066324 0.611696593543814   4.51226610050184e-11
0.087
3.05498959434100e-080.465522202845817   1.09246302124670e-08
0.00207004066920179 0.465583376966915   3.24213847202457e-11
0.9997366
1.88687179088788e-070.783614197203923   4.51346471059839e-08
0.00222403775221293 0.786422171740329   8.17865794171933e-10
0.9986103
1.0154423824979e-08 0.30265579883   9.06923080122203e-09
0.00206615353968094 0.359722316646974   8.27866320956902e-12
0.998461
8.91008717665837e-080.0020661526864997  3.08619455858999e-09
0.00206579199039568 0.00275523149199496 9.55650084108725e-09
0.985185595958656
1.25320647920029e-070.635217955401437   7.44627883600107e-08
0.00206656250455391 0.855937507707323   3.70326032870889e-10
0.9998375
2.57618374406559e-080.636499151952225   1.09822023878715e-08
0.00206677354204888 0.772636071860102   8.99370944431481e-11
0.9978744
1.09474196877990e-080.501469973722704   1.19992915868609e-10
0.00206117941606503 0.501594064757161   1.34320044786225e-11
0.9991232
5.24203710193977e-050.0001279983401441093.33258623630601e-09
7.55779680724378e-050.00248898574263025 5.82411313482383e-06
0.0221497278110802
3.80217498132259e-070.576645687031891.01755510162620e-08
0.00207232950382402 0.944031557945531   5.30703662426069e-10
0.9995957
1.45159816281038e-09

Re: [R] Fine tunning rgenoud

2007-07-04 Thread Paul Smith
On 7/4/07, Paul Smith [EMAIL PROTECTED] wrote:
 On 7/4/07, RAVI VARADHAN [EMAIL PROTECTED] wrote:
  My point is that it might be better to try multiple (feasible) starting 
  values for constrOptim to ensure that you have a good local minimum, since 
  it appears that constrOptim converges to a boundary solution where the 
  gradient is non-zero.  That is why my code could be useful.

 Thanks, Ravi. I have used your function, which works pretty fine.
 However, constrOptim returns solutions markedly different, depending
 on the starting values. That is true that I am expecting a solution in
 the boundary, but should not constrOptim find boundary solutions
 correctly? The set of solution that I got is below.

Unless, there are many local optimal solutions...

Paul


 

 2.67682495728743e-080.676401684216637   5.18627076390355e-09
 0.00206463986063195 0.871859686128364.32039325909089e-11
 0.9996234
 3.71711020733097e-080.539853580957444   1.82592937615235e-08
 0.00206941041763503 0.933052503934472.08076621230984e-11
 0.9995774
 1.55648443014316e-080.356047772992972   8.61341165816411e-09
 0.00207149128044574 0.939531540703735   2.55211186629222e-12
 0.424
 2.20685747493755e-070.575689534431218   5.30976753476747e-08
 0.00210500604605837 0.588947341576757   3.1310360048386e-10 
 0.9998789
 1.92961662926727e-080.773588030510204   1.04841835042200e-08
 0.00206723852358352 0.816755014708394   3.89478290348532e-11
 0.9997794
 0.0002798240512890820.0003109923855228861.01467522935252e-06
 3.11645639181419e-050.00249801538651552 3.0978819115532e-05 
 7.11821104872585e-06
 2.81901448690893e-070.381718731525906   4.72860507882539e-08
 0.00206807672109157 0.769178513763055   1.39278079797628e-09
 0.9967123
 5.58938545019597e-050.00171253668169328 4.54005998518212e-09
 0.00165663757292733 0.00247994862102590 6.20992250482468e-06
 0.419169641865998
 1.03300938985890e-080.438357835603591   6.89854079723234e-09
 0.00206693286138396 0.977554885433201   1.17209206267609e-10
 0.996921
 7.63336821363444e-050.00177141538041517 1.88050423143828e-10
 0.00169507950991094 0.00249739505142207 8.4814984916537e-06 
 0.470929220605509
 9.16005846107533e-090.682179815036755   1.63255733785783e-09
 0.00206922107327189 0.919323193130209   5.71436138398897e-11
 0.999629
 1.40968913167328e-080.343606628343661   1.33227447885302e-08
 0.00206789984370423 0.343671264496824   1.11679312116211e-11
 0.822
 4.76054734844857e-090.593022549313178   2.28102966623129e-09
 0.00206625165098398 0.947562121256448   8.9437610753173e-11 
 0.992
 1.96950784184139e-070.579488113726155   1.61915231214025e-07
 0.00208000350528798 1.008913405950401.22248906754713e-10
 0.9996493
 8.1448937742933e-09 0.441088618716555   4.54846390087941e-09
 0.00207634940425852 0.446155700100820   4.81439647816238e-12
 0.39
 4.82439218405912e-080.557771049256698   3.53737879481732e-08
 0.0020663035737319  0.588137767965923   2.6568947800491e-11 
 0.9988615
 2.43086751126363e-080.522927598354163   2.26886829089137e-08
 0.00206533531066324 0.611696593543814   4.51226610050184e-11
 0.087
 3.05498959434100e-080.465522202845817   1.09246302124670e-08
 0.00207004066920179 0.465583376966915   3.24213847202457e-11
 0.9997366
 1.88687179088788e-070.783614197203923   4.51346471059839e-08
 0.00222403775221293 0.786422171740329   8.17865794171933e-10
 0.9986103
 1.0154423824979e-08 0.30265579883   9.06923080122203e-09
 0.00206615353968094 0.359722316646974   8.27866320956902e-12
 0.998461
 8.91008717665837e-080.0020661526864997  3.08619455858999e-09
 0.00206579199039568 0.00275523149199496 9.55650084108725e-09
 0.985185595958656
 1.25320647920029e-070.635217955401437   7.44627883600107e-08
 0.00206656250455391 0.855937507707323   3.70326032870889e-10
 0.9998375
 2.57618374406559e-080.636499151952225   1.09822023878715e-08
 0.00206677354204888 0.772636071860102   8.99370944431481e-11
 0.9978744
 1.09474196877990e-080.501469973722704   1.19992915868609e-10
 0.00206117941606503 0.501594064757161   1.34320044786225e-11
 0.9991232
 5.24203710193977e-050.0001279983401441093.33258623630601e-09
 7.55779680724378e-050.00248898574263025 5.82411313482383e-06
 0.0221497278110802
 

[R] copula estimation wih time series marginals

2007-07-04 Thread [EMAIL PROTECTED]
I am using R 2.5.1 for windows and my purpose is to estimate a clayton copula . 
Since I have two time series marginals, I found that the most appropriate model 
was an ARMA(1,0)+GARCH(1,1) model for both with sstd as conditional 
distribution. Can anyone give me some tips about the code to estimate the 
copula?
Thanks in advance

Gaetano Rossi


--
Scegli infostrada: ADSL gratis per tutta l’estate e telefoni senza canone 
Telecom
http://click.libero.it/infostrada

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Please help with legend command

2007-07-04 Thread Smith, Phil (CDC/CCID/NCIRD)
Hi R-ers:

I'm drawing a plot and have used different line types (lty) for
different race/ethnicity groups. I want a legend that explains what line
types correspond to the different race/ethnicity groups. I used the
following code:


legend( 1992 , 42  , c(Hispanic , non-Hispanic white (NHW) ,
non-Hispanic black , AI/AN , Asian ) , lty=1:5 ,cex = .6 , bty='n'
)

Guess what? The legend box was so narrow that the line types that show
up in that legend box look essentially the same, because they are short.
I.e, although a line type might be a long dash followed by a short dash,
only the long dash shows up in the box. The consequence of this is that
the race/ethnic group that corresponds to the line type that is only a
long dash cannot be distinguished from the legend.

How do I stretch that legend box out so as to allow lty to draw longer
line segments?

Please reply to: [EMAIL PROTECTED]

Many thanks!
Phil Smith
Centers for Disease Control and Prevention

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A More efficient method?

2007-07-04 Thread Ken Knoblauch
Keith Alan Chamberlain Keith.Chamberlain at Colorado.EDU writes:
 Cat=c('a','a','a','b','b','b','a','a','b')# Categorical variable
 C1=vector(length=length(Cat)) # New vector for numeric values

 for(i in 1:length(C1)){
   if(Cat[i]=='a') C1[i]=-1 else C1[i]=1
 }
 
 C1
 [1] -1 -1 -1  1  1  1 -1 -1  1
 Cat
 [1] a a a b b b a a b

 ifelse(Cat == a, -1, 1)
[1] -1 -1 -1  1  1  1 -1 -1  1

HTH

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] A More efficient method?

2007-07-04 Thread Keith Alan Chamberlain
Dear Rhelpers,

Is there a faster way than below to set a vector based on values from
another vector? I'd like to call a pre-existing function for this, but one
which can also handle an arbitrarily large number of categories. Any ideas?

Cat=c('a','a','a','b','b','b','a','a','b')  # Categorical variable
C1=vector(length=length(Cat))   # New vector for numeric values

# Cycle through each column and set C1 to corresponding value of Cat.
for(i in 1:length(C1)){
if(Cat[i]=='a') C1[i]=-1 else C1[i]=1
}

C1
[1] -1 -1 -1  1  1  1 -1 -1  1
Cat
[1] a a a b b b a a b

Sincerely,
KeithC.
Psych Undergrad, CU Boulder (US)
RE McNair Scholar

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A More efficient method?

2007-07-04 Thread ONKELINX, Thierry
Cat - c('a','a','a','b','b','b','a','a','b')
C1 - ifelse(Cat == 'a', -1, 1)



ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
[EMAIL PROTECTED]
www.inbo.be 

Do not put your faith in what statistics say until you have carefully
considered what they do not say.  ~William W. Watt
A statistical analysis, properly conducted, is a delicate dissection of
uncertainties, a surgery of suppositions. ~M.J.Moroney

 

 -Oorspronkelijk bericht-
 Van: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] Namens Keith Alan 
 Chamberlain
 Verzonden: woensdag 4 juli 2007 15:45
 Aan: r-help@stat.math.ethz.ch
 Onderwerp: [R] A More efficient method?
 
 Dear Rhelpers,
 
 Is there a faster way than below to set a vector based on 
 values from another vector? I'd like to call a pre-existing 
 function for this, but one which can also handle an 
 arbitrarily large number of categories. Any ideas?
 
 Cat=c('a','a','a','b','b','b','a','a','b')# Categorical variable
 C1=vector(length=length(Cat)) # New vector for numeric values
 
 # Cycle through each column and set C1 to corresponding value of Cat.
 for(i in 1:length(C1)){
   if(Cat[i]=='a') C1[i]=-1 else C1[i]=1
 }
 
 C1
 [1] -1 -1 -1  1  1  1 -1 -1  1
 Cat
 [1] a a a b b b a a b
 
 Sincerely,
 KeithC.
 Psych Undergrad, CU Boulder (US)
 RE McNair Scholar
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to solve a min problem

2007-07-04 Thread domenico pestalozzi
S is an array 1-dimensional, for example 1 X 10, and mean(S) is the mean of
these 10 elements.

So, I want to do:

minimize mean(S) with 0  b_func(S)  800.
That is, there are some boundaries on S according the b_funct

The function apply an iterative convergent criterion:

f_1=g(S), f_2=g(f_1), f_3=g(f_2), ecc
The function stops when
f_n - f_n-1 =0.1e-09
and g(S) is a non-linear function of S and the convergence is mathematically
assured.

Is it possible to use  'optimize'?

thanks

domenico


2007/7/3, Spencer Graves [EMAIL PROTECTED]:

  Do you mean

  minimize mu with 0  b_func(S+mu)  800?

  For this kind of problem, I'd first want to know the nature of
 b_func.  Without knowing more, I might try to plot b_func(S+mu) vs.
 mu, then maybe use 'optimize'.

  If this is not what you mean, please be more specific:  I'm
 confused.

  Hope this helps.
  Spencer Graves

 domenico pestalozzi wrote:
  I know it's possible to solve max e min problems  by using these
 functions:
 
  nlm, optimize, optim
 
  but I don't know how to use them (...if possible...) to solve this
 problem.
 
  I have a personal function called  b_func(S) where S is an input array
 (1 X
  n)  and I'd like:
 
  minimize mean(S) with 0  b_funct  800.
 
  I know that the solution exists, but It's possible to calculate it in R?
  The b_func is non linear and it calculates a particular value using S as
  input and applying a convergent iterative algorithm.
 
  thanks
 
 
  domenico
 
[[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A More efficient method?

2007-07-04 Thread Benilton Carvalho
C1 - rep(-1, length(Cat))
C1[Cat == b]] - 1

b

On Jul 4, 2007, at 9:44 AM, Keith Alan Chamberlain wrote:

 Dear Rhelpers,

 Is there a faster way than below to set a vector based on values from
 another vector? I'd like to call a pre-existing function for this,  
 but one
 which can also handle an arbitrarily large number of categories.  
 Any ideas?

 Cat=c('a','a','a','b','b','b','a','a','b')# Categorical variable
 C1=vector(length=length(Cat)) # New vector for numeric values

 # Cycle through each column and set C1 to corresponding value of Cat.
 for(i in 1:length(C1)){
   if(Cat[i]=='a') C1[i]=-1 else C1[i]=1
 }

 C1
 [1] -1 -1 -1  1  1  1 -1 -1  1
 Cat
 [1] a a a b b b a a b

 Sincerely,
 KeithC.
 Psych Undergrad, CU Boulder (US)
 RE McNair Scholar

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A More efficient method?

2007-07-04 Thread Stefan Grosse

 Cat=c('a','a','a','b','b','b','a','a','b')# Categorical variable
 C1=vector(length=length(Cat)) # New vector for numeric values

 # Cycle through each column and set C1 to corresponding value of Cat.
 for(i in 1:length(C1)){
   if(Cat[i]=='a') C1[i]=-1 else C1[i]=1
 }

 C1
 [1] -1 -1 -1  1  1  1 -1 -1  1
 Cat
 [1] a a a b b b a a b

   
how about:

Cat-c('a','a','a','b','b','b','a','a','b')
c1- -2*(Cat==a)+1



-=-=-
... Time is an illusion, lunchtime doubly so. (Ford Prefect)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A More efficient method?

2007-07-04 Thread Ted Harding
On 04-Jul-07 13:44:44, Keith Alan Chamberlain wrote:
 Dear Rhelpers,
 
 Is there a faster way than below to set a vector based on values
 from another vector? I'd like to call a pre-existing function for
 this, but one which can also handle an arbitrarily large number
 of categories. Any ideas?
 
 Cat=c('a','a','a','b','b','b','a','a','b')# Categorical variable
 C1=vector(length=length(Cat)) # New vector for numeric values
 
# Cycle through each column and set C1 to corresponding value of Cat.
 for(i in 1:length(C1)){
   if(Cat[i]=='a') C1[i]=-1 else C1[i]=1
 }
 
 C1
 [1] -1 -1 -1  1  1  1 -1 -1  1
 Cat
 [1] a a a b b b a a b

 Cat=c('a','a','a','b','b','b','a','a','b')

 Cat==b 
[1] FALSE FALSE FALSE  TRUE  TRUE  TRUE FALSE FALSE  TRUE

 (Cat==b) - 0.5
[1] -0.5 -0.5 -0.5  0.5  0.5  0.5 -0.5 -0.5  0.5

 2*((Cat==b) - 0.5)
[1] -1 -1 -1  1  1  1 -1 -1  1

to give one example of a way to do it. But you don't say why you
really want to do this. You may really want factors. And what do
you want to see if there is an arbitrarily large number of
categories?

For instance:

 factor(Cat,labels=c(-1,1))
[1] -1 -1 -1 1  1  1  -1 -1 1 

but this is not a vector, but a factor object. To get the vector,
you need to convert Cat to an integer:

 as.integer(factor(Cat))
[1] 1 1 1 2 2 2 1 1 2

where (unless you've specified otherwise in factor()) the values
will correspond to the elements of Cat in natural order, in this
case first a (- 1), then b (- 2).

E.g.

 Cat2-c(a,a,c,b,a,b)
 as.integer(factor(Cat2))
[1] 1 1 3 2 1 2

so, with C2-as.integer(factor(Cat2)), you get a vector of distinct
integers 91,2,3) for the distinct levels (a,b,c) of Cat2.
If you want integer values for these levels, you can write a function
to change them.

Hoping this helps to beark the ice!
Ted.



E-Mail: (Ted Harding) [EMAIL PROTECTED]
Fax-to-email: +44 (0)870 094 0861
Date: 04-Jul-07   Time: 16:44:20
-- XFMail --

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A More efficient method?

2007-07-04 Thread Gabor Grothendieck
Here are two ways.  The second way is more than 10x faster.

 set.seed(1)
 C - sample(c(a, b), 10, replace = TRUE)
 system.time(s1 - ifelse(C == a, 1, -1))
   user  system elapsed
   0.370.010.38
 system.time(s2 - 2 * (C == a) - 1)
   user  system elapsed
   0.020.000.02
 identical(s1, s2)
[1] TRUE

On 7/4/07, Keith Alan Chamberlain [EMAIL PROTECTED] wrote:
 Dear Rhelpers,

 Is there a faster way than below to set a vector based on values from
 another vector? I'd like to call a pre-existing function for this, but one
 which can also handle an arbitrarily large number of categories. Any ideas?

 Cat=c('a','a','a','b','b','b','a','a','b')  # Categorical variable
 C1=vector(length=length(Cat))   # New vector for numeric values

 # Cycle through each column and set C1 to corresponding value of Cat.
 for(i in 1:length(C1)){
if(Cat[i]=='a') C1[i]=-1 else C1[i]=1
 }

 C1
 [1] -1 -1 -1  1  1  1 -1 -1  1
 Cat
 [1] a a a b b b a a b

 Sincerely,
 KeithC.
 Psych Undergrad, CU Boulder (US)
 RE McNair Scholar

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A More efficient method?

2007-07-04 Thread joris . dewolf


or
Cat - c('a','a','a','b','b','b','a','a','b')
C1 - (Cat=='a')*1







   
 ONKELINX,
 Thierry  
 Thierry.ONKELINX  To 
 @inbo.be Keith Alan Chamberlain
 Sent by:  [EMAIL PROTECTED],   
 [EMAIL PROTECTED] r-help@stat.math.ethz.ch  
 at.math.ethz.chcc 
   
   Subject 
 04/07/2007 17:17  Re: [R] A More efficient method?
   
   
   
   
   
   




Cat - c('a','a','a','b','b','b','a','a','b')
C1 - ifelse(Cat == 'a', -1, 1)



ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
[EMAIL PROTECTED]
www.inbo.be

Do not put your faith in what statistics say until you have carefully
considered what they do not say.  ~William W. Watt
A statistical analysis, properly conducted, is a delicate dissection of
uncertainties, a surgery of suppositions. ~M.J.Moroney



 -Oorspronkelijk bericht-
 Van: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] Namens Keith Alan
 Chamberlain
 Verzonden: woensdag 4 juli 2007 15:45
 Aan: r-help@stat.math.ethz.ch
 Onderwerp: [R] A More efficient method?

 Dear Rhelpers,

 Is there a faster way than below to set a vector based on
 values from another vector? I'd like to call a pre-existing
 function for this, but one which can also handle an
 arbitrarily large number of categories. Any ideas?

 Cat=c('a','a','a','b','b','b','a','a','b')   # Categorical
variable
 C1=vector(length=length(Cat))# New vector for numeric values

 # Cycle through each column and set C1 to corresponding value of Cat.
 for(i in 1:length(C1)){
if(Cat[i]=='a') C1[i]=-1 else C1[i]=1
 }

 C1
 [1] -1 -1 -1  1  1  1 -1 -1  1
 Cat
 [1] a a a b b b a a b

 Sincerely,
 KeithC.
 Psych Undergrad, CU Boulder (US)
 RE McNair Scholar

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A More efficient method?

2007-07-04 Thread Stefan Grosse
Gabor Grothendieck wrote:
 set.seed(1)
 C - sample(c(a, b), 10, replace = TRUE)
 system.time(s1 - ifelse(C == a, 1, -1))
 
user  system elapsed
0.370.010.38
   
 system.time(s2 - 2 * (C == a) - 1)
 
user  system elapsed
0.020.000.02
   
 system.time(s1 - ifelse(C == a, 1, -1))
   user  system elapsed
   0.040.010.08
 system.time(s2 - 2 * (C == a) - 1)
   user  system elapsed
  0   0   0


I am just wondering: how comes the time does add up to 0.05 while
elapsed states 0.08 on my system? (Vista+R2.5.1)

Stefan


-=-=-
... Time is an illusion, lunchtime doubly so. (Ford Prefect)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A More efficient method?

2007-07-04 Thread S Ellison
#Given
Cat=c('a','a','a','b','b','b','a','a','b')  # Categorical variable

#and defining 
coding-array(c(-1,1), dimnames=list(unique(Cat) ))

#(ie an array of values corresponding to your character array levels, and with 
names set to those levels)

coding[Cat]

#does what you want.

 Keith Alan Chamberlain [EMAIL PROTECTED] 04/07/2007 14:44:44 
Dear Rhelpers,

Is there a faster way than below to set a vector based on values from
another vector? I'd like to call a pre-existing function for this, but one
which can also handle an arbitrarily large number of categories. Any ideas?

Cat=c('a','a','a','b','b','b','a','a','b')  # Categorical variable
C1=vector(length=length(Cat))   # New vector for numeric values

# Cycle through each column and set C1 to corresponding value of Cat.
for(i in 1:length(C1)){
if(Cat[i]=='a') C1[i]=-1 else C1[i]=1
}

C1
[1] -1 -1 -1  1  1  1 -1 -1  1
Cat
[1] a a a b b b a a b

Sincerely,
KeithC.
Psych Undergrad, CU Boulder (US)
RE McNair Scholar

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

***
This email and any attachments are confidential. Any use, co...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] working with matrix

2007-07-04 Thread Yemi Oyeyemi
I am new in R and I want to solve this problem;
  I have a matrix X (with n-rows and p-colums) my problem is to obtain the 
products of the vectors of rows and print out only the maximum value and 
identify the row that gives the maximum value. Thanks
  Oyeyemi, G.M

 
-
Don't pick lemons.

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] working with matrix

2007-07-04 Thread Gabor Csardi
which.max(apply(mat, 1, prod))

Gabor

On Wed, Jul 04, 2007 at 09:18:41AM -0700, Yemi Oyeyemi wrote:
 I am new in R and I want to solve this problem;
   I have a matrix X (with n-rows and p-colums) my problem is to obtain the 
 products of the vectors of rows and print out only the maximum value and 
 identify the row that gives the maximum value. Thanks
   Oyeyemi, G.M
 
  
 -
 Don't pick lemons.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Csardi Gabor [EMAIL PROTECTED]MTA RMKI, ELTE TTK

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A More efficient method?

2007-07-04 Thread Keith Alan Chamberlain
Dear Ted,

You are correct in that factors are probably what I had in mind since I
would be using them as predictors in a regression. I didn't know the syntax
to get R to do the arithmetic.

Many thanks to everyone who replied! 

Sincerely,
KeithC.
Psych Undergrad, CU Boulder (US)
RE McNair Scholar

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A More efficient method?

2007-07-04 Thread Gabor Grothendieck
In thinking about this a bit more I have found a slightly faster one still.
See s3.  Also I have added s0, the original solution, to the timings.

 set.seed(1)
 C - sample(c(a, b), 100, replace = TRUE)
 system.time({
+ s0 - vector(length = length(C))
+ for(i in seq_along(C)) s0[i] - if (C[i] == a) 1 else -1
+ s0
+ })
   user  system elapsed
  21.750.02   25.99
 system.time(s1 - ifelse(C == a, 1, -1))
   user  system elapsed
   2.320.172.54
 system.time(s2 - 2 * (C == a) - 1)
   user  system elapsed
   0.290.020.32
 system.time({tmp - C == a; tmp - !tmp})
   user  system elapsed
   0.210.000.21
 identical(s0, s1)
[1] TRUE
 identical(s0, s2)
[1] TRUE
 identical(s0, s3)
[1] TRUE

On 7/4/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
 Here are two ways.  The second way is more than 10x faster.

  set.seed(1)
  C - sample(c(a, b), 10, replace = TRUE)
  system.time(s1 - ifelse(C == a, 1, -1))
   user  system elapsed
   0.370.010.38
  system.time(s2 - 2 * (C == a) - 1)
   user  system elapsed
   0.020.000.02
  identical(s1, s2)
 [1] TRUE

 On 7/4/07, Keith Alan Chamberlain [EMAIL PROTECTED] wrote:
  Dear Rhelpers,
 
  Is there a faster way than below to set a vector based on values from
  another vector? I'd like to call a pre-existing function for this, but one
  which can also handle an arbitrarily large number of categories. Any ideas?
 
  Cat=c('a','a','a','b','b','b','a','a','b')  # Categorical variable
  C1=vector(length=length(Cat))   # New vector for numeric values
 
  # Cycle through each column and set C1 to corresponding value of Cat.
  for(i in 1:length(C1)){
 if(Cat[i]=='a') C1[i]=-1 else C1[i]=1
  }
 
  C1
  [1] -1 -1 -1  1  1  1 -1 -1  1
  Cat
  [1] a a a b b b a a b
 
  Sincerely,
  KeithC.
  Psych Undergrad, CU Boulder (US)
  RE McNair Scholar
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] working with matrix

2007-07-04 Thread S Ellison
Yemi,

Try

which.max(apply(X,1,prod))

(or possibly abs(apply(X,1,prod)) if you're only interested in unsigned product 
max.

S

 Yemi Oyeyemi [EMAIL PROTECTED] 04/07/2007 17:18:41 
I am new in R and I want to solve this problem;
  I have a matrix X (with n-rows and p-colums) my problem is to obtain the 
products of the vectors of rows and print out only the maximum value and 
identify the row that gives the maximum value. Thanks
  Oyeyemi, G.M

 
-
Don't pick lemons.

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

***
This email and any attachments are confidential. Any use, co...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A More efficient method?

2007-07-04 Thread Gabor Grothendieck
This was in error since s3 was not set.  The as.numeric in the calculation
of s3 can be omitted if its ok to have an integer rather than numeric result
and in that case its still faster yet.

 set.seed(1)
 C - sample(c(a, b), 100, replace = TRUE)
 system.time({
+ s0 - vector(length = length(C))
+ for(i in seq_along(C)) s0[i] - if (C[i] == a) 1 else -1
+ s0
+ })
   user  system elapsed
  21.320.02   26.10
 system.time(s1 - ifelse(C == a, 1, -1))
   user  system elapsed
   2.370.262.64
 system.time(s2 - 2 * (C == a) - 1)
   user  system elapsed
   0.320.020.35
 system.time({tmp - C == a; s3 - as.numeric(tmp - !tmp)})
   user  system elapsed
   0.280.020.31
 identical(s0, s1)
[1] TRUE
 identical(s0, s2)
[1] TRUE
 identical(s0, s3)
[1] TRUE



On 7/4/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
 In thinking about this a bit more I have found a slightly faster one still.
 See s3.  Also I have added s0, the original solution, to the timings.

  set.seed(1)
  C - sample(c(a, b), 100, replace = TRUE)
  system.time({
 + s0 - vector(length = length(C))
 + for(i in seq_along(C)) s0[i] - if (C[i] == a) 1 else -1
 + s0
 + })
   user  system elapsed
  21.750.02   25.99
  system.time(s1 - ifelse(C == a, 1, -1))
   user  system elapsed
   2.320.172.54
  system.time(s2 - 2 * (C == a) - 1)
   user  system elapsed
   0.290.020.32
  system.time({tmp - C == a; tmp - !tmp})
   user  system elapsed
   0.210.000.21
  identical(s0, s1)
 [1] TRUE
  identical(s0, s2)
 [1] TRUE
  identical(s0, s3)
 [1] TRUE

 On 7/4/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
  Here are two ways.  The second way is more than 10x faster.
 
   set.seed(1)
   C - sample(c(a, b), 10, replace = TRUE)
   system.time(s1 - ifelse(C == a, 1, -1))
user  system elapsed
0.370.010.38
   system.time(s2 - 2 * (C == a) - 1)
user  system elapsed
0.020.000.02
   identical(s1, s2)
  [1] TRUE
 
  On 7/4/07, Keith Alan Chamberlain [EMAIL PROTECTED] wrote:
   Dear Rhelpers,
  
   Is there a faster way than below to set a vector based on values from
   another vector? I'd like to call a pre-existing function for this, but one
   which can also handle an arbitrarily large number of categories. Any 
   ideas?
  
   Cat=c('a','a','a','b','b','b','a','a','b')  # Categorical variable
   C1=vector(length=length(Cat))   # New vector for numeric values
  
   # Cycle through each column and set C1 to corresponding value of Cat.
   for(i in 1:length(C1)){
  if(Cat[i]=='a') C1[i]=-1 else C1[i]=1
   }
  
   C1
   [1] -1 -1 -1  1  1  1 -1 -1  1
   Cat
   [1] a a a b b b a a b
  
   Sincerely,
   KeithC.
   Psych Undergrad, CU Boulder (US)
   RE McNair Scholar
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide 
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] About dataset

2007-07-04 Thread Nitish Kumar Mishra
Hi R hep group member,
I want to know that how I can call my data in R for princomp function.
I want to calculate PCA of 200 descriptors of 4000 molecule(I am using
Linux). How I can call this in R.
Thanking you.


-- 
Nitish Kumar Mishra
Junior Research Fellow
BIC, IMTECH, Chandigarh, India
E-Mail Address:
[EMAIL PROTECTED]
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Working with matrix

2007-07-04 Thread Yemi Oyeyemi
Dear,
Thanks for your prompt assistant. The problem I posted is 
just a bit of my problem. The whole problem is that; 
  1. I simulated a multivariate data set X (n by p)
  2. Then draw sample without replacement from X (k less than n)
  3. Compute the eigen values of the variance-covariance matrix of the sample 
drawn
  4. Store the eigen values in matrix eigenvals
  5. Store the samples(that is the row identifier) drawn in rowvals
  6. Bind the two matrices using rbind.
  The problem is;
  7. Now I want the print out of the maximum values of the eigenvalues product 
and samples that give such maximum product.
  Thanks
  Oyeyemi Gafar M.


   
-
Fussy? Opinionated? Impossible to please? Perfect.  Join Yahoo!'s user panel 
and lay it on us.
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Long-tail model in R ... anyone?

2007-07-04 Thread ocelma
Dear all,

first I would like to tell you that I've been using R for two days... (so,
you can predict my knowledge of the language!).

Yet, I managed to implement some stuff related with the Long-Tail model [1].
I did some tests with the data in table 1 (from [1]), and plotted figure 2
(from [1]). (See R code and CSV file at the end of the email)

Now, I'm stuck in the nonlinear regression model of F(x). I got a nice error:

Error in nls(~F(r, N50, beta, alfa), data = dataset, start = list(N50 =
N50,  : singular gradient


And, yes, I've been looking for how to solve this (via this mailing list +
some google), and I could not come across to a proper solution. That's why
I am asking the experts to help me! :-)

So, any help would be much appreciated...

Cheers, Oscar
[1] http://www.firstmonday.org/issues/issue12_5/kilkki/

PS: R code and CVS file

FILE: data.R (data taken from [1] Table 1, columns 1 and 2)
--8=---
rank,cum_value
10, 17396510
32, 31194809
96, 53447300
420,100379331
1187,   152238166
24234,  432238757
91242,  581332371
294180, 650880870
1242185,665227287
--=8---

R CODE:

#
# F(x). The long-tail model
# Reference: http://www.firstmonday.org/issues/issue12_5/kilkki/
# Params:
#   x   :   Rank (either an integer or a list)
#   N50 :   the number of objects that cover half of the whole volume
#   beta:   total volume
#   alfa:   the factor that defines the form of the function
F - function (x, N50, beta=1.0, alfa=0.49)
{
xx - as.numeric(x) # as.numeric() prevents overflow
Fx = beta / ( (N50/xx)^alfa + 1 )
Fx
}

# Read CSV file (rank, cum_value)
lt - read.csv(file=data.R,head=TRUE,sep=,)

r - lt$rank
v - lt$cum_value
pcnt - v/v[length(v)] *100 # get cumulative percentage
plot(r, pcnt, log=x, type='l', xlab='Ranking', ylab='Cumulative
percentatge of sales', main=Books Popularity, sub=The long-tail
effect, col='blue')

# Set some default values to be used by F(x)...
alfa = 0.49
beta = 1.38
N50 = 30714

# Start using F(x). Results are in 'f' ...
f - c(0) # oops! is this the best initialization for 'f'?
for (i in 1:24234) f[i] - F(i, N50, beta, alfa)*100

# Plot some estimated values from F(x) (N50, beta, and alfa values come
from the paper. See ref. [1])
plot(f, log=x, type='l', xlab='Ranking', ylab='Cumulative percentatge of
sales', main=Books Popularity, sub=Plotting first values of F(x) and
some real points)
points(r, pcnt, col=blue) # adding the real points

# Create a dataset to be used by nls()
dataset - data.frame(r, pcnt)

# Verifying that F(x) works fine... (comparing with the real values
contained in the dataset)

dataset
F(10, N50, beta, alfa) * 100
F(32, N50, beta, alfa) * 100
F(96, N50, beta, alfa) * 100
F(420, N50, beta, alfa) * 100
F(1187, N50, beta, alfa) * 100
F(24234, N50, beta, alfa) * 100
F(91242, N50, beta, alfa) * 100
F(294180, N50, beta, alfa) * 100
F(1242185, N50, beta, alfa) * 100

#dataset - data.frame(pcnt) # which dataset should I use? Should I
include the ranks in it?
nls( ~ F(r, N50, beta, alfa), data = dataset, start = list(N50=N50,
beta=beta, alfa=alfa), trace = TRUE )

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A More efficient method?

2007-07-04 Thread François Pinard
[Keith Alan Chamberlain]

Is there a faster way than below to set a vector based on values
from another vector? I'd like to call a pre-existing function for
this, but one which can also handle an arbitrarily large number of
categories. Any ideas?

Cat=c('a','a','a','b','b','b','a','a','b') # Categorical variable
C1=vector(length=length(Cat))  # New vector for numeric values

# Cycle through each column and set C1 to corresponding value of Cat.
for(i in 1:length(C1)){
   if(Cat[i]=='a') C1[i]=-1 else C1[i]=1
}

C1
[1] -1 -1 -1  1  1  1 -1 -1  1
Cat
[1] a a a b b b a a b

For handling an arbitrarily large number of categories, one may go
through a recoding vector, like this for the example above:

 Cat - c('a', 'a', 'a', 'b', 'b', 'b', 'a', 'a', 'b')
 C1 - c(a=-1, b=1)[Cat]
 C1
 a  a  a  b  b  b  a  a  b
-1 -1 -1  1  1  1 -1 -1  1

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A More efficient method?

2007-07-04 Thread jim holtman
User and System are a measure of the CPU time that was consumed.  Elapsed
time is the wall clock and even though they are both measured in seconds,
they are not really the same units.  The reason for the difference is any
idle time that they system may have waiting for I/O to complete which does
not consume CPU time for your process, but does consume Elasped time.

For some instances of CPU intensive code (with no I/O of competing tasks),
the User + System ~= Elapsed.  Also you have to take into account the
granularity of the clock when looking at numbers like 0.04.  So serious
comparisons of timing, you want runs of at least 10s of seconds or more.


On 7/4/07, Stefan Grosse [EMAIL PROTECTED] wrote:

 Gabor Grothendieck wrote:
  set.seed(1)
  C - sample(c(a, b), 10, replace = TRUE)
  system.time(s1 - ifelse(C == a, 1, -1))
 
 user  system elapsed
 0.370.010.38
 
  system.time(s2 - 2 * (C == a) - 1)
 
 user  system elapsed
 0.020.000.02
 
  system.time(s1 - ifelse(C == a, 1, -1))
   user  system elapsed
   0.040.010.08
  system.time(s2 - 2 * (C == a) - 1)
   user  system elapsed
  0   0   0


 I am just wondering: how comes the time does add up to 0.05 while
 elapsed states 0.08 on my system? (Vista+R2.5.1)

 Stefan


 -=-=-
 ... Time is an illusion, lunchtime doubly so. (Ford Prefect)

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to solve a min problem

2007-07-04 Thread RAVI VARADHAN
Whether you can use optim or not depends on the nature of the constraints on 
S.  If you have simple box constraints, you can use the L-BFGS-B method in 
optim.  If not, optim may not be directly applicable, unless you can somehow 
transform your problem into an unconstrained minimization problem. 

Ravi.

- Original Message -
From: domenico pestalozzi [EMAIL PROTECTED]
Date: Wednesday, July 4, 2007 11:26 am
Subject: Re: [R] how to solve a min problem
To: R-help r-help@stat.math.ethz.ch


 S is an array 1-dimensional, for example 1 X 10, and mean(S) is the 
 mean of
  these 10 elements.
  
  So, I want to do:
  
  minimize mean(S) with 0  b_func(S)  800.
  That is, there are some boundaries on S according the b_funct
  
  The function apply an iterative convergent criterion:
  
  f_1=g(S), f_2=g(f_1), f_3=g(f_2), ecc
  The function stops when
  f_n - f_n-1 =0.1e-09
  and g(S) is a non-linear function of S and the convergence is mathematically
  assured.
  
  Is it possible to use  'optimize'?
  
  thanks
  
  domenico
  
  
  2007/7/3, Spencer Graves [EMAIL PROTECTED]:
  
Do you mean
  
minimize mu with 0  b_func(S+mu)  800?
  
For this kind of problem, I'd first want to know the nature of
   b_func.  Without knowing more, I might try to plot b_func(S+mu) vs.
   mu, then maybe use 'optimize'.
  
If this is not what you mean, please be more specific:  I'm
   confused.
  
Hope this helps.
Spencer Graves
  
   domenico pestalozzi wrote:
I know it's possible to solve max e min problems  by using these
   functions:
   
nlm, optimize, optim
   
but I don't know how to use them (...if possible...) to solve this
   problem.
   
I have a personal function called  b_func(S) where S is an input 
 array
   (1 X
n)  and I'd like:
   
minimize mean(S) with 0  b_funct  800.
   
I know that the solution exists, but It's possible to calculate 
 it in R?
The b_func is non linear and it calculates a particular value 
 using S as
input and applying a convergent iterative algorithm.
   
thanks
   
   
domenico
   
  [[alternative HTML version deleted]]
   
__
R-help@stat.math.ethz.ch mailing list

PLEASE do read the posting guide
   
and provide commented, minimal, self-contained, reproducible code.
   
  
  
   [[alternative HTML version deleted]]
  
  __
  R-help@stat.math.ethz.ch mailing list
  
  PLEASE do read the posting guide 
  and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A More efficient method?

2007-07-04 Thread jim holtman
One other thing, in a multiprocessor configuration, if your application is
making use of the additional CPUs, then

User + System  Elapsed

In some cases.


On 7/4/07, jim holtman [EMAIL PROTECTED] wrote:

 User and System are a measure of the CPU time that was consumed.  Elapsed
 time is the wall clock and even though they are both measured in seconds,
 they are not really the same units.  The reason for the difference is any
 idle time that they system may have waiting for I/O to complete which does
 not consume CPU time for your process, but does consume Elasped time.

 For some instances of CPU intensive code (with no I/O of competing tasks),
 the User + System ~= Elapsed.  Also you have to take into account the
 granularity of the clock when looking at numbers like 0.04.  So serious
 comparisons of timing, you want runs of at least 10s of seconds or more.


  On 7/4/07, Stefan Grosse [EMAIL PROTECTED] wrote:
 
  Gabor Grothendieck wrote:
   set.seed(1)
   C - sample(c(a, b), 10, replace = TRUE)
   system.time(s1 - ifelse(C == a, 1, -1))
  
  user  system elapsed
  0.370.010.38
  
   system.time(s2 - 2 * (C == a) - 1)
  
  user  system elapsed
  0.020.000.02
  
   system.time(s1 - ifelse(C == a, 1, -1))
user  system elapsed
0.040.010.08
   system.time(s2 - 2 * (C == a) - 1)
user  system elapsed
   0   0   0
 
 
  I am just wondering: how comes the time does add up to 0.05 while
  elapsed states 0.08 on my system? (Vista+R2.5.1)
 
  Stefan
 
 
  -=-=-
  ... Time is an illusion, lunchtime doubly so. (Ford Prefect)
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem you are trying to solve?




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to solve a min problem

2007-07-04 Thread RAVI VARADHAN
If the constraints on S are linear inequalities, then linear programming 
methods would work.  See function solveLP in package linprog or simplex in 
boot or package lpSolve.

Ravi.

- Original Message -
From: domenico pestalozzi [EMAIL PROTECTED]
Date: Wednesday, July 4, 2007 11:26 am
Subject: Re: [R] how to solve a min problem
To: R-help r-help@stat.math.ethz.ch


 S is an array 1-dimensional, for example 1 X 10, and mean(S) is the 
 mean of
  these 10 elements.
  
  So, I want to do:
  
  minimize mean(S) with 0  b_func(S)  800.
  That is, there are some boundaries on S according the b_funct
  
  The function apply an iterative convergent criterion:
  
  f_1=g(S), f_2=g(f_1), f_3=g(f_2), ecc
  The function stops when
  f_n - f_n-1 =0.1e-09
  and g(S) is a non-linear function of S and the convergence is mathematically
  assured.
  
  Is it possible to use  'optimize'?
  
  thanks
  
  domenico
  
  
  2007/7/3, Spencer Graves [EMAIL PROTECTED]:
  
Do you mean
  
minimize mu with 0  b_func(S+mu)  800?
  
For this kind of problem, I'd first want to know the nature of
   b_func.  Without knowing more, I might try to plot b_func(S+mu) vs.
   mu, then maybe use 'optimize'.
  
If this is not what you mean, please be more specific:  I'm
   confused.
  
Hope this helps.
Spencer Graves
  
   domenico pestalozzi wrote:
I know it's possible to solve max e min problems  by using these
   functions:
   
nlm, optimize, optim
   
but I don't know how to use them (...if possible...) to solve this
   problem.
   
I have a personal function called  b_func(S) where S is an input 
 array
   (1 X
n)  and I'd like:
   
minimize mean(S) with 0  b_funct  800.
   
I know that the solution exists, but It's possible to calculate 
 it in R?
The b_func is non linear and it calculates a particular value 
 using S as
input and applying a convergent iterative algorithm.
   
thanks
   
   
domenico
   
  [[alternative HTML version deleted]]
   
__
R-help@stat.math.ethz.ch mailing list

PLEASE do read the posting guide
   
and provide commented, minimal, self-contained, reproducible code.
   
  
  
   [[alternative HTML version deleted]]
  
  __
  R-help@stat.math.ethz.ch mailing list
  
  PLEASE do read the posting guide 
  and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read items

2007-07-04 Thread jim holtman
?scan

Help page says to set   quiet=TRUE


On 7/4/07, elyakhlifi mustapha [EMAIL PROTECTED] wrote:

 Hello,
 I write us because I wanna know if it's possible to don't display

 read item

 when I execute a scan(textConnection())
 thanks



 _

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Kmeans performance difference

2007-07-04 Thread Moisan Yves
Hi All,

A question from a newbie using R 2-5-0 on windows XP.  Why is it that
kmeans clustering with apparently the exact same parameters behaves so
differently between the two following examples :

 cl1 - kmeans(subset(pointsUXO1, select = c(2:4)), 10)

Takes about 2 seconds to deliver a result

 cl1 - clust(subset(pointsUXO1, select = c(2:4)), k=10,
method=kmeansHartigan) 

Dies after about 10 minutes and fills up RAM :   

*** running kmeansHartigan cluster algorithm...

 *** calculating validity measure... 
Erreur : impossible d'allouer un vecteur de taille 922.9 Mo
De plus : Warning messages:
1: Reached total allocation of 1023Mb: see help(memory.size) 
2: Reached total allocation of 1023Mb: see help(memory.size) 
3: Reached total allocation of 1023Mb: see help(memory.size) 
4: Reached total allocation of 1023Mb: see help(memory.size)

If I understand correctly, both methods should give the sameish results
(modulo the initial random locations) since the default in kmeans is
Hartigan-Wong.  My data frame is 3 columns X 1 lines.  It must be
that kmeans is more a core R function whereas clust id from the
clustTool package, but isn't clustTool simply wrapping the core kmeans
method ?  Why such a difference ?

TIA,

Yves Moisan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Lookups in R

2007-07-04 Thread mfrumin

Hey all; I'm a beginner++ user of R, trying to use it to do some processing
of data sets of over 1M rows, and running into a snafu.  imagine that my
input is a huge table of transactions, each linked to a specif user id.  as
I run through the transactions, I need to update a separate table for the
users, but I am finding that the traditional ways of doing a table lookup
are way too slow to support this kind of operation.

i.e:

for(i in 1:100) {
   userid = transactions$userid[i];
   amt = transactions$amounts[i];
   users[users$id == userid,'amt'] += amt;
}

I assume this is a linear lookup through the users table (in which there are
10's of thousands of rows), when really what I need is O(constant time), or
at worst O(log(# users)).

is there any way to manage a list of ID's (be they numeric, string, etc) and
have them efficiently mapped to some other table index?

I see the CRAN package for SQLite hashes, but that seems to be going a bit
too far.

thanks,
Mike

Intern, Oyster Card Group, Transport for London
(feel free to email back to this address, I'm posting through NAbble so I
hope it works).
-- 
View this message in context: 
http://www.nabble.com/Lookups-in-R-tf4026062.html#a11435994
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Long-tail model in R ... anyone?

2007-07-04 Thread Dirk Eddelbuettel

I think you simply had your nls() syntax wrong.  Works here:


## first a neat trick to read the data from embedded text
 fmdata - read.csv(textConnection(
+ rank,cum_value
10, 17396510
32, 31194809
96, 53447300
420,100379331
1187,   152238166
24234,  432238757
91242,  581332371
294180, 650880870
1242185,665227287))
 


## then compute cumulative share
 fmdata[,cumshare] - fmdata[,cum_value] / fmdata[nrow(fmdata),cum_value]
 


## then check the data, just in case
 summary(fmdata)
  rank   cum_valuecumshare  
 Min.   : 10   Min.   : 17396510   Min.   :0.02615  
 1st Qu.: 96   1st Qu.: 53447300   1st Qu.:0.08034  
 Median :   1187   Median :152238166   Median :0.22885  
 Mean   : 183732   Mean   :298259489   Mean   :0.44836  
 3rd Qu.:  91242   3rd Qu.:581332371   3rd Qu.:0.87389  
 Max.   :1242185   Max.   :665227287   Max.   :1.0  
 

## finally estimate the model, using only the first seven rows of data
## using the parametric form from the paper and some wild guesses as
## starting values:
 fit - nls(cumshare ~ Beta / ((N50 / rank)^Alpha + 1), data=fmdata[1:7,], 
 start=list(Alpha=1, Beta=1, N50=1e4))
 summary(fit)

Formula: cumshare ~ Beta/((N50/rank)^Alpha + 1)

Parameters:
   Estimate Std. Error t value Pr(|t|)
Alpha 4.829e-01  5.374e-03   89.86 9.20e-08 ***
Beta  1.429e+00  2.745e-02   52.07 8.14e-07 ***
N50   3.560e+04  3.045e+03   11.69 0.000306 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 0.002193 on 4 degrees of freedom

Number of iterations to convergence: 8 
Achieved convergence tolerance: 1.297e-06 

 

which is reasonably close to the quoted 
N50 = 30714, α = 0.49, and β = 1.38.

You can probably play a little with the nls options to see what effect this
has. 

That said, seven observations for three parameters in non-linear model may be
a little hazardous.  One indication is that the estimated parameters values
are not too stable once you add the eights and nineth row of data.

Dirk

-- 
Hell, there are no rules here - we're trying to accomplish something. 
  -- Thomas A. Edison

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lookups in R

2007-07-04 Thread Peter Dalgaard
mfrumin wrote:
 Hey all; I'm a beginner++ user of R, trying to use it to do some processing
 of data sets of over 1M rows, and running into a snafu.  imagine that my
 input is a huge table of transactions, each linked to a specif user id.  as
 I run through the transactions, I need to update a separate table for the
 users, but I am finding that the traditional ways of doing a table lookup
 are way too slow to support this kind of operation.

 i.e:

 for(i in 1:100) {
userid = transactions$userid[i];
amt = transactions$amounts[i];
users[users$id == userid,'amt'] += amt;
 }

 I assume this is a linear lookup through the users table (in which there are
 10's of thousands of rows), when really what I need is O(constant time), or
 at worst O(log(# users)).

 is there any way to manage a list of ID's (be they numeric, string, etc) and
 have them efficiently mapped to some other table index?

 I see the CRAN package for SQLite hashes, but that seems to be going a bit
 too far.
   
Sometimes you need a bit of lateral thinking. I suspect that you could 
do it like this:

tbl - with(transactions, tapply(amount, userid, sum))
users$amt - users$amt + tbl[users$id]

one catch is that there could be users with no transactions, in which 
case you may need to replace userid by factor(userid, levels=users$id). 
None of this is tested, of course.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] New function combinations using tree structures?

2007-07-04 Thread Atte Tenkanen
Hi everybody,

I'm still interesting the possibility to use R for genetic programming (see 
https://stat.ethz.ch/pipermail/r-help/2007-April/128782.html)
and I'd like to know, how to express for instance this kind of functions 
(x^2+3x+1 etc, see the picture) using some kind of tree structures and what are 
needed to make similar recombination operations with them as seen in the 
picture:

http://users.utu.fi/attenka/Kuva1.png

I know the basic principles of mutation and mating etc, as far as genetic 
algorithms are concerned, but now I'd like to know the practical issues 
considering R and mating of these functions using trees.

All suggestions are warmly appreciated.

With best wishes,

Atte

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to install R 2.5 with Synaptic in Ubuntu?

2007-07-04 Thread Thomas Harte
mike,

try installing directly using apt-get instead of Synaptic. 

in my /etc/apt/sources.list i added the line:

deb http://cran.R-project.org/bin/linux/ubuntu/ dapper/ 

and then i did:

bash$ sudo apt-get install r-base r-doc-info r-doc-pdf r-doc-html 
r-mathlib  r-base-html
r-base-latex r-base-dev r-gnome

recently to update to R 2.5.1 on my version of Ubuntu (6.06).

cheers,

thomas.


 Message: 98
 Date: Wed, 4 Jul 2007 02:34:37 -0700 (PDT)
 From: msmith [EMAIL PROTECTED]
 Subject: Re: [R] How to install R 2.5 with Synaptic in Ubuntu?
 To: r-help@stat.math.ethz.ch
 Message-ID: [EMAIL PROTECTED]
 Content-Type: text/plain; charset=us-ascii
 
 
 Hi,
 
 Thanks for the suggestion and I wish the solution was that obvious, but I
 have changed it to really point at my favourite mirror.
 
 Using your example Synaptic reports the following error when I try to update
 the repositories:
 

http://www.stats.bris.ac.uk/R/bin/linux/ubuntu/dists/feisty/main/binary-i386/Packages.gz:
 404 Not Found
 
 This is understandable since that location doesn't exist, but it makes me
 think that the directory structure of the R mirrors is not compatible with
 Ubuntu and Synaptic, since it automatically seeks /dists/feisty/ rather than
 just /feisty/ as it is on the CRAN mirrors.
 
 Thanks again
 Mike Smith
 
 
 Stefan Grosse-2 wrote:
  
  
   to end of the entry making it:
 
   deb http://my.favorite.cran.mirror/bin/linux/ubuntu feisty main
 
  However after this it still complains that it can't find packages.gz
 

  
  Just a guess: have you replaced the my.favorite.cran.mirror by a mirror
  which is close to you? If you're in UK it would be for example
  
  deb http://www.stats.bris.ac.uk/R/bin/linux/ubuntu feisty main
  
  ;o)
  Stefan
  
  It appears to be looking in 
  http://my.favorite.cran.mirror/bin/linux/ubuntu/distsfeisty
  which isn't the directory structure of the cran repository, but 
  I can see anyway to modify this behaviour.  Every other Ubuntu repositoy
  I have looked at contains the dists directory.
 
  Any suggestions for modifying this behaviour are gratefully recieved.
  Many thanks
 
  Mike Smith
 
 

  
  
  
  
  -=-=-
  ... The simple truth is that interstellar distances will not fit into
  the human imagination - (The Hitchhiker's Guide to the Galaxy)
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
  
  
 
 -- 
 View this message in context:
 
http://www.nabble.com/How-to-install-R-2.5-with-Synaptic-in-Ubuntu--tf3998481.html#a11427837
 Sent from the R help mailing list archive at Nabble.com.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lookups in R

2007-07-04 Thread Martin Morgan
Michael,

A hash provides constant-time access, though the resulting perl-esque
data structures (a hash of lists, e.g.) are not convenient for other
manipulations

 n_accts - 10^3
 n_trans - 10^4
 t - list()
 t$amt - runif(n_trans)
 t$acct - as.character(round(runif(n_trans, 1, n_accts)))
 
 uhash - new.env(hash=TRUE, parent=emptyenv(), size=n_accts)
 ## keys, presumably account ids
 for (acct in as.character(1:n_accts)) uhash[[acct]] - list(amt=0, n=0)
 
 system.time(for (i in seq_along(t$amt)) {
+ acct - t$acct[i]
+ x - uhash[[acct]]
+ uhash[[acct]] - list(amt=x$amt + t$amt[i], n=x$n + 1)
+ })
   user  system elapsed 
  0.264   0.000   0.262 
 udf - data.frame(amt=0, n=rep(0L, n_accts),
+   row.names=as.character(1:n_accts))
 system.time(for (i in seq_along(t$amt)) {
+ idx - row.names(udf)==t$acct[i]
+ udf[idx, ] - c(udf[idx,amt], udf[idx, n]) + c(t$amt[i], 1)
+ })
   user  system elapsed 
 18.398   0.000  18.394 

Peter Dalgaard [EMAIL PROTECTED] writes:

 mfrumin wrote:
 Hey all; I'm a beginner++ user of R, trying to use it to do some processing
 of data sets of over 1M rows, and running into a snafu.  imagine that my
 input is a huge table of transactions, each linked to a specif user id.  as
 I run through the transactions, I need to update a separate table for the
 users, but I am finding that the traditional ways of doing a table lookup
 are way too slow to support this kind of operation.

 i.e:

 for(i in 1:100) {
userid = transactions$userid[i];
amt = transactions$amounts[i];
users[users$id == userid,'amt'] += amt;
 }

 I assume this is a linear lookup through the users table (in which there are
 10's of thousands of rows), when really what I need is O(constant time), or
 at worst O(log(# users)).

 is there any way to manage a list of ID's (be they numeric, string, etc) and
 have them efficiently mapped to some other table index?

 I see the CRAN package for SQLite hashes, but that seems to be going a bit
 too far.
   
 Sometimes you need a bit of lateral thinking. I suspect that you could 
 do it like this:

 tbl - with(transactions, tapply(amount, userid, sum))
 users$amt - users$amt + tbl[users$id]

 one catch is that there could be users with no transactions, in which 
 case you may need to replace userid by factor(userid, levels=users$id). 
 None of this is tested, of course.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lookups in R

2007-07-04 Thread Michael Frumin
i wish it were that simple.  unfortunately the logic i have to do on 
each transaction is substantially more complicated, and involves 
referencing the existing values of the user table through a number of 
conditions.

any other thoughts on how to get better-than-linear performance time?  
is there a recommended binary searching/sorting (i.e. BTree) module that 
I could use to maintain my own index?

thanks,
mike

Peter Dalgaard wrote:
 mfrumin wrote:
 Hey all; I'm a beginner++ user of R, trying to use it to do some 
 processing
 of data sets of over 1M rows, and running into a snafu.  imagine that my
 input is a huge table of transactions, each linked to a specif user 
 id.  as
 I run through the transactions, I need to update a separate table for 
 the
 users, but I am finding that the traditional ways of doing a table 
 lookup
 are way too slow to support this kind of operation.

 i.e:

 for(i in 1:100) {
userid = transactions$userid[i];
amt = transactions$amounts[i];
users[users$id == userid,'amt'] += amt;
 }

 I assume this is a linear lookup through the users table (in which 
 there are
 10's of thousands of rows), when really what I need is O(constant 
 time), or
 at worst O(log(# users)).

 is there any way to manage a list of ID's (be they numeric, string, 
 etc) and
 have them efficiently mapped to some other table index?

 I see the CRAN package for SQLite hashes, but that seems to be going 
 a bit
 too far.
   
 Sometimes you need a bit of lateral thinking. I suspect that you could 
 do it like this:

 tbl - with(transactions, tapply(amount, userid, sum))
 users$amt - users$amt + tbl[users$id]

 one catch is that there could be users with no transactions, in which 
 case you may need to replace userid by factor(userid, 
 levels=users$id). None of this is tested, of course.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lookups in R

2007-07-04 Thread Michael Frumin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Empirical copula in R

2007-07-04 Thread Spencer Graves
  I just got 203 hits from RSiteSearch(copula) and 60 from 
RSiteSearch(copula, fun).  Most if not all of the first 24 or the 
hits in the latter referred to the 'copula' package.  Have you reviewed 
these?  The 25th hit in the latter referred to an 'fgac' package for 
'Generalized Archimedean Copula', with a Brazilian author and 
maintainer, who presumably is not related to the 'copula' author and 
maintainer at U. Iowa. 

  Also, have you tried contacting the official maintainers of 
these packages?  You can get an email address for them from 
help(package=copula) and help (package=fgac). 

  Hope this helps. 
  Spencer Graves

GWeiss wrote:
 Hi,

 I would like to implement the empirical copula in R, does anyone know if it
 is included in a package? I know it is not in the Copula package. This one
 only includes a gof-test based on the empirical copula process.

 Thanks for your help!
 Gregor


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Newbie creating package with compiled code

2007-07-04 Thread Edna Bell
Hi R Gurus!

I'm trying to create a test package using the package.skeleton function.
I wanted to  add some compiled code too.
In the src library, I put together a baby subroutine, compiled it and created
a test.dll

When I use the R cmd build, it works fine.  But I get into trouble
with the R CMD check section.


/home/Desktop/R-2.5.1/bin # ./R CMD check mypkg
* checking for working latex ... OK
* using log directory '/home/Desktop/R-2.5.1/bin/mypkg.Rcheck'
* using R version 2.5.1 (2007-06-27)
* checking for file 'mypkg/DESCRIPTION' ... OK
* checking extension type ... Package
* this is package 'mypkg' version '1.0'
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking whether package 'mypkg' can be installed ... OK
* checking package directory ... OK
* checking for portable file names ... OK
* checking for sufficient/correct file permissions ... OK
* checking DESCRIPTION meta-information ... OK
* checking top-level files ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking R files for non-ASCII characters ... OK
* checking R files for syntax errors ... OK
* checking whether the package can be loaded ... ERROR
Error in .find.package(package, lib.loc, verbose = verbose) :
there are no packages called 'mypkg', 'stats', 'graphics',
'grDevices', 'utils', 'datasets', 'methods', 'base'
Error in library(mypkg) : .First.lib failed for 'mypkg'
Execution halted

It looks like this package has a loading problem: see the messages for
details.

Here is the mypkg.R file
sss - /home/hodgesse/Desktop/R-2.5.1
.First.lib - function(lib=sss,pkg=mypkg)
   library.dynam(mypkg.so,pkg=mypkg,lib=sss)


f - function(x,y) x+y

g -function(x,y) x-y


Thanks for any help

Edna
mailto: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Newbie creating package with compiled code

2007-07-04 Thread Duncan Murdoch
On 04/07/2007 6:43 PM, Edna Bell wrote:
 Hi R Gurus!
 
 I'm trying to create a test package using the package.skeleton function.
 I wanted to  add some compiled code too.
 In the src library, I put together a baby subroutine, compiled it and created
 a test.dll
 
 When I use the R cmd build, it works fine.  But I get into trouble
 with the R CMD check section.
 
 
 /home/Desktop/R-2.5.1/bin # ./R CMD check mypkg
 * checking for working latex ... OK
 * using log directory '/home/Desktop/R-2.5.1/bin/mypkg.Rcheck'
 * using R version 2.5.1 (2007-06-27)
 * checking for file 'mypkg/DESCRIPTION' ... OK
 * checking extension type ... Package
 * this is package 'mypkg' version '1.0'
 * checking package dependencies ... OK
 * checking if this is a source package ... OK
 * checking whether package 'mypkg' can be installed ... OK
 * checking package directory ... OK
 * checking for portable file names ... OK
 * checking for sufficient/correct file permissions ... OK
 * checking DESCRIPTION meta-information ... OK
 * checking top-level files ... OK
 * checking index information ... OK
 * checking package subdirectories ... OK
 * checking R files for non-ASCII characters ... OK
 * checking R files for syntax errors ... OK
 * checking whether the package can be loaded ... ERROR
 Error in .find.package(package, lib.loc, verbose = verbose) :
 there are no packages called 'mypkg', 'stats', 'graphics',
 'grDevices', 'utils', 'datasets', 'methods', 'base'
 Error in library(mypkg) : .First.lib failed for 'mypkg'
 Execution halted
 
 It looks like this package has a loading problem: see the messages for
 details.
 
 Here is the mypkg.R file
 sss - /home/hodgesse/Desktop/R-2.5.1
 .First.lib - function(lib=sss,pkg=mypkg)
  library.dynam(mypkg.so,pkg=mypkg,lib=sss)

That's a very strange .First.lib.  I think you'll have more success with 
a simpler one:

.First.lib - function(libname, pkgname)
   library.dynam(mypkg, package=pkgname, lib.loc=libname)

(and sss is not needed at all).

Duncan Murdoch

 
 
 f - function(x,y) x+y
 
 g -function(x,y) x-y
 
 
 Thanks for any help
 
 Edna
 mailto: [EMAIL PROTECTED]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lookups in R

2007-07-04 Thread Peter Dalgaard
Michael Frumin wrote:
 i wish it were that simple.  unfortunately the logic i have to do on 
 each transaction is substantially more complicated, and involves 
 referencing the existing values of the user table through a number of 
 conditions.

 any other thoughts on how to get better-than-linear performance time?  
 is there a recommended binary searching/sorting (i.e. BTree) module that 
 I could use to maintain my own index?
   
The point remains: To do anything efficient in R, you need to get rid of 
that for loop and use something vectorized. Notice that you can expand 
values from the user table into the transaction table by indexing with 
transactions$userid, or you can use a merge operation.

 thanks,
 mike

 Peter Dalgaard wrote:
   
 mfrumin wrote:
 
 Hey all; I'm a beginner++ user of R, trying to use it to do some 
 processing
 of data sets of over 1M rows, and running into a snafu.  imagine that my
 input is a huge table of transactions, each linked to a specif user 
 id.  as
 I run through the transactions, I need to update a separate table for 
 the
 users, but I am finding that the traditional ways of doing a table 
 lookup
 are way too slow to support this kind of operation.

 i.e:

 for(i in 1:100) {
userid = transactions$userid[i];
amt = transactions$amounts[i];
users[users$id == userid,'amt'] += amt;
 }

 I assume this is a linear lookup through the users table (in which 
 there are
 10's of thousands of rows), when really what I need is O(constant 
 time), or
 at worst O(log(# users)).

 is there any way to manage a list of ID's (be they numeric, string, 
 etc) and
 have them efficiently mapped to some other table index?

 I see the CRAN package for SQLite hashes, but that seems to be going 
 a bit
 too far.
   
   
 Sometimes you need a bit of lateral thinking. I suspect that you could 
 do it like this:

 tbl - with(transactions, tapply(amount, userid, sum))
 users$amt - users$amt + tbl[users$id]

 one catch is that there could be users with no transactions, in which 
 case you may need to replace userid by factor(userid, 
 levels=users$id). None of this is tested, of course.
 

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] nls() lower/upper bound specification

2007-07-04 Thread Stephen Tucker
Dear all,

In optim() all parameters of a function to be adjusted is stored in a single
vector, with lower/upper bounds can be specified by a vector of the same
length.

In nls(), is it true that if I want to specify lower/upper bounds, functions
must be re-written so that each parameter is contained in a single-valued
vector?

## data input
x - 1:10
y - 3*x+4*x^2+rnorm(10,250)

## this one does not work
f - function(x)
  function(beta)
  beta[1]+ beta[2]*x+beta[3]*x^2

out - nls(y~f(x)(beta),data=data.frame(x,y),
   alg=port,
   start=list(beta=1:3),
   lower=list(beta=rep(0,3)))

(However, this works if I do not specify a lower bound)

## this one works
g - function(x)
  function(beta1,beta2,beta3)
  beta1+ beta2*x+beta3*x^2

out - nls(y~g(x)(beta1,beta2,beta3),data=data.frame(x,y),
   alg=port,
   start=list(beta1=1,beta2=1,beta3=1),
   lower=list(beta1=1,beta2=1,beta3=1))

Thanks in advance!

Stephen

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lookups in R

2007-07-04 Thread deepayan . sarkar
On 7/4/07, Martin Morgan [EMAIL PROTECTED] wrote:
 Michael,

 A hash provides constant-time access, though the resulting perl-esque
 data structures (a hash of lists, e.g.) are not convenient for other
 manipulations

  n_accts - 10^3
  n_trans - 10^4
  t - list()
  t$amt - runif(n_trans)
  t$acct - as.character(round(runif(n_trans, 1, n_accts)))
 
  uhash - new.env(hash=TRUE, parent=emptyenv(), size=n_accts)
  ## keys, presumably account ids
  for (acct in as.character(1:n_accts)) uhash[[acct]] - list(amt=0, n=0)
 
  system.time(for (i in seq_along(t$amt)) {
 + acct - t$acct[i]
 + x - uhash[[acct]]
 + uhash[[acct]] - list(amt=x$amt + t$amt[i], n=x$n + 1)
 + })
user  system elapsed
   0.264   0.000   0.262
  udf - data.frame(amt=0, n=rep(0L, n_accts),
 +   row.names=as.character(1:n_accts))
  system.time(for (i in seq_along(t$amt)) {
 + idx - row.names(udf)==t$acct[i]
 + udf[idx, ] - c(udf[idx,amt], udf[idx, n]) + c(t$amt[i], 1)
 + })
user  system elapsed
  18.398   0.000  18.394

I don't think that's a fair comparison--- much of the overhead comes
from the use of data frames and the creation of the indexing vector. I
get

 n_accts - 10^3
 n_trans - 10^4
 t - list()
 t$amt - runif(n_trans)
 t$acct - as.character(round(runif(n_trans, 1, n_accts)))
 uhash - new.env(hash=TRUE, parent=emptyenv(), size=n_accts)
 for (acct in as.character(1:n_accts)) uhash[[acct]] - list(amt=0, n=0)
 system.time(for (i in seq_along(t$amt)) {
+ acct - t$acct[i]
+ x - uhash[[acct]]
+ uhash[[acct]] - list(amt=x$amt + t$amt[i], n=x$n + 1)
+ }, gcFirst = TRUE)
   user  system elapsed
  0.508   0.008   0.517
 udf - matrix(0, nrow = n_accts, ncol = 2)
 rownames(udf) - as.character(1:n_accts)
 colnames(udf) - c(amt, n)
 system.time(for (i in seq_along(t$amt)) {
+ idx - t$acct[i]
+ udf[idx, ] - udf[idx, ] + c(t$amt[i], 1)
+ }, gcFirst = TRUE)
   user  system elapsed
  1.872   0.008   1.883

The loop is still going to be the problem for realistic examples.

-Deepayan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] questions on lme function

2007-07-04 Thread Dr Ross Darnell
Ana

You are estimating a random coefficient model on 5 individuals (mean 
and variance). Are you sure this is wise?

Ross Darnell 

- Original Message -
From: Ana Conesa [EMAIL PROTECTED]
Date: Thursday, July 5, 2007 1:21 am
Subject: [R] questions on lme function
 
 Dear list,
 
 I am using the lme funcion to fit a mixed model for the time response
 of a number of physiological variables. The random variable would be
 the individual on which physiological variables are measured at
 different time points. I have 4 time points, 5 individuals and 3
 replicates per condition (time/individual),  and I would like to fit
 a quadratic model on time. The model I am using is
 
  mm - lme(myvar ~ time + time2, random= ~ time|individual,
 data=clinical)
 
 being time2 = time*time
 
 I have a number of questions
 
 1) I am not very sure the random effect is correctly modeled. 
 Would I
 need to include the time2 variable aswell?
 
 2) I would like to extract the F statistics of the model, and I do
 not find a function for this. Is this possible?
 
 3) depending of the variable I take, I frequently obtain a
 convergence error as a result of the lme funcion. Any ideas on what
 to do to improve convergence?
 
 Thank you
 
 Ana Conesa, PhD
 Centro de Investigacion Principe Felipe
 Avda. Autopista Saler 16 46013 Valencia
 http://bioinfo.cipf.es
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.htmland provide commented, minimal, self-contained, 
 reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] (Statistics question) - Nonlinear regression and simultaneous equation

2007-07-04 Thread adschai
Hi,I have a fundamental questions that I'm a bit confused. If any guru from 
this circle could help me out, I would really appreciate.I have a system of 
equations in which some of the endogs appear on right hand sides of some 
equations. To solve this, one needs a technique like 2SLS or FIML to circumvent 
inconsistency of the estimated coefficients. My question is that if I apply the 
nonlinear regression like SVM regression. Do I still need to worry about 
endogeneity? Meaning, what I only need to care is the 1st step of 2SLS. That 
would mean that I only need to carry out the SVM regression on all the exogs. 
Am I missing anything here? Thank you so much.Regards,- adschai

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Question about framework to weighting different classes in SVM

2007-07-04 Thread adschai
Hi gurus,

I have a doubt about multiclass classification SVM. The population in my data 
includes a couple of class labels that have relatively small proportion of the 
entire population compared to other classes. I would like SVM to pay more 
attention to these classes. However, the question I am having here is that is 
there any systematic/theoretic framework to determine the weights for each 
class? 

My second question is directly related to R. I would like to use the 
class.weights attribute in svm function. However, I'm quite confused a bit 
about how to use it from the description I got from ?svm. Below is the quote.

'a named vector of weights for the different classes, used for asymetric class 
sizes. Not all factor levels have to be supplied (default weight: 1). All 
components have to be named.'

Is the name of the vector has to match the levels in my factor used as target 
labels for my classification? Any simple example would be really appreciated. 
Thank you!

- adschai

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] speed up crr function in cmprsk package

2007-07-04 Thread sj
I am trying to use the crr function in the cmprsk package to analyze a large
patient dataset (45000 +), The model has 100 + covariates and 5 competing
risks. I am finding that R seems to get bogged down and even if I let it run
for several hours I don't get anything back. Am I expecting too much, or are
there ways to speed up the process? Any help is appreciated.

Best,

Spencer

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Question for svm function in e1071

2007-07-04 Thread adschai
Hi,

Sorry that I have many questions today. I am using svm function on about 
180,000 points of training set. It takes very long time to run. However, I 
would like it to spit out something to make sure that the run is not dead in 
between.  Would you please suggest anyway to do so? 

And is there anyway to speed up the performance of this svm function? Thank you.

- adschai

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] about stableFit() and hypFit() of fBasics package

2007-07-04 Thread 김준희
Dear R users,

I'm trying to fit stable distribution and hyperbolic distribution to my data 
using stableFit(), and hypFit() of fBasics.
However, there are some problems

This is the result
==
 stableFit(lm, alpha = 1, beta = 0, gamma = 1, delta = 0, doplot = TRUE, trace 
 = FALSE, title = NULL, description = NULL)
Title:
 Stable Parameter Estimation 

Call:
 .qStableFit(x = x, doplot = doplot, title = title, description = description)

Model:
 Student-t Distribution

Estimated Parameter(s):
alpha  beta gamma delta 
   NANANANA 

==

first, this is stable distribution, but in Model, it's always Student-t 
Distribution.
Second, everytime I run stableFit(), the result of Estimated Parameter(s) is 
NA. I can't really find what's wrong in my code.

In the case of hyperbolic distribution, this is the result
==

Model:
 Hyperbolic Distribution

Estimated Parameter(s):
alpha  beta deltamu 
63.201132  1.991194 11.165716  2.921906 

There were 41 warnings (use warnings() to see them)

Warning messages:
1: NA/Inf replaced by maximum positive value
2: NA/Inf replaced by maximum positive value
...

==
First, I don't what the warning messages mean and why they appeared.

Many thanks in advance

Junhee
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.